CN113497935A - Video coding and decoding method and device - Google Patents

Video coding and decoding method and device Download PDF

Info

Publication number
CN113497935A
CN113497935A CN202110363246.5A CN202110363246A CN113497935A CN 113497935 A CN113497935 A CN 113497935A CN 202110363246 A CN202110363246 A CN 202110363246A CN 113497935 A CN113497935 A CN 113497935A
Authority
CN
China
Prior art keywords
palette
mode
tree
chroma
palette table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110363246.5A
Other languages
Chinese (zh)
Inventor
朱弘正
陈漪纹
修晓宇
马宗全
陈伟
王祥林
于冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Publication of CN113497935A publication Critical patent/CN113497935A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present disclosure provides a video encoding and decoding method and apparatus. The video encoding method includes: dividing a video image into a plurality of Coding Units (CU); predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table; wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU. By the method and the equipment, the coding and decoding efficiency can be improved.

Description

Video coding and decoding method and device
This application claims priority from U.S. patent application filed on 4.4.2020, entitled "Video coding using palette mode", by the U.S. patent office, application number US63/005,300, the entire contents of which are incorporated herein by reference.
Technical Field
The present disclosure relates generally to the field of video encoding and compression processing, and more particularly, to a video encoding method, a video decoding method, and an apparatus.
Background
In the field of video encoding and compression processing technology, video data may be compressed using various video encoding techniques, and video encoding may be performed according to at least one video encoding standard, which may include, for example: general Video Coding (VVC), Joint Exploration test Model (JEM), High-Efficiency Video Coding (h.265/HEVC), Advanced Video Coding (h.264/AVC), Moving Picture Expert Group (MPEG) Coding, and the like. An important goal of video coding techniques is to avoid or minimize impairments to video quality while compressing the video data into a form that uses a lower bit rate.
The first version of the HEVC standard, which has been completed in 2013 in 10 months, saves about 50% of the bit rate compared to the previous generation video coding standard h.264/MPEG AVC, providing an equivalent Perceptual Quality (Perceptual Quality). Although this version of the HEVC standard provides significant coding improvements over its previous generation video coding standard, there is evidence that coding efficiencies superior to HEVC can be achieved with additional coding tools. On this basis, both Video Coding Experts Group (VCEG) and MPEG have begun the quest for new Coding techniques for future Video Coding standardization. ITU-T VECG and ISO/IEC MPEG established a Joint Video Exploration Team (jfet) in 2015 10 to begin to conduct significant research on advanced technologies that can greatly improve coding efficiency. Jfet maintains a reference software called joint exploration model JEM by integrating a number of additional coding tools on top of the HEVC test model (HM).
The ITU-T and ISO/IEC published a joint Proposal (Call For pro-posal, CfP) For video compression with performance exceeding HEVC in 2017, month 10. In month 4 of 2018, 23 cfps were received and evaluated at the 10 th jvt conference, and the results indicated that the compression efficiency was about 40% higher than that of HEVC, based on which, the jvt started a new project to develop a new generation of video coding standards named universal video coding VVC, and a reference software code library called a VVC Test Model (VTM) was established in the same month for demonstrating the reference implementation of the VVC standard.
In general, the basic intra prediction scheme applied in VVC is the same as that applied in HEVC, except that some modules are further extended and/or improved, e.g., matrix weighted intra prediction (MIP) coding mode, intra sub-partition (ISP) coding mode, extended intra prediction with wide-angle intra direction, position dependent intra prediction combining (PDPC), and 4-tap frame interpolation.
The intra prediction scheme applied in VVC mainly includes: intra prediction mode with wide angle intra direction, position dependent intra prediction combination, multiple transform selection and block size adaptive transform selection, intra sub-partition coding mode, cross-component linear model prediction, palette mode, quantized residual differential pulse code modulation. For the case where only a few main colors exist in a picture, the use of the palette mode can improve the coding efficiency, but the coding efficiency and the quality of the coded image still need to be improved.
Disclosure of Invention
Exemplary embodiments of the present disclosure are to provide a video encoding method, a video decoding method and apparatus to improve an existing palette mode, thereby improving codec efficiency and facilitating efficient codec hardware implementation.
According to a first aspect of the embodiments of the present disclosure, there is provided a video encoding method, which divides a video image into a plurality of coding units CU; predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table; wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
Optionally, the one ancestor node is: an ancestor node of all ancestor nodes of the at least one CU that meets a predetermined condition with respect to a predetermined size threshold.
Optionally, the one ancestor node is: a largest ancestor node of all ancestor nodes of the at least one CU that is equal to or less than the predetermined size threshold; or the one ancestor node is: a smallest ancestor node of all ancestor nodes of the at least one CU that is equal to or greater than the predetermined size threshold.
Optionally, the method further comprises: the chroma palette table of a CU is predicted based on the luma palette table of the CU using a cross-component linear model.
Optionally, the method further comprises: in palette mode, signaling a scanning direction of a CU based on different contexts of a shape of the CU; or in the palette mode, when a shape of a CU satisfies a predetermined shape condition, determining a scanning direction of the CU based on the shape of the CU without signaling the scanning direction of the CU.
Optionally, the step of determining the scanning direction of the CU based on the shape of the CU comprises: determining a same direction as a longer side of the CU as a scanning direction of the CU, or determining a same direction as a shorter side of the CU as a scanning direction of the CU.
Optionally, the method further comprises: in palette mode, a CU is divided into multiple segments and palette related syntax for each segment of the same CU is encoded independently.
Optionally, the method further comprises: in the palette mode, a CU is divided into a plurality of segments, and palette related data of the respective segments of the same CU are independently cached.
Optionally, the step of dividing the CU into a plurality of segments comprises: dividing the CU into the plurality of segments according to the scanning direction of the CU; or, the CU is divided into the plurality of segments according to a binary tree or quadtree partitioning structure.
Optionally, the multiple segments of the same CU share one palette table, or the segments of the same CU use respective palette tables separately.
Optionally, the method further comprises: in the case of a separate block tree structure, the palette index map of the corresponding chroma CU is predicted based on the palette index map of the luma CU in a cross-component manner.
Optionally, the method further comprises: signaling delta QP for a CU encoded in palette mode.
Optionally, the step of signaling a delta QP for a CU encoded in palette mode comprises: signaling delta QP for a CU encoded in palette mode if there are escape samples; or always signal delta QP for a CU encoded in palette mode; or signaling delta QP for a luma component and delta QP for a chroma component of a CU encoded in palette mode, respectively.
Optionally, the method further comprises: signaling information of a QP to indicate whether a current palette mode is a lossless palette mode, wherein in case a value of the QP is equal to or less than a predetermined threshold, indicating that the current palette mode is a lossless palette mode; indicating that the current palette mode is not a lossless palette mode if the value of the QP is greater than a predetermined threshold.
Alternatively, the operation of quantizing the escape samples in the palette mode is the same as the operation of quantizing the samples in the other modes.
Optionally, the other modes include at least one of: transform skip mode, transform mode, quantized residual differential pulse code modulation, RDPCM, mode.
Optionally, the operation of lossless coding of escape samples in palette mode is the same as the operation of lossless coding of samples in other modes.
Optionally, the other mode is a transform skip mode.
Alternatively, the operation of lossless coding of escape samples in palette mode does not include a quantization operation, or the quantization operation in the operation of lossless coding of escape samples in palette mode is performed based on a QP less than or equal to a predetermined threshold.
Optionally, the method further comprises: performing one of the following binarization processes on the escape sample: fixed length binarization processing, k-th order exponential golomb binarization processing, truncated binary codeword binarization processing, wherein parameters of the binarization processing are determined based on parameters of a CU currently being processed.
Optionally, the fixed length is determined based on a size of the QP and/or bit depth; or k is determined based on the size of the QP and/or bit depth; or the maximum value of the truncated binary codeword is determined based on the size of the QP and/or bit depth.
Alternatively, in lossless palette mode, the escape samples are binarized for a fixed length.
Optionally, the method further comprises: the way of binarization processing of the escape samples is adaptively selected at different coding levels.
Optionally, the method further comprises: signaling palette mode related information; wherein the palette mode related information comprises at least one of: information indicating a maximum allowable palette size, information indicating a maximum allowable palette area, information indicating a maximum allowable palette predictor size, information indicating a difference between the maximum allowable palette predictor size and the maximum allowable palette size, information indicating that syntax for initializing a sequence palette predictor is to be transmitted, information indicating that the number of entries of a palette predictor initializer minus1, information indicating a component value for initializing an ith palette entry of the palette predictor array, information indicating that a bit depth value of a luminance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a chrominance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a luminance component of an entry of the palette minus8, information indicating that a bit depth value of a chrominance component of the palette minus8, information indicating that a value of a color is not less than 8, information indicating that a color is not less than 8, Information indicating the bit depth value of the chroma component of an entry of the palette minus 8.
Optionally, the method further comprises: rate-distortion analysis is performed on the palette mode based on the precision of the internal bit depth.
Optionally, the step of rate-distortion analyzing the palette mode based on the accuracy of the internal bit depth comprises: performing a rate-distortion analysis for deriving a palette in a palette mode based on the precision of the internal bit depth; or the accuracy of the distortion calculation for selecting the index of the nearest palette entry is equal to the accuracy of the internal bit depth.
Optionally, the method further comprises: performing a palette-mode correlation rate-distortion analysis, wherein different weights are used for the luma component cost and the chroma component cost in the correlation rate-distortion analysis, or different weights are used for the L1 norm difference and the L2 norm difference in the correlation rate-distortion analysis.
Optionally, chroma component distortion is reduced in a total cost calculation of the correlation rate-distortion analysis.
Optionally, the method further comprises: deriving the actual value QP of the quantization parameter QP for the quantization process of the escape samples byesca
QPesca=MIN(((MAX(4,QPcu)–2)/6)*6+4,61)
Where QPcu is the QP value for the current CU.
Alternatively, the number of distortion thresholds for the number of palette entries when deriving the palette table of the current CU in lossy palette mode is extended to 64, where the increased threshold is related to the quantization error.
Optionally, the method further comprises: signaling at least one of a maximum palette table size and a palette predictor size, wherein at least one of the maximum palette table size and the palette predictor size is variable.
Optionally, the method further comprises: disabling the palette mode for CUs having a size less than a first predetermined threshold; or in the dual-tree case, disabling the palette mode for chroma CUs having a size less than a second predetermined threshold; or in the single tree case, disabling the palette mode for CUs having chroma samples with a total size less than a third predetermined threshold; or in the case of local dual-tree, disabling the palette mode for CUs having a size less than a fourth predetermined threshold; or in the case of local dual-tree, disabling the palette mode; or in the local dual-tree case, the palette mode is disabled for chroma CUs.
Optionally, the method further comprises: in the local dual-tree case, the update process of the palette prediction is performed on both the luminance CU and the chrominance CU, where the luminance CU is encoded while the palette prediction is updated, and then the chrominance CU is encoded under the same local dual-tree.
Optionally, the method further comprises: in the case of local dual-tree, disabling palette table updates for the shared palette table; or in the case of local dual-tree, disabling palette table updating; or in the local dual-tree case, disabling palette table updates for CUs having luma samples with a total size less than a fifth predetermined threshold; or in the local dual-tree case, the palette table update is disabled for chroma CUs.
Optionally, the method further comprises: independently performing palette prediction update processing for different color components in the local dual tree; where palette mode coding is performed on the luma component and the chroma component in parallel, or the luma CU is coded while palette prediction is updated, and then the chroma CU is coded under the same local dual-tree.
Optionally, the step of performing the update process of the palette prediction independently for different color components in the local dual tree includes: when the palette prediction is updated while coding a CU of one color component under the local dual tree, the value of another color component of a previously available candidate in the palette is used, where a luma CU is coded while updating the palette prediction and then a chroma CU is coded under the same local dual tree.
Optionally, the method further comprises: when encoding a luma CU and/or a chroma CU under a local dual tree, when updating a palette predictor using entries copied from the palette predictor, using all component values of the palette entries copied from the palette predictor; alternatively, when encoding a luma CU and/or a chroma CU under a local dual-tree, when updating a palette predictor using entries copied from the palette predictor, component values missing from the palette entries copied from the palette predictor are replaced with default values.
Optionally, the default value is related to an internal bit depth.
According to a second aspect of the embodiments of the present disclosure, there is provided a video decoding method, including: receiving and parsing a bitstream; obtaining, from the parsed bitstream, a plurality of coding units CU into which the video image is divided; predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table; wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
Optionally, the one ancestor node is: an ancestor node of all ancestor nodes of the at least one CU that meets a predetermined condition with respect to a predetermined size threshold.
Optionally, the one ancestor node is: a largest ancestor node of all ancestor nodes of the at least one CU that is equal to or less than the predetermined size threshold; or the one ancestor node is: a smallest ancestor node of all ancestor nodes of the at least one CU that is equal to or greater than the predetermined size threshold.
Optionally, the method further comprises: the chroma palette table of a CU is predicted based on the luma palette table of the CU using a cross-component linear model.
Optionally, the method further comprises: in palette mode, receiving signaling about a scanning direction of a CU, wherein the signaling is sent based on different contexts of a shape of the CU; or in the palette mode, when a shape of a CU satisfies a predetermined shape condition, determining a scanning direction of the CU based on the shape of the CU without receiving signaling regarding the scanning direction of the CU.
Optionally, the step of determining the scanning direction of the CU based on the shape of the CU comprises: determining a same direction as a longer side of the CU as a scanning direction of the CU, or determining a same direction as a shorter side of the CU as a scanning direction of the CU.
Optionally, the method further comprises: in palette mode, a CU is divided into multiple segments and palette-related syntax for the individual segments of the same CU is decoded independently.
Optionally, the method further comprises: in the palette mode, a CU is divided into a plurality of segments, and palette related data of the respective segments of the same CU are independently cached.
Optionally, the step of dividing the CU into a plurality of segments comprises: dividing the CU into the plurality of segments according to the scanning direction of the CU; or, the CU is divided into the plurality of segments according to a binary tree or quadtree partitioning structure.
Optionally, the multiple segments of the same CU share one palette table, or the segments of the same CU use respective palette tables separately.
Optionally, the method further comprises: in the case of a separate block tree structure, the palette index map of the corresponding chroma CU is predicted based on the palette index map of the luma CU in a cross-component manner.
Optionally, the method further comprises: signaling is received regarding delta QP for a CU encoded in palette mode.
Optionally, the step of receiving signaling of delta QP for a CU encoded in palette mode comprises: receiving signaling of delta QP for a CU encoded in palette mode, if there are escape samples; or always receive signaling of delta QP for a CU encoded in palette mode; or receiving signaling regarding delta QP for a luma component of the CU encoded in the palette mode and signaling regarding delta QP for a chroma component of the CU encoded in the palette mode, respectively.
Optionally, the method further comprises: receiving information of a QP and determining whether a current palette mode is a lossless palette mode based on the information of the QP, wherein in the case that the value of the QP is equal to or less than a predetermined threshold, the current palette mode is determined to be the lossless palette mode; determining that the current palette mode is not a lossless palette mode if the value of the QP is greater than a predetermined threshold.
Alternatively, the operation of inverse quantizing the escape samples in the palette mode is the same as the operation of inverse quantizing the samples in the other modes.
Optionally, the other modes include at least one of: transform skip mode, transform mode, quantized residual differential pulse code modulation, RDPCM, mode.
Optionally, the operation of lossless decoding of escape samples in palette mode is the same as the operation of lossless decoding of samples in other modes.
Optionally, the other mode is a transform skip mode.
Alternatively, the operation of lossless decoding of escape samples in the palette mode does not include an inverse quantization operation, or an inverse quantization operation among the operations of lossless decoding of escape samples in the palette mode is performed based on a QP less than or equal to a predetermined threshold.
Optionally, the method further comprises: performing one of the following inverse binarization processes on the escape sample: fixed length inverse binarization processing, k-th order exponential golomb inverse binarization processing, truncated binary codeword inverse binarization processing, wherein parameters of the inverse binarization processing are determined based on parameters of a CU currently being processed.
Optionally, the fixed length is determined based on a size of the QP and/or bit depth; or k is determined based on the size of the QP and/or bit depth; or the maximum value of the truncated binary codeword is determined based on the size of the QP and/or bit depth.
Alternatively, in lossless palette mode, the escape samples are subjected to fixed length inverse binarization processing.
Optionally, the method further comprises: the manner in which the inverse binarization processing is performed on the escape samples is adaptively selected at different decoding levels.
Optionally, the method further comprises: receiving signaling regarding palette mode related information; wherein the palette mode related information comprises at least one of: information indicating a maximum allowable palette size, information indicating a maximum allowable palette area, information indicating a maximum allowable palette predictor size, information indicating a difference between the maximum allowable palette predictor size and the maximum allowable palette size, information indicating that syntax for initializing a sequence palette predictor is to be transmitted, information indicating that the number of entries of a palette predictor initializer minus1, information indicating a component value for initializing an ith palette entry of the palette predictor array, information indicating that a bit depth value of a luminance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a chrominance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a luminance component of an entry of the palette minus8, information indicating that a bit depth value of a chrominance component of the palette minus8, information indicating that a value of a color is not less than 8, information indicating that a color is not less than 8, Information indicating the bit depth value of the chroma component of an entry of the palette minus 8.
Optionally, the method further comprises: deriving the actual value QP of the quantization parameter QP for the inverse quantization process of the escape samples byesca
QPesca=MIN(((MAX(4,QPcu)–2)/6)*6+4,61)
Where QPcu is the QP value for the current CU.
Alternatively, the number of distortion thresholds for the number of palette entries when deriving the palette table of the current CU in lossy palette mode is extended to 64, where the increased threshold is related to the quantization error.
Optionally, the method further comprises: receiving signaling regarding at least one of a maximum palette table size and a palette predictor size, wherein the at least one of the maximum palette table size and the palette predictor size is variable.
Optionally, the method further comprises: disabling the palette mode for CUs having a size less than a first predetermined threshold; or in the dual-tree case, disabling the palette mode for chroma CUs having a size less than a second predetermined threshold; or in the single tree case, disabling the palette mode for CUs having chroma samples with a total size less than a third predetermined threshold; or in the case of local dual-tree, disabling the palette mode for CUs having a size less than a fourth predetermined threshold; or in the case of local dual-tree, disabling the palette mode; or in the local dual-tree case, the palette mode is disabled for chroma CUs.
Optionally, the method further comprises: in the local dual-tree case, the update process of the palette prediction is performed on both the luminance CU and the chrominance CU, where the luminance CU is decoded while the palette prediction is updated, and then the chrominance CU is decoded under the same local dual-tree.
Optionally, the method further comprises: under the condition of local dual-tree, forbidding updating of the palette table aiming at the shared palette table; or in the case of local dual-tree, disabling palette table updating; or in the local dual-tree case, disabling palette table updates for CUs having luma samples with a total size less than a fifth predetermined threshold; or in the local dual-tree case, the palette table update is disabled for chroma CUs.
Optionally, the method further comprises: independently performing palette prediction update processing for different color components in the local dual tree; where palette mode decoding is performed in parallel for the luma component and the chroma components, or the luma CU is decoded while the palette prediction is updated, and then the chroma CU is decoded under the same local dual-tree.
Optionally, the step of performing the update process of the palette prediction independently for different color components in the local dual tree includes: when updating the palette prediction while decoding a CU of one color component under the local dual tree, the value of another color component of a previously available candidate in the palette is used, wherein a luma CU is decoded while updating the palette prediction, and then a chroma CU is decoded under the same local dual tree.
Optionally, the method further comprises: when decoding a luma CU and/or a chroma CU under a local dual tree, when updating a palette predictor using entries copied from the palette predictor, using all component values of the palette entries copied from the palette predictor; alternatively, when decoding a luma CU and/or a chroma CU under a local dual-tree, when updating a palette predictor using entries copied from the palette predictor, component values missing from the palette entries copied from the palette predictor are replaced with default values.
Optionally, the default value is related to an internal bit depth.
According to a third aspect of embodiments of the present disclosure, there is provided a video encoding apparatus including: a CU partition unit configured to: dividing a video image into a plurality of Coding Units (CU); a palette table prediction unit configured to: predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table; wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
Optionally, the one ancestor node is: an ancestor node of all ancestor nodes of the at least one CU that meets a predetermined condition with respect to a predetermined size threshold.
Optionally, the one ancestor node is: a largest ancestor node of all ancestor nodes of the at least one CU that is equal to or less than the predetermined size threshold; or the one ancestor node is: a smallest ancestor node of all ancestor nodes of the at least one CU that is equal to or greater than the predetermined size threshold.
Optionally, the apparatus further comprises: a chroma palette table prediction unit configured to: the chroma palette table of a CU is predicted based on the luma palette table of the CU using a cross-component linear model.
Optionally, the apparatus further comprises: a signal transmitting unit configured to: in palette mode, signaling a scanning direction of a CU based on different contexts of a shape of the CU; or a scanning direction determination unit configured to: in the palette mode, when a shape of a CU satisfies a predetermined shape condition, a scanning direction of the CU is determined based on the shape of the CU without signaling the scanning direction of the CU.
Alternatively, the scanning direction determining unit determines a same direction as a longer side of the CU as the scanning direction of the CU, or determines a same direction as a shorter side of the CU as the scanning direction of the CU.
Optionally, the apparatus further comprises: a segment encoding unit configured to: in palette mode, a CU is divided into multiple segments and palette related syntax for each segment of the same CU is encoded independently.
Optionally, the apparatus further comprises: a cache unit configured to: in the palette mode, a CU is divided into a plurality of segments, and palette related data of the respective segments of the same CU are independently cached.
Optionally, the process of dividing the CU into a plurality of segments includes: dividing the CU into the plurality of segments according to the scanning direction of the CU; or, the CU is divided into the plurality of segments according to a binary tree or quadtree partitioning structure.
Optionally, the multiple segments of the same CU share one palette table, or the segments of the same CU use respective palette tables separately.
Optionally, the apparatus further comprises: a palette index map prediction unit configured to: in the case of a separate block tree structure, the palette index map of the corresponding chroma CU is predicted based on the palette index map of the luma CU in a cross-component manner.
Optionally, the apparatus further comprises: a signal transmitting unit configured to: signaling delta QP for a CU encoded in palette mode.
Optionally, the signaling unit signals delta QP for the CU encoded in palette mode in the presence of escape samples; or the signaling unit always signals the delta QP of the CU encoded in palette mode; or the signaling unit signals a delta QP for a luminance component and a delta QP for a chrominance component of the CU encoded in the palette mode, respectively.
Optionally, the apparatus further comprises: a signal transmitting unit configured to: signaling information of a QP to indicate whether a current palette mode is a lossless palette mode, wherein in case a value of the QP is equal to or less than a predetermined threshold, indicating that the current palette mode is a lossless palette mode; indicating that the current palette mode is not a lossless palette mode if the value of the QP is greater than a predetermined threshold.
Alternatively, the operation of quantizing the escape samples in the palette mode is the same as the operation of quantizing the samples in the other modes.
Optionally, the other modes include at least one of: transform skip mode, transform mode, quantized residual differential pulse code modulation, RDPCM, mode.
Optionally, the operation of lossless coding of escape samples in palette mode is the same as the operation of lossless coding of samples in other modes.
Optionally, the other mode is a transform skip mode.
Alternatively, the operation of lossless coding of escape samples in palette mode does not include a quantization operation, or the quantization operation in the operation of lossless coding of escape samples in palette mode is performed based on a QP less than or equal to a predetermined threshold.
Optionally, the apparatus further comprises: a binarization processing unit configured to: performing one of the following binarization processes on the escape sample: fixed length binarization processing, k-th order exponential golomb binarization processing, truncated binary codeword binarization processing, wherein parameters of the binarization processing are determined based on parameters of a CU currently being processed.
Optionally, the fixed length is determined based on a size of the QP and/or bit depth; or k is determined based on the size of the QP and/or bit depth; or the maximum value of the truncated binary codeword is determined based on the size of the QP and/or bit depth.
Alternatively, the binarization processing unit performs fixed-length binarization processing on the escape samples in a lossless palette mode.
Optionally, the apparatus further comprises: a binarization processing unit configured to: the way of binarization processing of the escape samples is adaptively selected at different coding levels.
Optionally, the apparatus further comprises: a signal transmitting unit configured to: signaling palette mode related information; wherein the palette mode related information comprises at least one of: information indicating a maximum allowable palette size, information indicating a maximum allowable palette area, information indicating a maximum allowable palette predictor size, information indicating a difference between the maximum allowable palette predictor size and the maximum allowable palette size, information indicating that syntax for initializing a sequence palette predictor is to be transmitted, information indicating that the number of entries of a palette predictor initializer minus1, information indicating a component value for initializing an ith palette entry of the palette predictor array, information indicating that a bit depth value of a luminance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a chrominance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a luminance component of an entry of the palette minus8, information indicating that a bit depth value of a chrominance component of the palette minus8, information indicating that a value of a color is not less than 8, information indicating that a color is not less than 8, Information indicating the bit depth value of the chroma component of an entry of the palette minus 8.
Optionally, the apparatus further comprises: a rate-distortion analysis unit configured to: rate-distortion analysis is performed on the palette mode based on the precision of the internal bit depth.
Optionally, the rate-distortion analysis unit performs rate-distortion analysis for deriving the palette in the palette mode based on the precision of the internal bit depth; or the accuracy of the distortion calculation for selecting the index of the nearest palette entry is equal to the accuracy of the internal bit depth.
Optionally, the apparatus further comprises: a correlation rate distortion analysis unit configured to: performing a palette-mode correlation rate-distortion analysis, wherein different weights are used for the luma component cost and the chroma component cost in the correlation rate-distortion analysis, or different weights are used for the L1 norm difference and the L2 norm difference in the correlation rate-distortion analysis.
Optionally, chroma component distortion is reduced in a total cost calculation of the correlation rate-distortion analysis.
Optionally, the apparatus further comprises: a QP actual value determination unit configured to: deriving the actual value QP of the quantization parameter QP for the quantization process of the escape samples byesca
QPesca=MIN(((MAX(4,QPcu)–2)/6)*6+4,61)
Where QPcu is the QP value for the current CU.
Alternatively, the number of distortion thresholds for the number of palette entries when deriving the palette table of the current CU in lossy palette mode is extended to 64, where the increased threshold is related to the quantization error.
Optionally, the apparatus further comprises: a signal transmitting unit configured to: signaling at least one of a maximum palette table size and a palette predictor size, wherein at least one of the maximum palette table size and the palette predictor size is variable.
Optionally, the apparatus further comprises: a palette mode disabling unit configured to: disabling the palette mode for CUs having a size less than a first predetermined threshold; or in the dual-tree case, disabling the palette mode for chroma CUs having a size less than a second predetermined threshold; or in the single tree case, disabling the palette mode for CUs having chroma samples with a total size less than a third predetermined threshold; or in the case of local dual-tree, disabling the palette mode for CUs having a size less than a fourth predetermined threshold; or in the case of local dual-tree, disabling the palette mode; or in the local dual-tree case, the palette mode is disabled for chroma CUs.
Optionally, the apparatus further comprises: a palette prediction update unit configured to: in the local dual-tree case, the update process of the palette prediction is performed on both the luminance CU and the chrominance CU; a palette mode encoding unit configured to: the luma CU is coded while the palette prediction is updated, and then the chroma CU is coded under the same local dual-tree.
Optionally, the apparatus further comprises: a palette table update disabling unit configured to: under the condition of local dual-tree, forbidding updating of the palette table aiming at the shared palette table; or in the case of local dual-tree, disabling palette table updating; or in the local dual-tree case, disabling palette table updates for CUs having luma samples with a total size less than a fifth predetermined threshold; or in the local dual-tree case, the palette table update is disabled for chroma CUs.
Optionally, the apparatus further comprises: a palette prediction update unit configured to: independently performing palette prediction update processing for different color components in the local dual tree; a palette mode encoding unit configured to: palette mode coding is performed for the luma component and the chroma component in parallel, or the luma CU is coded while the palette prediction is updated, and then the chroma CU is coded under the same local dual tree.
Alternatively, the palette mode encoding unit encodes the luminance CU while updating the palette prediction, and then encodes the chrominance CU under the same local dual tree; the palette prediction update unit uses a value of another color component of a previously available candidate in the palette when updating the palette prediction while encoding the CU of one color component under the local dual tree.
Optionally, the apparatus further comprises: a palette predictor update unit configured to: when encoding a luma CU and/or a chroma CU under a local dual tree, when updating a palette predictor using entries copied from the palette predictor, using all component values of the palette entries copied from the palette predictor; alternatively, when encoding a luma CU and/or a chroma CU under a local dual-tree, when updating a palette predictor using entries copied from the palette predictor, component values missing from the palette entries copied from the palette predictor are replaced with default values.
Optionally, the default value is related to an internal bit depth.
According to a fourth aspect of the embodiments of the present disclosure, there is provided a video decoding apparatus comprising: a reception parsing unit configured to: receiving and parsing a bitstream; a CU partition unit configured to: obtaining, from the parsed bitstream, a plurality of coding units CU into which the video image is divided; a palette table prediction unit configured to: predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table; wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
Optionally, the one ancestor node is: an ancestor node of all ancestor nodes of the at least one CU that meets a predetermined condition with respect to a predetermined size threshold.
Optionally, the one ancestor node is: a largest ancestor node of all ancestor nodes of the at least one CU that is equal to or less than the predetermined size threshold; or the one ancestor node is: a smallest ancestor node of all ancestor nodes of the at least one CU that is equal to or greater than the predetermined size threshold.
Optionally, the apparatus further comprises: a chroma palette table prediction unit configured to: the chroma palette table of a CU is predicted based on the luma palette table of the CU using a cross-component linear model.
Optionally, the apparatus further comprises: a signal receiving unit configured to: in palette mode, receiving signaling about a scanning direction of a CU, wherein the signaling is sent based on different contexts of a shape of the CU; or a scanning direction determination unit configured to: in the palette mode, when a shape of a CU satisfies a predetermined shape condition, a scanning direction of the CU is determined based on the shape of the CU without receiving signaling regarding the scanning direction of the CU.
Alternatively, the scanning direction determining unit determines a same direction as a longer side of the CU as the scanning direction of the CU, or determines a same direction as a shorter side of the CU as the scanning direction of the CU.
Optionally, the apparatus further comprises: a slice decoding unit configured to: in palette mode, a CU is divided into multiple segments and palette-related syntax for the individual segments of the same CU is decoded independently.
Optionally, the apparatus further comprises: a cache unit configured to: in the palette mode, a CU is divided into a plurality of segments, and palette related data of the respective segments of the same CU are independently cached.
Optionally, the process of dividing the CU into a plurality of segments includes: dividing the CU into the plurality of segments according to the scanning direction of the CU; or, the CU is divided into the plurality of segments according to a binary tree or quadtree partitioning structure.
Optionally, the multiple segments of the same CU share one palette table, or the segments of the same CU use respective palette tables separately.
Optionally, the apparatus further comprises: a palette index map prediction unit configured to: in the case of a separate block tree structure, the palette index map of the corresponding chroma CU is predicted based on the palette index map of the luma CU in a cross-component manner.
Optionally, the apparatus further comprises: a signal receiving unit configured to: signaling is received regarding delta QP for a CU encoded in palette mode.
Optionally, the signal receiving unit receives signaling of delta QP for a CU encoded in palette mode in the presence of escape samples; or the signal receiving unit always receives signaling of delta QP for a CU encoded in palette mode; or the signal receiving unit receives signaling regarding delta QP for a luma component of the CU encoded in the palette mode and signaling regarding delta QP for a chroma component of the CU encoded in the palette mode, respectively.
Optionally, the apparatus further comprises: a lossless palette mode determination unit configured to: receiving information of a QP and determining whether a current palette mode is a lossless palette mode based on the information of the QP, wherein in the case that the value of the QP is equal to or less than a predetermined threshold, the current palette mode is determined to be the lossless palette mode; determining that the current palette mode is not a lossless palette mode if the value of the QP is greater than a predetermined threshold.
Alternatively, the operation of inverse quantizing the escape samples in the palette mode is the same as the operation of inverse quantizing the samples in the other modes.
Optionally, the other modes include at least one of: transform skip mode, transform mode, quantized residual differential pulse code modulation, RDPCM, mode.
Optionally, the operation of lossless decoding of escape samples in palette mode is the same as the operation of lossless decoding of samples in other modes.
Optionally, the other mode is a transform skip mode.
Alternatively, the operation of lossless decoding of escape samples in the palette mode does not include an inverse quantization operation, or an inverse quantization operation among the operations of lossless decoding of escape samples in the palette mode is performed based on a QP less than or equal to a predetermined threshold.
Optionally, the apparatus further comprises: an inverse binarization processing unit configured to: performing one of the following inverse binarization processes on the escape sample: fixed length inverse binarization processing, k-th order exponential golomb inverse binarization processing, truncated binary codeword inverse binarization processing, wherein parameters of the inverse binarization processing are determined based on parameters of a CU currently being processed.
Optionally, the fixed length is determined based on a size of the QP and/or bit depth; or k is determined based on the size of the QP and/or bit depth; or the maximum value of the truncated binary codeword is determined based on the size of the QP and/or bit depth.
Alternatively, the inverse binarization processing unit performs fixed-length inverse binarization processing on the escape samples in a lossless palette mode.
Optionally, the apparatus further comprises: an inverse binarization processing unit configured to: the manner in which the inverse binarization processing is performed on the escape samples is adaptively selected at different decoding levels.
Optionally, the apparatus further comprises: a signal receiving unit configured to: receiving signaling regarding palette mode related information; wherein the palette mode related information comprises at least one of: information indicating a maximum allowable palette size, information indicating a maximum allowable palette area, information indicating a maximum allowable palette predictor size, information indicating a difference between the maximum allowable palette predictor size and the maximum allowable palette size, information indicating that syntax for initializing a sequence palette predictor is to be transmitted, information indicating that the number of entries of a palette predictor initializer minus1, information indicating a component value for initializing an ith palette entry of the palette predictor array, information indicating that a bit depth value of a luminance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a chrominance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a luminance component of an entry of the palette minus8, information indicating that a bit depth value of a chrominance component of the palette minus8, information indicating that a value of a color is not less than 8, information indicating that a color is not less than 8, Information indicating the bit depth value of the chroma component of an entry of the palette minus 8.
Optionally, the apparatus further comprises: a QP actual value determination unit configured to: deriving the actual value QP of the quantization parameter QP for the quantization process of the escape samples byesca
QPesca=MIN(((MAX(4,QPcu)–2)/6)*6+4,61)
Where QPcu is the QP value for the current CU.
Alternatively, the number of distortion thresholds for the number of palette entries when deriving the palette table of the current CU in lossy palette mode is extended to 64, where the increased threshold is related to the quantization error.
Optionally, the apparatus further comprises: a signal receiving unit configured to: receiving signaling regarding at least one of a maximum palette table size and a palette predictor size, wherein the at least one of the maximum palette table size and the palette predictor size is variable.
Optionally, the apparatus further comprises: a palette mode disabling unit configured to: disabling the palette mode for CUs having a size less than a first predetermined threshold; or in the dual-tree case, disabling the palette mode for chroma CUs having a size less than a second predetermined threshold; or in the single tree case, disabling the palette mode for CUs having chroma samples with a total size less than a third predetermined threshold; or in the case of local dual-tree, disabling the palette mode for CUs having a size less than a fourth predetermined threshold; or in the case of local dual-tree, disabling the palette mode; or in the local dual-tree case, the palette mode is disabled for chroma CUs.
Optionally, the apparatus further comprises: a palette prediction update unit configured to: in the local dual-tree case, the update process of the palette prediction is performed on both the luminance CU and the chrominance CU; a palette mode decoding unit configured to: the luma CU is decoded while the palette prediction is updated, and then the chroma CU is decoded under the same local dual-tree.
Optionally, the apparatus further comprises: a palette table update disabling unit configured to: under the condition of local dual-tree, forbidding updating of the palette table aiming at the shared palette table; or in the case of local dual-tree, disabling palette table updating; or in the local dual-tree case, disabling palette table updates for CUs having luma samples with a total size less than a fifth predetermined threshold; or in the local dual-tree case, the palette table update is disabled for chroma CUs.
Optionally, the apparatus further comprises: a palette prediction update unit configured to: independently performing palette prediction update processing for different color components in the local dual tree; a palette mode decoding unit configured to: performing palette mode decoding in parallel for the luma component and the chroma components; alternatively, the luma CU is decoded while the palette prediction is updated, and then the chroma CU is decoded under the same local dual-tree.
Optionally, the palette mode decoding unit decodes the luma CU while updating the palette prediction, and then decodes the chroma CU under the same local dual tree; the palette prediction update unit uses a value of another color component of a previously available candidate in the palette when updating the palette prediction while decoding the CU of the one color component under the local dual tree.
Optionally, the apparatus further comprises: a palette predictor update unit configured to: when decoding a luma CU and/or a chroma CU under a local dual tree, when updating a palette predictor using entries copied from the palette predictor, using all component values of the palette entries copied from the palette predictor; alternatively, when decoding a luma CU and/or a chroma CU under a local dual-tree, when updating a palette predictor using entries copied from the palette predictor, component values missing from the palette entries copied from the palette predictor are replaced with default values.
Optionally, the default value is related to an internal bit depth.
According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic apparatus including: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform a video encoding method as described above or a video decoding method as described above.
According to a sixth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform a video encoding method as described above or a video decoding method as described above.
According to a seventh aspect of embodiments of the present disclosure, there is provided a computer program product comprising computer instructions which, when executed by at least one processor, implement the video encoding method as described above or the video decoding method as described above.
In the video encoding method, the video decoding method and the video decoding device according to the exemplary embodiments of the present disclosure, the palette mode is optimized and improved, so that the encoding and decoding efficiency and the encoding and decoding quality can be improved, and the efficient hardware implementation of the encoder and decoder is facilitated.
Additional aspects and/or advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.
Drawings
The above and other objects and features of the exemplary embodiments of the present disclosure will become more apparent from the following description taken in conjunction with the accompanying drawings which illustrate exemplary embodiments, wherein:
fig. 1 shows a block diagram of a generic block-based hybrid video coding system according to an exemplary embodiment of the present disclosure;
FIG. 2 illustrates an example of block partitioning in a multi-type tree structure according to an example embodiment of the present disclosure;
fig. 3 shows a block diagram of a general block-based video decoding system according to an example embodiment of the present disclosure;
fig. 4 illustrates an example of a block encoded in a palette mode according to an exemplary embodiment of the present disclosure;
fig. 5 illustrates an example of signaling palette entries using a palette predictor according to an exemplary embodiment of the present disclosure;
FIG. 6 illustrates an example of horizontal and vertical scanning according to an exemplary embodiment of the present disclosure;
fig. 7 illustrates a sub-block based index map scan for a palette according to an example embodiment of the present disclosure;
fig. 8 illustrates a flowchart of a video encoding method according to an exemplary embodiment of the present disclosure;
fig. 9 illustrates an example of determining shared palette nodes according to an exemplary embodiment of the present disclosure;
fig. 10 illustrates a flowchart of a video decoding method according to an exemplary embodiment of the present disclosure;
fig. 11 illustrates a block diagram of a video encoding apparatus according to an exemplary embodiment of the present disclosure;
fig. 12 illustrates a block diagram of a video decoding apparatus according to an exemplary embodiment of the present disclosure;
fig. 13 shows a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The embodiments described in the following examples do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
In this case, the expression "at least one of the items" in the present disclosure means a case where three types of parallel expressions "any one of the items", "a combination of any plural ones of the items", and "the entirety of the items" are included. For example, "include at least one of a and B" includes the following three cases in parallel: (1) comprises A; (2) comprises B; (3) including a and B. For another example, "at least one of the first step and the second step is performed", which means that the following three cases are juxtaposed: (1) executing the step one; (2) executing the step two; (3) and executing the step one and the step two.
Fig. 1 to 3 illustrate examples of implementation scene diagrams of a video encoding method, a video decoding method, and apparatuses according to the present disclosure.
Fig. 1 shows a block diagram of a general block-based hybrid video coding system.
Referring to fig. 1, an input video signal is processed block by block (referred to as a coding unit CU). In VTM-1.0, one CU may be up to 128 × 128 pixels, however, unlike HEVC which uses only quadtrees to partition blocks, to accommodate local features of a video frame, the VVC uses quadtrees/binary trees/ternary trees to partition a Coding Tree Unit (CTU) into CUs. In addition, the VVC eliminates the concept of partition Unit type in HEVC, i.e., there is no longer a separation of CU, Prediction Unit (PU), and Transform Unit (TU) in the VVC; instead, each CU is always taken as the basic unit for both prediction and transform, without further partitioning. In the multi-type tree structure, the CTUs are first divided based on the quadtree structure, and then, each leaf node of the quadtree may be further divided based on the binary tree structure and the ternary tree structure. As shown in fig. 2, there are five partition types: a quaternary partition (as shown in (a)), a horizontal binary partition (as shown in (c)), a vertical binary partition (as shown in (b)), a horizontal ternary partition (as shown in (e)), and a vertical ternary partition (as shown in (d)), wherein W represents the width of the CUT and H represents the height of the CUT. As shown in fig. 1, spatial prediction and/or temporal prediction may be performed, wherein spatial prediction (or "intra prediction") uses pixels from samples (referred to as reference samples) of already coded neighboring blocks in the same video frame/slice to predict a current video block to reduce spatial redundancy inherent in the video frame; temporal prediction (also referred to as "inter prediction" or "motion compensated prediction") uses reconstructed pixels from an already encoded video frame to predict the current video block to reduce temporal redundancy inherent in the video frame. The temporal prediction signal of a CU typically requires the transmission of one or more Motion Vector (MV) signals, where the MV signals are used to indicate the Motion increment and direction between the current CU and its temporal reference, and additionally a reference is transmitted if there are multiple reference video frames.
An index of the video frame identifying from which reference video frame in the reference video frame store the temporal prediction signal came. After spatial prediction and/or temporal prediction, a mode decision module in the encoder may select an optimal prediction mode according to a rate-distortion optimization method, then subtract a prediction block obtained using the prediction mode from the current CU to obtain a prediction residual, decorrelate and quantize the prediction residual using a transform unit and a quantization unit, inverse quantize and inverse transform the quantized residual coefficients to form a reconstructed residual, and then add the reconstructed residual back to the prediction block to form a reconstructed signal of the current CU. Before the reconstructed CU is placed In the reference video frame memory, Loop filters such as a deblocking Filter, a Sample Adaptive Offset (SAO) Filter, and an Adaptive In-Loop Filter (ALF) may also be applied to the reconstructed CU. And finally, sending the coding mode (inter-frame or intra-frame), the prediction mode information, the motion information and the quantized residual coefficient to an entropy coding unit for further compression and packaging to obtain the finally output video bit stream.
Fig. 3 shows a block diagram of a general block-based video decoding system.
Referring to fig. 3, the entropy decoding unit entropy-decodes the video bitstream, and if the encoding mode is intra-coding, the prediction information is transmitted to the spatial prediction unit to form a prediction block, and if the encoding mode is inter-coding, the prediction information is transmitted to the temporal prediction unit to form a prediction block, and the residual coefficients are transmitted to the inverse quantization unit and the inverse transformation unit to reconstruct a residual block, and then the prediction block and the residual block are added to obtain a reconstructed block. The reconstructed block may also be loop filtered before it is stored to the reference video frame memory. The reconstructed video in the reference video frame store may be used to both drive the display device and predict video blocks.
Regarding the palette mode for intra prediction in the VVC standard, the basic idea of the palette mode is that the samples in a CU are represented by a small set of representative color values, which set is called a palette. Color values that do not belong to the palette can be classified as escape colors for which their three color component values are sent directly in the bitstream. As shown in fig. 4, the palette size is 4, the first 3 samples are reconstructed using palette entries 2, 0, and 3, respectively, and the black sample represents the escape symbol. The CU level flag palette _ escape _ val _ present _ flag indicates whether an escape symbol exists in the CU, if an escape symbol exists, the palette size is increased by 1 and the last index is used to indicate the escape symbol. Thus, as shown in fig. 4, an index of 4 is assigned to an escape symbol.
In order to decode a palette coded block, the decoder needs to know the following information: a palette table and palette indices. If the palette index corresponds to an escape symbol, the color value of the sample needs to be signaled additionally. Furthermore, on the encoder side, the palette used by the CU needs to be derived.
To derive a palette for lossy coding, a modified k-means clustering algorithm is used. The first sample value of the block is added to the palette and then, for each subsequent sample from the block, the Sum of Absolute Differences (SAD) between it and each palette color currently being calculated. The samples are added to the clusters belonging to the palette entry if the distortion for each of the components is less than a threshold for the palette entry corresponding to the minimum SAD, otherwise, the samples are added as a new palette entry. When the number of samples mapped to a cluster exceeds a threshold, the center of the cluster is updated and becomes the palette entry for the cluster.
In the next step, the clusters are sorted in descending order. Then, the palette entries corresponding to each entry are updated. Typically, the cluster center is treated as a palette entry. But when considering the cost of encoding palette entries, a rate-distortion analysis is performed to analyze whether any entry in the palette predictor is more suitable than the center as an updated palette entry. This process continues until all clusters are processed or the maximum palette size is reached. Finally, if a certain cluster has only a single sample and the corresponding palette entry is not in the palette predictor, that sample is converted to an escape symbol. In addition, duplicate palette entries are removed and their clusters are merged.
After palette derivation, each sample in the block is assigned (in SAD) the index closest to the palette entry. The samples are then assigned an "INDEX" or "COPY _ ABOVE" pattern. For each sample, which may be an "INDEX" or "COPY _ ABOVE" pattern, the operation of each pattern is determined. Then, the encoding cost of the mode is calculated, and the mode with lower cost is selected.
To encode palette entries, a palette predictor is maintained. The maximum size of the palette and the maximum size of the palette predictor are signaled in the SPS. The palette predictor is initialized at the beginning of each CTU row, each slice, and each tile.
For each entry in the palette predictor, a multiplexing flag is signaled to indicate whether it is part of the current palette. As shown in fig. 5, the multiplexing flag is transmitted using run-length coding of zeros. After this, the number of new palette entries is sent using the 0 th order exponential golomb code. Finally, the component value of the new palette entry is signaled.
The palette index is encoded using horizontal and vertical traversal scan (traverse scan), as shown in fig. 6, with the scan order being explicitly signaled in the bitstream using palette _ transit _ flag.
To encode the palette indices, palette mode based row Coefficient Groups (CGs) are used that divide the CU into multiple segments of 16 samples based on the ergodic scan pattern, as shown in fig. 7, the location of index runs, palette index values, and quantized colors of escape mode are sequentially encoded/parsed for each CG.
Two main palette sample modes are used: "INDEX" and "COPY _ ABOVE" encode palette indices. As described above, the escape symbol is specified as an index equal to the maximum palette size. In "COPY _ ABOVE" mode, the palette indices for the samples in the upper row are copied. In "INDEX" mode, the palette INDEX is explicitly signaled. The coding order of the palette run coding in each slice is as follows: for each pixel, one context-coded binary value run _ COPY _ flag is signaled as 1 if the pixel has the same pattern as the previous pixel, i.e., if both the previously scanned pixel and the current pixel have a run type COPY _ ABOVE, or both the previously scanned pixel and the current pixel have a run type index and the same index value. Otherwise, run _ copy _ flag is signaled as 0.
If the pixel and the previous pixel have different patterns, then a context-encoded binary value COPY ABOVE palette indices flag indicating the run type of the pixel (e.g., INDEX or COPY ABOVE) is signaled. If the sample is in the first row (horizontal traversal scan) or in the first column (vertical traversal scan), the decoder does not have to parse the run type because the INDEX mode is used by default. Furthermore, if the previously parsed run type is COPY ABOVE, the decoder does not have to parse the run type.
After palette run coding of pixels in a slice, the INDEX value (palette _ idx _ idc) and the quantized escape color (palette _ escape _ val) of the INDEX pattern are bypass coded.
Fig. 8 illustrates a flowchart of a video encoding method according to an exemplary embodiment of the present disclosure.
Referring to fig. 8, in step S101, a video image is divided into a plurality of coding units CU.
In step S102, a palette table of at least one CU partitioned under the same parent node among the plurality of CUs is predicted based on the shared palette table. The shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
In the present disclosure, in order to reduce the amount of derivation processing of palette predictor (palette predictor), it is proposed to predict at least one palette table based on the same shared palette table. As an example, for a plurality of small CUs, one palette table may be derived at a common ancestor node along their quadtree/binary/ternary tree, and for all CUs under that ancestor node, their respective palette tables are no longer derived individually, but the same palette table derived at the ancestor node is shared for palette table predictions of those CUs under that ancestor node. This ancestor node may be referred to as a "shared palette node".
As an example, the one ancestor node may be: an ancestor node of all ancestor nodes of the at least one CU that meets a predetermined condition with respect to a predetermined size threshold.
As an example, the one ancestor node may be: a largest ancestor node of all ancestor nodes of the at least one CU that is equal to or less than the predetermined size threshold; alternatively, the one ancestor node may be: a smallest ancestor node of all ancestor nodes of the at least one CU that is equal to or greater than the predetermined size threshold.
As an example, various suitable methods may be used to determine a shared palette node for several small CUs. In a first approach, the shared palette node may be determined as the largest of all ancestors of the CUs among the ancestors having a size equal to or less than a predetermined size threshold. In a second approach, the shared palette node may be determined to be the smallest of all ancestors of the CUs among the ancestors having a size equal to or greater than a predetermined size threshold. Fig. 9 illustrates an example of comparing differences between two methods when determining a shared palette node, where, as shown in fig. 9, one parent node of 128 samples is split via a ternary tree into three CUs of 32, 64, and 32 samples, respectively, assuming that the predetermined size threshold is 64, if the first method is utilized, three leaf nodes (i.e., the three CUs) are shared palette nodes, in other words, in the case where there is no ancestor node among all ancestor nodes of the at least one CU that satisfies the predetermined condition, the at least one CU may be determined to be a shared palette table node, i.e., the three CUs; if using the second approach, the parent node may be determined to be a shared palette node, and the palette tables of the three CUs may be predicted based on the palette table derived at the parent node.
As an example, the predetermined size threshold is predetermined by the encoding side and the decoding side together or is determined at the encoding side and signaled to the decoding side.
As described above, different thresholds for block sizes may be used to identify shared palette nodes. In one embodiment, both the encoder and decoder may share one fixed threshold (i.e., the predetermined size threshold) without signaling. In another embodiment, the threshold may be sent in the bitstream by a syntax element. As an example, the syntax elements may be sent in different levels of the bitstream (e.g., Sequence Parameter Set (SPS), Picture Parameter Set (PPS), tile group header, and slice header) to provide different tradeoffs between coding efficiency and parallel processing capability.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: the chroma palette table of a CU is predicted based on the luma palette table of the CU using a cross-component linear model.
The present disclosure proposes that a cross-component linear model (CCLM) can be used to predict the chroma palette.
In one embodiment, the linear model may be computed using neighboring luma and chroma samples as in CCLM. After determining the linear model, the chroma palette table may be predicted based on the luma palette table of the same CU along with the linear model. For example, the chroma palette prediction may be derived as follows:
predC(i,j)=α·recL′(i,j)+β (1)
therein, predC(i, j) represents the predicted chroma palette in the CU, and recL' (i, j) denotes reconstructed luma palette samples of the same CU. The linear model parameters α and β may be derived using various suitable derivation methods. For example, one exemplary method is a linear relationship between luminance and chrominance values from two samples, the minimum luminance sample A (X) in the palette tableA,YA) And maximum luminance sample B (X)B,YB). Here, XAAnd YAIs the X-coordinate value (i.e., luminance value) and the y-coordinate value (i.e., chrominance value) of sample A, XBAnd YBAre the luminance and chrominance values of sample B. The linear model parameters α and β can be obtained according to the following equations:
Figure BDA0003006398430000151
according to embodiments of the present disclosure, for chroma palette prediction, a flag may be signaled to indicate whether the predicted chroma palettes are all generated using a cross-component linear model. As an example, if the flag is true, it indicates that the predicted chroma palettes are all generated using the cross-component linear model of the same CU-based luma palette table described above; otherwise, it indicates that the predicted chroma palette was generated from a neighboring block palette, e.g., as in VVC. As another example, if the flag is true, it indicates that the predicted chroma palettes were all generated using the cross-component linear model described above; otherwise, it indicates that the predicted chroma palette is jointly generated based on the neighboring block palettes and the cross-component linear model described above.
According to an embodiment of the present disclosure, when a predicted chroma palette is jointly generated based on a neighboring block palette and the cross-component linear model described above, entries from each of the two palettes (i.e., the neighboring block palette and the palette generated from the cross-component linear model) may be sequentially selected based on their indices in each palette. As an example, for each entry in the predicted chroma palette, flags may be signaled to indicate whether the entry was generated using the cross-component linear model described above, which flags are sent to the decoder via the bitstream. It should be understood that the flags may be encoded using different methods, for example, run-length encoding of zeros may be used.
It is noted that with the above approach, new palette entries may still be sent in a similar manner as in current VVC designs. For example, after signaling these palette entries from the predicted chroma palette, the number of new palette entries may be signaled, and then the component values of the new palette entries may be signaled.
In one embodiment, the video encoding method according to an exemplary embodiment of the present disclosure may further include: in palette mode, the scanning direction of a CU is signaled based on different contexts of the shape of the CU.
In palette mode coding in HEVC, the traverse scan direction (i.e., horizontal or vertical) is signaled for each CU. In new coding standards such as VVC, where the CU shape is no longer always square, this signaling to traverse the scan direction can be further improved based on the shape (e.g., aspect ratio) of the current coding unit.
According to an embodiment of the present disclosure, different contexts based on the shape of the current coding unit are used to send the traversal scan direction in palette mode. In other words, depending on the shape of the current coding unit, different CABAC contexts may be selected such that different CABAC probabilities are used. For example, the context may also depend on the traversal scan direction of the neighboring blocks.
In another embodiment, the video encoding method according to an exemplary embodiment of the present disclosure may further include: in the palette mode, when a shape of a CU satisfies a predetermined shape condition, a scanning direction of the CU is determined based on the shape of the CU. Further, on this basis, the CU's scanning direction may not be signaled. As an example, the same direction as the longer side of the CU may be determined as the scanning direction of the CU, or the same direction as the shorter side of the CU may be determined as the scanning direction of the CU.
According to embodiments of the present disclosure, signaling to traverse the scan direction may be conditionally omitted depending on the shape of the current coding unit. In this case, as an example, at the decoder side, the traversal direction may be inferred based on the shape of the current block. In one embodiment, if a block has an aspect ratio above a certain threshold, its traversal scan direction may not be sent in palette mode; at the decoder side, the traversal scan direction can be inferred to be the same direction as the longer side of the block, otherwise its traversal scan direction is normally signaled. In another embodiment, if a block has an aspect ratio above a certain threshold, then in palette mode, its traversal scan direction is not sent; at the decoder side, the traversal scan direction is inferred to be the same direction as the shorter side of the block, otherwise its traversal scan direction is normally signaled.
In palette mode coding in HEVC, the traversal scan direction (i.e., horizontal or vertical) is determined by the rate-distortion (rate-distortion) cost of each CU. More specifically, the rate-distortion costs associated with the horizontal and vertical traversal scan directions are calculated separately, the direction with the lower rate-distortion cost value is selected as the traversal scan direction and signaled to the decoder of a given CU, which needs to be calculated at the encoder side. To save this computational complexity on the encoder side, in some cases such a traversal scan direction can be decided without rate-distortion cost calculations.
According to an embodiment of the present disclosure, calculating a rate-distortion cost for traversing a scan direction may be conditionally omitted depending on a shape of a current block. In this case, at the encoder side, the traversal scan direction may be determined based on the shape of the current block. As an example, if a block has an aspect ratio above a certain threshold, then in palette mode, its rate-distortion cost of traversing the scan direction is not calculated; at the encoder side, the traversal scan direction is simply selected to be the same direction as the longer side of the block, otherwise, the rate-distortion cost is calculated for each of its traversal scan directions and the direction with the lower cost value is selected. As another example, if a block has an aspect ratio above a certain threshold, then in palette mode, its rate-distortion cost of traversing the scan direction is not calculated; on the encoder side, the traversal scan direction is simply selected to be the same direction as the shorter side of the block, otherwise, the rate-distortion cost is typically calculated for each of its traversal scan directions, and the direction with the lower cost value is selected.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: in palette mode, a CU is divided into multiple segments and palette related syntax for each segment of the same CU is encoded independently.
According to one embodiment of the present disclosure, in order to improve CABAC encoding throughput in the palette mode, a segment-based palette mode is proposed. According to this embodiment, in palette mode, a CU may be divided into multiple segments, each segment containing multiple samples (e.g., M samples). In one example, M is a positive number, which may be 16 or 32, for example. As an example, for each segment, CABAC parsing and/or encoding of palette-related syntax (e.g., index runs, palette index values, and quantized colors for escape modes, etc.) is independent of other segments of the same CU. To achieve this, all CABAC parsing dependencies (e.g., context modeling) and decoding dependencies (e.g., copy-over mode) in palette mode are completely disallowed across neighboring slices.
According to one embodiment of the present disclosure, in order to improve the throughput of CABAC encoding and pixel value reconstruction in palette mode, a slice-based palette mode is proposed. According to this embodiment, a CU may be divided into multiple segments in palette mode, each segment containing multiple samples (e.g., M samples). M is a positive number, which may be, for example, 128, 256, 512, 1024, or the like. As an example, the value of M may be selected based on throughput requirements, e.g., the smaller the value of M, the better the throughput, but the greater the impact on coding performance. As an example, for each segment, CABAC parsing of palette-related syntax (e.g., index runs, palette index values, etc.) may be independent of other segments of the same CU, in other words, the CABAC engine may be initialized independently in each segment and syntax in one segment encoded without using any information from the other segments as context.
According to an embodiment of the present disclosure, a CU may be divided into multiple segments in palette mode using a suitable method. As an example, the CU may be divided into the plurality of segments according to a scanning direction of the CU; or, the CU is divided into the plurality of segments according to a binary tree or quadtree partitioning structure. In one embodiment, a CU in palette mode may be divided into multiple segments based on traversal scan order, i.e., the first M samples along the scan order are divided into segment 1, and the next M samples along the scan order are divided into segment 2, and so on. In another embodiment, a CU in palette mode may be divided into multiple segments based on a binary tree or quadtree partitioning structure. As an example, within each segment, the traversal scan order may still be used for palette coding.
As an example, the multiple segments of the same CU may share one palette table, or each segment of the same CU uses a respective palette table separately. According to embodiments of the present disclosure, the palette may be jointly encoded and shared for all different segments in the CU, or may be encoded separately with the palette sent for each segment. It should be understood that different methods may be used to encode the palette. For example, the number of index values of a segment may be signaled first, followed by the actual index values of the entire segment encoded using truncated binary. For example, both the number of indices and the index values may be encoded in bypass mode, which groups together index-dependent bypass bits. This run is then signaled and finally, the component escape values corresponding to the escape samples for the entire fragment are grouped together and encoded in bypass mode.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: in the palette mode, a CU is divided into a plurality of segments, and palette related data of the respective segments of the same CU are independently cached.
According to an embodiment of the present disclosure, a segment-based palette mode is proposed to reduce the size of a buffer required to store index values and an index map in the palette mode. According to this embodiment, in palette mode, a CU may be divided into multiple segments, each of which may include multiple samples (e.g., N samples). In one example, N is a positive number, e.g., N may be 128, 256, 512, or 1024, etc. As an example, the buffer of palette related data (e.g., index map and palette index values) for each segment is independent of other segments of the same CU. To achieve this, decoding dependencies in palette mode (e.g., copy-over mode) are prohibited from crossing neighboring segments.
According to an embodiment of the present disclosure, a CU may be divided into multiple segments in palette mode using an appropriate method. As an example, the CU may be divided into the plurality of segments according to a scanning direction of the CU; or, the CU is divided into the plurality of segments according to a binary tree or quadtree partitioning structure. In one embodiment, each CU of the palette mode is divided into multiple slices based on the traversal scan order, i.e., the first N samples along the scan order are divided into slice 1, and the next N samples along the scan order are divided into slice 2, and so on. In another embodiment, a CU may be divided into multiple segments based on a binary or quadtree partitioning structure. As an example, within each segment, the traversal scan order is still available for palette coding. For example, the number of index values of a segment is signaled first, followed by the actual index values of the truncated binary-coded entire segment. For example, both the number of indices and the index values are encoded in bypass mode, which groups together index-dependent bypass bits. This run is then signaled. Finally, the component escape values corresponding to the escape samples for the entire fragment are grouped together and encoded in bypass mode.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: in the case of a separate block tree structure, the palette index map of the corresponding chroma CU is predicted based on the palette index map of the luma CU in a cross-component manner.
In the current VVC standard, the coding tree partitioning scheme supports the case where luminance and chrominance have separate block tree structures. When the separate block tree mode is applied, the luma component of the CTU is divided into CUs by one coding tree structure, and the chroma component of the CTU is divided into chroma CUs by another coding tree structure. Accordingly, for the palette mode, there are two index maps, one for the luma component and the other for the two chroma components. According to an embodiment of the present disclosure, when a separate block tree structure is used, the index maps of different color components in the palette mode may be predicted in a cross-component manner. In one example, a chroma palette index map may be predicted using a corresponding luma palette index map. It should be appreciated that different methods may be used for this cross-component index map prediction. In one approach, a special mode may be introduced in which the chroma palette index map is assumed to be identical to the corresponding luma palette index map, such that the chroma palette index map need not be explicitly signaled. In another method, a corresponding luma palette index map may be used as a predictor of a chroma palette index map, and a difference between the predictor and an actual chroma palette index map may be encoded and transmitted in a bitstream.
According to embodiments of the present disclosure, a flag may be used to indicate whether cross-component palette index map prediction is used in palette mode. As an example, when the flag is true, it indicates a palette index map for predicting two chroma components using the luma palette index map; when the flag is false, the palette index map for luma and chroma is signaled separately without using mutual prediction.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: a delta QP (Quantization Parameter) for a CU encoded in palette mode is signaled.
According to an embodiment of the present disclosure, it is proposed to signal delta QP for a CU encoded in palette mode. As an example, delta QP may be signaled for a CU encoded in palette mode in the presence of escape samples. According to the present embodiment, the modification of the decoding process of the VVC draft is as shown in table 1, in which deleted portions are marked with a strikethrough.
TABLE 1 syntax example
Figure BDA0003006398430000181
As another example, delta QP may always be signaled for a CU encoded in palette mode, i.e. when a block is coded in palette mode, delta QP is always sent for the palette coded block, the modification of the decoding process to the VVC draft is as shown in table 2, where the deleted portions are marked with deletion lines.
Table 2 syntax example
Figure BDA0003006398430000191
As another example, delta QP for luma component and delta QP for chroma component of a CU encoded in palette mode may be signaled separately, i.e., when a block is encoded in palette mode, delta QP for luma and chroma is always sent separately for palette encoded blocks, and the modification of the decoding process for the VVC draft is as shown in table 3, where the deleted portions are marked with strikethrough.
Table 3 syntax example
Figure BDA0003006398430000201
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: signaling information of a QP to indicate whether a current palette mode is a lossless palette mode, wherein in case a value of the QP is equal to or less than a predetermined threshold, indicating that the current palette mode is a lossless palette mode; indicating that the current palette mode is not a lossless palette mode if the value of the QP is greater than a predetermined threshold.
According to an embodiment of the present disclosure, it is proposed to use information of the QP to indicate whether the current palette mode is lossless. In accordance with the present disclosure, such QP information may come from different levels, for example, it may come from a video sequence level, or a picture level, or a tile group level, or a tile level, or a separate CU level. Note that a CU within or below the level at which QP information is sent may be indicated to be encoded in lossless palette mode when the value of QP is equal to or less than a certain threshold (e.g., 4); when the value of QP is greater than or equal to a certain threshold (e.g., 4), indicating that a CU within or below the level at which QP information is sent is encoded, lossless palette mode is not available at all.
As an example, the operation of quantizing the escape samples in the palette mode may be the same as the operation of quantizing the samples in the other modes. As an example, the other modes may include at least one of: transform skip mode, transform mode, quantized residual differential pulse code modulation, RDPCM, mode.
In existing quantization designs for escape samples, the quantization scale of the escape samples is the same as the scale used for conventional quantization of samples using other coding tools, e.g., samples in the case of transform skip and/or transform, given the quantization parameter (or QP), but the actual operation of escape sample quantization is defined differently than the conventional quantization operation. For example, the quantization operation of the escape samples involves a different shift and/or offset operation than conventional quantization. According to an embodiment of the present disclosure, a unified quantization process is proposed. As an example, a conventional quantization process may be used for quantization of escape samples in palette mode. For example, the quantization design of the escape samples would be the same as the quantization process of the samples in the transform skip mode and/or the transform mode. Equations (3) and (4) describe the corresponding quantization and dequantization processes applied at the encoder and decoder, respectively, when the palette escape color is encoded using quantization/dequantization of the transform skip mode.
The encoder side:
Figure BDA0003006398430000211
on the decoder side:
Figure BDA0003006398430000212
wherein pResi and pResi' represent original residual coefficients and reconstructed residual coefficients, respectively; pLevel represents the quantization value; transformShift represents a displacement for compensating for an increase in dynamic range due to a 2D transform, which is equal to 15-bitDepth- (log2(W) + log2(H))/2, where W and H represent the width and height of the current transform unit, bitDepth represents the coded bit depth, encScale [ ] and desscale [ ] represent quantization and dequantization look-up tables, which have 14-bit and 6-bit precisions, and are defined as:
QP%6 0 1 2 3 4 5
encScale[QP%6] 26214 23302 20560 18396 16384 14564
decScale[QP%6] 40 45 51 57 64 72
when the size of the transform block is not a power of 4, another look-up table is defined as:
QP%6 0 1 2 3 4 5
encScale[QP%6] 18396 16384 14564 13107 11651 10280
decScale[QP%6] 57 64 72 80 90 102
in existing quantization designs for RDPCM (Quantized residual differential pulse coded modulation) samples, given the quantization parameter (or QP), the scale used for quantization of RDPCM samples is the same as the scale used for conventional quantization of samples using other coding tools when the size of the transform block is a power of 4. According to an embodiment of the present disclosure, it is proposed to unify the quantization process by using the same shifting and/or shifting operations as RDPCM quantization. As an example, the RDPCM quantization process is used for quantization of escape samples in palette mode, thus the quantization design for escape samples will be the same as the quantization process for samples in RDPCM mode.
As an example, the operation of lossless coding of escape samples in palette mode may be the same as the operation of lossless coding of samples in other modes. As an example, the other mode may be a transform skip mode. As an example, the operation of lossless coding of escape samples in palette mode does not include a quantization operation. As an example, the quantization operation in the operation of lossless coding of escape samples in palette mode may be performed based on a QP less than or equal to a predetermined threshold.
According to embodiments of the present disclosure, lossless encoding of escape samples may follow lossless encoding of samples in other encoding tools (e.g., samples in transform skip mode). In one embodiment, for lossless encoding, the quantization process may be bypassed. In another embodiment, the quantization process may be performed using a QP equal to or less than a certain threshold (e.g., 4) to achieve lossless encoding.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: performing one of the following binarization processes on the escape sample: fixed length binarization processing, k-order exponential Golomb binarization processing and truncated binary codeword binarization processing. For example, the parameter of the binarization process may be determined based on the parameter of the CU currently being processed.
As an example, the fixed length may be determined based on the size of the QP and/or bit depth. As an example, K may be determined based on the size of the QP and/or bit depth. As an example, the maximum value of the truncated binary codeword may be determined based on the size of the QP and/or bit depth.
In the current design, binarization of the escape samples is derived by invoking a third-order exponential golomb binarization process. According to an embodiment of the present disclosure, it is proposed to use a fixed length binarization process for escape samples. As an example, the codeword length for fixed length binarization may depend on certain parameters of a given block. For example, the length of the fixed length binarization process may be different due to different QPs and bit depths. As an example, the length of the fixed-length binarization process can be derived as equation (5):
length=(bitDepth-(Qp/6)) (5)
as another example, the length len of the fixed-length binarization can be derived as equation (6):
len=(bitDepth–(floor(QP–4)/6))) (6)
wherein QP represents an actual QP value for the current block encoded in palette mode.
According to another embodiment of the present disclosure, it is proposed to use exponential golomb binarization of order k for escape sample coding. As an example, the value of k may depend on certain parameters of a given block. As an example, the value of k for exponential golomb binarization for escape sample encoding may depend on the value of QP and internal bit depth (internal bit depth). For example, the value of k can be derived by:
k=(a–floor(QP/b)) (7)
where QP denotes an actual QP value for the current block encoded in the palette mode, and a and b are constants, e.g., a-6 and b-10.
According to another embodiment of the present disclosure, it is proposed to use truncated binary codewords for binarization of escape samples. As an example, the maximum value of the truncated binary codeword may depend on certain parameters of a given block. As an example, the maximum value of the truncated binary codeword may depend on the value of QP and the internal bit depth.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: the way of binarization processing of the escape samples is adaptively selected at different coding levels. That is, the present disclosure also proposes that binarization methods, such as Sequence Parameter Set (SPS), Picture Parameter Set (PPS), slice, or group of coding blocks, can be adaptively switched at different coding levels for escape samples. In this case, the encoder has the flexibility to dynamically select the binarization method to send information in the bitstream.
As an example, the escape samples may be binarized for a fixed length in lossless palette mode. That is, according to embodiments of the present disclosure, a fixed length binarization process may be used for escape samples when the palette mode is lossless. As an example, when the palette mode is lossless, escape samples may be directly encoded based on their binary format values, each bit may be encoded as a CABAC bypass binary number.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: signaling palette mode related information. As an example, the palette mode related information may include at least one of: information indicating a maximum allowable palette size, information indicating a maximum allowable palette area, information indicating a maximum allowable palette predictor size, information indicating a difference between the maximum allowable palette predictor size and the maximum allowable palette size, information indicating that syntax for initializing a sequence palette predictor is to be transmitted, information indicating that the number of entries of a palette predictor initializer minus1, information indicating a component value for initializing an ith palette entry of the palette predictor array, information indicating that a bit depth value of a luminance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a chrominance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a luminance component of an entry of the palette minus8, information indicating that a bit depth value of a chrominance component of the palette minus8, information indicating that a value of a color is not less than 8, information indicating that a color is not less than 8, Information indicating the bit depth value of the chroma component of an entry of the palette minus 8.
In other words, according to an embodiment of the present disclosure, several syntaxes are proposed to inform information of palette mode in VVC.
According to one embodiment of the present disclosure, it is proposed to signal a syntax, referred to as "palette _ max _ size" in the following description, to specify the maximum allowed palette size. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal a syntax called "palette _ max _ area" in the following description to specify the maximum allowable palette area. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal a syntax, referred to in the following description as "palette _ max _ predictor _ size", to specify the maximum allowed palette predictor size. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal a syntax, referred to in the following description as "delta _ palette _ max _ predictor _ size", to specify the difference between the maximum allowed palette predictor size and the maximum allowed palette size. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal a syntax, referred to in the following description as "palette _ predictor _ initializer _ present _ flag", to indicate that another syntax "sps _ palette _ predictor _ initializers" is to be signaled and used for initializing the sequence palette predictor. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal a syntax, referred to in the following description as "num _ palette _ predictor _ initializer _ minus 1", to indicate the number of entries in the palette predictor initializer minus 1. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal a syntax, referred to in the following description as "palette _ predictor _ initializers [ component ] [ i ]" to indicate a component value for initializing the ith palette bar of the palette predictor entry array "predictorpatteentries". As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal syntax, referred to in the following description as "luma _ bit _ depth _ entry _ minus8_ initializers", to indicate the bit-depth value minus8 of the luma component of an entry of the palette predictor initializer. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal syntax, referred to in the following description as "chroma _ bit _ depth _ entry _ minus8_ initializers", to indicate the bit-depth value minus8 of the chroma component of an entry of the palette predictor initializer. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal the syntax, referred to in the following description as "luma _ bit _ depth _ entry _ minus 8", to indicate that the bit-depth value of the luma component of an entry of the palette is minus 8. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
According to one embodiment of the present disclosure, it is proposed to signal the syntax, referred to in the following description as "chroma _ bit _ depth _ entry _ minus8_ initializers", to indicate the bit-depth value of the chroma component of an entry of the palette minus 8. As an example, the syntax may be signaled at different levels, e.g., it may be signaled at the video sequence level, or picture level, or tile group level, or tile level, or at the single CU level.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: rate-distortion analysis is performed on the palette mode based on the precision of the internal bit depth.
According to an embodiment of the present disclosure, it is proposed to rate-distortion analyze a palette mode at an encoder using the precision of the inner bit depth. As an example, the rate-distortion analysis may include computation of distortion and rate.
As an example, the precision of the distortion calculation for selecting the index of the nearest palette entry may be equal to the precision of the internal bit depth. That is, the precision of the distortion calculation (e.g., Sum of Absolute Differences (SAD)) used to select the index of the nearest palette entry may be equal to the precision of the internal bit depth.
As an example, a rate-distortion analysis for deriving a palette in palette mode may be performed based on the precision of the internal bit depth. That is, rate-distortion analysis for deriving a palette in palette mode may be performed based on the internal bit depth precision. When deriving the palette on the encoder side, the cluster center may typically be used as a palette entry, but when considering the cost of encoding the palette entry, a rate-distortion analysis is performed to analyze whether any entry from the palette predictor may be more suitable as an updated palette entry relative to the center.
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: a related rate-distortion (related rate-distortion) analysis of the palette mode is performed. As an example, the chrominance component distortion may be reduced in a total cost calculation of the correlation rate-distortion analysis.
As an example, different weights may be used for the luma component cost and the chroma component cost in the correlation rate-distortion analysis. According to an embodiment of the present disclosure, it is proposed to use different weights for the luminance component cost and the chrominance component cost in the related rate-distortion analysis of the palette mode. As an example, the correlation rate distortion analysis may be used to derive a palette for a given CU, and the correlation rate distortion analysis may also be used to determine a palette mode (e.g., index coding mode, copy above mode, or index run mode, etc.) for a given sample value. As an example, the chroma component distortion may be reduced in the overall cost calculation for such a correlation rate-distortion analysis for the palette mode. More specifically, the distortion of the chrominance components, such as the Sum of Absolute Differences (SAD) and/or the Sum of Squared Differences (SSD), may be multiplied by a constant less than 1, such as 0.8 or 0.7. As another example, the cost of chroma rate-distortion may be multiplied by a constant less than 1, such as 0.8 or 0.7, in the total cost calculation for such a correlation rate-distortion analysis for the palette mode.
As another example, different weights may be used for the L1 norm difference (L1-norm difference) and the L2 norm difference (L2-norm difference) in the correlation rate-distortion analysis. According to an embodiment of the present disclosure, it is proposed to use different weights for the L1 norm difference and the L2 norm difference in the palette mode correlation rate-distortion analysis. This correlation rate distortion analysis may be used to derive the palette for a given CU, or alternatively, this correlation rate distortion analysis may also be used to determine the palette mode (e.g., index coding mode, copy above mode, or index run mode, etc.) for a given sample value. As an example, chroma component distortion may be reduced in the overall cost calculation for such correlation rate-distortion analysis for palette mode. As an example, the distortion of the chrominance components, e.g., Sum of Absolute Differences (SAD), may be multiplied by a constant less than 1 (e.g., 0.8 or 0.7), e.g., Sum of Squared Differences (SSD) may be multiplied by a constant less than 1 (e.g., 0.64 or 0.49).
According to one embodiment of the present disclosure, it is proposed to remap Quantization Parameter (QP) values for the quantization process of escape samples. As an example, the QP range may be defined differently for escape sample quantization from quantization for other coding modes. In one embodiment, the minimum allowed QP for escape sample quantization may be defined as 4, since when QP equals 4, the quantization step size becomes 1. In another embodiment, the maximum allowed QP for escape sample quantization may be defined as 61.
According to the present disclosure, the process of remapping may be derived by a specific equation. As an example, the actual value QP of the quantization parameter QP for the quantization process of the escape sample may be derived by equation (8)esca
QPesca=MIN(((MAX(4,QPcu)–2)/6)*6+4,61) (8)
Where QPcu is the QP for a given CU, if the CU is coded in palette mode, then the actual QP for quantizing the escape samples for the CU is QPesca
As an example, the number of distortion thresholds for the number of palette entries may be extended to 64 when deriving the palette table for the current CU in lossy palette mode. For example, an increased threshold may be associated with quantization error.
In the prior art, on the encoder side, the number of thresholds is 52 and is defined by a look-up table, which is used as a threshold for each quantization parameter when deriving the appropriate palette to use with the CU. More specifically, to derive a palette for lossy coding, a modified k-means clustering algorithm is used. Adding the first sample of the block to the palette, then, for each subsequent sample from the block, calculating a Sum of Absolute Differences (SAD) between the sample and each color of the palette at present, adding the sample to a cluster belonging to a palette entry if the distortion of each of the components is less than a threshold for the palette entry corresponding to the minimum SAD; otherwise, the sample is added as a new palette entry.
According to an embodiment of the present disclosure, it is proposed to extend the number of thresholds to 64. As an example, the increment value may be increased by multiplying by a given scale (e.g., 1.05, 2)(1/6)Or 2(1/12)) To generate. As an example, the scale may relate to quantization error. In one embodiment, the thresholds may be defined as the following table, g _ palette Quant [64 ]]={0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,7,7,8,9,9,10,11,12,13,14,15,16,17,19,21,22,23,24,25,26,28,29,31,32,34,36,37,39,41,42,45,48,51,54,57,60,64,67,71,76,80,85,90}. In another embodiment, the thresholds may be defined as the following table, g _ palette Quant [64 ]]={0,0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,7,7,8,9,9,10,11,12,13,14,15,16,17,19,21,22,23,24,25,26,28,29,31,32,34,36,37,39,41,42,45,41,57,64,71,80,90,101,113,127,143,160,180}。
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: signaling at least one of a maximum palette table size and a palette predictor size, wherein at least one of the maximum palette table size and the palette predictor size is variable.
For palette mode in VVC, the maximum values of the palette table size and palette predictor size are fixed at 31 and 63, respectively. According to the present disclosure, the maximum palette table size and the maximum palette predictor size are allowed to be changed to provide more flexibility for the actual encoder/decoder device and to provide a variable performance/complexity trade-off. According to the present disclosure, by signaling the transmission palette table size and the palette predictor size, palette coding efficiency can be improved at the cost of increasing on-chip memory for representative colors stored in the palette table and palette predictor.
Furthermore, for the palette table generation and palette predictor update process, multiple in-order checks need to be performed at the encoder/decoder, and the number of check operations performed is proportional to the size of the palette table and palette predictor. For example, when generating a palette color for a current CU, the colors in the palette predictor (e.g., indicated by the multiplexing flag) need to be checked, specifically, the palette predictor color that is reused in the current CU is placed at the beginning of the palette table, followed by new palette colors that are not included in the palette predictor. Furthermore, after decoding one palette CU, the palette predictor color is updated by performing the following two steps "palette fill": 1) firstly, including the color palette of the current CU; 2) second, the colors in the previous palette predictor that are not used in the current palette are added. It can be seen that for both palette table generation and palette predictor updating, the encoder and decoder need to perform multiple in-order checks, where the number of check operations performed is proportional to the size of the palette table and palette predictor.
The present disclosure, in view of the above, proposes to explicitly signal the maximum palette table size and the maximum palette predictor size from the encoder to the decoder. As an example, the proposed palette signaling may be applied to different coding levels, e.g., Sequence Parameter Set (SPS), Picture Parameter Set (PPS), picture header, and slice header. Table 4 gives an example of signaling this syntax at the SPS level.
Table 4 syntax example
Figure BDA0003006398430000261
Wherein palette _ max _ size represents the maximum allowed palette size, and when it does not exist, the value of palette _ max _ size is inferred to be 0; delta _ palette _ max _ predictor _ size represents the difference between the maximum allowed palette predictor size and the maximum allowed palette size, and when it is not present, the value of delta _ palette _ max _ predictor _ size is inferred to be 0.
Furthermore, to limit the complexity of the worst case implementation, upper bounds for the maximum palette table size and maximum palette predictor size allowed may be specified, i.e., maxPaletteSizeUpbound and maxpalettedpredsizeupbound. As an example, bitstream conformance may be applied such that the decoded maximum palette table size and maximum palette predictor size should not exceed maxPaletteSizeUpbound and maxpalettedpressedzeupbound, respectively. For example, the values of maxPaletteSizeUpbound and maxpalettedpressedzeupbound may be set to 63 and 128, respectively.
As an example, the palette mode may be disabled for CUs having a size less than a first predetermined threshold.
To improve decoding throughput, the present disclosure proposes that the palette mode may be disabled for small blocks. In one embodiment, the palette mode may be disabled for all blocks with a size less than a certain threshold (e.g., 32 samples). In another embodiment, the palette mode may be disabled for all blocks having a size less than or equal to a certain threshold (e.g., 32 samples).
For the palette mode in VVC, the palette mode may be applied to a CU equal to or smaller than 64 × 64 pixels. In embodiments of the present disclosure, to reduce complexity, it is proposed to disable the palette mode for small tiles. In one embodiment, the palette mode may be disabled for all blocks with a size less than or equal to a certain threshold (e.g., 16 samples).
As an example, in the case of a dual-tree (local-dual tree), the palette mode may be disabled for chroma CUs having a size less than a second predetermined threshold. Specifically, the chroma components may be considered separately in a dual-tree case, where the palette mode for chroma CUs that are less than or equal to a certain threshold (e.g., 16 pixels) are disabled. Table 5 gives an example of signaling the proposed syntax in the VVC Draft (VVC Draft), wherein modifications to the VVC Draft are highlighted, wherein the newly added parts are underlined.
TABLE 5 syntax example
Figure BDA0003006398430000262
In another embodiment of the present disclosure, for the single tree case, it is proposed to disable the palette mode for CUs with small-sized luma blocks. As an example, in the single tree case, the palette mode may be disabled for CUs having chroma samples with a total size less than a third predetermined threshold. As an example, in the single tree case, the palette mode for a CU with a luma block less than or equal to 16 pixels may be disabled. In one particular example of the YUV420 format, the palette mode may be enabled for an 8 x 4CU containing 8 x 4 luma samples and two 4 x 2 chroma samples, since palette enablement is conditional on the size of luma samples without regard to chroma size.
In another embodiment of the present disclosure, for the partial double tree case, it is proposed to disable the palette mode for small-sized blocks. As an example, in the local dual-tree case, the palette mode may be disabled for CUs having a size less than a fourth predetermined threshold. As an example, in the local dual-tree case, the palette mode for CUs less than or equal to 32 pixels may be disabled.
As an example, in a partial dual tree case, the palette mode may be disabled.
In the VVC standard, in the single tree case, the palette mode is applied to a CU having a luminance block equal to or smaller than 64 × 64 pixels and larger than 4 × 4 pixels; in the dual-tree case, the palette mode applies to CUs that are equal to or smaller than 64 × 64 pixels and larger than 4 × 4 pixels for both luminance and chrominance. In an embodiment of the present disclosure, to reduce complexity, it is proposed to disable the palette mode for the local dual-tree case. An example of signaling the proposed syntax on the VVC draft is given in the following table, table 6 highlights the modification to the VVC draft when the modeType of the CU is equal to MODE _ TYPE _ INTRA in the VVC draft, indicating that the CU is in a partial dual-tree case, where the newly added part is underlined.
TABLE 6 syntax example
Figure BDA0003006398430000271
As an example, in a local dual-tree case, the palette mode may be disabled for chroma CUs. In particular, in another embodiment of the present disclosure, for the local dual-tree case, it is proposed to disable the palette mode for only the chroma components, in other words, in the local dual-tree case, the palette mode applies to the luma CU but not to the chroma CU. The following table gives an example of signaling the proposed syntax on the VVC draft, when modeType of CU is equal to MODE _ TYPE _ INTRA and treeType of CU is equal to DUAL _ TREE _ CHROMA in the VVC draft, indicating that CU is a CHROMA component and in the local DUAL-TREE case. Table 7 highlights the modifications to the VVC draft, where the newly added parts are underlined.
TABLE 7 syntax example
Figure BDA0003006398430000272
According to the current VVC standard, in the case of a partial dual tree, update processing of palette prediction (palette prediction) is performed only on the chroma components. More specifically, when each palette mode luma CU is encoded under a local dual tree, the palette prediction may not be updated; the palette predictor may be updated after encoding the last chroma component of each palette mode chroma CU under the local dual tree.
The update process of the palette prediction defined in the VVC standard as described above is not efficient for coding performance. According to one embodiment of the present disclosure, in order to improve coding efficiency, it is proposed to perform an update process of palette prediction on both a luminance CU and a chrominance CU in a local dual-tree case. More specifically, as an example, under a local dual-tree, each luma CU may be encoded while palette prediction is updated, and then each chroma CU may be encoded under the same local dual-tree. The following table gives an example of the proposed syntax signaled based on the VVC draft. In the VVC draft, the variable cIdx specifies the color component of the current CU, 0 for the luma component, 1 for the Cb component, and 2 for the Cr component. Table 8 highlights the modifications to the VVC draft, where the newly added parts are underlined.
TABLE 8 syntax example
Figure BDA0003006398430000281
As described above, in the local dual-tree case, the update process of palette prediction may be performed for both luma CU and chroma CU. More specifically, the palette prediction may be updated first while each luma CU is encoded under a local dual-tree, and then each chroma CU is encoded under the same local dual-tree.
As an example, in a partial dual tree case, palette table updates may be disabled for the shared palette table. The present disclosure considers that CUs under the local dual tree are all small-sized CUs, and performing the update process of palette prediction on these CUs in a sequential manner requires a large number of calculation cycles. According to one embodiment of the present disclosure, to reduce complexity, it is proposed to use one shared palette table for some or all CUs without updating the shared palette table in the local dual-tree case.
As an example, in a partial dual tree case, palette table updates may be disabled. In particular, in one embodiment, for the local dual tree case, it is proposed to disable the update process of palette prediction in palette mode. The following table gives an example of the proposed syntax signaled based on the VVC draft. In the VVC draft, the variable cIdx specifies the color component of the current CU, 0 for the luma component, 1 for the Cb component, and 2 for the Cr component. Table 9 highlights the modifications to the VVC draft, where newly added sections are underlined and deleted sections are marked with strikethrough.
TABLE 9 syntax example
Figure BDA0003006398430000291
As an example, in the local dual-tree case, palette table updates may be disabled for CUs having luma samples with a total size less than a fifth predetermined threshold. In another embodiment, in the local dual-tree case, the update process of palette prediction in palette mode may be disabled for CUs with a luma block size less than or equal to 32 pixels. In this case, the update process of palette prediction in palette mode may be enabled for CUs of 8 × 8 or larger that include at least 8 × 8 luma samples.
As an example, in a local dual-tree case, palette table updates may be disabled for chroma CUs. In another embodiment of the present disclosure, for the local dual-tree case, it is proposed to disable the update process of the palette prediction for chroma CUs only. The following table gives one example of signaling the proposed syntax on the VVC draft. In the VVC draft, the variable cIdx specifies the color component of the current CU, 0 for the luma component, 1 for the Cb component, and 2 for the Cr component. Table 10 highlights the modifications to the VVC draft, where the newly added sections are underlined and the deleted sections are marked with strikethrough.
TABLE 10 syntax example
Figure BDA0003006398430000301
As an example, the video encoding method according to an exemplary embodiment of the present disclosure may further include: independently performing palette prediction update processing for different color components in the local dual tree; palette mode coding is performed in parallel for the luma component and the chroma components.
As described above, in the case of the partial dual tree, the update processing of the palette prediction is sequentially performed. This also means that the decoding of palette mode chroma CUs in the local dual tree cannot start before all luma CUs in the same local dual tree are decoded, which may cause a delay in the hardware codec implementation.
To solve this problem, according to one embodiment of the present disclosure, it is proposed to perform update processing independently for different color components (e.g., luminance and chrominance) in a local dual tree, so that palette mode encoding of a chrominance component can be performed in parallel with a luminance component. As an example, under a local dual tree, the palette at the beginning of the local dual tree is used as the starting palette for both luma CU and chroma CU.
According to one embodiment of the present disclosure, in the local dual-tree case, the update process of palette prediction is performed separately for luma and chroma components. More specifically, the palette prediction may be updated first while each luma CU is encoded under a local dual-tree, and then each chroma CU is encoded under the same local dual-tree. Thus, when the palette prediction is updated while the luma CU is encoded under the local dual-tree, the chroma information of the co-located pixels may not be available, and vice versa. In the present disclosure, to improve coding efficiency, it is proposed that when updating the palette prediction while coding a CU of one color component (e.g., luma and/or chroma) under a local dual-tree, another color component (e.g., chroma and/or luma) value of a previously available candidate in the palette may be used. In one example of the local dual tree case, the chroma component of the first available candidate may be used as the chroma component of the newly added palette entry during the update of the palette prediction for the luma component. The following table gives an example of the proposed syntax signaled based on the VVC draft. Table 11 highlights the modifications to the VVC draft, where the newly added parts are underlined.
TABLE 11 syntax example
Figure BDA0003006398430000311
As an example, when encoding a luma CU and/or a chroma CU under a local dual tree, all component values of the palette entries copied from the palette predictor may be used when updating the palette predictor using the entries copied from the palette predictor.
In current VVC standards, the palette predictor needs to be maintained in order to encode the palette. For each entry in the palette predictor, a multiplexing flag is signaled to indicate whether it is part of the current palette in the CU. For example, a run length coding of zeros may be used to transmit the reuse flag. When the reuse flag is marked on, the corresponding entry on the palette predictor is copied to the entry of the palette table of the current coding CU. After copying the entries from the palette predictor, the number of new palette entries and the component value of the new palette entries are signaled. However, duplicate entries from the palette predictor of a luma CU under the local dual tree may not include chroma values of those entries in the palette predictor, and vice versa.
After encoding the palette coded CU, the palette predictor will be updated using the current palette table, but for those luma and chroma palette coded CUs under the local dual tree, the entries copied from the palette predictor are incomplete because the entries copied from the palette predictor for luma CUs do not contain chroma values, and vice versa. Incomplete entries result in undefined behavior of the decoding process.
In one embodiment of the present disclosure, in order to solve the above-mentioned problem, it is proposed that under the local dual tree, entries copied from the palette predictor should contain values of all components, whether it is a luma CU or a chroma CU, in other words, it is proposed that when encoding a luma CU and/or a chroma CU under the local dual tree, all component (e.g., Y, Cb, Cr) values of palette entries copied from the palette predictor are used when updating the palette predictor using the entries copied from the palette predictor. In one example of the local dual-tree case, during the update of the palette predictor for a luma CU, the chroma components of the entries in the palette predictor may be used as the chroma components of the predicted palette entries. The following table gives an example of the proposed syntax signaled based on the VVC draft. Table 12 highlights modifications to the VVC draft, where the newly added parts are underlined.
Table 12 syntax example
Figure BDA0003006398430000321
As an example, when encoding luma CUs and/or chroma CUs under a local dual tree, component values missing from palette entries copied from the palette predictor may be replaced with default values when updating the palette predictor using the entries copied from the palette predictor. For example, the default value may be related to an internal bit depth.
To solve the above problem, in another embodiment of the present disclosure, under the local dual tree, a default value is used as a chroma value of an entry copied from the palette predictor of the luma CU; and uses the default value as the luma value of the entry copied from the palette predictor of the chroma CU. In other words, it is proposed that when encoding luma CUs and/or chroma CUs under a local dual tree, default values are used as the values of those missing components when updating the palette predictor using entries copied from the palette predictor. In the current VVC standard, specifically, under the local dual tree, the entries copied from the palette predictor of the luma CU do not contain chroma values. However, in the method proposed by the present disclosure, a default value is used as a chroma value of an entry copied from the palette predictor of the luminance CU. As an example, the default value may depend on the internal bit depth, e.g., 1< < (BitDepth-1). The following table gives one example of signaling the proposed syntax on the VVC draft. Table 13 highlights changes to the VVC draft, where newly added sections are underlined and deleted sections are marked with strikethrough.
TABLE 13 syntax example
Figure BDA0003006398430000322
Fig. 10 illustrates a flowchart of a video decoding method according to an exemplary embodiment of the present disclosure.
Referring to fig. 10, in step S201, a bitstream is received and parsed.
In step S202, a plurality of coding units CU into which the video image is divided are acquired from the parsed bitstream.
In step S203, a palette table of at least one CU partitioned under the same parent node among the plurality of CUs is predicted based on the shared palette table. Wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
As an example, the one ancestor node may be: an ancestor node of all ancestor nodes of the at least one CU that meets a predetermined condition with respect to a predetermined size threshold.
As an example, the one ancestor node may be: a largest ancestor node of all ancestor nodes of the at least one CU that is equal to or less than the predetermined size threshold; or the one ancestor node may be: a smallest ancestor node of all ancestor nodes of the at least one CU that is equal to or greater than the predetermined size threshold.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: the chroma palette table of a CU is predicted based on the luma palette table of the CU using a cross-component linear model.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: in palette mode, receiving signaling about a scanning direction of a CU, wherein the signaling is sent based on different contexts of a shape of the CU; or in the palette mode, when a shape of a CU satisfies a predetermined shape condition, determining a scanning direction of the CU based on the shape of the CU without receiving signaling regarding the scanning direction of the CU.
As an example, the step of determining the scanning direction of the CU based on the shape of the CU may comprise: determining a same direction as a longer side of the CU as a scanning direction of the CU, or determining a same direction as a shorter side of the CU as a scanning direction of the CU.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: in palette mode, a CU is divided into multiple segments and palette-related syntax for the individual segments of the same CU is decoded independently.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: in the palette mode, a CU is divided into a plurality of segments, and palette related data of the respective segments of the same CU are independently cached.
As an example, the step of dividing the CU into a plurality of segments may comprise: dividing the CU into the plurality of segments according to the scanning direction of the CU; or, the CU is divided into the plurality of segments according to a binary tree or quadtree partitioning structure.
As an example, the multiple segments of the same CU may share one palette table, or each segment of the same CU may use a respective palette table separately.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: in the case of a separate block tree structure, the palette index map of the corresponding chroma CU is predicted based on the palette index map of the luma CU in a cross-component manner.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: signaling is received regarding delta QP for a CU encoded in palette mode.
As an example, the step of receiving signaling of delta QP for a CU encoded in palette mode may comprise: receiving signaling of delta QP for a CU encoded in palette mode, if there are escape samples; or always receive signaling of delta QP for a CU encoded in palette mode; or receiving signaling regarding delta QP for a luma component of the CU encoded in the palette mode and signaling regarding delta QP for a chroma component of the CU encoded in the palette mode, respectively.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: receiving information of a QP and determining whether a current palette mode is a lossless palette mode based on the information of the QP, wherein in the case that the value of the QP is equal to or less than a predetermined threshold, the current palette mode is determined to be the lossless palette mode; determining that the current palette mode is not a lossless palette mode if the value of the QP is greater than a predetermined threshold.
As an example, the operation of inverse quantizing the escape samples in the palette mode may be the same as the operation of inverse quantizing the samples in the other modes.
As an example, the other modes may include at least one of: transform skip mode, transform mode, quantized residual differential pulse code modulation, RDPCM, mode.
As an example, the operation of lossless decoding of escape samples in palette mode may be the same as the operation of lossless decoding of samples in other modes.
As an example, the other mode may be a transform skip mode.
As an example, the operation of lossless decoding of escape samples in palette mode may not include an inverse quantization operation, or the inverse quantization operation in the operation of lossless decoding of escape samples in palette mode may be performed based on a QP less than or equal to a predetermined threshold.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: performing one of the following inverse binarization processes on the escape sample: fixed length inverse binarization processing, k-th order exponential golomb inverse binarization processing, truncated binary codeword inverse binarization processing, wherein parameters of the inverse binarization processing are determined based on parameters of a CU currently being processed.
As an example, the fixed length may be determined based on the size of the QP and/or bit depth; or K may be determined based on the size of the QP and/or bit depth; or the maximum value of the truncated binary codeword may be determined based on the size of the QP and/or bit depth.
As an example, the escape samples may be fixed length inverse binarized in lossless palette mode.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: the manner in which the inverse binarization processing is performed on the escape samples is adaptively selected at different decoding levels.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: receiving signaling regarding palette mode related information; wherein the palette mode related information comprises at least one of: information indicating a maximum allowable palette size, information indicating a maximum allowable palette area, information indicating a maximum allowable palette predictor size, information indicating a difference between the maximum allowable palette predictor size and the maximum allowable palette size, information indicating that syntax for initializing a sequence palette predictor is to be transmitted, information indicating that the number of entries of a palette predictor initializer minus1, information indicating a component value for initializing an ith palette entry of the palette predictor array, information indicating that a bit depth value of a luminance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a chrominance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a luminance component of an entry of the palette minus8, information indicating that a bit depth value of a chrominance component of the palette minus8, information indicating that a value of a color is not less than 8, information indicating that a color is not less than 8, Information indicating the bit depth value of the chroma component of an entry of the palette minus 8.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: deriving the actual value QP of the quantization parameter QP for the inverse quantization process of the escape samples byesca
QPesca=MIN(((MAX(4,QPcu)–2)/6)*6+4,61)
Where QPcu is the QP value for the current CU.
As an example, the number of distortion thresholds for the number of palette entries when deriving the palette table of the current CU in lossy palette mode may be extended to 64, where an increased threshold is related to quantization error.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: receiving signaling regarding at least one of a maximum palette table size and a palette predictor size, wherein the at least one of the maximum palette table size and the palette predictor size is variable.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: disabling the palette mode for CUs having a size less than a first predetermined threshold; or in the dual-tree case, disabling the palette mode for chroma CUs having a size less than a second predetermined threshold; or in the single tree case, disabling the palette mode for CUs having chroma samples with a total size less than a third predetermined threshold; or in the case of local dual-tree, disabling the palette mode for CUs having a size less than a fourth predetermined threshold; or in the case of local dual-tree, disabling the palette mode; or in the local dual-tree case, the palette mode is disabled for chroma CUs.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: in the local dual-tree case, the update process of the palette prediction is performed on both the luminance CU and the chrominance CU, where the luminance CU is decoded while the palette prediction is updated, and then the chrominance CU is decoded under the same local dual-tree.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: under the condition of local dual-tree, forbidding updating of the palette table aiming at the shared palette table; or in the case of local dual-tree, disabling palette table updating; or in the local dual-tree case, disabling palette table updates for CUs having luma samples with a total size less than a fifth predetermined threshold; or in the local dual-tree case, the palette table update is disabled for chroma CUs.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: independently performing palette prediction update processing for different color components in the local dual tree; where palette mode decoding is performed in parallel for the luma component and the chroma components, or the luma CU is decoded while the palette prediction is updated, and then the chroma CU is decoded under the same local dual-tree.
As an example, the step of independently performing the update process of the palette prediction for different color components in the local dual tree may include: when updating the palette prediction while decoding a CU of one color component under the local dual tree, the value of another color component of a previously available candidate in the palette is used, wherein a luma CU is decoded while updating the palette prediction, and then a chroma CU is decoded under the same local dual tree.
As an example, the video decoding method according to an exemplary embodiment of the present disclosure may further include: when decoding a luma CU and/or a chroma CU under a local dual tree, when updating a palette predictor using entries copied from the palette predictor, using all component values of the palette entries copied from the palette predictor; alternatively, when decoding a luma CU and/or a chroma CU under a local dual-tree, when updating a palette predictor using entries copied from the palette predictor, component values missing from the palette entries copied from the palette predictor are replaced with default values.
As an example, the default value may be related to an internal bit depth.
For a specific processing example of the video decoding method, reference may be made to the specific processing of the video encoding method, which is not described herein again.
Fig. 11 illustrates a block diagram of a video encoding apparatus according to an exemplary embodiment of the present disclosure.
Referring to fig. 11, the video encoding apparatus 10 according to an exemplary embodiment of the present disclosure includes: CU dividing unit 101 and palette table prediction unit 102.
Specifically, the CU dividing unit 101 is configured to: the video image is divided into a plurality of coding units CU.
The palette table prediction unit 102 is configured to: predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table; wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
As an example, the one ancestor node may be: an ancestor node of all ancestor nodes of the at least one CU that meets a predetermined condition with respect to a predetermined size threshold.
As an example, the one ancestor node may be: a largest ancestor node of all ancestor nodes of the at least one CU that is equal to or less than the predetermined size threshold; or the one ancestor node may be: a smallest ancestor node of all ancestor nodes of the at least one CU that is equal to or greater than the predetermined size threshold.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a chroma palette table prediction unit (not shown) configured to: the chroma palette table of a CU is predicted based on the luma palette table of the CU using a cross-component linear model.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a signal transmitting unit (not shown) or a scanning direction determining unit (not shown), the signal transmitting unit being configured to: in palette mode, the scanning direction of a CU is signaled based on different contexts of the shape of the CU. The scanning direction determination unit is configured to: in the palette mode, when a shape of a CU satisfies a predetermined shape condition, a scanning direction of the CU is determined based on the shape of the CU without signaling the scanning direction of the CU.
As an example, the scanning direction determining unit may determine a same direction as a longer side of the CU as the scanning direction of the CU, or determine a same direction as a shorter side of the CU as the scanning direction of the CU.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a slice encoding unit (not shown) configured to: in palette mode, a CU is divided into multiple segments and palette related syntax for each segment of the same CU is encoded independently.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a cache unit (not shown) configured to: in the palette mode, a CU is divided into a plurality of segments, and palette related data of the respective segments of the same CU are independently cached.
As an example, the process of dividing the CU into a plurality of slices may include: dividing the CU into the plurality of segments according to the scanning direction of the CU; or, the CU is divided into the plurality of segments according to a binary tree or quadtree partitioning structure.
As an example, the multiple segments of the same CU may share one palette table, or each segment of the same CU may use a respective palette table separately.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette index map prediction unit (not shown) configured to: in the case of a separate block tree structure, the palette index map of the corresponding chroma CU is predicted based on the palette index map of the luma CU in a cross-component manner.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a signal transmitting unit (not shown) configured to: signaling delta QP for a CU encoded in palette mode.
As an example, the signaling unit may signal delta QP for a CU encoded in palette mode if there are escape samples; or
The signaling unit may always signal a delta QP for a CU encoded in palette mode; or
The signaling unit may signal a delta QP for a luma component and a delta QP for a chroma component of the CU encoded in the palette mode, respectively.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a signal transmitting unit (not shown) configured to: signaling information of a QP to indicate whether a current palette mode is a lossless palette mode, wherein in case a value of the QP is equal to or less than a predetermined threshold, indicating that the current palette mode is a lossless palette mode; indicating that the current palette mode is not a lossless palette mode if the value of the QP is greater than a predetermined threshold.
As an example, the operation of quantizing the escape samples in the palette mode may be the same as the operation of quantizing the samples in the other modes.
As an example, the other modes may include at least one of: transform skip mode, transform mode, quantized residual differential pulse code modulation, RDPCM, mode.
As an example, the operation of lossless coding of escape samples in palette mode may be the same as the operation of lossless coding of samples in other modes.
As an example, the other mode may be a transform skip mode.
As an example, the operation of lossless coding of escape samples in palette mode may not include a quantization operation, or the quantization operation in the operation of lossless coding of escape samples in palette mode may be performed based on a QP less than or equal to a predetermined threshold.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a binarization processing unit (not shown), which may be configured to: performing one of the following binarization processes on the escape sample: fixed length binarization processing, k-th order exponential golomb binarization processing, truncated binary codeword binarization processing, wherein parameters of the binarization processing are determined based on parameters of a CU currently being processed.
As an example, the fixed length may be determined based on the size of the QP and/or bit depth; or K may be determined based on the size of the QP and/or bit depth; or the maximum value of the truncated binary codeword may be determined based on the size of the QP and/or bit depth.
As an example, the binarization processing unit may perform fixed-length binarization processing on the escape samples in the lossless palette mode.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a binarization processing unit (not shown) configured to: the way of binarization processing of the escape samples is adaptively selected at different coding levels.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a signal transmitting unit (not shown) configured to: signaling palette mode related information;
wherein the palette mode related information may include at least one of: information indicating a maximum allowable palette size, information indicating a maximum allowable palette area, information indicating a maximum allowable palette predictor size, information indicating a difference between the maximum allowable palette predictor size and the maximum allowable palette size, information indicating that syntax for initializing a sequence palette predictor is to be transmitted, information indicating that the number of entries of a palette predictor initializer minus1, information indicating a component value for initializing an ith palette entry of the palette predictor array, information indicating that a bit depth value of a luminance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a chrominance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a luminance component of an entry of the palette minus8, information indicating that a bit depth value of a chrominance component of the palette minus8, information indicating that a value of a color is not less than 8, information indicating that a color is not less than 8, Information indicating the bit depth value of the chroma component of an entry of the palette minus 8.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a rate-distortion analysis unit (not shown) configured to: rate-distortion analysis is performed on the palette mode based on the precision of the internal bit depth.
As an example, the rate-distortion analysis unit may perform rate-distortion analysis for deriving the palette in the palette mode based on the precision of the internal bit depth; or the accuracy of the distortion calculation for selecting the index of the nearest palette entry may be equal to the accuracy of the internal bit depth.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a correlation rate distortion analysis unit (not shown) configured to: performing a palette-mode correlation rate-distortion analysis, wherein different weights are used for the luma component cost and the chroma component cost in the correlation rate-distortion analysis, or different weights are used for the L1 norm difference and the L2 norm difference in the correlation rate-distortion analysis.
As an example, the chrominance component distortion may be reduced in the overall cost calculation of the correlation rate-distortion analysis.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a QP actual value determination unit (not shown) configured to: deriving the actual value QP of the quantization parameter QP for the quantization process of the escape samples byesca
QPesca=MIN(((MAX(4,QPcu)–2)/6)*6+4,61)
Where QPcu is the QP value for the current CU.
As an example, the number of distortion thresholds for the number of palette entries when deriving the palette table of the current CU in lossy palette mode may be extended to 64, where an increased threshold is related to quantization error.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a signal transmitting unit (not shown) configured to: signaling at least one of a maximum palette table size and a palette predictor size, wherein at least one of the maximum palette table size and the palette predictor size is variable.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette mode disabling unit (not shown) configured to:
disabling the palette mode for CUs having a size less than a first predetermined threshold; or
In the dual-tree case, the palette mode is disabled for chroma CUs that are smaller in size than a second predetermined threshold; or
In a single tree case, disabling palette mode for CUs having chroma samples with a total size less than a third predetermined threshold; or
In the local dual-tree case, disabling the palette mode for CUs having a size less than a fourth predetermined threshold; or
In the case of local dual trees, the palette mode is disabled; or
In the local dual-tree case, the palette mode is disabled for chroma CUs.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette prediction update unit (not shown) and a palette mode encoding unit (not shown), the palette prediction update unit being configured to: in the local dual-tree case, the update process of the palette prediction is performed on both the luminance CU and the chrominance CU; the palette mode encoding unit is configured to: the luma CU is coded while the palette prediction is updated, and then the chroma CU is coded under the same local dual-tree.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette table update disabling unit (not shown),
the palette table update disabling unit is configured to:
under the condition of local dual-tree, forbidding updating of the palette table aiming at the shared palette table; or
In the case of local dual-tree, disabling palette table updates; or
In the local dual-tree case, disabling palette table updates for CUs having luma samples with a total size less than a fifth predetermined threshold; or
In the local dual-tree case, palette table updates are disabled for chroma CUs.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette prediction update unit (not shown) and a palette mode coding unit (not shown).
The palette prediction update unit is configured to: independently performing palette prediction update processing for different color components in the local dual tree; the palette mode encoding unit is configured to: palette mode coding is performed for the luma component and the chroma component in parallel, or the luma CU is coded while the palette prediction is updated, and then the chroma CU is coded under the same local dual tree.
As an example, the palette mode encoding unit may encode a luminance CU while updating the palette prediction, and then encode a chrominance CU under the same local dual tree; the palette prediction update unit uses a value of another color component of a previously available candidate in the palette when updating the palette prediction while encoding the CU of one color component under the local dual tree.
As an example, the video encoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette predictor update unit (not shown) configured to: when encoding a luma CU and/or a chroma CU under a local dual tree, when updating a palette predictor using entries copied from the palette predictor, using all component values of the palette entries copied from the palette predictor; alternatively, when encoding a luma CU and/or a chroma CU under a local dual-tree, when updating a palette predictor using entries copied from the palette predictor, component values missing from the palette entries copied from the palette predictor are replaced with default values.
As an example, the default value may be related to an internal bit depth.
Fig. 12 illustrates a block diagram of a video decoding apparatus according to an exemplary embodiment of the present disclosure.
Referring to fig. 12, the video decoding apparatus 20 according to an exemplary embodiment of the present disclosure includes: a reception parsing unit 201, a CU division unit 202, and a palette table prediction unit 203.
Specifically, the reception parsing unit 201 is configured to: a bitstream is received and parsed.
The CU dividing unit 202 is configured to: a plurality of coding units CU into which the video image is divided is obtained from the parsed bitstream.
The palette table prediction unit 203 is configured to: predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table; wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
As an example, the one ancestor node may be: an ancestor node of all ancestor nodes of the at least one CU that meets a predetermined condition with respect to a predetermined size threshold.
As an example, the one ancestor node may be: a largest ancestor node of all ancestor nodes of the at least one CU that is equal to or less than the predetermined size threshold; or the one ancestor node may be: a smallest ancestor node of all ancestor nodes of the at least one CU that is equal to or greater than the predetermined size threshold.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a chroma palette table prediction unit (not shown) configured to: the chroma palette table of a CU is predicted based on the luma palette table of the CU using a cross-component linear model.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a signal receiving unit (not shown) or a scanning direction determining unit (not shown), the signal receiving unit being configured to: in palette mode, signaling is received regarding a scanning direction of a CU, wherein the signaling is sent based on different contexts of a shape of the CU.
The scanning direction determination unit is configured to: in the palette mode, when a shape of a CU satisfies a predetermined shape condition, a scanning direction of the CU is determined based on the shape of the CU without receiving signaling regarding the scanning direction of the CU.
As an example, the scanning direction determining unit may determine a same direction as a longer side of the CU as the scanning direction of the CU, or determine a same direction as a shorter side of the CU as the scanning direction of the CU.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a slice decoding unit (not shown) configured to: in palette mode, a CU is divided into multiple segments and palette-related syntax for the individual segments of the same CU is decoded independently.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a cache unit (not shown) configured to: in the palette mode, a CU is divided into a plurality of segments, and palette related data of the respective segments of the same CU are independently cached.
As an example, the process of dividing the CU into a plurality of slices may include: dividing the CU into the plurality of segments according to the scanning direction of the CU; or, the CU is divided into the plurality of segments according to a binary tree or quadtree partitioning structure.
As an example, the multiple segments of the same CU may share one palette table, or each segment of the same CU may use a respective palette table separately.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette index map prediction unit (not shown) configured to: in the case of a separate block tree structure, the palette index map of the corresponding chroma CU is predicted based on the palette index map of the luma CU in a cross-component manner.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a signal receiving unit (not shown) configured to: signaling is received regarding delta QP for a CU encoded in palette mode.
As an example, the signal receiving unit may receive signaling of delta QP for a CU encoded in palette mode in the presence of escape samples; or the signal receiving unit may always receive signaling of delta QP for a CU encoded in palette mode; or the signal receiving unit may receive signaling regarding delta QP for a luma component of the CU encoded in the palette mode and signaling regarding delta QP for a chroma component of the CU encoded in the palette mode, respectively.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a lossless palette mode determination unit (not shown) configured to: receiving information of a QP and determining whether a current palette mode is a lossless palette mode based on the information of the QP, wherein in the case that the value of the QP is equal to or less than a predetermined threshold, the current palette mode is determined to be the lossless palette mode; determining that the current palette mode is not a lossless palette mode if the value of the QP is greater than a predetermined threshold.
As an example, the operation of inverse quantizing the escape samples in the palette mode may be the same as the operation of inverse quantizing the samples in the other modes.
As an example, the other modes may include at least one of: transform skip mode, transform mode, quantized residual differential pulse code modulation, RDPCM, mode.
As an example, the operation of lossless decoding of escape samples in palette mode may be the same as the operation of lossless decoding of samples in other modes.
As an example, the other mode may be a transform skip mode.
As an example, the operation of lossless decoding of escape samples in palette mode may not include an inverse quantization operation, or the inverse quantization operation in the operation of lossless decoding of escape samples in palette mode may be performed based on a QP less than or equal to a predetermined threshold.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: an inverse binarization processing unit (not shown) configured to: performing one of the following inverse binarization processes on the escape sample: fixed length inverse binarization processing, k-th order exponential golomb inverse binarization processing, truncated binary codeword inverse binarization processing, wherein parameters of the inverse binarization processing are determined based on parameters of a CU currently being processed.
As an example, the fixed length may be determined based on the size of the QP and/or bit depth; or K may be determined based on the size of the QP and/or bit depth; or the maximum value of the truncated binary codeword may be determined based on the size of the QP and/or bit depth.
As an example, the inverse binarization processing unit may perform fixed-length inverse binarization processing on the escape samples in the lossless palette mode.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: an inverse binarization processing unit (not shown) configured to: the manner in which the inverse binarization processing is performed on the escape samples is adaptively selected at different decoding levels.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a signal receiving unit (not shown) configured to: receiving signaling regarding palette mode related information;
wherein the palette mode related information comprises at least one of: information indicating a maximum allowable palette size, information indicating a maximum allowable palette area, information indicating a maximum allowable palette predictor size, information indicating a difference between the maximum allowable palette predictor size and the maximum allowable palette size, information indicating that syntax for initializing a sequence palette predictor is to be transmitted, information indicating that the number of entries of a palette predictor initializer minus1, information indicating a component value for initializing an ith palette entry of the palette predictor array, information indicating that a bit depth value of a luminance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a chrominance component of an entry of the palette predictor initializer minus8, information indicating that a bit depth value of a luminance component of an entry of the palette minus8, information indicating that a bit depth value of a chrominance component of the palette minus8, information indicating that a value of a color is not less than 8, information indicating that a color is not less than 8, Information indicating the bit depth value of the chroma component of an entry of the palette minus 8.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a QP actual value determination unit (not shown) configured to: deriving the actual value QP of the quantization parameter QP for the quantization process of the escape samples byesca
QPesca=MIN(((MAX(4,QPcu)–2)/6)*6+4,61)
Where QPcu is the QP value for the current CU.
As an example, the number of distortion thresholds for the number of palette entries when deriving the palette table of the current CU in lossy palette mode may be extended to 64, where an increased threshold is related to quantization error.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a signal receiving unit (not shown) configured to: receiving signaling regarding at least one of a maximum palette table size and a palette predictor size, wherein the at least one of the maximum palette table size and the palette predictor size is variable.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette mode disabling unit (not shown) configured to:
disabling the palette mode for CUs having a size less than a first predetermined threshold; or
In the dual-tree case, the palette mode is disabled for chroma CUs that are smaller in size than a second predetermined threshold; or
In a single tree case, disabling palette mode for CUs having chroma samples with a total size less than a third predetermined threshold; or
In the local dual-tree case, disabling the palette mode for CUs having a size less than a fourth predetermined threshold; or
In the case of local dual trees, the palette mode is disabled; or
In the local dual-tree case, the palette mode is disabled for chroma CUs.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette prediction update unit (not shown) and a palette mode decoding unit (not shown), the palette prediction update unit configured to: in the local dual-tree case, the update process of the palette prediction is performed on both the luminance CU and the chrominance CU; the palette mode decoding unit is configured to: the luma CU is decoded while the palette prediction is updated, and then the chroma CU is decoded under the same local dual-tree.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette table update disabling unit (not shown) configured to:
under the condition of local dual-tree, forbidding updating of the palette table aiming at the shared palette table; or
In the case of local dual-tree, disabling palette table updates; or
In the local dual-tree case, disabling palette table updates for CUs having luma samples with a total size less than a fifth predetermined threshold; or
In the local dual-tree case, palette table updates are disabled for chroma CUs.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette prediction update unit (not shown) and a palette mode decoding unit (not shown), the palette prediction update unit configured to: independently performing palette prediction update processing for different color components in the local dual tree; the palette mode decoding unit is configured to: performing palette mode decoding in parallel for the luma component and the chroma components; alternatively, the luma CU is decoded while the palette prediction is updated, and then the chroma CU is decoded under the same local dual-tree.
As an example, the palette mode decoding unit may decode a luma CU while updating the palette prediction, and then decode a chroma CU under the same local dual tree; the palette prediction update unit uses a value of another color component of a previously available candidate in the palette when updating the palette prediction while decoding the CU of the one color component under the local dual tree.
As an example, the video decoding apparatus according to an exemplary embodiment of the present disclosure may further include: a palette predictor update unit (not shown) configured to: when decoding a luma CU and/or a chroma CU under a local dual tree, when updating a palette predictor using entries copied from the palette predictor, using all component values of the palette entries copied from the palette predictor; alternatively, when decoding a luma CU and/or a chroma CU under a local dual-tree, when updating a palette predictor using entries copied from the palette predictor, component values missing from the palette entries copied from the palette predictor are replaced with default values.
As an example, the default value may be related to an internal bit depth.
With regard to the video encoding apparatus and the video decoding apparatus in the above-described embodiments, the specific manner in which the respective units perform operations has been described in detail in the embodiments related to the method, and will not be elaborated here.
Further, it should be understood that respective units in the video encoding apparatus and the video decoding apparatus according to exemplary embodiments of the present disclosure may be implemented as hardware components and/or software components. The individual units may be implemented, for example, using Field Programmable Gate Arrays (FPGAs) or Application Specific Integrated Circuits (ASICs), depending on the processing performed by the individual units as defined by the skilled person.
Fig. 13 shows a block diagram of the electronic device 30 according to an exemplary embodiment of the present disclosure.
Referring to fig. 13, the electronic device 30 includes at least one memory 301 and at least one processor 302, the at least one memory 301 having stored therein a set of computer-executable instructions that, when executed by the at least one processor 302, perform a video encoding method or a video decoding method according to exemplary embodiments of the present disclosure.
By way of example, the electronic device 30 may be a PC computer, tablet device, personal digital assistant, smart phone, or other device capable of executing the set of instructions described above. Here, the electronic device 30 need not be a single electronic device, but can be any collection of devices or circuits that can execute the above instructions (or sets of instructions) individually or in combination. The electronic device 30 may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).
In the electronic device 30, the processor 302 may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
The processor 302 may execute instructions or code stored in the memory 301, wherein the memory 1401 may also store data. The instructions and data may also be transmitted or received over a network via a network interface device, which may employ any known transmission protocol.
The memory 301 may be integrated with the processor 302, for example, by having RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, memory 301 may comprise a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The memory 1401 and the processor 302 may be operatively coupled or may communicate with each other, e.g. via I/O ports, network connections, etc., such that the processor 302 is able to read files stored in the memory.
In addition, the electronic device 30 may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device 30 may be connected to each other via a bus and/or a network.
According to an exemplary embodiment of the present disclosure, there may also be provided a computer-readable storage medium storing instructions, which when executed by at least one processor, cause the at least one processor to perform a video encoding method or a video decoding method according to the present disclosure. Examples of the computer-readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD + RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD + RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or compact disc memory, Hard Disk Drive (HDD), solid-state drive (SSD), card-type memory (such as a multimedia card, a Secure Digital (SD) card or a extreme digital (XD) card), magnetic tape, a floppy disk, a magneto-optical data storage device, an optical data storage device, a hard disk, a magnetic tape, a magneto-optical data storage device, a hard disk, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, A solid state disk, and any other device configured to store and provide a computer program and any associated data, data files, and data structures to a processor or computer in a non-transitory manner such that the processor or computer can execute the computer program. The computer program in the computer-readable storage medium described above can be run in an environment deployed in a computer apparatus, such as a client, a host, a proxy device, a server, and the like, and further, in one example, the computer program and any associated data, data files, and data structures are distributed across a networked computer system such that the computer program and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.
According to an exemplary embodiment of the present disclosure, there may also be provided a computer program product, in which instructions are executable by at least one processor to perform the video encoding method or the video decoding method as described in the above exemplary embodiment.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A video encoding method, comprising:
dividing a video image into a plurality of Coding Units (CU);
predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table;
wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
2. The method of claim 1, wherein the one ancestor node is: an ancestor node of all ancestor nodes of the at least one CU that meets a predetermined condition with respect to a predetermined size threshold.
3. The method of claim 1, further comprising:
the chroma palette table of a CU is predicted based on the luma palette table of the CU using a cross-component linear model.
4. The method of claim 1, further comprising:
in palette mode, signaling a scanning direction of a CU based on different contexts of a shape of the CU; or
In the palette mode, when a shape of a CU satisfies a predetermined shape condition, a scanning direction of the CU is determined based on the shape of the CU without signaling the scanning direction of the CU.
5. A video decoding method, comprising:
receiving and parsing a bitstream;
obtaining, from the parsed bitstream, a plurality of coding units CU into which the video image is divided;
predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table;
wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
6. A video encoding device, comprising:
a CU partition unit configured to: dividing a video image into a plurality of Coding Units (CU);
a palette table prediction unit configured to: predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table;
wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
7. A video decoding device, comprising:
a reception parsing unit configured to: receiving and parsing a bitstream;
a CU partition unit configured to: obtaining, from the parsed bitstream, a plurality of coding units CU into which the video image is divided;
a palette table prediction unit configured to: predicting a palette table of at least one CU divided under the same father node in the plurality of CUs based on the shared palette table;
wherein the shared palette table is a palette table derived at one of all ancestor nodes of the at least one CU.
8. An electronic device, comprising:
at least one processor;
at least one memory storing computer-executable instructions,
wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the video encoding method of any one of claims 1 to 4 or the video decoding method of claim 5.
9. A computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform the video encoding method of any one of claims 1 to 4 or the video decoding method of claim 5.
10. A computer program product comprising computer instructions, characterized in that the computer instructions, when executed by at least one processor, implement the video encoding method of any one of claims 1 to 4 or the video decoding method of claim 5.
CN202110363246.5A 2020-04-04 2021-04-02 Video coding and decoding method and device Pending CN113497935A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063005300P 2020-04-04 2020-04-04
US63/005,300 2020-04-04

Publications (1)

Publication Number Publication Date
CN113497935A true CN113497935A (en) 2021-10-12

Family

ID=77997559

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110363246.5A Pending CN113497935A (en) 2020-04-04 2021-04-02 Video coding and decoding method and device

Country Status (1)

Country Link
CN (1) CN113497935A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11463716B2 (en) * 2021-02-25 2022-10-04 Qualcomm Incorporated Buffers for video coding in palette mode

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11463716B2 (en) * 2021-02-25 2022-10-04 Qualcomm Incorporated Buffers for video coding in palette mode

Similar Documents

Publication Publication Date Title
TWI753356B (en) Method and apparatuses for coding transform blocks
EP3350995B1 (en) Palette predictor initialization and merge for video coding
KR102088560B1 (en) Palette predictor initializer when encoding or decoding self-contained coding structures
JP7433338B2 (en) Video coding method and device and computer program
US10212434B2 (en) Palette entries coding in video coding
EP3080988B1 (en) Parameter derivation for entropy coding of a syntax element
CA2914581A1 (en) Inter-color component residual prediction
US20210377519A1 (en) Intra prediction-based video signal processing method and device
JP2022538747A (en) Method and system for processing luma and chroma signals
KR20220131250A (en) Deblocking parameters for chroma components
CN114567786B (en) Method and apparatus for video encoding and decoding in 4:4:4 chroma format
KR102521034B1 (en) Video coding method and apparatus using palette mode
EP4205400A1 (en) Residual and coefficients coding for video coding
GB2521410A (en) Method and apparatus for encoding or decoding blocks of pixel
US20220353505A1 (en) Method for reconstructing residual blocks of chroma blocks, and video decoding apparatus
KR102507024B1 (en) Method and apparatus for encoding and decoding digital image/video material
CN113497935A (en) Video coding and decoding method and device
KR20220131249A (en) Palette mode for local double tree
CN116134820A (en) Method and device for encoding and decoding video data based on patch
US12034921B2 (en) Apparatus and method for applying artificial neural network to image encoding or decoding
CN116614625B9 (en) Video coding method, device and medium
CN116614625B (en) Video coding method, device and medium
WO2023131299A1 (en) Signaling for transform coding
US20220141462A1 (en) Apparatus and method for applying artificial neural network to image encoding or decoding
WO2023133443A2 (en) Method, apparatus, and medium for video processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination