WO2022192902A1 - Remaining level binarization for video coding - Google Patents

Remaining level binarization for video coding Download PDF

Info

Publication number
WO2022192902A1
WO2022192902A1 PCT/US2022/071091 US2022071091W WO2022192902A1 WO 2022192902 A1 WO2022192902 A1 WO 2022192902A1 US 2022071091 W US2022071091 W US 2022071091W WO 2022192902 A1 WO2022192902 A1 WO 2022192902A1
Authority
WO
WIPO (PCT)
Prior art keywords
quantization
block
level
levels
binary representation
Prior art date
Application number
PCT/US2022/071091
Other languages
French (fr)
Inventor
Yue Yu
Haoping Yu
Original Assignee
Innopeak Technology, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Innopeak Technology, Inc. filed Critical Innopeak Technology, Inc.
Priority to CN202280019616.6A priority Critical patent/CN116965028A/en
Publication of WO2022192902A1 publication Critical patent/WO2022192902A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • This disclosure relates generally to computer-implemented methods and systems for video processing. Specifically, the present disclosure involves remaining level binarization for video coding.
  • Video coding technology allows video data to be compressed into smaller sizes thereby allowing various videos to be stored and transmitted.
  • Video coding has been used in a wide range of applications, such as digital TV broadcast, video transmission over the internet and mobile networks, real-time applications (e.g., video chat, video conferencing), DVD and Blu- ray discs, and so on. To reduce the storage space for storing a video and/or the network bandwidth consumption for transmitting a video, it is desired to improve the efficiency of the video coding scheme.
  • a method for decoding a video includes accessing a binary representation of a block of the video, the block of the video associated with a plurality of quantization levels, processing the binary representation to recover the plurality of quantization levels of the block, and reconstructing the block by determining pixel values of the block from the plurality of quantization levels.
  • the processing includes obtaining a portion of the binary representation corresponding to a quantization level of the plurality of quantization levels and converting the portion of the binary representation into the quantization level according to a k-th order Exp- Golomb binarization.
  • k indicates the order of the Exp-Golomb binarization and is an integer larger than zero.
  • a non-transitory computer-readable medium has program code that is stored thereon and executable by one or more processing devices for performing operations.
  • the operations include accessing a binary representation of a block of the video, the block of the video associated with a plurality of quantization levels, processing the binary representation to recover the plurality of quantization levels of the block, and reconstructing the block by determining pixel values of the block from the plurality of quantization levels.
  • the processing includes obtaining a portion of the binary representation corresponding to a quantization level of the plurality of quantization levels and converting the portion of the binary representation into the quantization level according to a k-th order Exp-Golomb binarization. k indicates the order of the Exp-Golomb binarization and is an integer larger than zero.
  • a system in yet another example, includes a processing device and a non-transitory computer-readable medium communicatively coupled to the processing device.
  • the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations.
  • the operations include accessing a binary representation of a block of the video, the block of the video associated with a plurality of quantization levels, processing the binary representation to recover the plurality of quantization levels of the block, and reconstructing the block by determining pixel values of the block from the plurality of quantization levels.
  • the processing includes obtaining a portion of the binary representation corresponding to a quantization level of the plurality of quantization levels and converting the portion of the binary representation into the quantization level according to a k-th order Exp-Golomb binarization.
  • k indicates the order of the Exp- Golomb binarization and is an integer larger than zero.
  • a method for encoding a video includes accessing a plurality of quantization levels of a block of the video, processing each quantization level of the plurality of quantization levels of the block to generate binary representations for the plurality of quantization levels, the processing comprising: determining a remaining level of the quantization level and converting the remaining level of the quantization level into a binary representation for the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp-Golomb binarization and is an integer larger than zero; and encoding at least the binary representations for the plurality of quantization levels of the block into a bitstream of the video.
  • a non-transitory computer-readable medium has program code that is stored thereon.
  • the program code is executable by one or more processing devices for performing operations comprising: accessing a plurality of quantization levels of a block of a video; processing each quantization level of the plurality of quantization levels of the block to generate binary representations for the plurality of quantization levels, the processing comprising: determining a remaining level of the quantization level; and converting the remaining level of the quantization level into a binary representation for the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp- Golomb binarization and is an integer larger than zero; and encoding at least the binary representations for the plurality of quantization levels of the block into a bitstream of the video.
  • a system includes a processing device; and a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising: accessing a plurality of quantization levels of a block of a video; processing each quantization level of the plurality of quantization levels of the block to generate binary representations for the plurality of quantization levels, the processing comprising: determining a remaining level of the quantization level; and converting the remaining level of the quantization level into a binary representation for the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp-Golomb binarization and is an integer larger than zero; and encoding at least the binary representations for the plurality of quantization levels of the block into a bitstream of the video.
  • FIG. 1 is a block diagram showing an example of a video encoder configured to implement embodiments presented herein.
  • FIG. 2 is a block diagram showing an example of a video decoder configured to implement embodiments presented herein.
  • FIG. 3 depicts an example of a coding tree unit division of a picture in a video, according to some embodiments of the present disclosure.
  • FIG. 4 depicts an example of a coding unit division of a coding tree unit, according to some embodiments of the present disclosure.
  • FIG. 5 depicts an example of the scan region-based coefficient coding according to some embodiments of the present disclosure.
  • FIG. 6A depicts a table listing examples of k-th order Exp-Golomb binarization.
  • FIG. 6B depicts an example of a special position template used in an adaptive binarization method according to some embodiments of the present disclosure.
  • FIG. 7 depicts an example of a process for encoding a block of a video according to some embodiments of the present disclosure.
  • FIG. 8 depicts an example of a process for decoding a block of a video according to some embodiments of the present disclosure.
  • FIG. 9 depicts an example of a computing system that can be used to implement some embodiments of the present disclosure.
  • Various embodiments provide remaining level binarization schemes for video coding. As discussed above, more and more video data are being generated, stored, and transmitted. It is beneficial to increase the efficiency of the video coding technology thereby using less data to represent a video without compromising the visual quality of the decoded video.
  • One way to improve the coding efficiency is through entropy coding to compress processed video coefficients into a binary bitstream using as few bits as possible. Before entropy coding, video coefficient levels (or remaining levels of the coefficient levels) are binarized into binary bins and coding algorithms such as Context adaptive modeling based binary arithmetic coding (CABAC) can further compress bins into bits.
  • CABAC Context adaptive modeling based binary arithmetic coding
  • the 1 st order Exp-Golomb binarization method is used instead of the 0 th order Exp-Golomb. This allows fewer bits to be used to represent coefficient levels (or remaining coefficient levels), especially those levels with large values, thereby improving the coding efficiency.
  • the k-th order Exp-Golomb binarization method is used with k>l to further improve the coding efficiency for high bit-depth videos.
  • an adaptive k-th order Exp-Golomb binarization method is used to binarize the video coefficient levels. For example, the level information preceding the current position is used to decide the k-th order Exp-Golomb binarization method to be used to binarize the remaining level of the current position.
  • the adaptive binarization method allows the order of the binarization method to be changed according to the content of the video, leading to a more efficient coding result (i.e., using fewer bits to represent the video). These techniques can be an effective coding tool in future video coding standards.
  • FIG. l is a block diagram showing an example of a video encoder 100 configured to implement embodiments presented herein.
  • the video encoder 100 includes a partition module 112, a transform module 114, a quantization module 115, an inverse quantization module 118, an inverse transform module 119, an in-loop filter module 120, an intra prediction module 126, an inter prediction module 124, a motion estimation module 122, a decoded picture buffer 130, and an entropy coding module 116.
  • the input to the video encoder 100 is an input video 102 containing a sequence of pictures (also referred to as frames or images).
  • the video encoder 100 employs a partition module 112 to partition the picture into blocks 104, and each block contains multiple pixels.
  • the blocks may be macroblocks, coding tree units, coding units, prediction units, and/or prediction blocks.
  • One picture may include blocks of different sizes and the block partitions of different pictures of the video may also differ.
  • Each block may be encoded using different predictions, such as intra prediction or inter prediction or intra and inter hybrid prediction.
  • the first picture of a video signal is an intra-coded picture, which is encoded using only intra prediction.
  • the intra prediction mode a block of a picture is predicted using only data that has been encoded from the same picture.
  • a picture that is intra-coded can be decoded without information from other pictures.
  • the video encoder 100 shown in FIG. 1 can employ the intra prediction module 126.
  • the intra prediction module 126 is configured to use reconstructed samples in reconstructed blocks 136 of neighboring blocks of the same picture to generate an intra-prediction block (the prediction block 134).
  • the intra prediction is performed according to an intra-prediction mode selected for the block.
  • the video encoder 100 then calculates the difference between block 104 and the intra-prediction block 134. This difference is referred to as residual block 106.
  • the residual block 106 is transformed by the transform module 114 into a transform domain by applying a transform on the samples in the block.
  • the transform may include, but are not limited to, a discrete cosine transform (DCT) or discrete sine transform (DST).
  • the transformed values may be referred to as transform coefficients representing the residual block in the transform domain.
  • the residual block may be quantized directly without being transformed by the transform module 114. This is referred to as a transform skip mode.
  • the video encoder 100 can further use the quantization module 115 to quantize the transform coefficients to obtain quantized coefficients.
  • Quantization includes dividing a sample by a quantization step size followed by subsequent rounding, whereas inverse quantization involves multiplying the quantized value by the quantization step size. Such a quantization process is referred to as scalar quantization. Quantization is used to reduce the dynamic range of video samples (transformed or non-transformed) so that fewer bits are used to represent the video samples.
  • the quantization of coefficients/samples within a block can be done independently and this kind of quantization method is used in some existing video compression standards, such as H.264, and HEVC.
  • a specific scan order may be used to convert the 2D coefficients of a block into a 1-D array for coefficient quantization and coding.
  • Quantization of a coefficient within a block may make use of the scan order information. For example, the quantization of a given coefficient in the block may depend on the status of the previous quantized value along the scan order. In order to further improve the coding efficiency, more than one quantizer may be used. Which quantizer is used for quantizing a current coefficient depends on the information preceding the current coefficient in the encoding/decoding scan order. Such a quantization approach is referred to as dependent quantization.
  • the degree of quantization may be adjusted using the quantization step sizes. For instance, for scalar quantization, different quantization step sizes may be applied to achieve finer or coarser quantization. Smaller quantization step sizes correspond to finer quantization, whereas larger quantization step sizes correspond to coarser quantization.
  • the quantization step size can be indicated by a quantization parameter (QP).
  • QP quantization parameter
  • the quantization parameters are provided in the encoded bitstream of the video such that the video decoder can apply the same quantization parameters for decoding.
  • the quantized samples are then coded by the entropy coding module 116 to further reduce the size of the video signal.
  • the entropy encoding module 116 is configured to apply an entropy encoding algorithm on the quantized samples.
  • the quantized samples are binarized into binary bins and coding algorithms further compress the binary bins into bits. Examples of the binarization methods include, but are not limited to, a combined truncated rice (TR) and limited k-th order Exp-Golomb (EGk) binarization, and k-th order Exp- Golomb binarization.
  • Examples of the entropy encoding algorithm include, but are not limited to, a variable length coding (VLC) scheme, a context adaptive VLC scheme (CAVLC), an arithmetic coding scheme, a binarization, a context adaptive binary arithmetic coding (CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) coding, or other entropy encoding techniques.
  • VLC variable length coding
  • CAVLC context adaptive VLC scheme
  • CABAC context adaptive binary arithmetic coding
  • SBAC syntax-based context-adaptive binary arithmetic coding
  • PIPE probability interval partitioning entropy
  • the entropy-coded data is added to the bitstream of the output encoded video 132.
  • reconstructed blocks 136 from neighboring blocks are used in the intra-prediction of blocks of a picture.
  • Generating the reconstructed block 136 of a block involves calculating the reconstructed residuals of this block.
  • the reconstructed residual can be determined by applying inverse quantization and inverse transform on the quantized residual of the block.
  • the inverse quantization module 118 is configured to apply the inverse quantization on the quantized samples to obtain de-quantized coefficients.
  • the inverse quantization module 118 applies the inverse of the quantization scheme applied by the quantization module 115 by using the same quantization step size as the quantization module 115.
  • the inverse transform module 119 is configured to apply the inverse transform of the transform applied by the transform module 114 on the de-quantized samples, such as inverse DCT or inverse DST.
  • the output of the inverse transform module 119 is the reconstructed residuals for the block in the pixel domain.
  • the reconstructed residuals can be added to the prediction block 134 of the block to obtain a reconstructed block 136 in the pixel domain.
  • the inverse transform module 119 is not applied to those blocks.
  • the de-quantized samples are the reconstructed residuals for the blocks.
  • Blocks in subsequent pictures following the first intra-predicted picture can be coded using either inter prediction or intra prediction.
  • inter-prediction the prediction of a block in a picture is from one or more previously encoded video pictures.
  • the video encoder 100 uses an inter prediction module 124.
  • the inter prediction module 124 is configured to perform motion compensation for a block based on the motion estimation provided by the motion estimation module 122.
  • the motion estimation module 122 compares a current block 104 of the current picture with decoded reference pictures 108 for motion estimation.
  • the decoded reference pictures 108 are stored in a decoded picture buffer 130.
  • the motion estimation module 122 selects a reference block from the decoded reference pictures 108 that best matches the current block.
  • the motion estimation module 122 further identifies an offset between the position (e.g., x, y coordinates) of the reference block and the position of the current block. This offset is referred to as the motion vector (MV) and is provided to the inter prediction module 124 along with the selected reference block.
  • MV motion vector
  • multiple reference blocks are identified for the current block in multiple decoded reference pictures 108. Therefore, multiple motion vectors are generated and provided to the inter prediction module 124 along with the corresponding reference blocks.
  • the inter prediction module 124 uses the motion vector(s) along with other inter prediction parameters to perform motion compensation to generate a prediction of the current block, i.e., the inter prediction block 134. For example, based on the motion vector(s), the inter prediction module 124 can locate the prediction block(s) pointed to by the motion vector(s) in the corresponding reference picture(s). If there is more than one prediction block, these prediction blocks are combined with some weights to generate a prediction block 134 for the current block.
  • the video encoder 100 can subtract the inter-prediction block 134 from the block 104 to generate the residual block 106.
  • the residual block 106 can be transformed, quantized, and entropy coded in the same way as the residuals of an intra- predicted block discussed above.
  • the reconstructed block 136 of an inter-predicted block can be obtained through inverse quantizing, inverse transforming the residual, and subsequently combining with the corresponding prediction block 134.
  • the reconstructed block 136 is processed by an in-loop filter module 120.
  • the in-loop filter module 120 is configured to smooth out pixel transitions thereby improving the video quality.
  • the in-loop filter module 120 may be configured to implement one or more in-loop filters, such as a de blocking filter, or a sample-adaptive offset (SAO) filter, or an adaptive loop filter (ALF), etc.
  • FIG. 2 depicts an example of a video decoder 200 configured to implement embodiments presented herein.
  • the video decoder 200 processes an encoded video 202 in a bitstream and generates decoded pictures 208. In the example shown in FIG.
  • the video decoder 200 includes an entropy decoding module 216, an inverse quantization module 218, an inverse transform module 219, an in-loop filter module 220, an intra prediction module 226, an inter prediction module 224, and a decoded picture buffer 230.
  • the entropy decoding module 216 is configured to perform entropy decoding of the encoded video 202.
  • the entropy decoding module 216 decodes the quantized coefficients, coding parameters including intra prediction parameters and inter prediction parameters, and other information.
  • the entropy decoding module 216 decodes the bitstream of the encoded video 202 to binary representations and then converts the binary representations to quantization levels of the coefficients.
  • the entropy-decoded coefficient levels are then inverse quantized by the inverse quantization module 218 and subsequently inverse transformed by the inverse transform module 219 to the pixel domain.
  • the inverse quantization module 218 and the inverse transform module 219 function similarly as the inverse quantization module 118 and the inverse transform module 119, respectively, as described above with respect to FIG. 1.
  • the inverse-transformed residual block can be added to the corresponding prediction block 234 to generate a reconstructed block 236.
  • the inverse transform module 219 is not applied to those blocks.
  • the de-quantized samples generated by the inverse quantization module 118 are used to generate the reconstructed block 236.
  • the prediction block 234 of a particular block is generated based on the prediction mode of the block. If the coding parameters of the block indicate that the block is intra predicted, the reconstructed block 236 of a reference block in the same picture can be fed into the intra prediction module 226 to generate the prediction block 234 for the block. If the coding parameters of the block indicate that the block is inter-predicted, the prediction block 234 is generated by the inter prediction module 224.
  • the intra prediction module 226 and the inter prediction module 224 function similarly to the intra prediction module 126 and the inter prediction module 124 of FIG. 1, respectively.
  • the inter prediction involves one or more reference pictures.
  • the video decoder 200 generates the decoded pictures 208 for the reference pictures by applying the in-loop filter module 220 to the reconstructed blocks of the reference pictures.
  • the decoded pictures 208 are stored in the decoded picture buffer 230 for use by the inter prediction module 224 and also for output.
  • FIG. 3 depicts an example of a coding tree unit division of a picture in a video, according to some embodiments of the present disclosure.
  • the picture is divided into blocks, such as the CTUs (Coding Tree Units) 302 in AVS, as shown in FIG. 3.
  • the CTUs 302 can be blocks of 128x128 pixels.
  • the CTUs are processed according to an order, such as the order shown in FIG. 3.
  • each CTU 302 in a picture can be partitioned into one or more CUs (Coding Units) 402 as shown in FIG.
  • a CTU 302 may be partitioned into CUs 402 differently.
  • the CUs 402 can be rectangular or square, and can be coded without further partitioning into prediction units or transform units.
  • Each CU 402 can be as large as its root CTU 302 or be subdivisions of a root CTU 302 as small as 4x4 blocks.
  • a division of a CTU 302 into CUs 402 in AVS can be quadtree splitting or binary tree splitting or ternary tree splitting.
  • solid lines indicate quadtree splitting and dashed lines indicate binary or ternary tree splitting.
  • quantization is used to reduce the dynamic range of elements of blocks in the video signal so that fewer bits are used to represent the video signal.
  • the transformed or non- transformed video signal at a specific position is referred to as a coefficient.
  • the quantized value of coefficient is called as quantization level or level.
  • Quantization typically consists of division by a quantization step size and subsequent rounding while inverse quantization consists of multiplication by the quantization step size. Such a quantization process is also referred to as scalar quantization.
  • the quantization of the coefficients within a block can be performed independently and this kind of independent quantization method is used in some existing video compression standards, such as H.264, HEVC, AVS, etc. In other examples, dependent quantization is employed, such as in VVC.
  • Residual coding is used to convert the quantization levels into bitstream in video coding.
  • N X M quantization levels for an N X M block.
  • These N X M levels may be zero or non-zero values.
  • the non-zero levels will further be binarized to binary bins if the levels are not binary.
  • Context modeling based binary arithmetic coding such as the CABAC, can further compress bins into bits.
  • SRCC scan region-based coefficient coding
  • FIG. 5 illustrates an example of the scan region-based coefficient coding.
  • the two-dimensional (2-D) coordinates (scan_region_x and scan_region_y) are coded in bitstream to indicate a smallest rectangular area 504 within which non-zero levels exist and outside which the levels of all positions will be zero.
  • This smallest rectangular area 504 is referred to as SRCC area or SRCC block.
  • the scan region x and scan_region_y are less than or equal to blockWidth and blockHeight , respectively.
  • the level of each position may be zero or non-zero and there are always non-zero levels with coordinates equal to scan_region_x or scan_region_y or both.
  • An SRCC block may consist of several pre-defmed sub-blocks (e.g., 4x4 sub blocks). Because an SRCC block may have a size that does not fit an integer number of regular sub-blocks, the sub-blocks in the last row or column may have a size smaller than regular sub blocks.
  • a specific coding scan order may be used to convert 2-D coefficients of the block into a one-dimensional (1-D) order for coefficient quantization and coding.
  • the coding scan starts from the left- top corner and stops at last sub-block located at right-bottom corner of an SRCC block in a right-bottom direction.
  • the last sub-block is derived from (scan_region_x and scan_region_y) according to a predefined coding scan order.
  • RRC will code sub-block by sub-block starting from the last sub-block with a reverse coding scan order.
  • residual coding will code the level of each position with a reverse coding scan order.
  • FIG. 5 shows an example of a block 500 with an SRCC 504 and sub-blocks 506A-506D.
  • Each sub-block 506 has a pre determined reverse scanning order for coding the quantization levels in the sub-block 506.
  • the sub-block 506A has a size of 3 x 3 and the coding starts at the lower right corner at position L 0 and ends at the upper left corner L 8.
  • coeff_abs_level_greaterl_flag and coeff_abs_level_greater2_flag within the sub-block are coded, for any absolute level greater than 2 of any position within the sub block, another syntax element called coeff_abs_level_remaining will be coded for these positions.
  • the coeff_abs_level_remaining represents the value of the absolute level minus 3 in the current AVS.
  • a flag coeff_sign indicating the level being negative or positive for each non-zero level position will be coded.
  • the residual coding process will proceed to the next sub-block along the reverse coding scan order until all the syntax elements of all sub-blocks within a residual block are coded.
  • video coding schemes such as AVS
  • may adopt more flexible syntax elements e.g., abs_level_gtxX_flag
  • abs_level_gtxX_flag describes whether the absolute value of the quantization level is greater than X, where X is an integer number, such as 0, 1, 2, ..., or N. If abs_level_gtxY_flag is 0 where Y is an integer between 0 and N-l, abs_level_gtx(Y+l) flag will not be present.
  • abs_level_gtxY_flag is 1, abs_level_gtx(Y+l) flag will be present.
  • abs_level gtxl flag is 1, abs_level gtx2_flag is present.
  • abs_level gtx2_flag is 0 and thus abs_level gtx3_flag will not be present.
  • abs_level_gtxN_flag is 0, the remaining level will not be present.
  • the 0-th order Exp-Golomb binarization method is used for binarizing the remaining levels.
  • 0-th order Exp-Golomb may not be optimal for binarization of remaining levels especially when the bit depth of the video samples is high, leading to a higher bit rate of the encoded video.
  • 1 st order Exp-Golomb binarization method is proposed to be used in the remaining level binarization to improve the video coding performance.
  • FIG. 6 shows the codewords of k-th order Exp-Golomb binarization where k is an integer, such as 0, 1, 2, .... From FIG. 6, it can be seen that the lower order Exp_Golomb binarization works better for remaining levels having small values, whereas higher order Exp Golo b binarization works better for remaining levels having large values. For example, if the remaining levels distribute within a small number range, such as 0 to 2, the 0-th order Exp-Golomb binarization provides the smallest total number of bins among the binarization schemes in FIG. 6 to represent these remaining levels, and thus requires fewest bits.
  • the 1-th order Exp-Golomb binarization may lead to fewer number of binarization bins than the 0-th order Exp-Golomb binarization.
  • a higher order Exp- Golomb binarization provides a better coding efficiency than a lower order binarization scheme.
  • the proposed lst order Exp-Golomb binarization can be used for both the regular residual coding (RRC) and the transform skip residual coding (TSRC). Alternatively, lst order Exp- Golomb binarization can be used for RRC only or TSRC only.
  • k-th Exp- Golomb binarization is used to binarize the remaining level where k may be bigger than 1.
  • an adaptive Exp-Golomb binarization can be used to binarize the remaining levels.
  • the level information preceding the current position is used to decide the order k of the Exp-Golomb code to binarize the remaining level of the current position.
  • a statistic value e.g., the sum, average, or another statistics
  • M absolute levels or remaining levels of previous coded positions can be used to adaptively decide the order k of the Exp-Golomb binarization for the current position.
  • ⁇ t 2 , ..t n e.g., G ⁇ t 2 ⁇ ⁇ ⁇ t n
  • G ⁇ t 2 , ..t n e.g., G ⁇ t 2 ⁇ ⁇ ⁇ t n
  • a special position template may be used to calculate the statistic value of M absolute levels of previous coded positions. This special position template can ensure that certain positions are not used when calculating the statistic value. For example, in some implementations positions that sit in the same scan line are processed in parallel. As such, to avoid breaking the parallelism, when calculating the statistic value, those coded positions on the same scan line as the current position should not be used. The special position template can achieve such a goal.
  • FIG. 6B depicts an example of the special position template used in the adaptive binarization.
  • the current position 602 in a block 600 is shown as solid.
  • the scan line is illustrated using line 604.
  • the template 606 includes the five shaded positions.
  • the positions in the template 606 do not include positions along the scan line 604.
  • FIG. 7 depicts an example of a process 700 for encoding a partition for a video, according to some embodiments of the present disclosure.
  • One or more computing devices e.g., the computing device implementing the video encoder 100
  • implement operations depicted in FIG. 7 by executing suitable program code e.g., the program code implementing the entropy coding module 116.
  • suitable program code e.g., the program code implementing the entropy coding module 116.
  • the process 700 is described with reference to some examples depicted in the figures. Other implementations, however, are possible.
  • the process 700 involves accessing quantization levels of the residual of a block in a video.
  • the block can be a portion of a picture of the input video, such as a coding unit 402 discussed in FIG. 4 or any type of block processed by a video encoder as a unit when performing quantization and binarization.
  • the process 700 involves processing each quantization level of the block to generate binarized levels for the block.
  • the process 700 involves determining a remaining level of the quantization level.
  • the video encoder can use syntax elements such as abs_level_gtxX_flag to indicate a quantization level. If the quantization level has a value lager than that can be represented by these syntax elements, the video encoder can determine the remaining level for binarization to be the quantization level minus the portion represented by the syntax elements.
  • the remaining level is 2 after deducting the portion (i.e., 4) represented by the syntax elements (i.e., abs_level_gtx0_flag, abs_level_gtxl_flag, ... abs_level_gtx3_flag) from the quantization level 6.
  • the process 700 involves converting the remaining level into a binary representation using k-th order Exp-Golomb codewords.
  • k is 1, that is, the 1 st order Exp-Golomb binarization is used to binarize the remaining level.
  • k is greater than 1 and a higher order Exp-Golomb binarization is used to binarize the remaining level.
  • the binarization can be performed by converting the value of the remaining level indicated in the third column of the table shown in FIG. 6A to the binarization shown in the second column. For example, if the 1 st order Exp-Golomb binarization is used and the remaining level is 5, the binarization is “0111” according to FIG. 6A.
  • the adaptive Exp-Golomb binarization can be used to binarize the remaining levels.
  • the order of the Exp-Golomb binarization is determined based on the quantization levels or remaining levels preceding the current position.
  • the process 700 involves encoding the binary representations of the quantization levels in the block into a bitstream of the video.
  • the encoding can be performed, for example, using the context adaptive modeling based binary arithmetic coding (CABAC) discussed above.
  • CABAC context adaptive modeling based binary arithmetic coding
  • FIG. 8 depicts an example of a process 800 for decoding a block for a video, according to some embodiments of the present disclosure.
  • One or more computing devices implement operations depicted in FIG. 8 by executing suitable program code.
  • a computing device implementing the video decoder 200 may implement the operations depicted in FIG. 8 by executing the program code for the entropy decoding module 216, the inverse quantization module 218, and the inverse transform module 219.
  • the process 800 is described with reference to some examples depicted in the figures. Other implementations, however, are possible.
  • the process 800 involves accessing a binary string or a binary representation that represents a block of a video signal.
  • the block can be a portion of a picture of the input video, such as a coding unit 402 discussed in FIG. 4 or any type of block processed by a video encoder as a unit when performing quantization and binarization.
  • the process 800 involves processing the binary representation of the block to recover the quantization levels in the block.
  • the process 800 involves obtaining a portion of the binary representation that corresponds to a quantization level in the block.
  • the process 800 involves converting a part of the portion of the binary representation into a remaining level using the k-th order Exp- Golomb codewords.
  • k is 1 and the 1 st order Exp-Golomb binarization is used to recover the remaining level from the binary representation.
  • k is greater than 1 and a higher order Exp-Golomb binarization is used to recover the remaining level.
  • the de-binarization can be performed by converting the binary representation into the value of the remaining level according to the mapping between the second column and the third column of the table shown in FIG. 6A. For example, if the 1 st order Exp-Golomb binarization is used and the binary string is “0111,” the quantization level is 5 according to FIG. 6. Other values of the remaining level can be recovered in a similar way according to FIG. 6 and the order of the Exp-Golomb binarization. In some examples, the adaptive Exp- Golomb binarization may be used to binarize the remaining levels.
  • the decoder can first determine the order of the Exp-Golomb binarization based on the quantization levels or remaining levels of the block or other blocks that have been decoded before the current remaining level. The decoder then selects the proper Exp-Golomb binarization according to the determined order to convert the binary representation into the value of the remaining level.
  • the process 800 involves reconstructing the quantization level from the remaining level and other syntax elements, such as abs_level_gtxX_flag discussed above.
  • the decoder can parse the syntax elements from the portion of the binary representation and determine the value for the quantization level that corresponds to these syntax elements.
  • the decoder can further determine that the quantization level is the sum of the remaining level and the value determined from the syntax elements.
  • the process 800 involves reconstructing the block by determining pixel values of the block from the quantization levels through, for example, reverse quantization and reversion transformation as discussed above with respect to FIG. 2.
  • the decoded block of the video can be output for display.
  • FIG. 9 depicts an example of a computing device 900 that can implement the video encoder 100 of FIG. 1 or the video decoder 200 of FIG. 2.
  • the computing device 900 can include a processor 912 that is communicatively coupled to a memory 914 and that executes computer-executable program code and/or accesses information stored in the memory 914.
  • the processor 912 may comprise a microprocessor, an application- specific integrated circuit (“ASIC”), a state machine, or other processing device.
  • the processor 912 can include any of a number of processing devices, including one.
  • Such a processor can include or may be in communication with a computer-readable medium storing instructions that, when executed by the processor 912, cause the processor to perform the operations described herein.
  • the memory 914 can include any suitable non-transitory computer- readable medium.
  • the computer-readable medium can include any electronic, optical, magnetic, or other storage device capable of providing a processor with computer-readable instructions or other program code.
  • Non-limiting examples of a computer-readable medium include a magnetic disk, memory chip, ROM, RAM, an ASIC, a configured processor, optical storage, magnetic tape or other magnetic storage, or any other medium from which a computer processor can read instructions.
  • the instructions may include processor-specific instructions generated by a compiler and/or an interpreter from code written in any suitable computer programming language, including, for example, C, C++, C#, Visual Basic, Java, Python, Perl, JavaScript, and ActionScript.
  • the computing device 900 can also include a bus 916.
  • the bus 916 can communicatively couple one or more components of the computing device 900.
  • the computing device 900 can also include a number of external or internal devices such as input or output devices.
  • the computing device 900 is shown with an input/output (“I/O”) interface 918 that can receive input from one or more input devices 920 or provide output to one or more output devices 922.
  • the one or more input devices 920 and one or more output devices 922 can be communicatively coupled to the I/O interface 918.
  • the communicative coupling can be implemented via any suitable manner (e.g., a connection via a printed circuit board, connection via a cable, communication via wireless transmissions, etc.).
  • Non-limiting examples of input devices 920 include a touch screen (e.g., one or more cameras for imaging a touch area or pressure sensors for detecting pressure changes caused by a touch), a mouse, a keyboard, or any other device that can be used to generate input events in response to physical actions by a user of a computing device.
  • Non-limiting examples of output devices 1322 include an LCD screen, an external monitor, a speaker, or any other device that can be used to display or otherwise present outputs generated by a computing device.
  • the computing device 1300 can execute program code that configures the processor 912 to perform one or more of the operations described above with respect to FIGS. 1-8.
  • the program code can include the video encoder 100 or the video decoder 200.
  • the program code may be resident in the memory 914 or any suitable computer- readable medium and may be executed by the processor 912 or any other suitable processor.
  • the computing device 900 can also include at least one network interface device 924
  • the network interface device 924 can include any device or group of devices suitable for establishing a wired or wireless data connection to one or more data networks 928
  • Non- limiting examples of the network interface device 924 include an Ethernet network adapter, a modem, and/or the like.
  • the computing device 900 can transmit messages as electronic or optical signals via the network interface device 924
  • a computing device can include any suitable arrangement of components that provide a result conditioned on one or more inputs.
  • Suitable computing devices include multi-purpose microprocessor-based computer systems accessing stored software that programs or configures the computing system from a general purpose computing apparatus to a specialized computing apparatus implementing one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combinations of languages may be used to implement the teachings contained herein in software to be used in programming or configuring a computing device.
  • Embodiments of the methods disclosed herein may be performed in the operation of such computing devices.
  • the order of the blocks presented in the examples above can be varied — for example, blocks can be re-ordered, combined, and/or broken into sub-blocks. Some blocks or processes can be performed in parallel.

Abstract

In some embodiments, a video decoder decodes a block of a video from a bitstream of the video. The video decoder accesses a binary string decoded from the bitstream of the video representing the block of the video. The block of the video is associated with a plurality of quantization levels. The video decoder processes the binary string to recover the plurality of quantization levels of the block. The processing includes obtaining a portion of the binary string corresponding to a quantization level of the plurality of quantization levels and converting the portion of the binary string into the quantization level according to k-th order Exp-Golomb binarization, wherein k is an integer larger than zero. The video decoder reconstructs the block by determining pixel values of the block from the plurality of quantization levels.

Description

REMAINING LEVEL BINARIZATION FOR VIDEO CODING
Cross-Reference to Related Applications
[0001] This application claims priority to U.S. Provisional Application No. 63/159,913, entitled “Remaining Level Binarization method for AVS Video Coding,” filed on March 11, 2021, which is hereby incorporated in its entirety by this reference.
Technical Field
[0002] This disclosure relates generally to computer-implemented methods and systems for video processing. Specifically, the present disclosure involves remaining level binarization for video coding.
Background
[0003] The ubiquitous camera-enabled devices, such as smartphones, tablets, and computers, have made it easier than ever to capture videos or images. However, the amount of data for even a short video can be substantially large. Video coding technology (including video encoding and decoding) allows video data to be compressed into smaller sizes thereby allowing various videos to be stored and transmitted. Video coding has been used in a wide range of applications, such as digital TV broadcast, video transmission over the internet and mobile networks, real-time applications (e.g., video chat, video conferencing), DVD and Blu- ray discs, and so on. To reduce the storage space for storing a video and/or the network bandwidth consumption for transmitting a video, it is desired to improve the efficiency of the video coding scheme.
Summary
[0004] Some embodiments involve remaining level binarization for video coding. In one example, a method for decoding a video includes accessing a binary representation of a block of the video, the block of the video associated with a plurality of quantization levels, processing the binary representation to recover the plurality of quantization levels of the block, and reconstructing the block by determining pixel values of the block from the plurality of quantization levels. The processing includes obtaining a portion of the binary representation corresponding to a quantization level of the plurality of quantization levels and converting the portion of the binary representation into the quantization level according to a k-th order Exp- Golomb binarization. k indicates the order of the Exp-Golomb binarization and is an integer larger than zero.
[0005] In another example, a non-transitory computer-readable medium has program code that is stored thereon and executable by one or more processing devices for performing operations. The operations include accessing a binary representation of a block of the video, the block of the video associated with a plurality of quantization levels, processing the binary representation to recover the plurality of quantization levels of the block, and reconstructing the block by determining pixel values of the block from the plurality of quantization levels. The processing includes obtaining a portion of the binary representation corresponding to a quantization level of the plurality of quantization levels and converting the portion of the binary representation into the quantization level according to a k-th order Exp-Golomb binarization. k indicates the order of the Exp-Golomb binarization and is an integer larger than zero.
[0006] In yet another example, a system includes a processing device and a non-transitory computer-readable medium communicatively coupled to the processing device. The processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations. The operations include accessing a binary representation of a block of the video, the block of the video associated with a plurality of quantization levels, processing the binary representation to recover the plurality of quantization levels of the block, and reconstructing the block by determining pixel values of the block from the plurality of quantization levels. The processing includes obtaining a portion of the binary representation corresponding to a quantization level of the plurality of quantization levels and converting the portion of the binary representation into the quantization level according to a k-th order Exp-Golomb binarization. k indicates the order of the Exp- Golomb binarization and is an integer larger than zero.
[0007] In a further example, a method for encoding a video includes accessing a plurality of quantization levels of a block of the video, processing each quantization level of the plurality of quantization levels of the block to generate binary representations for the plurality of quantization levels, the processing comprising: determining a remaining level of the quantization level and converting the remaining level of the quantization level into a binary representation for the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp-Golomb binarization and is an integer larger than zero; and encoding at least the binary representations for the plurality of quantization levels of the block into a bitstream of the video.
[0008] In another example, a non-transitory computer-readable medium has program code that is stored thereon. The program code is executable by one or more processing devices for performing operations comprising: accessing a plurality of quantization levels of a block of a video; processing each quantization level of the plurality of quantization levels of the block to generate binary representations for the plurality of quantization levels, the processing comprising: determining a remaining level of the quantization level; and converting the remaining level of the quantization level into a binary representation for the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp- Golomb binarization and is an integer larger than zero; and encoding at least the binary representations for the plurality of quantization levels of the block into a bitstream of the video. [0009] In yet another example, a system includes a processing device; and a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising: accessing a plurality of quantization levels of a block of a video; processing each quantization level of the plurality of quantization levels of the block to generate binary representations for the plurality of quantization levels, the processing comprising: determining a remaining level of the quantization level; and converting the remaining level of the quantization level into a binary representation for the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp-Golomb binarization and is an integer larger than zero; and encoding at least the binary representations for the plurality of quantization levels of the block into a bitstream of the video.
[0010] These illustrative embodiments are mentioned not to limit or define the disclosure, but to provide examples to aid understanding thereof. Additional embodiments are discussed in the Detailed Description, and further description is provided there. Brief Description of the Drawings
[0011] Features, embodiments, and advantages of the present disclosure are better understood when the following Detailed Description is read with reference to the accompanying drawings.
[0012] FIG. 1 is a block diagram showing an example of a video encoder configured to implement embodiments presented herein.
[0013] FIG. 2 is a block diagram showing an example of a video decoder configured to implement embodiments presented herein.
[0014] FIG. 3 depicts an example of a coding tree unit division of a picture in a video, according to some embodiments of the present disclosure.
[0015] FIG. 4 depicts an example of a coding unit division of a coding tree unit, according to some embodiments of the present disclosure.
[0016] FIG. 5 depicts an example of the scan region-based coefficient coding according to some embodiments of the present disclosure.
[0017] FIG. 6A depicts a table listing examples of k-th order Exp-Golomb binarization.
[0018] FIG. 6B depicts an example of a special position template used in an adaptive binarization method according to some embodiments of the present disclosure.
[0019] FIG. 7 depicts an example of a process for encoding a block of a video according to some embodiments of the present disclosure.
[0020] FIG. 8 depicts an example of a process for decoding a block of a video according to some embodiments of the present disclosure.
[0021] FIG. 9 depicts an example of a computing system that can be used to implement some embodiments of the present disclosure.
Detailed Description
[0022] Various embodiments provide remaining level binarization schemes for video coding. As discussed above, more and more video data are being generated, stored, and transmitted. It is beneficial to increase the efficiency of the video coding technology thereby using less data to represent a video without compromising the visual quality of the decoded video. One way to improve the coding efficiency is through entropy coding to compress processed video coefficients into a binary bitstream using as few bits as possible. Before entropy coding, video coefficient levels (or remaining levels of the coefficient levels) are binarized into binary bins and coding algorithms such as Context adaptive modeling based binary arithmetic coding (CABAC) can further compress bins into bits. However, the current binarization method used in Audio Video Coding Standard (AVS) uses the 0-th order Exp- Golomb codewords. This binarization method may not be optimal, especially when the bit depth of the video samples increases and values to be binarized become larger. Various embodiments described herein address these problems by introducing a higher order binarization method to the remaining level binarization thereby improving the coding efficiency.
[0023] In one embodiment, the 1st order Exp-Golomb binarization method is used instead of the 0th order Exp-Golomb. This allows fewer bits to be used to represent coefficient levels (or remaining coefficient levels), especially those levels with large values, thereby improving the coding efficiency. In another embodiment, the k-th order Exp-Golomb binarization method is used with k>l to further improve the coding efficiency for high bit-depth videos. In further embodiments, an adaptive k-th order Exp-Golomb binarization method is used to binarize the video coefficient levels. For example, the level information preceding the current position is used to decide the k-th order Exp-Golomb binarization method to be used to binarize the remaining level of the current position. The adaptive binarization method allows the order of the binarization method to be changed according to the content of the video, leading to a more efficient coding result (i.e., using fewer bits to represent the video). These techniques can be an effective coding tool in future video coding standards.
[0024] Referring now to the drawings, FIG. l is a block diagram showing an example of a video encoder 100 configured to implement embodiments presented herein. In the example shown in FIG. 1, the video encoder 100 includes a partition module 112, a transform module 114, a quantization module 115, an inverse quantization module 118, an inverse transform module 119, an in-loop filter module 120, an intra prediction module 126, an inter prediction module 124, a motion estimation module 122, a decoded picture buffer 130, and an entropy coding module 116.
[0025] The input to the video encoder 100 is an input video 102 containing a sequence of pictures (also referred to as frames or images). In a block-based video encoder, for each of the pictures, the video encoder 100 employs a partition module 112 to partition the picture into blocks 104, and each block contains multiple pixels. The blocks may be macroblocks, coding tree units, coding units, prediction units, and/or prediction blocks. One picture may include blocks of different sizes and the block partitions of different pictures of the video may also differ. Each block may be encoded using different predictions, such as intra prediction or inter prediction or intra and inter hybrid prediction.
[0026] Usually, the first picture of a video signal is an intra-coded picture, which is encoded using only intra prediction. In the intra prediction mode, a block of a picture is predicted using only data that has been encoded from the same picture. A picture that is intra-coded can be decoded without information from other pictures. To perform the intra-prediction, the video encoder 100 shown in FIG. 1 can employ the intra prediction module 126. The intra prediction module 126 is configured to use reconstructed samples in reconstructed blocks 136 of neighboring blocks of the same picture to generate an intra-prediction block (the prediction block 134). The intra prediction is performed according to an intra-prediction mode selected for the block. The video encoder 100 then calculates the difference between block 104 and the intra-prediction block 134. This difference is referred to as residual block 106.
[0027] To further remove the redundancy from the block, the residual block 106 is transformed by the transform module 114 into a transform domain by applying a transform on the samples in the block. Examples of the transform may include, but are not limited to, a discrete cosine transform (DCT) or discrete sine transform (DST). The transformed values may be referred to as transform coefficients representing the residual block in the transform domain. In some examples, the residual block may be quantized directly without being transformed by the transform module 114. This is referred to as a transform skip mode.
[0028] The video encoder 100 can further use the quantization module 115 to quantize the transform coefficients to obtain quantized coefficients. Quantization includes dividing a sample by a quantization step size followed by subsequent rounding, whereas inverse quantization involves multiplying the quantized value by the quantization step size. Such a quantization process is referred to as scalar quantization. Quantization is used to reduce the dynamic range of video samples (transformed or non-transformed) so that fewer bits are used to represent the video samples. [0029] The quantization of coefficients/samples within a block can be done independently and this kind of quantization method is used in some existing video compression standards, such as H.264, and HEVC. For an N-by-M block, a specific scan order may be used to convert the 2D coefficients of a block into a 1-D array for coefficient quantization and coding. Quantization of a coefficient within a block may make use of the scan order information. For example, the quantization of a given coefficient in the block may depend on the status of the previous quantized value along the scan order. In order to further improve the coding efficiency, more than one quantizer may be used. Which quantizer is used for quantizing a current coefficient depends on the information preceding the current coefficient in the encoding/decoding scan order. Such a quantization approach is referred to as dependent quantization.
[0030] The degree of quantization may be adjusted using the quantization step sizes. For instance, for scalar quantization, different quantization step sizes may be applied to achieve finer or coarser quantization. Smaller quantization step sizes correspond to finer quantization, whereas larger quantization step sizes correspond to coarser quantization. The quantization step size can be indicated by a quantization parameter (QP). The quantization parameters are provided in the encoded bitstream of the video such that the video decoder can apply the same quantization parameters for decoding.
[0031] The quantized samples are then coded by the entropy coding module 116 to further reduce the size of the video signal. The entropy encoding module 116 is configured to apply an entropy encoding algorithm on the quantized samples. In some examples, the quantized samples are binarized into binary bins and coding algorithms further compress the binary bins into bits. Examples of the binarization methods include, but are not limited to, a combined truncated rice (TR) and limited k-th order Exp-Golomb (EGk) binarization, and k-th order Exp- Golomb binarization. Examples of the entropy encoding algorithm include, but are not limited to, a variable length coding (VLC) scheme, a context adaptive VLC scheme (CAVLC), an arithmetic coding scheme, a binarization, a context adaptive binary arithmetic coding (CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) coding, or other entropy encoding techniques. The entropy-coded data is added to the bitstream of the output encoded video 132. [0032] As discussed above, reconstructed blocks 136 from neighboring blocks are used in the intra-prediction of blocks of a picture. Generating the reconstructed block 136 of a block involves calculating the reconstructed residuals of this block. The reconstructed residual can be determined by applying inverse quantization and inverse transform on the quantized residual of the block. The inverse quantization module 118 is configured to apply the inverse quantization on the quantized samples to obtain de-quantized coefficients. The inverse quantization module 118 applies the inverse of the quantization scheme applied by the quantization module 115 by using the same quantization step size as the quantization module 115. The inverse transform module 119 is configured to apply the inverse transform of the transform applied by the transform module 114 on the de-quantized samples, such as inverse DCT or inverse DST. The output of the inverse transform module 119 is the reconstructed residuals for the block in the pixel domain. The reconstructed residuals can be added to the prediction block 134 of the block to obtain a reconstructed block 136 in the pixel domain. For blocks where the transform is skipped, the inverse transform module 119 is not applied to those blocks. The de-quantized samples are the reconstructed residuals for the blocks.
[0033] Blocks in subsequent pictures following the first intra-predicted picture can be coded using either inter prediction or intra prediction. In inter-prediction, the prediction of a block in a picture is from one or more previously encoded video pictures. To perform inter prediction, the video encoder 100 uses an inter prediction module 124. The inter prediction module 124 is configured to perform motion compensation for a block based on the motion estimation provided by the motion estimation module 122.
[0034] The motion estimation module 122 compares a current block 104 of the current picture with decoded reference pictures 108 for motion estimation. The decoded reference pictures 108 are stored in a decoded picture buffer 130. The motion estimation module 122 selects a reference block from the decoded reference pictures 108 that best matches the current block. The motion estimation module 122 further identifies an offset between the position (e.g., x, y coordinates) of the reference block and the position of the current block. This offset is referred to as the motion vector (MV) and is provided to the inter prediction module 124 along with the selected reference block. In some cases, multiple reference blocks are identified for the current block in multiple decoded reference pictures 108. Therefore, multiple motion vectors are generated and provided to the inter prediction module 124 along with the corresponding reference blocks.
[0035] The inter prediction module 124 uses the motion vector(s) along with other inter prediction parameters to perform motion compensation to generate a prediction of the current block, i.e., the inter prediction block 134. For example, based on the motion vector(s), the inter prediction module 124 can locate the prediction block(s) pointed to by the motion vector(s) in the corresponding reference picture(s). If there is more than one prediction block, these prediction blocks are combined with some weights to generate a prediction block 134 for the current block.
[0036] For inter-predicted blocks, the video encoder 100 can subtract the inter-prediction block 134 from the block 104 to generate the residual block 106. The residual block 106 can be transformed, quantized, and entropy coded in the same way as the residuals of an intra- predicted block discussed above. Likewise, the reconstructed block 136 of an inter-predicted block can be obtained through inverse quantizing, inverse transforming the residual, and subsequently combining with the corresponding prediction block 134.
[0037] To obtain the decoded picture 108 used for motion estimation, the reconstructed block 136 is processed by an in-loop filter module 120. The in-loop filter module 120 is configured to smooth out pixel transitions thereby improving the video quality. The in-loop filter module 120 may be configured to implement one or more in-loop filters, such as a de blocking filter, or a sample-adaptive offset (SAO) filter, or an adaptive loop filter (ALF), etc. [0038] FIG. 2 depicts an example of a video decoder 200 configured to implement embodiments presented herein. The video decoder 200 processes an encoded video 202 in a bitstream and generates decoded pictures 208. In the example shown in FIG. 2, the video decoder 200 includes an entropy decoding module 216, an inverse quantization module 218, an inverse transform module 219, an in-loop filter module 220, an intra prediction module 226, an inter prediction module 224, and a decoded picture buffer 230.
[0039] The entropy decoding module 216 is configured to perform entropy decoding of the encoded video 202. The entropy decoding module 216 decodes the quantized coefficients, coding parameters including intra prediction parameters and inter prediction parameters, and other information. In some examples, the entropy decoding module 216 decodes the bitstream of the encoded video 202 to binary representations and then converts the binary representations to quantization levels of the coefficients. The entropy-decoded coefficient levels are then inverse quantized by the inverse quantization module 218 and subsequently inverse transformed by the inverse transform module 219 to the pixel domain. The inverse quantization module 218 and the inverse transform module 219 function similarly as the inverse quantization module 118 and the inverse transform module 119, respectively, as described above with respect to FIG. 1. The inverse-transformed residual block can be added to the corresponding prediction block 234 to generate a reconstructed block 236. For blocks where the transform is skipped, the inverse transform module 219 is not applied to those blocks. The de-quantized samples generated by the inverse quantization module 118 are used to generate the reconstructed block 236.
[0040] The prediction block 234 of a particular block is generated based on the prediction mode of the block. If the coding parameters of the block indicate that the block is intra predicted, the reconstructed block 236 of a reference block in the same picture can be fed into the intra prediction module 226 to generate the prediction block 234 for the block. If the coding parameters of the block indicate that the block is inter-predicted, the prediction block 234 is generated by the inter prediction module 224. The intra prediction module 226 and the inter prediction module 224 function similarly to the intra prediction module 126 and the inter prediction module 124 of FIG. 1, respectively.
[0041] As discussed above with respect to FIG. 1 , the inter prediction involves one or more reference pictures. The video decoder 200 generates the decoded pictures 208 for the reference pictures by applying the in-loop filter module 220 to the reconstructed blocks of the reference pictures. The decoded pictures 208 are stored in the decoded picture buffer 230 for use by the inter prediction module 224 and also for output.
[0042] Referring now to FIG. 3, FIG. 3 depicts an example of a coding tree unit division of a picture in a video, according to some embodiments of the present disclosure. As discussed above with respect to FIGS. 1 and 2, to encode a picture of a video, the picture is divided into blocks, such as the CTUs (Coding Tree Units) 302 in AVS, as shown in FIG. 3. For example, the CTUs 302 can be blocks of 128x128 pixels. The CTUs are processed according to an order, such as the order shown in FIG. 3. In some examples, each CTU 302 in a picture can be partitioned into one or more CUs (Coding Units) 402 as shown in FIG. 4, which can be further partitioned into prediction units or transform units (TUs) for prediction and transformation. Depending on the coding schemes, a CTU 302 may be partitioned into CUs 402 differently. For example, in AVS, the CUs 402 can be rectangular or square, and can be coded without further partitioning into prediction units or transform units. Each CU 402 can be as large as its root CTU 302 or be subdivisions of a root CTU 302 as small as 4x4 blocks. As shown in FIG. 4, a division of a CTU 302 into CUs 402 in AVS can be quadtree splitting or binary tree splitting or ternary tree splitting. In FIG. 4, solid lines indicate quadtree splitting and dashed lines indicate binary or ternary tree splitting.
[0043] As discussed above with respect to FIGS. 1 and 2, quantization is used to reduce the dynamic range of elements of blocks in the video signal so that fewer bits are used to represent the video signal. In some examples, before quantization, the transformed or non- transformed video signal at a specific position is referred to as a coefficient. After quantization, the quantized value of coefficient is called as quantization level or level. Quantization typically consists of division by a quantization step size and subsequent rounding while inverse quantization consists of multiplication by the quantization step size. Such a quantization process is also referred to as scalar quantization. The quantization of the coefficients within a block can be performed independently and this kind of independent quantization method is used in some existing video compression standards, such as H.264, HEVC, AVS, etc. In other examples, dependent quantization is employed, such as in VVC.
[0044] Residual Coding
[0045] Residual coding is used to convert the quantization levels into bitstream in video coding. After quantization, there are N X M quantization levels for an N X M block. These N X M levels may be zero or non-zero values. The non-zero levels will further be binarized to binary bins if the levels are not binary. Context modeling based binary arithmetic coding, such as the CABAC, can further compress bins into bits. For a transformed regular residual coding (RRC) and transform skip residual coding (TSRC) block in AVS, a scan region-based coefficient coding (SRCC) may be used.
[0046] FIG. 5 illustrates an example of the scan region-based coefficient coding. For a blockWidth X blockHeight block 500, the two-dimensional (2-D) coordinates (scan_region_x and scan_region_y) are coded in bitstream to indicate a smallest rectangular area 504 within which non-zero levels exist and outside which the levels of all positions will be zero. This smallest rectangular area 504 is referred to as SRCC area or SRCC block. The scan region x and scan_region_y are less than or equal to blockWidth and blockHeight , respectively. Within the SRCC block, the level of each position may be zero or non-zero and there are always non-zero levels with coordinates equal to scan_region_x or scan_region_y or both.
[0047] An SRCC block may consist of several pre-defmed sub-blocks (e.g., 4x4 sub blocks). Because an SRCC block may have a size that does not fit an integer number of regular sub-blocks, the sub-blocks in the last row or column may have a size smaller than regular sub blocks. For an SRCC block with size of scan_region_x by scan_region_y, a specific coding scan order may be used to convert 2-D coefficients of the block into a one-dimensional (1-D) order for coefficient quantization and coding. Typically, the coding scan starts from the left- top corner and stops at last sub-block located at right-bottom corner of an SRCC block in a right-bottom direction. The last sub-block is derived from (scan_region_x and scan_region_y) according to a predefined coding scan order. RRC will code sub-block by sub-block starting from the last sub-block with a reverse coding scan order. Within a sub-block, residual coding will code the level of each position with a reverse coding scan order. FIG. 5 shows an example of a block 500 with an SRCC 504 and sub-blocks 506A-506D. Each sub-block 506 has a pre determined reverse scanning order for coding the quantization levels in the sub-block 506. In this example, the sub-block 506A has a size of 3 x 3 and the coding starts at the lower right corner at position L0 and ends at the upper left corner L8.
[0048] For each level, a flag, named sig_flag, is first coded into the bitstream to indicate if the level is zero or non-zero. These sig_flags for all the positions within a sub-block will be coded into the bitstream sequentially with exceptions at two special positions when certain conditions are met. At the position of (0, scan_region_y), if the quantized level of all positions (x, scan_region_y) with x = 1, ... , scan_region_x is zero, the sig_flag is not coded, as the level for this position must be non-zero. Similarly, at the position of (scan_region_x, 0), if the quantized level of all positions (scan_region_x, y) for y =1, ... , scan_region_y is zero, the sig_flag is not coded, as the level for this position must be non-zero. [0049] After all sig_flag within a sub-block is coded, for any non-zero level within the sub block, a coeff_abs_level_greaterl_flag will be coded to indicate if the absolute level is 1 or greater than 1. In AVS, if the absolute level is greater than 1, the coeff_abs_level_greater2_flag will be coded to indicate if the absolute level is 2 or greater than 2
[0050] After the coeff_abs_level_greaterl_flag and coeff_abs_level_greater2_flag within the sub-block are coded, for any absolute level greater than 2 of any position within the sub block, another syntax element called coeff_abs_level_remaining will be coded for these positions. The coeff_abs_level_remaining represents the value of the absolute level minus 3 in the current AVS. After the syntax elements coeff_abs_level_remaining within the sub-block are coded, a flag coeff_sign indicating the level being negative or positive for each non-zero level position will be coded. Once sig_flag, coeff_abs_level_greaterl_flag, coeff_abs_level_greater2_flag, coeff_abs_level_remaining, and coeff_sign within the sub block are coded, the residual coding process will proceed to the next sub-block along the reverse coding scan order until all the syntax elements of all sub-blocks within a residual block are coded.
[0051] In some examples, video coding schemes, such as AVS, may adopt more flexible syntax elements (e.g., abs_level_gtxX_flag) to allow conditionally parsing the syntax elements for level coding of a residual block. Table 1 shows one example with the binarization of absolute value of quantization levels. Here, abs_level_gtxX_flag describes whether the absolute value of the quantization level is greater than X, where X is an integer number, such as 0, 1, 2, ..., or N. If abs_level_gtxY_flag is 0 where Y is an integer between 0 and N-l, abs_level_gtx(Y+l) flag will not be present. If abs_level_gtxY_flag is 1, abs_level_gtx(Y+l) flag will be present. For example, for abs(level)=2, abs_level_gtx0_flag is 1, and thus abs_level gtxl_flag is present. Since abs level gtxl flag is 1, abs_level gtx2_flag is present. In this example, abs_level gtx2_flag is 0 and thus abs_level gtx3_flag will not be present. [0052] Moreover, if abs_level_gtxN_flag is 0, the remaining level will not be present. When abs_level_gtxN_flag is 1, the remaining level will be present, and it represents the value of level minus N+l. In the example shown in Table 1, N=3. So for abs(level) = 3, abs_level- gtx3_flag is 0 and thus the remaining level (denoted as remainder in Table 1) is not present. For abs(level)=5, abs_level-gtx3_flag is 1, and thus the remaining level is present and is 5- (3+1) = 1. How these syntax elements are coded in the bitstream is not constrained.
Table 1. The residual coding based upon abs level gtxX flag and remaining level
Figure imgf000016_0001
[0053] In the AVS, the 0-th order Exp-Golomb binarization method is used for binarizing the remaining levels. However, 0-th order Exp-Golomb may not be optimal for binarization of remaining levels especially when the bit depth of the video samples is high, leading to a higher bit rate of the encoded video. In one embodiment, 1st order Exp-Golomb binarization method is proposed to be used in the remaining level binarization to improve the video coding performance.
[0054] FIG. 6 shows the codewords of k-th order Exp-Golomb binarization where k is an integer, such as 0, 1, 2, .... From FIG. 6, it can be seen that the lower order Exp_Golomb binarization works better for remaining levels having small values, whereas higher order Exp Golo b binarization works better for remaining levels having large values. For example, if the remaining levels distribute within a small number range, such as 0 to 2, the 0-th order Exp-Golomb binarization provides the smallest total number of bins among the binarization schemes in FIG. 6 to represent these remaining levels, and thus requires fewest bits. However, if the remaining levels at many positions are in the range of 3 to 5, the 1-th order Exp-Golomb binarization may lead to fewer number of binarization bins than the 0-th order Exp-Golomb binarization. Likewise, as the values of the remaining levels get larger, a higher order Exp- Golomb binarization provides a better coding efficiency than a lower order binarization scheme.
[0055] In one embodiment, the lst order Exp-Golomb binarization (i.e., k=l in FIG. 6) is used to binarize a remaining level after removing (N +1) from the absolute levels where N is the largest value that abs_level_gtxN_flag is present for the absolute levels. N is 2 in the latest AVS. The proposed lst order Exp-Golomb binarization can be used for both the regular residual coding (RRC) and the transform skip residual coding (TSRC). Alternatively, lst order Exp- Golomb binarization can be used for RRC only or TSRC only. In another example, k-th Exp- Golomb binarization is used to binarize the remaining level where k may be bigger than 1. [0056] Alternatively, or additionally, an adaptive Exp-Golomb binarization can be used to binarize the remaining levels. In the adaptive binarization approach, the level information preceding the current position is used to decide the order k of the Exp-Golomb code to binarize the remaining level of the current position. For example, a statistic value (e.g., the sum, average, or another statistics) of M absolute levels or remaining levels of previous coded positions can be used to adaptively decide the order k of the Exp-Golomb binarization for the current position. Several threshold values G, t2 , ..tn (e.g., G < t2 < ··· < tn ) can be used to classify this statistic value into several classes which respectively map to several values representing different values of the order k. In addition, in order to make the implementation hardware friendly, a special position template may be used to calculate the statistic value of M absolute levels of previous coded positions. This special position template can ensure that certain positions are not used when calculating the statistic value. For example, in some implementations positions that sit in the same scan line are processed in parallel. As such, to avoid breaking the parallelism, when calculating the statistic value, those coded positions on the same scan line as the current position should not be used. The special position template can achieve such a goal. FIG. 6B depicts an example of the special position template used in the adaptive binarization. In this example, the current position 602 in a block 600 is shown as solid. The scan line is illustrated using line 604. The template 606 includes the five shaded positions. As can be seen, the positions in the template 606 do not include positions along the scan line 604. Thus, template 606 can be used, without breaking the parallelism, to determine the positions for calculating the statistic value for binarization order selection. For example, if the statistic value falls between thresholds G and t2, k=0 can be selected, if it falls between thresholds t2 and t3, k=l can be selected, and so on.
[0057] FIG. 7 depicts an example of a process 700 for encoding a partition for a video, according to some embodiments of the present disclosure. One or more computing devices (e.g., the computing device implementing the video encoder 100) implement operations depicted in FIG. 7 by executing suitable program code (e.g., the program code implementing the entropy coding module 116). For illustrative purposes, the process 700 is described with reference to some examples depicted in the figures. Other implementations, however, are possible.
[0058] At block 702, the process 700 involves accessing quantization levels of the residual of a block in a video. The block can be a portion of a picture of the input video, such as a coding unit 402 discussed in FIG. 4 or any type of block processed by a video encoder as a unit when performing quantization and binarization.
[0059] At block 704, which includes 706-708, the process 700 involves processing each quantization level of the block to generate binarized levels for the block. At block 706, the process 700 involves determining a remaining level of the quantization level. As discussed above, the video encoder can use syntax elements such as abs_level_gtxX_flag to indicate a quantization level. If the quantization level has a value lager than that can be represented by these syntax elements, the video encoder can determine the remaining level for binarization to be the quantization level minus the portion represented by the syntax elements. For example, for the quantization level 6 in Table 1, the remaining level is 2 after deducting the portion (i.e., 4) represented by the syntax elements (i.e., abs_level_gtx0_flag, abs_level_gtxl_flag, ... abs_level_gtx3_flag) from the quantization level 6.
[0060] At block 708, the process 700 involves converting the remaining level into a binary representation using k-th order Exp-Golomb codewords. In some examples, k is 1, that is, the 1st order Exp-Golomb binarization is used to binarize the remaining level. In other examples, k is greater than 1 and a higher order Exp-Golomb binarization is used to binarize the remaining level. The binarization can be performed by converting the value of the remaining level indicated in the third column of the table shown in FIG. 6A to the binarization shown in the second column. For example, if the 1st order Exp-Golomb binarization is used and the remaining level is 5, the binarization is “0111” according to FIG. 6A. Other values of the remaining level can be converted in a similar way according to FIG. 6A and the order of the Exp-Golomb binarization. As discussed above in detail, the adaptive Exp-Golomb binarization can be used to binarize the remaining levels. In this adaptive binarization method, the order of the Exp-Golomb binarization is determined based on the quantization levels or remaining levels preceding the current position. [0061] At block 710, the process 700 involves encoding the binary representations of the quantization levels in the block into a bitstream of the video. The encoding can be performed, for example, using the context adaptive modeling based binary arithmetic coding (CABAC) discussed above.
[0062] FIG. 8 depicts an example of a process 800 for decoding a block for a video, according to some embodiments of the present disclosure. One or more computing devices implement operations depicted in FIG. 8 by executing suitable program code. For example, a computing device implementing the video decoder 200 may implement the operations depicted in FIG. 8 by executing the program code for the entropy decoding module 216, the inverse quantization module 218, and the inverse transform module 219. For illustrative purposes, the process 800 is described with reference to some examples depicted in the figures. Other implementations, however, are possible.
[0063] At block 802, the process 800 involves accessing a binary string or a binary representation that represents a block of a video signal. The block can be a portion of a picture of the input video, such as a coding unit 402 discussed in FIG. 4 or any type of block processed by a video encoder as a unit when performing quantization and binarization.
[0064] At block 804, which includes 806-810, the process 800 involves processing the binary representation of the block to recover the quantization levels in the block. At block 806, the process 800 involves obtaining a portion of the binary representation that corresponds to a quantization level in the block. At block 808, the process 800 involves converting a part of the portion of the binary representation into a remaining level using the k-th order Exp- Golomb codewords. In some examples, k is 1 and the 1st order Exp-Golomb binarization is used to recover the remaining level from the binary representation. In other examples, k is greater than 1 and a higher order Exp-Golomb binarization is used to recover the remaining level. The de-binarization can be performed by converting the binary representation into the value of the remaining level according to the mapping between the second column and the third column of the table shown in FIG. 6A. For example, if the 1st order Exp-Golomb binarization is used and the binary string is “0111,” the quantization level is 5 according to FIG. 6. Other values of the remaining level can be recovered in a similar way according to FIG. 6 and the order of the Exp-Golomb binarization. In some examples, the adaptive Exp- Golomb binarization may be used to binarize the remaining levels. In those examples, the decoder can first determine the order of the Exp-Golomb binarization based on the quantization levels or remaining levels of the block or other blocks that have been decoded before the current remaining level. The decoder then selects the proper Exp-Golomb binarization according to the determined order to convert the binary representation into the value of the remaining level.
[0065] At block 810, the process 800 involves reconstructing the quantization level from the remaining level and other syntax elements, such as abs_level_gtxX_flag discussed above. The decoder can parse the syntax elements from the portion of the binary representation and determine the value for the quantization level that corresponds to these syntax elements. The decoder can further determine that the quantization level is the sum of the remaining level and the value determined from the syntax elements.
[0066] At block 812, the process 800 involves reconstructing the block by determining pixel values of the block from the quantization levels through, for example, reverse quantization and reversion transformation as discussed above with respect to FIG. 2. The decoded block of the video can be output for display.
[0067] Computing System Example for Implementing Dependent Quantization for Video Coding
[0068] Any suitable computing system can be used for performing the operations described herein. For example, FIG. 9 depicts an example of a computing device 900 that can implement the video encoder 100 of FIG. 1 or the video decoder 200 of FIG. 2. In some embodiments, the computing device 900 can include a processor 912 that is communicatively coupled to a memory 914 and that executes computer-executable program code and/or accesses information stored in the memory 914. The processor 912 may comprise a microprocessor, an application- specific integrated circuit (“ASIC”), a state machine, or other processing device. The processor 912 can include any of a number of processing devices, including one. Such a processor can include or may be in communication with a computer-readable medium storing instructions that, when executed by the processor 912, cause the processor to perform the operations described herein. [0069] The memory 914 can include any suitable non-transitory computer- readable medium. The computer-readable medium can include any electronic, optical, magnetic, or other storage device capable of providing a processor with computer-readable instructions or other program code. Non-limiting examples of a computer-readable medium include a magnetic disk, memory chip, ROM, RAM, an ASIC, a configured processor, optical storage, magnetic tape or other magnetic storage, or any other medium from which a computer processor can read instructions. The instructions may include processor-specific instructions generated by a compiler and/or an interpreter from code written in any suitable computer programming language, including, for example, C, C++, C#, Visual Basic, Java, Python, Perl, JavaScript, and ActionScript.
[0070] The computing device 900 can also include a bus 916. The bus 916 can communicatively couple one or more components of the computing device 900. The computing device 900 can also include a number of external or internal devices such as input or output devices. For example, the computing device 900 is shown with an input/output (“I/O”) interface 918 that can receive input from one or more input devices 920 or provide output to one or more output devices 922. The one or more input devices 920 and one or more output devices 922 can be communicatively coupled to the I/O interface 918. The communicative coupling can be implemented via any suitable manner (e.g., a connection via a printed circuit board, connection via a cable, communication via wireless transmissions, etc.). Non-limiting examples of input devices 920 include a touch screen (e.g., one or more cameras for imaging a touch area or pressure sensors for detecting pressure changes caused by a touch), a mouse, a keyboard, or any other device that can be used to generate input events in response to physical actions by a user of a computing device. Non-limiting examples of output devices 1322 include an LCD screen, an external monitor, a speaker, or any other device that can be used to display or otherwise present outputs generated by a computing device.
[0071] The computing device 1300 can execute program code that configures the processor 912 to perform one or more of the operations described above with respect to FIGS. 1-8. The program code can include the video encoder 100 or the video decoder 200. The program code may be resident in the memory 914 or any suitable computer- readable medium and may be executed by the processor 912 or any other suitable processor. [0072] The computing device 900 can also include at least one network interface device 924 The network interface device 924 can include any device or group of devices suitable for establishing a wired or wireless data connection to one or more data networks 928 Non- limiting examples of the network interface device 924 include an Ethernet network adapter, a modem, and/or the like. The computing device 900 can transmit messages as electronic or optical signals via the network interface device 924
[0073] General Considerations
[0074] Numerous specific details are set forth herein to provide a thorough understanding of the claimed subject matter. However, those skilled in the art will understand that the claimed subject matter may be practiced without these specific details. In other instances, methods, apparatuses, or systems that would be known by one of ordinary skill have not been described in detail so as not to obscure claimed subject matter.
[0075] Unless specifically stated otherwise, it is appreciated that throughout this specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining,” and “identifying” or the like refer to actions or processes of a computing device, such as one or more computers or a similar electronic computing device or devices, that manipulate or transform data represented as physical electronic or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the computing platform.
[0076] The system or systems discussed herein are not limited to any particular hardware architecture or configuration. A computing device can include any suitable arrangement of components that provide a result conditioned on one or more inputs. Suitable computing devices include multi-purpose microprocessor-based computer systems accessing stored software that programs or configures the computing system from a general purpose computing apparatus to a specialized computing apparatus implementing one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combinations of languages may be used to implement the teachings contained herein in software to be used in programming or configuring a computing device.
[0077] Embodiments of the methods disclosed herein may be performed in the operation of such computing devices. The order of the blocks presented in the examples above can be varied — for example, blocks can be re-ordered, combined, and/or broken into sub-blocks. Some blocks or processes can be performed in parallel.
[0078] The use of “adapted to” or “configured to” herein is meant as open and inclusive language that does not foreclose devices adapted to or configured to perform additional tasks or steps. Additionally, the use of “based on” is meant to be open and inclusive, in that a process, step, calculation, or other action “based on” one or more recited conditions or values may, in practice, be based on additional conditions or values beyond those recited. Headings, lists, and numbering included herein are for ease of explanation only and are not meant to be limiting.
[0079] While the present subject matter has been described in detail with respect to specific embodiments thereof, it will be appreciated that those skilled in the art, upon attaining an understanding of the foregoing, may readily produce alterations to, variations of, and equivalents to such embodiments. Accordingly, it should be understood that the present disclosure has been presented for purposes of example rather than limitation, and does not preclude the inclusion of such modifications, variations, and/or additions to the present subject matter as would be readily apparent to one of ordinary skill in the art.

Claims

Claims
1. A method for decoding a video, the method comprising: accessing a binary representation of a block of the video, the block of the video associated with a plurality of quantization levels; processing the binary representation to recover the plurality of quantization levels of the block, the processing comprising: obtaining a portion of the binary representation corresponding to a quantization level of the plurality of quantization levels; and converting the portion of the binary representation into the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp-Golomb binarization and is an integer larger than zero; and reconstructing the block by determining pixel values of the block from the plurality of quantization levels.
2. The method of claim 1, wherein converting the portion of the binary representation into the quantization level according to the k-th order Exp-Golomb binarization comprises: converting a first part of the portion of the binary representation to generate a first value for the quantization level; converting a second part of the portion of the binary representation according to the k- th order Exp-Golomb binarization to generate a remaining level of the quantization level; and obtaining the quantization level by adding the first value and the remaining level of the quantization level.
3. The method of claim 1, wherein k is 1.
4. The method of claim 1, wherein k is larger than 1.
5. The method of claim 1, wherein converting the portion of the binary representation into the quantization level according to k-th order Exp-Golomb binarization comprises: determining a value of the order k of the Exp-Golomb binarization based, at least in part, upon one or more quantization levels of the plurality of quantization levels associated with the block, the one or more quantization levels preceding the quantization level; and converting the portion of the binary representation into the quantization level according to the determined k-th order Exp-Golomb binarization.
6. The method of claim 1, wherein the block comprises a coding unit.
7. The method of claim 1, wherein the plurality of quantization levels associated with the block comprise quantized transformed signal of the block or quantized non-transform signal of the block.
8. A non-transitory computer-readable medium having program code that is stored thereon, the program code executable by one or more processing devices for performing operations comprising: accessing a binary representation of a block of a video, the block of the video associated with a plurality of quantization levels; processing the binary representation to recover the plurality of quantization levels of the block, the processing comprising: obtaining a portion of the binary representation corresponding to a quantization level of the plurality of quantization levels; and converting the portion of the binary representation into the quantization level according to k-th order Exp-Golomb binarization, wherein k is an integer larger than zero; and reconstructing the block by determining pixel values of the block from the plurality of quantization levels.
9. The non-transitory computer-readable medium of claim 8, wherein converting the portion of the binary representation into the quantization level according to k-th order Exp- Golomb binarization comprises: converting a first part of the portion of the binary representation to generate a first value for the quantization level; converting a second part of the portion of the binary representation according to the k- th order Exp-Golomb binarization to generate a remaining level of the quantization level; and obtaining the quantization level by adding the first value and the remaining level of the quantization level.
10. The non-transitory computer-readable medium of claim 8, wherein k is 1.
11. The non-transitory computer-readable medium of claim 8, wherein k is larger than 1.
12. The non-transitory computer-readable medium of claim 8, wherein converting the portion of the binary representation into the quantization level according to k-th order Exp- Golomb binarization comprises: determining a value of the order k of the Exp-Golomb binarization based, at least in part, upon one or more quantization levels of the plurality of quantization levels associated with the block, the one or more quantization levels preceding the quantization level; and converting the portion of the binary representation into the quantization level according to the determined k-th order Exp-Golomb binarization.
13. The non-transitory computer-readable medium of claim 8, wherein the block comprises a coding unit.
14. The non-transitory computer-readable medium of claim 8, wherein the plurality of quantization levels associated with the block comprise quantized transformed signal of the block or quantized non-transform signal of the block.
15. A system comprising: a processing device; and a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising: accessing a binary representation of a block of a video, the block of the video associated with a plurality of quantization levels; processing the binary representation to recover the plurality of quantization levels of the block, the processing comprising: obtaining a portion of the binary representation corresponding to a quantization level of the plurality of quantization levels; and converting the portion of the binary representation into the quantization level according to k-th order Exp-Golomb binarization, wherein k is an integer larger than zero; and reconstructing the block by determining pixel values of the block from the plurality of quantization levels.
16. The system of claim 15, wherein converting the portion of the binary representation into the quantization level according to k-th order Exp-Golomb binarization comprises: converting a first part of the portion of the binary representation to generate a first value for the quantization level; converting a second part of the portion of the binary representation according to the k- th order Exp-Golomb binarization to generate a remaining level of the quantization level; and obtaining the quantization level by adding the first value and the remaining level of the quantization level.
17. The system of claim 15, wherein k is 1.
18. The system of claim 15, wherein k is larger than 1.
19. The system of claim 15, wherein converting the portion of the binary representation string into the quantization level according to k-th order Exp-Golomb binarization comprises: Determining a value of the order k of the Exp-Golomb binarization based, at least in part, upon one or more quantization levels of the plurality of quantization levels associated with the block, the one or more quantization levels preceding the quantization level; and converting the portion of the binary representation into the quantization level according to the determined k-th order Exp-Golomb binarization.
20. The system of claim 15, wherein the block comprises a coding unit.
21. A method for encoding a video, the method comprising: accessing a plurality of quantization levels of a block of the video; processing each quantization level of the plurality of quantization levels of the block to generate binary representations for the plurality of quantization levels, the processing comprising: determining a remaining level of the quantization level; and converting the remaining level of the quantization level into a binary representation for the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp-Golomb binarization and is an integer larger than zero; and encoding at least the binary representations for the plurality of quantization levels of the block into a bitstream of the video.
22. The method of claim 21, wherein k is 1.
23. The method of claim 21, wherein k is larger than 1.
24. The method of claim 21, wherein converting the remaining level of the quantization level into the binary representation according to k-th order Exp-Golomb binarization comprises: determining a value of the order k of the Exp-Golomb binarization based, at least in part, upon one or more quantization levels or remaining levels of the plurality of quantization levels of the block, the one or more quantization levels or remaining levels preceding the quantization level; and converting the remaining level of the quantization level into the binary representation according to the determined k-th order Exp-Golomb binarization.
25. The method of claim 21, wherein the block comprises a coding unit.
26. The method of claim 21, wherein the plurality of quantization levels of the block comprise quantized transformed signal of the block or quantized non-transform signal of the block.
27. A non-transitory computer-readable medium having program code that is stored thereon, the program code executable by one or more processing devices for performing operations comprising: accessing a plurality of quantization levels of a block of a video; and processing each quantization level of the plurality of quantization levels of the block to generate binary representations for the plurality of quantization levels, the processing comprising: determining a remaining level of the quantization level; converting the remaining level of the quantization level into a binary representation for the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp-Golomb binarization and is an integer larger than zero; and encoding at least the binary representations for the plurality of quantization levels of the block into a bitstream of the video.
28. The non-transitory computer-readable medium of claim 27, wherein k is 1.
29. The non-transitory computer-readable medium of claim 27, wherein k is larger than 1.
30. The non-transitory computer-readable medium of claim 27, wherein converting the remaining level of the quantization level into the binary representation according to k-th order Exp-Golomb binarization comprises: determining a value of the order k of the Exp-Golomb binarization based, at least in part, upon one or more quantization levels or remaining levels of the plurality of quantization levels of the block, the one or more quantization levels or remaining levels preceding the quantization level; and converting the remaining level of the quantization level into the binary representation according to the determined k-th order Exp-Golomb binarization.
31. The non-transitory computer-readable medium of claim 27, wherein the block comprises a coding unit.
32. The non-transitory computer-readable medium of claim 27, wherein the plurality of quantization levels of the block comprise quantized transformed signal of the block or quantized non-transform signal of the block.
33. A system comprising: a processing device; and a non-transitory computer-readable medium communicatively coupled to the processing device, wherein the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations comprising: accessing a plurality of quantization levels of a block of a video; processing each quantization level of the plurality of quantization levels of the block to generate binary representations for the plurality of quantization levels, the processing comprising: determining a remaining level of the quantization level; and converting the remaining level of the quantization level into a binary representation for the quantization level according to a k-th order Exp-Golomb binarization, wherein k indicates an order of the Exp-Golomb binarization and is an integer larger than zero; and encoding at least the binary representations for the plurality of quantization levels of the block into a bitstream of the video.
34. The system of claim 33, wherein k is 1.
35. The system of claim 33, wherein k is larger than 1.
36. The system of claim 33, wherein converting the remaining level of the quantization level into the binary representation according to k-th order Exp-Golomb binarization comprises: determining a value of the order k of the Exp-Golomb binarization based, at least in part, upon one or more quantization levels or remaining levels of the plurality of quantization levels of the block, the one or more quantization levels or remaining levels preceding the quantization level; and converting the remaining level of the quantization level into the binary representation according to the determined k-th order Exp-Golomb binarization.
37. The system of claim 33, wherein the block comprises a coding unit.
38. The system of claim 33, wherein the plurality of quantization levels of the block comprise quantized transformed signal of the block or quantized non-transform signal of the block.
PCT/US2022/071091 2021-03-11 2022-03-11 Remaining level binarization for video coding WO2022192902A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202280019616.6A CN116965028A (en) 2021-03-11 2022-03-11 Residual level binarization for video coding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202163159913P 2021-03-11 2021-03-11
US63/159,913 2021-03-11

Publications (1)

Publication Number Publication Date
WO2022192902A1 true WO2022192902A1 (en) 2022-09-15

Family

ID=83227134

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/071091 WO2022192902A1 (en) 2021-03-11 2022-03-11 Remaining level binarization for video coding

Country Status (2)

Country Link
CN (1) CN116965028A (en)
WO (1) WO2022192902A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180316938A1 (en) * 2017-04-26 2018-11-01 Canon Kabushiki Kaisha Method and apparatus for k-th order exp-golomb binarization
WO2019185769A1 (en) * 2018-03-29 2019-10-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Dependent quantization
US20200186164A1 (en) * 2011-01-14 2020-06-11 Ge Video Compression, Llc Entropy encoding and decoding scheme

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200186164A1 (en) * 2011-01-14 2020-06-11 Ge Video Compression, Llc Entropy encoding and decoding scheme
US20180316938A1 (en) * 2017-04-26 2018-11-01 Canon Kabushiki Kaisha Method and apparatus for k-th order exp-golomb binarization
WO2019185769A1 (en) * 2018-03-29 2019-10-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Dependent quantization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
D. BARDONE ; E.S.G. CAROTTI ; J.C. DE MARTIN: "Adaptive Golomb Codes For Level Binarization In The H.264/AVC FRExt Lossless Mode", SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2008. ISSPIT 2008. IEEE INTERNATIONAL SYMPOSIUM ON, IEEE, PISCATAWAY, NJ, USA, 16 December 2008 (2008-12-16), Piscataway, NJ, USA , pages 287 - 291, XP031419586, ISBN: 978-1-4244-3554-8 *
JOEL SOLE ; RAJAN JOSHI ; NGUYEN NGUYEN ; TIANYING JI ; MARTA KARCZEWICZ ; GORDON CLARE ; FÉLIX HENRY ; ALBERTO DUENAS : "Transform Coefficient Coding in HEVC", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE, USA, vol. 22, no. 12, 1 December 2012 (2012-12-01), USA, pages 1765 - 1777, XP011486338, ISSN: 1051-8215, DOI: 10.1109/TCSVT.2012.2223055 *

Also Published As

Publication number Publication date
CN116965028A (en) 2023-10-27

Similar Documents

Publication Publication Date Title
US9888249B2 (en) Devices and methods for sample adaptive offset coding and/or selection of edge offset parameters
EP2839645B1 (en) Coefficient groups and coefficient coding for coefficient scans
WO2015196087A1 (en) Intra block copy block vector signaling for video coding
WO2013003819A1 (en) Encoding of prediction residuals for lossless video coding
CN108141621B (en) Method and device for coding and decoding video data
WO2013152356A1 (en) Devices and methods for signaling sample adaptive offset (sao) parameters
KR102380579B1 (en) Method and device for context-adaptive binary arithmetic coding of a sequence of binary symbols representing a syntax element related to video data
WO2022174660A1 (en) Video coding and decoding method, video coding and decoding apparatus, computer-readable medium, and electronic device
WO2023028555A1 (en) Independent history-based rice parameter derivations for video coding
WO2023028576A2 (en) History-based rice parameter derivations for wavefront parallel processing in video coding
WO2021263251A1 (en) State transition for dependent quantization in video coding
EP4246975A1 (en) Video decoding method and apparatus, video coding method and apparatus, and device
WO2022192902A1 (en) Remaining level binarization for video coding
CN115086664A (en) Decoding method, encoding method, decoder and encoder for unmatched pixels
WO2022217245A1 (en) Remaining level binarization for video coding
WO2022213122A1 (en) State transition for trellis quantization in video coding
WO2023023608A2 (en) History-based rice parameter derivations for video coding
WO2023212684A1 (en) Subblock coding inference in video coding
WO2023060140A1 (en) History-based rice parameter derivations for video coding
WO2023168257A2 (en) State transition of dependent quantization for aom enhanced compression model
WO2023086956A1 (en) Initialization processing for video coding
CN117837148A (en) History-based rice coding parameter derivation for video coding
WO2023132991A1 (en) Signaling general constraints information for video coding
CN117529914A (en) History-based rice parameter derivation for wavefront parallel processing in video coding
WO2023056348A1 (en) Video coding with selectable neural-network-based coding tools

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22768228

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202280019616.6

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22768228

Country of ref document: EP

Kind code of ref document: A1