WO2023173025A2 - Adaptive context initialization for entropy coding in video coding - Google Patents
Adaptive context initialization for entropy coding in video coding Download PDFInfo
- Publication number
- WO2023173025A2 WO2023173025A2 PCT/US2023/064052 US2023064052W WO2023173025A2 WO 2023173025 A2 WO2023173025 A2 WO 2023173025A2 US 2023064052 W US2023064052 W US 2023064052W WO 2023173025 A2 WO2023173025 A2 WO 2023173025A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- context value
- slice
- previous
- frame
- ctu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/174—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/1883—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit relating to sub-band structure, e.g. hierarchical level, directional tree, e.g. low-high [LH], high-low [HL], high-high [HH]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/31—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/36—Scalability techniques involving formatting the layers as a function of picture distortion after decoding, e.g. signal-to-noise [SNR] scalability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/91—Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
Definitions
- This disclosure relates generally to video processing. Specifically, the present disclosure involves context initialization for entropy’ coding in video coding.
- Video coding technology allows video data to be compressed into smaller sizes thereby allowing various videos to be stored and transmited.
- Video coding has been used in a wide range of applications, such as digital TV broadcast, video transmission over the Internet and mobile networks, real-time applications (e.g., video chat, video conferencing), DVD and Blu- ray discs, and so on. To reduce the storage space for storing a video and/or the network bandwidth consumption for transmitting a video, it is desired to improve the efficiency of the video coding scheme.
- a method for decoding a video from a video bitstream representing the video includes accessing a binary string from the video bitstream, the binary string representing a slice of a frame of the video; determining an initial context value of an entropy coding model for the slice to be one of a first context value stored for a first CTU in a previous slice of the slice, a second context value stored for a second CTU in the previous slice, and a default initial context value independent of the previous slice; decoding the slice by 7 decoding at least a portion of the binaiy string according to the entropy coding model with the initial context value; reconstructing the frame of the video based, at leas t in part, upon the decoded slice; and causing the reconstructed frame to be displayed along with other frames of the video.
- a non-transitory computer-readable medium has program code that is stored thereon, and the program code is executable by one or more processing devices for performing operations.
- the operations include accessing a binary string from a video bitstream representing a video, the binary string representing a slice of a frame of the video; determining an initial context value of an entropy coding model for the slice to be one of a first context value stored for a first CTU in a previous slice of the slice, a second context value stored for a second CTU in the previous slice, and a default initial context value independent of tire previous slice; decoding the slice by decoding at least a portion of the binary string according to the entropy coding model with the initial context value; reconstructing the frame of the video based, at least in part, upon the decoded slice; and causing the reconstructed frame to be displayed along with other frames of the video.
- a system in yet another example, includes a processing device; and a non-transitory computer-readable medium communicatively coupled to the processing device.
- the processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations.
- the operations include accessing a binary string from a video bitstream representing a video, the binary string representing a slice of a frame of the video; determining an initial context value of an entropy coding model for the slice to be one of a first context value stored for a first CTU in a previous slice of the slice, a second context value stored for a second CTU in the previous slice, and a default initial context value independent of the previous slice; decoding the slice by decoding at least a portion of the binary string according to the entropy coding model with the initial context value; reconstructing the frame of the video based, at least in part, upon the decoded slice; and causing the reconstructed frame to be displayed along with other frames of the video.
- a method for decoding a video from a video bitstream representing the video includes accessing a binary’ string from the video bitstream, the binary' string representing a partition of the video; determining an initial context value of an entropy coding model for the partition by converting a context value stored for a CTU in a previous partition of the partition based on an initial context value associ ated with the previous partition, a slice quantization parameter of the previous partition, and a slice quantization parameter of the partition; decoding the partition by decoding at least a portion of the binary string according to the entropy coding model with the initial context value; reconstructing frames of the video based, at least in part, upon the decoded partition; and causing the reconstructed frames to be displayed.
- a non-transitory computer-readable medium has program code that is stored thereon, and the program code is executable by one or more processing devices for performing operations.
- the operations include accessing a binary string from a video bitstream of a video, the binary string representing a partition of the video: determining an initial context, value of an entropy coding model for the partition by- converting a context value stored for a CTU in a previous partition of the partition based on an initial context value associated with the previous partition, a slice quantization parameter of the previous partition, and a slice quantization parameter of the partition; decoding the partition by decoding at least a portion of the binary- 7 string according to the entropy coding model with the initial context value; reconstructing frames of the video based, at least in part, upon the decoded partition; and causing the reconstructed frames to be displayed.
- a system in yet another example, includes a processing device; and a non-transitory 7 computer-readable medium communicatively 7 coupled to the processing device.
- the processing device is configured to execute program code stored in the non-transitory 7 computer-readable medium and thereby perform operations.
- the operations include accessing a binary string from a video bitstream of a video, the binary string representing a partition of the video: determining an initial context value of an entropy coding model for the partition by 7 converting a context value stored for a CTU in a previous partition of the partition based on an initial context value associated with the previous partition, a slice quantization parameter of the previous partition, and a slice quantization parameter of the partition; decoding the partition by decoding at least a portion of the binary 7 string according to the entropy 7 coding model with the i nitial context value; reconstructing frames of the video based, at least in part, upon the decoded partition; and causing the reconstructed frames to be displayed.
- a method for decoding a video from a video bitstream representing the video includes accessing a binary 7 string from the video bitstream, the binary 7 string representing a partition of a frame of the video; determining an initial context value for an entropy coding model for the partition by converting a context value stored in a buffer for a CTU in a previous frame of the frame based on an initial context value associated with the previous frame, a quantization parameter of the previous frame, and a slice quantization parameter of the frame; decoding the partition by decoding at least a portion of the binary 7 string according to the entropy coding model with the initial context value; replacing the context value stored in the buffer with a context value for a CTU in the frame determined in decoding the partition; reconstructing the frame of the video based, at least in part, upon the decoded partition; and causing the reconstructed frame to be displayed.
- a non-transitory computer-readable medium has program code that is stored thereon, and the program code is executable by one or more processing devices for performing operations.
- the operations include accessing a binary string from a video bitstream of a video, the binary string representing a partition of a frame of the video; determining an initial context value for an entropy coding model for the partition by converting a context value stored in a buffer tor a CTU in a previous frame of the frame based on an initial context value associated with the previous frame, a slice quantization parameter of the previous frame, and a slice quantization parameter of the frame; decoding the partition by decoding at least a portion of the binary' string according to the entropy coding model with the initial context value; replacing the context value stored in the buffer with a context value for a CTU in the frame determined in decoding the partition; reconstructing the frame of the video based, at least in part, upon the decoded partition; and causing the reconstructed frame to be displayed.
- a system in yet another example, includes a processing device; and a non-transitory' computer-readable medium communicatively coupled to the processing device.
- Tire processing device is configured to execute program code stored in the non-transitory computer-readable medium and thereby perform operations.
- the operations accessing a binary string from a video bitstream of a video, the binary- 7 string representing a partition of a frame of the video; determining an initial context value for an entropy coding model for the partition by converting a context value stored in a buffer for a CTU in a previous frame of the frame based on an initial context value associated with the previous frame, a slice quantization parameter of the previous frame, and a quantization parameter of the frame; decoding the partition by’ decoding at least a portion of the binary string according to the entropy coding model with the initial context value; replacing the context value stored in the buffer with a context value for a CTU in the frame determined in decoding the partition; reconstructing the frame of the video based, at least in part, upon the decoded partition; and causing the reconstructed frame to be displayed.
- FIG. 1 is a block diagram showing an example of a video encoder configured to implement embodiments presented herein.
- FIG. 2 is a block diagram showing an example of a video decoder configured to implement embodiments presented herein.
- FIG. 3 depicts an example of a coding tree unit division of a picture in a video, according to some embodiments of the present disclosure.
- FIG, 4 depicts an example of context initialization from the previous frame (CIPF), according to some embodiments of the present disclosure.
- FIG. 5 depicts another example of context initialization from a previous frame (CIPF), according to some embodiments of the present disclosure.
- FIG. 6 depicts an example of a group of pictures structure for random access with the associated temporal layer indices, according to some embodiments of the present disclosure.
- FIG. 7 depicts an example of a process for decoding a video encoded via entropy coding with adaptive context initialization, according to some embodiments of the present disclosure.
- FIG. 8 shows an example of the motion compensation and entropy coding context initialization dependencies of a picture coding structure for random access common test condition applied with the context initialization using the previous frame.
- FIG. 9 shows an example of the context initialization inheritance from the previous frame in the coding order regardless of temporal layer and quantization parameter for the example picture coding structure shown in FIG. 8, according to some embodiments of the present disclosure.
- FIG. 10 shows an example of the context initialization inheritance from the previous frame in a lower temporal layer for the example picture coding structure shown in FIG. 8. according to some embodiments of the present disclosure.
- FIG. 11 depicts an example of values involved in the context initialization table conversion, according to some embodiments of the present disclosure.
- FIG. 12 depicts an example of a process for decoding a video encoded with the picture coding structure of random access via entropy coding with adaptive context initialization, according to some embodiments of the present disclosure.
- FIG. 13 depicts an example of applying the context initialization using the previous frame (CIPF) to the low delay common test condition, according to some embodiments of the present disclosure.
- FIG. 14 shows the behaviour of the CIPF buffers for the example shown in FIG. 8.
- FIG. 15 shows another example of the random access (RA) test condition.
- FIG. 16 shows the behaviour of the CIPF buffers for the example shown in FIG. 15.
- FIG. 17 shows an example of the behaviour of the proposed CIPF buffer configuration for the RA test condition shown in FIG. 15, according to some embodiments of the present disclosure.
- FIG. 18 depicts an example of a process for decoding a video encoded with the CIPF with adaptive context initialization and presented buffer management, according to some embodiments of the present disclosure.
- FIG. 19 depicts an example of a computing system that can be used to implement some embodiments of the present disclosure.
- Various embodiments provide context initialization for entropy coding in video coding. As discussed above, more and more video data are being generated, stored, and transmitted. It is beneficial to increase the efficiency of the video coding technology. One way to do so is through entropy coding where an entropy encoding algorithm is applied to quantized samples of the video to reduce the size of data representing the video samples. In the contextbased binary arithmetic entropy coding, the coding engine estimates a context probability indicating the likelihood of the next binary symbol having the value one. Such estimation requires an initial context probability estimate. One way to determine the initial context probability estimate is to use the context value for a CTU located in the center of the previous slice. However, such an initialization may not be accurate because it is likely that the previous slice does not have enough bits encoded in the context-based coding mode, and the context value of the CTU located in the center of the previous slice does not accurately reflect the context of the slice,
- the adaptive context initialization allows the initial context value of an entropy coding model for a current slice to be chosen from multiple options based on the setting or configuration of the frame or the slice.
- the initial context value can be set to the context value of a last CTU in the previous slice or frame, the context value of a CTU located in the center of the previous slice or frame, or a default initial context value independent of the previous slice or frame.
- a syntax element can be used to indicate the CTU location for obtaining the initial context value from the previous slice or frame. If the syntax element has a first value (e.g., 1), the initial context value can be set to the context value stored for the center CTU of the previous slice or frame; if the syntax element has a second value (e.g., 0), the initial context value can be set to the context value stored for the last CTU of the previous slice or frame.
- Another sy ntax element can be used to indicate whether to use the context value from the previous slice or frame for initialization or use the default initial context, value. In some examples, both syntax elements can be transmitted in the picture header of the frame containing the slice or the slice header of the slice.
- a syntax element indicating the threshold value for determining a CTU location for obtaining the initial context value from the previous slice or frame can be used.
- the quantization parameter (QP) value of the previous slice or frame can be compared with the threshold value. If the QP value is no higher than the threshold value, the initial context value can be set to be the context value of the center CTU of the previous slice or frame; and otherwise, the initial context value can be set to be the context value of the last CTU of the previous slice or frame.
- the initialization can be made based on the temporal layer indices associated with the frames in a group of pictures (GOP) structure for random access (RA).
- GOP group of pictures
- RA random access
- two syntax elements can be used: a first syntax element indicating a first threshold value for determining whether to use the initial context value from the previous slice or frame and a second syntax element indicating a second threshold value for determining a CTU location for obtaining the initial context value from the previous slice or frame.
- the second threshold value is set to be no higher than the first threshold value. If the temporal layer index of the current slice is higher than the first threshold value, the initial context value for the slice is set to be the default initial context value.
- the temporal layer index of the slice is compared with the second threshold value. If the temporal layer index is no higher than the second threshold value, the initial context value is determined to be the context value of the center CTU of the previous slice or frame; otherwise, the initial context value is set to be the context value of the last CTU of the previous slice or frame. [0039]
- the initialization inheritance from previous slice or frame having the same slice quantization parameter may introduce additional dependencies between frames, which would limit parallel processing capability for both encoding and decoding. To solve this problem, the context initilization inheritance can be modified to eliminate these additional dependencies.
- the context initilization for a current frame can be determined to be the context value of the previous frame in the coding order regardless of temporal layer and the slice quantizatoin parameter.
- the initial context value can be determined to be the context value of the previous frame in a lower temporal layer.
- the initial context value can be determined to be the context value of the reference frame(s) of the current frame according to the motion compensation and prediction structure.
- the inherited initial context value can be converted based on the previous slice quantization parameter and the current slice quantization parameter.
- the conversion is performed based on the default initial context value determined using the quantization parameter of the previous slice or frame and the default initial context value determined using the quantization parameter of the current slice or frame.
- the conversion is performed based on the initial context value of the previous slice or frame which may be determined using the same method described herein based on its previous slice or frame.
- a buffer is used to store the context value.
- the current CIPF uses 5 buffers to store context value and each buffer is used to store the context data for frames with a corresponding temporal layer index.
- frames with the same temporal layer index may have different slice quantizatoin parameters. Tims, each time a new combination of temporal layer index and slice quantization parameter is observed, the context value for the frame with the new combination is pushed into the buffer and old data in the buffer is discarded.
- the context value for previous frames especially frames with low temporal layer indices, may be discarded, preventing the CIPF to be applied to the frames for which the coding gain can be obtained the most by applying the CIPF. This leads to a reduction in the coding efficiency.
- the CIPF buffers can be managed to keep a context value from each temporal layer in the buffers.
- the CIPF process can be applied to each eligible frame by using tire context value stored in the buffer that has the same temporal layer index.
- the new context value will replace the existing context value in the buffer that has been used as the initial context value and has the same temporal layer index as the current frame.
- the stored context value can be converted based on the two slice quantization parameters before being used for the entropy coding model.
- the number of buffers can be increased to allow the context values for different slic ⁇ quantization parameters at different temporal layers to be stored and used for frames with tire corresponding temporal layer index and slice quantization parameter.
- some embodiments provide improvements in video coding efficiency by allowing adaptively selecting the context value initialization for the entropy coding model. Because the initial context value can be selected from tire center CTU or the last CTU of the previous slice based on the configuration of the current slice or frame and/or the previous slice or frame, such as the slice QP and the temporal layer index, the initial context value can be selected more accurately. As a result, the entropy coding model is more accurate, leading to a higher coding efficiency.
- the initial context can be inherited from a slice or a frame having a different slice quantization parameter than the current slice quantization parameter, the additional dependencies among pictures introduced by the context initialization inheritance in the picture coding structure of random access can be eliminated thereby improving the parallel processing capability of the encoder and decoder.
- the inherited initial context value can be converted based on the quantization parameter of the previous slice or frame and the quantization parameter of the current slice. The conversion reduces or eliminates the inaccuracy in the initial context value estimation that is introduced by the difference between the slice quantization parameters of the current slice or frame and the previous slice or frame. As a result, the overall coding efficiency is improved.
- the coding efficiency of the video is further improved by improving the buffer management to keep a context value for each temporal layer in the buffer. Further, by converting the context value based on the slice quantization parameters of the previous frame and the current frame, the same buffer can be used to score the context value for frames in a temporal layer with different slice quantization parameters. As a result, the total number of buffers remain unchanged and the CIPF can be performed for each qualifying frame. Compared with the existing buffer management where the data in the buffer may be lost rendering the CIPF unavailable for some frames, the proposed buffer management allow s the CIPF to be applied to more frames to achieve a higher coding efficiency. [0046] Referring now to the drawings, FIG.
- I ss a block diagram showing an example of a video encoder 100 configured to implement embodiments presented herein.
- the video encoder 100 includes a partition module 112. a transform module 114, a quantization module 1 15, an inverse quantization module 1 18, an inverse transform module 119, an in-loop filter module 120, an intra prediction module 126, an inter prediction module 124, a motion estimation module 122, a decoded picture buffer 130, and an entropy coding module 116.
- the input to the video encoder 100 is an input video 102 containing a sequence of pictures (also referred to as frames or images).
- the video encoder 100 employs a partition module 112 to partition the picture into blocks 104, and each block contains multiple pixels.
- the blocks may be macroblocks, coding tree units, coding units, prediction units, and/or prediction blocks.
- One picture may include blocks of different sizes and the block partitions of different pictures of the video may also differ.
- Each block may be encoded using different predictions, such as intra prediction or inter prediction or intra and inter hybrid prediction.
- the first picture of a video signal is an intra-coded picture, which is encoded using only intra prediction.
- the intra prediction mode a block of a picture is predicted using only data that has been encoded from the same picture.
- a picture that is intra-coded can be decoded without information from other pictures.
- the video encoder 100 shown in FIG. 1 can employ the intra prediction module 126.
- the intra prediction module 126 is configured to use reconstructed samples in reconstructed blocks 136 of neighboring blocks of the same picture to generate an intra-prediction block (the prediction block 134).
- the intra prediction is performed according to an intra-prediction mode selected for the block.
- the video encoder 100 then calculates the difference between block 104 and the intra-prediction block 134. This difference is referred to as residual block 106.
- the residual block 106 is transformed by the transform module 114 into a transform domain by applying a transform on the samples in the block.
- the transform may include, but are not limited to, a discrete cosine transform (DCT) or discrete sine transform (DST).
- the transformed values may be referred to as transform coefficients representing the residual block in the transform domain.
- the residual block may be quantized directly without being transformed by the transform module 1 14. This is referred to as a transform skip mode.
- the video encoder 100 can further use the quantization module 115 to quantize the transform coefficients to obtain quantized coefficients.
- Quantization includes dividing a sample by a quantization step size followed by subsequent rounding, whereas inverse quantization involves multiplying the quantized value by the quantization step size. Such a quantization process is referred to as scalar quantization. Quantization is used to reduce the dynamic range of video samples (transformed or non-transformed) so that fewer bits are used to represent the video samples.
- the quantization of coefficients/samples within a block can be done independently and this kind of quantization method is used in some existing video compression standards, such as H.264, HEVC, and VVC.
- some scan order may be used to convert the 2D coefficients of a block into a 1-D array for coefficient quantization and coding.
- Quantization of a coefficient within a block may make use of the scan order information.
- the quantization of a given coefficient in the block may depend on the status of the previous quantized value along the scan order.
- more than one quantizer may be used. Which quantizer is used for quantizing a current coefficient depends on the information preceding the current coefficient in the encoding/decoding scan order. Such a quantization approach is referred to as dependent quantization.
- the degree of quantization may be adjusted using the quantization step sizes. For instance, for scalar quantization, different quantization step sizes may be applied to achieve finer or coarser quantization. Smaller quantization step sizes correspond to finer quantization, whereas larger quantization step sizes correspond to coarser quantization.
- the quantization step size can be indicated by a quantization parameter (QP). Quantization parameters are provided in an encoded bitstream of the video such that the video decoder can access and apply the quantization parameters for decoding.
- QP quantization parameter
- the quantized samples are then coded by the entropy coding module 116 to further reduce the size of the video signal.
- the entropy encoding module 116 is configured to apply an entropy encoding algorithm to the quantized samples.
- the quantized samples are binarized into binary bins and coding algorithms further compress the binary bins into bits. Examples of the binarization methods include, but are not limited to, a combined truncated Rice (TR) and limited k-th order Exp-Golomb (EGk) binarization, and k-th order Exp-Golomb binarization.
- Examples of the entropy encoding algorithm include, but are not limited to, a variable length coding (VLC) scheme, a context adaptive VLC scheme (CA VLC), an arithmetic coding scheme, a binarization, a context adaptive binary arithmetic coding (CABAC), syntax-based context-adaptive bmary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) coding, or other entropy encoding techniques.
- VLC variable length coding
- CA VLC context adaptive VLC scheme
- CABAC context adaptive binary arithmetic coding
- SBAC syntax-based context-adaptive bmary arithmetic coding
- PIPE probability interval partitioning entropy
- the entropy-coded data is added to the bitstream of the output encoded video 132.
- reconstructed blocks 136 from neighboring blocks are used in the intra-prediction of blocks of a picture.
- Generating the reconstructed block 136 of a block involves calculating the reconstructed residuals of this block.
- the reconstructed residual can be determined by applying inverse quantization and inverse transform to the quantized residual of the block.
- the inverse quantization module 1 18 is configured to apply the inverse quantization to the quantized samples to obtain de-quantized coefficients.
- the inverse quantization module 118 applies the inverse of the quantization scheme applied by the quantization module 115 by using the same quantization step size as the quantization module 115.
- the inverse transform module 119 is configured to apply the inverse transform of the transform applied by the transform module 114 to the de-quantized samples, such as inverse DCT or inverse DST.
- the output of the inverse transform module 119 is the reconstructed residuals for the block m tire pixel domain.
- the reconstructed residuals can be added to the prediction block 134 of the block to obtain a reconstructed block 136 in the pixel domain.
- the inverse transform module 1 19 is not applied to those blocks.
- the de-quantized samples are the reconstructed residuals for the blocks.
- Blocks in subsequent pictures following the first intra-predicted picture can be coded using either inter prediction or intra prediction.
- inter-prediction the prediction of a block in a picture is from one or more previously encoded video pictures.
- the video encoder 100 uses an inter prediction module 124.
- the inter prediction module 12.4 is configured to perform motion compensation for a block based on the motion estimation provided by the motion estimation module 122.
- the motion estimation module 122 compares a current block 104 of the current picture with decoded reference pictures 108 for motion estimation.
- the decoded reference pictures 108 are stored in a decoded picture buffer 130.
- the motion estimation module 12.2 selects a reference block from the decoded reference pictures 108 that best matches the current block.
- the motion estimation module 122 further identifies an offset between the position (e.g., x, y coordinates) of the reference block and the position of the current block. This offset is referred to as the motion vector (MV) and is provided to the inter prediction module 124 along with the selected reference block.
- MV motion vector
- multiple reference blocks are identified for the current block in multiple decoded reference pictures 108.
- the inter prediction module 124 uses the motion vector(s) along with other interprediction parameters to perform motion compensation to generate a prediction of the current block, i ,e,, the inter prediction block 134. For example, based on the motion vector(s), the inter prediction module 124 can locate the prediction block(s) pointed to by the motion vector(s) in the corresponding reference picture(s). If there is more than one prediction block, these prediction blocks are combined with some weights to generate a prediction block 134 for tire current block.
- the video encoder 100 can subtract the inter-prediction block 134 from block 104 to generate the residual block 106.
- the residual block 106 can be transformed, quantized, and entropy coded in tire same way as the residuals of an intrapredicted block discussed above.
- the reconstructed block 136 of an inter-predicted block can be obtained through inverse quantizing, inverse transforming the residual, and subsequently combining with the corresponding prediction block 134.
- the reconstructed block 136 is processed by an in-loop filter module 120.
- the in-loop filter module 120 is configured to smooth out pixel transitions thereby improving the video quality.
- the in-loop filter module 120 may be configured to implement one or more in-loop filters, such as a deblocking filter, a sample-adaptive offset (SAO) filter, an adaptive loop filter (ALF), etc.
- FIG. 2 depicts an example of a video decoder 200 configured to implement the embodiments presented herein.
- the video decoder 200 processes an encoded video 202 in a bitstream and generates decoded pictures 208.
- the video decoder 200 includes an entropy decoding module 216, an inverse quantization module 218, an inverse transform module 219, an in-loop filter module 220, an intra prediction module 226, an inter prediction module 224, and a decoded picture buffer 230.
- the entropy decoding module 216 is configured to perform entropy decoding of the encoded video 202.
- the entropy decoding module 216 decodes the quantized coefficients, coding parameters including intra prediction parameters and inter prediction parameters, and other information.
- the entropy decoding module 216 decodes the bitstream of the encoded video 202 to binary representations and then converts the binary representations to quantization levels of the coefficients.
- the entropy-decoded coefficient levels are then inverse quantized by the inverse quantization module 218 and subsequently inverse transformed by the inverse transform module 219 to the pixel domain.
- the inverse quantization module 218 and the inverse transform module 219 function similarly to the inverse quantization module 1 18 and the inverse transform module 119, respectively, as described above with respect to FIG. 1 .
- Tire inverse-transformed residual block can be added to the corresponding prediction block 234 to generate a reconstructed block 236.
- the inverse transform module 219 is not applied to those blocks.
- the de-quantized samples generated by the inverse quantization module 1 18 are used to generate the reconstructed block 236.
- the prediction block 234 of a particular block is generated based on the prediction mode of the block. If the coding parameters of the block indicate that the block is intra predicted, the reconstructed block 236 of a reference block in the same picture can be fed into the intra prediction module 226 to generate the prediction block 234 for the block. If the coding parameters of the block indicate that the block is inter-predicted, the prediction block 234 is generated by the inter prediction module 224.
- the intra prediction module 226 and the inter prediction module 224 function similarly to the intra prediction module 126 and the inter prediction module 124 of FIG. 1, respectively.
- the inter prediction involves one or more reference pictures.
- the video decoder 200 generates the decoded pictures 2.08 for the reference pictures by applying the in-loop filter module 220 to the reconstructed blocks of the reference pictures.
- the decoded pictures 208 are stored in the decoded picture buffer 230 for use by the inter prediction module 224 and also for output.
- FIG. 3 depicts an example of a coding tree unit, division of a picture in a video, according to some embodiments of the present disclosure.
- the picture is divided into blocks, such as the CTUs (Coding Tree Units) 302 in VVC, as shown in FIG. 3.
- the CTUs 302 can be blocks of 128x128 pixels.
- the CTUs are processed according to an order, such as the order shown in FIG. 3.
- CABAC context-based binary’ arithmetic coding
- the coding engine consists of two elements: probability estimation and codeword mapping.
- probability estimation is to determine the likelihood of the next binary symbol having the value 1 .
- This estimation i s based on the history' of symbol values coded using the same context and typically uses an exponential decay window. Given a sequence of binary’ symbols x(t), with t ⁇ ⁇ 1, ...
- the estimated probability’ p(t + 1) ofx(t + 1) being equal to 1 is given by where p(l) is an initial probability estimate and a is a base determining the rate of adaptation. Alternatively, this can be expressed in a recursive manner as
- the initial estimate p(l) is derived for each context using a linear function of the quantization parameter (QP).
- some blocks in a slice may be coded in a skip-mode without using CABAC, for example, to reduce the number of bits used for the slice.
- the blocks coded using the skip-mode do not contribute to the building of the context,
- Variables m and n used in the initialization of context variables, are derived from slopeldx and offsetldx as:
- the two values assigned to pStateldxO and pStateldxl for the initialization are derived from SliceQpy as specified in the VVC standard. Given the variables m and n, the initialization is specified as follows:
- initValue can be obtained with pre-defined Tables.
- initType which is determined by the slice and the syntax element sh_cabac_init_flag, as extracted as follows, is the entry' of the Tables.
- syntax element ph inter slice allowed flag is transmitted as shown in Table 2.
- ph inter slice allowed flag 0 specifies that all coded slices of the picture have sh_slice_type equal to 2.
- ph_inter_slice_allowed_flag 1 specifies that there might or might not be one or more coded slices in the picture that have sh__slice_type equal to 0 or 1 .
- sh slicejype specifies the coding type of the slice according to 3.
- sh_slice_type in slice_heaeder() (VVC Specification)
- sh cabac init flag specifies the method for determining the initialization table used in the initialization process for context variables. When sh cabac init flag is not present, it is inferred to be equal to 0.
- FIG. 4 depicts an example of the CABAC context initialization from the previous frame (CIPF).
- the probability state i.e., the context value
- the stored probability state will be used as the initial probability state for the corresponding context model in the next B- or P-slice coded with the same quantization parameter (QP) and the same temporal ID (Tid).
- QP quantization parameter
- Tid temporal ID
- the CTU location for storing probability states is computed rising the following formula: where IF denotes the number of CTUs in a CTU row, and C is the total number of CTUs in a slice.
- a syntax element sps cipf enabled flag in the sequence parameter set can be used as shown in Table 5 to indicate whether the context initialization from previous frame is enabled or not. If the value of sps cipf enabled flag is equal to I, the context initialization from previous frame described above is used for each slice associated with the SPS. If the value of sps cipf enabled flag is equal to 0, the CABAC context initialization process same as that specified in VVC is applied for each slice associated with the SPS.
- the quantization parameter QP tor a slice is derived as follows.
- the syntax elements pps no pic partition flag, pps init qp _mmus26 and pps_qp_delta_info_in_ph_flag are transmitted in the picture parameter set (PPS) as shown in Table 6.
- the syntax element ph_qp_delta is transmitted in picture _ heading _structure, as shown in
- ph qp delta specifies the initial value of QpY to be used for the coding blocks in the picture until modified by the value of CuQpDeltaVal in the coding unit layer.
- pps qp delta info _ in _ph_flag is equal to 1
- the initial value ofthe QpY quantization parameter for all slices ofthe picture, SliceQpY is derived as follows:
- sh qp delta is transmitted in slice_header_structure, as shown in Table 8.
- sh qp delta specifies the initial value of QpY to be used for the coding blocks in the slice until modified by the value of CuQpDeltaVal in the coding unit layer.
- pps qp delta info_in_ph_flag is equal to 0
- the initial value ofthe QpY quantization parameter tor the slice, SliceQpY is derived as follows: (10)
- Table 8 sh qp delta syntax in slice header structureQ (VC specification) [0072]
- the number of temporal layers (sublayers) are defined in video parameter set (VPS) and in sequence parameter set (SPS), as shown in Table 9 and in Table 10.
- vps_max_sublayers_minusl plus 1 specifies the maximum number of temporal sublayers that may be present in a layer specified by the VPS.
- vps max sublayers minus 1 shall be in the range of 0 to 6, inclusive.
- Table 9 Definition of The Number of Temporal Layers in VPS (VVC Specification) sps_max_sublayers_minusl plus 1 specifies the maximum number of temporal sublayers that could be present in each CLVS (coded layer video sequence) referring to the SPS.
- sps video parameter set id is greater than 0, the value of sps max sublayers minus 1 shall be in the range of 0 to vps max sublayers minusl, inclusive.
- transform coefficients consume most of the bits within video bitstreams. If a number of bits are spent for the slice, the context table or context value is more tailored as encoding proceeds from one CTU to another. On the other hand, the texture may differ from the first CTU to the last CTU. In this case, a good trade-off can be achieved by using the context value of a CTU in the centre of the slice to initialize the context for the next slice as shown in FIG. 4. However, if fewer bits are spent for the slice, more blocks are coded as skip mode. In this case, the context table cannot be tailored for the texture because there are not enough context-coded blocks in the slice.
- the context value of a CTU near the end of the slice along the encoding order can be used to initialize the context for the next slice.
- the last CTU of the slice can be used to initialize the context for the next slice as shown in FIG. 5. That is, instead of the Eqn. (8), the CTU location for storing probability states is computed using the following formula: (11) where C is the total number of CTUs in a slice.
- the CTU location used for initializing the context for the next slice can be adaptively switched between the Eqns. (8) and (1 1). In one embodiment, if the value of
- Table 11 Proposed Syntax of sps cipf center flag sps cipf enabled flag equal to 1 specifies that for each slice the CTU location for storing probability states is specified by specifies that the probability states for each slice are reset to the default initial values.
- sps_cipf_center_flag specifies the CTU location for storing probability states
- sps _cipf center flag 1 specified that for each slice the CTU location for storing probability states is computed using the following formula:
- sps_cipf_center_flag 0 specified that for each slice the CTU location for storing probability states is computed using the following formula: where IF denote the number of CTUs in a CTU row, and C is the total number of CTUs in a slice. If sps__cipf_center_flag is not present, the value of sps_cipf_center_flag is inferred to be equal to 0.
- a pre-determined threshold can be transmitted in the SPS and used to be compared with the slice QP value to determine whether to use the center CTU or the last CTU of the slice for context initialization for tire next slice.
- a pre-determined threshold cipf QP threshold can be transmitted in the SPS as shown in Table 12, and the QP of the previous slice, sliceQP, can be compared with the value of cipf QP threshold to determine the location CTU location of the CTU that is used to initialize the context of the slice as follows:
- Table 12 Proposed Syntax of cipf QP threshold equal to 1 specifies that for each slice the CTU location for storing probability states is specified by p _ p _ _ p _ p _ _ g specifies that the probability states for each slice are reset to the default initial values. specifies the QP threshold used to control how to decide the CTU location for entropy initialization if If the slice QP specified in the slice header is not bigger than this threshold, Otherwise, where W denotes the number of CTOs in a CTU row, and C is the total number of CTOs in a slice.
- the context initialization for the Random Access is considered.
- the Group of Pictures (GOP) structure for RA as shown in FIG. 6 is defined.
- pictures are divided into different temporal layers, such as the layer 0 to layer 5 in FIG. 6.
- an I-frame and a B-frame is in the temporal layer 0; temporal layer 1 has one B-frame; temporal layer 2 has two B-frames; and so on.
- a lower QP is applied to the pictures of lower temporal layer, and a higher QP is applied to the pictures of higher temporal layer. Coding efficiency improvement can be realized if more bits are spent for pictures of lower temporal layer.
- more blocks are coded in the skip-mode arid in this case, image quality of the reference frames is more important for coding efficiency.
- a pre-determined sps cipf temporal ..layer threshold can be used to realize coding efficiency improvement.
- the syntax elements p _ __ p __ y _ p _ _ p _ y _ can be transmitted in SPS as shown in Table 13 with cipf center temporal layer threshold being no larger than cipf enabled temporal layeiy threshold.
- Table 13 Proposed Syntax of cipf temporal layer threshold sps cipf enabled flag equal to 1 specifies CABAC context initialization process for each slice associated to the SPS is specified with the following syntax elements _ps_cipf_enabled_flag equal to 0 specifies that CABAC context initialization process for all the slices is the same and reset to the default initial values. sps_cipf_enab?ed_temporal_layer_thresho!d specifies the maximum Tid value where CABAC context initialization from the previous frame is applied. If the value of Tid for the current slice is larger than sps cipf enabled tempral layer threshold, CABAC context initialization process specified by VVC is applied.
- sps_cipf_enabled_temporaljayer_threhsold shall be in the range of 0 to sps max sublayers minus 1 + 1 , inclusive .
- sps_.cipf_center_temporal_layerjhreshold specifies the maximum Tid value where CABAC context initialization specified by FIG. 4 is applied. If the value of Tid for the current slice is larger than sps cipf center temporal layer threshold, CABAC context initialization specified by FIG. 4 is applied, that is, where Tid denotes temporal layer index, W denote the number of CTUs in a CTU row, and C is the total number of CTUs in a slice.
- sps cipf center temporal layer threhsold shall be in the range of 0 to sps cipf enabled temporal layerthreshold, inclusive.
- One more benefit of using the syntax element sps cipf enabled temporal layer threshold is that the context values need to be stored can be reduced. For example, in FIG. 6, if the value of sps cipf enabled temporal layer threshold is 5, CABAC context initialization values need to be stored for Tid 2, 3, 4 and 5. However if the value of sps cipf enabled temporal layer threshold is 3, CABAC context initialization tables need to be stored only for Tid 2 and 3. This is useful if the storage of the encoder or the decoder is limited.
- cipf enabled flag is transmitted in the picture header or in the slice_ header. If cipf enabled flag is transmited in the picture beader or in the slice header, cipf center flag is also transmitted in the picture header or in the slice header.
- SPS, PPS, picture header and slice header are shown in Tables 14, 15, 16, and 17 respectively.
- Table 14 Proposed Syntax of sps cipf enabled flag in SPS
- Table 15 Proposed Syntax of pps cipf in ph flag
- pps_dpf_info_in_ph_flag 1 specifies that ph cipf enabled flag and ph cipf center flag are transmitted m the picture header structure() syntax
- pps cipf info in ph flag 0 specifies that ph cipf enabled flag and ph cipf center flag are not transmitted in the picture header structure() syntax
- sh_cipf_enabled_flag and sh_cipf_center_flag are transmitted in the slice__header() syntax.
- ph i'ipt' enabled flag 1 specifies that CABAC context initialization from the previous frame is applied to all the slices in the associated picture
- ph cipf enabled flag 0 specifies that CABAC context initialization from the previous frame is applied to none of the slices in the associated picture
- CABAC context initialization specified by VVC is applied for all the slices in the associated picture
- ph cipf center flag 1 specifies that, for all the slices in the associated picture, the CTU location for CABAC context initialization from the previous frame is obtained as
- CTU location mm ((W t- Q/2 + 1, C) ph cipf enabled flag equal to 0 specifies that, for all the slices in the associated picture, the CTU location for CABAC context initialization from previous frame is obtained as
- CTU location C where denote the number of CTUs in a CTU row, and C is the total number of CTUs in a slice.
- sh_cipf_enabled_flag 1 specifies that CABAC context initialization from the previous frame is applied to the associated slice.
- sh_cipf_enabled_flag 0 specifies that CABAC context initialization from the previous frame is not applied to the associated slice and CABAC context initialization is reset to the default initial.
- sh cipf center cruflag 1 specifies that, for the associated slice, the CTU location for CABAC context initialization from the previous frame is obtained as
- CTU location min ((W + C)/2 + 1.
- the context initialization is more accurate for a slice and the entropy coded bits can be reduced, thereby improving the coding efficiency.
- FIG. 7 depicts an example of a process 700 for decoding a video encoded via entropy coding with adaptive context initialization, according to some embodiments of the present disclosure.
- One or more computing devices e.g., the computing device implementing the video decoder 200
- implement operations depicted in FIG. 7 by executing suitable program code e.g., the program code implementing the entropy decoding module 216.
- suitable program code e.g., the program code implementing the entropy decoding module 216.
- the process 700 is described with reference to some examples depicted in the figures. Other implementations, however, are possible.
- the process 700 involves accessing a video bitstream representing a video signal.
- Tire video bitstream is encoded by a video encoder using an entropy coding with the adaptive context initialization presented herein.
- tire process 700 involves reconstructing each frame of the video from the video bitstream.
- the process 700 involves accessing a binary bit string from the video bitstream that represents a slice of the frame.
- the process 700 involves determining the initial context value (e.g., p(l) in Eqn. (1)) of an entropy coding model for the slice.
- the order of the CTUs of the previous slice is determined by the scanning order as explained above with respect to FIG. 3.
- a syntax element can be used to indicate the CTU location tor obtaining the initial context value from the previous slice, such as the syntax element sps cipf center flag described above. If the syntax element sps cipf center flag has value 1, the initial context value can be set to the context value stored for the center CTU of the previou s slice; if the syntax element sps_cipf_center_flag has a value 0, the initial context value can be set to the context value stored for the last CTU of the previous slice.
- Another syntax element such as sps_cipf_enabled_flag, can be used to indicate whether to use the context value from the previous slice for initialization or use the default initial context value.
- both syntax elements sps_cipf_center_flag and sps_cipf_enabled_flag can be transmitted in the picture header (PH) of the frame containing the slice or the slice header (SH) of the slice.
- determining the initial context value can be performed by extracting the syntax elements sps cipf center flag and sps cipf enabled flag from the bitstream and selecting the proper initial context value based on the value of the syntax elements.
- a syntax element e.g., sps cipf QP threshold described above
- Hie quantization parameter (QP) value of the previous slice can be compared with the threshold value sps cipf QP threshold. If the QP value is smaller than or equal to the threshold value, the initial context value can be set to be the context value of the center CTU of the previous slice; otherwise, the initial context value can be set to be the context value of tire last CTU of the previous slice.
- the initialization can be made based on the temporal layer indices associated with the frames in a group of pictures (GOP) structure for random access (RA).
- a syntax element such as sps cipf enabled temporal lay eiy threshold discussed above, indicating a threshold value for determining whether to use the initial context value from the previous slice
- a syntax element such as sps cipf center temporal layer threshold discussed above, indicating a second threshold value for determining a CTU location for obtaining the initial context value from the previous slice.
- the sps cipf center temporal layer tbreshold is set to be no higher than sps cipf enabled temporal layer threshold.
- the initial context value for the slice is set to be the default initial context value. If the temporal layer index Tid is no higher than the sps cipf enabled temporal layer threshold, the temporal layer index Tid of the slice is compared with the sps cipf center temporal layer threshold.
- the process 700 involves decoding the slice by decoding the entropy coded portion of the binary string using the entropy coding model with the determined initial context value.
- the entropy decoded values may represent quantized and transformed residuals of the slice.
- the process 700 involves reconstructing the frame based on the decoded slice. Tire reconstruction can include dequantization and inverse transformation of the entropy decoded values as described above with respect to FIG. 2 to reconstruct the pixel samples of the slice.
- the operations in blocks 706-712 can be performed for other slices of the frame to reconstruct the frame.
- the reconstructed frames may be output for displaying.
- any CTU located in the center CTU rows (e.g., the center 1-5 CTU rows) of the slice can be used as the first of the three options.
- any CTU in the last several CTU rows e.g., last 1-3 CTU rows
- the same method can be applied to tire frame using the stored context value for a CTU (e.g., the center CTU or an end CTU) in the previous frame or the last slice in the previous frame.
- a CTU e.g., the center CTU or an end CTU
- FIG. 8 shows an example of the motion compensation and entropy coding context initialization dependencies of a picture coding structure for random access (RA) common test condition (CTC) applied with the CIPF.
- each box represents a frame.
- the letter inside the box indicates the picture type of the frame and the number indicates the picture order count (POC) of the frame in the display order.
- the number below tire box indicates the position of the frame in the coding order.
- the right side of the drawing shows the temporal layer index Tid of each temporal layer similar to those shown in FIG. 6.
- the left side of the drawing shows the delta QP for each temporal layer which is the difference between the QP of the layer and a base QP.
- the context initialization value inherits from the previous picture in the coding order regardless of temporal layer and QP as shown in the example of FIG. 9.
- the context initialization value inherits from the previous picture of lower temporal layer as shown in the example of FIG. 10.
- the context initialization table inheritance follows the motion compensation and prediction structure and uses the reference frame for motion compensation as the “previous” frame to inherit the state of the context variables for initilizing the context variables for the current frame.
- the context intialization value inheritance of this example can be demonstrated by the motion compensation and prediction paths shown as “prediction dependency’’ in dotted lines FIG. 9 and FIG. 10.
- the context value may inherit from multiple frames when multiple reference frames are involved in motion prediction and compensation.
- the context value initialization may be a combination of these inherited values such as an average or a weighted average.
- coding standards like VVC, ECM, A VC and HEVC supports multiple reference frames and reference index can differ from block to block even within a single slice.
- coding standards like VVC, ECM, A VC and HEVC supports bi-prediction: list 0 prediction and list 1 prediction typically forward prediction and backward prediction.
- the reference frame of index equal to 0 in the list 0 prediction can be used for CABAC inheritance.
- “slice” may be used to refer to a slice or a frame where the slice is the entire frame.
- the “previous” slice from which the context intialization table is inherited for the current slice may also be referred to as a “reference slice.”
- the context initialization value is inherited from the frame with a different QP value. Directly inheriting context initialization table of the different QP value can cause loss in coding efficiency. To avoid the loss, context initialization table convertion based on the previous QP and the current QP can be implemented.
- m and n are not dependent on the slice QP value. But because the values of sh_cabac_init_flag may be different for the previous and the cun-ent slices, m and n may be different for the previous slice and the current slice.
- prevCtxState(Qp_prev) is set to be the CABAC table CtxState(Qp_prev) for the previous slice to be inherited to the current slice, and is a known parameter.
- preCtxState(Qp curr) is the CABAC table for the current slice, and can be obtained by converting CtxState(Qpjprev) with the quantization parameters QpY_prev and QpY_curr. From Eqns. (12) and (13),
- Eqn. (14) can be executed only if tlie shceTypes of the previous slice and the current slice are the same. If they are different, CABAC initialization value calculated by Eqn. (5) is applied.
- the initial context value for the current slice is determined based on the QP values for the previous slice and the current slice as well as the initial context value and the inherited context value of the previous slice.
- FIG. 1 1 depicts an example of the various values involved in the context initialization table conversion of this embodiment.
- P i QP N is the initial context value (e.g., p(l) in Eqn. (1)) for frame N with slice QP value QP N .
- PiQP N is the context value of the top-left CTU for frame N.
- PiQPx is the context value at a fixed location that is going to be inherited by the first CTU of slice M. The fixed location can be either the center CTU or the last CTU as discussed above.
- P f QP M is the initial context value for frame M with slice QP value QP M , for the top-left CTU of frame M
- P f QP M is the context value at the fixed location of either th; center CTU or the last (bottom-right) CTU of frame M with QP M .
- P f QP M is the context value that is going to be inherited by the first CTU of frame X.
- PiQPx is the initial context value for frame X with Slice QP value QPx, for the top-left CTU of frame X.
- P f QP x is the context value at the fixed location of either the center CTU or the last CTU of frame X with slice QP value SliceQP x . In other words, this is the context value that is going to be inherited by the first CTU of the next frame after frame X.
- the P f QP M is derived as follows:
- PiQPx PreCtxState(QPx) + ((P f QP M -P;Q P M )/Q P M )*QP x ) . ( 17)
- the obtained result can be diped to be within a certain range, such as [1,127], PreCtxState(QP M ) and PreCtxState(QPx) can be calculated based on Eqn. (5).
- slice N is the first slice of the whole sequence
- PiQPN is the initial value PreCtxState(QPo) defined by Eqn. (5) in the VVC standard
- PfQPN is the context value after encoding from the first CTU till the selected fixed CTU location tor inheritance by slice M. If QP M and QP N are the same, then PiQP N is set to PfQPx.
- FIG. 12 depicts an example of a process 1200 for decoding a video encoded with the picture coding structure of random access via entropy coding with adaptive context initialization, according to some embodiments of the present disclosure.
- One or more computing devices e.g., the computing device implementing the video decoder 200
- implement operations depicted in FIG. 12 by executing suitable program code e.g., the program code i mplementing the entropy decoding module 216.
- suitable program code e.g., the program code i mplementing the entropy decoding module 216.
- the process 1200 is described with reference to some examples depicted in the figures. Other implementations, however, are possible.
- the process 1200 involves accessing a video bitstream representing a video signal.
- the video bitstream is encoded by a video encoder rising an entropy coding with the adaptive context initialization presented herein.
- the process 1200 involves reconstructing each frame of the video from the video bitstream.
- the process 1200 involves accessing a binary bit string from the video bitstream that, represents a partition of the frame, such as a slice. In some examples, the slice may be the entire frame.
- the process 1200 involves determining the initial context value (e.g., p(l) in Eqn. (1)) of an entropy coding model for the partition. The determination can be based on a context value stored for a CTU in a previous partition, an initial context value associated with the previous partition, the slice quantization parameter of the previous partition, and the slice quantization parameter of the partition.
- the initial context value can be determined to be the context value of the previous frame in the coding order regardless of temporal layer and QP value as shown in the example of FIG. 9.
- the initial context value can be determined to be the context value of the previous frame in a lower temporal layer as shown in the example of FIG. 10.
- the initial context value can be determined to be the context value of the reference frame(s) of the current frame according to the motion compensation and prediction structure.
- the context value of the previous frame can be the context value stored for a center CTU or the last CTU in the previous partition as discussed above.
- the initial context value is inherited from a partition with a different slice QP value.
- Context initialization table convertion based on the previous slice QP value and the current slice QP value is utilized to convert the inherited initial context value to suit for the current partition with the current slice QP value.
- the conversion is performed according to Eqn. (15) based on the default, initial context value determined using the quantization parameter of the previous partition and the default initial context value determined using the slice quantization parameter of the current partition.
- the conversion is performed based on the initial context value of the previous partition according to Eqn. (16).
- the initial context value of the previous partition can be determined using the same method described herein based on its previous partition.
- the process 1200 involves decoding the partition by decoding the entropy coded portion of the binary string using the entropy coding model with the determined initial context value.
- Tire entropy decoded values may represent quantized and transformed residuals of the partition.
- the process 1200 involves reconstructing the frame based on the decoded partition. The reconstruction can include dequantization and inverse transformation of the entropy decoded values as described above with respect to FIG. 2 to reconstruct the pixel samples of the partition. If the frame has more than one partition, the operations in blocks 1206-1212 can be performed for other partition of the frame to reconstruct the frame.
- the reconstructed frames may also be output for displaying.
- FIG. 14 shows the behaviour of the CIPF buffer for the example shown in FIG. 8.
- QPi to QP 5 are defined as:
- QP. is the QP value for the temporal layer with Tid : w.
- the five buffers are used to store the CABAC context values for corresponding QP values at the corresponding temporal layers.
- buffer i is used to store the CABAC context value for temporal layer i with quantization parameter QP;.
- the CABAC context table stored in the buffer is denoted as (Tid, QP) in FIG. 14.
- FIG. 14 shows the content of the buffer after the picture of POC shown at the bottom is coded. The shaded part indicates the new data stored in the buffer after the corresponding picture is coded.
- CIPF buffers including buffer 1 to buffer 5, are empty and there is no need to store CABAC context values for inheritance.
- the CABAC context value with QP 1 for Tid 1 is stored in the CIPF buffer 1.
- the CABAC context value with QP 2 for Tid 2 is stored in the CIPF buffer 2.
- the CABAC context value with QP 3 for Tid 3 is stored in the CIPF buffer 3.
- the CABAC context value with QP 4 for Tid 4 is stored in the CIPF buffer 4.
- the CABAC context value with QP5 for Tid 5 is stored in the CIPF buffer 5 ,
- the CABAC context value with QP5 for Tid 5 is used because POC 3 has Tid 5 and QP 5 .
- This CABAC context value is stored m the CIPF buffer 5 after encoding the picture of POC 1.
- the CABAC context value with QP5 for Tid 5 in the CIPF buffer 5 is thus updated after the picture of POC 3 is processed.
- the CABAC context value with QP 4 for Tid 4 which is stored in the CIPF buffer 4 after encoding the picture of POC 2, is used.
- the CABAC context value with QP 4 for Tid 4 in the CIPF buffer 4 is updated after the picture of POC 6 is processed.
- FIG. 15 shows another example of the RA test condition.
- the GOP structure is the same as the example shown in FIG. 8. But for the temporal layers 4 and 5, QP values are not constant.
- FIG. 16 The behaviour of the CIPF buffer tor the example of FIG. 15 is shown in FIG. 16.
- QP 1 to QP 5b are defined as: (19)
- both QP 4a and QP 4b can be used for temporal layer 4 and both QP 5a and QP 5b can be used for temporal layer 5.
- the entire CIPF buffer is empty' and thus there is no need to store CABAC context tables for inheritance.
- the CABAC context value with QPi for Tid 1 is stored in the CIPF buffer 1 .
- the CABAC context value with QP 2 for Tid 2 is stored in the CIPF buffer 2.
- the CABAC context value with QP 3 for Tid 3 is stored in the CIPF buffer 3.
- the CABAC context value with QP 4 a for Tid 4 is stored in the CIPF buffer 4.
- the CABAC context value with QP 5a for Tid 5 is stored in the CIPF buffer 5.
- the CABAC initialization value with QP 5b calculated using Eqn. (6) is used. Since QP 5b for Tid 5 is new to the CIPF buffer, the context values with QP 5a for Tid 5, QP 4a for Tid 4, QP 3 for Tid 3, QP 2 for Tid 2 are moved to the CIPF buffer 4, 3, 2, 1 respectively. Then the CABAC context value with QP 5b for Tid 5 is loaded to the CIPF buffer 5 after the picture of POC 3 is processed. As a result, the context value with QP 1 for Tid 1 is removed from the buffer.
- the CABAC initialization value with QP 4b calculated using Eqn. (5) is used. Since QP 4b for Tid 4 is new to the CIPF buffer, the context values with QP 5b for Tid 5, QP 5a for Tid 5, QP 4a for Tid 4, QP 3 for Tid 3 are moved to the CIPF buffer 4, 3, 2, 1 respectively. Then the CABAC context value with QP/jb for Tid 4 is added to the CIPF buffer 5 after the picture of POC 6 is processed. As a result, the context value with QP 2 for Tid 2 is removed from the buffer. As seen in the above description, the CABAC initialization value calculated using Eqn. (5) rather than the CIPF have been used in the coding of the pictures from POC 0 through POC 6.
- CIPF In the process of encoding or decoding the picture of POC 24, CIPF cannot be applied either, because the CABAC context table with QP 2 for Tid 2 does not exist in the CIPF buffer.
- a smaller QP value is applied to the pictures in the lower temporal layers, and a bigger QP value is applied to the pictures in the higher temporal layers, because the picture quality of the lower temporal layers affects the picture quality of pictures at the higher temporal layers.
- more bits are spent for the pictures in the lower temporal layers and fewer bits are spent for the pictures in the higher temporal layers.
- Bit saving of the pictures achieved in the encodings at the lower temporal layer is more important to improve overall coding efficiency. Therefore, in this example, the fact that CABAC context table initialization cannot be applied to Tid 2 significantly reduces the coding efficiency improvement that would have been achieved by CIPF.
- the number of buffers can be increased in some cases to accomandate the different combinations of the temporal layer and quantization parameter.
- the number of buffers can be set to to be max(5, maximum number of sublayers -1), instead of 5.
- the buffers can handle the cases in FIGS. 8 and 13.
- the proposed buffer configuration allows multiple CIPF buffers be allocated to the single Tid if the value of max sublayers (temporal layers) - 1 is 0, that is, only one temporal layer is contained in the bitstream. Hie allocated multiple CIPF buffers can support the condition like LD CTC shown in FIG. 13.
- the number of CIPF buffers can be set equal to the number of hierarchical layers in the motion compensation and prediction structure. Inheriting the CABAC context value for a current slice can use the context value in the buffer which has the same Tid. The inheritance is also allowed even if the QP values of the previous and the current slices are different. The discrepancy between the different QP values can be addressed by converting the CABAC context values associated with the QP values of the previous frame and the current frame.
- the conversion can be performed using the Eqns. (16) and (17), or more generally,
- the conversion can be performed as: When QP(N+1) QP(N),
- the QP(N) is in the range of 0 and 63. If the QP value SliceQP for a particular frame is outside this range, the SliceQP should be first clipped accordingly before it is applied to (20) or (21).
- the clipping function can be defined as:
- QP(N) - Clip3(0, 63, SliceQP(N)) (22) where QP(N) - SliceQP(N) when 0 SliceQP(N) ⁇ -63; QP(N) - 0, when SliceQP(N) ⁇ 0; and QP(N) 63, when SliceQP(N)>63.
- FIG. 17 shows an example of the behaviour of the proposed CIPF buffer configuration for the RA test condition shown in FIG. 15, according to some embodiments of the present disclosure.
- QPi to QP 5b are defined by Eqn. (19).
- the entire CIPF buffer is empty, because there is no need to store CABAC context values for inheritance.
- the CABAC context value with QP, for Tid 1 is stored in the CIPF buffer 1.
- the CABAC context value with QP2 for Tid 2 is stored in the CIPF buffer 2.
- the CABAC context value with QP 3 for Tid 3 is stored in the CIPF buffer 3.
- the CABAC context value with QP 4a for Tid 4 is stored in the CIPF buffer 4.
- the CABAC context value w ith QP 5a for Tid 5 is stored in the CIPF buffer 5.
- a converted CABAC context value is first calculated.
- the conversion can be performed using the CABAC context value in the CIPF buffer 5, the previous slice QP value QP 5a and the current slice QP value QP 5 b according to Eqn. (10) or (11).
- the converted CABAC context value is applied in the encoding or decoding process.
- the CABAC context value at a selected location in the picture of POC 3 e.g., the CTU location selected based on Eqn. (8) or (1 1)
- the CIPF buffer 5 e.g., the CTU location selected based on Eqn. (8) or (1 1)
- a converted CABAC context value is first calculated. The calculation can be performed using the CABAC context value in the CIPF buffer 4, the previous slice QP value QP 4 a and the current slice QP value QP4b according to Eqn. (10) or (11). The converted CABAC context value is applied in the encoding or decoding process. After the encoding or the decoding of the picture of POC 6, the CABAC context value at a selected location in the picture of POC 6 (e.g., the CTU location selected based on Eqn. (8) or (11)) is stored in the CIPF buffer 4. In contrast to FIG.
- the CABAC initialization values calculated by Eqn. (2.0) or (21) can be used in the coding of at least these pictures. Further, the CIPF can also be applied to the picture of POC 24, as the CABAC context Table with QP 2 for Tid 2 is maintained in the CIPF buffer and is available for the coding of picture of POC 24.
- the CIPF buffers always keep a set of CABAC context values from each temporal layer.
- the CIPF process can be applied to each eligible picture by using the CABAC context value stored in the buffer that has the same temporal layer index.
- the new CABAC context value will replace the existing CABAC context value in the buffer that has been used as the initial CABAC context value and has the same temporal layer index as the current picture.
- the CIPF proposed in Eqn. (10) or (11) can be applied.
- the existing CABAC context value inheritance approach i .e, inheritance from a previous frame with the same slice QP value in the same temporal layer
- the deftaul initialization value calculated using Eqn. (5) is applied instead.
- the buffer management shown in FIG, 16 can be improved by applying the default context initilization in Eqn.
- FIG. 18 depicts an example of a process 1800 for decoding a video encoded with the picture coding structure of random access via entropy coding with adaptive context initialization, according to some embodiments of the present disclosure.
- One or more computing devices e.g., the computing device implementing the video decoder 200
- implement operations depicted in FIG. 18 by executing suitable program code e.g., the program code implementing the entropy decoding module 216.
- suitable program code e.g., the program code implementing the entropy decoding module 216.
- the process 1800 is described with reference to some examples depicted in the figures. Other implementations, however, are possible.
- the process 1800 involves accessing a video bitstream representing a video signal .
- the video bitstream is encoded by a video encoder using an entropy coding with the adaptive context initialization presented herein.
- the process 1800 involves reconstructing each frame of the video from the video bitstream.
- the process 1800 involves accessing a binary bit string from the video bitstream that represents a partition of a frame, such as a slice. In some examples, the slice may be the entire frame.
- the process 1800 involves determining the initial context value (e.g., P i QP(N+1) in Eqns. (20) and (21)) of an entropy coding model for the partition.
- the decoder can access a buffer, from a set of buffers, that corresponds to the temporal layer (i ,e. , sublayer) of the frame to obtain the context value stored for a CTU in the previous frame (e.g., PfQP(N) in Eqns. (20) and (21).
- the stored context value may be for a center CTU or the last CTU in the previous frame.
- the number of buffers is set to the number of temporal layers, each temporal layer having one buffer storing the context value. It is likely that tlie slice quantization parameters for the frames in the same temporal layer have different values. As such, the same buffer will need to store the context values for frames with different parameter values.
- context value retrieved from the buffer can be to be converted before being used to derive the initial context value for the current frame. For example, the conversion can be performed according to Eqn. (20) or (21).
- the number of buffers can be set to the larger value of 5 and the number of maximum sub-layers in the video. In this way, one buffer is used to store data for one combination of the temporal layer index and the slice quantization parameter value. No conversion is needed in this embodiment so long as the combination of the temporal layer index and the slice quantization parameter is in the CIPF buffer.
- the process 1800 involves decoding the partition by decoding the entropy coded portion of the binary'- string using the entropy coding model with the determined initial context value.
- the entropy decoded values may represent quantized and transformed residuals of the partition.
- the process 1800 involves replacing the context value stored in the buffer with the context value determined for a CTU in the frame during the decoding.
- die CTU may be die center CTU or the last CTU of a slice in the frame depending on the value of the syntax elements indicating the location of the CTU for CIPF, such as sps_cipf_center_flag.
- die process 1800 involves reconstructing die frame based on the decoded partition.
- the reconstruction can include dequantization and inverse transformation of the entropy decoded values as described above with respect to FIG, 2 to reconstruct the pixel samples of the partition. If the frame has more than one partition, the operations in blocks 1806-1814 can be performed for other partition of die frame to reconstruct the frame.
- the reconstructed frames may be output for displaying.
- FIG. 19 depicts an example of a computing device 1900 that can implement the video encoder 100 of FIG. 1 or the video decoder 200 of FIG. 2.
- the computing device 1900 can include a processor 1912 that is communicatively coupled to a memory' 1914 and that executes computer-executable program code and/or accesses information stored in the memory 1914.
- the processor 1912 may comprise a microprocessor, an application-specific integrated circuit (“ASIC’), a state machine, or other processing device.
- the processor 1912 can include any of a number of processing devices, including one.
- Such a processor can include or may be in communication with a computer-readable medium storing instructions that, when executed by the processor 1912, cause the processor to perform the operations described herein.
- the memory 1914 can include any suitable non -transitory computer-readable medium.
- the computer-readable medium can include any electronic, optical, magnetic, or other storage device capable of providing a processor with computer-readable instructions or other program code.
- Non-limiting examples of a computer-readable medium include a magnetic disk, memory' chip, ROM, RAM, an ASIC, a configured processor, optical storage, magnetic tape or other magnetic storage, or any other medium from which a computer processor can read instructions.
- the instructions may include processor-specific instructions generated by a compiler and/or an interpreter from code written in any suitable computerprogramming language, including, for example, C, C++, C## Visual Basic, Java, Python, Perl, JavaScript, and ActionScript.
- the computing device 1900 can also include a bus 1916.
- the bus 1916 can communicatively couple one or more components of the computing device 1900.
- the computing device 1900 can also include a number of external or internal devices such as input or output devices.
- the computing device 1900 is shown with an input/output (“I/O”) interface 1918 that can receive input from one or more input devices 1920 or provide output to one or more output devices 1922.
- Tire one or more input devices 1920 and one or more output devices 1922 can be communicatively coupled to the I/O interface 1918.
- the communicative coupling can be implemented via any suitable manner (e.g., a connection via a printed circuit board, connection via a cable, communication via wireless transmissions, etc.).
- Non-limiting examples of input devices 1920 include a touch screen (e.g., one or more cameras for imaging a touch area or pressure sensors for detecting pressure changes caused by a touch), a mouse, a keyboard, or any other device that can be used to generate input events in response to physical actions by a user of a computing device.
- Non-limiting examples of output devices 1922 include an LCD screen, an external monitor, a speaker, or any other device that can be used to display or otherwise present outputs generated by a computing device.
- the computing device 1900 can execute program code that configures the processor 1912 to perform one or more of the operations described above with respect to FIGS. 1-18.
- the program code can include the video encoder 100 or the video decoder 200.
- the program code may be resident in the memory 1914 or any suitable computer-readable medium and may be executed by the processor 1912 or any other suitable processor.
- the computing device 1900 can also include at least one network interface device 1924.
- the network interface device 1924 can include any device or group of devices suitable for establishing a wired or wireless data connection to one or more data networks 1928.
- Nonlimiting examples of the network interface device 1924 include an Ethernet network adapter, a modem, and/or the like.
- the computing device 1900 can transmit messages as electronic or optical signals via the network interface device 1924.
- a computing device can include any suitable arrangement of components that provide a result conditioned on one or more inputs.
- Suitable computing devices include multi-purpose microprocessor-based computer systems accessing stored software that programs or configures the computing system from a general purpose computing apparatus to a specialized computing apparatus implementing one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combinations of languages may be used to implement the teachings contained herein in software to be used in programming or configuring a computing device.
- Embodiments of the methods disclosed herein may be performed in the operation of such computing devices.
- the order of the blocks presented in the examples above can be varied — for example, blocks can be re-ordered, combined, and/or broken into sub-blocks. Some blocks or processes can be performed in parallel.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202380024808.0A CN118830246A (zh) | 2022-03-09 | 2023-03-09 | 用于视频编码中的熵编码的自适应上下文初始化 |
| JP2024552506A JP2025508539A (ja) | 2022-03-09 | 2023-03-09 | ビデオコーディングにおけるエントロピーコーディングのための適応的コンテキスト初期化 |
| US18/843,984 US20250184510A1 (en) | 2022-03-09 | 2023-03-09 | Method for decoding video from video bitstream representing video |
| MX2024010882A MX2024010882A (es) | 2022-03-09 | 2023-03-09 | Inicializacion adaptativa de contexto para codificacion de entropia en la codificacion de video. |
Applications Claiming Priority (10)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202263269090P | 2022-03-09 | 2022-03-09 | |
| US63/269,090 | 2022-03-09 | ||
| US202263363703P | 2022-04-27 | 2022-04-27 | |
| US63/363,703 | 2022-04-27 | ||
| US202263366218P | 2022-06-10 | 2022-06-10 | |
| US63/366,218 | 2022-06-10 | ||
| US202263367710P | 2022-07-05 | 2022-07-05 | |
| US63/367,710 | 2022-07-05 | ||
| US202263368240P | 2022-07-12 | 2022-07-12 | |
| US63/368,240 | 2022-07-12 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2023173025A2 true WO2023173025A2 (en) | 2023-09-14 |
| WO2023173025A3 WO2023173025A3 (en) | 2023-10-19 |
Family
ID=87935994
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2023/064052 Ceased WO2023173025A2 (en) | 2022-03-09 | 2023-03-09 | Adaptive context initialization for entropy coding in video coding |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20250184510A1 (https=) |
| JP (1) | JP2025508539A (https=) |
| MX (1) | MX2024010882A (https=) |
| WO (1) | WO2023173025A2 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2026010995A1 (en) * | 2024-07-01 | 2026-01-08 | Tencent America LLC | Flexible context-based adaptive binary arithmetic coding parameters |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR100664936B1 (ko) * | 2005-04-13 | 2007-01-04 | 삼성전자주식회사 | 코딩 효율이 향상된 컨텍스트 기반 적응적 산술 코딩 및디코딩 방법과 이를 위한 장치, 이를 포함하는 비디오 코딩및 디코딩 방법과 이를 위한 장치 |
| US7932843B2 (en) * | 2008-10-17 | 2011-04-26 | Texas Instruments Incorporated | Parallel CABAC decoding for video decompression |
| US10869062B2 (en) * | 2017-12-21 | 2020-12-15 | Qualcomm Incorporated | Probability initialization and signaling for adaptive arithmetic coding in video coding |
| JP7079377B2 (ja) * | 2019-03-25 | 2022-06-01 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 符号化装置、復号装置、符号化方法、および復号方法 |
-
2023
- 2023-03-09 JP JP2024552506A patent/JP2025508539A/ja active Pending
- 2023-03-09 US US18/843,984 patent/US20250184510A1/en active Pending
- 2023-03-09 MX MX2024010882A patent/MX2024010882A/es unknown
- 2023-03-09 WO PCT/US2023/064052 patent/WO2023173025A2/en not_active Ceased
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2026010995A1 (en) * | 2024-07-01 | 2026-01-08 | Tencent America LLC | Flexible context-based adaptive binary arithmetic coding parameters |
Also Published As
| Publication number | Publication date |
|---|---|
| US20250184510A1 (en) | 2025-06-05 |
| WO2023173025A3 (en) | 2023-10-19 |
| JP2025508539A (ja) | 2025-03-26 |
| MX2024010882A (es) | 2024-09-17 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9445114B2 (en) | Method and device for determining slice boundaries based on multiple video encoding processes | |
| JP7343669B2 (ja) | Vvcにおける色変換のための方法及び機器 | |
| JP7143512B2 (ja) | ビデオ復号方法およびビデオデコーダ | |
| US12537966B2 (en) | Method for decoding video from video bitstream encoded using video encoder, system for decoding video bitstream comprising compressed video data of video, and computer-readable medium | |
| JP2026016599A (ja) | コンテキスト・ベース適応バイナリ算術符号化のための確率計算の方法およびデバイス | |
| JP2025515239A (ja) | Cabacコンテキスト初期化のctuレベル継承のための方法および装置 | |
| US20250184510A1 (en) | Method for decoding video from video bitstream representing video | |
| KR20240137567A (ko) | 비디오 코딩을 위한 일반 제약 정보 시그널링 | |
| JP2025528985A (ja) | 時間的動きベクトル予測候補導出のための方法、システム及びコンピュータプログラム | |
| WO2023177752A1 (en) | Methods and devices on probability calculation for context-based adaptive binary arithmetic coding | |
| JP2025510434A (ja) | 正則化フリーの多仮説算術コーディングのためのシステムおよび方法 | |
| WO2021263251A1 (en) | State transition for dependent quantization in video coding | |
| US12445630B2 (en) | Systems and methods for implicit derivation in a recursive intra region | |
| CN118830246A (zh) | 用于视频编码中的熵编码的自适应上下文初始化 | |
| US20250267259A1 (en) | Inverse pre-filter for image and video compression | |
| US20260082086A1 (en) | Combining probability model initializations of multiple reference frames | |
| JP2026506467A (ja) | パーティションベースの予測モードでブロックセクションをブレンドするためのシステムおよび方法 | |
| WO2026072104A1 (en) | Selection of multiple reference frames for weighted averaging probability model | |
| WO2022213122A1 (en) | State transition for trellis quantization in video coding | |
| KR20250084292A (ko) | 후보 리스트 구성을 위한 시스템들 및 방법들 | |
| WO2026063976A1 (en) | Probability model initialization for arithmetic coding | |
| JP2025530603A (ja) | 算術コーダ確率更新レートをパラメータ化するためのシステムおよび方法 | |
| WO2025250367A1 (en) | Hardware friendly block level adaptive weighted prediction | |
| WO2026095991A1 (en) | Frame context probability model initialization using weighted averaging of tile or slice context probability models | |
| WO2025075657A1 (en) | Extended directional predictions for residual blocks |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23767705 Country of ref document: EP Kind code of ref document: A2 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024552506 Country of ref document: JP Ref document number: 202380024808.0 Country of ref document: CN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18843984 Country of ref document: US |
|
| WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2024/010882 Country of ref document: MX |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202417074635 Country of ref document: IN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 23767705 Country of ref document: EP Kind code of ref document: A2 |
|
| WWP | Wipo information: published in national office |
Ref document number: 18843984 Country of ref document: US |