CN110710219B

CN110710219B - Method and apparatus for context derivation for coefficient coding

Info

Publication number: CN110710219B
Application number: CN201880036772.7A
Authority: CN
Inventors: 阿基·库塞拉; 达克·何
Original assignee: Google LLC
Current assignee: Google LLC
Priority date: 2017-12-08
Filing date: 2018-09-14
Publication date: 2022-02-11
Anticipated expiration: 2038-09-14
Also published as: CN114449277A; CN110710219A; EP3721630A1; WO2019112669A1

Abstract

Coding a transform block having transform coefficients is described. A plurality of register arrays are defined to separately hold one or more stored values for a coding context based on at least one spatial template for the coding context. Initializing the register array by setting the stored values to default values, and coding values of the transform coefficients from the transform block in a reverse scan order. The value of the transform coefficient indicates a magnitude of the transform coefficient. For each of one or more transform coefficients, the coding comprises: determining the coding context using at least some of the stored values from the register array; entropy coding a value of the transform coefficient using the coding context; and updating the register array after entropy coding the values of the transform coefficients.

Description

Method and apparatus for context derivation for coefficient coding

Background

A digital video stream may represent video using a sequence of frames or still images. Digital video may be used in a variety of applications, including: such as video conferencing, high definition video entertainment, video advertising, or sharing of user generated video. Digital video streams may contain large amounts of data and consume a large amount of computing or communication resources of a computing device used to process, transmit, or store the video data. Various methods have been proposed to reduce the amount of data in a video stream, including compression and other encoding techniques.

Disclosure of Invention

One aspect of the disclosed embodiments is a method of coding a transform block having transform coefficients. The method comprises the following steps: defining a register array for each holding one or more stored values for a coding context (i.e., for determining a coding context) based on at least one spatial template for the coding context, wherein the register array includes at least a first register array having a first size and a second register array having a second size different from the first size; initializing a register array by setting the stored value to a default value; and coding values of transform coefficients from a transform block indicating magnitudes of the transform coefficients in a reverse scan order. The coding comprises, for each of one or more transform coefficients: determining a coding context using at least some of the stored values from a register array; entropy coding a value of a transform coefficient at a scan location using a coding context; and updating the register array after entropy coding the values of the transform coefficients. The first register array has a different size than the second register array, wherein, for example, the size of the first register array is set to a base that stores a value different from a base of a value to which the size of the second register array is set.

Another aspect of the disclosed embodiments is an apparatus for coding a transform block having transform coefficients. The apparatus comprises: a memory; and a processor configured to execute instructions stored in the memory. The instructions when executed cause the processor to define a register array based on at least one spatial template for a coding context, the register array for each holding one or more stored values for the coding context, wherein the register array includes at least a first register array having a first size and a second register array having a second size different from the first size; initializing a register array by setting the stored value to a default value; and coding values of transform coefficients from a transform block indicating magnitudes of the transform coefficients in a reverse scan order. The instructions for coding include instructions for each of the one or more transform coefficients to: determining a coding context using at least some of the stored values from a register array; entropy coding a value of a transform coefficient at a scan location using a coding context; and updating the register array after entropy coding the values of the transform coefficients.

Another apparatus for coding a transform block having transform coefficients is described, the apparatus comprising a memory; and a processor configured to execute instructions stored in the memory. The instructions when executed cause the processor to define a register array for each holding one or more stored values for a coding context based on at least one spatial template for the coding context, initialize the register array by setting the stored values to default values, and code values of transform coefficients of a transform block indicating magnitudes of the transform coefficients in a reverse scan order. The coded instructions include instructions for: determining a first coding context using at least some of the stored values from a register array; entropy coding a first value of the transform coefficient using the first coding context, the first value indicating a magnitude of the transform coefficient, and the first value belonging to a set of positive integers {0, …, first maximum }; determining a second coding context using at least some of the stored values from the register array; entropy coding a second value of the transform coefficient using the second coding context, the second value indicating a magnitude of the transform coefficient, the second value belonging to the set of positive integers {0, …, second maximum }, and the second maximum being greater than the first maximum; and updating the register array after entropy coding the first value and the second value.

Another aspect of the disclosed embodiments is a method of coding a transform block having transform coefficients. The method comprises the following steps: defining a register array for each holding one or more stored values for a coding context based on at least one spatial template for the coding context, initializing the register array by setting the stored values to default values, and coding values of transform coefficients of a transform block indicating magnitudes of the transform coefficients in a reverse scan order. The coding comprises the following steps: determining a first coding context using at least some of the stored values from a register array; entropy coding a first value of the transform coefficient using the first coding context, the first value indicating a magnitude of the transform coefficient, and the first value belonging to a set of positive integers {0, …, first maximum }; determining a second coding context using at least some of the stored values from the register array; entropy coding a second value of the transform coefficient using the second coding context, the second value indicating a magnitude of the transform coefficient, the second value belonging to the set of positive integers {0, …, second maximum }, and the second maximum being greater than the first maximum; and updating the register array after entropy coding the first value and the second value.

These and other aspects of the disclosure are disclosed in the following detailed description of the embodiments, the appended claims and the accompanying drawings.

Drawings

The description herein makes reference to the accompanying drawings wherein like reference numerals refer to like parts throughout the several views.

Fig. 1 is a schematic diagram of a video encoding and decoding system.

Fig. 2 is a block diagram of an example of a computing device that can implement a sending station or a receiving station.

Fig. 3 is a diagram of a video stream to be encoded and subsequently decoded.

Fig. 4 is a block diagram of an encoder according to an embodiment of the present disclosure.

Fig. 5 is a block diagram of a decoder according to an embodiment of the present disclosure.

Fig. 6 is a diagram illustrating a scanning order that may be utilized when coding a block of transform coefficients according to an embodiment of the present disclosure.

Fig. 7 is a diagram illustrating the levels of transform coefficient coding using a level map according to an embodiment of the present disclosure.

Fig. 8 is a flow diagram of a process for encoding a transform block in an encoded video bitstream using a level map according to an embodiment of the present disclosure.

Fig. 9A is a diagram illustrating a first set of spatially neighboring templates that may be utilized in a context-based arithmetic coding method according to an embodiment of the present disclosure.

Fig. 9B is a diagram illustrating a second set of spatially neighboring templates that may be utilized in a context-based arithmetic coding method according to an embodiment of the present disclosure.

Fig. 10 is a diagram showing a first example of a register set corresponding to a horizontal template.

Fig. 11 is a diagram showing a first example of a register set corresponding to a vertical template.

Fig. 12 is a diagram showing a first example of a register set corresponding to a two-dimensional template.

Fig. 13 is a diagram showing a second example of a register set corresponding to a horizontal template.

Fig. 14 is a diagram showing a second example of a register set corresponding to a vertical template.

Fig. 15 is a diagram showing a second example of a register set corresponding to a two-dimensional template.

Fig. 16 is a diagram showing a third example of a register set corresponding to a horizontal template.

Fig. 17 is a diagram showing a third example of a register set corresponding to a vertical template.

Fig. 18 is a diagram showing a third example of a register set corresponding to a two-dimensional template.

Fig. 19 is a flow diagram of a process for coding a transform block according to an embodiment of the present disclosure.

Fig. 20 is a flow diagram of another process for coding a transform block according to an embodiment of the present disclosure.

Detailed Description

As mentioned above, compression schemes related to encoding a video stream may include: decomposing the image into blocks; and generating a digital video output bitstream (e.g., an encoded bitstream) using one or more techniques to restrict information included in the output bitstream. The received bitstream may be decoded to recreate the block and source images according to the restriction information. Encoding a video stream or a portion thereof, such as a frame or a block, may include using temporal or spatial similarities in the video stream to improve coding efficiency. For example, a current block of a video stream may be encoded based on identifying differences (residuals) between previously encoded pixel values, or between combinations of previously encoded pixel values and those in the current block.

Encoding using spatial similarity may be referred to as intra prediction. Intra-prediction attempts to predict pixel values of blocks of a frame of video using pixels at the periphery of the blocks; that is, pixels in the same frame as the block but outside the block are used. The prediction block resulting from intra prediction is referred to herein as an intra predictor. Intra prediction may be performed along prediction directions, where each direction may correspond to an intra prediction mode. The intra prediction mode may be signaled by the encoder to the decoder.

Encoding using temporal similarity may be referred to as inter prediction. Inter-prediction attempts to predict pixel values of a block using one or more possibly displaced blocks from one or more temporally adjacent frames (i.e., reference frames). The displacement is identified by a motion vector. Temporally adjacent frames are frames that occur in the video stream either earlier or later in time than the frame of the block being encoded. Some coders use up to eight reference frames, which may be stored in a frame buffer. The motion vector may reference (i.e., use) one of the reference frames of the frame buffer. Also, one or more reference frames may be used to code the current frame. The prediction block resulting from inter prediction is referred to herein as an inter predictor.

As mentioned above, a current block of a video stream may be encoded based on identifying differences (residuals) between previously coded pixel values and pixel values in the current block. In this way, only the residual and the parameters used to generate the residual need to be added to the encoded bitstream. The residual may be encoded using a lossy quantization step.

The residual block may be in the pixel domain. The residual block may be transformed into a transform domain, thereby producing a transformed block of transform coefficients. Herein, the frequency domain is used as an example of a transform domain, but should be interpreted to generally refer to a domain in which values are represented after being transformed including a Discrete Cosine Transform (DCT) and its variants, a Discrete Sine Transform (DST) and its variants, and an identity transform and its scaled variants.

The transform coefficients may be quantized, thereby producing a quantized transform block of quantized transform coefficients. The quantized coefficients may be entropy encoded and added to the encoded bitstream. A decoder may receive the encoded bitstream and entropy decode the quantized transform coefficients to reconstruct the original video frame.

Entropy coding is a technique for "lossless" coding that relies on a probabilistic model that models the distribution of values that occur in the coded video bitstream. Entropy coding can reduce the number of bits required to represent video data to near a theoretical minimum by using a probabilistic model based on a distribution of measured or estimated values. In practice, the actual reduction in the number of bits required to represent the video data may be a function of the precision of the probability model, the number of bits used to perform the encoding, and the computational precision of the fixed point arithmetic used to perform the encoding.

In encoding a video bitstream, many bits are used in one of two cases: content prediction (e.g., inter mode/motion vector coding, intra prediction mode coding, etc.) or residual coding (e.g., transform coefficient coding).

For content prediction, bits in the bitstream may include, for a block, an intra-prediction mode for encoding the block. The intra prediction mode may be coded (encoded by the encoder and decoded by the decoder) using entropy coding. Also, a context is determined for the intra prediction mode, and a probability model corresponding to the context is used to code the intra prediction mode.

Entropy coding a sequence of symbols is typically achieved by: the probability model is used to determine the probabilities of the sequences, which are then mapped to binary codewords at the encoder using binary arithmetic coding, and the sequences from the binary codewords are decoded at the decoder.

A context model as used herein may be a parameter in entropy coding. The context model may be any parameter or method that affects the probability estimation for entropy coding. The purpose of context modeling is to obtain probability distributions for subsequent entropy coding engines such as arithmetic coding, huffman coding, and other variable length to variable length coding engines. To achieve good compression performance, a large number of contexts may be required. For example, some video coding systems may include hundreds or even thousands of contexts that are only used for transform coefficient coding. Each context may correspond to a probability distribution.

Residual coding involves transforming the residual of a video block into a transform block of transform coefficients. The transform block is in the frequency domain and one or more transform blocks may be generated for the video block. The transform coefficients are quantized and entropy coded into a coded video bitstream. The decoder reconstructs the block using the encoded transform coefficients and the reference frame. Entropy coding of transform coefficients involves the selection of a context model (also referred to as a probabilistic context model or probability model) that provides an estimate of the conditional probability for coding the binary symbols of the binarized transform coefficients.

In embodiments of template-based entropy coding of quantized transform coefficients described herein, spatial templates are used during entropy coding to select context neighbors of the coded values, and the context neighbors are used to determine the coding context. However, accessing the values needed to determine the coding context may result in a storage bottleneck. In the embodiments described herein, the values required to determine the coding context are saved in a register array within memory, and at least some of the values in the register array are updated with the most recently encoded values, which reduces the amount of information that must be obtained from basic information such as transform blocks or level maps. In some embodiments, these register arrays may be organized into one or more register sets, where the length of the arrays (i.e., the number of elements or stored context neighbor values) may be different. The register array may comprise a shift register or any other memory, wherein values may be shifted within the array and/or between arrays.

Template-based entropy coding of quantized transform coefficients is described herein first with reference to a system that may incorporate teachings.

Fig. 1 is a schematic diagram of a video encoding and decoding system 100. Transmitter station 102 may be, for example, an internally configured computer having hardware such as that depicted in fig. 2. However, other suitable implementations of transmitting station 102 are possible. For example, the processing of transmitting station 102 may be distributed among multiple devices.

Network 104 may connect transmitting station 102 and receiving station 106 to encode and decode video streams. In particular, a video stream may be encoded in transmitting station 102 and the encoded video stream may be decoded in receiving station 106. The network 104 may be, for example, the internet. In this example, network 104 may also be a Local Area Network (LAN), a Wide Area Network (WAN), a Virtual Private Network (VPN), a cellular telephone network, or any other manner of transmitting a video stream from transmitting station 102 to receiving station 106.

In one example, the receiving station 106 may be an internally configured computer having hardware, such as the hardware depicted in FIG. 2. However, other suitable implementations of the receiving station 106 are possible. For example, the processing of the receiving station 106 may be distributed among multiple devices.

Other implementations of the video encoding and decoding system 100 are possible. For example, one embodiment may omit network 104. In another embodiment, the video stream may be encoded and then stored for transmission to the receiving station 106 or any other device having memory at a later time. In one embodiment, the receiving station 106 receives an encoded video stream (e.g., via the network 104, a computer bus, and/or some communication path) and stores the video stream for later decoding. In an example embodiment, the real-time transport protocol (RTP) is used to transport encoded video over the network 104. In another embodiment, transport protocols other than RTP may be used, for example, HTTP-based video streaming protocol.

For example, when used in a videoconferencing system, transmitter station 102 and/or receiving station 106 may include the ability to encode and decode video streams, as described below. For example, receiving station 106 may be a video conference participant that receives an encoded video bitstream from a video conference server (e.g., transmitting station 102) to decode and view and further encode and send its own video bitstream to the video conference server for decoding and viewing by other participants.

Fig. 2 is a block diagram of an example of a computing device 200 that may implement a transmitting station or a receiving station. For example, computing device 200 may implement one or both of transmitting station 102 and receiving station 106 of fig. 1. The computing device 200 may be in the form of a computing system including multiple computing devices, or in the form of a single computing device, such as a mobile phone, a tablet computer, a laptop computer, a notebook computer, a desktop computer, and so forth.

The CPU 202 in the computing device 200 may be a central processing unit. Alternatively, CPU 202 may be any other type of device or devices capable of manipulating or processing information now existing or later developed. Although the disclosed embodiments may be practiced with a single processor, such as CPU 202, as shown, more than one processor may be used to achieve speed and efficiency advantages.

In an embodiment, the memory 204 in the computing device 200 may be a Read Only Memory (ROM) device or a Random Access Memory (RAM) device. Any other suitable type of storage device may be used for memory 204. The memory 204 may include code and data 206 that are accessed by the CPU 202 using the bus 212. The memory 204 may also include an operating system 208 and application programs 210, the application programs 210 including at least one program that allows the CPU 202 to perform the methods described herein. For example, the application programs 210 may include applications 1 through N, which also include a video coding application that performs the methods described herein. Computing device 200 may also include secondary storage 214, which may be, for example, a memory card for use with mobile computing device 200. Since video communication sessions may contain a large amount of information, they may be stored in whole or in part in secondary storage 214 and loaded into memory 204 as needed.

Computing device 200 may also include one or more output devices, such as a display 218. In one example, display 218 may be a touch sensitive display that combines the display with a touch sensitive element operable to sense touch input. A display 218 may be coupled to the CPU 202 via the bus 212. Other output devices may be provided in addition to or in place of display 218 that allow a user to program or use computing device 200. When the output device is or includes a display, then the display may be implemented in various ways, including by a Liquid Crystal Display (LCD), Cathode Ray Tube (CRT) display, or Light Emitting Diode (LED) display such as an organic LED (oled) display.

Computing device 200 may also include or communicate with an image sensing device 220, such as a camera, or any other image sensing device 220 now existing or later developed that is capable of sensing images, such as images of a user operating computing device 200. The image sensing device 220 may be positioned so that it is facing the user operating the computer device 200. In an example, the position and optical axis of the image sensing device 220 may be configured such that the field of view includes an area directly adjacent to the display 218 from which the display 218 is visible.

Computing device 200 may also include or be in communication with a sound sensing device 222, such as a microphone, or any other sound sensing device now existing or later developed capable of sensing sound in the vicinity of computing device 200. The sound sensing device 222 can be positioned so that it is directed toward a user operating the computing device 200, and can be configured to receive sound, such as speech or other utterances, emitted by the user while the user is operating the computing device 200.

Although fig. 2 depicts the CPU 202 and memory 204 of the computing device 200 as integrated into a single unit, other configurations may be utilized. The operations of CPU 202 may be distributed across multiple machines (each machine having one or more of the processors) that may be coupled directly or across a local area network or other network. Memory 204 may be distributed among multiple machines, such as a network-based memory or a memory among multiple machines that perform operations for computing device 200. Although depicted here as a single bus, the bus 212 of the computing device 200 may be comprised of multiple buses. Further, secondary storage 214 may be directly coupled to other components of computing device 200 or may be accessible via a network and may comprise a single integrated unit such as a memory card or multiple units such as multiple memory cards. Thus, the computing device 200 may be implemented in a variety of configurations.

Fig. 3 is a diagram of an example of a video stream 300 to be encoded and subsequently decoded. The video stream 300 includes a video sequence 302. At the next level, the video sequence 302 includes several adjacent frames 304. Although three frames are depicted as adjacent frames 304, video sequence 302 may include any number of adjacent frames 304. The adjacent frames 304 may then be further subdivided into individual frames, such as frame 306. At the next level, the frame 306 may be divided into a series of slices 308 or planes. For example, the segments 308 may be a subset of frames that allow parallel processing. The segment 308 may also be a subset of a frame that may separate the video data into separate colors. For example, a frame 306 of color video data may include a luminance plane and two chrominance planes. The segments 308 may be sampled at different resolutions.

Regardless of whether frame 306 is divided into segments 308, frame 306 may be further subdivided into blocks 310, which may contain data corresponding to, for example, 16x16 pixels in frame 306. The block 310 may also be arranged to include data from one or more segments 308 of pixel data. The blocks 310 may also have any other suitable size, such as 4x4 pixels, 8x8 pixels, 16x8 pixels, 8x16 pixels, 16x16 pixels, or larger.

Fig. 4 is a block diagram of an encoder 400 according to an embodiment of the present disclosure. As described above, encoder 400 may be implemented in transmitting station 102, such as by providing a computer software program stored in a memory, such as memory 204. The computer software program may include machine instructions that, when executed by a processor, such as CPU 202, cause transmitting station 102 to encode video data in the manner described herein. Encoder 400 may also be implemented as dedicated hardware included in, for example, transmitting station 102. The encoder 400 has the following stages to perform various functions in the forward path (illustrated by the solid connecting lines) to generate an encoded or compressed bitstream 420 using the video stream 300 as input: an intra/inter prediction stage 402, a transform stage 404, a quantization stage 406, and an entropy coding stage 408. The encoder 400 may also include a reconstruction path (illustrated by dashed connecting lines) to reconstruct the frame to encode future blocks. In fig. 4, the encoder 400 has the following stages to perform various functions in the reconstruction path: a dequantization stage 410, an inverse transform stage 412, a reconstruction stage 414 and a loop filtering stage 416. Other structural variations of the encoder 400 may be used to encode the video stream 300.

When the video stream 300 is presented for encoding, the frames 306 may be processed in units of blocks. In the intra/inter prediction stage 402, a block may be encoded using intra prediction (also referred to as intra prediction) or inter prediction (also referred to as inter prediction), or a combination of both. In any case, a prediction block may be formed. In the case of intra prediction, all or part of a prediction block may be formed from samples in the current frame that have been previously encoded and reconstructed. In the case of inter prediction, all or a portion of the prediction block may be formed from samples in one or more previously constructed reference frames determined using the motion vector.

Next, still referring to FIG. 4, the prediction block may be subtracted from the current block at the intra/inter prediction stage 402 to generate a residual block (also referred to as a residual). The transform stage 404 transforms the residual into transform coefficients, e.g., in the frequency domain, using a block-based transform. Such block-based transforms include: such as DCT and asymmetric DST. Other block-based transforms are also possible. Furthermore, a combination of different transforms may be applied to a single residual. In one example of the application of the transform, the DCT transforms the residual block to the frequency domain, where the transform coefficient values are based on spatial frequency. The lowest frequency (DC) coefficient is located at the top left corner of the matrix and the highest frequency coefficient is located at the bottom right corner of the matrix. It is noted that the size of the prediction block, and thus the size of the residual block, may be different from the size of the transform block. For example, a prediction block may be divided into smaller blocks to which separate transforms are applied.

The quantization stage 406 converts the transform coefficients into discrete quantum values, referred to as quantized transform coefficients, using quantizer values or quantization levels. For example, the transform coefficients may be divided by the quantizer values and truncated. The quantized transform coefficients are then entropy encoded by an entropy encoding stage 408. Entropy coding may be performed using any number of techniques, including token trees and binary trees. The entropy coded coefficients are then output to the compressed bitstream 420 along with other information for decoding the block that may include, for example, the type of prediction, the type of transform, the motion vector, and the quantizer value used. The information for decoding the block may be entropy encoded into a block, frame, slice, and/or section header within the compressed bit stream 420. The compressed bitstream 420 may also be referred to as an encoded video stream or an encoded video bitstream, and these terms will be used interchangeably herein.

The reconstruction path in fig. 4 (shown by the dashed connecting lines) may be used to ensure that both the encoder 400 and the decoder 500 (described below) use the same reference frames and blocks to decode the compressed bitstream 420. The reconstruction path performs functions similar to those occurring during decoding, which will be discussed in more detail below, including dequantizing the quantized transform coefficients at a dequantization stage 410 and inverse transforming the dequantized transform coefficients at an inverse transform stage 412 to produce a block of derivative residues (also referred to as derivative residuals). In the reconstruction stage 414, the prediction block predicted in the intra/inter prediction stage 402 may be added to the derivative residuals to create a reconstructed block. A loop filtering stage 416 may be applied to the reconstructed block to reduce distortion such as blocking artifacts.

Other variations of the encoder 400 may be used to encode the compressed bitstream 420. For example, the non-transform based encoder 400 may quantize the residual signal directly without the transform stage 404 for certain blocks or frames. In another embodiment, the encoder 400 may combine the quantization stage 406 and the de-quantization stage 410 into a single stage.

Fig. 5 is a block diagram of a decoder 500 according to an embodiment of the present disclosure. The decoder 500 may be implemented in the receiving station 106, for example, by providing a computer software program stored in the memory 204. The computer software program may include machine instructions that, when executed by a processor, such as CPU 202, cause receiving station 106 to decode video data in the manner described below. Decoder 500 may also be implemented in hardware included in, for example, transmitting station 102 or receiving station 106.

Similar to the reconstruction path of the encoder 400 discussed above, the decoder 500 in one example includes the following stages to perform various functions to generate the output video stream 516 from the compressed bitstream 420: an entropy decoding stage 502, a dequantization stage 504, an inverse transform stage 506, an intra/inter prediction stage 508, a reconstruction stage 510, a loop filtering stage 512, and a post filtering stage 514. Other structural variations of the decoder 500 may be used to decode the compressed bitstream 420.

When the compressed bitstream 420 is presented for decoding, data elements within the compressed bitstream 420 may be decoded by the entropy decoding stage 502 to produce a set of quantized transform coefficients. The dequantization stage 504 dequantizes the quantized transform coefficients (e.g., by multiplying the quantized transform coefficients by quantizer values), and the inverse transform stage 506 inverse transforms the dequantized transform coefficients using the selected transform type to produce derivative residuals that may be the same as the derivative residuals created by the inverse transform stage 412 in the encoder 400. Using the header information decoded from the compressed bitstream 420, the decoder 500 may use the intra/inter prediction stage 508 to create the same prediction block as was created in the encoder 400, e.g., in the intra/inter prediction stage 402. In the reconstruction stage 510, the prediction block may be added to the derivative residual to create a reconstructed block. A loop filtering stage 512 may be applied to the reconstructed block to reduce blocking artifacts. Other filtering may be applied to the reconstructed block. In an example, the post-filtering stage 514 is applied to the reconstructed block to reduce block distortion, and the result is output as the output video stream 516. The output video stream 516 may also be referred to as a decoded video stream, and these terms will be used interchangeably herein.

Other variations of the decoder 500 may be used to decode the compressed bitstream 420. For example, the decoder 500 may produce the output video stream 516 without the post-filtering stage 514. In some embodiments of the decoder 500, a post-filtering stage 514 is applied before the loop filtering stage 512. Additionally, or alternatively, encoder 400 includes a deblocking filtering stage in addition to loop filtering stage 416.

In the encoder 400 and the decoder 500, a block of transform coefficients may be determined by transforming residual values according to a transform type. The transform type may be one or more one-dimensional transform types, including a one-dimensional horizontal transform type that applies an identity transform to columns, referred to herein as TX _ CLASS _ HORIZ, or a one-dimensional vertical transform type that applies an identity transform to rows, referred to herein as TX _ CLASS _ VERT. In the one-dimensional horizontal transform type, an identity transform is applied to columns. Similarly, in the one-dimensional vertical transform type, an identity transform is applied to rows. The transform type may also be a two-dimensional transform type, referred to herein as TX _ CLASS _ 2D. Whichever transform type is selected, it is used to transform the residual values into the frequency domain during encoding and to inverse transform from the frequency domain during decoding. As previously described, the transform coefficients may be quantized.

The quantized transform coefficients may be represented using a level map and encoded or decoded using a context-based arithmetic coding method. These codings and operations are performed, for example, in the entropy coding stage 408 of the encoder 400 and the entropy decoding stage 502 of the decoder 500. Context-based arithmetic coding methods use probability models that are selected based on the coding context or only on the context to encode values. The coding context includes values in a spatial region around the value being coded. Selecting a probabilistic model based on coding context allows for better modeling of probabilities given that there is typically a high degree of correlation between coding patterns in a given spatial region. When encoding transform values from a block, a context is determined based on a template. The template may be selected based on the transformation type.

In level map coding, a transform block is decomposed into multiple level maps, such that the level maps can decompose (i.e., reduce) the coding of each transform coefficient value into a series of binary decisions that each correspond to a magnitude level (i.e., a map level). Decomposition may be accomplished by using a multiple run process. Also, the transform coefficients of the transform block are decomposed into a series of level maps and residuals, which may be level bins, according to the equation:

wherein the content of the first and second substances,

residual [ r ] [ c ] ═ absolute (coefficient [ r ] [ c ]) -T-1; and is

In the above equation, coefficient [ r ]][c]Is the transform coefficient of the transform block at position (row r, column c), T is the maximum graph level, level_kIs a level map corresponding to a map level k, residual is a coefficient residual map, and sign is a sign map of the transform coefficients. Level may be encoded from the encoder using the same equation, such as by a decoder_kThe map, residual map residual and sign maps to recombine the transform coefficients of the transform block. The level map coding will be further explained with reference to fig. 6 and 7.

Fig. 6 is a diagram illustrating a scanning order that may be utilized when coding a block of transform coefficients according to an embodiment of the present disclosure. The scanning order includes a zigzag scanning order 601, a horizontal scanning order 602, and a vertical scanning order 603. In the illustrated example, the blocks are 4x4 blocks, each including 16 values. Each block has four rows, labeled R0-R3 in left-to-right order, and four columns, labeled C0-C3 in top-to-bottom order. Individual positions in each block correspond to individual transform coefficients and can be addressed in the [ r, c ] format, where r denotes the number of rows and c denotes the number of columns. Each of the zigzag scan order 601, the horizontal scan order 602, and the vertical scan order 603 starts from position [0, 0], and the numbers shown indicate the order of accessing/processing the positions in the block after position [0, 0] according to the scan order. The zig-zag scan order 601 visits locations in the block along a diagonal that runs in a left-to-right and top-to-bottom manner. The horizontal scan sequence 602 travels from left to right along each row before traveling to the next row in top-to-bottom order. The vertical scan order 603 runs from top to bottom along each column before running to the next column in left to right order.

Fig. 7 is a diagram illustrating the levels of transform coefficient coding using a level map according to an embodiment of the present disclosure. Fig. 7 shows a transform block 704 and a level map representing the transform block, including an end-of-block map 706, a non-zero map 708, a sign map 710, a level-1 map 712, a level-2 map 714, and a coefficient residual or residual map 716.

Transform block 704 is an example of a block of transform coefficients that may be received from a quantization step of an encoder, such as quantization stage 406 of encoder 400 of fig. 4. Transform block 704 includes zero transform coefficients and non-zero transform coefficients. Some non-zero coefficients may be negative.

The end-of-block map 706 indicates the end-of-block location of the transform block 704. The end-of-block position is a position in transform block 704 where no other non-zero values exist, as determined when the transform coefficient positions are visited in the scan order used. Thus, at and after the end of block position, all values from transform block 704 are zero. In the illustrated example, a zig-zag scan order 601 is used, with zero values indicating non-zero coefficients other than the end-of-block position, and one (1) values indicating the end-of-block position. In the illustrated example, the end of block is located at position [2, 2] indicated by a value of one (1) at that position, where the previous non-zero value is indicated by a value of zero.

Non-zero map 708 is a level map that indicates, for each position in transform block 704, whether the corresponding transform coefficient is equal to 0 or a non-zero value. In the illustrated example, non-zero map 708 includes zeros at positions of each transform coefficient that have zero values and are located before the end-of-block position, and non-zero map 708 includes values of one (1) at all positions that have non-zero values in transform block 704. Non-zero map 708 may also be referred to as a zero level map.

The sign map 710 indicates, for each position of the transform block 704 having a non-zero value, whether the corresponding transform coefficient is a positive value or a negative value. In the illustrated example, a value of-1 indicates that the corresponding transform coefficient has a negative value, and a value of one indicates that the corresponding transform coefficient has a positive value. Other symbols may be utilized, such as zero and one.

The non-zero map 708, the level-1 map 712, the level-2 map 714, and the coefficient residual map 716 define, in combination, the absolute values of the transform coefficients from the transform block 704. In these figures, the non-zero map 708, the level-1 map 712, the level-2 map indicate whether the corresponding transform coefficient from the transform block 704 has an absolute value equal to zero, one or two, or greater than or equal to three, using only binary values. For each non-zero value, as indicated by non-zero map 708, if the absolute value of the corresponding transform coefficient is equal to one, then level-1 map 712 includes a value of zero; if the absolute value of the transform coefficient is greater than or equal to 1, then the level-1 map 712 includes a value of 1. For each value indicated as being greater than or equal to two in the level-1 graph 712, the level-2 graph 714 includes a value of zero if the absolute value of the corresponding transform coefficient is equal to two, or the level-2 graph 714 includes a value of one if the absolute value of the transform coefficient is greater than or equal to three.

In an alternative example, a single level map may replace non-zero map 708, level-1 map 712, and level-2 map by: using a two-bit value indicates, for each transform coefficient from transform block 704, whether the absolute value of the transform coefficient is equal to zero, one, or two, or greater than or equal to three. In another alternative example, a different number of level maps may be used, in which case the threshold for the presence of residual values will change.

In the illustrated example, the coefficient residual map 716 includes the residual of each transform coefficient from the transform block 704. The residual of each transform coefficient from transform block 704 is the magnitude of the representation of the transform coefficient that exceeds the magnitude in the level map. In this example, the residual of each transform coefficient from transform block 704 is calculated as the absolute value of the transform coefficient from transform block 704 minus three.

Fig. 8 is a flow diagram of a process 800 for encoding a transform block in an encoded video bitstream using level mapping according to an embodiment of the present disclosure. Process 800 may be implemented in an encoder, such as encoder 400. The encoded video bitstream may be the compressed bitstream 420 of fig. 4.

Process 800 may be implemented, for example, as a software program executable by a computing device, such as transmitting station 102. The software program may include machine-readable instructions that may be stored in a memory, such as memory 204 or secondary storage 214, and executed by a processor, such as CPU 202, to cause a computing device to perform process 800. In at least some implementations, the process 800 may be performed in whole or in part by the entropy encoding stage 408 of the encoder 400.

Process 800 may be implemented using dedicated hardware or firmware. Some computing devices may have multiple memories, multiple processors, or both. The steps or operations of process 800 may be distributed using different processors, memories, or both. The use of the terms "processor" or "memory" in the singular encompasses computing devices having one processor or one memory and devices having multiple processors or multiple memories that may be used to perform some or all of the above-described steps.

Process 800 may receive a transform block such as transform block 704 of fig. 7. The transform block 704 may be received as output from a quantization step of an encoder, such as the quantization stage 406 of the encoder 400 of fig. 4. Transform block 704 includes zero transform coefficients and non-zero transform coefficients. Some non-zero coefficients may be negative.

In operation 801, an end of block position (EOB) is encoded by generating and including a value indicating an end of block position in an encoded video bitstream. In an embodiment of process 800, operation 802 may include generating an end-of-block map for the transform block, as explained with respect to end-of-block map 706. At and after EOB, all coefficients are zero.

In operation 802, a value BL [ i ] is coded to indicate the magnitude of the transform coefficient, where i indicates the scan position (i ═ 0 corresponds to the position of the upper left corner, which is commonly referred to as the DC position), and BL [ i ] indicates whether the magnitude of the quantized coefficient at scan position i is 0, 1, 2, or ≧ 3. The value BL [ i ] is encoded for each position in a scan order in reverse order from a position (i ═ EOB-1) before the end-of-block position to a DC position (i ═ 0) (e.g., 0, 1, 2, or 3). In some embodiments, the value BL [ i ] is encoded in operation 802 using a level map, such as non-zero map 708, level-1 map 712, and level-2 map 714.

The value BL [ i ] is coded into the video bitstream using a context-based arithmetic coding method in operation 802. Context-based arithmetic coding methods utilize a context model that can be determined based on the binary values of any number of previously encoded neighbors and can leverage information from all of these neighbors. The previously encoded neighbors may be neighbors in the same level map or a previous level map, such as an immediately previous level map. For example, level-1 figure 712 may provide context information for encoding level-2 figure 714.

In some implementations of process 800,

operations

801 and 802 are combined by interleaving end-of-block map 706 into non-zero map 708.

In operation 803, residual values, referred to as BR [ i ], are encoded for all transform coefficients having absolute values larger in magnitude than that represented by the value BL [ i ], which in this example represents quantized transform coefficients having

absolute values

0, 1 and 2 without using residual values. Thus, in this example, for each quantized transform coefficient, the magnitude of its absolute value is greater than two (e.g., bl (i) ═ 3), the value BR [ i ] represents the magnitude of the quantized transform coefficient at scan position i, and is equal to the magnitude value of the quantized transform coefficient at scan position i minus three.

As with the value BL [ i ], the residual value BR [ i ] is coded into the video bitstream using a context-based arithmetic coding method in operation 803. Context-based arithmetic coding methods utilize a context model that can be determined based on the binary values of any number of previously coded neighbors and can leverage information from all of these neighbors. The previously coded neighbors may be neighbors in the same level map or a previous level map, such as an immediately previous level map. The residual values BR [ i ] may be encoded in the encoded video bitstream using binary coding or multi-symbol coding. A statistical probability distribution of the residual coefficients of the fitted coefficient residual map may be used. The probability distribution may be a geometric distribution, a laplacian distribution, a pareto distribution, or any other distribution.

In operation 804, a value indicating whether a sign of the quantized transform coefficient is positive or negative is coded for each non-zero quantized transform coefficient. This value may be referred to as Sign [ i ], where i represents the scan position and Sign [ i ] represents the Sign of the non-zero coefficient at scan position i. Operation 804 may be performed using the graphical diagram 710. The encoding Sign [ i ] may be performed using a context-based or non-context-based entropy coding technique.

In some embodiments, the available values of br (i) may include a maximum value. In an example, when bl (i) equals 3, br (i) may take any one of values from 0 to 12. Together with bl (i), this corresponds to the absolute values 0 to 15 of the amplitude l (i) of the quantized transform coefficient at the scanning position i. A value of 12 for bl (i) may indicate that the residual value is greater than or equal to 15. If applicable (i.e., br (i) has a maximum value and the amplitude l (i) of the quantized transform coefficient is greater than or equal to 15), the amplitude of the quantized transform coefficient minus 15(l (i) -15) is encoded in operation 805. The amplitude may be encoded in the encoded video bitstream using binary coding or multi-symbol coding. A statistical probability distribution of the residual coefficients of the fitted coefficient residual map may be used. The probability distribution may be a geometric distribution, a laplacian distribution, a pareto distribution, or any other distribution. The resulting symbols may be coded without any context derivation. That is, the symbol may not be context coded.

In some embodiments, the coding of BL and BR symbols may have its own loop for the current block, followed by another loop for coding the applicable symbols and the amplitude minus 15 of the quantized transform coefficients in the same block.

The quantized transform coefficients may be reconstructed using the coded values to verify the encoding after operation 805.

In process 800, spatially neighboring templates may be used to determine context models for use in context-based arithmetic coding methods. For example, in operation 802, a context for coding a value BL [ i ] is derived by using a spatial template anchored to a block location (r _ i, c _ i) corresponding to a scan location i, where r _ i represents a row index and c _ i represents a column index.

Fig. 9A is a diagram illustrating a first set of spatially neighboring templates that may be utilized in a context-based arithmetic coding method according to an embodiment of the present disclosure. The horizontal template 901 includes a plurality of context neighbors in the same row as the position of the transform coefficient to be coded (i.e., the transform coefficient to be encoded or decoded, which may be referred to herein as the position to be coded for brevity) and one context neighbor in the same column as the position to be coded. In the illustrated example, the horizontal template 901 includes four context neighbors to the right of the location to be coded and one context neighbor below the location to be coded. The vertical template 902 includes one context neighbor in the same row as the location to be coded and a plurality of context neighbors in the same column as the location to be coded. In the illustrated example, the vertical template 902 includes one context neighbor to the right of the location to be coded and four context neighbors below the location to be coded. The two-dimensional template 903 includes context neighbors anchored in the triangle pattern at the location to be coded. In the illustrated example, the two-dimensional template 903 includes two context neighbors to the right of the location to be coded, two context neighbors below the location to be coded, and one context neighbor to the lower right of the diagonal with respect to the location to be coded.

The context of BL [ i ] can be derived using the first set of spatially neighboring templates illustrated in FIG. 9A. The context of BR [ i ] may be derived using the same or different sets of spatially neighboring templates. Fig. 9B is a diagram illustrating a second set of spatially neighboring templates that may be utilized in a context-based arithmetic coding method according to an embodiment of the present disclosure. For example, the context of BR [ i ] may be derived using a second set of spatially neighboring templates.

In the second set of spatially neighboring templates, the horizontal template 905 includes a plurality of context neighbors in the same row as the location to be coded and one context neighbor in the same column as the location to be coded. In the illustrated example, the horizontal template 905 includes two context neighbors to the right of the location to be coded and one context neighbor below the location to be coded. Vertical template 906 includes one context neighbor in the same row as the location to be coded and multiple context neighbors in the same column as the location to be coded. In the illustrated example, the vertical template 906 includes one context neighbor to the right of the location to be coded and two context neighbors below the location to be coded. The two-dimensional template 907 includes context neighbors anchored in the triangle pattern at the location to be coded. In the illustrated example, the two-dimensional template 907 includes one context neighbor to the right of the location to be coded, one context neighbor below the location to be coded, and one context neighbor to the right and below the diagonal with respect to the location to be coded.

The spatial neighborhood template used in a particular encoding operation may be selected based on the transform type used to determine the quantized transform coefficients. For example, if the transform type is a one-dimensional horizontal transform (TX _ CLASS _ HORIZ) type, the horizontal template 901 and/or the horizontal template 905 may be used. For example, if the transform type is a one-dimensional vertical transform type (TX _ CLASS _ VERT), then the vertical template 902 and/or the vertical template 906 may be used. If the transform type is a two-dimensional transform type (TX _ CLASS _2D), a two-dimensional template 903 and/or a two-dimensional template 907 may be used.

During context-based encoding, in common implementations such as implementations where the desired values are obtained by table lookup, it may become costly to obtain values to be used as context neighbors in the selected template. That is, for example, there are three transform type classes, each with its own template, and a straightforward implementation may require at least three arrays to store the neighborhood positions for each valid transform size. In the case where the coder specifies a certain number of scan orders, the actual implementation is more complex. In this case, it may be desirable to define the array based on block locations rather than scan locations to avoid reliance on scan order. Thus, table lookups can create performance issues, such as storage bottlenecks. For example, performance issues may be more common when the block size is larger (e.g., a 32x32 transform block, which has 1024 locations).

According to embodiments of the present disclosure, performance may be improved by storing the required information in the memory register array or only in the register array. To allow the use of a register array for this purpose, the common property shared by the scan orders used to code the transform coefficients, such as zig-zag scan order 601, horizontal scan order 602, and vertical scan order 603, is that coefficients in rows are visited from left to right and coefficients in columns are visited from top to bottom in scan order. In other words, given a scanning order S, iS [ r, c ] denotes the scanning position of the valid block position [ r, c ], where r denotes the row index and c denotes the column index in the transform block. Then, for any r '> r, iS [ r, c ] < iS [ r', c ], and for any c '> c, iS [ r, c ] < iS [ r, c' ]. Thus, during the coding of the level map, when coding is performed in reverse scan order, the context neighbors needed to code the current value to be coded have been accessed.

In the embodiments to be described herein, the information required for context derivation is stored in a register set, for example a register set comprising two or three register arrays, as initially shown in the examples of fig. 10 to 12. Fig. 10 is a diagram showing a first example of a register set corresponding to a horizontal template. Fig. 11 is a diagram showing a first example of a register set corresponding to a vertical template. Fig. 12 is a diagram showing a first example of a register set corresponding to a two-dimensional template. The examples of fig. 10-12 use the first set of spatially neighboring templates of fig. 9A to derive the context of a base-level symbol (i.e., BL [ i ]).

By using a limited memory or register array set, the context information for all locations of the transform block need not be saved in memory. The register array implements template-based coding by saving the values of the locations in the templates used, such as horizontal template 901, vertical template 902, and two-dimensional template 903 in this example. Thus, the register array may correspond to the size and shape of the template used, where each register value corresponds to a particular spatial location (e.g., context neighbor) in the template. Accordingly, the values within the register array may be referred to herein as context neighbor values. In some embodiments, the register arrays include at least a first register array having a first size (e.g., for storing a first number of values) and a second register array having a second size (e.g., for storing a second number of values) different from the first size.

The values in the register array in the register set are initially set to a default value (e.g., zero). Whenever position [ r, c ] exceeds a block boundary, a default value, such as 0, may be used at that position. Once a value is coded (e.g., a symbol is encoded or decoded), the value of the coded location and/or basic information obtained from the level map is used to update the register array for coding the next value.

For a transform of size MxN in TX _ CLASS _ HORIZ, the number of register arrays in the register set is equal to the number of rows N. In this embodiment, the register set includes an 8-bit register array and a 2-bit register array that hold values corresponding to the horizontal template 901 on a row-by-row basis. The register set is used to code specific values in a row of transform blocks. For a transform of size MxN in TX _ CLASS _ VERT, the number of register arrays in the register set is equal to the number of columns M. In this embodiment, the register set includes an 8-bit register array and a 2-bit register array that hold values corresponding to the vertical template 902 on a column-by-column basis. A register set is used to code a particular value in a column of transform blocks. For a transform of size MxN in TX _ CLASS _2D, the number of register arrays in the register set is equal to the smaller of the number of columns M and the number of rows N. In this embodiment, the register set includes two 4-bit register arrays and one 2-bit register array, which hold values corresponding to the two-dimensional template 903. Thus, on either a column-by-column basis or a row-by-row basis (depending on the smaller size of the transform), the context neighbor values are stored in a register set of context neighbors defined by the shape of the two-dimensional template 903, and the register set is used to code specific values in a column or row of the transform block. The register sets described above all use 2-bit precision to store values (e.g., one 8-bit register stores four 2-bit values, and one 2-bit register stores one 2-bit value). However, it should be understood that values of different accuracies may be utilized. It should also be understood that the number of values held in each register array may vary depending on the geometry of the particular spatial template.

To code a transform block determined using TX _ CLASS _ HORIZ and a context corresponding to the horizontal template 901, there are N register arrays corresponding to rows r-0 to N-1. Fig. 10 is a diagram showing an example of a register set corresponding to the horizontal template 901. Each register set includes two register arrays, including a first register array S0 and a second register array S1, which are respectively defined as:

s0[ r,0], S0[ r,1], S0[ r,2], S0[ r,3] and

S1[r,0]。

as shown in fig. 10, the register array S0 stores values of the same row and to the right of the scanning position i of the one or more values of the coded transform coefficient indicating the magnitude of the transform coefficient, and the register array S1 stores a single value of the same column and in a row below the scanning position i of the one or more values of the coded transform coefficient indicating the magnitude of the transform coefficient. In these examples, scan position i is labeled here and in fig. 11-15 discussed below as BL [ i ], since it is the value of the transform coefficient at scan position i that is being coded (i.e., either encoded or decoded). Thus, the scan position i may be referred to herein as the position of the coded value BL [ i ].

To code a transform block determined using TX _ CLASS _ VERT and the context corresponding to the vertical template 902, there are M register arrays corresponding to columns c-0 to M-1. Fig. 11 is a diagram showing an example of a register set corresponding to the vertical template 902. Each register set includes two register arrays, including a first register array S0 and a second register array S1, which are respectively defined as:

s0[ c,0], S0[ c,1], S0[ c,2], S0[ c,3] and

S1[c,0]。

as shown in FIG. 11, register array S0 stores values that are the same column and below the location of the coded value BL [ i ], and register array S1 stores a single value in the same row and in a row to the right of the location of the coded value BL [ i ].

Referring to an example where a transform block has fewer columns M than rows N (i.e., M < N), in order to code a transform block determined using TX _ CLASS _2D and using a context corresponding to the two-dimensional template 903, there are M register arrays corresponding to columns c0 to M-1. Fig. 12 is a diagram showing an example of a register set corresponding to the two-dimensional template 903. Each register set includes three register arrays, including a first register array S0, a second register array S1, and a third register array S3, which are respectively defined as:

S0[c,0]、S0[c,1]、

s1[ c,0], S1[ c,1] and

S2[c,0]。

as shown in fig. 12, the register array S0 stores values of the same column as the position of the coded value BL [ i ], the register array S1 stores values of one column to the right of the position of the coded value BL [ i ], and the register array S2 stores single values of two columns located to the right of the position of the coded value BL [ i ]. In some embodiments, each register set is organized into three arrays, with a first array (for S0) having a size of 2 (i.e., storing two coefficient neighbor values), a second array (for S1) having a size of 2 (i.e., storing two coefficient neighbor values), and a third array (for S2) having a size of 1 (i.e., storing one coefficient neighbor value).

At the start of coding a transform block, a register set is defined according to a transform type used to determine transform coefficients of the transform block, and all values in the register set are initialized to zero. The value to be coded at scan position i, BL [ i ] from {0, 1, 2, 3} is assigned a block position corresponding to scan position i represented by row r _ i and column c _ i. At the time of encoding, the value BL [ i ] is obtained from basic information such as a level map. At decoding time, the input is the part of the encoded bitstream from which the values BL [ i ] are derived. The context for entropy coding the value BL [ i ] is determined by combining (e.g., summing) the values from the register array, which represent the spatial context neighbors of the value BL [ i ] according to the template corresponding to the transform type.

If the transform type is one in TX _ CLASS _ HORIZ, the context used to code the value BL [ i ] is derived from:

S0[r_i,0]+S0[r_i,1]+S0[r_i,2]+S0[r_i,3]+S1[r_i,0]。

after coding the value BL [ i ], the register array values are updated as follows:

S0[r_i,0]＝BL[i]，

S0[r_i,1]＝S0[r_i,0]，

S0[r_i,2]＝S0[r_i,1]，

s0[ r _ i,3] ═ S0[ r _ i,2], and

S1[r_i,0]＝BL[iS[r_i+1,c_i-1]]。

in summary, the values in the register array are updated to assume the values of their immediate neighbors, which in this case are the values of the leftmost position of the position represented by each value in the register array. For register array S0, the first value S0[ r _1,0] is updated to the value at the location that was BL [ i ] just coded. The remaining values in register array S0 assume values from previous values in the register array (i.e., values in the register array are shifted by one position). For the register array S1, the basic information obtained from the level map iS used to update the unique values, i.e., the values to be coded for the cells of the row below and the column to the left of the position of the just-coded value BL [ i ], which are labeled as values BL [ iS [ r _ i +1, c _ i-1] ]. After the update, the register set is ready for coding the next value in the same row (i.e., the value in the position just to the left of the position of the value BL [ i ] that was just coded).

If the transform type is one in TX _ CLASS _ VERT, the context used to code the value BL [ i ] is derived from:

S0[c_i,0]+S0[c_i,1]+S0[c_i,2]+S0[c_i,3]+S1[c_i,0]。

after coding the value BL [ i ], the register array is updated as follows:

S0[c_i,0]＝BL[i]，

S0[c_i,1]＝S0[c_i,0]，

S0[c_i,2]＝S0[c_i,1]，

s0[ c _ i,3] ═ S0[ c _ i,2], and

S1[c_i,0]＝BL[iS[r_i-1,c_i+1]]。

in summary, the values in the register array are updated to assume the values of their immediate neighbors, which in this case are the values of the positions directly above the position represented by each value in the register array. For register array S0, the first value S0[ c _ i,0] is updated to the value BL [ i ] that was just coded. The remaining values in register array S0 assume values from previous values in the register array (i.e., values in the register array are shifted by one position). For the register array S1, the unique value, i.e., the value to be coded for the cells of the row above and the column to the right of the position of the value BL [ i ] just coded, which iS the value BL [ iS [ r _ i-1, c _ i +1], iS updated using the basic information obtained from the level map. After the update, the register set is ready for coding the next value in the same column (i.e., the value in the position directly above the position of the value BL [ i ] that was just coded).

If the transform type is 1 in TX _ CLASS _2D, continuing with the example where the transform block has fewer columns M than rows N (i.e., M < N), the context for coding the value BL [ i ] is derived with the following formula:

S0[c_i,0]+S0[c_i,1]+S1[c_i,0]+S1[c_i,1]+S2[c_i,0]。

after coding the value BL [ i ], the register array is updated as follows:

S0[c_i,0]＝BL[i]，

S0[c_i,1]＝S0[c_i,0]，

S1[c_i,0]＝BL[iS[r_i-1,c_i+1]]，

s1[ c _ i,1] ═ S1[ c _ i,0], and

S2[c_i,0]＝BL[iS[r_i-1,c_i+2]]。

in summary, the values in the register array are updated to assume the values of their immediate neighbors, which in this case are the values of the positions directly above the position represented by each value in the register array. For register array S0, the first value S0[ c _ i,0] is updated to the value of BL [ i ] that was just coded, and the second value in register array S0, represented by S0[ c _ i,1], assumes a value from the previous value in the register array (i.e., the value in the register array is shifted by one position), which in this example is the value S0[ c _ i,0 ]. For the second register array S1, the first value, i.e., the value to be coded for the cell located one row above and one column to the right of the position of the just-coded value BL [ i ], which iS the value BL [ iS [ r _ i-1, c _ i +1], iS updated using the basic information obtained from the level map, and the second value assumes the value of the first value from the second register array S1[ c _ i,0 ]. For the third register array S2, the unique value, i.e., the value to be coded for the cells of one row above and two columns to the right of the position of the value BL [ i ] just coded, which iS the value BL [ iS [ r _ i-1, c _ i +2], iS updated using the basic information obtained from the level map. After the update, the register set is ready for coding the next value in the same column (i.e., the value in the position directly above the position of the value BL [ i ] that was just coded). In examples where the number of rows is less than the number of columns, instead, the register array may be defined on a row-by-row basis, where each register set is used to provide context for coding values in a row.

When the base-range symbols (i.e., BR [ i ]) are context-coded, the contexts may be derived in a similar manner as the contexts for base-level symbols described above with respect to fig. 10-12. Alternatively, different templates such as those shown in fig. 9B may be used to derive the context, and the register array is adapted to the number and location of context neighbors. As with the register sets described above with respect to fig. 10-12, only information obtained from the register sets is used to update the first set of register values (e.g., the register array), and the basic information obtained from, for example, the level map is used to update the second set of register values (e.g., the register array). This embodiment reduces the reliance on basic information, which can improve efficiency and avoid performance issues such as storage bottlenecks, while accurately modeling the context of each value being coded.

A second example of storing information required for context derivation in a register set is shown in fig. 13 to 15. Fig. 13 is a diagram showing a second example of a register set corresponding to a horizontal template. Fig. 14 is a diagram showing a second example of a register set corresponding to a vertical template. Fig. 15 is a diagram showing a second example of a register set corresponding to a two-dimensional template. In fig. 13 to 15, the first template set of fig. 9A is used as an example. This embodiment avoids the situation where previously coded basic information other than the coded value BL [ i ] is accessed to update the register array after the value BL [ i ] is coded. In contrast, values that are not obtained by shifting other values through the register array are based on the value BL [ i ]. This embodiment may be preferable where the cost of accessing the basic information is high.

For a transform of size MxN in TX _ CLASS _ HORIZ, an 8-bit register array may be defined to hold four 2-bit values per row and a 2-bit register array may be defined to hold a 2-bit value per column, holding values that spatially correspond to the horizontal template 901, where the values to the right of the coded values are stored on a row-by-row basis and the values below the coded values are stored on a column-by-column basis. For a transform of size MxN in TX _ CLASS _ VERT, an 8-bit register array may be defined to hold four 2-bit values for each column, and a 2-bit register array may be defined to hold a 2-bit value for each row, where these register arrays hold values that spatially correspond to the vertical template 902, where the values below the coded values are stored on a column-by-column basis, and the values to the right of the coded values are stored on a row-by-row basis. For a transform of size MxN in TX _ CLASS _2D, a 4-bit register array may be defined to hold two 2-bit values per row, a 4-bit register array may be defined to hold two 2-bit values per column, and a 2-bit register array may be defined to hold one 2-bit value per diagonal, thereby holding values that spatially correspond to the two-dimensional template 903, where the values to the right of the coded values are stored on a row-by-row basis, the values below the coded values are stored on a column-by-column basis, and the values below and to the right of the diagonal of the coded values are stored on a diagonal-by-diagonal basis. The foregoing example utilizes a register with 2-bit precision for each value. However, it should be understood that values of different accuracies may be utilized. It should also be understood that the number of values held in each register may vary depending on the geometry of the particular spatial template.

To code a transform block of size MxN determined using TX _ CLASS _ HORIZ using a context corresponding to the horizontal template 901, a first register array S0 is defined for each row (i.e., for r0, 1, …, N-1) and a second register array S1 is defined for each column (i.e., for c0, 1. The first register array S0 and the second register array S1 are defined as:

for r ═ 0, 1, …, N-1, S0[ r,0], S0[ r,1], S0[ r,2], S0[ r,3], and

for c-0, 1, …, M-1, S1[ c,0 ].

As shown in FIG. 13, register array S0 includes values in the same row and to the right of the location of the coded value BL [ i ], and register array S1 includes a single value in the same column and in a row below the location of the coded value BL [ i ].

To code a transform block of size MxN determined using TX _ CLASS _ VERT and the context corresponding to the vertical template 902, a first register array S0 is defined for each column (i.e., M-1 for c0, 1, …) and a second register array S1 is defined for each row (i.e., r0, 1. The first register array S0 and the second register array S1 are defined as:

for c ═ 0, 1, …, M-1, S0[ c,0], S0[ c,1], S0[ c,2], S0[ c,3], and

for r-0, 1, …, N-1, S1[ r,0 ].

As shown in FIG. 14, register array S0 includes values that are the same column and below the location of the coded value BL [ i ], and register array S1 includes a single value in a column that is the same row and to the right of the location of the coded value BL [ i ].

To code a transform block of size MxN determined using TX _ CLASS _2D and the context corresponding to the two-dimensional template 903, a first register array S0 is defined for each column (i.e., for c0, 1, …, M-1), a second register array S1 is defined for each row (i.e., for r0, 1, N-1), and a third register array S2 is defined for each diagonal. The first register array S0, the second register array S1, and the third register array S2 are respectively defined as:

for c0, 1, …, M-1, S0[ c,0], S0[ c,1]

For r ═ 0, 1, … …, N-1, S1[ r,0], S1[ r,1] and

for d-0, 1, …, M + N-2, S2[ d,0 ].

In the foregoing definition of the third register array S2, d is an index of a diagonal line, and may be determined based on the row index [ r ] and the column index [ c ] as follows:

if (r ═ c), then d is 0, and

if (r ≠ c), d ═ 2 × abs (r-c) + (r < c).

In defining the index d, the code (r < c) evaluates to zero if r < c is false and to one if r < c is true. Note that any bijective mapping of (r-c) to {0, 1, …, M + N-2} can be used to define index d. Where the use of negative indices is allowed, either r-c or c-r may be used directly as a definition of index d. As shown in FIG. 15, the first register array S0 includes values that are the same column and below the location of the coded value BL [ i ], the second register array S1 includes values that are the same row and to the right of the location of the coded value BL [ i ], and the third register array S2 includes a single value that is located below and to the right of the diagonal of the location of the coded value BL [ i ].

At the start of coding a transform block, a set of registers is defined according to the transform type used to determine the transform coefficients of the transform block, and all values in the register array are initialized to zero. The coded value BL [ i ] is used to scan position i, where the value BL [ i ] is an integer from {0, 1, 2, 3 }. As with the previous example, the block position corresponding to scan position i is represented by row r _ i and column c _ i. At the time of encoding, the value BL [ i ] is obtained from basic information such as a level map. At decoding, the value BL [ i ] is derived from a portion of the encoded bitstream using the context. The context used for coding the values BL [ i ] is determined by summing the values from the register array, which represent the spatial context neighbors of the values BL [ i ] according to the template corresponding to the transform type.

Referring to fig. 13, if the transform type is one in TX _ CLASS _ HORIZ, the context for coding the value BL [ i ] is derived from:

S0[r_i,0]+S0[r_i,1]+S0[r_i,2]+S0[r_i,3]+S1[c_i,0]。

after coding the value BL [ i ], the register array is updated as follows:

S0[r_i,0]＝BL[i]，

S0[r_i,1]＝S0[r_i,0]，

S0[r_i,2]＝S0[r_i,1]，

s0[ r _ i,3] ═ S0[ r _ i,2], and

S1[c_i,0]＝BL[i]。

for register array S0, the first value S0[ r _ i,0] is updated to the value BL [ i ] that was just coded. The remaining values in register array S0 assume values from previous values in the register array (i.e., values in the register array are shifted by one position). For the register array S1, the unique value is updated to the value BL [ i ] that was just coded.

Referring to fig. 14, if the transform type is one in TX _ CLASS _ VERT, the context for coding the value BL [ i ] is derived from:

S0[c_i,0]+S0[c_i,1]+S0[c_i,2]+S0[c_i,3]+S1[r_i,0]。

after coding the value BL [ i ], the register array is updated as follows:

S0[c_i,0]＝BL[i]，

S0[c_i,1]＝S0[c_i,0]，

S0[c_i,2]＝S0[c_i,1]，

s0[ c _ i,3] ═ S0[ c _ i,2], and

S1[r_i,0]＝BL[i]。

for register array S0, the first value S0[ c _ i,0] is updated to the value BL [ i ] that was just coded. The remaining values in register array S0 assume values from previous values in the register array (i.e., values in the register array are shifted by one position). In other words, the register array S0 is updated in a first-in-first-out (FIFO) manner by shifting out the oldest value and adding the value BL [ i ] as the newest entry. For the register array S1, the unique value is updated to the value BL [ i ] that was just coded.

If the transform type is one in TX _ CLASS _2D, the context for coding the value BL [ i ] is derived from:

S0[c_i,0]+S0[c_i,1]+S1[r_i,0]+S1[r_i,1]+S2[d_i,0]。

after coding the value BL [ i ], the register array is updated as follows:

S0[c_i,0]＝BL[i]，

S0[c_i,1]＝S0[c_i,0]，

S1[r_i,0]＝BL[i]，

s1[ r _ i,1] ═ S1[ r _ i,0], and

s2[ d _ i,0] ═ BL [ i ], where,

if (r _ i ═ c _ i), then d _ i is 0, and

if (r _ i ═ c _ i), then d _ i ═ 2 abs (r _ i-c _ i) + (r _ i < c _ i).

In defining the index d, the code (r _ i < c _ i) evaluates to zero if r _ i < c _ i is false and evaluates to one if r _ i < c _ i is true.

For the first register array S0, the first value S0[ c _ i,0] is updated to the value BL [ i ] that was just coded, and the second value S0[ c _ i,1] is updated to the previous value to the first value in the register array (i.e., the value has been shifted). For the second register array S1, the first value S1[ r _ i,0] is updated to the value BL [ i ] that was just coded, and the second value S1[ r _ i,1] is updated to the previous value to the first value in the register array (i.e., the value has been shifted). For register array S2, the unique value is updated to the value BL [ i ] of the BL just encoded.

A third example of storing information required for context derivation in a register set is shown in fig. 16 to 18. Fig. 16 is a diagram showing a third example of a register set corresponding to a horizontal template. Fig. 17 is a diagram showing a third example of a register set corresponding to a vertical template. Fig. 18 is a diagram showing a third example of a register set corresponding to a two-dimensional template. The examples of fig. 16-18 use the second set of spatially neighboring templates of fig. 9B to derive a context for a range-level symbol (i.e., BL [ i ]). This embodiment avoids the following situation: after the value BR [ i ] is coded, previously coded base information other than the coded value BR [ i ] is accessed to update the register set. In contrast, values that are not obtained by shifting other values through the register array are based on the value BR [ i ]. This embodiment may be preferable where the cost of accessing the basic information is high.

For a transform of size MxN in TX _ CLASS _ HORIZ, an 8-bit register array may be defined to hold two 4-bit values per row and a 4-bit register array may be defined to hold one 4-bit value per column, thus holding a value that spatially corresponds to the horizontal template 905, where the values to the right of the coded value are stored on a row-by-row basis and the values below the coded value are stored on a column-by-column basis. For a transform of size MxN in TX _ CLASS _ VERT, an 8-bit register array may be defined to hold four 4-bit values for each column and a 4-bit register array may be defined to hold a 4-bit value for each row, where the register arrays hold values that spatially correspond to the vertical template 906, where values below the coded values are stored on a column-by-column basis and values to the right of the coded values are stored on a row-by-row basis. For a transform of size MxN in TX _ CLASS _2D, a 4-bit register array may be defined to hold a 4-bit value for each row, a 4-bit register array may be defined to hold a 4-bit value for each column, and a 4-bit register array may be defined to hold a 4-bit value for each diagonal, to hold values corresponding spatially to the two-dimensional template 907, where the values to the right of the coded values are stored on a row-by-row basis, the values below the coded values are stored on a column-by-column basis, and the values below and to the right of the diagonal of the coded values are stored on a diagonal-by-diagonal basis. The foregoing example utilizes a register array with 4-bit precision for each value. However, it should be understood that values of different accuracies may be utilized. It should also be understood that the number of values held in each register array may vary depending on the geometry of the particular spatial template.

To code a transform block of size MxN determined using TX _ CLASS _ HORIZ and the context corresponding to horizontal template 905, a first register array S0 is defined for each row (i.e., for r0, 1, …, N-1) and a second register array S1 is defined for each column (i.e., for c0, 1. The first register array S0 and the second register array S1 are defined as:

for r ═ 0, 1, … …, N-1, S0[ r,0], S0[ r,1], and

for c-0, 1, …, M-1, S1[ c,0 ].

As shown in fig. 16, the register array S0 includes values of the same row and to the right of the scan position i of the one or more values of the coded transform coefficient indicating the magnitude of the transform coefficient, and the register array S1 includes a single value of the same column and in a row below the scan position i of the one or more values of the coded transform coefficient indicating the magnitude of the transform coefficient. In these examples, scan position i is labeled here and in fig. 17 and 18 as BR [ i ] because it is the value of the transform coefficient at scan position i that is being coded (i.e., either encoded or decoded). Thus, the scan position i may be referred to herein as the position of the coded value BR [ i ].

To code a transform block of size MxN determined using TX _ CLASS _ VERT and a context corresponding to the vertical template 906, a first register array S0 is defined for each column (i.e., M-1 for c0, 1, …) and a second register array S1 is defined for each row (i.e., r0, 1. The first register array S0 and the second register array S1 are defined as:

for c ═ 0, 1, …, M-1, S0[ c,0], S0[ c,1], and

for r-0, 1, …, N-1, S1[ r,0 ].

As shown in FIG. 17, register array S0 includes values that are the same column and below the position of the coded value BR [ i ], and register array S1 includes a single value in a column that is the same row and to the right of the position of the coded value BR [ i ].

To code a transform block of size MxN determined using TX _ CLASS _2D and the context corresponding to the two-dimensional template 907, a first register array S0 is defined for each column (i.e., for c0, 1, …, M-1), a second register array S1 is defined for each row (i.e., for r0, 1, N-1), and a third register array S2 is defined for each diagonal. The first register array S0, the second register array S1, and the third register array S2 are respectively defined as:

for c-0, 1, …, M-1, S0[ c,0]

For r ═ 0, 1, …, N-1, S1[ r,0], and

for d-0, 1, …, M + N-2, S2[ d,0 ].

if r is equal to c, d is 0, and

if r is not equal to c, d 2 abs (r-c) + (r < c).

In defining the index d, the code (r < c) evaluates to zero if r < c is false and to one if r < c is true. Note that any bijective mapping of (r-c) to {0, 1, …, M + N-2} can be used to define index d. Where the use of negative indices is allowed, either r-c or c-r may be used directly as a definition of index d.

As shown in FIG. 18, the first register array S0 includes values that are the same column and below the position of the coded value BR [ i ], the second register array S1 includes values that are the same row and to the right of the position of the coded value BR [ i ], and the third register array S2 includes a single value that is below and to the right of the diagonal of the position of the coded value BR [ i ].

At the start of coding a transform block, a set of registers is defined according to the transform type used to determine the transform coefficients of the transform block. Similar to the description above with respect to fig. 10-15, values in the register array of the register set for the current block being encoded or decoded are initially set to a default value (e.g., zero). The value BR [ i ] of the transform coefficient at scan position i is the value from {0, 1, 2, … … 12 }. The block position corresponding to the scanning position i is represented by a row r _ i and a column c _ i. At the time of encoding, the coded value BR [ i ] is obtained from basic information such as a level map. At decoding, entropy coding is used to derive the coded value BR [ i ] from the coded bitstream. The context used for coding the value BR [ i ] is determined by summing the values from the register array, which represent the spatial context neighbours of the position of the value BR [ i ] according to the template corresponding to the type of transformation.

Referring to fig. 16, if the transform type is one in TX _ CLASS _ HORIZ, the context for coding the value BL [ i ] is derived from:

S0[r_i,0]+S0[r_i,1]+S1[c_i,0]。

after coding the value BR [ i ], the register array is updated as follows:

S0[r_i,0]＝BR[i]，

s0[ r _ i,1] ═ S0[ r _ i,0], and

S1[c_i,0]＝BR[i]。

for register array S0, the first value S0[ r _ i,0] is updated to the value BR [ i ] that was just coded. The remaining values in register array S0 assume values from previous values in the register array (i.e., values in the register array are shifted by one position). For the register array S1, the unique value is updated to the value BR [ i ] that was just coded.

Referring to fig. 17, if the transform type is one in TX _ CLASS _ VERT, the context for coding the value BR [ i ] is derived from:

S0[c_i,0]+S0[c_i,1]+S1[r_i,0]。

after coding the value BR [ i ], the register array is updated as follows:

S0[c_i,0]＝BR[i]，

s0[ c _ i,1] ═ S0[ c _ i,0], and

S1[r_i,0]＝BR[i]。

for register array S0, the first value S0[ c _ i,0] is updated to the value BR [ i ] that was just coded. The remaining values in register array S0 assume values from previous values in the register array (i.e., values in the register array are shifted by one position). In other words, the register array S0 is updated in a first-in-first-out (FIFO) manner by shifting out the oldest value and adding the value BR [ i ] as the newest entry. For the register array S1, the unique value is updated to the value BR [ i ] that was just coded.

If the transform type is one in TX _ CLASS _2D, the context for coding the value BR [ i ] is derived from:

S0[c_i,0]+S1[r_i,0]+S2[d_i,0]。

after coding the value BR [ i ], the register array is updated as follows:

S0[c_i,0]＝BR[i]，

s1[ r _ i,0] ═ BR [ i ], and

s2[ d _ i,0] ═ BR [ i ], where,

if (r _ i ═ c _ i), then d _ i is 0, and

if (r _ i ═ c _ i), then d _ i ═ 2 abs (r _ i-c _ i) + (r _ i < c _ i).

For the first register array S0, the unique value S0[ c _ i,0] is updated to the value BR [ i ] that was just coded. For the second register array S1, the unique value S1[ r _ i,0] is updated to the value BR [ i ] that was just coded. For the register array S2, the unique value is updated to the value BR [ i ] that was just coded.

As mentioned previously, the embodiments described with respect to fig. 13-18 avoid accessing previously coded basic information other than the coded value BL [ i ] or BR [ i ] to update the shift register after coding. This eliminates potentially costly access to the basic information compared to the embodiments of fig. 10-12. To further reduce processing, a single set of registers that fits all transform sizes and similarly does not depend on the scan order of the level map is a required change.

According to this further embodiment, one or more spatial templates for determining contextual neighbors of quantized transform coefficients may be selected. Selecting one or more spatial templates may include: transform type(s) that may be used to generate quantized transform coefficients in a transform block being coded are determined. That is, the transform type indicates what spatial template may be used to select a context neighbor to encode one or more values representing the magnitude of the transform coefficient, here the values bl (i) and optionally the values br (i). According to examples described herein, the transform type may be a one-dimensional horizontal transform type, such as a one-dimensional horizontal transform type from TX _ CLASS _ HORIZ; a one-dimensional vertical transform type, such as the one-dimensional vertical transform type from TX _ CLASS _ VERT; and a two-dimensional transform type, such as the two-dimensional transform type from TX _ CLASS _ 2D. The spatial templates of fig. 9A and 9B may be used in this example.

One or more spatial templates may be used to define the register array. In an example where the spatial templates of fig. 9A and 9B may be used for a transform block being coded, a register set of 5 register arrays is defined to derive the context of base level symbols (i.e., BL [ i ]) and base range symbols (i.e., BR [ i ]). Defining the register set includes determining a number and size of register arrays forming the register set. The number and size of register arrays may be defined based on the size and shape of the spatial template and the maximum expected size MxN of the transform block. In general, the cardinality (i.e., number) of the array may be equal to the number of contextual neighbors defined in the spatial template. The cardinality of the register array may include one plus the maximum number of context neighbors in a row or column of one or more spatial templates. In the illustrated example, the maximum number of contextual neighbors along the vertical or horizontal dimension is 4, and the

templates

903, 907 include diagonal contexts. Thus, the register set includes 5 register arrays.

The maximum expected size of the transform block, MxN, may be used to define the size of the register array. For example, the maximum size of the maximum expected size may be used as the number of elements (i.e., context neighbor values) for 4 register arrays (where 4 is the maximum vertical or horizontal dimension of the spatial template). In an example where the maximum expected size (also referred to as the maximum transform size) is 32x32, 4 of the 5 register arrays each have 32 elements. The number of elements (i.e., context neighbor values) of the fifth final array is a desired number of diagonal elements based on the maximum transform size. In this example, according to R-C +31, the remaining array has 63 elements within the range {0, 1, …, 62}, where R includes the row position of the value BL or BR to be coded within the range (0, 1, …, 31), and C includes the column position of the value BL or BR to be coded within the range (0, 1. In some cases (e.g., in a software implementation as opposed to a hardware implementation), the remaining array may have 64 elements (more generally, a multiple of 2 elements) instead of 63 elements. More generally, a single one of the register arrays has a size sufficient to store a number of stored values (i.e., cardinality) corresponding to a number of values in a diagonal of a maximum available transform size, and the remaining ones of the register arrays have an array size sufficient to store a number of stored values corresponding to a maximum size of the maximum available transform size.

In one example hardware implementation, five register arrays are defined as follows:

uai4 reg32_0[32]，

uai4 reg32_1[32]，

uai4 reg32_2[32]，

uai4 reg32_3[32], an

uai4 reg64[63]。

In the foregoing, uai4 indicates an unsigned 4-bit integer type. In a software implementation, uai4 may be replaced with an unsigned char or an unsigned 8-bit integer type. That is, each element in the register array is sized to support at least one value, where the value is the maximum expected value to be coded. In the example herein, the maximum expected value of the value BL is 3 and the maximum expected value of the value BR is 12. Thus, a 4-bit register array element may be used to store one value, and an 8-bit register array element may be used to store two values.

As with other embodiments described herein, at the start of coding a transform unit (or block), the register array is initialized to a default value, desirably 0. Deriving or determining a coding context for coding one or more values of the transform coefficients, the values indicating the magnitude of the transform coefficients at scan position i, wherein (r _ i, c _ i) represents a block position corresponding to scan position i following a given scan order, may be implemented using a defined set of registers. That is, at least one of the stored values from the register array of the register set may be used to determine the coding context.

In this example, a register array of register sets may be used to compute two magnitude values, mag and BR _ mag, which are in turn used to derive the coding context for the values BL [ i ] and BR [ i ], respectively. The magnitude values mag and br _ mag may be determined based on a transform type of a transform block being coded.

If the transform type is one in TX _ CLASS _ HORIZ, the amplitude value can be determined from the following pseudo code:

mag ═ MIN (reg32_0[ r _ i ], uai4(3)) + MIN (reg32_1[ r _ i ], uai4(3)) + MIN (reg32_2[ r _ i ], uai4(3)) + MIN (reg32_3[ r _ i ], uai4(3)) + MIN (reg64[ c _ i ], uai4 (3)); and is

br_mag＝reg32_0[r_i]+reg32_1[r_i]+reg64[c_i]；

mag＝MIN((mag+1)>>1,4)；

br_mag＝MIN((br_mag+1)>>1,6)。

Function reg32_0[ r _ i ] returns the value in first register array reg32_0 at the array position corresponding to row r _ i, function reg32_1[ r _ i ] returns the value in second register array reg32_1 at the array position corresponding to row value r _ i, function reg32_2[ r _ i ] returns the value in third register array reg32_2 at the array position corresponding to row value r _ i, and function reg32_3[ r _ i ] returns the value in fourth register array reg32_3 at the array position corresponding to row value r _ i. Similarly, the function reg64[ c _ i ] returns the value in the fifth register array reg64 at the array position corresponding to the column value c _ i. For example, if scan position i is at block position (4,0), then the value at array position 4 in each of first, second, third, and fourth register arrays reg32_0, reg32_1, reg32_2, and reg32_3 is returned by function reg32_0[ r _ i ], function reg32_1[ r _ i ], function reg32_2[ r _ i ], and function reg32_3[ r _ i ], respectively. Similarly, the function reg64[ c _ i ] returns the value at array position 0 in the fifth register array reg 64.

The function uai4(3) returns the binary value of 3 of the 4 bits 0011. The value 3 is used because it is the highest value of BL and, therefore, it is the highest value of the context neighbor of BL (i). In alternative embodiments, the value may be different. The function MIN (a, b) returns the smaller value between a and b. The function operator "> > >" right-shifts the value by a specified number of bits (here, 1 bit). Calculating mag-MIN ((mag +1) > >1,4) and br _ mag-MIN ((br _ mag +1) > >1,6) normalize the amplitude values for different transform types.

Using the same example as described above and immediately after initialization (e.g., such that all values in the array are 0), the amplitude value mag is calculated as follows:

mag＝MIN(0,0011)+MIN(0,0011)+MIN(0,0011)+MIN(0,0011)+MIN(0,0011)＝0；

mag＝MIN((0+1)>>1,4)；

mag-MIN (0, 4); and is

mag is 0; and is

The amplitude value br _ mag is calculated as follows:

br_mag＝0+0+0＝0；

br_mag＝MIN((0+1)>>1,6)；

br _ mag ═ MIN (0, 6); and is

br_mag＝0。

If the type of transform is one in TX _ CLASS _ VERT, the amplitude value can be determined from the following pseudo code:

mag＝MIN(reg32_0[c_i],uai4(3))+MIN(reg32_1[c_i],uai4(3))+MIN(reg32_2[c_i],uai4(3))+MIN(reg32_3[c_i],uai4(3))+MIN(reg64[r_i],uai4(3))；

br_mag＝reg32_0[c_i]+reg32_1[c_i]+reg64[r_i]；

mag＝MIN((mag+1)>>1,4)；

br_mag＝MIN((br_mag+1)>>1,6)。

function reg32_0[ c _ i ] returns the value in first register array reg32_0 at the array position corresponding to column value c _ i, function reg32_1[ c _ i ] returns the value in second register array reg32_1 at the array position corresponding to column value c _ i, function reg32_2[ c _ i ] returns the value in third register array reg32_2 at the array position corresponding to column value c _ i, and function reg32_3[ c _ i ] returns the value in fourth register array reg32_3 at the array position corresponding to column value c _ i. Similarly, the function reg64[ r _ i ] returns the value in the fifth register array reg64 at the array position corresponding to the row value r _ i. For example, if scan position i is at block position (6,2), the value at array position 6 (e.g., the seventh value) in each of first, second, third, and fourth register arrays reg32_0, reg32_1, reg32_2, and reg32_3 is returned by function reg32_0[ c _ i ], function reg32_1[ c _ i ], function reg32_2[ c _ i ], and function reg32_3[ c _ i ], respectively. Similarly, the function reg64[ r _ i ] returns the value at array position 2 (e.g., the third value) in the fifth register array reg 64.

The block position using the coded values BL [ i ] and BR [ i ] is an example of (6,2), and assuming that reg32_0[ c _ i ] ═ reg32_0[2] ═ 4 and reg32_1[ c _ i ] ═ reg32_1[2] ═ 4, and the values at the remaining array positions have a value of 0, the amplitude value mag is calculated as follows:

mag＝MIN(0100,0011)+MIN(0100,0011)+MIN(0,0011)+MIN(0,0011)+MIN(0,0011)；

mag＝0011+0011+0+0+0＝0110；

mag＝MIN((0110+1)>>1,4)；

mag＝MIN(0111>>1,4)；

mag ═ MIN (0011, 4); and is

mag 0011 3; and is

The amplitude value br _ mag is calculated as follows:

br_mag＝0100+0100+0＝1000；

br_mag＝MIN((1000+1)>>1,6)；

br_mag＝MIN((1001)>>1,6)；

br _ mag ═ MIN (0100, 6); and is

br_mag＝0100＝4。

If the transform type is one in TX _ CLASS _2D, the amplitude value can be determined from the following pseudo code:

mag＝MIN(reg32_0[c_i],uai4(3))+MIN(reg32_1[c_i],uai4(3))+MIN(reg32_2[r_i],uai4(3))+MIN(reg32_3[r_i],uai4(3))+MIN(reg64[diag],uai4(3))；

br_mag＝reg32_0[c_i]+reg32_2[r_i]+reg64[diag]；

mag＝MIN((mag+1)>>1,4)；

br_mag＝MIN((br_mag+1)>>1,6)。

in the foregoing, diag is r _ i-c _ i +31 and is an index of the fifth register array reg 64. Thus, the function reg64[ diag ] returns the value in the fifth register array at the array position corresponding to the index value diag. The calculation is performed similarly to the calculation in the case where the conversion type is one in TX _ CLASS _ HORIZ or TX _ CLASS _ VERT.

In summary, determining a coding context using at least some of the stored values comprises: based on the transform type for the transform block, a respective index for each register array is determined using the columns and/or rows of scan locations. The stored value from each register array used to determine the coding context is then selected using the corresponding index for each register array. The selected stored values from each register array are summed to generate a first amplitude value (e.g., mag), while each selected stored value is limited to a first maximum value (e.g., 3) in the summation. The first amplitude value is then normalized. Similarly, the stored values from less than each register array are summed to generate a second magnitude value (e.g., br _ mag). The second amplitude value is then normalized. Subsequently and as described below, the normalized first amplitude value may be used to determine a first coding context for entropy coding a first value of the transform coefficient (e.g., BL [ i ]), the first value indicating an amplitude of the transform coefficient that is not greater than a first maximum value (e.g., 3); and using the normalized second amplitude value to determine a second coding context for entropy coding a second value (e.g., BR [ i ]) of the transform coefficient, the second value indicating an amplitude of the transform coefficient up to a second maximum value (e.g., 12).

Once the amplitude value mag is obtained, the context offset ctx _ offset used for coding the value BL [ i ] can be determined based on the transform type as well. If the transform type is one in TX _ CLASS _2D, then the following pseudo code may be used to determine ctx _ offset:

if(r_i＝＝0&&c_i＝＝0)ctx_offset＝0；

else if(w<h&&r_i<2)ctx_offset＝11+mag；

else if(w>h&&c_i<2)ctx_offset＝16+mag；

else if(r_i+c_i<2)ctx_offset＝mag+1；

else if(r_i+c_i<4)ctx_offset＝5+mag+1；

else ctx_offset＝21+mag。

herein, w is the width of the transform block being coded, h is the height of the transform block being coded, is a boolean operator such that when a b, (a b) evaluates to true, otherwise evaluates to false, and & & is a boolean operator such that when a and b are true, (a & & b) evaluates to true, and when a or b is false, (a & & b) evaluates to false. Thus, the value of ctx _ offset is based on the values of r _ i and c _ i. The value of ctx _ offset is based on the width and height of the transform block being coded. If r _ i and c _ i are both equal to 0, ctx _ offset is equal to zero. If either r _ i or c _ i or both are not equal to 0, then the remaining conditions are considered in order. Once the value of ctx _ offset is determined in response to the condition, further processing of the condition ends. For example, if w is less than h but r _ i is not less than 2, then the next condition is considered (i.e., whether (w > h & & c _ i <2) evaluates to true). On the other hand, if w is less than h, and r _ i is less than 2, ctx _ offset is equal to 11+ mag. The next condition (i.e., whether (w > h & & c _ i <2) evaluates to true) is not considered, nor is the subsequent condition considered.

If the transform type is one in TX _ CLASS _ VERT, the following pseudo-code may be used to determine ctx _ offset:

if(r_i＝＝0)ctx_offset＝26+mag；

else if(r_i<2)ctx_offset＝26+5+mag；

else ctx_offset＝26+10+mag。

if the transform type is one in TX _ CLASS _ HORIZ, the following pseudo-code may be used to determine ctx _ offset:

if(c_i＝＝0)ctx_offset＝26+mag；

else if(c_i<2)ctx_offset＝26+5+mag；

else ctx_offset＝26+10+mag。

once the amplitude value BR _ mag is obtained, the context offset BR _ ctx _ offset used to code BR [ i ] may be determined. If r _ i and c _ i are both equal to zero, the context coefficient br _ ctx _ offset is set equal to the magnitude value br _ mag. Otherwise, the context coefficient br _ ctx _ offset is based on the transform type. If the transform type is one in TX _ CLASS _2D, the context coefficient br _ ctx _ offset is set to br _ mag +7 in case r _ i and c _ i are both less than 2. Otherwise, the context coefficient br _ ctx _ offset is set to br _ mag + 14. If the transform type is one in TX _ CLASS _ HORIZ, the context coefficient br _ ctx _ offset is set to br _ mag +7 if c _ i is equal to 0. Otherwise, the context parameter br _ ctx _ offset is set to br _ mag + 14. Finally, if the transform type is one in TX _ CLASS _ VERT, the context coefficient br _ ctx _ offset is set to br _ mag +7 if r _ i is equal to 0. Otherwise, the context coefficient br _ ctx _ offset is set to br _ mag + 14.

The context of the value BL [ i ] is determined using the context offset ctx _ offset and other information such as the transform size and whether the transform block is a luma block or a chroma block. As is conventional, a context specifies a probability distribution used in arithmetic coding. In the case of the value BL [ i ], the probability distribution is a 4-tuple. On the encoder side, arithmetic coding encodes the values BL [ i ] into binary codewords by using the probability distribution given by the context. On the decoder side, arithmetic decoding decodes the value BL [ i ] from the binary codeword and the probability distribution. The context offset BR _ ctx _ offset is similarly used to determine the context for the value BR [ i ].

After the values BL [ i ] and BR [ i ] are coded, if i >0, the register array is updated in preparation for context derivation at scan position i-1. Hereinafter, if BL [ i ] <3, the level is BL [ i ], and if BL [ i ] <3, the level is 3+ BR [ i ]. Depending on the transform type of the transform block, the register array may be updated as shown in the following pseudo code:

in case the transform type is one in TX _ CLASS _ HORIZ:

reg32_3[r_i]＝reg32_2[r_i]；

reg32_2[r_i]＝reg32_1[r_i]；

reg32_1[r_i]＝reg32_0[r_i]；

reg32_0[ r _ i ] ═ level;

reg64[ c _ i ] ═ level;

in case the type of transformation is one in TX _ CLASS _ VERT:

reg32_3[c_i]＝reg32_2[c_i]；

reg32_2[c_i]＝reg32_1[c_i]；

reg32_1[c_i]＝reg32_0[c_i]；

reg32_0[ c _ i ] ═ level;

reg64[ r _ i ] ═ level;

in case the transform type is one in TX _ CLASS _ 2D:

reg32_1[c_i]＝reg32_0[c_i]；

reg32_0[ c _ i ] ═ level;

reg32_3[r_i]＝reg32_2[r_i]；

reg32_2[ r _ i ] ═ level;

reg64[ diag ] ═ grade.

In summary, in the case where the transform type is one of TX _ CLASS _ HORIZ, the values in the first register array reg32_0[ r _ i ] at position r _ i and the fifth register array reg64[ c _ i ] at position c _ i are updated to levels, the values of which are based on the value BL [ i ] that has just been coded as described above. The value in the second register array reg32_1[ r _ i ] at position r _ i is updated to the value in the first register array reg32_0[ r _ i ] at position r _ i. The remaining values in the remaining register array at position r _ i assume values from the previous register array at position r _ i. That is, the values in the register array are shifted by one array position. In the previously described embodiments, the array position shift is a shift of position within the array. In this embodiment, the array position offset is a shift in position between the arrays.

Similarly, in the case where the transform type is one of TX _ CLASS _ VERT, the values in the first register array reg32_0[ c _ i ] at position c _ i and the fifth register array reg64[ r _ i ] at position r _ i, which are based on the value of the value BL [ i ] at position just coded as described above, are updated to level. The value in second register array reg32_1[ c _ i ] at position c _ i is updated to the value in first register array reg32_0[ r _ i ] at position c _ i. The remaining values in the remaining register array at position c _ i assume values from the previous register array at position c _ i (i.e., shift the values in the register array by one array position).

Finally, in the case where the transform type is one of TX _ CLASS _2D, the values in the first register array reg32_0[ c _ i ] at position c _ i, the third register array at position r _ i, and the fifth register array reg64[ diag ] at position diag are updated to levels based on the value BL [ i ] that has just been coded as described above. The value in second register array reg32_1[ c _ i ] at position c _ i is updated to the value in first register array reg32_0[ r _ i ] at position c _ i. The remaining values in the fourth register array at position r _ i assume values from the previous (third) register array reg32_2[ r _ i ] at position r _ i. As with other transform types, values in a register array are shifted by one array position.

Note that the solution can be described as generic in the sense that it can be applied to any transform size and any scan order. With this solution, there is also no need to fill in the transform (to the right and below) to derive the context of the symbols at the right and bottom boundaries of the transform as needed when storing the neighborhood locations — the filling is replaced by only initializing the array to 0.

Fig. 19 is a flow diagram of a process for coding a transform block according to an embodiment of the present disclosure. Process 1900 may be implemented in an encoder, such as encoder 400. In one embodiment, process 1900 is utilized in process 800, for example, to implement the coding of the value BL [ i ] in operation 802, the coding of the value BR [ i ] in operation 803, or both.

Process 1900 may be implemented, for example, as a software program executable by a computing device, such as transmitting station 102. The software program may include machine-readable instructions that may be stored in a memory, such as memory 204 or secondary storage 214, and that may be executed by a processor, such as CPU 202, to cause a computing device to perform process 1900. In at least some implementations, the process 1900 may be performed in whole or in part by the entropy encoding stage 408 of the encoder 400.

Process 1900 may be implemented using dedicated hardware or firmware. Some computing devices may have multiple memories, multiple processors, or both. The steps or operations of process 1900 may be distributed using different processors, memories, or both. The use of the terms "processor" or "memory" in the singular encompasses computing devices having one processor or one memory and devices having multiple processors or multiple memories that may be used to perform some or all of the above-described steps.

Process 1900 may receive information describing the magnitude of transform coefficients. For example, process 1900 may receive a transform block such as transform block 704 or receive a level map, such as non-zero map 708, level-1 map 712, and level-2 map 714, representing values from transform block 704.

In operation 1901, one or more spatial templates for the coding context may be determined or selected. The determination may be made based on a transform type used to determine quantized transform coefficients in a transform block being coded. The spatial template is a spatial arrangement of cells anchored at the value being coded at the current scan position, thereby determining the coding context. The template may be a horizontal template, a vertical template, or a two-dimensional template, which is selected based on using a one-dimensional horizontal transform type, such as the one-dimensional horizontal transform type from TX _ CLASS _ HORIZ, a one-dimensional vertical transform type, such as the one-dimensional vertical transform type from TX _ CLASS _ VERT, or a two-dimensional transform type, such as the two-dimensional transform type from TX _ CLASS _2D, respectively. Thus, each of a plurality of different transform types may correspond to a selection of a different spatial template. Examples of spatial templates that may be selected in operation 1901 include

horizontal templates

901, 905;

vertical templates

902, 906, and two-

dimensional templates

903, 907.

In some embodiments, the spatial template for the coding context corresponds to a region that includes locations from at least two rows and locations from at least two columns, and a location of an upper left corner of the spatial template corresponds to a scan location. The transform type may be a horizontal transform type, a vertical transform type, a two-dimensional transform type, or any combination thereof.

In embodiments such as those described with respect to fig. 10-18, only one template may be selected for each determination of context for coding the value BL, and if applicable the value BR. In embodiments such as the generic solution described above, all available templates may be selected. According to a variant of this embodiment, all available templates can be selected only for the transform type. In the case where all available templates are used, for example, the selection of one or more templates at 1901 may be omitted, as the templates may be a priori.

When the transform type is a horizontal transform type, the spatial template determined in operation 1901 may include a plurality of values from the same row as the scanning position and a single value from the same column as the scanning position. When the transform type is a vertical transform type, the spatial template selected in operation 1901 may include a plurality of values from the same column as the scanning position and a single value from the same row as the scanning position. When the transform type is a two-dimensional transform type, the spatial template selected in operation 1901 may include a plurality of values from the same column as the scan position, a plurality of values from the same row as the scan position, and a single value from the same diagonal as the scan position.

In operation 1902, an array of registers is defined to hold values of a coding context. The values held in the register array may be referred to herein as stored values. The register array is defined based at least in part on the geometry of the spatial template selected in operation 1901. For example, the values in the register array may each correspond to a location in the spatial template selected in operation 1901. By defining the register arrays to correspond to the geometric arrangement of the spatial template, the stored values from the register arrays will each correspond to a respective location from the spatial template, and thus, a particular value in each register array location will correspond to a particular spatial location within the spatial template. The values of the spatial template are stored in two or more register arrays that may each hold values for a single row, column, or diagonal of the spatial template. The register arrays may each correspond to a column index, a row index, or a diagonal index of the transform block. For example, a register array may be defined as described with reference to the examples shown in fig. 10-18.

A register array may be defined at operation 1902 based on a geometry of a spatial template and based on a maximum available transform size. As previously described, the maximum available transform size may be 32x 32. The cardinality or number of register arrays may be a number defined by an array having a size sufficient to store a number of values corresponding to a maximum number of values in a diagonal of the maximum available transform size and an array corresponding to a maximum number of maximum column numbers or maximum row numbers of the spatial template determined at 1901, where each of the latter register arrays has an array size sufficient to store a number of values corresponding to the greater of the maximum column numbers or maximum row numbers of the maximum available transform size. In the example where the maximum available transform size is 32x32 and the maximum number of columns and maximum number of rows in the spatial template are both equal to 4, there are five register arrays-4 register arrays have an array size (number of elements) of 32 and one register array has an array size of 63 (or 64, where the process is such that 2 isⁿIs desirably sized).

In operation 1903, the register array is initialized. Initializing the register array may include: all values in the register array are set to a default value, such as zero.

In operation 1904, entropy coding of a value of a transform coefficient indicating a magnitude of the transform coefficient from a transform block is started by setting a scan position to a next position to be coded, referred to herein as a scan position i. Various scan orders may be used to predict the blocks used to generate the transform blocks. Entropy coding is performed using the reverse of the scan order (referred to herein as the reverse scan order). Thus, the first location to be coded corresponds to the first non-zero value occurring in the reverse scan order. In subsequent iterations, the scan positions are decremented in operation 1904 so that operations 1905 through 1908 can be performed again, which continues until all values from the transform block are coded.

In operation 1905, values of the transform coefficients being coded are obtained. This value indicates the amplitude of the transform coefficient at the current scan position i within the current transform block being coded. This value may be obtained from the transform block 704 and/or from a level map representing the transform block. For example, the coded value may be a single value corresponding to BL or BR, or may be two values corresponding to BL and BR.

In operation 1906, a coding context is determined using values from the register array. As an example, the coding context may be determined by summing values from a register array, as described with reference to fig. 10-18. In another example, such as the example described above, the coding context of the BL may be determined by summing the values from the register array after comparing them to the highest value of the BL. The smaller values from each comparison are summed and used to generate an amplitude value mag that is used to generate the coding context, context offset ctx _ offset. The coding context of BR can be determined by summing values from less than all of the register arrays, and then using the sum to generate the amplitude value BR _ mag used to generate the coding context, context offset BR _ ctx _ offset. The coding contexts, ctx _ offset and br _ ctx _ offset, also depend on the scan position (r _ i, c _ i, or both). When the transform type is one in TX _ CLASS _2D, the coding context ctx _ offset may also depend on the size (width, height, or both) of the transform block.

In some embodiments, determining the coding context using the stored values from the register array in operation 1906 includes: one or more register array locations corresponding to a column index, a row index, or a diagonal index of the scan location are selected.

Operation 1907 includes entropy coding one or more values (BL, BR, or both) using the coding context determined in operation 1906. In particular, a statistical model is selected for entropy coding using the coding context, and then entropy coding is performed, e.g., as described with respect to the entropy coding stage 408 of the encoder 400. The output of operation 1907 may be inserted into the encoded bitstream.

After entropy coding of the values in operation 1907, at least some of the stored values in the register array are updated in operation 1908. Updating at least some of the stored values in the register array may include shifting one or more values by one register array position. Such shifting may occur between register array locations within a single register array. The shift may occur between register array positions of the two register arrays. The register array locations of the two register arrays may be the same location (e.g., a common index or the same register index) in the two register arrays. That is, for example, shifting one or more stored values may include: one or more stored values are shifted from an array position at an index within the first register array to an array position at a common index within the second register array.

Updating at least some of the stored values in the register array may include setting one or more values in the register array equal to the value coded in operation 1907. In the case where a single set of register arrays is used to determine the coding context of both values BL [ i ] and BR [ i ] at scan position i, updating at least some of the stored values in the register arrays may comprise: as long as the value BL [ i ] is less than the maximum value of BL [ i ] (e.g., 3), one or more values in the register array are set equal to the value BL [ i ] coded in operation 1907, otherwise one or more values in the register array are set equal to the maximum value of the value BR [ i ] plus the value BL [ i ].

In some embodiments, updating at least some of the stored values in the shift register comprises: information is obtained from values indicative of the magnitudes of transform coefficients from the transform block. These values (e.g., values BL and BR) may be the transform coefficient values themselves, the absolute values of the transform coefficients, and/or values from the level map. The value may be a number or may be an expression, such as a boolean expression. For example, the values may indicate whether the absolute value of each transform coefficient is equal to zero, equal to one, equal to two, or greater than or equal to three.

In operation 1909, a determination is made as to whether there are more values to be coded. For example, if the last operation 1907 coded scan position i-0, then it may be determined that there are no more values to code and the process 1900 ends for the current transform block. Otherwise, the process returns to operation 1904 in which the scan position is set to the next position in the reverse scan order and operations 1905 to 1909 are performed again for the value of the new scan position. Process 1900 may be repeated for multiple transform blocks of a frame.

As is clear from the description of

operations

1905 and 1907, process 1900 may be used to entropy code transform coefficients of a transform block. Fig. 20 is a flow diagram of a process for coding a transform block according to another embodiment of the present disclosure. Process 2000 may be implemented in a decoder, such as decoder 500. In an embodiment, the process 2000 is utilized in the process 800, for example, to implement the coding of the value BL [ i ] in operation 802, the coding of the value BR [ i ] in operation 803, or both.

The process 2000 may be implemented, for example, as a software program executable by a computing device, such as the receiving station 106. The software program may include machine-readable instructions that may be stored in a memory, such as memory 204 or secondary storage 214, and executed by a processor, such as CPU 202, to cause a computing device to perform process 2000. In at least some implementations, the process 1900 may be performed in whole or in part by the entropy decoding stage 502 of the decoder 500.

The process 2000 may be implemented using dedicated hardware or firmware. Some computing devices may have multiple memories, multiple processors, or both. Different processors, memories, or both may be used to distribute the steps or operations of process 2000.

Process 2000 may receive information describing the magnitude of transform coefficients. For example, process 2000 may receive a portion of an encoded bitstream that includes encoded transform blocks, such as transform block 704, or an encoding level map, such as non-zero map 708, level-1 map 712, and level-2 map 714, that represents values from transform block 704.

In operation 2001, one or more spatial templates for the coding context are determined or selected. Operation 2001 may be the same as operation 1901 described above.

In operation 2002, a register array is defined to hold the stored values of the coding context. Operation 2002 may be the same as operation 1902 described above.

In operation 2003, the register array is initialized. Initializing the register array may include setting all values in the register array to a default value, such as zero, as described above with respect to operation 1903.

In operation 2004, entropy coding of values of transform coefficients indicative of magnitudes of transform coefficients from the transform block begins by setting a scan position to a next position (e.g., in reverse scan order starting from a first non-zero value, as described with respect to operation 1904).

In operation 2006, a coding context is determined using values from the register array. Operation 2006 may be the same as operation 1906 described above.

Operation 2007 includes entropy coding one or more values (BL, BR, or both) of the transform coefficient at the scan position i set at operation 2004. Entropy coding is performed using a coding context determined in operation 2006 with an encoded bitstream, such as the compressed bitstream 420, as input. The coding context is used, along with other information such as transform block size, prediction mode, etc., to select a statistical model for use in entropy coding. For example, entropy coding may be performed as described with respect to the entropy decoding stage 502 of the encoder 400. The output of operation 2007 is one or more values of the transform coefficients at scan position i. For example, the process 2000 may be used to code the values BL and BR, which are combined to produce transform coefficients once the process 800 is completed for the transform coefficients, as described in

operations

802 and 803.

After entropy coding one or more values of the transform coefficients in operation 2007, at least some of the stored values in the register array are updated in operation 2008. Updating at least some of the stored values in the register array in operation 2008 may be performed in the same manner as the updating in operation 1908 described above.

In operation 2009, a determination is made as to whether there are still more values to be coded. For example, if the latest operation 2007 coded a value at scan position i-0, then the process 2000 ended for the current transform block. Otherwise, the process 2000 returns to operation 2004 where the scan position is set to the next position in the reverse scan order. Operations 2006 to 2009 are performed again for the values of the transform coefficients of the new scan position. Process 2000 may be repeated for multiple transform blocks of a frame.

The encoding and decoding aspects described above illustrate some of the encoding and decoding techniques. It should be understood, however, that encoding and decoding, as those terms are used in the claims, may refer to compression, decompression, transformation, or any other processing or alteration of data.

The word "example" or "embodiment" is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as "exemplary" or "embodiment" is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the word "example" or "embodiment" is intended to present concepts in a concrete fashion. As used in this application, the term "or" is intended to mean an inclusive "or" rather than an exclusive "or". That is, unless specified otherwise, or clear from context, "X comprises a or B" is intended to mean any of the natural inclusive permutations. That is, if X comprises A; x comprises B; or X includes A and B, then "X includes A or B" is satisfied under any of the foregoing circumstances. In addition, the articles "a" and "an" as used in this application and the appended claims should generally be construed to mean "one or more" unless specified otherwise or clear from context to be directed to a singular form. Furthermore, the terms "embodiment" or "one embodiment" are not intended to refer to the same embodiment or implementation throughout unless so described.

Embodiments of transmitting station 102 and/or receiving station 106 (as well as algorithms, methods, instructions, etc. stored thereon and/or executed thereby, including by encoder 400 and decoder 500) may be implemented in hardware, software, or any combination thereof. The hardware may include: such as a computer, an Intellectual Property (IP) core, an Application Specific Integrated Circuit (ASIC), a programmable logic array, an optical processor, a programmable logic controller, microcode, a microcontroller, a server, a microprocessor, a digital signal processor, or any other suitable circuit. In the claims, the term "processor" should be understood to include any of the foregoing hardware, alone or in combination. The terms "signal" and "data" are used interchangeably. Furthermore, portions of transmitting station 102 and receiving station 106 need not necessarily be implemented in the same manner.

Further, in one aspect, transmitting station 102 or receiving station 106 may be implemented, for example, using a general purpose computer or a general purpose processor having a computer program that, when executed, performs any of the various methods, algorithms, and/or instructions described herein. Additionally or alternatively, for example, a special purpose computer/processor may be utilized which may contain other hardware for performing any of the methods, algorithms, or instructions described herein.

For example, transmitting station 102 and receiving station 106 may be implemented on computers in a videoconferencing system. Alternatively, transmitting station 102 may be implemented on a server and receiving station 106 may be implemented on a device separate from the server, such as a handheld communication device. In this case, transmitting station 102 may encode the content into an encoded video signal using encoder 400 and transmit the encoded video signal to the communication device. The communication device may then decode the encoded video signal using the decoder 500. Alternatively, the communication device may decode content stored locally on the communication device, e.g., content not transmitted by transmitting station 102. Other transmitter station 102 and receiving station 106 embodiments are also available. For example, the receiving station 106 may be a substantially stationary personal computer rather than a portable communication device, and/or a device including the encoder 400 may also include the decoder 500.

Furthermore, all or portions of embodiments of the present disclosure may take the form of a computer program product accessible from, for example, a tangible computer-usable or computer-readable medium. A computer-usable or computer-readable medium may be, for example, any apparatus that can tangibly contain, store, communicate, or transport the program for use by or in connection with any processor. The medium may be, for example, an electronic, magnetic, optical, electromagnetic, or semiconductor device. Other suitable media may also be used.

Other embodiments are summarized in the following examples:

example 1: a method of coding a transform block having transform coefficients, the method comprising: selecting a spatial template for the coding context based on a transform type for the transform block; defining shift registers for each holding one or more stored values for a coding context; initializing a shift register by setting the stored value to a default value; and coding values indicative of magnitudes of transform coefficients from the transform block in a reverse scan order comprises for each of one or more values: the method includes obtaining a value to be coded at a scan location, determining a coding context using stored values from a shift register, entropy coding the value to be coded using the coding context, and updating at least some of the stored values in the shift register after entropy coding the value to be coded.

Example 2: the method of example 1, wherein the shift register is defined to correspond to a geometric arrangement of the spatial template such that the stored values from the shift register can each correspond to a respective location from the spatial template.

Example 3: the method of example 1 or 2, wherein the shift registers include at least a first shift register having a first size and a second shift register having a second size different from the first size.

Example 4: the method of any of examples 1 to 3, wherein the shift registers each correspond to a column index, a row index, or a diagonal index of the transform block.

Example 5: the method of example 4, wherein determining the coding context using the stored values from the shift register comprises selecting one or more of the shift registers corresponding to a column index, a row index, or a diagonal index of the scan location.

Example 6: the method of any of examples 1 to 5, wherein updating at least some of the stored values in the shift register comprises shifting one or more values by one position.

Example 7: the method of any of examples 1 to 6, wherein updating at least some of the stored values in the shift register comprises setting one or more values in the shift register equal to a value to be coded.

Example 8: the method of any of examples 1 to 7, wherein updating at least some of the stored values in the shift register comprises obtaining information from values indicative of magnitudes of transform coefficients from the transform block.

Example 9: the method of any of examples 1 to 8, wherein the spatial template for the coding context corresponds to an area including locations from the at least two rows and locations from the at least two columns, and a location of an upper left corner of the spatial template corresponds to the scan location.

Example 10: the method of any of claims 1 to 9, wherein the transform type is one of a horizontal transform type, a vertical transform type, or a two-dimensional transform type, when the transform type is the horizontal transform type, the spatial template includes a plurality of values from a same row as the scan position and a single value from a same column as the scan position, and when the transform type is the vertical transform type, the spatial template includes a plurality of values from a same column as the scan position and a single value from a same row as the scan position, and when the transform type is the two-dimensional transform type, the spatial template includes a plurality of values from a same column as the scan position, a plurality of values from a same row as the scan position, and a single value from a same diagonal as the scan position.

Example 11: an apparatus for coding a transform block having transform coefficients, the apparatus comprising: a memory; and a processor configured to execute instructions stored in the memory for: selecting a spatial template for the coding context based on a transform type for the transform block; defining shift registers for each holding one or more stored values for a coding context; initializing a shift register by setting the stored value to a default value; and code values indicative of magnitudes of transform coefficients from the transform block in a reverse scan order, wherein the instructions further cause the processor to, for each of the one or more values: the method includes obtaining a value to be coded at a scan location, determining a coding context using stored values from a shift register, entropy coding the value to be coded using the coding context, and updating at least some of the stored values in the shift register after entropy coding the value to be coded.

Example 12: the apparatus of example 11, wherein the shift register is defined to correspond to a geometric arrangement of the spatial template such that the stored values from the shift register can each correspond to a respective location from the spatial template.

Example 13: the apparatus of example 11 or 12, wherein the shift registers include at least a first shift register having a first size and a second shift register having a second size smaller than the first size.

Example 14: the apparatus of any of examples 11 to 13, wherein the shift registers each correspond to a column index, a row index, or a diagonal index of the transform block, and the instructions further cause the processor to determine the coding context using the stored values from the shift registers comprise: one or more of the shift registers corresponding to one of a column index, a row index, or a diagonal index of the scan position are selected.

Example 15: the apparatus of any of examples 11 to 14, wherein the instructions to cause the processor to update at least some of the stored values in the shift register comprise: shifting one or more values by one position, and the instruction causing the processor to update at least some of said stored values in the shift register comprises: one or more values in the shift register are set equal to the value to be coded.

Example 16: the apparatus of any of examples 11 to 15, wherein the instructions to cause the processor to update at least some of the stored values in the shift register comprise obtaining information from a value indicative of a magnitude of transform coefficients from a transform block.

Example 17: the apparatus of any of examples 11 to 16, wherein the spatial template for the coding context corresponds to an area comprising locations from the at least two rows and locations from the at least two columns, and a location of an upper left corner of the spatial template corresponds to the scan location.

Example 18: a non-transitory computer-readable storage device comprising program instructions executable by one or more processors that, when executed, cause the one or more processors to perform operations for coding a transform block having transform coefficients, the operations comprising: selecting a spatial template for the coding context based on a transform type for the transform block; defining shift registers for each holding one or more stored values relating to a coding context; initializing a shift register by setting the stored value to a default value; and coding values indicative of magnitudes of transform coefficients from the transform block in a reverse scan order comprises for each of one or more values: the method includes obtaining a value to be coded at a scan location, determining a coding context using stored values from a shift register, entropy coding the value to be coded using the coding context, and updating at least some of the stored values in the shift register after entropy coding the value to be coded.

Example 19: the non-transitory computer-readable storage device of example 18, wherein the shift registers each correspond to a column index, a row index, or a diagonal index of the transform block, and determining the coding context using the stored values from the shift registers comprises selecting one or more of the shift registers corresponding to one of the column index, the row index, or the diagonal index of the scan location.

Example 20: the non-transitory computer-readable storage device of example 18 or 19, wherein updating at least some of the stored values in the shift register comprises shifting one or more values by one position and updating at least some of the stored values in the shift register comprises setting one or more values in the shift register equal to the value to be coded.

The above-described embodiments, implementations, and aspects have been described in order to facilitate understanding of the present disclosure and not to limit the present disclosure. On the contrary, the disclosure is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims, which scope is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures as is permitted under the law.

Claims

1. A method of coding a transform block having transform coefficients, the method comprising:

defining a register array based on at least one spatial template for a coding context, the register array for each holding one or more storage values for determining the coding context;

initializing the register array by setting the stored values to default values; and

coding, in reverse scan order, values of the transform coefficients from the transform block indicative of magnitudes of the transform coefficients, the coding comprising, for each of one or more transform coefficients:

determining the coding context using at least some of the storage values from the register array,

entropy coding the value of the transform coefficient at the scanning position using the coding context, and

updating the register array after entropy coding the value of the transform coefficient,

wherein the register array corresponds to a geometric arrangement of the at least one spatial template such that the stored values from the register array each correspond to a respective location from the at least one spatial template.

2. The method of claim 1, wherein:

a cardinality of the register array is equal to one plus a maximum number of context neighbors in a row or column of the at least one spatial template; and

the register array includes:

a single register array of the register arrays having an array size sufficient to store a number of stored values corresponding to a number of values in a diagonal of a maximum available transform size; and

a remaining register array of the register array having an array size sufficient to store a number of stored values corresponding to a maximum size of the maximum available transform size.

3. The method of claim 2, wherein:

the radix of the register array is 5;

the single register array has an array size that stores at least 63 values; and

the remaining ones of the register arrays each have an array size that stores 32 values.

4. The method of claim 1, further comprising:

selecting the at least one spatial template of the coding context based on a transform type for the transform block, wherein the at least one spatial template includes a first spatial template for a first value at the scan location indicating a magnitude of a current transform coefficient of the transform block and a second spatial template for a second value at the scan location indicating the magnitude of the current transform coefficient, the first spatial template being different from the second spatial template, and wherein:

entropy coding the value of the transform coefficient comprises entropy coding the first value and entropy coding the second value, wherein entropy coding the first value uses a different coding context than entropy coding the second value.

5. The method of claim 4, wherein the first value is less than or equal to a first maximum value of the amplitude and the second value is less than or equal to a second maximum value, the first maximum value being less than the second maximum value.

6. The method of any of claims 1-5, wherein the register arrays include at least a first register array having a first size and a second register array having a second size different from the first size.

7. The method of any of claims 1-5, wherein the register array comprises a set of register arrays, and determining the coding context comprises: determining two coding contexts using the set of register arrays, each of the two coding contexts being used to entropy code a respective value of the transform coefficient at the scan location.

8. The method of any of claims 1-5, wherein updating the register array comprises: one or more stored values are shifted by one array position.

9. The method according to claim 8, wherein shifting one or more stored values by one array position comprises: shifting the one or more stored values by one array position within a single one of the register arrays.

10. The method according to claim 8, wherein shifting one or more stored values by one array position comprises: the one or more storage values are shifted from an array position at an index within the first register array to an array position at a common index within the second register array.

11. An apparatus for coding a transform block having transform coefficients, the apparatus comprising:

a processor configured to:

defining a register array based on at least one spatial template for a coding context, the register array for each holding one or more stored values for determining the coding context, wherein the register array comprises at least a first register array having a first size and a second register array having a second size different from the first size;

coding values of the transform coefficients from the transform block indicative of magnitudes of the transform coefficients in a reverse scan order, wherein the coding comprises, for each of one or more transform coefficients:

updating the register array after entropy coding the value of the transform coefficient, wherein entropy coding the value comprises:

entropy coding a first value indicative of a magnitude of the transform coefficient, the first value belonging to a set of positive integers {0, …, a first maximum }, and

entropy coding a second value indicative of the magnitude of the transform coefficient, the second value belonging to a set of positive integers {0, …, a second maximum }, and the second maximum being greater than the first maximum, and

wherein updating at least some of the stored values in the register array comprises:

setting one or more values in the register array equal to the first value when the first value is less than the first maximum value, and otherwise setting the one or more values in the register array equal to a sum of the first value and the second value.

12. The apparatus of claim 11, wherein updating the register array comprises setting one or more stored values in the register array equal to the value of the transform coefficient.

13. The apparatus of any of claims 11 to 12, wherein using at least some of the storage values from the register array to determine the coding context comprises:

determining, based on a transform type for the transform block, a respective index for each of the register arrays using at least one of a column and a row of the scan locations; and

selecting a stored value from each of the register arrays using a respective index of the register arrays to determine the coding context.

14. The apparatus of claim 13, wherein using at least some of the storage values from the register array to determine the coding context comprises:

summing the selected stored values from each of the register arrays to generate a first amplitude value, wherein each selected stored value is limited to a first maximum value upon summing;

normalizing the first amplitude value;

determining a first coding context for entropy coding a first value of the transform coefficient using the normalized first amplitude value, the first value indicating an amplitude of the transform coefficient that is not greater than the first maximum value;

summing selected stored values from less than all of the register arrays to generate a second amplitude value;

normalizing the second amplitude value; and

determining a second coding context for entropy coding a second value of the transform coefficient using the normalized second amplitude value, the second value being indicative of the amplitude of the transform coefficient reaching a second maximum value.

15. An apparatus for coding a transform block having transform coefficients, the apparatus comprising:

a processor configured to:

coding values of the transform coefficients of the transform block indicative of magnitudes of the transform coefficients in a reverse scan order, comprising:

determining a first coding context using at least some of the storage values from the register array,

entropy coding a first value of the transform coefficient using the first coding context, the first value indicating a magnitude of the transform coefficient and the first value belonging to a set of positive integers {0, …, first maximum },

determining a second coding context using at least some of the storage values from the register array,

entropy coding a second value of the transform coefficient using the second coding context, the second value indicating the magnitude of the transform coefficient, the second value belonging to a set of positive integers {0, …, second maximum }, and the second maximum being greater than the first maximum, and

updating the register array after entropy coding the first value and the second value.

16. The device of claim 15, wherein determining the first coding context comprises:

summing the respective stored values from each of the register arrays to generate a first amplitude value, wherein each stored value is limited to a first maximum value upon summing;

normalizing the first amplitude value; and

determining the first coding context using the normalized first amplitude value, and wherein determining the first coding context comprises:

summing corresponding stored values from less than all of the register arrays to generate a second amplitude value;

normalizing the second amplitude value; and

determining the second coding context using the normalized second amplitude value.

17. The apparatus of claim 15, wherein:

a cardinality of the register array is equal to one plus a number corresponding to a greater of a maximum number of columns or a maximum number of rows of the at least one spatial template;

the register array includes:

18. The apparatus of any of claims 15 to 17, wherein updating the register array comprises:

shifting one or more stored values from an array position at an index within the first register array to an array position at a common index within the second register array; and

setting one or more stored values in the register array equal to the first value when the first value is less than the first maximum value, and otherwise setting the one or more stored values in the register array equal to a sum of the first value and the second value.

19. A method for coding a transform block having transform coefficients, the method comprising:

coding values of the transform coefficients of the transform block indicative of magnitudes of the transform coefficients in a reverse scan order, the coding comprising:

determining a first coding context using at least some of the storage values from the register array;

entropy coding a first value of the transform coefficient using the first coding context, the first value indicating a magnitude of the transform coefficient, and the first value belonging to a set of positive integers {0, …, first maximum };

determining a second coding context using at least some of the storage values from the register array;