WO2004032032A1 - Context-based adaptive variable length coding for adaptive block transforms - Google Patents

Context-based adaptive variable length coding for adaptive block transforms Download PDF

Info

Publication number
WO2004032032A1
WO2004032032A1 PCT/IB2003/003382 IB0303382W WO2004032032A1 WO 2004032032 A1 WO2004032032 A1 WO 2004032032A1 IB 0303382 W IB0303382 W IB 0303382W WO 2004032032 A1 WO2004032032 A1 WO 2004032032A1
Authority
WO
WIPO (PCT)
Prior art keywords
transform coefficients
sub
block
coding
image
Prior art date
Application number
PCT/IB2003/003382
Other languages
French (fr)
Inventor
Marta Karczewicz
Justin Ridge
Original Assignee
Nokia Corporation
Nokia Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corporation, Nokia Inc. filed Critical Nokia Corporation
Priority to AU2003253133A priority Critical patent/AU2003253133A1/en
Priority to KR1020057005733A priority patent/KR100751869B1/en
Priority to EP03798973A priority patent/EP1546995B1/en
Priority to CA2498384A priority patent/CA2498384C/en
Priority to CNB038235951A priority patent/CN100392671C/en
Priority to JP2004541020A priority patent/JP4308138B2/en
Publication of WO2004032032A1 publication Critical patent/WO2004032032A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/18Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention is generally related to the field of video coding and compression and, more particularly, to a method and system for context-based adaptive variable length coding.
  • a typical video encoder partitions each frame of the original video sequence into contiguous rectangular regions called “blocks”. These blocks are encoded in "intra mode” (I- mode), or in “inter mode” (P-mode).
  • I- mode intra mode
  • P-mode inter mode
  • the encoder first searches for a block similar to the one being encoded in a previously transmitted "reference frame", denoted by Fref. Searches are generally restricted to being no more than a certain spatial displacement from the block to be encoded. When the best match, or “prediction”, has been identified, it is expressed in the form of a two-dimensional (2D) motion vector ( ⁇ x, ⁇ y) where ⁇ x is the horizontal and ⁇ y is the vertical displacement.
  • 2D two-dimensional
  • the location of a pixel within the frame is denoted by (x, y).
  • the predicted block is formed using spatial prediction from previously encoded neighboring blocks within the same frame.
  • the prediction error i.e. the difference between the block being encoded and the predicted block
  • the prediction error is represented as a set of weighted basis functions of some discrete transform. Transforms are typically performed on an 8x8 or 4x4 block basis. The weights - transform coefficients - are subsequently quantized. Quantization introduces loss of information, thus quantized coefficients have lower precision than the original ones. Quantized transform coefficients and motion vectors are examples of "syntax elements". These, plus some control information, form a complete coded representation of the video sequence.
  • VLC Variable Length Codes
  • the VLC must be constructed so that the codewords are uniquely decodable, i.e., if the decoder receives a valid sequence of bits of a finite length, there must be only one possible sequence of input symbols that, when encoded, would have produced the received sequence of bits.
  • both encoder and decoder have to use the same set of VLC codewords and the same assignment of symbols to them.
  • the most frequently occurring symbols should be assigned the shortest VLC codewords.
  • the frequency (probability) of different symbols is dependant upon the actual frame being encoded.
  • VLCs VLC codewords
  • the table selected to encode a particular symbol then depends on the information known both to the encoder and decoder, such as the type of the coded block (I- or P- type block), the component (luma or chroma) being coded, or the quantization parameter (QP) value.
  • the performance depends on how well the parameters used to switch between the VLCs characterize the symbol statistics.
  • the block in the current frame is obtained by first constructing its prediction in the same manner as in the encoder, and by adding to the prediction the compressed prediction error.
  • the compressed prediction error is found by weighting the transform basis functions using the quantized coefficients.
  • the difference between the reconstructed frame and the original frame is called reconstruction error.
  • the compression ratio i.e. the ratio of the number of bits used to represent original sequence and the compressed one, may be controlled by adjusting the value of the quantization parameter (QP) used when quantizing transform coefficients.
  • QP quantization parameter
  • the compression ratio also depends on the method of entropy coding employed.
  • Coefficients in a given block are ordered (scanned) using zigzag scanning, resulting in a one-dimensional ordered coefficient vector.
  • An exemplary zigzag scan for a 4x4 block is shown in Figure 1.
  • Zigzag scanning presumes that, after applying 2 dimensional (2D) transform, the transform coefficients having most energy (i.e. higher value coefficients) correspond to low frequency transform functions and are located toward the top-left of the block as it is depicted in Figure 1.
  • 2D 2 dimensional
  • the vector of coefficients can be further processed so that each nonzero coefficient is represented by 2 values: a run (the number of consecutive zero coefficients proceeding a nonzero value in the vector), and a level (the coefficient's value).
  • CAVLC Context-based Adaptive VLC
  • JVT coder Joint Final Committee Draft (JFCD) of Joint Video
  • the number of trailing ones is defined as the number of coefficients with a magnitude of one that are encountered before a coefficient with magnitude greater than one is encountered when the coefficient vector is read in reverse order (i.e. 15, 14, 13, 12, 11, ... in Figure 1).
  • the VLC used to code this information is based upon a predicted number of nonzero coefficients, where the prediction is based on the number of nonzero coefficients in previously encoded neighboring blocks (upper and left blocks).
  • the VLC used to encode a run value is selected based upon the sum of the runs from step (4), and the sum of the runs coded so far. For example, if a block has a "sum of runs" of 8, and the first run encoded is 6, then all remaining runs must be 0, 1, or 2. Because the possible run length becomes progressively shorter, more efficient VLC codes are selected to minimize the number of bits required to represent the run.
  • the video server 100 comprises a front-end unit 10, which receives video signals 110 from a video source, and a video multiplex coder 40. Each frame of uncompressed video provided from the video source to the input 110 is received and processed macroblock-by-macroblock in a raster-scan order.
  • the front-end unit 10 comprises a coding control manager 12 to switch between the I-mode and P-mode and to perform timing coordination with the multiplex coder 40 via control signals 120, a DCT (Discrete Cosine Transform) transformation module 16 and a quantizer 14 to provide quantized DCT coefficients.
  • the quantized DCT coefficients 122 are conveyed to the multiplex coder 40.
  • the front-end unit 10 also comprises an inverse quantizer 18 and an inverse transformation unit 20 to perform an inverse block-based discrete cosine transform (IDCT), and a motion compensation prediction and estimation module 22 to reduce the temporal redundancy in video sequences and to provide a prediction error frame for error prediction and compensation purposes.
  • the motion estimation module 22 also provides a motion vector 124 for each macroblock to the multiplex coder 40.
  • the multiplex coder 40 typically comprises a scanning module 42 to perform the zigzag scan for forming an order vector for each block of image data, an entropy coding module to designate non-zero quantized DCT coefficients with run and level parameters.
  • the run and level values are further mapped to a sequence of bins, each of which is assigned to a so-called 'context' by a context assignment module 46.
  • the contexts, along with the motion vector, is formatted into a bitstream 140.
  • a context-based encoder is known in the art.
  • the transformation module 16 is a FFT (Fast Fourier Transform) module or DFT (Discrete Fourier Transform) module, and that DCT can be an approximation of a DCT.
  • a client 200 comprises a video multiplex decoder 60, which receives the encoded video bitstream 140 from the encoder 40.
  • the decoder 60 also decodes an I-mode frame on a macroblock-by-macroblock basis.
  • a coefficient extractor module 62 in the decoder 60 recovers the run and level values, and then reconstructs an array of quantized DCT coefficients 162 for each block of the macroblock.
  • the encoded motion vector information associated with the macroblock is extracted from the encoded video bitstream 140.
  • the extracted motion vector 166, along with the reconstructed quantized DCT coefficients 162 is provided to a back-end unit 80.
  • An inverse quantizer 84 inverse quantizes the quantized DCT coefficients 162 representing the prediction error information for each block of the macroblock provides the results to an inverse transformer 86. With the control information provided by a coding control manager 82, an array of reconstructed prediction error values for each block of the macroblock is yielded in order to produce video signals 180.
  • H.26L or H.264-to-be
  • JVT Joint Video Team
  • the image is first subdivided into blocks of 4x4 pixels in size and the blocks are transformed into a 4x4 matrix of transform coefficients.
  • the coefficients are then arranged by scanning them along a zigzag path, wherein the low-frequency coefficients are placed first in the scan in order to form an ordered sequence of transform coefficients - a one-dimensional vector.
  • a 4x4 transform coefficient matrix of Figure 1 will result in a one-dimension array or a sequence of 1, 2, 5, 9, 6, 3, 4, 7, 10, 13, 14, 11, 8, 12, 15, 16.
  • variable-length coding means that not all symbols have the same length (in bits).
  • Huffman coding is an example of variable-length coding.
  • Arithmetic is slightly different in that it involves a series of symbols. Thus, it is in general not possible to describe the length of ONE symbol as requiring X bits. Rather, a specific series of symbols will require Y bits. For this reason "entropy coding" is perhaps a more general term than "variable-length coding”.
  • Context-based Adaptive VLC may involve in partitioning the transform coefficients into blocks that are larger than 4x4.
  • the JVT coder contains a feature called "Adaptive Block Transforms" (ABT) which performs transforms on 4x8, 8x4, and 8x8 blocks.
  • ABT Adaptive Block Transforms
  • a solution to the problem is to split the larger block into sub-blocks of size 4x4.
  • An existing solution has been proposed, wherein the ABT block of coefficients is divided into 4x4 blocks in the spatial domain.
  • an 8x8 block is shown in Figure 4 with one of the scan orders used for this block in the JVT coder.
  • the same block partitioned into four 4x4 blocks is shown in Figures 5a to 5c.
  • each 4x4 block is zigzag scanned using 4x4 scan, yielding a plurality of vectors of length 16.
  • These length 16 vectors are then passed to the standard 4x4 CAVLC algorithm.
  • 4x4 scan shown in Figure 1 is used for the 4x4 blocks in Figures 5a to 5c, the resulting vectors are as given in Figure 6a to 6c.
  • This existing CAVLC algorithm makes certain assumptions about the content of a coefficient vector. When these assumptions are violated, the coding tables (i.e.
  • each of the 4x4 blocks created after partitioning of the ABT block has coefficients corresponding to different frequencies in the ABT transform.
  • the 4x4 block of Figure 5 a contains low frequency information (both horizontally and vertically) and therefore most of the high amplitude coefficients.
  • the 4x4 block of Figure 5d contains high frequency information and low amplitude coefficients.
  • the CAVLC algorithm assumes that higher magnitudes generally occur toward the start of the vector, and critically, it assumes that longer runs of zeros will generally occur toward the end of a vector.
  • the 4x4 block of Figure 5d is statistically unlikely to contain many values in the 4x4 block of Figure 5a, and the "outlying" values are likely to have long runs of zeros associated with them. Although the 4x4 block of Figure 5d may contain one or two nonzero coefficients, the locations of those coefficients are mismatched with what
  • the CAVLC method also assumes that the neighboring blocks have similar number of nonzero coefficients. For the blocks, which have coefficients corresponding to different frequencies of transform functions the number of nonzero coefficients vary drastically. That can lead to the wrong choice of the VLC table used to code the number of the nonzero coefficient of a given block since this choice is based on the number of the nonzero coefficients of its neighbors.
  • the existing block partitioning scheme is not an optimal solution in terms of coding efficiency and quantization accuracy. It is advantageous and desirable to provide a more efficient method and system for video and image coding, which can be applied to ABT blocks having a general size of (4 «)x(4m) where n and m are positive integers equal to or greater than 1.
  • a method of image coding characterized by forming at least a block of transform coefficients from the image data, by scanning the block of transform coefficients for providing a sequence of transform coefficients, by sub-sampling the transform coefficients in the sequence in an interleaved manner for providing a plurality of sub-sampled sequences of transform coefficients, and by coding the sub-sampled sequences of transform coefficients using an entropy encoder.
  • said sub-sampling is carried out prior to or after said coding.
  • the sequence of the transform coefficients has a length of 16nxm, where n and m are positive integer equal to or greater than 1, and each of said sub-sampled sequence of the transform coefficients has a length of 16.
  • a computer program to be used in image coding wherein the coding process comprises the steps of: forming at least a block of transform coefficients from the image data, and scanning the block of transform coefficients for providing a sequence of transform coefficients.
  • the computer program is characterized by an algorithm for sub-sampling the transform coefficients in the sequence in an interleaved manner for providing a plurality of sub-sampled sequences of transform coefficients.
  • the coding process further comprises the step of coding the sub- sampled sequences of transform coefficients using an entropy encoder.
  • the coding process further comprises the step of coding the sequence of transform coefficients using an entropy encoder prior to said sub-sampling.
  • an image encoder for receiving image data and providing a bitstream indicative of the image data.
  • the image encoder is characterized by: means for forming at least a block of transform coefficients from the image data, by means for scanning the block of transform coefficients for forming an ordered sequence of transform coefficients from the block, by a software program for sub-sampling the ordered sequence of transform coefficients in order to form a plurality of sub-sampled sequences of transform coefficients, by means for entropy coding the sub-sampled sequences of transform coefficients for provided signals indicative of the encoded transform coefficients, and by means, for providing the bitstream based on the signals.
  • an image coding system comprising a server for providing a bitstream indicative of image data and a client for reconstructing the image data based on the bitstream, wherein the server characterized by a receiver for receiving signals indicative of the image data, by means for forming at least a block of transform coefficients from the signals, by means for scanning the block of transform coefficients for forming an ordered sequence of transform coefficients from the block, by a software program for sub-sampling the ordered sequence of transform coefficients in order to form a plurality of sub-sampled sequences of transform coefficients, by means for entropy coding the sub-sampled sequences of transform coefficients for provided further signals indicative of the encoded transform coefficients, and by means, for providing the bitstream based on the further signals.
  • Figure 1 is an exemplary zigzag scan for a 4x4 block.
  • Figure 2 is a block diagram showing a typical video server, which employs block- based transform coding and motion-compensated prediction.
  • Figure 3 is a block diagram showing a typical video client corresponding to the encoder of Figure 2.
  • Figure 4 is an exemplary zigzag scan for an 8x8 block.
  • Figure 5 a is a 4x4 sub-block from the 8x8 block of Figure 4.
  • Figure 5b is another 4x4 sub-block from the 8x8 block of Figure 4.
  • Figure 5c is yet another 4x4 sub-block from the 8x8 block of Figure 4.
  • Figure 5d is the fourth 4x4 sub-block from the 8x8 block of Figure 4.
  • Figure 6a is a one-dimensional array representing a vector, according to the 4x4 block of Figure 5a, to be passed to the 4x4 CAVLC algorithm.
  • Figure 6b is a one-dimensional array representing a vector, according to the 4x4 block of Figure 5b, to be passed to the 4x4 CAVLC algorithm.
  • Figure 6c is a one-dimensional array of coefficients representing a vector, according to the 4x4 block of Figure 5c, to be passed to the 4x4 CAVLC algorithm.
  • Figure 6d is a one-dimensional array representing a vector, according to the 4x4 block of Figure 5d, to be passed to the 4x4 CAVLC algorithm.
  • Figure 7 is a one-dimensional vector representing an ordered sequence of coefficients of a 8x8 block.
  • Figure 8a is a one-dimensional array of coefficients representing the first segmented vector from the original vector, according to the present invention.
  • Figure 8b is a one-dimensional array of coefficients representing the second segmented vector from the original vector, according to the present invention.
  • Figure 8c is a one-dimensional array of coefficients representing the third segmented vector from the original vector, according to the present invention.
  • Figure 8d is a one-dimensional array of coefficients representing the fourth segmented vector from the original vector, according to the present invention.
  • Figure 9 is a block diagram showing an exemplary video server, according to the present invention.
  • FIG. 10 is a block diagram showing a video client, according to the present invention, which is corresponding to the video encoder of Figure 9.
  • Figure 1 la is a 4x4 block sub-sampled from an 8x8 block of transform coefficients.
  • Figure 1 lb is another 4x4 block sub-sampled from an 8x8 block of transform coefficients.
  • Figure 1 lc is yet another 4x4 block sub-sampled from an 8x8 block of transform coefficients.
  • Figure 1 Id is the fourth 4x4 block sub-sampled from an 8x8 block of transform coefficients.
  • the block segmentation method partitions an ABT block (an 8x8 block, a 4x8 or 8x4 block) of transform coefficients into 4x4 blocks, which are encoded using the standard 4x4 CAVLC algorithm.
  • the division of the coefficients among 4x4 blocks is based on the coefficients energy to ensure that the statistical distributions of coefficients in each 4x4 blocks is similar.
  • the energy of the coefficient depends on the frequency of the transform function to which it corresponds and can be for example indicated by its position in the zigzag scan of the ABT block. As a result of such division, not all the coefficients selected to a given 4x4 block are adjacent to each other spatially in ABT block.
  • the method presented in this invention operates on blocks of coefficients produced using a 4x8, 8x4 or 8x8 transform, which have subsequently been scanned in a zigzag pattern (or any other pattern) to produce an ordered vector of coefficients.
  • the goal of zigzag scanning is to pack nonzero coefficients toward the start of the coefficient vector. Effectively, the goal is to arrange the coefficients according to decreasing energy (variance). The actual scan used to accomplish this is of no consequence to this invention, provided the energy is generally decreasing.
  • the algorithm of the present invention segments this vector into N/16 smaller vectors, each of length 16.
  • Each such vector is formed by taking every (N/16) th coefficient from the length N coefficient vector in a sub-sampling process. For example, if the ordered vector contains coefficients labeled cO, cl, c2, ..., c63, then the first segmented vector of length 16 contains cO, c4, c8, cl2, ..., c60.
  • the second segmented vector of length 16 vector contains cl, c5, c9, cl3, ..., c61, and so on for the third and fourth vectors.
  • the ordered vector is represented by a one-dimensional array of 64 coefficients as shown in Figure 1, then the first, second, third and fourth segmented vectors of length 16 are shown, respectively, in Figures 8a - 8d.
  • the sub-sampled vectors of length 16 are obtained in the described manner, they are encoded using the standard 4x4 CAVLC algorithm.
  • coding of nonzero coefficients relies on the number of nonzero coefficients of the upper and left neighboring 4x4 blocks (See Figures 8a to 8d). Therefore each of the vectors created by splitting ABT block is assigned the spatial locations of one of the 4x4 blocks created by dividing ABT block spatially. For example when the method of the present invention operates on 8x4 block the first vector is assigned upper 4x4 block and the second vector lower block.
  • the multiplex encoder 242 comprises an interleaving segmentation unit 48 for segmenting an ABT block (a 4nx4m block, with n, m being positive integer equal to or greater than 1) into nxm blocks in an interleaved manner, as illustrated in Figures 8a - 8d.
  • a computer software in the interleaving segmentation unit 48 having an algorithm is used to segment this ordered vector into nxm smaller vectors, each of which has a length of 16.
  • Each such vector is formed by taking every (nxm) th coefficients from the ordered coefficient vector of length N.
  • the bitstream 142 is indicative of the contexts of the nxm segmented vectors.
  • a vector assembling unit 66 which has a computer program with an algorithm for regrouping the coefficients in nxm segmented vectors into an ordered vector of length N.
  • the allocation pattern can be determined from other parameters used in the coding of the image. What is essential here is that both the encoder and the decoder use the same allocation pattern, since otherwise the coded image cannot be decoded properly.
  • the DC coefficient can be coded differently and separately. However, in order to ensure that the existing 4x4 CAVLC is unchanged, the DC coefficient is not treated any differently than the 3 lowest-frequency AC values. Treating the DC coefficient separately would mostly result in a benefit when there are very few coefficients in the block (for example, for an 8x8 block, three out of four 4x4 blocks are empty). In this case, it may be desirable to exclude the DC term from the prediction of number of non-zero values. However, the benefit may not be significant in general.
  • the distance/cost metric intrinsic to a coefficient's position in the scan can be used to determine which 4x4 block that coefficient is allocated to. For example, a cost pattern of (0 0 0 0 1 1 1 1 1 2 2 2 2 3 3 3 ! can be used for such determining. Alternatively, a cartesian distance such as "0111.42 " can be used.
  • the effect of the allocation algorithm is to create blocks with an equal or approximately equal total cost. As such, the variance of the total cost for each block is taken to be a measure of the similarity.
  • the block selected for the next coefficient in the scan is the block with the lowest accumulated cost of coefficients allocated to it so far.
  • an image coding method which comprises the steps of: 1. forming at least a block of transform coefficients for the image data;
  • the method of the present invention as described herein above divides coefficients corresponding to different frequencies of the ABT transform among 4x4 blocks more equally. Therefore the created 4x4 blocks have properties statistically similar to those expected by the CAVLC coder, which leads to increased coding efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)

Abstract

A method and system for coding an image using context-based adaptive VLC where transform coefficients are partitioned into blocks having a block dimension of 4nx4m (with n, m being positive integer equal to or greater than 1). Each block is scanned in a zigzag manner to produce an ordered vector of coefficients having a length of 16nxm. The ordered vector is sub-sampled in an interleaved manner to produce nxm sub-sampled sequences of transform coefficients prior to encoding the transform coefficients using an entropy encoder.

Description

CONTEXT-BASED ADAPTIVE VARIABLE LENGTH CODING FOR ADAPTIVE BLOCK TRANSFORMS Field of the Invention
The present invention is generally related to the field of video coding and compression and, more particularly, to a method and system for context-based adaptive variable length coding.
Background of the Invention
A typical video encoder partitions each frame of the original video sequence into contiguous rectangular regions called "blocks". These blocks are encoded in "intra mode" (I- mode), or in "inter mode" (P-mode). For P-mode, the encoder first searches for a block similar to the one being encoded in a previously transmitted "reference frame", denoted by Fref. Searches are generally restricted to being no more than a certain spatial displacement from the block to be encoded. When the best match, or "prediction", has been identified, it is expressed in the form of a two-dimensional (2D) motion vector (Δx, Δy) where Δx is the horizontal and Δy is the vertical displacement. The motion vectors together with the reference frame are used to construct a predicted block Fpred'.
FPred(x,y) = Freχx+Δx, y+Δy)
The location of a pixel within the frame is denoted by (x, y).
For blocks encoded in I-mode, the predicted block is formed using spatial prediction from previously encoded neighboring blocks within the same frame. For both I-mode and P- mode, the prediction error, i.e. the difference between the block being encoded and the predicted block, is represented as a set of weighted basis functions of some discrete transform. Transforms are typically performed on an 8x8 or 4x4 block basis. The weights - transform coefficients - are subsequently quantized. Quantization introduces loss of information, thus quantized coefficients have lower precision than the original ones. Quantized transform coefficients and motion vectors are examples of "syntax elements". These, plus some control information, form a complete coded representation of the video sequence. Prior to transmission from the encoder to the decoder, all syntax elements are entropy coded, thereby further reducing the number of bits needed for their representation. Entropy coding is a lossless operation aimed at minimizing the number of bits required to represent transmitted or stored symbols (in our case syntax elements) by utilizing properties of their distribution (some symbols occur more frequently than others). One method of entropy coding employed by video coders is Variable Length Codes (VLC). A VLC codeword, which is a sequence of bits (0's and l's), is assigned to each symbol. The VLC is constructed so that the codeword lengths correspond to how frequently the symbol represented by the codeword occurs, e.g. more frequently occurring symbols are represented by shorter VLC codewords. Moreover, the VLC must be constructed so that the codewords are uniquely decodable, i.e., if the decoder receives a valid sequence of bits of a finite length, there must be only one possible sequence of input symbols that, when encoded, would have produced the received sequence of bits.
To correctly decode the bitstream, both encoder and decoder have to use the same set of VLC codewords and the same assignment of symbols to them. As discussed earlier, to maximize the compression, the most frequently occurring symbols should be assigned the shortest VLC codewords. However, the frequency (probability) of different symbols is dependant upon the actual frame being encoded. In the case where a single set of VLC codewords, and a constant assignment of symbols to those codewords is used, it is likely that the probability distribution of symbols within a given frame will differ from the probabilities assumed by the VLC, even though the average symbol probability across the entire sequence may not. Consequently, using a single set of VLC codewords and a single assignment of symbols to those codewords reduces coding efficiency.
To rectify this problem different methods of adaptation are used. One approach, which offers reasonable computational complexity, and a good compression versus efficiency trade-off, and which is currently used in the state-of-the art video coders, is now described. For a set of symbols, a number of tables specifying VLC codewords (VLCs) are provided for the encoder and the decoder to use. The table selected to encode a particular symbol then depends on the information known both to the encoder and decoder, such as the type of the coded block (I- or P- type block), the component (luma or chroma) being coded, or the quantization parameter (QP) value. The performance depends on how well the parameters used to switch between the VLCs characterize the symbol statistics.
In the decoder, the block in the current frame is obtained by first constructing its prediction in the same manner as in the encoder, and by adding to the prediction the compressed prediction error. The compressed prediction error is found by weighting the transform basis functions using the quantized coefficients. The difference between the reconstructed frame and the original frame is called reconstruction error.
The compression ratio, i.e. the ratio of the number of bits used to represent original sequence and the compressed one, may be controlled by adjusting the value of the quantization parameter (QP) used when quantizing transform coefficients. The compression ratio also depends on the method of entropy coding employed.
Coefficients in a given block are ordered (scanned) using zigzag scanning, resulting in a one-dimensional ordered coefficient vector. An exemplary zigzag scan for a 4x4 block is shown in Figure 1. Zigzag scanning presumes that, after applying 2 dimensional (2D) transform, the transform coefficients having most energy (i.e. higher value coefficients) correspond to low frequency transform functions and are located toward the top-left of the block as it is depicted in Figure 1. Thus, in a coefficient vector produced through zigzag scanning, the higher magnitude coefficients are most likely to appear toward the start of the vector. After quantization most of the low energy coefficients become equal to 0.
The vector of coefficients can be further processed so that each nonzero coefficient is represented by 2 values: a run (the number of consecutive zero coefficients proceeding a nonzero value in the vector), and a level (the coefficient's value).
CAVLC (Context-based Adaptive VLC) is the method of coding transform coefficients used in the JVT coder "Joint Final Committee Draft (JFCD) of Joint Video
Specification (ITU-T Rec. H.264 | ISO/LEC 14496-10 AVC". In summary, encoding a single 4x4 block using CAVLC involves five steps:
1. Encoding the total number of nonzero coefficients in the block, combined with the number of "trailing ones". The number of trailing ones is defined as the number of coefficients with a magnitude of one that are encountered before a coefficient with magnitude greater than one is encountered when the coefficient vector is read in reverse order (i.e. 15, 14, 13, 12, 11, ... in Figure 1). The VLC used to code this information is based upon a predicted number of nonzero coefficients, where the prediction is based on the number of nonzero coefficients in previously encoded neighboring blocks (upper and left blocks).
2. Encoding the sign of any trailing ones.
3. Encoding the levels (magnitudes) of nonzero coefficients other than the trailing ones.
4. Encoding the number of zero values in the coefficient vector before the last nonzero coefficient, i.e. the sum of all the "runs". The VLC used when coding this value depends upon the total number of nonzero coefficients in the block, since there is some relationship between these two values.
5. Encoding the run that occurs before each nonzero coefficient, starting from the last nonzero value in the coefficient vector.
The VLC used to encode a run value is selected based upon the sum of the runs from step (4), and the sum of the runs coded so far. For example, if a block has a "sum of runs" of 8, and the first run encoded is 6, then all remaining runs must be 0, 1, or 2. Because the possible run length becomes progressively shorter, more efficient VLC codes are selected to minimize the number of bits required to represent the run.
A typical block-based video encoder is shown in Figure 2. As shown in Figure 1, the video server 100 comprises a front-end unit 10, which receives video signals 110 from a video source, and a video multiplex coder 40. Each frame of uncompressed video provided from the video source to the input 110 is received and processed macroblock-by-macroblock in a raster-scan order. The front-end unit 10 comprises a coding control manager 12 to switch between the I-mode and P-mode and to perform timing coordination with the multiplex coder 40 via control signals 120, a DCT (Discrete Cosine Transform) transformation module 16 and a quantizer 14 to provide quantized DCT coefficients. The quantized DCT coefficients 122 are conveyed to the multiplex coder 40. The front-end unit 10 also comprises an inverse quantizer 18 and an inverse transformation unit 20 to perform an inverse block-based discrete cosine transform (IDCT), and a motion compensation prediction and estimation module 22 to reduce the temporal redundancy in video sequences and to provide a prediction error frame for error prediction and compensation purposes. The motion estimation module 22 also provides a motion vector 124 for each macroblock to the multiplex coder 40. The multiplex coder 40 typically comprises a scanning module 42 to perform the zigzag scan for forming an order vector for each block of image data, an entropy coding module to designate non-zero quantized DCT coefficients with run and level parameters. The run and level values are further mapped to a sequence of bins, each of which is assigned to a so-called 'context' by a context assignment module 46. The contexts, along with the motion vector, is formatted into a bitstream 140. A context-based encoder is known in the art. Furthermore, it is possible that the transformation module 16 is a FFT (Fast Fourier Transform) module or DFT (Discrete Fourier Transform) module, and that DCT can be an approximation of a DCT.
A typical decoder is shown in Figure 3. As shown, a client 200 comprises a video multiplex decoder 60, which receives the encoded video bitstream 140 from the encoder 40. The decoder 60 also decodes an I-mode frame on a macroblock-by-macroblock basis. Based on the VLC codewords contained in the bitstream 140, a coefficient extractor module 62 in the decoder 60 recovers the run and level values, and then reconstructs an array of quantized DCT coefficients 162 for each block of the macroblock. The encoded motion vector information associated with the macroblock is extracted from the encoded video bitstream 140. The extracted motion vector 166, along with the reconstructed quantized DCT coefficients 162, is provided to a back-end unit 80. An inverse quantizer 84 inverse quantizes the quantized DCT coefficients 162 representing the prediction error information for each block of the macroblock provides the results to an inverse transformer 86. With the control information provided by a coding control manager 82, an array of reconstructed prediction error values for each block of the macroblock is yielded in order to produce video signals 180.
Currently, video and still images are typically coded with help of a block-wise transformation to frequency domain. Such coding method is used in H.26L (or H.264-to-be) standard by the Joint Video Team (JVT). In such a method, the image is first subdivided into blocks of 4x4 pixels in size and the blocks are transformed into a 4x4 matrix of transform coefficients. The coefficients are then arranged by scanning them along a zigzag path, wherein the low-frequency coefficients are placed first in the scan in order to form an ordered sequence of transform coefficients - a one-dimensional vector. A 4x4 transform coefficient matrix of Figure 1 will result in a one-dimension array or a sequence of 1, 2, 5, 9, 6, 3, 4, 7, 10, 13, 14, 11, 8, 12, 15, 16. This is advantageous because the following step is to code the quantized values of the DCT coefficients by rom-length coding, whereby the more probable runs are represented by short codes (Huffman coding or arithmetic coding). Arranged in such a manner, many of the coefficients at the end of the scan usually end up being zero. Thus the coefficients are coded with high-efficiency. It is known that variable-length coding means that not all symbols have the same length (in bits). Huffman coding is an example of variable-length coding. Arithmetic is slightly different in that it involves a series of symbols. Thus, it is in general not possible to describe the length of ONE symbol as requiring X bits. Rather, a specific series of symbols will require Y bits. For this reason "entropy coding" is perhaps a more general term than "variable-length coding".
The above-described coding scheme is used for producing a block transform of 4x4 pixels. However, Context-based Adaptive VLC (CAVLC) may involve in partitioning the transform coefficients into blocks that are larger than 4x4. For example, the JVT coder contains a feature called "Adaptive Block Transforms" (ABT) which performs transforms on 4x8, 8x4, and 8x8 blocks. Thus, the coding scheme designed for 4x4 blocks can no longer be applied. A solution to the problem is to split the larger block into sub-blocks of size 4x4. An existing solution has been proposed, wherein the ABT block of coefficients is divided into 4x4 blocks in the spatial domain. As an example, an 8x8 block is shown in Figure 4 with one of the scan orders used for this block in the JVT coder. The same block partitioned into four 4x4 blocks is shown in Figures 5a to 5c. Subsequently each 4x4 block is zigzag scanned using 4x4 scan, yielding a plurality of vectors of length 16. These length 16 vectors are then passed to the standard 4x4 CAVLC algorithm. When 4x4 scan shown in Figure 1 is used for the 4x4 blocks in Figures 5a to 5c, the resulting vectors are as given in Figure 6a to 6c. This existing CAVLC algorithm makes certain assumptions about the content of a coefficient vector. When these assumptions are violated, the coding tables (i.e. the tables specifying which codeword is used to describe which symbol) used by CAVLC are "mismatched". This means that the length of codewords in the table no longer accurately reflects the probability of a symbol, and consequently CAVLC is less efficient. As a result of this existing approach, each of the 4x4 blocks created after partitioning of the ABT block has coefficients corresponding to different frequencies in the ABT transform. For example, the 4x4 block of Figure 5 a contains low frequency information (both horizontally and vertically) and therefore most of the high amplitude coefficients. Likewise, the 4x4 block of Figure 5d contains high frequency information and low amplitude coefficients. The CAVLC algorithm assumes that higher magnitudes generally occur toward the start of the vector, and critically, it assumes that longer runs of zeros will generally occur toward the end of a vector. The 4x4 block of Figure 5d is statistically unlikely to contain many values in the 4x4 block of Figure 5a, and the "outlying" values are likely to have long runs of zeros associated with them. Although the 4x4 block of Figure 5d may contain one or two nonzero coefficients, the locations of those coefficients are mismatched with what
CAVLC expects, and consequently coding of that block requires a disproportionately large number of bits.
The CAVLC method also assumes that the neighboring blocks have similar number of nonzero coefficients. For the blocks, which have coefficients corresponding to different frequencies of transform functions the number of nonzero coefficients vary drastically. That can lead to the wrong choice of the VLC table used to code the number of the nonzero coefficient of a given block since this choice is based on the number of the nonzero coefficients of its neighbors.
Thus, the existing block partitioning scheme is not an optimal solution in terms of coding efficiency and quantization accuracy. It is advantageous and desirable to provide a more efficient method and system for video and image coding, which can be applied to ABT blocks having a general size of (4«)x(4m) where n and m are positive integers equal to or greater than 1.
Summary of the Invention
It is a primary objective of the present invention to reduce the number of bits required to represent the quantized coefficients that result after application of a block transform larger than 4x4. More precisely, it is aimed at reducing the number of bits required to represent coefficients resulting from a 4x8, 8x4, or 8x8 transform. Moreover, in order to simplify design of the JVT encoder as well as to minimize the memory required by the code implementing JVT, it is desirable that the CAVLC method developed for 4x4 block is used to code 4x8, 8x4, or 8x8 blocks unchanged or with minimal modifications.
The objective can be achieved by partitioning a block larger than 4x4 by a plurality of sub-block of size 4x4 using the original vector in an interleaved fashion. Thus, according to the first aspect of the present invention, a method of image coding characterized by forming at least a block of transform coefficients from the image data, by scanning the block of transform coefficients for providing a sequence of transform coefficients, by sub-sampling the transform coefficients in the sequence in an interleaved manner for providing a plurality of sub-sampled sequences of transform coefficients, and by coding the sub-sampled sequences of transform coefficients using an entropy encoder.
Advantageously, said sub-sampling is carried out prior to or after said coding.
Preferably, the sequence of the transform coefficients has a length of 16nxm, where n and m are positive integer equal to or greater than 1, and each of said sub-sampled sequence of the transform coefficients has a length of 16.
According to the second aspect of the present invention, there is provided a computer program to be used in image coding, wherein the coding process comprises the steps of: forming at least a block of transform coefficients from the image data, and scanning the block of transform coefficients for providing a sequence of transform coefficients. The computer program is characterized by an algorithm for sub-sampling the transform coefficients in the sequence in an interleaved manner for providing a plurality of sub-sampled sequences of transform coefficients.
Advantageously, the coding process further comprises the step of coding the sub- sampled sequences of transform coefficients using an entropy encoder.
Alternatively, the coding process further comprises the step of coding the sequence of transform coefficients using an entropy encoder prior to said sub-sampling. . According to the third aspect of the present invention, there is provided an image encoder for receiving image data and providing a bitstream indicative of the image data. The image encoder is characterized by: means for forming at least a block of transform coefficients from the image data, by means for scanning the block of transform coefficients for forming an ordered sequence of transform coefficients from the block, by a software program for sub-sampling the ordered sequence of transform coefficients in order to form a plurality of sub-sampled sequences of transform coefficients, by means for entropy coding the sub-sampled sequences of transform coefficients for provided signals indicative of the encoded transform coefficients, and by means, for providing the bitstream based on the signals.
According to the fourth aspect of the present invention, there is provided an image coding system comprising a server for providing a bitstream indicative of image data and a client for reconstructing the image data based on the bitstream, wherein the server characterized by a receiver for receiving signals indicative of the image data, by means for forming at least a block of transform coefficients from the signals, by means for scanning the block of transform coefficients for forming an ordered sequence of transform coefficients from the block, by a software program for sub-sampling the ordered sequence of transform coefficients in order to form a plurality of sub-sampled sequences of transform coefficients, by means for entropy coding the sub-sampled sequences of transform coefficients for provided further signals indicative of the encoded transform coefficients, and by means, for providing the bitstream based on the further signals.
Brief Description of the Drawings
Figure 1 is an exemplary zigzag scan for a 4x4 block.
Figure 2 is a block diagram showing a typical video server, which employs block- based transform coding and motion-compensated prediction.
Figure 3 is a block diagram showing a typical video client corresponding to the encoder of Figure 2.
Figure 4 is an exemplary zigzag scan for an 8x8 block. Figure 5 a is a 4x4 sub-block from the 8x8 block of Figure 4. Figure 5b is another 4x4 sub-block from the 8x8 block of Figure 4. Figure 5c is yet another 4x4 sub-block from the 8x8 block of Figure 4. Figure 5d is the fourth 4x4 sub-block from the 8x8 block of Figure 4.
Figure 6a is a one-dimensional array representing a vector, according to the 4x4 block of Figure 5a, to be passed to the 4x4 CAVLC algorithm.
Figure 6b is a one-dimensional array representing a vector, according to the 4x4 block of Figure 5b, to be passed to the 4x4 CAVLC algorithm. Figure 6c is a one-dimensional array of coefficients representing a vector, according to the 4x4 block of Figure 5c, to be passed to the 4x4 CAVLC algorithm.
Figure 6d is a one-dimensional array representing a vector, according to the 4x4 block of Figure 5d, to be passed to the 4x4 CAVLC algorithm.
Figure 7 is a one-dimensional vector representing an ordered sequence of coefficients of a 8x8 block.
Figure 8a is a one-dimensional array of coefficients representing the first segmented vector from the original vector, according to the present invention.
Figure 8b is a one-dimensional array of coefficients representing the second segmented vector from the original vector, according to the present invention. Figure 8c is a one-dimensional array of coefficients representing the third segmented vector from the original vector, according to the present invention.
Figure 8d is a one-dimensional array of coefficients representing the fourth segmented vector from the original vector, according to the present invention. Figure 9 is a block diagram showing an exemplary video server, according to the present invention.
Figure 10 is a block diagram showing a video client, according to the present invention, which is corresponding to the video encoder of Figure 9.
Figure 1 la is a 4x4 block sub-sampled from an 8x8 block of transform coefficients. Figure 1 lb is another 4x4 block sub-sampled from an 8x8 block of transform coefficients.
Figure 1 lc is yet another 4x4 block sub-sampled from an 8x8 block of transform coefficients.
Figure 1 Id is the fourth 4x4 block sub-sampled from an 8x8 block of transform coefficients.
Best Mode to Carry Out the Invention
The block segmentation method, according to the present invention, partitions an ABT block (an 8x8 block, a 4x8 or 8x4 block) of transform coefficients into 4x4 blocks, which are encoded using the standard 4x4 CAVLC algorithm. The division of the coefficients among 4x4 blocks is based on the coefficients energy to ensure that the statistical distributions of coefficients in each 4x4 blocks is similar. The energy of the coefficient depends on the frequency of the transform function to which it corresponds and can be for example indicated by its position in the zigzag scan of the ABT block. As a result of such division, not all the coefficients selected to a given 4x4 block are adjacent to each other spatially in ABT block.
The method presented in this invention operates on blocks of coefficients produced using a 4x8, 8x4 or 8x8 transform, which have subsequently been scanned in a zigzag pattern (or any other pattern) to produce an ordered vector of coefficients. As mentioned earlier, the goal of zigzag scanning is to pack nonzero coefficients toward the start of the coefficient vector. Effectively, the goal is to arrange the coefficients according to decreasing energy (variance). The actual scan used to accomplish this is of no consequence to this invention, provided the energy is generally decreasing. After zigzag scanning to produce a length N ordered vector of coefficients (N being
64 for an 8x8 block, or 32 for a 4x8 or 8x4 block), the algorithm of the present invention segments this vector into N/16 smaller vectors, each of length 16. Each such vector is formed by taking every (N/16)th coefficient from the length N coefficient vector in a sub-sampling process. For example, if the ordered vector contains coefficients labeled cO, cl, c2, ..., c63, then the first segmented vector of length 16 contains cO, c4, c8, cl2, ..., c60. The second segmented vector of length 16 vector contains cl, c5, c9, cl3, ..., c61, and so on for the third and fourth vectors. For example, if the ordered vector is represented by a one-dimensional array of 64 coefficients as shown in Figure 1, then the first, second, third and fourth segmented vectors of length 16 are shown, respectively, in Figures 8a - 8d. After the sub-sampled vectors of length 16 are obtained in the described manner, they are encoded using the standard 4x4 CAVLC algorithm. As written in the CAVLC description, coding of nonzero coefficients relies on the number of nonzero coefficients of the upper and left neighboring 4x4 blocks (See Figures 8a to 8d). Therefore each of the vectors created by splitting ABT block is assigned the spatial locations of one of the 4x4 blocks created by dividing ABT block spatially. For example when the method of the present invention operates on 8x4 block the first vector is assigned upper 4x4 block and the second vector lower block.
In the method, according to the present invention, where every fourth coefficient is selected as shown in Figures 8a - 8d, one coefficient out of the first ("most significant") four coefficients numbered 0-4 is allocated to each 4x4 block. One coefficient out of the next group of four (numbered 4-7) is allocated to each 4x4 block. The same pattern repeats for remaining groups of four coefficients. This has the effect of "balancing" the amount of energy in each of the resulting 4x4 blocks. According to our experiments, this algorithm requires an average of 3-5% fewer bits to represent a given video sequence, when compared to the existing solution. To facilitate the video coding using the vector segmentation method, according to the present invention, a video server 102 as shown in Figure 9 and a video client 202 as shown in Figure 10 can be used. The major difference between the encoder 242, according to the present invention, and the typical encoder 40 (Figure 2) is that the multiplex encoder 242 comprises an interleaving segmentation unit 48 for segmenting an ABT block (a 4nx4m block, with n, m being positive integer equal to or greater than 1) into nxm blocks in an interleaved manner, as illustrated in Figures 8a - 8d. According to the present invention, after the scanning unit 42 produces an ordered vector of coefficients of length N (N=\6nxm), a computer software in the interleaving segmentation unit 48 having an algorithm is used to segment this ordered vector into nxm smaller vectors, each of which has a length of 16. Each such vector is formed by taking every (nxm)th coefficients from the ordered coefficient vector of length N. Thus, the bitstream 142 is indicative of the contexts of the nxm segmented vectors.
Likewise, in the decoder 262 of the client 202 has a vector assembling unit 66, which has a computer program with an algorithm for regrouping the coefficients in nxm segmented vectors into an ordered vector of length N.
It should be noted that the algorithm as described in conjunction with Figures 8a to 10 is a specific embodiment of a more general concept. It is possible to assign a number to each position in the length N vector representing its "distance" from the DC (or first) term in the vector. This value should reflect the relative importance of the coefficients in that position. For example, in Figure 1, the selection of whether to encode position 1 or 2 first is nearly arbitrary; therefore they might be assigned the same "distance" or "cost" value.
Ensuring that all blocks possess similar characteristics (i.e. are suited to the CAVLC coder) is then a minimization problem. For each possible allocation pattern, the total "cost" of coefficients in each 4x4 block can be calculated, and the variance across the 4x4 blocks taken. The allocation pattern that minimizes the variance will lead to blocks with the most similar statistical properties.
Mathematically, if P is the set of allocation patterns, then we want to calculate the value of p such that σ2 = minσ^ p p and d,-j is the "cost" of the ith coefficient in the h
Figure imgf000015_0001
segmented vector. As mentioned above, the allocation pattern described here is one example of an attempt to minimize the "cost variance" between segmented blocks. It should be understood that if the allocation patterns are selected adaptively, information on the allocation pattern that is used at the encoder needs to be transmitted to the decoder.
Alternatively, the allocation pattern can be determined from other parameters used in the coding of the image. What is essential here is that both the encoder and the decoder use the same allocation pattern, since otherwise the coded image cannot be decoded properly.
It should be noted that the DC coefficient can be coded differently and separately. However, in order to ensure that the existing 4x4 CAVLC is unchanged, the DC coefficient is not treated any differently than the 3 lowest-frequency AC values. Treating the DC coefficient separately would mostly result in a benefit when there are very few coefficients in the block (for example, for an 8x8 block, three out of four 4x4 blocks are empty). In this case, it may be desirable to exclude the DC term from the prediction of number of non-zero values. However, the benefit may not be significant in general.
The distance/cost metric intrinsic to a coefficient's position in the scan can be used to determine which 4x4 block that coefficient is allocated to. For example, a cost pattern of (0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 ..." can be used for such determining. Alternatively, a cartesian distance such as "0111.42 ..." can be used. The effect of the allocation algorithm is to create blocks with an equal or approximately equal total cost. As such, the variance of the total cost for each block is taken to be a measure of the similarity. The block selected for the next coefficient in the scan is the block with the lowest accumulated cost of coefficients allocated to it so far.
It is also possible that, prior to zigzag scanning, a pre-determined sub-sample procedure is used to sub-sample the 8x8 block as shown in Figure 4 into four "interleaved" sub-blocks as shown in Figures lla - l ld. A zigzag scan is then applied to these sub-blocks in order to produce four ordered vectors of length 16. As such, the result is equivalent to that shown in Figures 8a to 8d. Accordingly, it is possible to provide an image coding method, which comprises the steps of: 1. forming at least a block of transform coefficients for the image data;
2. sub-sampling the transform coefficients in the block in a pre-determined manner for providing a plurality of sub-sampled blocks of transform coefficients;
3. scanning the sub-sampled blocks of transform coefficients for providing a plurality of sub-sampled sequences of transform coefficients, and
4. coding the sub-sampled sequences of transform coefficients using an entropy encoder.
The method of the present invention as described herein above divides coefficients corresponding to different frequencies of the ABT transform among 4x4 blocks more equally. Therefore the created 4x4 blocks have properties statistically similar to those expected by the CAVLC coder, which leads to increased coding efficiency.
Thus, although the invention has been described with respect to a preferred embodiment thereof, it will be understood by those skilled in the art that the foregoing and various other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention.

Claims

What is claimed is:
1. A method of image coding using data indicative of an image, characterized by forming at least a block of transform coefficients from the image data, by scanning the block of transform coefficients for providing a sequence of transform coefficients, by sub-sampling the transform coefficients in the sequence in an interleaved manner for providing a plurality of sub-sampled sequences of transform coefficients, and by coding the sub-sampled sequences of transform coefficients using an entropy encoder.
2. The method according to claim 1, characterized in that said sub-sampling is carried out prior to said coding.
3. The method according to claim 1, characterized in that said coding is carried out prior to said sub-sampling.
4. The method according to any one of claims 1 to 3, characterized in that said sequence of the transform coefficients has a length of 16nxm, where n and m are positive integer equal to or greater than 1.
5. The method according to claim 4, characterized in that each of said sub-sampled sequence of the transform coefficients has a length of 16.
6. The method according to any one of claims 1 to 5, characterized in that said image data is prediction error data.
7. The method according to any one of claims 1 to 5, characterized in that said image data is pixel data.
8. The method according to any one of claims 1 to 7, further characterized by quantizing the transform coefficients into quantized transform coefficients.
9. A computer program to be used in image coding image data indicative of an image, wherein the coding process comprises the steps of: forming at least a block of transform coefficients from the image data, and scanning the block of transform coefficients for providing a sequence of transform coefficients, said computer program characterized by an algorithm for sub-sampling the transform coefficients in the sequence in an interleaved manner for providing a plurality of sub-sampled sequences of transform coefficients.
10. The computer program according to claim 9, characterized in that the coding process further comprises the step of coding the sub-sampled sequences of transform coefficients using an entropy encoder.
11. The computer program according to claim 9, characterized in that the coding process further comprises the step of coding the sequence of transform coefficients using an entropy encoder prior to said sub-sampling.
12. An image encoder for receiving image data and providing a bitstream indicative of the image data, characterized by: means for forming at least a block of transform coefficients from the image data, by means for scanning the block of transform coefficients for forming an ordered sequence of transform coefficients from the block, by a software program for sub-sampling the ordered sequence of transform coefficients in order to form a plurality of sub-sampled sequences of transform coefficients, by means for entropy coding the sub-sampled sequences of transform coefficients for providing signals indicative of the encoded transform coefficients, and by means, for providing the bitstream based on the signals.
13. The image encoder according to claim 12, characterized in that the software program forms the plurality of sub-sampled sequences of transform coefficient prior to the entropy coding means providing the signals indicative of the encoded transform coefficients.
14. The image encoder according to claim 12, characterized in that the entropy coding means provides the signals indicative of the encoded transform coefficients prior to the software program forming the plurality of sub-sampled sequences of transform coefficient.
15. The image encoder according to any one of claims 12 to 14, characterized in that said image data is prediction error data.
16. The image encoder according to any one of claims 12 to 14, characterized in that said image data is pixel data.
17. An image coding system comprising a server for providing a bitstream indicative of image data and a client for reconstructing the image data based on the bitstream, wherein the server characterized by a receiver for receiving signals indicative of the image data, by means for forming at least a block of transform coefficients from the signals, by means for scanning the block of transform coefficients for forming an ordered sequence of transform coefficients from the block, by a software program for sub-sampling the ordered sequence of transform coefficients in order to form a plurality of sub-sampled sequences of transform coefficients, by means for entropy coding the sub-sampled sequences of transform coefficients for providing further signals indicative of the encoded transform coefficients, and by means, for providing the bitstream based on the further signals.
18. The image coding system according to claim 17, characterized in that the software program forms the plurality of sub-sampled sequences of transform coefficient prior to the entropy coding means providing the signals indicative of the encoded transform coefficients.
19. The image coding system according to claim 17, characterized in that the entropy coding means provides the signals indicative of the encoded transform coefficients prior to the software program forming the plurality of sub-sampled sequences of transform coefficient.
20. The image coding system according to any one of claims 17 to 19, characterized in that said image data is prediction error data.
21. The image coding system according to any one of claims 17 to 19, characterized in that said image data is pixel data.
22. A method of image coding using image data indicative of an image, characterized by forming at least a block of transform coefficients from the image data, by sub-sampling the transformation coefficients in the block in an interleaved manner for providing a plurality of sub-sampled blocks of transform coefficients, by scanning the sub-sampled blocks of transform coefficients for providing a plurality of sub-sampled sequences of transform coefficients, and by coding the sub-sampled sequences of transform coefficients using an entropy encoder.
23. A method of image coding using image data indicative of an image, wherein at least a block of transform coefficients is formed from the image data and the block of transformation coefficients is scanned for providing a sequence of transform coefficients located at a plurality of positions in the sequence, wherein the positions include a reference position so that each of said plurality of positions relative to the reference position defines a distance, said method characterized by assigning a cost value to each of the distances, by arranging the transform coefficients in the sequence into a plurality of sub-sequences based on the cost values, and by coding the sub-sequences of transform coefficients using an entropy encoder.
24. The method according to claim 23, wherein each of the sub-sequences has a total cost indicative of a sum of the cost values associated with the transform coefficients in said each sub-sequence, said method characterized in that said arranging is adapted to achieve a minimum in the difference between the total cost of said each sub-sequences and the total cost of each of the other sub-sequences.
PCT/IB2003/003382 2002-10-03 2003-08-19 Context-based adaptive variable length coding for adaptive block transforms WO2004032032A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
AU2003253133A AU2003253133A1 (en) 2002-10-03 2003-08-19 Context-based adaptive variable length coding for adaptive block transforms
KR1020057005733A KR100751869B1 (en) 2002-10-03 2003-08-19 Context-based adaptive variable length coding for adaptive block transforms
EP03798973A EP1546995B1 (en) 2002-10-03 2003-08-19 Context-based adaptive variable length coding for adaptive block transforms
CA2498384A CA2498384C (en) 2002-10-03 2003-08-19 Context-based adaptive variable length coding for adaptive block transforms
CNB038235951A CN100392671C (en) 2002-10-03 2003-08-19 Context-based adaptive variable length coding for adaptive block transforms
JP2004541020A JP4308138B2 (en) 2002-10-03 2003-08-19 Context-based adaptive variable length coding for adaptive block transform

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/264,279 2002-10-03
US10/264,279 US6795584B2 (en) 2002-10-03 2002-10-03 Context-based adaptive variable length coding for adaptive block transforms

Publications (1)

Publication Number Publication Date
WO2004032032A1 true WO2004032032A1 (en) 2004-04-15

Family

ID=32042197

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2003/003382 WO2004032032A1 (en) 2002-10-03 2003-08-19 Context-based adaptive variable length coding for adaptive block transforms

Country Status (10)

Country Link
US (1) US6795584B2 (en)
EP (1) EP1546995B1 (en)
JP (1) JP4308138B2 (en)
KR (1) KR100751869B1 (en)
CN (2) CN101132534B (en)
AU (1) AU2003253133A1 (en)
CA (1) CA2498384C (en)
EG (1) EG23916A (en)
RU (1) RU2330325C2 (en)
WO (1) WO2004032032A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1696675A2 (en) * 2002-10-08 2006-08-30 NTT DoCoMo INC. Method and apparatus for image encoding and decoding
JP2010119153A (en) * 2004-07-12 2010-05-27 Sony Corp Encoding method, encoder, program for them, and recording medium
US8428133B2 (en) 2007-06-15 2013-04-23 Qualcomm Incorporated Adaptive coding of video block prediction mode
US8483282B2 (en) 2007-10-12 2013-07-09 Qualcomm, Incorporated Entropy coding of interleaved sub-blocks of a video block
US8571104B2 (en) 2007-06-15 2013-10-29 Qualcomm, Incorporated Adaptive coefficient scanning in video coding
US8687904B2 (en) 2011-01-14 2014-04-01 Panasonic Corporation Image coding method, image coding apparatus, image decoding method, image decoding apparatus, and image coding and decoding apparatus which include arithmetic coding or arithmetic decoding
US8755620B2 (en) 2011-01-12 2014-06-17 Panasonic Corporation Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus for performing arithmetic coding and/or arithmetic decoding
US9432696B2 (en) 2014-03-17 2016-08-30 Qualcomm Incorporated Systems and methods for low complexity forward transforms using zeroed-out coefficients
US9451287B2 (en) 2011-11-08 2016-09-20 Qualcomm Incorporated Context reduction for context adaptive binary arithmetic coding
US9516345B2 (en) 2014-03-17 2016-12-06 Qualcomm Incorporated Systems and methods for low complexity forward transforms using mesh-based calculations
US9894356B2 (en) 2010-01-14 2018-02-13 Samsung Electronics Co., Ltd. Method and apparatus for encoding video and method and apparatus for decoding video by considering skip and split order
US10306229B2 (en) 2015-01-26 2019-05-28 Qualcomm Incorporated Enhanced multiple transforms for prediction residual
US10623774B2 (en) 2016-03-22 2020-04-14 Qualcomm Incorporated Constrained block-level optimization and signaling for video coding tools
US11323748B2 (en) 2018-12-19 2022-05-03 Qualcomm Incorporated Tree-based transform unit (TU) partition for video coding
US11700384B2 (en) 2011-07-17 2023-07-11 Qualcomm Incorporated Signaling picture size in video coding

Families Citing this family (148)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6563953B2 (en) 1998-11-30 2003-05-13 Microsoft Corporation Predictive image compression using a single variable length code for both the luminance and chrominance blocks for each macroblock
US7082450B2 (en) 2001-08-30 2006-07-25 Nokia Corporation Implementation of a transform and of a subsequent quantization
CN101448162B (en) 2001-12-17 2013-01-02 微软公司 Method for processing video image
WO2003053066A1 (en) * 2001-12-17 2003-06-26 Microsoft Corporation Skip macroblock coding
US7099387B2 (en) * 2002-03-22 2006-08-29 Realnetorks, Inc. Context-adaptive VLC video transform coefficients encoding/decoding methods and apparatuses
US7016547B1 (en) * 2002-06-28 2006-03-21 Microsoft Corporation Adaptive entropy encoding/decoding for screen capture content
US7433824B2 (en) * 2002-09-04 2008-10-07 Microsoft Corporation Entropy coding by adapting coding between level and run-length/level modes
ES2297083T3 (en) 2002-09-04 2008-05-01 Microsoft Corporation ENTROPIC CODIFICATION BY ADAPTATION OF THE CODIFICATION BETWEEN MODES BY LENGTH OF EXECUTION AND BY LEVEL.
TW574802B (en) * 2002-10-24 2004-02-01 Icp Electronics Inc Real-time monitoring and control image transmission system and method
FR2846835B1 (en) * 2002-11-05 2005-04-15 Canon Kk CODING DIGITAL DATA COMBINING MULTIPLE CODING MODES
TWI220846B (en) * 2003-02-25 2004-09-01 Icp Electronics Inc Real-time transmission method and system of monitored video image
US10554985B2 (en) 2003-07-18 2020-02-04 Microsoft Technology Licensing, Llc DC coefficient signaling at small quantization step sizes
US7602851B2 (en) 2003-07-18 2009-10-13 Microsoft Corporation Intelligent differential quantization of video coding
US7830963B2 (en) * 2003-07-18 2010-11-09 Microsoft Corporation Decoding jointly coded transform type and subblock pattern information
US7738554B2 (en) 2003-07-18 2010-06-15 Microsoft Corporation DC coefficient signaling at small quantization step sizes
US8218624B2 (en) 2003-07-18 2012-07-10 Microsoft Corporation Fractional quantization step sizes for high bit rates
US7606308B2 (en) * 2003-09-07 2009-10-20 Microsoft Corporation Signaling macroblock mode information for macroblocks of interlaced forward-predicted fields
US7688894B2 (en) * 2003-09-07 2010-03-30 Microsoft Corporation Scan patterns for interlaced video content
US7782954B2 (en) * 2003-09-07 2010-08-24 Microsoft Corporation Scan patterns for progressive video content
US7724827B2 (en) * 2003-09-07 2010-05-25 Microsoft Corporation Multi-layer run level encoding and decoding
US7286710B2 (en) * 2003-10-01 2007-10-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Coding of a syntax element contained in a pre-coded video signal
US7379608B2 (en) * 2003-12-04 2008-05-27 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung, E.V. Arithmetic coding for transforming video and picture data units
US7599435B2 (en) 2004-01-30 2009-10-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Video frame encoding and decoding
US7586924B2 (en) 2004-02-27 2009-09-08 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for coding an information signal into a data stream, converting the data stream and decoding the data stream
US7801383B2 (en) 2004-05-15 2010-09-21 Microsoft Corporation Embedded scalar quantizers with arbitrary dead-zone ratios
JP2006054846A (en) * 2004-07-12 2006-02-23 Sony Corp Coding method and device, decoding method and device, and program thereof
EP1836858A1 (en) * 2005-01-14 2007-09-26 Sungkyunkwan University Methods of and apparatuses for adaptive entropy encoding and adaptive entropy decoding for scalable video encoding
US7751478B2 (en) * 2005-01-21 2010-07-06 Seiko Epson Corporation Prediction intra-mode selection in an encoder
US8422546B2 (en) 2005-05-25 2013-04-16 Microsoft Corporation Adaptive video encoding using a perceptual model
US7830961B2 (en) * 2005-06-21 2010-11-09 Seiko Epson Corporation Motion estimation and inter-mode prediction
US7684981B2 (en) 2005-07-15 2010-03-23 Microsoft Corporation Prediction of spectral coefficients in waveform coding and decoding
US7693709B2 (en) * 2005-07-15 2010-04-06 Microsoft Corporation Reordering coefficients for waveform coding or decoding
US7933337B2 (en) * 2005-08-12 2011-04-26 Microsoft Corporation Prediction of transform coefficients for image compression
US9077960B2 (en) 2005-08-12 2015-07-07 Microsoft Corporation Non-zero coefficient block pattern coding
US7565018B2 (en) * 2005-08-12 2009-07-21 Microsoft Corporation Adaptive coding and decoding of wide-range coefficients
US8599925B2 (en) * 2005-08-12 2013-12-03 Microsoft Corporation Efficient coding and decoding of transform blocks
KR100736086B1 (en) * 2005-09-06 2007-07-06 삼성전자주식회사 Method and apparatus for enhancing performance of entropy coding, video coding method and apparatus using the method
US20070058723A1 (en) * 2005-09-14 2007-03-15 Chandramouly Ashwin A Adaptively adjusted slice width selection
US8170102B2 (en) * 2005-12-19 2012-05-01 Seiko Epson Corporation Macroblock homogeneity analysis and inter mode prediction
US7843995B2 (en) * 2005-12-19 2010-11-30 Seiko Epson Corporation Temporal and spatial analysis of a video macroblock
DE102005063136B3 (en) * 2005-12-30 2007-07-05 Siemens Ag Marked data stream generating method for use in digital video data, involves displaying applicability of marked data stream section and localizing marked data stream section using marking information
US7974340B2 (en) 2006-04-07 2011-07-05 Microsoft Corporation Adaptive B-picture quantization control
US8059721B2 (en) 2006-04-07 2011-11-15 Microsoft Corporation Estimating sample-domain distortion in the transform domain with rounding compensation
US8503536B2 (en) 2006-04-07 2013-08-06 Microsoft Corporation Quantization adjustments for DC shift artifacts
US7995649B2 (en) 2006-04-07 2011-08-09 Microsoft Corporation Quantization adjustment based on texture level
US8130828B2 (en) 2006-04-07 2012-03-06 Microsoft Corporation Adjusting quantization to preserve non-zero AC coefficients
US8711925B2 (en) 2006-05-05 2014-04-29 Microsoft Corporation Flexible quantization
KR100809301B1 (en) * 2006-07-20 2008-03-04 삼성전자주식회사 Method and apparatus for entropy encoding/decoding
JP4379444B2 (en) * 2006-07-26 2009-12-09 ソニー株式会社 Decoding method, decoding method program, recording medium storing decoding method program, and decoding apparatus
US8599926B2 (en) 2006-10-12 2013-12-03 Qualcomm Incorporated Combined run-length coding of refinement and significant coefficients in scalable video coding enhancement layers
US9319700B2 (en) 2006-10-12 2016-04-19 Qualcomm Incorporated Refinement coefficient coding based on history of corresponding transform coefficient values
US8325819B2 (en) 2006-10-12 2012-12-04 Qualcomm Incorporated Variable length coding table selection based on video block type for refinement coefficient coding
US8565314B2 (en) 2006-10-12 2013-10-22 Qualcomm Incorporated Variable length coding table selection based on block type statistics for refinement coefficient coding
JP5746811B2 (en) * 2006-12-21 2015-07-08 味の素株式会社 Colorectal cancer evaluation method, colorectal cancer evaluation device, colorectal cancer evaluation method, colorectal cancer evaluation system, colorectal cancer evaluation program, and recording medium
US8335261B2 (en) * 2007-01-08 2012-12-18 Qualcomm Incorporated Variable length coding techniques for coded block patterns
US8467449B2 (en) * 2007-01-08 2013-06-18 Qualcomm Incorporated CAVLC enhancements for SVC CGS enhancement layer coding
BRPI0720806B1 (en) * 2007-01-18 2023-03-28 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. SCALABLE QUALITY VIDEO DATA STREAMING
US8238424B2 (en) 2007-02-09 2012-08-07 Microsoft Corporation Complexity-based adaptive preprocessing for multiple-pass video compression
US8184710B2 (en) 2007-02-21 2012-05-22 Microsoft Corporation Adaptive truncation of transform coefficient data in a transform-based digital media codec
RU2420023C1 (en) * 2007-03-13 2011-05-27 Нокиа Корпорейшн System and method to code and decode video signals
US8498335B2 (en) 2007-03-26 2013-07-30 Microsoft Corporation Adaptive deadzone size adjustment in quantization
US8243797B2 (en) 2007-03-30 2012-08-14 Microsoft Corporation Regions of interest for quality adjustments
US8442337B2 (en) 2007-04-18 2013-05-14 Microsoft Corporation Encoding adjustments for animation content
US8331438B2 (en) 2007-06-05 2012-12-11 Microsoft Corporation Adaptive selection of picture-level quantization parameters for predicted video pictures
CN101321283B (en) * 2007-06-10 2010-04-07 华为技术有限公司 Encoding/decoding method and device compatible with different block transformation
US7774205B2 (en) * 2007-06-15 2010-08-10 Microsoft Corporation Coding of sparse digital media spectral data
US8254455B2 (en) 2007-06-30 2012-08-28 Microsoft Corporation Computing collocated macroblock information for direct mode macroblocks
US8144784B2 (en) 2007-07-09 2012-03-27 Cisco Technology, Inc. Position coding for context-based adaptive variable length coding
EP2383920B1 (en) 2007-12-20 2014-07-30 Optis Wireless Technology, LLC Control channel signaling using a common signaling field for transport format and redundancy version
CN101500159B (en) * 2008-01-31 2012-01-11 华为技术有限公司 Method and apparatus for image entropy encoding, entropy decoding
US8189933B2 (en) 2008-03-31 2012-05-29 Microsoft Corporation Classifying and controlling encoding quality for textured, dark smooth and smooth video content
US8902972B2 (en) * 2008-04-11 2014-12-02 Qualcomm Incorporated Rate-distortion quantization for context-adaptive variable length coding (CAVLC)
KR101595899B1 (en) * 2008-04-15 2016-02-19 오렌지 Coding and decoding of an image or of a sequence of images sliced into partitions of pixels of linear form
US8179974B2 (en) 2008-05-02 2012-05-15 Microsoft Corporation Multi-level representation of reordered transform coefficients
US8897359B2 (en) 2008-06-03 2014-11-25 Microsoft Corporation Adaptive quantization for enhancement layer video coding
KR101501568B1 (en) * 2008-07-04 2015-03-12 에스케이 텔레콤주식회사 video encoding, decoding apparatus and method
EP2154894A1 (en) * 2008-08-15 2010-02-17 Thomson Licensing Video coding with coding of the locations of significant coefficients in a block of coefficients
US8406307B2 (en) 2008-08-22 2013-03-26 Microsoft Corporation Entropy coding/decoding of hierarchically organized data
US8503527B2 (en) 2008-10-03 2013-08-06 Qualcomm Incorporated Video coding with large macroblocks
US8619856B2 (en) * 2008-10-03 2013-12-31 Qualcomm Incorporated Video coding with large macroblocks
WO2010039822A2 (en) * 2008-10-03 2010-04-08 Qualcomm Incorporated VIDEO CODING USING TRANSFORMS BIGGER THAN 4x4 AND 8x8
US8634456B2 (en) * 2008-10-03 2014-01-21 Qualcomm Incorporated Video coding with large macroblocks
US20100098156A1 (en) 2008-10-16 2010-04-22 Qualcomm Incorporated Weighted prediction based on vectorized entropy coding
US8189666B2 (en) 2009-02-02 2012-05-29 Microsoft Corporation Local picture identifier and computation of co-located information
KR101672456B1 (en) * 2009-02-09 2016-11-17 삼성전자 주식회사 Method and apparatus for video encoding using low-complexity frequency transform, and method and apparatus for video decoding using the same
JP5258664B2 (en) * 2009-04-14 2013-08-07 株式会社エヌ・ティ・ティ・ドコモ Image coding apparatus, method and program, and image decoding apparatus, method and program
KR101474756B1 (en) * 2009-08-13 2014-12-19 삼성전자주식회사 Method and apparatus for encoding and decoding image using large transform unit
KR20110017719A (en) 2009-08-14 2011-02-22 삼성전자주식회사 Method and apparatus for video encoding, and method and apparatus for video decoding
KR101624649B1 (en) 2009-08-14 2016-05-26 삼성전자주식회사 Method and apparatus for video encoding considering hierarchical coded block pattern, and method and apparatus for video decoding considering hierarchical coded block pattern
KR101456498B1 (en) * 2009-08-14 2014-10-31 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101457894B1 (en) 2009-10-28 2014-11-05 삼성전자주식회사 Method and apparatus for encoding image, and method and apparatus for decoding image
KR101702822B1 (en) * 2010-04-01 2017-02-06 소니 주식회사 Image processing device and method
DK3435674T3 (en) 2010-04-13 2023-08-21 Ge Video Compression Llc Encoding of significance maps and transformation coefficient blocks
US20110292247A1 (en) * 2010-05-27 2011-12-01 Sony Corporation Image compression method with random access capability
WO2012016354A1 (en) * 2010-08-04 2012-02-09 Nxp B.V. Video player
HUE039299T2 (en) * 2010-09-09 2018-12-28 Fraunhofer Ges Forschung Entropy encoding and decoding scheme
CN102447895B (en) * 2010-09-30 2013-10-02 华为技术有限公司 Scanning method, scanning device, anti-scanning method and anti-scanning device
US9008175B2 (en) * 2010-10-01 2015-04-14 Qualcomm Incorporated Intra smoothing filter for video coding
US8913666B2 (en) * 2010-10-01 2014-12-16 Qualcomm Incorporated Entropy coding coefficients using a joint context model
US9641846B2 (en) 2010-10-22 2017-05-02 Qualcomm Incorporated Adaptive scanning of transform coefficients for video coding
US9172963B2 (en) * 2010-11-01 2015-10-27 Qualcomm Incorporated Joint coding of syntax elements for video coding
US9497472B2 (en) * 2010-11-16 2016-11-15 Qualcomm Incorporated Parallel context calculation in video coding
US8976861B2 (en) * 2010-12-03 2015-03-10 Qualcomm Incorporated Separately coding the position of a last significant coefficient of a video block in video coding
WO2012077332A1 (en) * 2010-12-06 2012-06-14 パナソニック株式会社 Image encoding method, image decoding method, image encoding device, and image decoding device
US9049444B2 (en) 2010-12-22 2015-06-02 Qualcomm Incorporated Mode dependent scanning of coefficients of a block of video data
US20120163456A1 (en) 2010-12-22 2012-06-28 Qualcomm Incorporated Using a most probable scanning order to efficiently code scanning order information for a video block in video coding
US10992958B2 (en) 2010-12-29 2021-04-27 Qualcomm Incorporated Video coding using mapped transforms and scanning modes
US9490839B2 (en) 2011-01-03 2016-11-08 Qualcomm Incorporated Variable length coding of video block coefficients
BR122019025407B8 (en) * 2011-01-13 2023-05-02 Canon Kk IMAGE CODING APPARATUS, IMAGE CODING METHOD, IMAGE DECODING APPARATUS, IMAGE DECODING METHOD AND STORAGE MEDIA
FR2972588A1 (en) 2011-03-07 2012-09-14 France Telecom METHOD FOR ENCODING AND DECODING IMAGES, CORRESPONDING ENCODING AND DECODING DEVICE AND COMPUTER PROGRAMS
US10397577B2 (en) 2011-03-08 2019-08-27 Velos Media, Llc Inverse scan order for significance map coding of transform coefficients in video coding
CN102685503B (en) 2011-03-10 2014-06-25 华为技术有限公司 Encoding method of conversion coefficients, decoding method of conversion coefficients and device
KR101215152B1 (en) 2011-04-21 2012-12-24 한양대학교 산학협력단 Video encoding/decoding method and apparatus using prediction based on in-loop filtering
US8743969B2 (en) 2011-06-23 2014-06-03 Panasonic Corporation Image decoding method and apparatus based on a signal type of the control parameter of the current block
USRE47366E1 (en) 2011-06-23 2019-04-23 Sun Patent Trust Image decoding method and apparatus based on a signal type of the control parameter of the current block
KR102062283B1 (en) 2011-06-24 2020-01-03 선 페이턴트 트러스트 Image decoding method, image encoding method, image decoding device, image encoding device, and image encoding/decoding device
FR2977111A1 (en) 2011-06-24 2012-12-28 France Telecom METHOD FOR ENCODING AND DECODING IMAGES, CORRESPONDING ENCODING AND DECODING DEVICE AND COMPUTER PROGRAMS
WO2012176464A1 (en) 2011-06-24 2012-12-27 パナソニック株式会社 Image decoding method, image encoding method, image decoding device, image encoding device, and image encoding/decoding device
CA2842646C (en) 2011-06-27 2018-09-04 Panasonic Corporation Image decoding method, image coding method, image decoding apparatus, image coding apparatus, and image coding and decoding apparatus
GB2492333B (en) * 2011-06-27 2018-12-12 British Broadcasting Corp Video encoding and decoding using transforms
CN106878724B (en) * 2011-06-28 2020-06-05 太阳专利托管公司 Image encoding and decoding device
MX2013010892A (en) 2011-06-29 2013-12-06 Panasonic Corp Image decoding method, image encoding method, image decoding device, image encoding device, and image encoding/decoding device.
US9445093B2 (en) 2011-06-29 2016-09-13 Qualcomm Incorporated Multiple zone scanning order for video coding
US9516316B2 (en) 2011-06-29 2016-12-06 Qualcomm Incorporated VLC coefficient coding for large chroma block
AU2012277219A1 (en) 2011-06-30 2013-09-19 Sun Patent Trust Image decoding method, image encoding method, image decoding device, image encoding device, and image encoding/decoding device
CA2837537C (en) 2011-06-30 2019-04-02 Panasonic Corporation Image decoding method, image coding method, image decoding apparatus, image coding apparatus, and image coding and decoding apparatus
CA2838575C (en) 2011-07-11 2018-06-12 Panasonic Corporation Image decoding method, image coding method, image decoding apparatus, image coding apparatus, and image coding and decoding apparatus
US9338456B2 (en) 2011-07-11 2016-05-10 Qualcomm Incorporated Coding syntax elements using VLC codewords
KR101600615B1 (en) * 2011-07-22 2016-03-14 구글 테크놀로지 홀딩스 엘엘씨 Device and methods for scanning rectangular-shaped transforms in video coding
US20130083845A1 (en) 2011-09-30 2013-04-04 Research In Motion Limited Methods and devices for data compression using a non-uniform reconstruction space
FR2982447A1 (en) 2011-11-07 2013-05-10 France Telecom METHOD FOR ENCODING AND DECODING IMAGES, CORRESPONDING ENCODING AND DECODING DEVICE AND COMPUTER PROGRAMS
FR2982446A1 (en) * 2011-11-07 2013-05-10 France Telecom METHOD FOR ENCODING AND DECODING IMAGES, CORRESPONDING ENCODING AND DECODING DEVICE AND COMPUTER PROGRAMS
HUE060954T2 (en) * 2011-11-07 2023-04-28 Tagivan Ii Llc Image decoding method and image decoding device
EP2595382B1 (en) 2011-11-21 2019-01-09 BlackBerry Limited Methods and devices for encoding and decoding transform domain filters
EP2795901A1 (en) 2011-12-20 2014-10-29 Motorola Mobility LLC Method and apparatus for efficient transform unit encoding
AU2012200319B2 (en) 2012-01-19 2015-11-26 Canon Kabushiki Kaisha Method, apparatus and system for encoding and decoding the significance map for residual coefficients of a transform unit
AU2012200345B2 (en) * 2012-01-20 2014-05-01 Canon Kabushiki Kaisha Method, apparatus and system for encoding and decoding the significance map residual coefficients of a transform unit
US9041721B2 (en) * 2012-02-13 2015-05-26 Nvidia Corporation System, method, and computer program product for evaluating an integral utilizing a low discrepancy sequence and a block size
US8675731B2 (en) * 2012-08-13 2014-03-18 Gurulogic Microsystems Oy Encoder and method
WO2014110651A1 (en) 2013-01-16 2014-07-24 Blackberry Limited Transform coefficient coding for context-adaptive binary entropy coding of video
US20140327737A1 (en) * 2013-05-01 2014-11-06 Raymond John Westwater Method and Apparatus to Perform Optimal Visually-Weighed Quantization of Time-Varying Visual Sequences in Transform Space
CN103391440A (en) * 2013-07-19 2013-11-13 华为技术有限公司 Binarization encoding processing method and device of syntactic information
WO2016154930A1 (en) * 2015-03-31 2016-10-06 Realnetworks, Inc. Motion vector selection and prediction in video coding systems and methods
US10171810B2 (en) 2015-06-22 2019-01-01 Cisco Technology, Inc. Transform coefficient coding using level-mode and run-mode
WO2017107072A1 (en) 2015-12-22 2017-06-29 Realnetworks, Inc. Motion vector selection and prediction in video coding systems and methods
EP3270594A1 (en) * 2016-07-15 2018-01-17 Thomson Licensing Method and apparatus for advanced cabac context adaptation for last coefficient coding
US9712830B1 (en) 2016-09-15 2017-07-18 Dropbox, Inc. Techniques for image recompression
CN113489980B (en) * 2021-08-13 2022-10-25 北京大学深圳研究生院 Method and equipment for entropy coding and entropy decoding of point cloud attribute transformation coefficient

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030012286A1 (en) * 2001-07-10 2003-01-16 Motorola, Inc. Method and device for suspecting errors and recovering macroblock data in video coding
US6577251B1 (en) * 2000-04-04 2003-06-10 Canon Kabushiki Kaisha Accessing sub-blocks of symbols from memory

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5790706A (en) * 1996-07-03 1998-08-04 Motorola, Inc. Method and apparatus for scanning of transform coefficients
CN1067204C (en) * 1998-09-18 2001-06-13 清华大学 Global decision method for video frequency coding
WO2002023475A2 (en) * 2000-09-12 2002-03-21 Koninklijke Philips Electronics N.V. Video coding method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6577251B1 (en) * 2000-04-04 2003-06-10 Canon Kabushiki Kaisha Accessing sub-blocks of symbols from memory
US20030012286A1 (en) * 2001-07-10 2003-01-16 Motorola, Inc. Method and device for suspecting errors and recovering macroblock data in video coding

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MARPE ET AL.: "Video compression using context based adaptive arthmetic coding", IEEE, 2001, pages 558 - 561, XP001110199 *
See also references of EP1546995A4 *

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8326057B2 (en) 2002-10-08 2012-12-04 Ntt Docomo, Inc. Image encoding method, image decoding method, image encoding apparatus, image decoding apparatus, image encoding program, and image decoding program
EP1696675A3 (en) * 2002-10-08 2009-09-23 NTT DoCoMo INC. Method and apparatus for image encoding and decoding
EP1696675A2 (en) * 2002-10-08 2006-08-30 NTT DoCoMo INC. Method and apparatus for image encoding and decoding
US8422809B2 (en) 2002-10-08 2013-04-16 Ntt Docomo, Inc. Image encoding method, image decoding method, image encoding apparatus, image decoding apparatus, image encoding program, and image decoding program
US8422808B2 (en) 2002-10-08 2013-04-16 Ntt Docomo, Inc. Image encoding method, image decoding method, image encoding apparatus, image decoding apparatus, image encoding program, and image decoding program
US7764842B2 (en) 2002-10-08 2010-07-27 Ntt Docomo, Inc. Image encoding method, image decoding method, image encoding apparatus, image decoding apparatus, image encoding program, and image decoding program
US7916959B2 (en) 2002-10-08 2011-03-29 Ntt Docomo, Inc. Image encoding method, image decoding method, image encoding apparatus, image decoding apparatus, image encoding program, and image decoding program
US8036472B2 (en) 2002-10-08 2011-10-11 Ntt Docomo, Inc. Image encoding method, image decoding method, image encoding apparatus, image decoding apparatus, image encoding program, and image decoding program
JP2010141926A (en) * 2004-07-12 2010-06-24 Sony Corp Decoding method, decoding device, program, and recording medium
JP2010136454A (en) * 2004-07-12 2010-06-17 Sony Corp Decoding method, decoding apparatus, program therefor and recording medium
JP2010119153A (en) * 2004-07-12 2010-05-27 Sony Corp Encoding method, encoder, program for them, and recording medium
US9578331B2 (en) 2007-06-15 2017-02-21 Qualcomm Incorporated Separable directional transforms
US8428133B2 (en) 2007-06-15 2013-04-23 Qualcomm Incorporated Adaptive coding of video block prediction mode
US8488668B2 (en) 2007-06-15 2013-07-16 Qualcomm Incorporated Adaptive coefficient scanning for video coding
US8520732B2 (en) 2007-06-15 2013-08-27 Qualcomm Incorporated Adaptive coding of video block prediction mode
US8571104B2 (en) 2007-06-15 2013-10-29 Qualcomm, Incorporated Adaptive coefficient scanning in video coding
US8619853B2 (en) 2007-06-15 2013-12-31 Qualcomm Incorporated Separable directional transforms
US8483282B2 (en) 2007-10-12 2013-07-09 Qualcomm, Incorporated Entropy coding of interleaved sub-blocks of a video block
US10110894B2 (en) 2010-01-14 2018-10-23 Samsung Electronics Co., Ltd. Method and apparatus for encoding video and method and apparatus for decoding video by considering skip and split order
US9894356B2 (en) 2010-01-14 2018-02-13 Samsung Electronics Co., Ltd. Method and apparatus for encoding video and method and apparatus for decoding video by considering skip and split order
US11128856B2 (en) 2010-01-14 2021-09-21 Samsung Electronics Co., Ltd. Method and apparatus for encoding video and method and apparatus for decoding video by considering skip and split order
US10582194B2 (en) 2010-01-14 2020-03-03 Samsung Electronics Co., Ltd. Method and apparatus for encoding video and method and apparatus for decoding video by considering skip and split order
US10638134B2 (en) 2011-01-12 2020-04-28 Sun Patent Trust Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus
US9258558B2 (en) 2011-01-12 2016-02-09 Panasonic Intellectual Property Corporation Of America Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus
US9681137B2 (en) 2011-01-12 2017-06-13 Sun Patent Trust Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus
US11770536B2 (en) 2011-01-12 2023-09-26 Sun Patent Trust Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus
US10015494B2 (en) 2011-01-12 2018-07-03 Sun Patent Trust Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus
US8755620B2 (en) 2011-01-12 2014-06-17 Panasonic Corporation Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus for performing arithmetic coding and/or arithmetic decoding
US11350096B2 (en) 2011-01-12 2022-05-31 Sun Patent Trust Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus
US8687904B2 (en) 2011-01-14 2014-04-01 Panasonic Corporation Image coding method, image coding apparatus, image decoding method, image decoding apparatus, and image coding and decoding apparatus which include arithmetic coding or arithmetic decoding
US11700384B2 (en) 2011-07-17 2023-07-11 Qualcomm Incorporated Signaling picture size in video coding
US9451287B2 (en) 2011-11-08 2016-09-20 Qualcomm Incorporated Context reduction for context adaptive binary arithmetic coding
US9432696B2 (en) 2014-03-17 2016-08-30 Qualcomm Incorporated Systems and methods for low complexity forward transforms using zeroed-out coefficients
US9516345B2 (en) 2014-03-17 2016-12-06 Qualcomm Incorporated Systems and methods for low complexity forward transforms using mesh-based calculations
US10306229B2 (en) 2015-01-26 2019-05-28 Qualcomm Incorporated Enhanced multiple transforms for prediction residual
US10623774B2 (en) 2016-03-22 2020-04-14 Qualcomm Incorporated Constrained block-level optimization and signaling for video coding tools
US11323748B2 (en) 2018-12-19 2022-05-03 Qualcomm Incorporated Tree-based transform unit (TU) partition for video coding

Also Published As

Publication number Publication date
EP1546995B1 (en) 2012-09-19
JP2006501740A (en) 2006-01-12
KR20050052523A (en) 2005-06-02
EP1546995A4 (en) 2006-12-20
JP4308138B2 (en) 2009-08-05
RU2330325C2 (en) 2008-07-27
KR100751869B1 (en) 2007-08-23
CN101132534B (en) 2010-06-02
CN1689026A (en) 2005-10-26
CA2498384A1 (en) 2004-04-15
AU2003253133A1 (en) 2004-04-23
RU2005113308A (en) 2006-01-20
CA2498384C (en) 2011-06-21
EP1546995A1 (en) 2005-06-29
CN101132534A (en) 2008-02-27
US6795584B2 (en) 2004-09-21
US20040066974A1 (en) 2004-04-08
CN100392671C (en) 2008-06-04
EG23916A (en) 2007-12-30

Similar Documents

Publication Publication Date Title
CA2498384C (en) Context-based adaptive variable length coding for adaptive block transforms
JP3679083B2 (en) Image encoding method, image decoding method, image encoding device, image decoding device, image encoding program, image decoding program
KR101622450B1 (en) Video encoding and decoding using transforms
CN107396110B (en) Video data decoding apparatus
KR101947657B1 (en) Method and apparatus for encoding intra prediction information
US8687692B2 (en) Method of processing a video signal
CN107396103B (en) Image decoding method and apparatus, data item encoding device, and storage device
US20090067503A1 (en) Method and apparatus for video data encoding and decoding
US20060232452A1 (en) Method for entropy coding and decoding having improved coding efficiency and apparatus for providing the same
AU2021200431B2 (en) Techniques for high efficiency entropy coding of video data
JP2020005294A (en) Processing method
KR100801967B1 (en) Encoder and decoder for Context-based Adaptive Variable Length Coding, methods for encoding and decoding the same, and a moving picture transmission system using the same
KR101739580B1 (en) Adaptive Scan Apparatus and Method therefor
CN114025166A (en) Video compression method, electronic device and computer-readable storage medium
CN116647673A (en) Video encoding and decoding method and device
KR100460947B1 (en) Device for processing image signal and method thereof
KR20040028318A (en) Image encoding and decoding method and apparatus using spatial predictive coding

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 1200500581

Country of ref document: VN

AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2498384

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2003798973

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2004541020

Country of ref document: JP

Ref document number: 20038235951

Country of ref document: CN

Ref document number: 1020057005733

Country of ref document: KR

ENP Entry into the national phase

Ref document number: 2005113308

Country of ref document: RU

Kind code of ref document: A

WWP Wipo information: published in national office

Ref document number: 1020057005733

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2003798973

Country of ref document: EP