WO2007079782A1 - Quality scalable picture coding with particular transform coefficient scan path - Google Patents

Quality scalable picture coding with particular transform coefficient scan path Download PDF

Info

Publication number
WO2007079782A1
WO2007079782A1 PCT/EP2006/001293 EP2006001293W WO2007079782A1 WO 2007079782 A1 WO2007079782 A1 WO 2007079782A1 EP 2006001293 W EP2006001293 W EP 2006001293W WO 2007079782 A1 WO2007079782 A1 WO 2007079782A1
Authority
WO
WIPO (PCT)
Prior art keywords
stream
transform coefficient
picture
transform
refinement information
Prior art date
Application number
PCT/EP2006/001293
Other languages
French (fr)
Inventor
Heiko Schwarz
Thomas Wiegand
Tobias Hinz
Original Assignee
Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. filed Critical Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
Publication of WO2007079782A1 publication Critical patent/WO2007079782A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/34Scalability techniques involving progressive bit-plane based encoding of the enhancement layer, e.g. fine granular scalability [FGS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/16Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter for a given display mode, e.g. for interlaced or progressive display mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention relates to a video codec supporting quality- or SNR-scalability .
  • JVT Joint Video Team
  • MPEG Moving Pictures Experts Group
  • VCEG ITU-T Video Coding Experts Group
  • H.264/MPEG4-AVC as described in ITU-T Rec. & ISO/IEC 14496- 10 AVC, "Advanced Video Coding for Generic Audiovisual Services, " version 3, 2005, specifies a hybrid video codec in which macroblock prediction signals are either generated by motion-compensated prediction or intra-prediction and both predictions are followed by residual coding.
  • H.264/MPEG4-AVC coding without the scalability extension is referred to as single-layer H.264/MPEG4-AVC coding.
  • Rate- distortion performance comparable to single-layer H.264/MPEG4-AVC means that the same visual reproduction quality is typically achieved at 10% bit-rate.
  • scalability is considered as a functionality for removal of parts of the bit-stream while achieving an R-D performance at any supported spatial, temporal or SNR resolution that is comparable to single-layer H.264/MPEG4- AVC coding at that particular resolution.
  • the basic design of the scalable video coding can toe classified as layered video codec.
  • the basic concepts of motion-compensated prediction and intra prediction are employed as in H.264/MPEG4-AVC.
  • additional inter-layer prediction mechanisms have been integrated in order to exploit the redundancy between several spatial or SNR layers.
  • SNR scalability is basically achieved by residual quantization, while for spatial scalability, a combination of motion-compensated prediction and oversampled pyramid decomposition is employed.
  • the temporal scalability approach of H.264/MPBG4-AVC is maintained.
  • the coder structure depends on the scalability space that is required by an application.
  • Pig. 10 shows a typical coder structure 900 with two spatial layers 902a, 902b.
  • an independent hierarchical motion-compensated prediction structure 904a,b with layer-specific motion parameters 906a, b is employed.
  • the redundancy between consecutive layers 902a, b is exploited by inter-layer prediction concepts 908 that include prediction mechanisms for motion parameters 906a,b as well as texture data 910a, b.
  • a base representation 912a, b of the input pictures 914a, b of each layer 902a, b is obtained by transform coding 916a, b similar to that of H.264/MPEG4-AVC, the corresponding NAL units (NAL - Network Abstraction Layer) contain motion information and texture data; the NAL units of the base representation of the lowest layer, i.e. 912a, are compatible with single-layer H.264/MPEG4-AVC.
  • the reconstruction quality of the base representations can be improved by an additional coding 918a, b of so-called progressive refinement slices; the corresponding NAL units can be arbitrarily truncated in order to support fine granular quality scalability (FGS) or flexible bit-rate adaptation.
  • FGS fine granular quality scalability
  • the resulting bit-streams output by the base layer coding 916a, b and the progressive SNR refinement texture coding 918a, b of the respective layers 902a,b, respectively, are multiplexed by a multiplexer 920 in order to result in the scalable bit-stream 922.
  • This bit-stream 922 is scalable in time, space and SNR quality.
  • the temporal scalability is provided by using a hierarchical prediction structure.
  • the one of single-layer H.264/MPEG4-AVC standards may be used without any changes.
  • additional tools have to be added to the single-layer H.264/MPEG4.AVC. All three scalability types can be combined in order to generate a bit-stream that supports a large degree on combined scalability.
  • CGS coarse-grain scalability
  • FGS fine-granular scalability
  • each NAL unit for a PR slice represents a refinement signal that corresponds to a bisection of a quantization step size (QP increase of 6) .
  • QP increase of 6 the refinement signal represented by a PR NAL unit refines the transformation coefficients of transform blocks into which a current picture of the video has been separated.
  • this refinement signal may be used to refine the transformation coefficients within the base layer bit-stream before performing the inverse transform in order to reconstruct the texture of prediction residual used for reconstructing the actual picture by use of a spatial and/or temporal prediction, such as by means of motion compensation.
  • the progressive refinement NAL units can be truncated at any arbitrary point, so that the quality of the SNR base layer can be improved in a fine granular way. Therefore, the coding order of transform coefficient levels has been modified. Instead of scanning the transform coefficients macroblock-by-macroblock, as it is done in (normal) slices, the transform coefficient blocks are scanned in separate paths and in each path, only a few coding symbols for a transform coefficient block are coded. With the exception of the modified coding order, the CABAC entropy coding as specified in H.264/MPEG4-AVC is re-used.
  • the transmission of transform coefficient levels of progressive refinement slices i.e. the transmission of a quantization refinement level for the transform coefficients within a progressive refinement slice proceeds in so-called scan cycles.
  • the progressive SNR refinement texture coding means 918 and multiplexer 920 co-operate in order to cause the transmission of the transform coefficient levels of the progressive refinement slices in the order described in the following.
  • Coding means 918 scans all macroblocks of the progressive refinement slice in a specific raster-scan order. Inside each macroblock, the transform blocks are scanned in a further specific scan order. The transform coefficient levels inside each transform block are, in turn, scanned in a further specific zigzag scanning order.
  • the zigzag scanning orders used for scanning the transform coefficient levels inside the transform blocks have a back-and-forth direction parallel to a direction perpendicular to a bisecting line between the horizontal and vertical axis of the transform blocks.
  • the refinement information for those transform coefficients is selected for transmission and coding the transformation coefficient levels of which are non-significant when considering the base layer and all intermediate refinement layers to the current refinement layer, if existing.
  • refinement information for no more than one significant transform coefficient level of a 4x4 transform block and refinement information for no more than four significant transform coefficient levels of an 8x8 transform block are selected for transmission and coding.
  • refinement path refinement information for the other transform coefficients is selected for transmission and coding by use of the same macroblocks, block and transform coefficient scanning orders.
  • refinement information thus provided in the progressive refinement slices is associatable with the transform coefficients by use of the same scanning orders and by the knowledge derived from the base layer bit-stream and the eventually existing intermediate refinement layers.
  • One disadvantage of the above-described scalable extension of the Video Coding Standard H.264/MPEG4-AVC is that a distortion/rate performance of the refinement layers defined by the base layer bit-stream 912a or 912b plus the respective refinement layer bit-streams output by blocks 918a, b up to a specific point within a specific respective refinement layer is very likely to be non-optimal when considering interlaced video source material due to the special characteristic of interlaced source material, i.e.
  • each frame is composed of two inter-leaved fields with the fields being individually handled like frames (field-coded) or with macroblock pair-wise deciding as to whether the respective macroblock portion is divided up into two macroblocks in accordance with the membership to the top or bottom field or the membership to the top or bottom half of the macroblock pair area within the frame.
  • an improved coding efficiency in refinement information in fine granular scalability sense may be achieved by scanning the transform coefficients of a predetermined transform coefficient block in a scan order along a zig-zag scanning path having a back-and-forth direction being inclined relative to a direction perpendicular to the bisecting line between the horizontal and vertical axis of the predetermined transformation coefficient block.
  • the scanning path is better adapted for transformation blocks of field macroblocks, i.e.
  • the scanning path better separates between significant transform coefficients, i.e. those whose transformation coefficient levels are significant in accordance with the base layer or any of the intermediate refinement layers, and non-significant transform coefficients.
  • the present invention is based on the finding that the coding efficiency may be enhanced by appending refinement information to a base layer data- stream serially such that the transform coefficients of different transform coefficient blocks of a picture are refined in a scan order along different scanning paths, one of which, for example, has a back-and-forth direction parallel to a direction perpendicular to a bisecting line between the horizontal and vertical axis and the other one of which has a back-and-forth direction inclined relative to a direction perpendicular to the bisecting line.
  • the present invention is advantageous in that it is possible to better adapt the refinement values by which the individual transform coefficients are refined by the refinement information to each other within each scanning path.
  • adaptive entropy encoding of the refinement information is enabled to better adapt its probability estimation to the actual probability distribution of the actual refinement values.
  • Fig. 1 a block diagram of a video encoder according to an embodiment of the present invention
  • Figs. 2 (a) and (b) a flow chart illustrating the steps performed in the encoder of Fig. 1 for refining the transform coefficients when operating according to the structure of Fig. 10;
  • Fig. 3 a schematic illustrating the macroblock scan of a progressive refinement slice inside a scan cycle;
  • Fig. 4 a schematic illustrating the scanning of transform coefficient blocks inside a macroblock in case of (a) scanning of 4x4 luma blocks and in case of (b) scanning of 8x luma blocks;
  • Fig. 5 a schematic illustrating the scanning of transform coefficient levels inside (a) a 4x4 block and in case of (b) an 8x8 block;
  • Fig. 6 a flow chart illustrating the difference of the behavior of the encoder of Fig. 1 when operating in accordance with the embodiments of the present invention
  • Fig. 7 a schematic illustrating an alternative scanning of transform coefficient levels inside (a) a 4x4 block and (b) an 8x8 block in accordance with an embodiment of the present invention
  • Fig. 8 a schematic illustrating an alternative macroblock scan of a progressive refinement slice inside a scan cycle for coding frames with activated macroblock adaptive frame field coding option in accordance with an embodiment of the present invention
  • Fig. 9 a flow chart showing the steps performed at decoder side in accordance with the present invention.
  • Fig. 10 a conventional coder structure for scalable video coding.
  • the present invention is described in the following by means of an embodiment with a similar structure to the conventional coder structure of Fig. 10.
  • the video encoder of Fig. 1 representing an embodiment of the present invention is firstly described as operating in accordance with the scalable extension of the H.264/MPEG4-AVC standard having been presented in the introductory portion of this specification with respect to Fig. 10.
  • the actual operation of the encoder Fig. 1 is illustrated by emphasizing the differences to the mode of operation in accordance with the video structure of Fig. 10.
  • the differences reside in the refinement coding means and the multiplexer.
  • the video coder of Fig. 1 operating as defined in the above-mentioned Joint Drafts supports two spatial layers.
  • the encoder of Fig. 1, which is generally indicated by 100 comprises two layer portions or layers 102a and 102b, among which layer 102b is dedicated for generating that part of the desired scalable bit-stream concerning a coarser spatial resolution, while the other layer 102a is dedicated for supplementing the bit-stream output by layer 102b with information concerning a higher resolution representation of an input video signal 104.
  • encoder 100 comprises a spatial decimeter 106 for spatially decimating the video signal 104 before inputting the resulting spatially decimated video signal 108 into layer 102b.
  • the decimation performed in spatial decimeter 106 comprises, for example, decimating the number of pixels for each picture 104a of the original video signal 104 by a factor of 4 by means of discarding every second pixel in column and row directions.
  • the low-resolution layer 102b comprises a motion- compensated prediction block 110b, a base layer coding block 112b and a refinement coding block 114b.
  • the prediction block 110b performs a motion-compensated prediction on pictures 108a of the decimated video signal 108 in order to predict pictures 108a of the decimated video signal 108 from other reference pictures 108a of the decimated video signal 108. For example, for a specific picture 108a, the prediction block 110b generates motion information that indicates as to how this picture may be predicted from other pictures of the video signal 108, i.e. from reference pictures.
  • the motion information may comprise pairs of motion vectors and associated reference picture indices, each pair indicating, for example, how a specific part or macroblock of the current picture is predicted from an index reference picture by displacing the respective reference picture by the respective motion vector.
  • Each macroblock may be assigned one or more pairs of motion vectors and reference picture indices.
  • some of the macroblocks of a picture may be intra-predicted, i.e. predicted by use of the information of the current picture.
  • the prediction block 110b may perform a hierarchical motion- compensator prediction on the decimated video signal 108.
  • the prediction block 110b outputs the motion information 116b as well as the prediction residuals of the video texture information 118b representing the differences between the predictors and the actual decimated pictures
  • the determination of the motion information and the texture information 116b and 118b and prediction block 110b is performed such that the resulting encoding of this information by means of the subsequent base layer coding 110b results in a base-representation bit-stream with, preferably, optimum rate-distortion performance.
  • the base layer coding block 110b receives the first motion information 116b and the texture information 118b from block 110b and encodes the information to a base-representation bit-stream 120b.
  • the encoding performed by block HOb comprises a transformation and a quantization of the texture information 118b.
  • the quantization used by block 110b is relatively coarse.
  • the refinement coding block 114b supports the bit-stream 120b with additional bit- streams for various refinement layers containing information for refining the coarsely quantized transform coefficients representing the texture information in bit- stream 120b.
  • refinement coding block 114b - in co-operation with the prediction block HOb - could also be able to decide that a specific refinement layer bit- stream 122b should be accompanied by refined motion information 116b.
  • this functionality is not discussed further in the following in order to ease the description of the present invention.
  • the refinement of the residual texture information relative to the base representation 120b of the formerly-output lower refinement layer bit-stream 122b comprises, for example, the encoding of the current quantization error of the transform coefficients thereby representing the texture information 118b with a finer quantization prediction.
  • Both bit-streams 120b and 122b are multiplexed by a multiplexer 124 comprised by encoder 100 in order to insert both bit-streams into the final scalable bit-stream 126 representing the output of encoder 100.
  • Layer 102a substantially operates the same as layer 102b. Accordingly, layer 102a comprises a motion-compensation prediction block HOa, a base layer coding block 112a and a refinement coding block 114a.
  • the prediction block HOa receives the video signal 104 and performs a motion-compensated prediction thereon in order to obtain motion information 116a and texture information 118a.
  • the output motion and texture information 116a and 118a are received by coding block 112a, which encodes this information to obtain the base representation bit-stream 120a.
  • the refinement coding block 114a codes refinements of the quantization error manifesting itself on the base representation 120a by comparing a transformation coefficient of bit-stream 120a and the actual transformation coefficient resulting from the original texture information 118a and, accordingly, outputs refinement-layer bit-streams 122a for various refinement layers.
  • layer 102a is inter-layer predicted. That is, the prediction block 110a uses information derivable from layer 102b, such as residual texture information, motion information or a reconstructed video signal, as derived from one or more of the bit-streams 120b and 122b in order to pre-predict the high resolution pictures 104a of the video signal 104, thereafter performing the motion- compensated prediction on the pre-prediction residuals, as mentioned above with respect to prediction block 110b relative to the decimated video signal 108. Alternatively, the prediction block 110a uses the information derivable from layer 102b for predicting the motion compensated residual 118a.
  • information derivable from layer 102b such as residual texture information, motion information or a reconstructed video signal
  • picture content 104a may be predicted by means of the reconstructed base layer picture.
  • the motion vector (s) 116a output from 110a may be predicted from the corresponding reconstructed base layer motion vector.
  • the motion compensated residual 118a of layer 102a may be predicted from the reconstructed base layer residual for the corresponding picture which residual is then further prosecuted in blocks 112a, 114a.
  • the mode of operation of the encoder 100 is described in more detail below, with, however, as indicated above, firstly restricting the mode of operation of encoder 100 to that being in accordance with the structure of Fig. 1.
  • the following description of the mode of operation of encoder 100 focuses on the refinement coding performed in means 114a, b, i.e. the refinement of the transformation coefficient levels in consecutive quality levels.
  • Figs. 2a and 2b showing the steps performed by the refinement coding means 114a, b in order to provide the refinement information.
  • Figs. 6 to 8 the different behavior of encoder 100 of Fig. 1 in accordance with the embodiment of the present invention is described.
  • the refinement coding means 114a, b provide refinement information for refining in consecutive quality stages or levels, the transform coefficients of the base layer bit-stream representing the texture information output by prediction means 110a, b, respectively.
  • the refinement coding means 114a r b consecutively refines a quantization step size from one quality level to the next and determines refinement values enabling refining the respective transform coefficients by adding the respective refinement value to the transform coefficient level in accordance with the immediately-preceding quality level, i.e. the preceding refinement level or the base layer level.
  • the refinement values of the refinement information determined by refinement coding means 114a, b are simply called transform coefficient levels, although the actual refinement values define the offset of the transform coefficient levels between the current quality level and the preceding quality level.
  • the transmission of transform coefficient levels of progressive refinement slices performed by refinement coding means 114a, b proceeds in so-called scan cycles.
  • no more than one significant transform coefficient level of a 4x4 transform block and no more than 4 significant transform coefficient levels of an 8x8 transform block are coded.
  • a transform coefficient level is called significant when its value is not equal to zero.
  • all macroblocks of the progressive refinement slice are scanned in a specific raster-scan order.
  • the transform blocks are scanned in a specific transform block scanning order.
  • the transform coefficients are scanned in specific zigzag scans, as described hereinafter.
  • Fig. 2 (a) at the beginning of a progressive refinement slice, all transform coefficient levels of non-significant transform coefficients are transmitted and coded. In the following description, only luma transform coefficients are considered. However, the progressive refinement coding of chroma transform coefficients is similarly performed.
  • the transform coefficient levels of the nonsignificant transform coefficients are firstly coded.
  • the coding of transform coefficient levels for all nonsignificant transform coefficients of the current progressive refinement slice is complete.
  • the coding of transform coefficient levels for the significant transform coefficients is referred to as significant path.
  • the significance path is shown in Fig. 2 (a).
  • the coding of transform coefficient levels for significant transform coefficient levels is referred to as refinement path.
  • the refinement path follows the significant path and is shown in Fig. 2 (b) .
  • a slice refers to a group of macroblocks into which a picture is partitioned.
  • the macroblocks are fixed- sized and cover, for example, a rectangular picture area of 16x16 samples of the luma component and 8x8 samples of each of the two chroma components.
  • a picture may be split into one or several slices. A picture is therefore a collection of one or more slices.
  • slices are self-contained in the sense that given the active sequence and picture parameter sets, the syntax elements can be parsed from the bit-stream and the values of the samples in the area of the picture that the slice represents can be correctly decoded without use of data from other slices provided that utilized reference pictures are identical at encoder and decoder.
  • the progressive refinement slices contain refinement information for the transform coefficients of the transform blocks of macroblocks within the picture area represented by this progressive refinement slice.
  • the progressive refinement slices coincide, for example, with the slices used in the base layer bit-stream 120a, b.
  • the significance path involves several scan cycles. In these scan cycles, all macroblocks of the progressive refinement slice are scanned in a raster-scan order, as depicted in Fig. 3. Inside each macroblock, the transform blocks are scanned as depicted in Figs. 4 (a) and (b) , respectively and inside each transform block, the transform coefficient levels are scanned as depicted in Figs. 5 (a) and (b) , respectively.
  • the refinement coding means 114a, b firstly steps to the first macroblock of the PR slice in macroblock scan order in step 200.
  • Fig. 3 shows an exemplary progressive refinement slice 300 exemplarily comprising a conglomeration of macroblocks 302.
  • the macroblock raster-scan order among the macroblocks 302 inside slice 300 is indicated by consecutively arranged arrows 304.
  • the macroblocks 302 are arranged in lines and columns in a rectangular array, wherein the raster-scan order 304 is defined among the macroblocks 302 such that the raster-scan order 304 begins with a macroblock 302a in the most top row and most left- hand side column, then scans all macroblocks within this row to the macroblock 302 within slice 300 in the most right-hand side column (302b) and then steps to the most left macroblock 302c in the next lower row of macroblocks occupied by one of the macroblocks of the PR slice 300, and so on.
  • macroblock 302a is visited.
  • the refinement coding means 114a, b steps to the first transformation block or transform block within the current macroblock 302a in transformation block scanning order.
  • the transformation block scanning order is different for 4x4 and 8x8 transform blocks.
  • the luma samples of a macroblock may either be partitioned in a rectangular 2x2 array of 8x8 blocks or a rectangular 4x4 array of 4x4 blocks.
  • Each 8x8 or 4x4 block is transformed from spatial to spectral domain by means of a two-dimensional transform individually.
  • each macroblock may either be represented by four 8x8 transform blocks or sixteen 4x4 transform blocks representing the texture information of the macroblock within the respective rectangularly arranged portions of this macroblock.
  • Fig. 4 (a) shows a macroblock 400a being partitioned into sixteen 4x4 transform blocks 402a being arranged in a 4x4 rectangular array and each consisting of four 4x4 transform coefficients.
  • the transform block scan order among blocks 402a is indicated by a sequence of arrows 404.
  • scan order 404 the order among blocks 402a is defined such that same are scanned in quadrants. Firstly, the four blocks 402a in the upper left quadrant of the macroblock 400a are scanned followed by the ones in the upper right, lower left and then lower right quadrant. Inside each quadrant, the blocks 402a are scanned in the same order, i.e. in the order of the upper left, upper right, lower left and lower right block inside the respective quadrant.
  • a macroblock 400b partitioned into 8x8 blocks same is represented by four 8x8 transform blocks 402b arranged in a 2x2 rectangular array and the scan order among them indicated by arrows 406, is defined such that the upper left block 402b is followed by the upper right, lower left and lower right block 402b in this order.
  • the upper left transform block 402a,b is visited in step 202.
  • step 204 the refinement coding means 114a, b steps to the first non-significant transform coefficient in scanning order within the current macroblock.
  • Fig. 5a shows the scanning order among the transform coefficients used in step 204 in case of a 4x4 transform block 402a.
  • Fig. 5 (a) shows a rectangular 4x4 array of transform coefficients 450 representing the transform block 402a.
  • the scanning order defined among the transform coefficients 415 is indicated by a sequence of consecutive arrows 452.
  • the scanning order 452 defines a zigzag scan having a back and forth direction 454, which is substantially parallel to a direction perpendicular to a bisecting line 456 between the horizontal axis 458 and vertical axis 460 spanning the spectral domain in which the transform coefficients 450 are arranged.
  • scanning order 452 starts at the upper left transform coefficient 450a representing the DC component and leads to the transform coefficient at the lower left corner 450b representing the highest spectral component in both directions 458 and 460 in a zigzag scan stepping through the other transform coefficients 450 with a back and forth direction 454 and the general step forward direction pointing from transform coefficient 450a to 450b.
  • Fig. 5(b) shows the scanning path among the transform coefficients of an 8x8 transform block 402b in which elements corresponding to that of Fig. 5 (a) are indicated with the same reference signs as in Fig. 5a, wherein the foregoing description with respect to these documents of Fig. 5 (a) equally applies to Fig. 5 (b) .
  • a 4x4 transform block and an 8x8 transform block means 114a, b starts visiting the transform coefficients 450 in the order shown in Fig. 5 (a) or 5(b) starting with a DC transform coefficient 450a.
  • a visited transform coefficient represents a significant transform coefficient
  • the next transform coefficient 450 inside the respective block 402a or 402b is visited.
  • next transform coefficient 450 inside the respective block is available, i.e. that the last transform coefficient 450b has been reached with same being a significant transform coefficient.
  • the coding is to proceed with the next transform coefficient block, although not shown in Fig. 2 (a) .
  • a visited transform coefficient in step 204 represents a non-significant transform coefficient, this is the transform coefficient which is stepped to in step 204.
  • step 206 The value of the corresponding transform coefficient level is then coded in step 206. Thereafter, in step 208, it is determined as to whether the value of the transform coefficient level of the current transform coefficient is zero, i.e. as to whether the current transform coefficient level is significant. If the transform coefficient level is equal to zero, the next transform coefficient levels inside the block in scanning order according to Fig. 5 (a) or 5 (b) , respectively are visited until the next non-significant transform coefficient in scanning order is reached. If such next non-significant transform coefficient in scanning order is available 212, the method steps to step 206 where its level is coded.
  • the coding proceeds with the next transform coefficient block by stepping to the next transform block in transformation block scanning order according to Fig. 4 (a) or (b) , respectively, in step 214. Otherwise, if the transform coefficient level is determined not to be equal to zero in step 208, it is determined in step 216 as to whether the current block represents a 4x4 block. If this is the case, the coding is continued with the next transform coefficient block by jumping to step 214.
  • step 216 If it is determined in step 216 that the current block is not a 4x4 block, but represents an 8x8 block, it is determined in step 218 as to whether the current transform coefficient is the fourth transform coefficient inside the current block and the current scan cycle, for which a transform coefficient level not equal to zero has been coded, wherein a complete scan cycle shall indicate a complete scan of all transformation coefficients inside the current progressive refinement slice. If the result of the determination in step 218 is positive, the coding is continued with the next transform coefficient block by jumping to step 214. Otherwise, the next transform coefficient levels inside the current block are visited until reaching the next non-significant transform coefficient in scanning order, according to Fig. 5 (a) and (b) , respectively by jumping to step 210 wherein, as noted above, when there is no further transform coefficient inside the current block, the coding proceeds with the next transform coefficient block in step 204.
  • the next transformation block or transform coefficient block in step 214 may either be the next transform coefficient block inside the current macroblock, when available, or the first transform coefficient block of the first macroblock inside the next scan cycle, i.e. the first transformation coefficient block of the first macroblock within the progressive refinement slices. In other words, when there is no next transformation block in the current macroblock in transformation block scanning order available
  • step 222 comprising stepping to the next macroblock in macroblock scan order, as shown in Fig. 3. Otherwise, i.e. if a next transformation block in transformation block scanning order according to Fig. 4 (a) or (b) , respectively is available, the procedure proceeds to step 204. If there is no next macroblock in macroblock scanning order available in step 222 (224) , the procedure proceeds with step 226, where it is determined as to whether there are still non-significant transformation coefficients left in the progressive refinement slice, the levels of which have not yet been coded. If a next macroblock in macroblock scanning order is available, the procedure proceeds with step 202. If, however, no next macroblock in macroblock scanning order is available, the procedure proceeds with step 200. Otherwise, this significance path is finished and the procedure proceeds to the refinement path shown in Fig. 2(b).
  • the coding of transform coefficient levels for any transform coefficient block starts with the first transform coefficient in scanning order that has not yet been visited in any of the previous scan cycles.
  • the first non-significant transform coefficient and scanning order is visited or stepped to, the level of which has not been coded in any of the previous scan cycles.
  • the coding proceeds with the next block in scanning order at step 204.
  • the refinement path begins at step 228 with determining as to whether variable length coding is used for coding the transform coefficient levels of the progressive refinement slice or arithmetic coding.
  • variable length coding is used for coding the transform coefficient levels of the progressive refinement slice or arithmetic coding.
  • CABAC context-adaptive binary arithmetic coding scheme
  • CAVLC context-adaptive variable length coding
  • step 230 the procedure proceeds with step 230.
  • the first macroblock of the PR slice is visited.
  • step 232 the first transformation block within the current macroblock is visited.
  • step 234 it is determined as to whether the current transformation block is a 4x4 block or not. If this is the case, in step 236, the transform coefficient at the current position within the scanning order 452 is checked to be significant or not, and, if same is significant, its (refined) level is transmitted. Otherwise, nothing is done with respect to this block at this scan cycle.
  • the transmission of levels leads to their binary arithmetic coding into the bit-stream 122a, b.
  • the levels of the significant transform coefficients (if any) within the next four positions within the scanning order of the current block are scanned and transmitted in step 238.
  • steps 236 and 238 the procedure steps to steps 240 or 242, respectively, involving the stepping to the next transform block within the current macroblock, if available.
  • steps 234 to 242 define the following prosecution when arriving at a transformation block.
  • Starting from the first transform coefficient it is determined as to whether same represents a significant transform coefficient. If this is the case, the transform coefficient level is transmitted.
  • N be the scan index using the scanning pattern of Fig. 5 (a) or (b) , respectively, of the current transform coefficient inside the transform block starting with 1.
  • the prosecution prosecutes further, depending on the transform block size of the current macroblock. If the current transform block is a 4x4 block, the coding proceeds with the next transform coefficient block in scanning order, shown in Fig. 4 (a). Otherwise, if the current transform block is an 8x8 block and N is an integer multiple of 4, i.e.
  • Steps 248 and 250 comprise stepping to the next macroblock in macroblock scan order, according to Fig. 3. If such next macroblock is available (252 and 254, respectively), the procedure loops back to step 232. However, if not, the procedure proceeds with step 256, where it is determined as to whether there are, in case of the 8x8 bocks, quadruples, i.e.
  • the following pseudo code summarizes the procedure, wherein scanldx being the scan index, 8x8block indexes the 8x8 transform blocks within the current macroblock, 4x4block indexes the 4x4 transform blocks within the current macroblock and scanldx8x8 indexes to the transform coefficients within an 8x8 transform block in scanning order, according to Fig. 5(b) and scanldx4x4 indexes the transform coefficients in scanning order according to Fig. 5 (a) within a 4x4 transform block.
  • step 228 if it is determined in step 228 that variable length coding or CAVLC is used as entropy coding mode, i.e. entropy_coding_mode_flag is equal to 0, the procedure proceeds with steps 258 and 260, corresponding to steps 230 and 232. Then, in step 262, the significant transform coefficients within the current transform block are scanned and transmitted in the respective scanning order according to Fig. 5 (a) and (b) , respectively, whereinafter the procedure proceeds with step 264, involving stepping to the next transformation block. If such next transformation block is available in the current macroblock, the procedure loops back to step 262 (step 266) .
  • step 268 the procedure steps to the next macroblock in macroblock scan order according to Fig. 3 wherein, if same is available (270), the procedure loops back to step 260. Otherwise, the procedure ends.
  • the refinement path in case of variable length coding involves, for each macroblock and each transform block inside a macroblock, visiting the transform coefficients inside the transform block in the scanning order of Fig. 5 (a) and (b) , respectively and, for each transform coefficient, transmitting the transform coefficient level when the transform coefficient represents a significant transform coefficient.
  • Fig. 1 operating in accordance with an embodiment of the present invention differs from the functionality, according to Fig. 2 (a) and (b) in the portions indicated below, thereby forming a kind
  • a frame can either be coded as a coded frame or as two coded fields. This is referred to as picture- adaptive frame field coding.
  • a frame or video may be considered to contain two interleaved fields, a top and a bottom field.
  • the top field contains even- numbered rows 0, 2, . . . H/2-1, with H being the number of rows of the frame, wherein the bottom field contains the odd-numbered rows starting with the second line of the frame.
  • the frame may be referred to as an interlaced frame or it may otherwise be referred to as a progressive frame.
  • the coding representation in H.264/MPEG4-AVC is primarily agnostic with respect to this video characteristic, i.e. the underlying interlaced or progressive timing of the original captured pictures. Instead, its coding specifies a representation primary based on geometric concepts, rather than being based on timing.
  • the above-mentioned concept of picture-adaptive frame field coding is also extended to macroblock adaptive frame field coding.
  • the macroblock pair When mb_field_decoding_flag is equal to 0, the macroblock pair is coded as a frame macroblock pair with the top macroblock representing the top half of the macroblock pair and the bottom macroblock representing the bottom half of the macroblock pair in the geometrical sense.
  • the motion- compensation prediction and transform coding for both the top and the bottom macroblock is applied as for macroblocks or frames with mb_adaptive_frame_field_coding equal to 0 indicating that macroblock adaptive frame field coding is deactivated and merely frame macroblocks exist.
  • mb_field_decoding_flag When mb_field_decoding_flag is equal to 1, the macroblock pair represents a field macroblock pair with a top macroblock representing the top field lines of the macroblock pair and the bottom macroblock representing the bottom field lines of the macroblock pair.
  • the top and the bottom macroblock substantially cover the same area of the picture, namely the macroblock pair area.
  • the vertical resolution is twice the horizontal resolution.
  • the motion compensation prediction and the transform coding is performed on a field basis.
  • Macroblocks of coded fields or macroblocks with mb_field_decoding_flag equal to 1 of coded frames are referred to as field macroblocks. Since each transform block of a field macroblock represents an image area with a vertical resolution that is equal to twice the horizontal resolution, it is likely that the distribution of non-zero transform coefficient levels is shifted towards horizontal low frequencies and for a rate-distortion optimized coding, the scanning of transform coefficients inside a transform block is modified for field macroblocks, • as illustrated in Fig. 7 (a) and (b) , respectively.
  • Fig. 6 summarizes the different behavior of the encoder of Fig. 1 when operating in accordance with an embodiment of the present invention.
  • the refinement coding means 114a, b performs a step 500 of selecting as to which scanning order or path is to be used for the transformation coefficients within a specific transform block in the PR slice.
  • the selection 500 may be performed in advance to the steps shown in Figs. 2 (a) and ⁇ b) with respect to all transform, blocks.
  • step 500 may be performed at macroblock pair level, macroblock level, transform block level or any combination thereof.
  • step 500 may, in fact, represent several sub-steps being performed in advance of and/or in between the steps of Fig. 2 (a) and (b) .
  • Possible selection criteria are described in the following.
  • the selection 500 as to which transformation coefficient scan order or scan path is to be used in a specific 4x4 transform block is made among the scanning paths shown in Figs. 5 (a) and 7 (a), respectively.
  • the selection 500 as to which transform coefficient scanning order is to be used for a certain 8x8 transform block is made among those shown in Fig. 5(b) and 7 (b) , respectively.
  • Figs. 1 the selection 500 as to which transformation coefficient scan order or scan path is to be used in a specific 4x4 transform block is made among the scanning paths shown in Figs. 5 (a) and 7 (a), respectively.
  • the selection 500 as to which transform coefficient scanning order is to be used for a certain 8x8 transform block is made among those shown in Fig. 5(b) and 7 (b) , respectively.
  • the alternative scanning orders for scanning the transform coefficients are also zigzag scans leading from the upper left transform coefficient 450a and 450a' , respectively, representing the DC component to the lower right transform coefficient 550b and 550b' , respectively, representing the spatial spectral component corresponding to the highest spectral components in both directions, horizontal and vertical direction 458, 458', 460, 460'.
  • the back-and-forth direction of a zigzag scan could be determined as the average of the inclinations of all individual arrows with or without weighting the inclinations by the length of the individual arrows.
  • the selection made in step 500 may be indicated in the PR slice NAL unit in step 502 in the form of respective side information. Embodiments using step 502 are described in the following.
  • the refinement coding means 114a, b uses for a specific transform block, that one among the scanning orders of Figs. 5 and 7(a,b), respectively, which is defined by the selection in step 500 and performs, by use of this scanning order or scanning path, the respective operations defined in Fig. 2 (a) and (b) , as indicated in step 504 and Fig. 6.
  • the actual operations of Fig. 2 (a) and (b) modified as indicated in step 504 in Fig. 6 are the steps of 204, 210, 262, 236 and 238.
  • These steps are performed by refinement coding means 114a, b of encoder of Fig. 1 by use of the scanning order or scanning path defined by the selection in step 500, as indicated in step 504.
  • a macroblock is referred to as a field macroblock in this regard when it is a macroblock of a coded field or a macroblock of a coded frame with mb_field_decoding_flag is equal to 1 and, otherwise, with a macroblock being referred to as frame macroblock.
  • the base layer coding means 112a, b decides during performing the base layer coding for each transform block as to whether same is to be a frame or a field macroblock. Eventually, the base layer coding means 112a, b decides that all macroblocks are to be frame macroblocks, or vice-versa. The base layer coding means 112a, b performs this decision, for example, such that the R/D performance is optimized with respect to the base layer quality.
  • the refinement coding means 114a, b then performs the selection 500 dependent on the decisions of the base layer coding means 112a, b, respectively, with respect to the association of the inacroblocks to field on frame macroblocks. Since the side information contained in the base layer bit-stream already indicates as to which macroblock is a frame macroblock and which macroblock is a field macroblock, step 502 may be omitted.
  • the scanning order of transform coefficients inside a transform block may be specified by syntax elements in step 502 that are present inside the progressive refinement slice syntax.
  • the corresponding syntax elements can be present at a slice level, a macroblock pair level, a macroblock level, block level or any combination thereof.
  • the selection in step 500 in this case would preferably be performed such that the scanning orders of Figs. 5 (a) and (b) would be selected for frame macroblocks, while the scanning orders of Figs. 7 (a) and (b) would be selective field macroblocks.
  • the selection performed in step 500 in case of selecting the scanning orders independent of the decisions performed by the base layer coding means 114 with respect to the division into field and frame macroblocks could also be performed by solving an R/D optimization problem.
  • the syntax element specifying the scanning order of transform coefficients inside a block could be coded by conditioned entropy codes, whereby the condition is dependent on as to whether the code-located block/macroblock/macroblock pair in the base quality layer, i.e. quality_level equal to 0, is located in or represents a field macroblock pair.
  • the encoder of Fig. 1 in accordance with the embodiment of the present invention could use the macroblock scan of Fig. 8 for progressive refinement slices for coding frames and mb_adaptive_frame_field_coding_flag equal to 1.
  • the encoder of Fig. 1 would decide to use the macroblock scan order of Fig. 8 in steps 222, 248, 250 and 268 of Fig.
  • the alternative macroblocks scan of Fig. 8 scans the macroblocks 302 macroblock pair-wise, one macroblock pair 702 being representatively highlighted.
  • scan order 700 scans the macroblock pairs 702 consisting of two vertically-adjacent macroblocks 302 in the same way as the scan order 304 scans the macroblocks 302 macroblock-wise, i.e. row-wise from the top to the bottom.
  • scan order 700 scans the top macroblock 702a first and then the bottom macroblock 702b.
  • step 800 the decoder knows which transform coefficient levels in the base layer are significant and which are not. Thus, the decoder is able to reconstruct the significance path and the refinement path used at the encoder side for the first refinement layer.
  • step 802 the decoder parses the PR slice NAL units.
  • step 802 contains the refinement information for the transform coefficients of the base layer.
  • the result of step 802 is a sequence of refinement levels for the transformation coefficients.
  • the decoder assigns these levels to the transformation coefficients inside the PR slice of the preceding quality layer.
  • the preceding layer is the base layer.
  • the assignment 804 is performed by using the same significance path and refinement path used at the encoder side and having been described above.
  • the decoder refines the transformation coefficients of the preceding quality level by use of the assigned levels, thereby achieving the refined transform coefficients and the actualized quality level, respectively.
  • Steps 802 to 806 may be performed several times in order to step-wisely enhance the quality level-by- level.
  • the decoder retransforms the transform blocks to reconstruct the texture information and then in step 810, reconstructs the frames or frame based on the reconstructed texture information.
  • the above-described embodiment of the present invention represents an adaptive scanning of transform coefficient levels in progressive refinement slices for enabling efficient fine-granular SNR scalability for interlaced frames.
  • it represents a concept for fine-granular SNR scalable coding of interlaced frames in which the scanning of transform coefficients inside a transform block is adaptively selected.
  • all scalability tools are only specified for the coding of progressive source material; the special characteristic of interlaced source material is not anticipated.
  • the above-presented embodiments extend the coding of progressive refinement slices in a way that the coding efficiency for interlaced sources is improved. This is achieved by adaptively controlling the scanning order of transform coefficients inside a transform block.
  • the above-described embodiments of the present invention could be described as a coding scheme supporting fine granular SNR scalability in which the scanning order of transform coefficients inside a transform block is adaptively selected.
  • the scanning order of transform coefficients could be selected based on whether the co-located macroblock pair in the base quality layer with identical representation time and dependency_ID and quality_level equal to 0, represents a field of a frame macroblock pair.
  • the scanning order of Figs. 5 (a) and (b) is chosen when the co-located macroblock pair in the base quality slice represents a frame macroblock pair and the scanning order depicted in Fig.
  • the selection of the scanning order could be transmitted as part of the slice syntax. This includes the transmission of corresponding syntax elements on a slice level, macroblock pair level, macroblock level or block level.
  • the scanning order depicted in Figs. 5 (a) and (b) could be chosen when the co-located macroblock pair and the base quality slice representative frame, macroblock pair and the scanning order depicted in Figs. 7 (a) and (b) could be chosen when the co-located macroblock pair in the base quality layer represents a field macroblock pair.
  • one or more syntax elements specifying the scanning order of transform coefficients can be coded by conditioned entropy codes, whereby the condition is dependent on as to whether the co-located macroblock or macroblock pair on the base quality layer is coded as frame or field macroblock or macroblock pair, respectively.
  • conditioned entropy codes whereby the condition is dependent on as to whether the co-located macroblock or macroblock pair on the base quality layer is coded as frame or field macroblock or macroblock pair, respectively.
  • the selection of the macroblock scan among those shown in Figs. 3 and 8 can be performed based on a sequence level syntax element, such as mb_adaptive_frame_field_flag.
  • the present invention is not necessarily restricted to the usage of the H.264/MPEG4-AVC Standard. Rather, any other transformation-based coding scheme could also be used.
  • the scanning path and the refinement path could be chosen in another way.
  • a significance path could be scanned in a single scan cycle. In this case, during this one scan cycle, all significant and non-significant transform coefficients, respectively, could be selected for transmission and coding, respectively together.
  • it is possible not to differentiate between significant and non-significant transform coefficients i.e. not to firstly code the non-significant coefficients in the significance path followed by then code the significant transform coefficients in the refinement path. Rather, it would be possible to merge the significance path and the refinement path into one path.
  • the present invention is also advantageous in still image processing.
  • the inventive coding scheme can be implemented in hardware or in software. Therefore, the present invention also relates to a computer program, which can be stored on a computer-readable medium such as a CD, a disc or any other data carrier.
  • the present invention is, therefore, also a computer program having a program code which, when executed on a computer, performs the inventive method described in connection with the above figures.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Coding efficiency is achieved in refinement information in fine granular scalability sense by scanning the transform coefficients of a predetermined transform coefficient block in a scan order along a zig-zag scanning path having a back-and-forth direction being inclined relative to a bisecting line between the horizontal and vertical axis of the predetermined transformation coefficient block. In particular, by inclining the back and forth direction, for example, such that a propagation rate of the zig-zag scan path along the vertical axis is approximately twice a propagation rate of the zig-zag scan path along the horizontal axis in the scan order, the scanning path is better adapted for transformation blocks of field macroblocks, i.e. macroblocks representing an image area with a vertical resolution that is equal to twice the horizontal resolution, in particular, with this measure, the scanning path better separates between significant transform coefficients, i.e. those whose transformation coefficient levels are significant in accordance with the base layer or any of the intermediate refinement layers, and non-significant transform coefficients.

Description

QUALITY SCALABLE PICTURE CODING WITH PARTICULAR TRANSFORM COEFFICIENT SCAN PATH
Description
The present invention relates to a video codec supporting quality- or SNR-scalability .
A current project of the Joint Video Team (JVT) of the ISO/IEC Moving Pictures Experts Group (MPEG) and the ITU-T Video Coding Experts Group (VCEG) is the development of a scalable extension of the state-of-the-art video coding standard H.264/MPEG4-AVC defined in ITU-T Rec. & ISO/IEC 14496-10 AVC, "Advanced Video Coding for Generic Audiovisual Services," version 3, 2005. The current working draft as described in J. Reichel, H. Schwarz and M. Wien, eds., "Scalable Video Coding - Joint Draft 4, "Joint Video Team, Doc. JVT-Q201, Nice, France, October 2005 and J. Reichel, H. Schwarz and M. Wien, eds., "Joint Scalable Video Model JSVM-4," Joint Video Team, Doc. JVT-Q202, Nice, France, October 2005, supports temporal, spatial and SNR scalable coding of video sequences or any combination thereof.
H.264/MPEG4-AVC as described in ITU-T Rec. & ISO/IEC 14496- 10 AVC, "Advanced Video Coding for Generic Audiovisual Services, " version 3, 2005, specifies a hybrid video codec in which macroblock prediction signals are either generated by motion-compensated prediction or intra-prediction and both predictions are followed by residual coding. H.264/MPEG4-AVC coding without the scalability extension is referred to as single-layer H.264/MPEG4-AVC coding. Rate- distortion performance comparable to single-layer H.264/MPEG4-AVC means that the same visual reproduction quality is typically achieved at 10% bit-rate. Given the above, scalability is considered as a functionality for removal of parts of the bit-stream while achieving an R-D performance at any supported spatial, temporal or SNR resolution that is comparable to single-layer H.264/MPEG4- AVC coding at that particular resolution.
The basic design of the scalable video coding (SVC) can toe classified as layered video codec. In each layer, the basic concepts of motion-compensated prediction and intra prediction are employed as in H.264/MPEG4-AVC. However, additional inter-layer prediction mechanisms have been integrated in order to exploit the redundancy between several spatial or SNR layers. SNR scalability is basically achieved by residual quantization, while for spatial scalability, a combination of motion-compensated prediction and oversampled pyramid decomposition is employed. The temporal scalability approach of H.264/MPBG4-AVC is maintained.
In general, the coder structure depends on the scalability space that is required by an application. For illustration, Pig. 10 shows a typical coder structure 900 with two spatial layers 902a, 902b. In each layer, an independent hierarchical motion-compensated prediction structure 904a,b with layer-specific motion parameters 906a, b is employed. The redundancy between consecutive layers 902a, b is exploited by inter-layer prediction concepts 908 that include prediction mechanisms for motion parameters 906a,b as well as texture data 910a, b. A base representation 912a, b of the input pictures 914a, b of each layer 902a, b is obtained by transform coding 916a, b similar to that of H.264/MPEG4-AVC, the corresponding NAL units (NAL - Network Abstraction Layer) contain motion information and texture data; the NAL units of the base representation of the lowest layer, i.e. 912a, are compatible with single-layer H.264/MPEG4-AVC. The reconstruction quality of the base representations can be improved by an additional coding 918a, b of so-called progressive refinement slices; the corresponding NAL units can be arbitrarily truncated in order to support fine granular quality scalability (FGS) or flexible bit-rate adaptation. The resulting bit-streams output by the base layer coding 916a, b and the progressive SNR refinement texture coding 918a, b of the respective layers 902a,b, respectively, are multiplexed by a multiplexer 920 in order to result in the scalable bit-stream 922. This bit-stream 922 is scalable in time, space and SNR quality.
Summarizing, in accordance with the above scalable extension of the Video Coding Standard H.264/MPEG4-AVC, the temporal scalability is provided by using a hierarchical prediction structure. For this hierarchical prediction structure, the one of single-layer H.264/MPEG4-AVC standards may be used without any changes. For spatial and SNR scalability, additional tools have to be added to the single-layer H.264/MPEG4.AVC. All three scalability types can be combined in order to generate a bit-stream that supports a large degree on combined scalability.
For SNR scalability, coarse-grain scalability (CGS) and fine-granular scalability (FGS) are distinguished. With CGS, only selected SNR scalability layers are supported and the coding efficiency is optimized for coarse rate graduations as factor 1.5-2 from one layer to the next. FGS enables the truncation of NAL units at any arbitrary and eventually byte-aligned point. NAL units represent bit packets, which are serially aligned in order to represent the scalable bit-stream 922 output by multiplexer 920.
In order to support fine-granular SNR scalability, so- called progressive refinement (PR) slices have been introduced. Progressive refinement slices contain refinement information for refining the reconstruction quality available for that slice from the base layer bit- stream 912a, b, respectively. Even more precise, each NAL unit for a PR slice represents a refinement signal that corresponds to a bisection of a quantization step size (QP increase of 6) . These signals are represented in a way that only a single inverse transform has to be performed for each transform block at the decoder side. In other words, the refinement signal represented by a PR NAL unit refines the transformation coefficients of transform blocks into which a current picture of the video has been separated. At the decoder side, this refinement signal may be used to refine the transformation coefficients within the base layer bit-stream before performing the inverse transform in order to reconstruct the texture of prediction residual used for reconstructing the actual picture by use of a spatial and/or temporal prediction, such as by means of motion compensation.
The progressive refinement NAL units can be truncated at any arbitrary point, so that the quality of the SNR base layer can be improved in a fine granular way. Therefore, the coding order of transform coefficient levels has been modified. Instead of scanning the transform coefficients macroblock-by-macroblock, as it is done in (normal) slices, the transform coefficient blocks are scanned in separate paths and in each path, only a few coding symbols for a transform coefficient block are coded. With the exception of the modified coding order, the CABAC entropy coding as specified in H.264/MPEG4-AVC is re-used.
The transmission of transform coefficient levels of progressive refinement slices, i.e. the transmission of a quantization refinement level for the transform coefficients within a progressive refinement slice proceeds in so-called scan cycles. In particular, the progressive SNR refinement texture coding means 918 and multiplexer 920 co-operate in order to cause the transmission of the transform coefficient levels of the progressive refinement slices in the order described in the following. Coding means 918 scans all macroblocks of the progressive refinement slice in a specific raster-scan order. Inside each macroblock, the transform blocks are scanned in a further specific scan order. The transform coefficient levels inside each transform block are, in turn, scanned in a further specific zigzag scanning order. The zigzag scanning orders used for scanning the transform coefficient levels inside the transform blocks have a back-and-forth direction parallel to a direction perpendicular to a bisecting line between the horizontal and vertical axis of the transform blocks. In the first scan cycles, merely the refinement information for those transform coefficients is selected for transmission and coding the transformation coefficient levels of which are non-significant when considering the base layer and all intermediate refinement layers to the current refinement layer, if existing. In particular, during one scan cycle, refinement information for no more than one significant transform coefficient level of a 4x4 transform block and refinement information for no more than four significant transform coefficient levels of an 8x8 transform block are selected for transmission and coding. After these first scan cycles, called significant path, in one or more further scan cycles, called refinement path, refinement information for the other transform coefficients is selected for transmission and coding by use of the same macroblocks, block and transform coefficient scanning orders. At the decoder side, the refinement information thus provided in the progressive refinement slices is associatable with the transform coefficients by use of the same scanning orders and by the knowledge derived from the base layer bit-stream and the eventually existing intermediate refinement layers.
One disadvantage of the above-described scalable extension of the Video Coding Standard H.264/MPEG4-AVC is that a distortion/rate performance of the refinement layers defined by the base layer bit-stream 912a or 912b plus the respective refinement layer bit-streams output by blocks 918a, b up to a specific point within a specific respective refinement layer is very likely to be non-optimal when considering interlaced video source material due to the special characteristic of interlaced source material, i.e. videos in which each frame is composed of two inter-leaved fields with the fields being individually handled like frames (field-coded) or with macroblock pair-wise deciding as to whether the respective macroblock portion is divided up into two macroblocks in accordance with the membership to the top or bottom field or the membership to the top or bottom half of the macroblock pair area within the frame.
Thus, it is an object of the present application to provide a coding scheme providing quality scalability allowing for an improved coding efficiency in both cases, interlaced and progressive video material.
This object is achieved by an encoder according to claim 1 or 23, a decoder according to claim 12 or 23, and a method according to any of claims 25 to 28.
The basic idea underlying the present invention is that an improved coding efficiency in refinement information in fine granular scalability sense may be achieved by scanning the transform coefficients of a predetermined transform coefficient block in a scan order along a zig-zag scanning path having a back-and-forth direction being inclined relative to a direction perpendicular to the bisecting line between the horizontal and vertical axis of the predetermined transformation coefficient block. In particular, by inclining the back and forth direction, for example, such that a propagation rate of the zig-zag scan path along the vertical axis is approximately twice a propagation rate of the zig-zag scan path along the horizontal axis in the scan order, the scanning path is better adapted for transformation blocks of field macroblocks, i.e. macroblocks representing an image area with a vertical resolution that is equal to twice the horizontal resolution. In particular, with this measure, the scanning path better separates between significant transform coefficients, i.e. those whose transformation coefficient levels are significant in accordance with the base layer or any of the intermediate refinement layers, and non-significant transform coefficients.
In other words, the present invention is based on the finding that the coding efficiency may be enhanced by appending refinement information to a base layer data- stream serially such that the transform coefficients of different transform coefficient blocks of a picture are refined in a scan order along different scanning paths, one of which, for example, has a back-and-forth direction parallel to a direction perpendicular to a bisecting line between the horizontal and vertical axis and the other one of which has a back-and-forth direction inclined relative to a direction perpendicular to the bisecting line.
In case of the usage of macroblock adaptive frame field coding, the present invention is advantageous in that it is possible to better adapt the refinement values by which the individual transform coefficients are refined by the refinement information to each other within each scanning path. By this measure, adaptive entropy encoding of the refinement information is enabled to better adapt its probability estimation to the actual probability distribution of the actual refinement values.
In the following, preferred embodiments of the present application are described with reference to the Figs. In particular, it is shown in
Fig. 1 a block diagram of a video encoder according to an embodiment of the present invention;
Figs. 2 (a) and (b) a flow chart illustrating the steps performed in the encoder of Fig. 1 for refining the transform coefficients when operating according to the structure of Fig. 10; Fig. 3 a schematic illustrating the macroblock scan of a progressive refinement slice inside a scan cycle;
Fig. 4 a schematic illustrating the scanning of transform coefficient blocks inside a macroblock in case of (a) scanning of 4x4 luma blocks and in case of (b) scanning of 8x luma blocks;
Fig. 5 a schematic illustrating the scanning of transform coefficient levels inside (a) a 4x4 block and in case of (b) an 8x8 block;
Fig. 6 a flow chart illustrating the difference of the behavior of the encoder of Fig. 1 when operating in accordance with the embodiments of the present invention;
Fig. 7 a schematic illustrating an alternative scanning of transform coefficient levels inside (a) a 4x4 block and (b) an 8x8 block in accordance with an embodiment of the present invention;
Fig. 8 a schematic illustrating an alternative macroblock scan of a progressive refinement slice inside a scan cycle for coding frames with activated macroblock adaptive frame field coding option in accordance with an embodiment of the present invention;
Fig. 9 a flow chart showing the steps performed at decoder side in accordance with the present invention; and
Fig. 10 a conventional coder structure for scalable video coding.
The present invention is described in the following by means of an embodiment with a similar structure to the conventional coder structure of Fig. 10. However, in order to more clearly indicate the improvements in accordance with the present invention, the video encoder of Fig. 1 representing an embodiment of the present invention is firstly described as operating in accordance with the scalable extension of the H.264/MPEG4-AVC standard having been presented in the introductory portion of this specification with respect to Fig. 10. Thereafter, the actual operation of the encoder Fig. 1 is illustrated by emphasizing the differences to the mode of operation in accordance with the video structure of Fig. 10. As will turn out from this discussion, the differences reside in the refinement coding means and the multiplexer.
The video coder of Fig. 1 operating as defined in the above-mentioned Joint Drafts supports two spatial layers. To this end, the encoder of Fig. 1, which is generally indicated by 100, comprises two layer portions or layers 102a and 102b, among which layer 102b is dedicated for generating that part of the desired scalable bit-stream concerning a coarser spatial resolution, while the other layer 102a is dedicated for supplementing the bit-stream output by layer 102b with information concerning a higher resolution representation of an input video signal 104. Therefore, the video signal 104 to be encoded by encoder 100 is directly input into layer 102a, whereas encoder 100 comprises a spatial decimeter 106 for spatially decimating the video signal 104 before inputting the resulting spatially decimated video signal 108 into layer 102b.
The decimation performed in spatial decimeter 106 comprises, for example, decimating the number of pixels for each picture 104a of the original video signal 104 by a factor of 4 by means of discarding every second pixel in column and row directions.
The low-resolution layer 102b comprises a motion- compensated prediction block 110b, a base layer coding block 112b and a refinement coding block 114b. The prediction block 110b performs a motion-compensated prediction on pictures 108a of the decimated video signal 108 in order to predict pictures 108a of the decimated video signal 108 from other reference pictures 108a of the decimated video signal 108. For example, for a specific picture 108a, the prediction block 110b generates motion information that indicates as to how this picture may be predicted from other pictures of the video signal 108, i.e. from reference pictures. In particular, to this end, the motion information may comprise pairs of motion vectors and associated reference picture indices, each pair indicating, for example, how a specific part or macroblock of the current picture is predicted from an index reference picture by displacing the respective reference picture by the respective motion vector. Each macroblock may be assigned one or more pairs of motion vectors and reference picture indices. Moreover, some of the macroblocks of a picture may be intra-predicted, i.e. predicted by use of the information of the current picture. In particular, the prediction block 110b may perform a hierarchical motion- compensator prediction on the decimated video signal 108.
The prediction block 110b outputs the motion information 116b as well as the prediction residuals of the video texture information 118b representing the differences between the predictors and the actual decimated pictures
108a. In particular, the determination of the motion information and the texture information 116b and 118b and prediction block 110b is performed such that the resulting encoding of this information by means of the subsequent base layer coding 110b results in a base-representation bit-stream with, preferably, optimum rate-distortion performance.
As already described above, the base layer coding block 110b receives the first motion information 116b and the texture information 118b from block 110b and encodes the information to a base-representation bit-stream 120b. The encoding performed by block HOb comprises a transformation and a quantization of the texture information 118b. In particular, the quantization used by block 110b is relatively coarse. Thus, in order to enable quality-up scaling of the bit-stream 120b, the refinement coding block 114b supports the bit-stream 120b with additional bit- streams for various refinement layers containing information for refining the coarsely quantized transform coefficients representing the texture information in bit- stream 120b. In this regard, refinement coding block 114b - in co-operation with the prediction block HOb - could also be able to decide that a specific refinement layer bit- stream 122b should be accompanied by refined motion information 116b. However, this functionality is not discussed further in the following in order to ease the description of the present invention. The refinement of the residual texture information relative to the base representation 120b of the formerly-output lower refinement layer bit-stream 122b comprises, for example, the encoding of the current quantization error of the transform coefficients thereby representing the texture information 118b with a finer quantization prediction.
Both bit-streams 120b and 122b are multiplexed by a multiplexer 124 comprised by encoder 100 in order to insert both bit-streams into the final scalable bit-stream 126 representing the output of encoder 100.
Layer 102a substantially operates the same as layer 102b. Accordingly, layer 102a comprises a motion-compensation prediction block HOa, a base layer coding block 112a and a refinement coding block 114a. In conformity with layer 102b, the prediction block HOa receives the video signal 104 and performs a motion-compensated prediction thereon in order to obtain motion information 116a and texture information 118a. The output motion and texture information 116a and 118a are received by coding block 112a, which encodes this information to obtain the base representation bit-stream 120a. The refinement coding block 114a codes refinements of the quantization error manifesting itself on the base representation 120a by comparing a transformation coefficient of bit-stream 120a and the actual transformation coefficient resulting from the original texture information 118a and, accordingly, outputs refinement-layer bit-streams 122a for various refinement layers.
The only difference between layers 102 and 102b is that layer 102a is inter-layer predicted. That is, the prediction block 110a uses information derivable from layer 102b, such as residual texture information, motion information or a reconstructed video signal, as derived from one or more of the bit-streams 120b and 122b in order to pre-predict the high resolution pictures 104a of the video signal 104, thereafter performing the motion- compensated prediction on the pre-prediction residuals, as mentioned above with respect to prediction block 110b relative to the decimated video signal 108. Alternatively, the prediction block 110a uses the information derivable from layer 102b for predicting the motion compensated residual 118a. In this case, for intra blocks, picture content 104a may be predicted by means of the reconstructed base layer picture. For inter blocks 104a, the motion vector (s) 116a output from 110a may be predicted from the corresponding reconstructed base layer motion vector. Moreover, after the motion compensated residual 118a of layer 102a has been determined, same may be predicted from the reconstructed base layer residual for the corresponding picture which residual is then further prosecuted in blocks 112a, 114a.
In order to illustrate the advantage of the present invention, in the following, the mode of operation of the encoder 100 is described in more detail below, with, however, as indicated above, firstly restricting the mode of operation of encoder 100 to that being in accordance with the structure of Fig. 1. In particular, the following description of the mode of operation of encoder 100 focuses on the refinement coding performed in means 114a, b, i.e. the refinement of the transformation coefficient levels in consecutive quality levels. To this end, reference is additionally made to Figs. 2a and 2b showing the steps performed by the refinement coding means 114a, b in order to provide the refinement information. Afterwards, and especially with respect to Figs. 6 to 8, the different behavior of encoder 100 of Fig. 1 in accordance with the embodiment of the present invention is described.
As described above, the refinement coding means 114a, b provide refinement information for refining in consecutive quality stages or levels, the transform coefficients of the base layer bit-stream representing the texture information output by prediction means 110a, b, respectively. To this end, the refinement coding means 114arb consecutively refines a quantization step size from one quality level to the next and determines refinement values enabling refining the respective transform coefficients by adding the respective refinement value to the transform coefficient level in accordance with the immediately-preceding quality level, i.e. the preceding refinement level or the base layer level. As it will turn out from the following description, the behavior of the encoder of Fig. 1 in accordance with the embodiments of the present invention differs from the behavior of the encoder of Fig. 1 when operating in accordance with Fig. 10 resides in the order in which the refinement information within one refinement level is inserted into the scalable bit-stream 126. The "insertion" into the scalable bit-stream 126 may, of course, comprise a coding of the refinement information or the respective sequence of refinement values. For this reason, in the following, the insertion of a respective refinement value or refined level into the scalable bit- stream 126 or the bit-stream 122a, b is sometimes described as a "transmission of the respective level" or a "coding of the respective level". Moreover, in the following description, the refinement values of the refinement information determined by refinement coding means 114a, b are simply called transform coefficient levels, although the actual refinement values define the offset of the transform coefficient levels between the current quality level and the preceding quality level.
Simply speaking, the transmission of transform coefficient levels of progressive refinement slices performed by refinement coding means 114a, b proceeds in so-called scan cycles. In each cycle, no more than one significant transform coefficient level of a 4x4 transform block and no more than 4 significant transform coefficient levels of an 8x8 transform block are coded. A transform coefficient level is called significant when its value is not equal to zero. Inside each scan cycle, all macroblocks of the progressive refinement slice are scanned in a specific raster-scan order. Inside each macroblock, in turn, the transform blocks are scanned in a specific transform block scanning order. Further, inside each 4x4 transform block and inside each 8x8 transform block, the transform coefficients are scanned in specific zigzag scans, as described hereinafter.
Referring now to Fig. 2 (a), at the beginning of a progressive refinement slice, all transform coefficient levels of non-significant transform coefficients are transmitted and coded. In the following description, only luma transform coefficients are considered. However, the progressive refinement coding of chroma transform coefficients is similarly performed. A transform coefficient is called significant when a transform coefficient level unequal to zero has been either transmitted in the base quality layer having, for example, quality level quality_level = 0 or in any subordinate progressive refinement slice having a quality level quality _level less than quality__level of the current progressive refinement slide, for the corresponding transform coefficient.
As described, the transform coefficient levels of the nonsignificant transform coefficients are firstly coded. When the coding of the transform coefficient levels for all nonsignificant transform coefficients of the current progressive refinement slice is complete, the coding of transform coefficient levels for the significant transform coefficients is started. The coding of transform coefficient levels for non-significant transform coefficients is referred to as significant path. The significance path is shown in Fig. 2 (a). The coding of transform coefficient levels for significant transform coefficient levels is referred to as refinement path. The refinement path follows the significant path and is shown in Fig. 2 (b) .
For both, the transform coefficient level coding of nonsignificant transform coefficients (significant path) as well as the transform coefficient coding of significant transform coefficients (refinement path) , in accordance with the structure of Fig. 10, the scanning patterns of Figs. 5 (a) and 5 (b) are employed. As will turn out from the following description, this is different in accordance with the behavior of the encoder Fig. 1 when operating in accordance with an embodiment of the present invention.
As described above, the generation of the refinement information of a progressive refinement slice begins with the significance path, which is shown in Fig. 2 (a). During the significance path, only non-significant transform coefficients of the progressive refinement slice are considered. A slice refers to a group of macroblocks into which a picture is partitioned. The macroblocks are fixed- sized and cover, for example, a rectangular picture area of 16x16 samples of the luma component and 8x8 samples of each of the two chroma components. A picture may be split into one or several slices. A picture is therefore a collection of one or more slices. Further, slices are self-contained in the sense that given the active sequence and picture parameter sets, the syntax elements can be parsed from the bit-stream and the values of the samples in the area of the picture that the slice represents can be correctly decoded without use of data from other slices provided that utilized reference pictures are identical at encoder and decoder. Turning to progressive refinement slices, this means that the progressive refinement slices contain refinement information for the transform coefficients of the transform blocks of macroblocks within the picture area represented by this progressive refinement slice. The progressive refinement slices coincide, for example, with the slices used in the base layer bit-stream 120a, b.
The significance path involves several scan cycles. In these scan cycles, all macroblocks of the progressive refinement slice are scanned in a raster-scan order, as depicted in Fig. 3. Inside each macroblock, the transform blocks are scanned as depicted in Figs. 4 (a) and (b) , respectively and inside each transform block, the transform coefficient levels are scanned as depicted in Figs. 5 (a) and (b) , respectively. To be more precise, the refinement coding means 114a, b firstly steps to the first macroblock of the PR slice in macroblock scan order in step 200. Fig. 3 shows an exemplary progressive refinement slice 300 exemplarily comprising a conglomeration of macroblocks 302. The macroblock raster-scan order among the macroblocks 302 inside slice 300 is indicated by consecutively arranged arrows 304. As can be seen from Fig. 3, the macroblocks 302 are arranged in lines and columns in a rectangular array, wherein the raster-scan order 304 is defined among the macroblocks 302 such that the raster-scan order 304 begins with a macroblock 302a in the most top row and most left- hand side column, then scans all macroblocks within this row to the macroblock 302 within slice 300 in the most right-hand side column (302b) and then steps to the most left macroblock 302c in the next lower row of macroblocks occupied by one of the macroblocks of the PR slice 300, and so on. Thus, in step 200, macroblock 302a is visited.
Next, in step 202, the refinement coding means 114a, b steps to the first transformation block or transform block within the current macroblock 302a in transformation block scanning order. As is shown in Figs. 4 (a) and (b) , the transformation block scanning order is different for 4x4 and 8x8 transform blocks. In this regard, it is noted that the luma samples of a macroblock may either be partitioned in a rectangular 2x2 array of 8x8 blocks or a rectangular 4x4 array of 4x4 blocks. Each 8x8 or 4x4 block is transformed from spatial to spectral domain by means of a two-dimensional transform individually. Thus, each macroblock may either be represented by four 8x8 transform blocks or sixteen 4x4 transform blocks representing the texture information of the macroblock within the respective rectangularly arranged portions of this macroblock. Fig. 4 (a) shows a macroblock 400a being partitioned into sixteen 4x4 transform blocks 402a being arranged in a 4x4 rectangular array and each consisting of four 4x4 transform coefficients. The transform block scan order among blocks 402a is indicated by a sequence of arrows 404. As shown, by scan order 404, the order among blocks 402a is defined such that same are scanned in quadrants. Firstly, the four blocks 402a in the upper left quadrant of the macroblock 400a are scanned followed by the ones in the upper right, lower left and then lower right quadrant. Inside each quadrant, the blocks 402a are scanned in the same order, i.e. in the order of the upper left, upper right, lower left and lower right block inside the respective quadrant.
As shown in Fig. 4 (b) , in case of a macroblock 400b partitioned into 8x8 blocks, same is represented by four 8x8 transform blocks 402b arranged in a 2x2 rectangular array and the scan order among them indicated by arrows 406, is defined such that the upper left block 402b is followed by the upper right, lower left and lower right block 402b in this order. Thus, in both cases, in macroblock 402a or 402b, the upper left transform block 402a,b is visited in step 202.
In step 204, the refinement coding means 114a, b steps to the first non-significant transform coefficient in scanning order within the current macroblock. Fig. 5a shows the scanning order among the transform coefficients used in step 204 in case of a 4x4 transform block 402a. In particular, Fig. 5 (a) shows a rectangular 4x4 array of transform coefficients 450 representing the transform block 402a. The scanning order defined among the transform coefficients 415 is indicated by a sequence of consecutive arrows 452. As shown, the scanning order 452 defines a zigzag scan having a back and forth direction 454, which is substantially parallel to a direction perpendicular to a bisecting line 456 between the horizontal axis 458 and vertical axis 460 spanning the spectral domain in which the transform coefficients 450 are arranged. To be more precise, scanning order 452 starts at the upper left transform coefficient 450a representing the DC component and leads to the transform coefficient at the lower left corner 450b representing the highest spectral component in both directions 458 and 460 in a zigzag scan stepping through the other transform coefficients 450 with a back and forth direction 454 and the general step forward direction pointing from transform coefficient 450a to 450b.
Similarly, Fig. 5(b) shows the scanning path among the transform coefficients of an 8x8 transform block 402b in which elements corresponding to that of Fig. 5 (a) are indicated with the same reference signs as in Fig. 5a, wherein the foregoing description with respect to these documents of Fig. 5 (a) equally applies to Fig. 5 (b) . Thus, in both cases a 4x4 transform block and an 8x8 transform block, means 114a, b starts visiting the transform coefficients 450 in the order shown in Fig. 5 (a) or 5(b) starting with a DC transform coefficient 450a. When a visited transform coefficient represents a significant transform coefficient, the next transform coefficient 450 inside the respective block 402a or 402b is visited. It is noted that it is possible that no next transform coefficient 450 inside the respective block is available, i.e. that the last transform coefficient 450b has been reached with same being a significant transform coefficient. In this case, i.e. when there is no further transform coefficient inside the current block, the coding is to proceed with the next transform coefficient block, although not shown in Fig. 2 (a) . When a visited transform coefficient in step 204 represents a non-significant transform coefficient, this is the transform coefficient which is stepped to in step 204.
The value of the corresponding transform coefficient level is then coded in step 206. Thereafter, in step 208, it is determined as to whether the value of the transform coefficient level of the current transform coefficient is zero, i.e. as to whether the current transform coefficient level is significant. If the transform coefficient level is equal to zero, the next transform coefficient levels inside the block in scanning order according to Fig. 5 (a) or 5 (b) , respectively are visited until the next non-significant transform coefficient in scanning order is reached. If such next non-significant transform coefficient in scanning order is available 212, the method steps to step 206 where its level is coded. However, when a next non-significant transform coefficient and scanning order is not available in the current block, the coding proceeds with the next transform coefficient block by stepping to the next transform block in transformation block scanning order according to Fig. 4 (a) or (b) , respectively, in step 214. Otherwise, if the transform coefficient level is determined not to be equal to zero in step 208, it is determined in step 216 as to whether the current block represents a 4x4 block. If this is the case, the coding is continued with the next transform coefficient block by jumping to step 214.
If it is determined in step 216 that the current block is not a 4x4 block, but represents an 8x8 block, it is determined in step 218 as to whether the current transform coefficient is the fourth transform coefficient inside the current block and the current scan cycle, for which a transform coefficient level not equal to zero has been coded, wherein a complete scan cycle shall indicate a complete scan of all transformation coefficients inside the current progressive refinement slice. If the result of the determination in step 218 is positive, the coding is continued with the next transform coefficient block by jumping to step 214. Otherwise, the next transform coefficient levels inside the current block are visited until reaching the next non-significant transform coefficient in scanning order, according to Fig. 5 (a) and (b) , respectively by jumping to step 210 wherein, as noted above, when there is no further transform coefficient inside the current block, the coding proceeds with the next transform coefficient block in step 204.
The next transformation block or transform coefficient block in step 214 may either be the next transform coefficient block inside the current macroblock, when available, or the first transform coefficient block of the first macroblock inside the next scan cycle, i.e. the first transformation coefficient block of the first macroblock within the progressive refinement slices. In other words, when there is no next transformation block in the current macroblock in transformation block scanning order available
(220), the procedure proceeds to step 222 comprising stepping to the next macroblock in macroblock scan order, as shown in Fig. 3. Otherwise, i.e. if a next transformation block in transformation block scanning order according to Fig. 4 (a) or (b) , respectively is available, the procedure proceeds to step 204. If there is no next macroblock in macroblock scanning order available in step 222 (224) , the procedure proceeds with step 226, where it is determined as to whether there are still non-significant transformation coefficients left in the progressive refinement slice, the levels of which have not yet been coded. If a next macroblock in macroblock scanning order is available, the procedure proceeds with step 202. If, however, no next macroblock in macroblock scanning order is available, the procedure proceeds with step 200. Otherwise, this significance path is finished and the procedure proceeds to the refinement path shown in Fig. 2(b).
When the procedure loops back to step 200, the current scan cycle has been finished. In any following scan cycle, the coding of transform coefficient levels for any transform coefficient block starts with the first transform coefficient in scanning order that has not yet been visited in any of the previous scan cycles. To be more precise, in step 204, the first non-significant transform coefficient and scanning order is visited or stepped to, the level of which has not been coded in any of the previous scan cycles. However, as already noted above, when all transform coefficients of a block have already been visited, the coding proceeds with the next block in scanning order at step 204.
Thus, finally, when all transform coefficients have been visited during the significance path, the refinement path is started, which is shown in Fig. 2 (b) . Similar to the significance path, all transform coefficients of the current progressive refinement slice are visited a second time starting with a DC coefficient of the first transform coefficient block inside the first macroblock. In particular, the refinement path begins at step 228 with determining as to whether variable length coding is used for coding the transform coefficient levels of the progressive refinement slice or arithmetic coding. In particular, in accordance with the H.264/MPEG4-AVC standard, for coding either a context-adaptive binary arithmetic coding scheme (CABAC) or a context-adaptive variable length coding (CAVLC) is used. A determination in step 228 may be performed by checking the entropy_coding__mode_flag defined in the H.264/MPEG4-AVC standard and as contained, for example, in the base layer bit-stream 120a, b.
If CABAC is used as an entropy coding mode, i.e. entropy_coding_mode_flag is equal to 1, the procedure proceeds with step 230. In this step, the first macroblock of the PR slice is visited. In the next step 232, the first transformation block within the current macroblock is visited. Next, in step 234, it is determined as to whether the current transformation block is a 4x4 block or not. If this is the case, in step 236, the transform coefficient at the current position within the scanning order 452 is checked to be significant or not, and, if same is significant, its (refined) level is transmitted. Otherwise, nothing is done with respect to this block at this scan cycle. As already noted above, the transmission of levels leads to their binary arithmetic coding into the bit-stream 122a, b. However, in case of the determination in step 234 revealing that the current transformation block is an 8x8 transformation block, the levels of the significant transform coefficients (if any) within the next four positions within the scanning order of the current block are scanned and transmitted in step 238. After steps 236 and 238, the procedure steps to steps 240 or 242, respectively, involving the stepping to the next transform block within the current macroblock, if available.
In other words, steps 234 to 242 define the following prosecution when arriving at a transformation block. Starting from the first transform coefficient, it is determined as to whether same represents a significant transform coefficient. If this is the case, the transform coefficient level is transmitted. Let N be the scan index using the scanning pattern of Fig. 5 (a) or (b) , respectively, of the current transform coefficient inside the transform block starting with 1. The prosecution prosecutes further, depending on the transform block size of the current macroblock. If the current transform block is a 4x4 block, the coding proceeds with the next transform coefficient block in scanning order, shown in Fig. 4 (a). Otherwise, if the current transform block is an 8x8 block and N is an integer multiple of 4, i.e. N%4 == 0, the coding proceeds with the next transform coefficient block. Otherwise, i.e. if the current transform block is an 8x8 block and N is not an integer multiple of 4, i.e. N%4 != 0, the coding proceeds to the next transform coefficient or transform coefficients inside the current transform block, in order to code its/their levels if same is/are significant, until N%4 == 0. Then the next block is visited.
Returning to Fig. 2 (b) , if a next transform block is available inside the current macroblock (244 and 246, respectively) the procedure loops back to step 234. If not, the procedure proceeds from step 244 to step 248 or from step 246 to step 250, respectively. Steps 248 and 250 comprise stepping to the next macroblock in macroblock scan order, according to Fig. 3. If such next macroblock is available (252 and 254, respectively), the procedure loops back to step 232. However, if not, the procedure proceeds with step 256, where it is determined as to whether there are, in case of the 8x8 bocks, quadruples, i.e. another four positions, or, in case of 4x4 blocks, a position, within the transform coefficient scanning orders left, which have not yet been scanned or visited. If this is the case, the procedure loops back to step 230 in order to perform another scan of the PR slice. However, if not, the procedure ends. In case, the procedure ends at step 256, 16 cycles have been passed, corresponding to 16 coefficients in 4x4 blocks and 16*4 coefficients in 8*8 blocks.
In order to additionally clarify the procedure of the refinement path in case of arithmetic coding, the following pseudo code summarizes the procedure, wherein scanldx being the scan index, 8x8block indexes the 8x8 transform blocks within the current macroblock, 4x4block indexes the 4x4 transform blocks within the current macroblock and scanldx8x8 indexes to the transform coefficients within an 8x8 transform block in scanning order, according to Fig. 5(b) and scanldx4x4 indexes the transform coefficients in scanning order according to Fig. 5 (a) within a 4x4 transform block.
for ( scanldx = 0; scanldx < 16; scanldx++ ) { for ( macroblocks in scan order ) { if( 8x8 block ) { for( 8x8block = 0; 8x8block < 4; 8x8block++ ) { for( scanldx8x8 = 4*scanldx; scanldx8x8 < 4*scanIdx+4; scanldx8xδ++)
{ if ( significant ) encode_refinement ( ) ; } } } else ( 4x4 block ) { for( 4x4block in scan order ) { if( coeff at scanldx4x4 is significant ) encode_refinement () ;
} } } }
However, if it is determined in step 228 that variable length coding or CAVLC is used as entropy coding mode, i.e. entropy_coding_mode_flag is equal to 0, the procedure proceeds with steps 258 and 260, corresponding to steps 230 and 232. Then, in step 262, the significant transform coefficients within the current transform block are scanned and transmitted in the respective scanning order according to Fig. 5 (a) and (b) , respectively, whereinafter the procedure proceeds with step 264, involving stepping to the next transformation block. If such next transformation block is available in the current macroblock, the procedure loops back to step 262 (step 266) . However, if no such next transformation block is available, the procedure proceeds to step 268, where the procedure steps to the next macroblock in macroblock scan order according to Fig. 3 wherein, if same is available (270), the procedure loops back to step 260. Otherwise, the procedure ends.
In other words, the refinement path in case of variable length coding involves, for each macroblock and each transform block inside a macroblock, visiting the transform coefficients inside the transform block in the scanning order of Fig. 5 (a) and (b) , respectively and, for each transform coefficient, transmitting the transform coefficient level when the transform coefficient represents a significant transform coefficient.
The following pseudo-code summarizes the refinement path in case of variable length coding as entropy coding mode:
for ( macroblocks in scan order ) { if( 8x8 block ) { for( 8x8block = 0; 8x8block < 4; 8x8block++ ) { encode significant coefficients using combined VLC codes } 5 } else ( 4x4 block )
{ for ( 4x4block in scan order ) {
10 encode significant coefficients using combined VLC codes
} } }
15
The functionality of the encoder of Fig. 1 according to the above-mentioned Joint Drafts fit well to cases of progressive video source material or in cases where the base layer coding means 112a, b uses frame_MBS_only_flag
20. being equal to one, which means that the picture sequence representing the video consists of coded frames only, so that a decomposition of the frames into fields is neglected. However, the SNR and spatial scalability provided by the encoder of Fig. 1 in accordance with the
25 functionality described so far is not ideal for interlaced source material. Thus, the encoder of Fig. 1 operating in accordance with an embodiment of the present invention differs from the functionality, according to Fig. 2 (a) and (b) in the portions indicated below, thereby forming a kind
30 of extension to interlaced sources by considering the properties of interlaced source material.
However, before describing the different behavior, reference is made to the H.264/MPEG4-AVC Standard in which 35 several interlaced tools have been incorporated. In the first tool, a frame can either be coded as a coded frame or as two coded fields. This is referred to as picture- adaptive frame field coding. In other words, a frame or video may be considered to contain two interleaved fields, a top and a bottom field. The top field contains even- numbered rows 0, 2, . . . H/2-1, with H being the number of rows of the frame, wherein the bottom field contains the odd-numbered rows starting with the second line of the frame. If two fields of a frame are captured at different time instances, the frame may be referred to as an interlaced frame or it may otherwise be referred to as a progressive frame. The coding representation in H.264/MPEG4-AVC is primarily agnostic with respect to this video characteristic, i.e. the underlying interlaced or progressive timing of the original captured pictures. Instead, its coding specifies a representation primary based on geometric concepts, rather than being based on timing. The above-mentioned concept of picture-adaptive frame field coding is also extended to macroblock adaptive frame field coding. When a frame is coded as a single frame and the flag mb_adaptive_frame_field_flag, which is transmitted in the sequence parameter set is equal to 1, the scanning of macroblocks inside a slice is modified, as depicted in Fig. 8, wherein it is noted that like references in the Figs. denote like elements and a redundant explanation of these elements is avoided. Two vertical adjacent macroblocks are referred to as a macroblock pair. For each macroblock pair, a syntax element mb_field_decoding_flag is transmitted or inferred. When mb_field_decoding_flag is equal to 0, the macroblock pair is coded as a frame macroblock pair with the top macroblock representing the top half of the macroblock pair and the bottom macroblock representing the bottom half of the macroblock pair in the geometrical sense. The motion- compensation prediction and transform coding for both the top and the bottom macroblock, is applied as for macroblocks or frames with mb_adaptive_frame_field_coding equal to 0 indicating that macroblock adaptive frame field coding is deactivated and merely frame macroblocks exist. When mb_field_decoding_flag is equal to 1, the macroblock pair represents a field macroblock pair with a top macroblock representing the top field lines of the macroblock pair and the bottom macroblock representing the bottom field lines of the macroblock pair. Thus, in this case, the top and the bottom macroblock substantially cover the same area of the picture, namely the macroblock pair area. However, in these macroblocks, the vertical resolution is twice the horizontal resolution. In the case of the latter field macroblock pairs, the motion compensation prediction and the transform coding is performed on a field basis.
Macroblocks of coded fields or macroblocks with mb_field_decoding_flag equal to 1 of coded frames are referred to as field macroblocks. Since each transform block of a field macroblock represents an image area with a vertical resolution that is equal to twice the horizontal resolution, it is likely that the distribution of non-zero transform coefficient levels is shifted towards horizontal low frequencies and for a rate-distortion optimized coding, the scanning of transform coefficients inside a transform block is modified for field macroblocks, • as illustrated in Fig. 7 (a) and (b) , respectively.
Turning now to the embodiment of the present invention, it will turn out from the following description that it is an advantage of the present invention to optimize the coding of progressive refinement slices for interlaced source material. The scanning of transform coefficient levels inside a transform block of progressive refinement slices is especially optimized for interlaced sources and as a consequence, the coding efficiency of progressive refinement slices for interlaced sources is significantly increased when compared to the approach in accordance with the above-mentioned Joint Drafts. Generally speaking, a distinct feature of the encoder of Fig. 1 in accordance with the embodiment of the present invention when compared to the mode of operation of this encoder as described above is that the scanning order of transform coefficients inside a transform coefficient block of progressive refinement slice is adaptively selected.
Fig. 6 summarizes the different behavior of the encoder of Fig. 1 when operating in accordance with an embodiment of the present invention. In addition to the steps mentioned in Fig. 2 (a) and (b) , the refinement coding means 114a, b performs a step 500 of selecting as to which scanning order or path is to be used for the transformation coefficients within a specific transform block in the PR slice. The selection 500 may be performed in advance to the steps shown in Figs. 2 (a) and <b) with respect to all transform, blocks. Alternatively to the performance of step 500 at slice level, step 500 may be performed at macroblock pair level, macroblock level, transform block level or any combination thereof. In this regard, step 500 may, in fact, represent several sub-steps being performed in advance of and/or in between the steps of Fig. 2 (a) and (b) . Possible selection criteria are described in the following. In accordance with the embodiment of Fig. 1, the selection 500 as to which transformation coefficient scan order or scan path is to be used in a specific 4x4 transform block is made among the scanning paths shown in Figs. 5 (a) and 7 (a), respectively. Similarly, the selection 500 as to which transform coefficient scanning order is to be used for a certain 8x8 transform block is made among those shown in Fig. 5(b) and 7 (b) , respectively. As is derivable from Figs. 7 (a) and (b) , the alternative scanning orders for scanning the transform coefficients are also zigzag scans leading from the upper left transform coefficient 450a and 450a' , respectively, representing the DC component to the lower right transform coefficient 550b and 550b' , respectively, representing the spatial spectral component corresponding to the highest spectral components in both directions, horizontal and vertical direction 458, 458', 460, 460'. However, in contrary to the zigzag scans according to Fig. 5 (a) and (b) , the general back-and-forth direction of the zigzag scans indicated in Figs. 7 (a) and (b) by the sequence of arrows 600 and 602, respectively, is inclined relative to a direction perpendicular to the bisecting line 456, 456' between the horizontal and vertical axes 458, 458', 460, 460', the inclined back-and- forth direction being indicated by 604 and 604', respectively. To be more precise, the general back-and- forth direction 604 and 604' , respectively, is inclined relative to the horizontal axes 458 and 458', respectively, by about 60° rather than 45°, as it is the case in Fig. 5 (a) and (b) . The alternative scanning order of Fig. 7 (a) and (b) is appropriate for accounting for the different spectral ranges sampled by the transform coefficients along the horizontal and vertical axis in case of field macroblocks when compared to frame macroblocks, due to the different spectral resolution in horizontal and vertical axis in case of field macroblocks. In particular, by giving the encoder the opportunity of accommodating for these differences among fields and frame macroblocks, it is possible to equalize the value level distribution of the transform coefficients within the individual scan cycles performed during the significance or refinement path, thereby alleviating a fast and precise adaptation of the probability distribution estimation used for entropy encoding the refinement information or enhancing the effectiveness when using run-length coding for coding the refinement information.
With regard to the general back-and-forth direction 604', it is noted that, same represents a kind of average of the inclinations of the individual arrows forming the zigzag scan path 600 and 602, respectively, and pointing from one transform coefficient to the next. Thus, the back-and-forth direction of a zigzag scan could be determined as the average of the inclinations of all individual arrows with or without weighting the inclinations by the length of the individual arrows. Optionally, the selection made in step 500 may be indicated in the PR slice NAL unit in step 502 in the form of respective side information. Embodiments using step 502 are described in the following. Finally, based on the selection performed in step 500, the refinement coding means 114a, b uses for a specific transform block, that one among the scanning orders of Figs. 5 and 7(a,b), respectively, which is defined by the selection in step 500 and performs, by use of this scanning order or scanning path, the respective operations defined in Fig. 2 (a) and (b) , as indicated in step 504 and Fig. 6. In particular, the actual operations of Fig. 2 (a) and (b) modified as indicated in step 504 in Fig. 6 are the steps of 204, 210, 262, 236 and 238. These steps are performed by refinement coding means 114a, b of encoder of Fig. 1 by use of the scanning order or scanning path defined by the selection in step 500, as indicated in step 504.
In one embodiment of the present invention, the selection 500 of the scanning order of transform coefficients inside a transform block could be performed based on whether the co-located macroblock in the base quality layer, i.e. quality__level == 0, represents a frame macroblock or a field macroblock. A macroblock is referred to as a field macroblock in this regard when it is a macroblock of a coded field or a macroblock of a coded frame with mb_field_decoding_flag is equal to 1 and, otherwise, with a macroblock being referred to as frame macroblock. In this case, it is preferred to select the scanning orders of Fig. 5 (a) and (b) for frame macroblocks and the scanning order depicted in Figs. 7 (a) and (b) for field macroblocks. In other words, the base layer coding means 112a, b decides during performing the base layer coding for each transform block as to whether same is to be a frame or a field macroblock. Eventually, the base layer coding means 112a, b decides that all macroblocks are to be frame macroblocks, or vice-versa. The base layer coding means 112a, b performs this decision, for example, such that the R/D performance is optimized with respect to the base layer quality. The refinement coding means 114a, b then performs the selection 500 dependent on the decisions of the base layer coding means 112a, b, respectively, with respect to the association of the inacroblocks to field on frame macroblocks. Since the side information contained in the base layer bit-stream already indicates as to which macroblock is a frame macroblock and which macroblock is a field macroblock, step 502 may be omitted.
Alternatively, the scanning order of transform coefficients inside a transform block may be specified by syntax elements in step 502 that are present inside the progressive refinement slice syntax. The corresponding syntax elements can be present at a slice level, a macroblock pair level, a macroblock level, block level or any combination thereof. Again, the selection in step 500 in this case would preferably be performed such that the scanning orders of Figs. 5 (a) and (b) would be selected for frame macroblocks, while the scanning orders of Figs. 7 (a) and (b) would be selective field macroblocks. However, the selection performed in step 500 in case of selecting the scanning orders independent of the decisions performed by the base layer coding means 114 with respect to the division into field and frame macroblocks could also be performed by solving an R/D optimization problem. When indicating the selection made in step 500 and the PR slice NAL unit, the syntax element specifying the scanning order of transform coefficients inside a block could be coded by conditioned entropy codes, whereby the condition is dependent on as to whether the code-located block/macroblock/macroblock pair in the base quality layer, i.e. quality_level equal to 0, is located in or represents a field macroblock pair.
In addition to the above-mentioned differences between the mode of operation of the encoder of Fig. 1 between the case where same operates in accordance with the embodiment of the present invention and in accordance with the Joint Draft mentioned in the introductory portion of the specification, the encoder of Fig. 1 in accordance with the embodiment of the present invention could use the macroblock scan of Fig. 8 for progressive refinement slices for coding frames and mb_adaptive_frame_field_coding_flag equal to 1. In other words, in case of progressive refinement slices for coded frames and mb_adaptive_frame_field_coding_flag equal to 1, the encoder of Fig. 1 would decide to use the macroblock scan order of Fig. 8 in steps 222, 248, 250 and 268 of Fig. 2 (a) and (b) rather than Fig. 3. As indicated by the sequence of arrows 700 in Fig. 8, the alternative macroblocks scan of Fig. 8 scans the macroblocks 302 macroblock pair-wise, one macroblock pair 702 being representatively highlighted. In particular, scan order 700 scans the macroblock pairs 702 consisting of two vertically-adjacent macroblocks 302 in the same way as the scan order 304 scans the macroblocks 302 macroblock-wise, i.e. row-wise from the top to the bottom. Within each macroblock pair 702, scan order 700, scans the top macroblock 702a first and then the bottom macroblock 702b.
With respect to Fig. 9, the steps to be performed by a decoder for decoding the scalable bit-stream 126 are described for the case that the decoder decides to refine the base layer quality provided by the base layer bit- stream multiplexed into the scalable bit-stream 126. The decoder would start with parsing the base layer bit-stream 122a and 122b contained in the scalable bit-stream 126 in step 800. The result of step 800 is that the decoder knows which transform coefficient levels in the base layer are significant and which are not. Thus, the decoder is able to reconstruct the significance path and the refinement path used at the encoder side for the first refinement layer. In the next step, step 802, the decoder parses the PR slice NAL units. These contain the refinement information for the transform coefficients of the base layer. In particular, the result of step 802 is a sequence of refinement levels for the transformation coefficients. In step 804, the decoder assigns these levels to the transformation coefficients inside the PR slice of the preceding quality layer. When first performing step 804, the preceding layer is the base layer. The assignment 804 is performed by using the same significance path and refinement path used at the encoder side and having been described above. Thereafter, in step 806, the decoder refines the transformation coefficients of the preceding quality level by use of the assigned levels, thereby achieving the refined transform coefficients and the actualized quality level, respectively. Steps 802 to 806 may be performed several times in order to step-wisely enhance the quality level-by- level. Thereafter, in step 808, the decoder retransforms the transform blocks to reconstruct the texture information and then in step 810, reconstructs the frames or frame based on the reconstructed texture information.
In case of the encoder having used step 502, i.e. indicating the scanning or the selection for the transform coefficients along with the refinement information, the decoder uses this information in step 804.
To summarize, the above-described embodiment of the present invention represents an adaptive scanning of transform coefficient levels in progressive refinement slices for enabling efficient fine-granular SNR scalability for interlaced frames. In particular, it represents a concept for fine-granular SNR scalable coding of interlaced frames in which the scanning of transform coefficients inside a transform block is adaptively selected. In the design of the scalable extension of H.264/MPEG4-AVC, as described in the above-referenced Joint Drafts, all scalability tools are only specified for the coding of progressive source material; the special characteristic of interlaced source material is not anticipated. The above-presented embodiments extend the coding of progressive refinement slices in a way that the coding efficiency for interlaced sources is improved. This is achieved by adaptively controlling the scanning order of transform coefficients inside a transform block.
In other words, the above-described embodiments of the present invention could be described as a coding scheme supporting fine granular SNR scalability in which the scanning order of transform coefficients inside a transform block is adaptively selected. In this scheme, the scanning order of transform coefficients could be selected based on whether the co-located macroblock pair in the base quality layer with identical representation time and dependency_ID and quality_level equal to 0, represents a field of a frame macroblock pair. When doing so, the scanning order of Figs. 5 (a) and (b) is chosen when the co-located macroblock pair in the base quality slice represents a frame macroblock pair and the scanning order depicted in Fig. 7 (a) and (b) is chosen when the co-located macroblock pair in the base quality layer represents a field macroblock pair. Moreover, the selection of the scanning order could be transmitted as part of the slice syntax. This includes the transmission of corresponding syntax elements on a slice level, macroblock pair level, macroblock level or block level. Moreover, the scanning order depicted in Figs. 5 (a) and (b) could be chosen when the co-located macroblock pair and the base quality slice representative frame, macroblock pair and the scanning order depicted in Figs. 7 (a) and (b) could be chosen when the co-located macroblock pair in the base quality layer represents a field macroblock pair. In case of transmitting the selection of the scanning order as a part of the slice syntax, one or more syntax elements specifying the scanning order of transform coefficients can be coded by conditioned entropy codes, whereby the condition is dependent on as to whether the co-located macroblock or macroblock pair on the base quality layer is coded as frame or field macroblock or macroblock pair, respectively. Additionally, it is possible to select the macroblock scan depicted in Fig. 8 for coded frames. The selection of the macroblock scan among those shown in Figs. 3 and 8 can be performed based on a sequence level syntax element, such as mb_adaptive_frame_field_flag.
With regard to the above embodiments, it is noted that the present invention is not necessarily restricted to the usage of the H.264/MPEG4-AVC Standard. Rather, any other transformation-based coding scheme could also be used. Moreover, it is noted that the scanning path and the refinement path could be chosen in another way. In particular, a significance path could be scanned in a single scan cycle. In this case, during this one scan cycle, all significant and non-significant transform coefficients, respectively, could be selected for transmission and coding, respectively together. In addition, it is possible not to differentiate between significant and non-significant transform coefficients i.e. not to firstly code the non-significant coefficients in the significance path followed by then code the significant transform coefficients in the refinement path. Rather, it would be possible to merge the significance path and the refinement path into one path. Moreover, the present invention is also advantageous in still image processing.
Depending on an actual implementation, the inventive coding scheme can be implemented in hardware or in software. Therefore, the present invention also relates to a computer program, which can be stored on a computer-readable medium such as a CD, a disc or any other data carrier. The present invention is, therefore, also a computer program having a program code which, when executed on a computer, performs the inventive method described in connection with the above figures.
Furthermore, it is noted that all steps indicated in the flow diagrams could be implemented by respective means and the implementations may comprise sub-routines running on a CPU, circuit parts of an ASIC or the like.

Claims

Claims
1. Encoder for encoding a picture (104a) into a quality scalable data-stream (126) , comprising:
base quality encoding means (112a, b) for encoding the picture (104a, b) into a base layer data-stream (120a, b) from which the picture (104a, b) may be derived with a base quality, by partitioning the picture (104a, b) into a plurality of blocks, individually transforming the blocks into transform coefficient blocks (402) of transform coefficients (450) and coding the transform coefficients (450) into the base layer data-stream (120a, b); and
appending means (114a,b) for appending refinement information (122a,b) to the base layer data-stream (120a, b) to yield the quality scalable data-stream (126) such that the refinement information is serially arranged in the quality scalable data-stream (126) such that the refinement information refines a first portion (206) of the transform coefficients (450) of a predetermined of the transform coefficient blocks
(402) in a scan order along a zigzag scan path (600, 602) having a back-and-forth direction (604, 604') being inclined relative to a direction perpendicular to a bisecting line (456) between a horizontal axis
(458) and a vertical axis (460) of the predetermined transform coefficient block, and a second remaining portion (262, 236, 238) of the transform coefficients (450) of the predetermined transform coefficient blocks (402) in the scan order along the zigzag scan path (604, 604' ) .
2. Encoder according to claim 1, wherein the appending means (114a, b) is designed such that the zigzag scan path (600, 602) has a back-and-forth direction (604, 604') inclined relative to the direction perpendicular to the bisecting line by 10° or more.
3. Encoder according to claim 1 or 2, wherein the appending means (114a,b) is designed such that a propagation rate of the zigzag scan path (600, 602) along the vertical axis (460) is approximately twice a propagation rate of the zigzag scan path along the horizontal axis (458) .
4. Encoder according to any of claims 1 to 3, wherein the appending means (114a) is designed such that the scan order along the zigzag scan path (600, 602) starts at that transform coefficient (450a) among the plurality of transform coefficients (450) of the predetermined one of the transform coefficient blocks (402), which corresponds to the DC component.
5. Encoder according to any of the preceding claims, wherein the appending means (114a, b) is designed such that the refinement information (122a, b) is serially arranged in the quality scalable data-stream (126) such that the refinement information (122a,b) refines a first portion of another one of the transform coefficient blocks (402) being different from the predetermined one in a scan order along a zigzag scan path (452) having a back-and-forth direction (454) being substantially parallel to the direction perpendicular to the bisecting line (456) between a horizontal axis (458) and a vertical axis (460) of the other transform coefficient block, as well as a second remaining portion of the other transform coefficient block.
6. Encoder according to any of the preceding claims, wherein the picture (114a, b) is composed of at least a first and second interleaved fields, the first field containing a first sub-set of rows of the picture (114a,b) and the second field containing a second subset of rows of the picture (114a, b), the rows of the first sub-set and the rows of the second sub-set being interleaved and the picture (114a,b) being portioned into macroblock areas (702) , each of which is structured into at least two macroblocks (302, 702a, 702b), each macroblock (302) comprising several blocks (450) of the plurality of blocks (450), wherein the base quality encoding means (122a,b) is designed such that first ones of the macroblock areas (702) is portioned into at least two of the macroblocks (302) such that a first one of the at least two macroblocks incorporates rows of the first sub-set of the picture within the respective macroblock area, while a second one of the at least two macroblocks incorporate rows of the second sub-set of rows within the respective macroblock area, and second ones of the macroblock areas are partitioned into at least two of the macroblocks such that a first one of the macroblocks incorporates a continuous top portion of the macroblock area, while a second one of the macroblocks incorporates a continuous bottom portion of the macroblock area, and that the base layer data-stream
(126) comprises first syntax elements enabling identifying as to whether any of the xtvacroblocks belongs to the first or second macroblock areas, wherein the appending means (114a,b) is designed to append the refinement information (122a,b) to the base layer data-stream (l20a,b) such that the refinement information (122a, b) is serially arranged in the quality scalable data-stream such that the refinement information refines the transform coefficients of a transform coefficient block of a macroblock of any of the first macroblock areas in a scan order along a zigzag scan path (600, 6002) having a back-and-forth direction (604) being inclined relative to a direction perpendicular to a bisecting line (456) between a horizontal axis (458) and a vertical axis (460) of the respective transform coefficient blocks and such that the refinement information refines the transformation coefficients of a transform coefficient block of a macroblock of any of the second macroblock areas in a scan order along a zigzag scan path (452) having a back-and-forth direction (454) being substantially parallel to a direction perpendicular to a bisecting line (456) between a horizontal axis (458) and a vertical axis (460) of the respective transform coefficient blocks.
7. Encoder according to any of claims 1 to 5, wherein the picture (114afb) is composed of at least a first and second interleaved fields, the first field containing a first sub-set of rows of the picture (114a, b) and the second field containing a second sub-set of rows of the picture (114a, b), the rows of the first sub-set and the rows of the second sub-set being interleaved and the picture (114a, b) being portioned into macroblock areas (702), each of which is structured into at least two macroblocks (302, 702a, 702b), each macroblock (302) comprising several blocks (450) of the plurality of blocks (450) , wherein the base quality encoding means (122a,b) is designed such that the macroblock areas (702) are portioned into at least two of the macroblocks (302) either such that a first one of the at least two macroblocks incorporates rows of the first sub-set of the picture within the respective macroblock area, while a second one of the at least two macroblocks incorporate rows of the second sub-set of rows within the respective macroblock area, or either such that a first one of the macroblocks incorporates a continuous top portion of the macroblock area, while a second one of the macroblocks incorporates a continuous bottom portion of the macroblock area, and wherein the base quality encoding means (122a, b) is designed to decide to either encode the picture picture-wise or field-wise, and the appending means (114a, b) is designed to, depending on the decision, step through the macroblocks row-wise in units of macroblock areas (700) or in units of individual macroblocks (304) .
8. Encoder according to claim 5, wherein the appending means (114a, b) is designed such that the refinement information additionally comprises a syntax element based on which it is derivable by which zig-zag scan path the refinement information refines all of the first portion and/or all of the second portion of the transform coefficients of the predetermined transform coefficient block.
9. Encoder according to any of the preceding claims, wherein the appending means is designed such that the refinement information is serially arranged in the quality scalable data-stream such that the refinement information refines (206) the first portion of the transform coefficients of the predetermined transform coefficient blocks completely before refining (262, 236, 238) the second portion of the transform coefficients of the predetermined transform coefficient block.
10. Encoder according to any of the preceding claims, wherein the appending means is arranged for assigning the transform coefficients of the predetermined transform coefficient block into the first or second portions depending on a significance of a corresponding transform coefficient in the base layer data-stream or a significance of a corresponding refined transform coefficient in accordance with an intermediate refinement information arranged inside the quality scalable data-stream between the base layer data-stream and the refinement information.
11. Encoder according to claim 8, wherein the appending means is designed to code the syntax element into the refinement information by a conditioned entropy code with a condition which is dependent on a decision of the base quality encoding means with regard to a part of the picture involving the block transformed into the predetermined transform coefficient block.
12. Decoder for decoding a quality scalable data-stream into which a picture is encoded, the quality scalable data-stream comprising a base layer data-stream from which the picture is derivable with a base quality followed by refinement information, the base layer data-stream comprising transform coefficient blocks of transform coefficients derived by individually transforming a plurality of blocks into which the picture is partitioned, and the refinement information being arranged in the quality data-stream such that the refinement information refines a first portion of the transform coefficients of a predetermined of the transform coefficient blocks in a scan order along a zigzag scan path having a back-and-forth direction being inclined relative to a direction perpendicular to a bisecting line between the horizontal and the vertical axis of the predetermined transformation coefficient block and a second remaining portion of the transform coefficients of the predetermined of the transform coefficient blocks in the scan order along the zigzag scan path, the decoder comprising:
means (800, 802) for parsing the quality scalable data-stream in order to yield the base layer data- stream and the refinement information;
means (804) for assigning the refinement information to the first portion of the transform coefficients of the predetermined transform coefficient block in the scan order along the zigzag scan path and to the second remaining portion of the transform coefficients of the transform coefficient block in the scan order along the zigzag scan path;
means (806) for refining the transform coefficients of the transform coefficient block with the refinement information as assigned by the means for assigning in order to yield refined transformation coefficients/ and
means (808, 810) for, based on the refined transformation coefficients, deriving the picture with a quality enhanced relative to the base quality.
13. Decoder according to claim 12, wherein the means for assigning is designed such that the zigzag scan path
(600, 602) has a back-and-forth direction (604, 604') inclined relative to the direction perpendicular to the bisecting line by 10° or more.
14. Decoder according to claim 12 or 13, wherein the means for assigning is designed such that a propagation rate of the zigzag scan path (600, 602) along the vertical axis (460) is approximately twice a propagation rate of the zigzag scan path along the horizontal axis (458) .
15. Decoder according to any of claims 12 to 14, wherein the means for assigning is designed such that the scan order along the zigzag scan path (600, 602) starts at that transform coefficient (450a) among the plurality of transform coefficients (450) of the predetermined one of the transform coefficient blocks (402), which corresponds to the DC component.
16. Decoder according to any of claims 12 to 15, wherein refinement information (122a, b) is serially arranged in the quality scalable data-stream (126) such that the refinement information (122a, b) is serially arranged in the quality scalable data-stream (126) such that the refinement information (122a,b) refines a first portion of another one of the transform coefficient blocks (402) being different from the predetermined one in a scan order along a zigzag scan path (452) having a back-and-forth direction (454) being substantially parallel to the direction perpendicular to the bisecting line (456) between a horizontal axis (458) and a vertical axis (460) of the other transform coefficient block, as well as a second remaining portion of the other transform coefficient block, and wherein the means for assigning is designed to assign the refinement information to the first portion of the transform coefficients of the other transform coefficient block and to the second remaining portion of the transform coefficient blocks in the a zigzag scan path (452) having a back-and- forth direction (454) being substantially parallel to the direction perpendicular to the bisecting line (456) between a horizontal axis (458) and a vertical axis (460) of the other transform coefficient block.
17. Decoder according to any of claims 12 to 16, wherein the picture (114a, b) is composed of at least a first and second interleaved fields, the first field containing a first sub-set of rows of the picture (114a,b) and the second field containing a second subset of rows of the picture (114a, b), the rows of the first sub-set and the rows of the second sub-set being interleaved and the picture (114a, b) being portioned into macroblock areas (702) , each of which is structured into at least two macroblocks (302, 702a, 702b), each macroblock (302) comprising several blocks (450) of the plurality of blocks (450), wherein in the base quality data-stream the picture is coded such that first ones of the macroblock areas (702) are portioned into at least two of the macroblocks (302) such that a first one of the at least two macroblocks incorporates rows of the first sub-set of the picture within the respective macroblock area, while a second one of the at least two macroblocks incorporate rows of the second sub-set of rows within the respective macroblock area, and second ones of the macroblock areas are partitioned into at least two of the macroblocks such that a first one of the macroblocks incorporates a continuous top portion of the macroblock area, while a second one of the macroblocks incorporates a continuous bottom portion of the macroblock area, and wherein the base layer data- stream (126) comprises first syntax elements enabling identifying as to whether any of the macroblocks belongs to the first or second macroblock areas, wherein the means for assigning is designed to assign the refinement information to the transform coefficients of a transform coefficient block of a macroblock of any of the first macroblock areas in a scan order along a zigzag scan path (600, 6002) having a back-and-forth direction (604) being inclined relative to a direction perpendicular to a bisecting line (456) between a horizontal axis (458) and a vertical axis (460) of the respective transform coefficient blocks, and to assign the refinement information to the transformation coefficients of a transform coefficient block of a macroblock of any of the second macroblock areas in a scan order along a zigzag scan path (452) having a back-and-forth direction (454) being substantially parallel to a direction perpendicular to a bisecting line (456) between a horizontal axis (458) and a vertical axis (460) of the respective transform coefficient blocks.
18. Decoder according to any of claims 12 to 17, wherein the picture (114a, b) is composed of at least a first and second interleaved fields, the first field containing a first sub-set of rows of the picture (114a, b) and the second field containing a second subset of rows of the picture (114a, b), the rows of the first sub-set and the rows of the second sub-set being interleaved and the picture (114a, b) being portioned into macroblock areas (702) , each of which is structured into at least two macroblocks (302, 702a, 702b), each macroblock (302) comprising several blocks (450) of the plurality of blocks (450) , wherein in base quality data-stream the picture is coded such that the macroblock areas (702) are portioned into at least two of the macroblocks (302) either such that a first one of the at least two macroblocks incorporates rows of the first sub-set of the picture within the respective macroblock area, while a second one of the at least two macroblocks incorporate rows of the second sub-set of rows within the respective macroblock area, or either such that a first one of the macroblocks incorporates a continuous top portion of the macroblock area, while a second one of the macroblocks incorporates a continuous bottom portion of the macroblock area, and wherein the base quality comprises an indication of a decision of as to whether the picture is encoded picture-wise or field-wise in the base layer data-stream, and the means for assigning is designed to, depending on the decision, step through the macroblocks row-wise in units of macroblock areas (700) or in units of individual macroblocks (304) .
19. Decoder according to claim 16, wherein the refinement information additionally comprises a syntax element based on which it is derivable by which zig-zag scan path the refinement information refines all of the first portion and/or all of the second portion of the transform coefficients of the predetermined transform coefficient block, wherein the means for assigning is designed to use said syntax element for deciding as to whether the scan order along the zig-zag scan path having a back-and-forth direction being substantially inclined relative to the direction perpendicular to the bisecting line between a horizontal axis and a vertical axis of the other transform coefficient block is to be used for assigning the refinement information to the transform coefficients of the predetermined transform coefficient block.
20. Decoder according to any of claims 12 to 19, wherein the means for assigning is designed to assign the refinement information to the first portion of the transform coefficients of the predetermined transform coefficient blocks completely before the second portion of the transform coefficients of the predetermined transform coefficient block.
21. Decoder according to any of claims 12 to 20, wherein the means for assigning is designed to determine a significance of corresponding transform coefficients for the predetermined transform coefficient block in the base layer data-stream or a significance of corresponding refined transform coefficients in accordance with an intermediate refinement information arranged inside the quality scalable data-stream between the base layer data-stream and the refinement information and to, depending on the determination, divide the transform coefficients of the predetermined transform coefficient block into the first and second portions .
22. Decoder according to claim 19, wherein the parsing means is designed to decode the syntax element from the refinement information by a conditioned entropy code with a condition which is dependent on a decision derivable from the base quality data-stream concerning a part of the picture involving the block transformed into the predetermined transform coefficient block.
23. Encoder for encoding a picture into a quality scalable data-stream, comprising:
base quality encoding means (122a, b) for encoding the picture (114a, b) into a base layer data-stream (120a, b) from which the picture (114a, b) may be derived with a base quality, by partitioning the picture (114a, b) into a plurality of blocks, individual transforming the blocks into transform coefficient blocks (402) of transform coefficients (450) and coding the transform coefficients (450) into the base layer data-stream (120a, b) ; and
appending means (114a,b) for appending refinement information (122a,b) into the base layer data-stream (120a,b) to yield the quality scalable data-stream (126) such that the refinement information (122a, b) is serially arranged in the quality scalable data-stream (126) such that the refinement information refines a first portion (206) of the transform coefficients (450) of a first predetermined one of the transform coefficient blocks (402) in a scan order along a first zigzag scan path (600, 602) having a first back-and- forth direction (604, 604'), and a first portion (206) of the transform coefficients (450) of a second predetermined one of the transform coefficient blocks
(402) in a scan order along a second zigzag scan path
(600, 602) having a second back-and-forth direction
(452) being different from first back-and-forth direction (604, 604').
24. Decoder for decoding a quality scalable data-stream into which a picture is encoded, the quality scalable data-stream comprising a base layer data-stream from which the picture is derivable with a base quality followed by refinement information, the base layer data-stream comprising transform coefficient blocks of transform coefficients derived by individually transforming a plurality of blocks into which the picture is partitioned, and the refinement information being arranged in the quality data-stream such that the refinement information refines a first portion (206) of the transform coefficients (450) of a first predetermined one of the transform coefficient blocks (402) in a scan order along a first zigzag scan path (600, 602) having a first back-and-forth direction (604, 604'), and a first portion (206) of the transform coefficients (450) of a second predetermined one of the transform coefficient blocks (402) in a scan order along a second zigzag scan path (600, 602) having a second back-and-forth direction (452) being different from first back-and-forth direction (604, 6OA' ) , the decoder comprising:
means (800, 802) for parsing the quality scalable data-stream in order to yield the base layer data- stream and the refinement information;
means (804) for assigning the refinement information to the first portion of the transform coefficients of the first predetermined transform coefficient block in the scan order along the first zigzag scan path and to the first portion of the transform coefficients of the second predetermined transform coefficient block in the scan order along the second zigzag scan path;
means (806) for refining the transform coefficients of the transform coefficient block with the refinement information as assigned by the means for assigning in order to yield refined transformation coefficients; and
means (808, 810) for, based on the refined transformation coefficients, deriving the picture with a quality enhanced relative to the base quality.
25. Method for encoding a picture (104a, b) into a quality scalable data-stream (126), comprising the following steps:
encoding the picture (104a, b) into a base layer data- stream (120a, b) from which the picture (104a, b) may be derived with a base quality by partitioning the picture (104a, b) into a plurality of blocks, individually transforming the blocks into transform coefficient blocks (402) of transform coefficients (450) and coding the transform coefficients (450) into the base layer data-stream (120a, b); and
appending refinement information (122a,b) to the base layer data-stream (120a,b) to yield the quality scalable data-stream (126) such that the refinement information is serially arranged in the quality scalable data-stream (126) such that the refinement information refines a first portion (206) of the transform coefficients (450) of a predetermined of the transform coefficient blocks (402) in a scan order along a zigzag scan path (600, 602) having a back-and- forth direction (604, 604') being inclined relative to a direction perpendicular to a bisecting line (456) between a horizontal axis (458) and a vertical axis
(460) of the predetermined transform coefficient block, and a second remaining portion (262, 236, 238) of the transform coefficients (450) of the predetermined of the transform coefficient blocks (402) in the scan order along the zigzag scan path (604, 604').
26. Method for decoding a quality scalable data-stream into which a picture is encoded, the quality scalable data-stream comprising a base layer data-stream from which the picture is derivable with a base quality followed by refinement information, the base layer data-stream comprising transform coefficient blocks of transform coefficients derived by individually transforming a plurality of blocks into which the picture is partitioned, and the refinement information being arranged in the quality data-stream such that the refinement information refines a first portion of the transform coefficients of a predetermined of the transform coefficient blocks in a scan order along a zigzag scan path having a back-and-forth direction being inclined relative to a direction perpendicular to a bisecting line between the horizontal and the vertical axis of the predetermined transformation coefficient block and a second remaining portion of the transform coefficients of the predetermined of the transform coefficient blocks in the scan order along the zigzag scan path, the method comprising the following steps:
parsing the quality scalable data-stream in order to yield the base layer data-stream and the refinement information;
assigning the refinement information to the first portion of the transform coefficients of the predetermined transform coefficient block in the scan order along the zigzag scan path and to the second remaining portion of the transform coefficients of the transform coefficient block in the scan order along the zigzag scan path;
refining the transform coefficients of the transform coefficient block with the refinement information as assigned by the step of assigning in order to yield refined transformation coefficients; and
based on the refined transformation coefficients, deriving the picture with a quality enhanced relative to the base quality.
27. Method for encoding a picture into a quality scalable data-stream, comprising the following steps:
encoding the picture (114a, b) into a base layer data- stream (120a, b) from which the picture (114a) may be derived with a base quality, by partitioning the picture (114a, b) into a plurality of blocks, individual transforming the blocks into transform coefficient blocks (402) of transform coefficients (450) and coding the transform coefficients (450) into the base layer data-stream (120arb); and
appending refinement information (122a, b) into the base layer data-stream (120a, b) to yield the quality scalable data-stream (126) such that the refinement information (122a,b) is serially arranged in the quality, scalable data-stream (126) such that the refinement information refines a first portion (206) of the transform coefficients (450) of a first predetermined one of the transform coefficient blocks
(402) in a scan order along a first zigzag scan path
(600, 602) having a first back-and-forth direction
(604, 604'), and a first portion (206) of the transform coefficients (450) of a second predetermined one of the transform coefficient blocks (402) in a scan order along a second zigzag scan path (600, 602) having a second back-and-forth direction (452) being different from first back-and-forth direction (604, 604').
28. Method for decoding a quality scalable data-stream into which a picture is encoded, the quality scalable data-stream comprising a base layer data-stream from which the picture is derivable with a base quality followed by refinement information, the base layer data-stream comprising transform coefficient blocks of transform coefficients derived by individually transforming a plurality of blocks into which the picture is partitioned, and the refinement information being arranged in the quality data-stream such that the refinement information refines a first portion (206) of the transform coefficients (450) of a first predetermined one of the transform coefficient blocks
(402) in a scan order along a first zigzag scan path
(600, 602) having a first back-and-forth direction
(604, 604'), and a first portion (206) of the transform coefficients (450) of a second predetermined one of the transform coefficient blocks (402) in a scan order along a second zigzag scan path (600, 602) having a second back-and-forth direction (452) being different from first back-and-forth direction (604, 604'), comprising the following steps:
parsing the quality scalable data-stream in order to yield the base layer data-stream and the refinement information;
assigning the refinement information to the first portion of the transform coefficients of the first predetermined transform coefficient block in the scan order along the first zigzag scan path and to the first portion of the transform coefficients of the second predetermined transform coefficient block in the scan order along the second zigzag scan path;
refining the transform coefficients of the transform coefficient block with the refinement information as assigned by the step of assigning in order to yield refined transformation coefficients; and
based on the refined transformation coefficients, deriving the picture with a quality enhanced relative to the base quality.
9. Computer-Program having a program code for performing, when running on a computer, a method according to any of claims 25 to 28.
PCT/EP2006/001293 2006-01-13 2006-02-13 Quality scalable picture coding with particular transform coefficient scan path WO2007079782A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP2006000268 2006-01-13
EPPCT/EP2006/000268 2006-01-13

Publications (1)

Publication Number Publication Date
WO2007079782A1 true WO2007079782A1 (en) 2007-07-19

Family

ID=37137376

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2006/001293 WO2007079782A1 (en) 2006-01-13 2006-02-13 Quality scalable picture coding with particular transform coefficient scan path

Country Status (1)

Country Link
WO (1) WO2007079782A1 (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2213098A2 (en) * 2007-10-16 2010-08-04 Thomson Licensing Methods and apparatus for video encoding and decoding geometrically partitioned super blocks
WO2011113346A1 (en) * 2010-03-15 2011-09-22 Mediatek Singapore Pte. Ltd. Methods of utilizing tables adaptively updated for coding/decoding and related processing circuits thereof
CN102474613A (en) * 2009-08-14 2012-05-23 三星电子株式会社 Method and apparatus for encoding video in consideration of scanning order of coding units having hierarchical structure, and method and apparatus for decoding video in consideration of scanning order of coding units having hierarchical structure
CN102595113A (en) * 2011-01-13 2012-07-18 华为技术有限公司 Method, device and system for scanning conversion coefficient block
WO2013104198A1 (en) * 2012-01-13 2013-07-18 Mediatek Inc. Method and apparatus for unification of coefficient scan of 8x8 transform units in hevc
CN104093018A (en) * 2011-03-10 2014-10-08 华为技术有限公司 Coding method and device of transformation coefficients and decoding method and device of transformation coefficients
US9049444B2 (en) 2010-12-22 2015-06-02 Qualcomm Incorporated Mode dependent scanning of coefficients of a block of video data
KR101576199B1 (en) 2015-04-13 2015-12-10 삼성전자주식회사 Method and apparatus for video encoding based on scanning order of hierarchical data units, and method and apparatus for video decoding based on the same
KR101576198B1 (en) 2014-10-29 2015-12-10 삼성전자주식회사 Method and apparatus for video encoding based on scanning order of hierarchical data units, and method and apparatus for video decoding based on the same
KR101576200B1 (en) 2015-07-03 2015-12-10 삼성전자주식회사 Method and apparatus for video encoding based on scanning order of hierarchical data units, and method and apparatus for video decoding based on the same
KR101577243B1 (en) 2015-07-03 2015-12-14 삼성전자주식회사 Method and apparatus for video encoding based on scanning order of hierarchical data units, and method and apparatus for video decoding based on the same
US9225997B2 (en) 2010-02-02 2015-12-29 Samsung Electronics Co., Ltd. Method and apparatus for encoding video based on scanning order of hierarchical data units, and method and apparatus for decoding video based on scanning order of hierarchical data units
KR101607310B1 (en) * 2014-10-29 2016-03-29 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101607312B1 (en) * 2015-07-02 2016-03-29 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101607311B1 (en) * 2015-04-20 2016-03-29 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
US9386306B2 (en) 2012-08-15 2016-07-05 Qualcomm Incorporated Enhancement layer scan order derivation for scalable video coding
US9571836B2 (en) 2011-03-10 2017-02-14 Huawei Technologies Co., Ltd. Method and apparatus for encoding and decoding with multiple transform coefficients sub-blocks
KR101731430B1 (en) * 2015-07-02 2017-04-28 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101783966B1 (en) * 2017-04-24 2017-10-10 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101842262B1 (en) 2017-09-26 2018-03-26 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR20180032542A (en) * 2018-03-20 2018-03-30 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
WO2018084476A1 (en) 2016-11-01 2018-05-11 Samsung Electronics Co., Ltd. Processing apparatuses and controlling methods thereof
CN108632620A (en) * 2011-03-08 2018-10-09 维洛媒体国际有限公司 The decoding of transformation coefficient for video coding
CN109479138A (en) * 2016-07-13 2019-03-15 韩国电子通信研究院 Image coding/decoding method and device
CN113556561A (en) * 2010-04-13 2021-10-26 Ge视频压缩有限责任公司 Coding of significance maps and transform coefficient blocks
US11330272B2 (en) 2010-12-22 2022-05-10 Qualcomm Incorporated Using a most probable scanning order to efficiently code scanning order information for a video block in video coding

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
G. SULLIVAN, A. LUTHRA, T. WIEGAND: "Editors'text for ISO/IEC 14496-10:2005", ISO/IEC JTC1/SC29/WG11, no. N7081, 11 August 2005 (2005-08-11), Busan, pages 1 - 318, XP002434242 *
REICHEL J ET AL: "Draft of Joint Scalable Video Model JSVM-4 Annex G", JOINT VIDEO TEAM (JVT) OF ISO/IEC MPEG & ITU-T VCEG (ISO/IEC JTC1/SC29/WG11 AND ITU-T SG16 Q6), XX, XX, no. JVT-Q201, 21 October 2005 (2005-10-21), pages 1 - 165, XP002422832 *
T. HINZ, H. SCHWARZ, T. WIEGAND: "FGS field pictures and MBAFF frames", ISO/IEC JTC1/SC29/WG11, ITU-T SG16 Q.6, no. JVT-R062, 11 January 2006 (2006-01-11), XP002434241, Retrieved from the Internet <URL:http://ftp3.itu.ch/av-arch/jvt-site/2006_01_Bangkok/JVT-R062.zip> [retrieved on 20070521] *

Cited By (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2213098A2 (en) * 2007-10-16 2010-08-04 Thomson Licensing Methods and apparatus for video encoding and decoding geometrically partitioned super blocks
EP2950540A1 (en) * 2009-08-14 2015-12-02 Samsung Electronics Co., Ltd Method and apparatus for encoding video in consideration of scanning order of coding units having hierarchical structure, and method and apparatus for decoding video in consideration of scanning order of coding units having hierarchical structure
CN104539968A (en) * 2009-08-14 2015-04-22 三星电子株式会社 A method and apparatus for decoding a video
EP3474553A1 (en) * 2009-08-14 2019-04-24 Samsung Electronics Co., Ltd. Video coding in consideration of scanning order of coding units having hierarchical structure
CN104780381B (en) * 2009-08-14 2018-04-24 三星电子株式会社 For carrying out decoded method and apparatus to video
CN104539968B (en) * 2009-08-14 2017-12-22 三星电子株式会社 Method and apparatus for being decoded to video
CN102474613B (en) * 2009-08-14 2016-06-01 三星电子株式会社 Consider that there is the method and apparatus that video is encoded and decodes by the scanning sequency of the coding unit of hierarchy
CN102474613A (en) * 2009-08-14 2012-05-23 三星电子株式会社 Method and apparatus for encoding video in consideration of scanning order of coding units having hierarchical structure, and method and apparatus for decoding video in consideration of scanning order of coding units having hierarchical structure
EP2950542A1 (en) * 2009-08-14 2015-12-02 Samsung Electronics Co., Ltd Method and apparatus for encoding video in consideration of scanning order of coding units having hierarchical structure, and method and apparatus for decoding video in consideration of scanning order of coding units having hierarchical structure
USRE48224E1 (en) 2009-08-14 2020-09-22 Samsung Electronics Co., Ltd. Method and apparatus for encoding video in consideration of scanning order of coding units having hierarchical structure, and method and apparatus for decoding video in consideration of scanning order of coding units having hierarchical structure
EP2950539A1 (en) * 2009-08-14 2015-12-02 Samsung Electronics Co., Ltd Method and apparatus for encoding video in consideration of scanning order of coding units having hierarchical structure, and method and apparatus for decoding video in consideration of scanning order of coding units having hierarchical structure
CN104780381A (en) * 2009-08-14 2015-07-15 三星电子株式会社 Method and apparatus for decoding video
CN104780382A (en) * 2009-08-14 2015-07-15 三星电子株式会社 Method and apparatus for encoding video
EP2443833A4 (en) * 2009-08-14 2015-08-19 Samsung Electronics Co Ltd Method and apparatus for encoding video in consideration of scanning order of coding units having hierarchical structure, and method and apparatus for decoding video in consideration of scanning order of coding units having hierarchical structure
US9137536B2 (en) 2009-08-14 2015-09-15 Samsung Electronics Co., Ltd. Method and apparatus for encoding video in consideration of scanning order of coding units having hierarchical structure, and method and apparatus for decoding video in consideration of scanning order of coding units having hierarchical structure
EP2950541A1 (en) * 2009-08-14 2015-12-02 Samsung Electronics Co., Ltd Method and apparatus for encoding video in consideration of scanning order of coding units having hierarchical structure, and method and apparatus for decoding video in consideration of scanning order of coding units having hierarchical structure
US9319713B2 (en) 2010-02-02 2016-04-19 Samsung Electronics Co., Ltd. Method and apparatus for encoding video based on scanning order of hierarchical data units, and method and apparatus for decoding video based on scanning order of hierarchical data units
US9351015B2 (en) 2010-02-02 2016-05-24 Samsung Electronics Co., Ltd. Method and apparatus for encoding video based on scanning order of hierarchical data units, and method and apparatus for decoding video based on scanning order of hierarchical data units
US10123043B2 (en) 2010-02-02 2018-11-06 Samsung Electronics Co., Ltd. Method and apparatus for encoding video based on scanning order of hierarchical data units, and method and apparatus for decoding video based on scanning order of hierarchical data units
US10567798B2 (en) 2010-02-02 2020-02-18 Samsung Electronics Co., Ltd. Method and apparatus for encoding video based on scanning order of hierarchical data units, and method and apparatus for decoding video based on scanning order of hierarchical data units
US9225997B2 (en) 2010-02-02 2015-12-29 Samsung Electronics Co., Ltd. Method and apparatus for encoding video based on scanning order of hierarchical data units, and method and apparatus for decoding video based on scanning order of hierarchical data units
US9277239B2 (en) 2010-02-02 2016-03-01 Samsung Electronics Co., Ltd. Method and apparatus for encoding video based on scanning order of hierarchical data units, and method and apparatus for decoding video based on scanning order of hierarchical data units
US9743109B2 (en) 2010-02-02 2017-08-22 Samsung Electronics Co., Ltd. Method and apparatus for encoding video based on scanning order of hierarchical data units, and method and apparatus for decoding video based on scanning order of hierarchical data units
WO2011113346A1 (en) * 2010-03-15 2011-09-22 Mediatek Singapore Pte. Ltd. Methods of utilizing tables adaptively updated for coding/decoding and related processing circuits thereof
CN113556561A (en) * 2010-04-13 2021-10-26 Ge视频压缩有限责任公司 Coding of significance maps and transform coefficient blocks
US9049444B2 (en) 2010-12-22 2015-06-02 Qualcomm Incorporated Mode dependent scanning of coefficients of a block of video data
US11330272B2 (en) 2010-12-22 2022-05-10 Qualcomm Incorporated Using a most probable scanning order to efficiently code scanning order information for a video block in video coding
WO2012094909A1 (en) * 2011-01-13 2012-07-19 华为技术有限公司 Scanning method, device and system for transformation coefficient block
CN102595113A (en) * 2011-01-13 2012-07-18 华为技术有限公司 Method, device and system for scanning conversion coefficient block
CN108632620A (en) * 2011-03-08 2018-10-09 维洛媒体国际有限公司 The decoding of transformation coefficient for video coding
US11405616B2 (en) 2011-03-08 2022-08-02 Qualcomm Incorporated Coding of transform coefficients for video coding
CN108632620B (en) * 2011-03-08 2022-04-01 高通股份有限公司 Coding of transform coefficients for video coding
US10165305B2 (en) 2011-03-10 2018-12-25 Huawei Technologies Co., Ltd. Encoding and decoding transform coefficient sub-blocks in same predetermine order
CN104093018A (en) * 2011-03-10 2014-10-08 华为技术有限公司 Coding method and device of transformation coefficients and decoding method and device of transformation coefficients
US9571836B2 (en) 2011-03-10 2017-02-14 Huawei Technologies Co., Ltd. Method and apparatus for encoding and decoding with multiple transform coefficients sub-blocks
CN104093018B (en) * 2011-03-10 2017-08-04 华为技术有限公司 The coding method of conversion coefficient, the coding/decoding method of conversion coefficient, and device
US10104399B2 (en) 2012-01-13 2018-10-16 Hfi Innovation Inc. Method and apparatus for unification of coefficient scan of 8X8 transform units in HEVC
AU2012365727B2 (en) * 2012-01-13 2015-11-05 Hfi Innovation Inc. Method and apparatus for unification of coefficient scan of 8x8 transform units in HEVC
WO2013104198A1 (en) * 2012-01-13 2013-07-18 Mediatek Inc. Method and apparatus for unification of coefficient scan of 8x8 transform units in hevc
CN104041049A (en) * 2012-01-13 2014-09-10 联发科技股份有限公司 Method and apparatus for unification of coefficient scan of 8x8 transform units in HEVC
US20150010072A1 (en) * 2012-01-13 2015-01-08 Mediatek Inc. Method and apparatus for unification of coefficient scan of 8x8 transform units in hevc
US9386306B2 (en) 2012-08-15 2016-07-05 Qualcomm Incorporated Enhancement layer scan order derivation for scalable video coding
KR101607310B1 (en) * 2014-10-29 2016-03-29 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101576198B1 (en) 2014-10-29 2015-12-10 삼성전자주식회사 Method and apparatus for video encoding based on scanning order of hierarchical data units, and method and apparatus for video decoding based on the same
KR101576199B1 (en) 2015-04-13 2015-12-10 삼성전자주식회사 Method and apparatus for video encoding based on scanning order of hierarchical data units, and method and apparatus for video decoding based on the same
KR101607311B1 (en) * 2015-04-20 2016-03-29 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101731430B1 (en) * 2015-07-02 2017-04-28 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101607312B1 (en) * 2015-07-02 2016-03-29 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101576200B1 (en) 2015-07-03 2015-12-10 삼성전자주식회사 Method and apparatus for video encoding based on scanning order of hierarchical data units, and method and apparatus for video decoding based on the same
KR101577243B1 (en) 2015-07-03 2015-12-14 삼성전자주식회사 Method and apparatus for video encoding based on scanning order of hierarchical data units, and method and apparatus for video decoding based on the same
CN109479138A (en) * 2016-07-13 2019-03-15 韩国电子通信研究院 Image coding/decoding method and device
CN109479138B (en) * 2016-07-13 2023-11-03 韩国电子通信研究院 Image encoding/decoding method and apparatus
CN109891889A (en) * 2016-11-01 2019-06-14 三星电子株式会社 Processing unit and its control method
US10694190B2 (en) 2016-11-01 2020-06-23 Samsung Electronics Co., Ltd. Processing apparatuses and controlling methods thereof
CN109891889B (en) * 2016-11-01 2021-09-28 三星电子株式会社 Processing apparatus and control method thereof
EP3494699A4 (en) * 2016-11-01 2019-06-12 Samsung Electronics Co., Ltd. Processing apparatuses and controlling methods thereof
WO2018084476A1 (en) 2016-11-01 2018-05-11 Samsung Electronics Co., Ltd. Processing apparatuses and controlling methods thereof
KR101783966B1 (en) * 2017-04-24 2017-10-10 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR101842262B1 (en) 2017-09-26 2018-03-26 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR102040316B1 (en) 2018-03-20 2019-11-04 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure
KR20180032542A (en) * 2018-03-20 2018-03-30 삼성전자주식회사 Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure

Similar Documents

Publication Publication Date Title
WO2007079782A1 (en) Quality scalable picture coding with particular transform coefficient scan path
US8428143B2 (en) Coding scheme enabling precision-scalability
US10659776B2 (en) Quality scalable coding with mapping different ranges of bit depths
US9113167B2 (en) Coding a video signal based on a transform coefficient for each scan position determined by summing contribution values across quality layers
EP2904786B1 (en) Scalable video coding using inter-layer prediction of spatial intra prediction parameters
AU2006269728B2 (en) Method and apparatus for macroblock adaptive inter-layer intra texture prediction
EP2143279B1 (en) Scalable video coding supporting pixel value refinement scalability
US20070237240A1 (en) Video coding method and apparatus supporting independent parsing
US20070177664A1 (en) Entropy encoding/decoding method and apparatus
WO2007042063A1 (en) Video codec supporting quality-scalability

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06706904

Country of ref document: EP

Kind code of ref document: A1