WO2006006793A1 - Procede de codage et decodage de video et codeur et decodeur de video - Google Patents

Procede de codage et decodage de video et codeur et decodeur de video Download PDF

Info

Publication number
WO2006006793A1
WO2006006793A1 PCT/KR2005/002187 KR2005002187W WO2006006793A1 WO 2006006793 A1 WO2006006793 A1 WO 2006006793A1 KR 2005002187 W KR2005002187 W KR 2005002187W WO 2006006793 A1 WO2006006793 A1 WO 2006006793A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
layer motion
base layer
block
motion vectors
Prior art date
Application number
PCT/KR2005/002187
Other languages
English (en)
Inventor
Ho-Jin Ha
Woo-Jin Han
Original Assignee
Samsung Electronics Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020040118021A external-priority patent/KR100678949B1/ko
Application filed by Samsung Electronics Co., Ltd. filed Critical Samsung Electronics Co., Ltd.
Publication of WO2006006793A1 publication Critical patent/WO2006006793A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • H04N19/615Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding using motion compensated temporal filtering [MCTF]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/187Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/33Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/63Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]

Definitions

  • Apparatuses and methods consistent with the present invention relate to video coding, and more particularly, to video coding providing motion scalability.
  • a compression coding method is required for transmitting multimedia data including text, video, and audio.
  • a 24-bit true color image having a resolution of 640*480 needs a capacity of 640*480*24 bits, i.e., data of about 7.37 Mbits, per frame.
  • a bandwidth of 221 Mbits/sec is required.
  • a 90-minute movie based on such an image is stored, a storage space of about 1200 Gbits is required.
  • a compression coding method is a requisite for transmitting multimedia data including text, video, and audio.
  • Data redundancy is typically defined as: (i) spatial redundancy in which the same color or object is repeated in an image; (ii) temporal redundancy in which there is little change between adjacent frames in a moving image or the same sound is repeated in audio; or (iii) mental visual redundancy taking into account human eyesight and perception dull to high frequency. Data can be compressed by removing such data redundancy.
  • FIG. 1 is a block diagram of a conventional video encoder 100.
  • the conventional video encoder 100 includes a motion estimator 110 estimating motion between video frames, a motion compensator 120 removing temporal redundancies within video frames, a spatial transformer 130 performing spatial transform to remove spatial redundancies, a quantizer 140 quantizing the frames in which spatial redundancies have been removed, a motion in ⁇ formation encoder 160, and a bitstream generator 150 generating a bitstream.
  • the motion estimator 110 finds motion vectors to be used in removing temporal redundancies by compensating the motion of a current frame.
  • the motion vector is defined as a displacement from the best-matching block in a reference frame with respect to a block in a current frame, which will be described with reference to FIG. 2.
  • the original video frame may be used as the reference frame, many of known video coding techniques use a reconstructed frame obtained by decoding the original video frame as the reference frame.
  • the motion compensator 120 uses the motion vectors calculated by the motion estimator 110 to remove the temporal redundancies present in the current frame. To this end, the motion compensator 120 uses a reference frame and motion vectors to generate a predicted frame and compares the current frame with the predicted frame to thereby generate a residual frame.
  • the spatial transformer 130 spatially transforms residual frames to obtain transform coefficients.
  • the most commonly used spatial transform algorithm is the Discrete Cosine Transform (DCT). Recently, a wavelet transform has been widely adopted.
  • DCT Discrete Cosine Transform
  • the quantizer 140 quantizes the transform coefficients obtained through the spatial transformer 130.
  • a quantization strength is determined according to a bit rate.
  • the motion information encoder 160 encodes the motion vectors calculated by the motion estimator 110 in order to reduce the amount of data and generates motion in ⁇ formation that is contained in a bitstream.
  • the bitstream generator 150 generates a bitstream containing the quantized transform coefficients and the encoded motion vectors. While not shown in FIG. 1, in conventional video coding schemes such as MPEG-2, MPEG-4, and H.264, the quantized transform coefficients are not directly inserted into the bitstream. Instead, texture information created after scanning, scaling, and entropy coding is contained in the bitstream.
  • FIG. 2 illustrates a conventional motion estimation process and a temporal mode used during motion estimation.
  • the motion estimation process is basically performed using a block-matching algorithm.
  • a block in a reference frame is moved within a search area to be compared with a block in a current frame and a difference between the two blocks and a cost for coding a motion vector are calculated.
  • a block in a reference frame minimizing the cost is selected as the best-matching reference block. While a full search guarantees the best performance in motion estimation, the process requires excessive computational load.
  • Three step search or hierarchical variable block size matching (HVSBM) is commonly used for motion estimation in currently widely used video coding.
  • a conventional video coding scheme uses the inter-frame prediction modes as well as an intraframe prediction mode using information from the current frame. Disclosure of Invention
  • a scalable video coding scheme using motion compensation to remove temporal redundancies provides high video compression efficiency at a sufficient bit rate.
  • the conventional scheme gives poor compression efficiency at a low bit rate since it reduces the number of bits being allocated to texture information contained in a bitstream generated by video coding while maintaining the same number of bits being allocated to motion information contained therein.
  • the conventional video coding scheme is performed at a very low bit rate, the resulting bitstream may contain little texture information, or, in the extreme case, only motion information. For this reason, the conventional video coding in which motion information is difficult to reduce suffers significant degradation in video quality in a low bit rate. Therefore, there is a need for an algorithm designed to adjust the amount of bits to be allocated to motion information in a bitstream.
  • the present invention provides video encoding and decoding methods capable of adjusting the amount of bits being allocated to motion information and video encoder and decoder.
  • a video coding method including estimating a base layer motion vector and an enhancement layer motion vector for each block in a video frame; removing temporal redundancies in the video frame using the enhancement layer motion vectors; spatially transforming the video frame in which the temporal redundancies have been removed and quantizing the spatially transformed video frame to obtain texture information; selecting one of the estimated base layer motion vector and the estimated enhancement layer motion vector for each block; and generating a bitstream containing the motion vector selected for each block and the texture information.
  • a video coding method including estimating a base layer motion vector and an enhancement layer motion vector for ach block in a video frame, removing temporal redundancies in the video frame using the enhancement layer motion vector, spatially transforming the video frame in which the temporal redundancies have been removed and quantizing the spatially transformed video frame to obtain texture information, and generating a bitstream containing the estimated base layer motion vector, a residual motion vector being the difference between the estimated base layer motion vector and the estimated enhancement layer motion vector, and the texture information for each block.
  • a video encoder including a motion estimator estimating a base layer motion vector and an en ⁇ hancement layer motion vector for each block in a video frame, a motion compensator removing temporal redundancies in the video frame using the enhancement layer motion vectors, a spatial transformer spatially transforming the video frame in which the temporal redundancies have been removed, a quantizer quantizing the spatially transformed video frame to obtain texture information, a motion vector selector selecting one of the estimated base layer motion vector and the estimated enhancement layer motion vector for each block, and a bitstream generator generating a bitstream containing the motion vector selected for each block and the texture information.
  • a video encoder including a motion estimator estimating a base layer motion vector and an en ⁇ hancement layer motion vector for each block in a video frame, a motion compensator removing temporal redundancies in the video frame using the enhancement layer motion vectors, a spatial transformer spatially transforming the video frame in which the temporal redundancies have been removed, a quantizer quantizing the spatially transformed video frame to obtain texture information, and a bitstream generator generating a bitstream containing the estimated base layer motion vector, a residual motion vector being the difference between the estimated base layer motion vector and the estimated enhancement layer motion vector, and the texture information for each block.
  • a predecoding method including receiving a bitstream containing a base layer motion vector and a residual motion vector being the difference between the base layer motion vector and an enhancement layer motion vector for each block, and texture information obtained by encoding the video frame, and truncating at least a part of the residual motion vectors.
  • a video decoding method including interpreting an input bitstream and obtaining texture in- formation and motion information containing base layer motion vectors and residual motion vectors, merging a base layer motion vector with a residual motion vector for each of blocks having both the base layer motion vector and the residual motion vector and obtaining merged motion vectors, performing inverse quantization and inverse spatial transform on the texture information and obtaining frames in which temporal redundancies are removed, and performing inverse motion compensation on the frames in which the temporal redundancies have been removed using the merged motion vectors and the unmerged base layer motion vectors.
  • a video decoding method including interpreting an input bitstream and obtaining texture in ⁇ formation and motion information containing base layer motion vectors and residual motion vectors, merging a base layer motion vector with a residual motion vector for each of blocks having both the base layer motion vector and the residual motion vector and obtaining merged motion vectors, performing inverse quantization and inverse spatial transform on the texture information and obtaining frames in which temporal redundancies are removed, and performing inverse motion compensation on the frames in which the temporal redundancies have been removed using the merged motion vectors and the unmerged base layer motion vectors.
  • a video decoder including a bitstream interpreter interpreting an input bitstream and obtaining texture information and motion information containing base layer motion vectors and enhancement layer motion vectors, a motion vector readjuster readjusting the base layer motion vectors, an inverse quantizer performing inverse quantization on the texture information, an inverse spatial transformer performing inverse spatial transform on the inversely quantized texture information to obtain frames in which temporal re ⁇ dundancies are removed, and an inverse motion compensator performing inverse motion compensation on the frames in which the temporal redundancies have been removed using the readjusted base layer motion vectors and the enhancement layer motion vectors and reconstructing a video frame.
  • a video decoder including a bitstream interpreter interpreting an input bitstream and obtaining texture information and motion information containing base layer motion vectors and residual motion vectors, a motion vector merger merging a base layer motion vector with a residual motion vector for each of blocks having both the base layer motion vector and the residual motion vector and obtaining merged motion vectors, an inverse quantizer performing inverse quantization on the texture information, an inverse spatial transformer performing inverse spatial transform on the inversely quantized texture in ⁇ formation and obtaining frames in which temporal redundancies are removed, and an inverse motion compensator performing inverse motion compensation on the frames in which the temporal redundancies have been removed using the merged motion vectors and the unmerged base layer motion vectors.
  • FIG. 1 is a block diagram of a conventional video encoder
  • FIG. 2 illustrates a conventional motion estimation process and temporal modes
  • FIG. 3 is a block diagram of a video encoder according to a first exemplary embodiment of the present invention.
  • FIGS. 4 and 5 are block diagrams of video encoders according to second and third exemplary embodiments of the present invention, respectively;
  • FIG. 6 illustrates a motion estimation process according to an exemplary embodiment of the present invention
  • FIG. 7 illustrates block modes according to an exemplary embodiment of the present invention
  • FIG. 8 illustrates examples of a frame with different percentages of enhancement layers according to an exemplary embodiment of the present invention
  • FIG. 9 is a block diagram of a video decoder according to a first exemplary embodiment of the present invention.
  • FIGS. 10 and 11 are block diagrams of video decoders according to second and third exemplary embodiments of the present invention, respectively;
  • FIG. 12 illustrates a video service environment according to an exemplary embodiment of the present invention
  • FIG. 13 illustrates the structure of a bitstream according to an exemplary embodiment of the present invention.
  • FIG. 14 is a graph illustrating changes in video qualities when an enhancement layer motion vector and a base layer motion vector are used.
  • the present invention presents a video coding scheme designed to adjust the amount of bits being allocated to motion vectors (motion information) and can be applied to both open-loop video coding using an original video frame as a reference frame and closed-loop video coding using a reconstructed frame as a reference frame. Since closed- loop video coding uses a reconstructed frame obtained by performing inverse quantization, inverse transform, and motion compensation on quantized transform coefficients as a reference frame, a closed-loop video encoder includes several components for video decoding such as an inverse quantizer and an inverse spatial transformer, unlike an open-loop video encoder. While the present invention will be described with reference to exemplary embodiments using open-loop scalable video coding, closed-loop video coding may be used as well.
  • FIG. 3 is a block diagram of a video encoder 300 according to a first exemplary embodiment of the present invention.
  • the video encoder 300 includes a motion estimator 310, a motion compensator 320, a spatial transformer 330, a quantizer 340, a bitstream generator 350, a motion vector selector 360, and a motion information encoder 370.
  • the motion estimator 310 estimates motion between each block in a current frame and a block in one reference frame or blocks in two reference frames corresponding to the block in the current frame.
  • the displacement between positions of each block in the current frame and a corresponding block in the reference frame is defined as a motion vector.
  • Three step search or two dimensional (2D) logarithm search is designed to reduce the amount of calculations by reducing the number of search points for each motion vector estimation.
  • An adaptive/predictive search is a method by which a motion vector for a block in a current frame is predicted from a motion vector for a block in the previous block in order to reduce the amount of cal ⁇ culations required for motion estimation.
  • HVSBM is an algorithm in which a frame having an original resolution is downs ampled to obtain low resolution frames and a motion vector found at the lowest resolution is used to find motion vectors having in ⁇ creasingly higher resolutions.
  • Another approach to reducing the amount of calculations needed for motion estimation is to replace a function of calculating a cost of block matching with a simple one.
  • the motion estimator 310 in the present exemplary embodiment performs a process of finding a base layer motion vector and a process of finding an enhancement layer motion vector. That is, the motion estimator 310 finds the base layer motion vector and then readjusts the base layer motion vector to find the enhancement layer motion vector.
  • the process of finding motion vectors of a base layer and an en ⁇ hancement layer may be performed by various motion estimation algorithms.
  • the process of finding the motion vector of a base layer or of finding the motion vectors of the base layer and the enhancement layer is performed using HVSBM since a motion vector obtained using HVSBM has charac ⁇ teristics that are consistent with the characteristics of a motion vector for an adjacent block.
  • the enhancement layer motion vector is found within a search area smaller than a search area in which the base layer motion vector is obtained. In other words, the enhancement layer motion vector is obtained by readjusting the based layer motion vector already estimated.
  • the motion compensator 320 obtains order information by performing motion compensation using the base layer motion vector (hereinafter called 'base layer motion compensation') separately from motion compensation using the enhancement layer motion vector (hereinafter called 'enhancement layer motion compensation'). The motion compensator 320 then provides frames in which temporal redundancies have been removed by enhancement layer motion compensation to the spatial transformer 330.
  • MCTF Motion Compensated Temporal Filtering
  • a Haar filter was used in a conventional MCTF
  • a 5/3 filter has been recently widely used.
  • MCTF is performed on a group of picture (GOP) basis and includes generating a predicted frame using the result of motion estimation, obtaining a residual frame (high-pass subband) that is the difference between a current frame and the predicted frame, and updating the remaining original frame or a low-pass subband using the residual frame.
  • GOP group of picture
  • temporal redundancies are removed in frames making up a GOP to obtain one low-pass subband and a plurality of high-pass subbands.
  • the spatial transformer 330 removes spatial redundancies in the frames in which temporal redundancies have been removed using spatial transform and creates transform coefficients.
  • the spatial transform is performed using DCT or wavelet transform.
  • the video encoder 300 may use wavelet transform to generate a bitstream having spatial scalability.
  • the video encoder 300 with a plurality of layers of different resolutions may use DCT to remove spatial redundancies in the frames in which the temporal redundancies have been removed in order to generate a bitstream having spatial scalability.
  • the quantizer 340 quantizes the transform coefficients in such a way as to minimize distortion at a given bit rate.
  • Quantization for scalable video coding is performed using well-known embedded quantization algorithms such as Embedded ZeroTrees Wavelet (EZW), Set Partitioning in Hierarchical Trees (SPIHT), Embedded ZeroBlock Coding (EZBC), and Embedded Block Coding with Optimized Truncation (EBCOT).
  • the quantized transform coefficients (texture information) are inserted into a bitstream after being subjected to scanning, scaling, and variable length coding. Meanwhile, the bitstream contains texture information as well as motion information.
  • the video encoder 300 includes the motion vector selector 360 and the motion information encoder 370.
  • the motion vector selector 360 selects either one of a base layer motion vector and an enhancement layer motion vector for each block. More specifically, an en ⁇ hancement layer motion vector is selected in the order from a block with a largest difference to a block with a smallest difference between visual qualities obtained when temporal redundancies are removed using base layer motion compensation and en ⁇ hancement layer motion compensation, respectively. For example, when the extent to which visual quality is improved decreases in the order of blocks 1, 2, 3, 4, 5, 6, 7, and 8 and enhancement layer motion compensation can be used only for three blocks, the motion vector selector 360 selects enhancement layer motion vectors for blocks 1 through 3 and base layer motion vectors for blocks 4 through 8.
  • the selected motion information (base layer motion vectors and enhancement layer motion vectors) is provided to the motion information encoder 370. Consequently, the texture information contained in the bitstream is the quantized transform coefficients obtained from en ⁇ hancement layer motion compensation, spatial transform, and quantization while the motion information contained therein is the enhancement layer motion vectors for blocks 1 through 3 and the base layer motion vectors for blocks 4 through 8.
  • the motion vector selector 360 receives information about the order of blocks in which the extent to which video quality is improved decreases (hereinafter called 'order information') from the motion compensator 320.
  • the percentage of enhancement layer motion vectors selected by the motion vector selector 360 may be input manually by a user or be determined automatically according to a bit rate.
  • the motion vector selector 360 selects a high percentage of enhancement layer motion vectors for a high bit rate while selecting a low percentage of enhancement layer motion vectors for a low bit rate.
  • the motion information encoder 370 encodes the motion information using arithmetic coding or variable length coding.
  • the encoded motion information is inserted into the bitstream.
  • coding efficiency for the motion information is high.
  • the motion estimator 310 estimates motion vectors (base layer motion vectors and enhancement layer motion vectors) using an HVSBM algorithm.
  • the bitstream generator 350 generates a bitstream containing the texture in ⁇ formation and the encoded motion information. While it is described above that the motion vector for each block contained in the encoded motion information is either a base layer motion vector or an enhancement layer motion vector, the motion vector may contain the base layer motion vector and residual motion vector needed for obtaining the enhancement layer motion vector, instead of the enhancement layer motion vector. The same can apply to a video encoder shown in FIG. 4.
  • FIG. 4 is a block diagram of a video encoder 400 according to a second exemplary embodiment of the present invention.
  • a motion estimator 410, a motion compensator 420, a spatial transformer 430, and a quantizer 440 in the video encoder 400 have substantially the same functions as their counterparts in the video encoder 300 of FIG. 3.
  • a motion vector selector 460 operates in a slightly different way than their counterparts in the video encoder 300.
  • the motion vector selector 460 generates a plurality of types of motion data, each having a different percentage of base layer motion vectors and enhancement layer motion vectors. For example, the motion vector selector 460 may generate a total of types of six motion data.
  • a first type of motion data consists of enhancement layer motion vectors for all blocks.
  • a second type of motion data consists of enhancement layer motion vectors for 80 percent of the blocks and base layer motion vectors for 20 percent of the blocks.
  • a third type of motion data consists of enhancement layer motion vectors for 60 percent of the blocks and base layer motion vectors for 40 percent of the blocks.
  • a fourth type of motion data contains enhancement layer motion vectors for 40 percent of the blocks and base layer motion vectors for 60 percent of the blocks.
  • a fifth type of motion data contains enhancement layer motion vectors for 20 percent of the blocks and base layer motion vectors for 80 percent of the blocks.
  • a sixth type of motion data contains base layer motion vectors for all blocks. The six types of motion data are all inserted into the bitstream. Meanwhile, a video decoder receives a bitstream predecoded by a predecoder 480 in order to reconstruct video frames using one type of motion data.
  • each type of motion data may contain a different percentage of enhancement layer motion vectors than in the above example.
  • the percentages of enhancement layer motion vectors contained in the six types of motion vector data may be 100, 70, 40, 20, 10, and 0, respectively.
  • the motion information encoder 470 encodes the plurality of types of motion data using arithmetic coding or variable length coding in order to reduce the amount of data.
  • the bitstream generator 450 generates a bitstream containing the texture in ⁇ formation and the encoded motion data.
  • the predecoder 480 truncates encoded motion data excluding one type of motion data for transmission to the decoder. For example, when a bandwidth for transmitting a bitstream to the decoder is very narrow, the predecoder 480 truncates encoded motion vector data excluding motion data containing the lowest percentage (e.g., 0%) of en ⁇ hancement layer motion vectors. Conversely, when a bandwidth for transmitting a bitstream to the decoder is very narrow, the predecoder 480 truncates encoded motion vector data excluding motion data containing the highest percentage (e.g., 100%) of enhancement layer motion vectors. In a similar fashion, the predecoder 480 truncates encoded motion data excluding one type of motion data suitably selected according to a bit rate.
  • FIG. 5 is a block diagram of a video encoder 500 according to a third exemplary embodiment of the present invention.
  • a motion estimator 510, a motion compensator 520, a spatial transformer 530, a quantizer 540, a bitstream generator 550, and a motion information encoder 570 in the video encoder 500 have substantially the same functions as their counterparts in the video encoder 300 of FIG. 3.
  • the video encoder 500 does not include a motion vector selector.
  • the motion information encoder 570 encodes information containing both a base layer motion vector and an enhancement layer motion vector for each block.
  • the encoded motion information base layer motion vectors and en ⁇ hancement layer motion vectors is inserted into a bitstream.
  • the bitstream generator 550 generates a bitstream containing the texture in ⁇ formation, the encoded motion information, and the order information.
  • the predecoder 580 truncates encoded motion information from an enhancement layer motion vector for a block showing the smallest quality improvement. For example, the predecoder 580 truncates all the encoded enhancement layer motion vectors when a bit rate is very low while retaining the enhancement layer motion vectors when a bit rate is sufficient.
  • FIG. 6 illustrates a motion estimation process according to an exemplary embodiment of the present invention.
  • a base layer motion vector, an enhancement layer motion vector, and a residual motion vector are shown in FIG. 6.
  • the base layer motion vector and the en ⁇ hancement layer motion vector are obtained from a base layer motion search and an enhancement layer motion search, respectively, and a residual motion vector is the difference between the enhancement layer motion vector and the base layer motion vector.
  • a block 610 is a block in a current frame
  • a block 620 is a block corresponding to the block 610
  • a block 630 is a block obtained from a base layer motion search.
  • the block 620 corresponding to the block 610 is directly found.
  • the block 620 is found using an enhancement layer motion search after the block 630 is found using a base layer motion search.
  • a block at a position that minimizes the cost for encoding a block in a current frame is determined as a block corresponding to the block in the current frame.
  • E(k, 1) and B(k, 1) respectively denote bits allocated to texture and motion vectors when encoding a k-th block in a current frame using an 1-th block in a search area of a reference frame
  • the cost C(k, 1) is defined by Equation (1):
  • is a Lagrangian coefficient used to control the balance among the bits allocated to motion vectors and textures.
  • increases, the number of bits allocated to texture increases.
  • decreases, the number of bits allocated to motion vectors increases.
  • is made so large that bits are mainly allocated to texture.
  • a value 1 that minimizes the cost C(k, 1) is found and a displacement between the block 630 in the reference frame corresponding to the value 1 and the block 610 in the current frame is calculated.
  • the block 620 is found within an enhancement layer search area using Equation (1).
  • the enhancement layer search area may be sig ⁇ nificantly narrower than the base layer search area in order to minimize the difference between the base layer motion vector and the enhancement layer motion vector.
  • the block 620 that minimizes the cost is found and the difference between the block 620 and the block 630 found using the base layer motion search is determined as the enhancement layer motion vector.
  • the base layer motion search uses ⁇ greater than the enhancement layer motion search so that a small number of bits can be allocated to the base layer motion vector.
  • texture and base layer motion information are contained in the bitstream in order to minimize the number of bits being allocated to the motion vector.
  • the base layer motion search and the enhancement layer motion search may be performed using HVSBM.
  • HVSBM providing consistent motion vector fields reduces the overall bit rate of the motion vectors. Furthermore, HVSBM requires a small amount of calculations and also achieves motion scalability by restricting an en ⁇ hancement layer search area to a small region.
  • PSNR peak signal-to-noise ration
  • the bitstream generated by the video encoder 300 of FIG. 3 contains a single type of motion data consisting of either a base layer motion vector or an enhancement layer for each block.
  • the bitstream generated by the video encoder 400 of FIG. 4 contains a plurality of types of motion data, each consisting of either a base layer motion vector or an enhancement layer motion vector for each block.
  • the motion data also has a different percentage of enhancement layer motion vectors.
  • the bitstream is predecoded and truncated excluding particular motion data for transmission to a video decoder.
  • the bitstream generated by the video encoder 500 of FIG. 5 contains a single type motion data consisting of both a base layer motion vector and a residual motion vector for each block.
  • the bitstream is predecoded according to a bit rate to transmit only base layer motion vectors for some blocks and both base layer motion vectors and residual motion vectors for the remaining blocks to the video decoder.
  • the video encoder 300 of FIG. 3 may include a motion vector merger merging a base layer motion vector with a residual motion vector instead of the motion vector selector 360.
  • base layer motion vectors and residual motion vectors are provided to the motion vector merger while the motion information containing base layer motion vectors and enhancement layer motion vectors is provided to the motion information encoder 370.
  • Each of the enhancement layer motion vectors is obtained by merging a base layer motion vector with a residual motion vector.
  • the video encoder 400 of FIG. 4 may also include a motion vector merger instead of the motion vector selector 460.
  • the bitstream generated by the video encoder 500 of FIG. 5 contains both a base layer motion vector and a residual motion vector for each block
  • an enhancement layer motion vector may be inserted into the bitstream instead of the residual motion vector.
  • the predecoder 580 se- lectively truncates a base layer motion vector or an enhancement layer motion vector for each block according to a bit rate and order information.
  • FIG. 7 illustrates block modes according to an exemplary embodiment of the present invention. Referring to FIG. 7, the motion scalability achieved using a small enhancement layer search area as described above is intensified when the concept of a block mode is introduced.
  • a motion vector search is made in 8*16, 16*8, 8*8, and 4*4 subblocks, respectively.
  • a base layer block mode is one of mode 0, mode 1, mode 2, and mode 3 while an enhancement layer block mode is one of mode 0, mode 1, mode 2, mode 3, and mode 4.
  • the enhancement layer block mode is selected from mode 0, mode 1, mode 2, mode 3, and mode 4.
  • the enhancement layer block mode is selected from mode 1, mode 3, and mode 4.
  • the base layer block mode is mode 2 and mode 3, respectively, the enhancement layer block mode is selected from modes 2 through 4 and modes 3 and 4, respectively.
  • the base layer block mode is mode 1
  • the enhancement layer block mode cannot be mode 2 since mode 1 and mode 2 are horizontal mode and vertical mode, respectively.
  • base layer motion search uses ⁇ greater than the enhancement layer motion search as described above, a larger penalty is inflicted on a base layer even if the number of bits allocated to motion vector estimated during the base layer motion search (base layer motion vector) is equal to the number of bits allocated to motion vector estimated during the enhancement layer motion search (base layer motion vector and enhancement layer motion vector).
  • base layer motion vector the number of bits allocated to motion vector estimated during the enhancement layer motion search
  • base layer motion vector and enhancement layer motion vector base layer motion vector and enhancement layer motion vector.
  • mode 0 was determined as a base layer block mode except for special cases.
  • the enhancement layer uses small ⁇ , penalty for the number of bits being allocated to a motion vector is less than for the base layer. For this reason, an en ⁇ hancement layer block mode usually has more finely subdivided blocks. While FIG. 7 shows five block modes, the number of block modes available may be greater than or less than five.
  • a texture image contained in the bitstream is obtained by performing spatial transform and quantization on frames in which temporal redundancies have been removed using enhancement layer motion vectors.
  • motion vectors for some blocks are base layer motion vectors at a low bit rate
  • motion mismatch may occur.
  • the motion mismatch is introduced since an enhancement layer motion vector is used during encoding but a base layer motion vector is used during decoding, which results in degradation of coding performance (e.g., visual quality, compression efficiency, etc).
  • the present invention proposes an algorithm for determining an enhancement layer motion vector or a base layer motion vector for each block.
  • the degree E of a mismatch resulting from the use of a base layer motion vector at a decoder is given by Equation (2) as follows:
  • O m and O b are frames reconstructed using enhancement layer motion vectors and base layer motion vectors, respectively.
  • O m and O b are defined by Equation (3) as follows: [86]
  • P and H are a predicted frame and a resiual frame obtained using en- m m hancement layer motion vectors, respectively, and P is a frame predicted using base b layer motion vectors.
  • O may be defined by m
  • Equation (4) as follows: [89]
  • Equation (5) as follows: [91]
  • the degree E of mismatch is determined by the difference between frames predicted using enhancement layer motion vectors and base layer motion vectors or the difference between residual frames obtained using en ⁇ hancement layer motion vectors and base layer motion vectors.
  • predicted frames and residual frames are obtained by the motion compensator 320, 420, or 520. That is, the motion compensator 320, 420, or 520 receives base layer motion vectors and enhancement layer motion vectors from the motion estimator 310, 410, or 510 to generate predicted frames P and P and m b residual frames H m and H b.
  • each block may be determined using
  • Equation (5) the difference between encoding of each block using en- hancement layer motion compensation and using base layer motion compensation is calculated and the order of significance of blocks is determined according to the degree of difference.
  • the order of significance may be determined by the difference between residual blocks (the difference between a block in a current frame and a block in a predicted frame) obtained using enhancement layer motion com ⁇ pensation and base layer motion compensation. That is, when there is a large difference between residual blocks, the difference between encoding of each block using enhancement layer motion compensation and using base layer motion com ⁇ pensation is also considered large.
  • the order of significance of blocks may be calculated by a motion vector selector instead of a motion estimator.
  • the motion vector selector 360 or 460 in FIG. 3 or 4 selects an enhancement layer motion vector in the order of significance. That is, an enhancement layer motion vector is preferentially allocated to a block with a large error.
  • a bitstream generated by the video encoder of FIG. 5 not including a motion vector selector contains base layer motion vectors and residual motion vectors for all blocks and order information. Using the order information, the predecoder 580 truncates motion in ⁇ formation from residual motion vectors with least significance as needed according to a bit rate.
  • FIG. 8 illustrates examples of a frame in which the percentage of enhancement layers is 0% and 50%, respectively.
  • a block mode number is indicated within a block. As illustrated in FIG. 8, a base layer block mode and an enhancement layer block mode may vary for the same block. When the base layer block mode is different from the enhancement layer block mode, an enhancement layer block mode is used for a block being subjected to inverse en ⁇ hancement layer motion compensation during decoding while a base layer block mode is used for a block being subjected to inverse base layer motion compensation.
  • FIG. 9 shows a video decoder 900 for decoding the bitstream generated by the video encoder 300 of FIG. 3 or the predecoded bitstream generated by the predecoder 480 shown in FIG. 4.
  • FIGS. 10 and 11 show video decoders for decoding a predecoded bitstream generated by the predecoder 580 shown in FIG. 5.
  • FIG. 9 is a block diagram of a video decoder 900 according to a first exemplary embodiment of the present invention.
  • the video decoder 900 includes a bitstream interpreter 910, an inverse quantizer 920, an inverse spatial transformer 930, an inverse motion compensator 940, a motion information decoder 950, and a motion vector readjuster 960.
  • the bitstream interpreter 910 obtains texture information and encoded motion in ⁇ formation from an input bitstream.
  • the texture information containing image data of encoded video frames is provided to the inverse quantizer 920 while the encoded motion information containing either a base layer motion vector or an enhancement layer motion vector for each block is provided to the motion information decoder 950.
  • the inverse quantizer 920 inversely quantizes the texture information to obtain transform coefficients.
  • the obtained transform coefficients are sent to the inverse spatial transformer 930.
  • the inverse spatial transformer 930 performs inverse spatial transform on the transform coefficients to obtain a single low-pass subband and a plurality of high-pass subbands for each GOP.
  • the inverse motion compensator 940 receives the low-pass subband and the plurality of high-pass subbands for each GOP to update the low-pass subband using one or more high-pass subbands and generate a predicted frame using the updated low- pass subband.
  • the inverse motion compensator 940 then adds the predicted frame to a high-pass subband, thereby reconstructing a low-pass subband.
  • the inverse motion compensator 940 updates the updated low-pass subband and the reconstructed low- pass subbands again, generates two predicted frames using the updated low-pass subbands, and reconstructs two low-pass subbands by adding the two predicted frames to two high-pass subbands, respectively.
  • the inverse motion compensator 940 performs the above process iteratively to reconstruct video frames making up a GOP.
  • the motion vectors used during an update operation and a predicted frame generation operation is obtained from motion information (a base layer motion vector or an en ⁇ hancement layer motion vector for each block) obtained by the motion information decoder 950 decoding the encoded motion information.
  • the resulting motion in ⁇ formation contains base layer motion vectors and enhancement layer motion vectors.
  • the base layer motion vectors are provided to the motion vector readjuster 960 that then readjust a base layer motion vector using enhancement layer motion vectors for neighboring blocks.
  • the motion vector readjuster 960 may readjust the base layer motion vectors using a predicted frame produced during inverse motion compensation as a reference.
  • the enhancement layer motion vectors and the readjusted base layer motion vectors are provided to the inverse motion compensator 940 for use in an update operation and a predicted frame generation operation.
  • FIG. 10 is a block diagram of a video decoder 1000 according to second exemplary embodiment of the present invention.
  • the video decoder 1000 includes a bitstream interpreter 1010, an inverse quantizer 1020, an inverse spatial transformer 1030, an inverse motion compensator 1040, a motion information decoder 1050, and a motion vector merger 1070.
  • the bitstream interpreter 1010 obtains texture information and encoded motion in ⁇ formation from an input bitstream.
  • the texture information containing image data of encoded video frames is provided to the inverse quantizer 1020 while the encoded motion information containing motion vectors is provided to the motion information decoder 1050.
  • the inverse quantizer 1020 inversely quantizes the texture information to obtain transform coefficients that are then sent to the inverse spatial transformer 1030.
  • the inverse spatial transformer 1030 performs inverse spatial transform on the transform coefficients to obtain a single low-pass subband and a plurality of high-pass subbands for each GOP.
  • the inverse motion compensator 1040 receives the low-pass subband and the plurality of high-pass subbands for each GOP to reconstruct video frames.
  • the motion information decoder 1050 decodes encoded motion information to obtain motion information.
  • the motion information contains base layer motion vectors for some blocks and base layer motion vectors and residual motion vectors for the remaining blocks.
  • the base layer motion vectors and residual motion vectors for the remaining blocks are sent to the motion vector merger 1070.
  • the motion vector merger 1070 merges the base layer motion vector with the residual motion vector to obtain an enhancement layer motion vector that is then provided to the inverse motion compensator 1040 for use in an update operation and a predicted frame generation operation.
  • FIG. 11 is a block diagram of a video decoder according to a third exemplary embodiment of the present invention.
  • the video decoder 1100 includes a bitstream interpreter 1110, an inverse quantizer
  • the video decoder 1100 further includes the motion vector readjuster 1160.
  • the motion vector readjuster 1160 readjusts the base layer motion vector using merged motion vectors for neighboring blocks.
  • the motion vector readjuster 1160 may readjust the base layer motion vectors using a predicted frame obtained during inverse motion compensation as a reference.
  • the merged motion vectors and the readjusted motion vectors are provided to the inverse motion compensator 1140 for use in an update operation and a predicted frame generation operation.
  • FIG. 12 illustrates a video service environment according to an exemplary embodiment of the present invention.
  • a video encoder 1210 encodes video frames into a bitstream using scalable video coding.
  • the structure of a bitstream generated according to the exemplary embodiments of the present invention will be described later with reference to FIG. 13.
  • a predecoder 1220 truncates a part of the bitstream (predecoding) according to a bandwidth on a network 1230. For example, when the bandwidth of the network 1230 is sufficient, a user requests high quality video. The predecoder 1220 truncates a small number of bits in the bitstream or no bits. On the other hand, when the available bandwidth is not sufficient, the predecoder 1220 truncates a large number of bits in the bitstream.
  • a video decoder 1240 receives the predecoded bitstream through the network 1230 to reconstruct video frames.
  • FIG. 13 illustrates the structure of a bitstream according to an exemplary embodiment of the present invention.
  • the bitstream is composed of a header 1310, a motion vector field 1320, and a texture information field 1330.
  • the header 1310 may contain a sequence header, a GOP header, a frame header, and a slice header specifying information necessary for a sequence, a GOP, a frame, and a slice, respectively.
  • the motion vector field 1320 includes an order information field 1321, a base layer motion vector field 1322, and an enhancement layer motion vector field 1323.
  • the order information field 1321 contains information about the order of blocks in which the degree of video quality improvement decreases. For example, when en ⁇ hancement layer motion vectors are used for blocks 1 through 6 and the degree of visual quality improvement decreases in the order of blocks 1, 4, 2, 3, 5, and 6, the order information specifies the order as 1, 4, 2, 3, 5, 6. Thus, the enhancement layer motion vectors are truncated during predecoding in the order of blocks (6, 5, 3, 2, 4, 1) in which the degree of visual quality improvement increases.
  • the base layer motion vector field 1322 contains information about motion vectors obtained when a small number of bits are allocated to a motion vector.
  • the enhancement layer motion vector field 1323 contains information about motion vectors obtained when a large number of bits are allocated to a motion vector.
  • a predecoder selectively truncates a base layer motion vector or an enhancement layer motion vector for a particular block. That is, on the one hand, when the en ⁇ hancement layer motion vector is determined for the block, the predecoder truncates the base layer motion vector in a bitstream. On the other hand, when the base layer motion vector is determined for the block, the predecoder truncates the enhancement layer motion vector in the bitstream.
  • the motion vector field 1320 may include the base layer motion vector field 1322 and a residual motion vector field.
  • the predecoder truncates a residual motion vector in the bitstream.
  • the predecoder does not truncate the base layer motion vector. That is, a video decoder uses the base layer motion vector and the residual motion vector for the block to reconstruct an enhancement layer motion vector for inverse motion compensation.
  • the texture information field 1330 contains a Y Component field 1331 specifying texture information of Y component, a U Component field 1332 specifying texture in ⁇ formation of U component, and a V Component field 1333 specifying texture in ⁇ formation of V component.
  • FIG. 14 is a graph illustrating changes in video qualities when an enhancement layer motion vector and a base layer motion vector are used.
  • the quality of video reconstructed by a decoder when an enhancement layer motion vector is used is higher than that when a base layer motion vector is used.
  • the quality of reconstructed video when the base layer motion vector is used is higher than that when the enhancement layer motion vector is used.
  • the predecoder upon receiving a request for a bitstream having a bit rate higher than a reference point, the predecoder provides all enhancement layer motion vectors while truncating unnecessary bits of a texture. On the other hand, upon receiving a request for a bitstream having a bit rate lower than the reference point, the predecoder truncates bits of the texture as well as a part or all of the enhancement layer motion vectors.
  • the reference point can be experimentally obtained from various video sequences.
  • the predecoder may truncate all motion vectors (base layer motion vectors and enhancement layer motion vectors).
  • video coding providing motion scalability can be achieved by the video coding and decoding methods and video encoder and decoder according to the present invention.
  • the video coding and decoding methods according to the present invention provide improved visual quality by minimizing the number of bits contained in motion information at a very low bit rate.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention porte sur des procédés de codage et décodage de vidéo à échelle mobile, et sur un codeur et un décodeur de vidéo. Ledit procédé consiste: à estimer un vecteur de mouvement de couche de base et un vecteur de mouvement de couche plus élevée pour chacun des blocs de la trame vidéo; à éliminer les redondances temporelles de la trame vidéo à l'aide des vecteurs de mouvement de couche plus élevée; à effectuer une transformation spatiale de la trame vidéo dont les redondances temporelles ont été éliminées; et à quantifier la trame vidéo spatialement transformée pour obtenir une information de texture; à sélectionner pour chacun des blocs l'un des vecteurs de mouvement de couche de base et le vecteur de mouvement estimé de couche plus élevée; et à créer un flux binaire contenant le vecteur de mouvement sélectionné pour chaque bloc et l'information de texture.
PCT/KR2005/002187 2004-07-15 2005-07-07 Procede de codage et decodage de video et codeur et decodeur de video WO2006006793A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US58790504P 2004-07-15 2004-07-15
US60/587,905 2004-07-15
KR10-2004-0063198 2004-08-11
KR20040063198 2004-08-11
KR10-2004-0118021 2004-12-31
KR1020040118021A KR100678949B1 (ko) 2004-07-15 2004-12-31 비디오 코딩 및 디코딩 방법, 비디오 인코더 및 디코더

Publications (1)

Publication Number Publication Date
WO2006006793A1 true WO2006006793A1 (fr) 2006-01-19

Family

ID=35784109

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2005/002187 WO2006006793A1 (fr) 2004-07-15 2005-07-07 Procede de codage et decodage de video et codeur et decodeur de video

Country Status (1)

Country Link
WO (1) WO2006006793A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104469369A (zh) * 2014-11-17 2015-03-25 何震宇 一种利用解码端信息提高svc性能的方法
JP2016535465A (ja) * 2013-04-05 2016-11-10 ヴィド スケール インコーポレイテッド 多重レイヤビデオコーディングに対するインターレイヤ基準画像エンハンスメント

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6339618B1 (en) * 1997-01-08 2002-01-15 At&T Corp. Mesh node motion coding to enable object based functionalities within a motion compensated transform video coder
US6501797B1 (en) * 1999-07-06 2002-12-31 Koninklijke Phillips Electronics N.V. System and method for improved fine granular scalable video using base layer coding information
US6510177B1 (en) * 2000-03-24 2003-01-21 Microsoft Corporation System and method for layered video coding enhancement

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6339618B1 (en) * 1997-01-08 2002-01-15 At&T Corp. Mesh node motion coding to enable object based functionalities within a motion compensated transform video coder
US6501797B1 (en) * 1999-07-06 2002-12-31 Koninklijke Phillips Electronics N.V. System and method for improved fine granular scalable video using base layer coding information
US6510177B1 (en) * 2000-03-24 2003-01-21 Microsoft Corporation System and method for layered video coding enhancement

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016535465A (ja) * 2013-04-05 2016-11-10 ヴィド スケール インコーポレイテッド 多重レイヤビデオコーディングに対するインターレイヤ基準画像エンハンスメント
US10708605B2 (en) 2013-04-05 2020-07-07 Vid Scale, Inc. Inter-layer reference picture enhancement for multiple layer video coding
CN104469369A (zh) * 2014-11-17 2015-03-25 何震宇 一种利用解码端信息提高svc性能的方法
CN104469369B (zh) * 2014-11-17 2017-10-31 何震宇 一种利用解码端信息提高svc性能的方法

Similar Documents

Publication Publication Date Title
US20060013309A1 (en) Video encoding and decoding methods and video encoder and decoder
US8031776B2 (en) Method and apparatus for predecoding and decoding bitstream including base layer
US7839929B2 (en) Method and apparatus for predecoding hybrid bitstream
US8817872B2 (en) Method and apparatus for encoding/decoding multi-layer video using weighted prediction
US20060008006A1 (en) Video encoding and decoding methods and video encoder and decoder
JP4891234B2 (ja) グリッド動き推定/補償を用いたスケーラブルビデオ符号化
US7944975B2 (en) Inter-frame prediction method in video coding, video encoder, video decoding method, and video decoder
US20060013313A1 (en) Scalable video coding method and apparatus using base-layer
US20060013310A1 (en) Temporal decomposition and inverse temporal decomposition methods for video encoding and decoding and video encoder and decoder
US20050226334A1 (en) Method and apparatus for implementing motion scalability
US20050226335A1 (en) Method and apparatus for supporting motion scalability
US20050195897A1 (en) Scalable video coding method supporting variable GOP size and scalable video encoder
WO2006004331A1 (fr) Procedes de codage et de decodage video, codeur et decodeur video
US8340181B2 (en) Video coding and decoding methods with hierarchical temporal filtering structure, and apparatus for the same
US20050163217A1 (en) Method and apparatus for coding and decoding video bitstream
WO2006004305A1 (fr) Procede et appareil permettant de mettre en oeuvre l'extensibilite de mouvement
EP1878252A1 (fr) Procede et appareil destine a coder/decoder une video a couches multiples en utilisant une prediction ponderee
WO2006006793A1 (fr) Procede de codage et decodage de video et codeur et decodeur de video
EP1766986A1 (fr) Procedes de decomposition temporelle et de decomposition temporelle inverse pour le codage et le decodage video et codeur et decodeur video
EP1813114A1 (fr) Procede et appareil de precodage de trains de bits hybride

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase