WO2006109989A1 - Procede et appareil de codage video permettant de reduire un mauvais appariement entre un codeur et un decodeur - Google Patents

Procede et appareil de codage video permettant de reduire un mauvais appariement entre un codeur et un decodeur Download PDF

Info

Publication number
WO2006109989A1
WO2006109989A1 PCT/KR2006/001342 KR2006001342W WO2006109989A1 WO 2006109989 A1 WO2006109989 A1 WO 2006109989A1 KR 2006001342 W KR2006001342 W KR 2006001342W WO 2006109989 A1 WO2006109989 A1 WO 2006109989A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
frequency
frequency frame
frames
low
Prior art date
Application number
PCT/KR2006/001342
Other languages
English (en)
Inventor
Woo-Jin Han
Bae-Keun Lee
Original Assignee
Samsung Electronics Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020050052425A external-priority patent/KR100703772B1/ko
Application filed by Samsung Electronics Co., Ltd. filed Critical Samsung Electronics Co., Ltd.
Publication of WO2006109989A1 publication Critical patent/WO2006109989A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • H04N19/615Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding using motion compensated temporal filtering [MCTF]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/63Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation

Definitions

  • Methods and apparatuses consistent with the present invention relate generally to video coding, and more particularly, to reducing a mismatch between an encoder and a decoder in motion compensated temporal filtering.
  • Data can be compressed by eliminating spatial redundancy such as the repetition of a color or object in an image, temporal redundancy such as the case where there is little change between neighboring frames or a sound is repeated, or psychovisual redundancy which takes into account human visual and perceptual insensitivity to high frequencies.
  • spatial redundancy such as the repetition of a color or object in an image
  • temporal redundancy such as the case where there is little change between neighboring frames or a sound is repeated
  • psychovisual redundancy which takes into account human visual and perceptual insensitivity to high frequencies.
  • temporal redundancy is eliminated using temporal filtering based on motion compensation
  • spatial redundancy is eliminated using a spatial transform.
  • transmission media are necessary. Performance differs according to transmission medium.
  • Currently used transmission media have various transmission speeds ranging from the speed of an ultra high-speed communication network, which can transmit data at a transmission rate of several tens of megabits per second, to the speed of a mobile communication network, which can transmit data at a transmission rate of 384 Kbits per second.
  • a scalable video encoding method is required that can support transmission media having a variety of speeds and that can transmit multimedia at a transmission speed suitable for each transmission environment.
  • Such a scalable video coding method refers to a coding method that allows a video resolution, a frame rate, a Signal-to-Noise Ratio (SNR), and other parameters to be adjusted by truncating part of an already compressed bitstream in conformity with surrounding conditions, such as the transmission bit rate, transmission error rate, system source, and others.
  • SNR Signal-to-Noise Ratio
  • Motion Compensated Temporal Filtering is widely used in scalable video coding methods that support temporal scalability, such as the H.264 Scalable Extension (SE).
  • SE H.264 Scalable Extension
  • 5/3 MCTF which uses both neighboring right-hand and left-hand frames, has not only a high compression efficiency, but also a structure suitable for temporal scalability and SNR scalability. Therefore, 5/3 MCTF has been adopted in the working draft of H.264 SE that is being standardized by the Moving Pictures Expert Group (MPEG).
  • MPEG Moving Pictures Expert Group
  • FIG. 1 shows the structure of 5/3 MCTF, which sequentially performs a prediction step and an update step on one Group of Pictures (GOP).
  • GOP Group of Pictures
  • a prediction step and an update step are repeatedly performed in temporal level order.
  • a frame generated by the prediction step is referred to as a high-frequency frame (indicated by 'H') and a frame generated by the update step is referred to as a low-frequency frame (indicated by 'L').
  • the prediction step and the update step are repeated until one low-frequency frame L(4) is produced.
  • FIG. 2 is a view generally showing a prediction step and an update step.
  • subscripts 't' and 't+1' indicate a temporal level t and a temporal level t+1, respectively, and the '-1', '0' and T in parentheses indicate the temporal order.
  • the numbers on each arrow indicate the weight ratio of each frame in the prediction step or the update step.
  • a high-frequency frame H(O) is acquired using a difference between a current frame L (0) and a frame that is predicted from neighboring right- hand and left-hand reference frames L(-l) and L (1).
  • the neighboring right-hand and left-hand reference frames L (-1) and L (I) that are used in the previous prediction step are changed using the frame H(O) generated in the prediction step. This is a process of eliminating high-frequency components, that is, the frame H(O), from a reference frame, and it is similar to a type of low-frequency filtering. Since these changed frames L (-1) and L (1) are free of high-frequency components, efficiency can be improved at the time of compression.
  • the respective frames of a GOP are arranged on a temporal level basis, and one H frame (a high-frequency frame) is produced by performing the prediction step for each temporal level, and two reference frames used in the prediction step are changed using the H frame (the update step). If this process is performed on N frames at one temporal level, N/2 H frames and N/2 L frames (low-frequency frames) can be obtained. As a result, if this process is performed until only a final L frame (refers to a low-frequency frame) remains, M-I H frames and one L frame remain when the number of frames of one GOP is set to M. Thereafter, the encoding process can be finished by quantizing these frames.
  • the update step functions to eliminate the high-frequency components of the right- hand and left-hand reference frames using a difference image obtained in the prediction step, that is, the H frame. As shown in FlG. 2, L(-l) and L (1) are changed to L (-l) and L (1), from which the high-frequency components are eliminated during the update step.
  • MCTF has a codec configuration having an open-loop structure and adopts the update step in order to reduce the drifting error.
  • the open-loop structure refers to a structure employing right-hand and left-hand reference frames that are not quantized in order to obtain a difference image (high-frequency frame).
  • An existing video codec generally uses a process (including a quantization process) that encodes and restores a preceding reference frame and then uses the results thereof, that is, a closed-loop structure.
  • the MCTF-based codec mainly has two types of mismatches between the encoder and the decoder.
  • the first is a mismatch in the prediction step.
  • the right-hand and left-hand reference frames are used to produce the H frame.
  • the decoder cannot recognize that the H frame produced in this manner is an optimal signal. Only after the right-hand and left-hand reference frames are changed by the update step, and then changed to the H frame in the temporal level can they be quantized. It is difficult in the MCTF structure to consider the previous quantization of the reference frames, as is done in the closed-loop process.
  • the high-frequency frame H(O) is used to change the right-hand and left-hand reference frames L (-1) and L (1). Since the high-frequency frame has not yet been quantized, the mismatch between the decoder and the encoder occurs even in this case.
  • the present invention proposes, after the completion of MCTF, a process of recalculating the H frame (hereinafter referred to as 'frame re-estimation') including a coding/decoding process in order to solve the mismatch in the prediction step.
  • the present invention also proposes a method that is capable of reducing a mismatch between the encoder and the decoder by performing an update step using re-estimated images neighboring an encoded/decoded difference image, instead of the H frame (that is, an original difference image), hereinafter referred to as a 'closed-loop update', in order to solve the mismatch in the update step.
  • Illustrative, non-limiting embodiments of the present invention overcome the above disadvantages and other disadvantages not described above. Also, the present invention is not required to overcome the disadvantages described above, and an illustrative, non- limiting embodiment of the present invention may not overcome any of the problems described above.
  • the present invention provides a method and apparatus for improving overall video compression efficiency by reducing drift error between an encoder and a decoder in an MCTF-based video codec.
  • the present invention also provides a method and apparatus for efficiently re- estimating a high-frequency frame in an MCTF-based video codec.
  • the present invention also provides a method and apparatus for efficiently performing an update step at a current layer using the lower layer information in an MCTF-based multi-layered video codec.
  • a video encoding method including dividing input frames into one final low-frequency frame and one or more high-frequency frames by performing motion compensated temporal filtering on the input frames; encoding the final low-frequency frame and then decoding the encoded final low-frequency frame; re-estimating the high-frequency frames using the decoded final low-frequency frame; and encoding the re-estimated high-frequency frames.
  • a video encoding method including dividing input frames into one final low-frequency frame and one or more high-frequency frames by performing motion compensated temporal filtering on the input frames; and encoding the final low-frequency frame and the high- frequency frames; wherein the dividing input frames includes generating the high- frequency frames from a low-frequency frame of a current layer; generating a virtual high-frequency frame using a restored frame of a lower layer; and updating the low- frequency frame using the virtual high-frequency frame.
  • a video encoding method including dividing input frames into one final low-frequency frame and one or more high-frequency frames by performing motion compensated temporal filtering on the input frames; and encoding the final low-frequency frame and the high- frequency frames; wherein the dividing the input frames includes generating the high- frequency frames from a low-frequency frame of a current layer; generating a virtual high-frequency frame using a restored frame of a lower layer; and updating the low- frequency frame using a weighted mean of the high-frequency frames and the virtual high-frequency frame.
  • a video decoding method including restoring a final low-frequency frame and one or more high-frequency frames by decoding texture data included in an input bitstream; and performing inverse-motion compensated temporal filtering on the final low-frequency frame and the high-frequency frames using motion data included in the input bitstream; wherein the high-frequency frames are high-frequency frames re-estimated in an encoder.
  • a video decoding method including restoring a final low-frequency frame and one or more high-frequency frames of a current layer by decoding texture data included in an input bitstream; and performing inverse-motion compensated temporal filtering on the final low-frequency frame and the high-frequency frames using motion data included in the input bitstream; wherein the performing inverse-motion compensated temporal filtering includes generating a virtual high-frequency frame using a restored frame of a lower layer; inversely updating a first low-frequency frame using the virtual high- frequency frame; and restoring a second low-frequency frame by inversely predicting the restored high-frequency frame with reference to the updated first low-frequency frame.
  • FlG. 1 is a diagram showing the structure of 5/3 MCTF, which sequentially performs a prediction step and an update step on one GOP;
  • FlG. 2 is a view generally showing the prediction step and the update step
  • FlG. 3 is a view illustrating the 5/3 MCTF process
  • FlG. 4 is a view illustrating a closed-loop frame re-estimation process based on mode 0;
  • FlG. 5 is a view illustrating a decoding process based on mode 0;
  • FlG. 6 is a view illustrating an MCTF process based on mode 1 ;
  • FlG. 7 is a view illustrating a closed-loop frame re-estimation process based on mode 1;
  • FlG. 8 is a view illustrating a decoding process based on mode 1 ;
  • FlG. 9 is a view illustrating an MCTF process based on mode 2;
  • FlG. 10 is a view illustrating a decoding process based on mode 2;
  • FlG. 11 is a block diagram of a video encoder based on mode 0 according to an exemplary embodiment of the present invention
  • FlG. 12 is a block diagram of a video encoder based on mode 2 according to an exemplary embodiment of the present invention
  • FlG. 13 is a block diagram of a video decoder based on mode 0 according to an exemplary embodiment of the present invention
  • FlG. 14 is a block diagram of a video decoder based on mode 2 according to an exemplary embodiment of the present invention
  • FlG. 15 is a diagram illustrating a system for performing the operation of a video encoder or a video decoder according to an exemplary embodiment of the present invention.
  • a 'closed-loop frame re-estimation method' proposed by the present invention is performed using the following processes.
  • First when the size of a GOP is M after the existing MCTF has been performed, M-I H frames and one L frame are obtained.
  • Second an environment is conformed to that of a decoder by coding/decoding right-hand and left-hand reference frames while performing MCTF in an inverse manner.
  • Third a high-frequency frame is recalculated using the encoded/decoded reference frames.
  • a method of implementing a 'closed-loop update' which is proposed by the present invention, includes the following three modes.
  • Mode 1 involves reducing a mismatch by omitting an update step for the final L frame
  • Mode 2 involves replacing an H frame used in the update step with the information of a base layer
  • Mode 3 involves reducing a mismatch using the weighted mean of an existing H frame and information obtained in mode 2.
  • the closed-loop frame re-estimation technique and the closed-loop update technique according to the present invention can be applied together or independently.
  • FTG. 3 is a view illustrating the 5/3 MCTF process. As shown in FIG. 3, if MCTF is performed according to an existing method, one L frame and a plurality of H frames are obtained. [56] A general MCTF process is performed based on a lifting scheme.
  • the lifting scheme includes a prediction step and an update step.
  • the lifting scheme divides input frames into frames that will undergo low-frequency filtering (hereinafter referred to as
  • Equation 1 expresses the prediction step and the update step in mathematical form.
  • Equation 1 L frames generated at a temporal level t.
  • L L
  • H H frames generated at a temporal level t+1
  • L indicates L frames generated at a temporal level t.
  • the constant within the parentheses is an index indicating the order of frames.
  • P and u are constant coefficients. If a Haar filter is used in the MCTF process, P(L (Ik-V)) and U(H (k)) of Equation 1 can be t t+i expressed by the following Equation 2. [60]
  • Equation 1 Equation 1
  • Equation 3 Equation 3
  • the L frame L (1) is encoded and decoded.
  • the coding process may include a transform process and a quantization process.
  • the decoding process can include an inverse quantization process and an inverse transform process.
  • the decoded L frame L (I) can be represented as L '(1).
  • the encoding process and the decoding process may be collectively referred to as a 'restoration process'.
  • the inverse update step is the inverse of the update step.
  • the inverse update step can be expressed mathematically by the following Equation 4, which is a modification of the second equation of Equation 1.
  • L (k) corresponds to L '(1) of FIG. 4 and L (2k) corresponds to L
  • H (k) in order to obtain L (2k) , H (k) must be known. Since the H t t+l frames other than one L frame have not undergone the restoration process, not the H frame that has undergone the restoration process, but an original H frame is used as H
  • H (1) is one of the ⁇ frames generated in the MCTF process of FlG. 3.
  • P(L (2k-l)) (that is, a predicted frame) can be obtained using L (2k) , as in Equation 5.
  • H (Jc) can be re-estimated by subtracting P(L (2k-l)) from the original frame L (2k-l) .
  • the re-estimated H (k) is represented as R (k).
  • Equation 5 a frame R (1), which is obtained by re- estimating H (1) can be generated from L (2) and L '(2).
  • R (1) becomes R '(1) through the restoration process, that is, the closed-loop process, so that it can be used ttoo rreessttoorree ootthheerr ffrraammeess..
  • P(L (2k-l)) corresponds to P(L (I)), and is a predicted frame which is obtained from L (2) and any L frame of a previous GOP). If L '(1) is obtained, L '(2) can be restored by applying ⁇ (1) and ⁇ (2) to the inverse prediction process, as shown in Equation 4. In a similar way, L '(4) can be restored by applying L '(2) and ⁇ (2) to Equation 4.
  • R (1) that is, a re-estimated frame for ⁇ (1)
  • L (2) that is, a re-estimated frame for ⁇ (2)
  • 'R frames' The re-estimated ⁇ frames (hereinafter referred to as 'R frames'), such as R (1), R
  • the R frames and the final L frame are encoded and then transmitted to the decoder.
  • the decoder restores video frames using the transmitted frames.
  • Such a restoration process is shown in FlG. 5.
  • the original input frames of a temporal level 0 can be restored by a process of repeatedly performing the inverse update step and the inverse prediction step, that is, the inverse MCTF process.
  • the inverse MCTF process can be performed in the same manner as the conventional inverse MCTF process, but it is different from the conventional inverse MCTF process in that the R frames are used instead of the H frames in the inverse prediction step.
  • 'mode 0' may be effective in eliminating the mismatch between the encoder and the decoder that is caused by the prediction step, the mismatch between the encoder and the decoder that is caused by the update step still exists. Therefore, in the following 'closed-loop update step', a method of eliminating the mismatch between the encoder and the decoder by constructing the update step so that it uses the closed-loop process will be described.
  • the present invention proposes a method of obviating a mismatch in all the frames (mode 1 to mode 3) in such a way as to use the closed-loop frame re-estimation method, but to omit the update step for a frame that is located at an L frame position.
  • FIGS. 6 and 7 The MCTF process and the closed-loop frame re-estimation process in accordance with mode 1 are illustrated as shown in FIGS. 6 and 7.
  • the process of generating L (2) by applying the update step to L (4) in the update step 1 of FIG. 3, and the process of generating L (1) by applying the update step to L (2) in the update step 2 of FIG. 3 are omitted.
  • the inverse update process of generating L '(2) based on H (I) and L '(1) and the inverse update process of generating L '(4) based on H (2) and L '(2) in FTG. 4 are omitted.
  • Mode 2 is a method of performing the update step using the H frame of a lower layer instead of the H frame obtained in a current layer if there is no significant difference between the quality of the lower layer and the quality of the current layer.
  • FTG. 9 is a view illustrating mode 2. All of the frames of a current layer (1) are affixed with a superscript T and all of the frames of a lower layer (0) are affixed with a subscript 1 O'. In FTG. 9, it is indicated that the frame rates of the current layer and the lower layer are the same. However, when the frame rate of the lower layer is lower than that of the current layer, mode 2 can be applied between corresponding H frames in the same way.
  • the core of mode 2 resides in using virtual H frames (hereinafter referred to as 'S frames'), which are generated using corresponding lower layer information, and not un- quantized H frames, when performing the update step.
  • 'S frames' virtual H frames
  • the S frames S (1), S (2) and S (1) are used only in the update step, and the H frames H '(1), H '(2) and H are used without change in the prediction step.
  • the S frames will be used in the inverse update step and the H frames will be used in the inverse prediction step.
  • information that may be used to generate the S frame includes L ' (the restored L frame of a lower layer), L 1 (the L frame of a current layer that has not undergone the restoration process of the current layer), P ' (a predicted frame of a lower layer that is generated from a restored L frame) and P 1 (the predicted frame of a current layer).
  • mode 2 attempts to use lower layer information that can provide restored frames at the time of updating a current layer. It should be noted that, in mode 2, all the frames fetched from the lower layer are restored frames. In addition, if the resolution of a lower layer is different from the resolution of a current layer at the time of using the frame of the lower layer, the frame of the lower layer must be properly up-sampled.
  • Mode 2 may also be classified into three detailed modes.
  • S frames are obtained from L°'-P°'.
  • an S frame S '(2) used to update L '(4) is obtained by subtracting the predicted frame P 0 ' of a lower layer, which is calculated from L °'(2) and L °'(4), from L °'(3). This is the same as H °'(2), that is, a result obtained by restoring the H frame of the lower layer.
  • Mode 2-1 is advantageous in that a mismatch does not occur between the encoder and the decoder because already restored frames are used, and additional calculation is not required because the S frame itself is a result obtained by restoring the H frame of the lower layer.
  • S frames are obtained from L -P '.
  • the S frame S '(2) used to update L '(4) is obtained by subtracting the predicted frame P 0 ', which is calculated from the restored frames L '(2) and L '(4) of a current layer, from the frame L '(3) of a lower layer.
  • a mismatch is somewhat reduced compared to existing MCTF, but a predicted frame P ' is generated using the motion vector of the lower layer, not a value based on the motion vector of a current layer. Therefore, there is a case where efficiency is somewhat lowered.
  • S frames are obtained from L 0 ⁇ P 1 .
  • the S frame(S '(2)) used to update L (4) in FIG. 9 is obtained by subtracting the predicted frame (P ), which is calculated from L '(2) and L '(4) of a current layer, from the restored frame L '(3) of a lower layer.
  • P is generated using the motion vector of a current layer. The mismatch is reduced and the amount of calculation is small compared to existing MCTF.
  • a restored lower layer frame L ' is used instead of a current layer frame L , the improvement of performance further increases when the lower layer frame is significantly similar to the current layer frame.
  • Equation 7 The difference between the S frame based on mode 2-3 and the H frame used in the inverse update step on the decoder will be described below.
  • the H frame in the decoder can be expressed by the following Equation 7 and the S frame based on mode 2-3 can be expressed by the following Equation 8.
  • Equations 7 and 8 From the comparison between Equations 7 and 8, it can be appreciated that the former terms of the Equations are the same and the latter terms of both Equations respectively correspond to the differences between restored values (P ' and L ' are restored values) and original values and the values thereof are relatively low, so that the mismatch between the inter encoder and the decoder can be effectively reduced when mode 2-3 is followed.
  • the frames of a lower layer (0) are restored through an existing inverse MCTF process.
  • the frames of a current layer (1) are also restored by repeating an inverse update step and an inverse prediction step.
  • the S frame(s) is used at the inverse update step and the H frame(s) is used at the inverse prediction step.
  • the H frame used at the inverse prediction step is the result of performing inverse quantization and inverse transform on the H frame transferred from an encoder.
  • the S frame used in the inverse update step is not a value transferred from an encoder, but is a virtual H frame that is estimated from the restored frame of a lower layer and the restored frame of a current layer.
  • the method of generating the S frame may vary depending on mode 2-1, 2-2 or 2-3, as described above.
  • the decoder can generate the S frame based on a predetermined mode 2-1, 2-2 or 2-3, or it can generate the S frame based on selected mode information transferred from the encoder.
  • Mode 2 is the same as the existing MCTF process except that the H frame used at the update process is changed to a value that can reduce the mismatch between the encoder and the decoder.
  • mode 2 employs lower layer information, it has a limitation in that it can be used only in a video codec having multiple layers.
  • a method in which mode 0 and mode 2 are mixed and then used may be considered. That is, for the prediction step, the re-estimated H frame, that is, the frame R, is used as in mode 0, and for the update step, the S frame is used as in mode 2. In a similar way, in the decoder stage, the S frame is used at the inverse update step and the frame R is used at the inverse prediction step.
  • mode 3 is a method using the weighted mean of the H frame based on the existing MCTF method and the S frame based on mode 2.
  • a result S which is the weight-mean of the H frame based on the existing MCTF method and the S frame based on mode 2 is used to update the L frame, as in Equation 9:
  • is a constant having a value between 0 and 1.
  • the existing MCTF method and the methods (mode 0 to mode 3) proposed in the present invention can be selectively used. The selection may be performed on a frame, slice (defined in H.264) or macroblock basis.
  • the criterion for selection may be the selection of a method in which the number of bits of data (including motion data and texture data), which are generated as a result of performing coding according to a plurality of methods serving as the targets of comparison, is small, or the selection of a method in which a rate-distortion (R-D)-based cost function is minimal.
  • R-D rate-distortion
  • a new flag called 'CLUFlag' is introduced, and a selected mode number (one of 0 to 3) according to the present invention can be recorded as the value of the flag (mode 2 can also be classified into sub-divided modes), or a number (for example, '4') according to the existing MCTF method can be recorded as the value of the flag.
  • CLUFlag When CLUFlag is 0, it indicates coding according to the existing MCTF.
  • CLUFlag When CLUFlag is 1, it indicates coding based on one of the modes according to the present invention (modes that have been previously agreed upon by the encoder and the decoder).
  • CLUFlag can be recorded in a frame header.
  • CLUFlag can be recorded in a slice header.
  • CLUFlag can be recoded and included in macroblock syntax.
  • FIG. 11 is a block diagram of the video encoder 100 based on mode 0 according to an exemplary embodiment of the present invention.
  • [112] Frames are input to an L frame buffer 117. This is because the input frames can belong to L frames (low-frequency frames).
  • the L frames stored in the L frame buffer 117 are provided to a separation unit 111.
  • the separation unit 111 separates the received low-frequency frames into frames at high-frequency frame locations (H locations) and frames at low-frequency frame locations (L locations).
  • the high-frequency frames are located at odd- numbered locations (2i+l) and the low-frequency frames are located at even-numbered locations (2i).
  • T is an index indicating a frame number.
  • the H location frames are transformed into H frames through the prediction step.
  • the L location frames are transformed into low-frequency frames in a next temporal level through the update step.
  • the H location frames are input to a motion estimation unit 115 and a subtracter
  • the motion estimation unit 115 obtains a motion vector (MV) by performing motion estimation on the frames at the H location (hereinafter referred to as a 'current frame') with reference to neighboring frames (frames in the same temporal level located at temporally different locations).
  • MV motion vector
  • the neighboring frames to which reference is made as described above are referred to as 'reference frames'.
  • a displacement when error is the lowest, while a predetermined block moves within a specific search region of the reference frame on a pixel or sub-pixel (e.g., 1/4 pixel) basis, is estimated as an MV.
  • a fixed block can be employed, but a hierarchical method using Hierarchical Variable Size Block Matching (HVSBM) may be employed.
  • HVSBM Hierarchical Variable Size Block Matching
  • the MV obtained in the motion estimation unit 115 is provided to a motion compensation unit 112.
  • the motion compensation unit 112 generates a predicted frame for the current frame by performing motion compensation on the reference frame using the obtained MV.
  • the predicted frame can be expressed as P(L (2k- 1)) of Equation 1.
  • the subtracter 118 generates a high-frequency frame (an H frame) by subtracting the predicted frame from the current frame.
  • the generated high-frequency frame is buffered in the H frame buffer 117.
  • the updating unit 116 generates low-frequency frames by updating the L location frames using the generated high-frequency frame.
  • a frame at a predetermined L location can be updated using two high-frequency frames that are temporally adjacent to each other.
  • the update process can also be performed unidirectionally in the same manner.
  • the update process can be expressed in the second term of Equation 1.
  • the low-frequency frames generated in the updating unit 116 are buffered in the L frame buffer 117.
  • the L frame buffer 117 provides the generated low-frequency frames to the separation unit 111 in order to perform a prediction step and an update step in a next temporal level.
  • the final low-frequency frame L is provided to a transformation unit 120 because a next temporal level does not exit.
  • the transformation unit 120 performs a spatial transform process on the received final low-frequency frame L and generates a transform coefficient.
  • This spatial transform method can include a Discrete Cosine Transform (DCT) method, a wavelet transform method or the like.
  • DCT Discrete Cosine Transform
  • the transform coefficient becomes a DCT coefficient.
  • the wavelet transform method the transform coefficient becomes a wavelet coefficient.
  • a quantization unit 130 quantizes the transform coefficient.
  • the quantization process is a process of representing the transform coefficient, which is represented by a predetermined real value, by discrete values.
  • the quantization unit 130 can perform a quantization process of dividing the transform coefficient, which is represented by a predetermined real value, using a predetermined quantization step and rounding off the result to an integer value (in the case where scalar quantization is used).
  • the quantization step can be provided from a previously agreed quantization table.
  • the quantization result obtained by the quantization unit 130 is provided to an entropy encoding unit 140 and an inverse quantization unit 150.
  • the inverse quantization unit 150 inversely quantizes the quantization coefficient with respect to L .
  • the inverse quantization process is a process of restoring a value matching an index, which is generated in the quantization process, from the index using the same quantization table as that used in the quantization process.
  • An inverse transformation unit 160 receives an inverse quantization result and performs an inverse transform process on the received inverse quantization result.
  • the inverse transform process is performed in an inverse manner to the transform process of the transformation unit 120.
  • an inverse DCT transform method, an inverse wavelet transform method and so on can be used for the inverse transform process.
  • the inverse transform result that is, the restored final low-frequency frame f (referred to as Lf)
  • Lf the restored final low-frequency frame f
  • a re-estimation module 199 re-estimates the high-frequency frame using the restored final low-frequency frame Lf.
  • An example of the re-estimation process is shown in FlG. 4.
  • the re-estimation module 199 includes an inverse updating unit 170, a frame re-estimation unit 180, and an inverse prediction unit 190.
  • the inverse updating unit 170 performs an inverse update process on Lf using a high-frequency frame that is from the high-frequency frames that are resolved in the MCTF process and then buffered in the H frame buffer 177, and that is of the same temporal level as Lf.
  • the inverse update process may be performed according to Equation 4.
  • an MV of the high-frequency frame which is obtained in the motion estimation unit 115, is employed.
  • the inverse updating unit 170 updates L '(1) using H (1) and, as a result, generates L '(2).
  • the frame re-estimation unit 180 generates a predicted frame using the inversely updated frame, and re-estimates the high-frequency frame by obtaining a difference between the low-frequency frame (that is, the original low-frequency frame of the high-frequency frame) in a temporal level, which is lower than that of the high- frequency frame by one level, and the predicted frame. As a result, a re-estimated frame (R) is generated.
  • the high-frequency frame re-estimation process may be performed according to Equation 5.
  • the frame re-estimation unit 180 generates a predicted frame using L '(2) (and the restored low-frequency frame of a previous GOP). To generate the predicted frame, an MV is needed.
  • the MV of a high-frequency frame (H (I)) may be used without change, and an MV may be obtained by performing an additional motion estimation process.
  • the frame re-estimation unit 180 finds a difference between an original low-frequency frame L (1) of the high-frequency frame H (I) and the predicted frame.
  • the re-estimated frame R is restored through the transformation unit 120, the quantization unit 130, the inverse quantization unit 150 and the inverse transformation unit 160.
  • the restored re-estimated frame R' is input to the inverse prediction unit 190.
  • the inverse prediction unit 190 inversely predicts L '(1) using the restored re- estimated frame R '(1) and the updated low-frequency frame L '(2).
  • the inverse prediction process can be performed according to Equation 6.
  • the inversely predicted L '(1) is a value that is equal to that at the decoder.
  • R (1) and R (2) are generated by re-estimating the remaining high-frequency frames H (I) and H (2) through the frame re-estimation unit 180. Therefore, since three high-frequency frames exist in the example of FlG. 4, re-estimated frames for the three high-frequency frames are all found. If other high-frequency frames exist, R (1) and R (2) can also be used to re-estimate other high-frequency frames after undergoing the coding/decoding process.
  • the re-estimated frames R include re-estimated frames for all of the high- frequency frames, and undergo the transform process of the transformation unit 120 and the quantization process of the quantization unit 130.
  • the re-estimated frames that have undergone the above processes need not undergo the same process as in R (1) of FIG. 4.
  • the entropy encoding unit 140 receives the quantization coefficient of the final low-frequency frame L , which is generated by the quantization unit 130, and the quantization coefficient of the re-estimated high-frequency frames R, and generates a bitstream by performing lossless encoding on the coefficients.
  • the lossless encoding method may include Huffman coding, arithmetic coding, variable length coding and a variety of other methods.
  • the block diagram based on mode 1 has the same construction as that of FIG. 11 except that, in the case where the low-frequency frame is located at the location of the final low-frequency frame when the updating unit 116 of FIG. 11 performs the update step, the update step on the low-frequency frame is omitted.
  • FIG. 12 is a block diagram illustrating the construction of the video encoder 300 based on mode 2 according to an exemplary embodiment of the present invention. Mode 2 is applied to a multi-layered frame.
  • a video encoder 300 includes a lower layer encoder and a current layer encoder, as shown in FIG. 12.
  • a superscript 0 or 1 is an index that identifies a layer, and superscript 0 indicates a lower layer and superscript 1 indicates a current layer.
  • Frames are input to an L frame buffer 317 and the down sampler 401 of the current layer.
  • the down sampler 401 performs down sampling spatially or temporally.
  • the spatial down sampling process is performed to reduce resolution and the temporal down sampling process is performed to reduce frame rate.
  • the down-sampled frames are input to an L frame buffer 417 of the lower layer.
  • An MCTF module 410 of the lower layer performs the prediction and update steps of a general MCTF process. Descriptions thereof will be omitted to avoid redundancy.
  • At least one high-frequency frame W and a final low-frequency frame L generated in the MCTF module 410 are restored through a transformation unit 420, a quantization unit 430, an inverse quantization unit 450 and an inverse transformation unit 460.
  • the restored high-frequency frame H 0 ' is provided to an inverse prediction unit 490 and the restored low-frequency frame L °' is provided to an inverse updating unit 470.
  • the inverse updating unit 470 and the inverse prediction unit 490 restore the low- frequency frames L ' in each temporal level while repeatedly performing the inverse update and inverse prediction steps.
  • the inverse update and inverse prediction steps are general steps in the inverse MCTF process.
  • a restoration frame buffer 480 buffers the restored low-frequency frames L 0 ' and the restored high-frequency frame W' and provides them to a virtual H frame generator 319.
  • the low-frequency frame L in an L frame buffer 317 is separated into H and L location frames using the separation unit 311.
  • the prediction step that is performed in the MCTF module 310 of the current layer is the same as the MCTF process performed in the MCTF module 410 of the lower layer. However, the prediction step is different from the MCTF process in that the update step is not performed using the high- frequency frame H 1 as in the MCTF module 310 of the lower layer, but is performed using the virtual high-frequency frame S estimated from information of the lower layer.
  • the virtual H frame generator 319 generates a virtual high- frequency frame S using the restored frames L 0 ' and H 0 ' of the lower layer and provides the generated virtual high-frequency frame S to an updating unit 316.
  • the method of generating the virtual H frame may include three modes (mode 2-1, mode 2-2 and mode 2-3) as described above.
  • the decoded high-frequency frame H 0 ' is used as the virtual high-frequency frame S without change.
  • the high- frequency frames H (1), H (2) and H (1) of the lower layer are replaced with the virtual high-frequency frames S 1 Q), S *(2) and S without change.
  • the virtual H frame generator 319 generates a predicted frame from the restored low-frequency frames L 0 ' of the lower layer, and subtracts the predicted frame from the low-frequency frame L of the current layer, which is provided from the L frame buffer 317, thus generating the virtual high-frequency frame S.
  • an MV generated in a motion estimation unit 415 of the lower layer can be employed.
  • the virtual H frame generator 319 generates the virtual high- frequency frame S by subtracting the predicted frame, which is used to generate the H frame of the current layer, from the one of the restored low-frequency frames L ' of the lower layer frame that corresponds to a current frame.
  • the current frame refers to L (1) when S (1) is desired to be generated, L (3) when S (2) is desired to be generated, and L (1) when S (1) is desired to be generated, in FlG. 9.
  • the updating unit 316 inversely updates a low-frequency frame at a predetermined temporal level using the generated S frame, and generates a low-frequency frame at a temporal level that is one higher than the temporal level.
  • Information which is encoded in the current layer and then transmitted to the decoder, includes the final low-frequency frame L ⁇ and the high-frequency frame H 1 , but does not include the frame S. This is because the frame S can be estimated/ generated in the decoder in the same manner as in the encoder.
  • An entropy encoding unit 340 generates a bit stream by losslessly encoding a quantization coefficient Q 1 with respect to L ⁇ and H 1 , which are generated in the quantization unit 330, the MV MV of the current layer, a quantization coefficient Q with respect to L ° and H 0 , which are generated in the quantization unit 430, and the MV MV of the lower layer.
  • the virtual high-frequency frame S is not directly used in the update step, but the weight mean of the high-frequency frame H and the virtual high- frequency frame S is applied to the update step. Therefore, according to mode 3, a process in which the virtual H frame generator 319 calculates the weighted mean, as in Equation 9, can be further added.
  • FIG. 13 is a block diagram of the construction of the video decoder 500 based on mode 0 according to an exemplary embodiment of the present invention.
  • An entropy decoding unit 510 performs a lossless decoding process and, thereby, extracts texture data and MV data with respect to each frame from a received bitstream.
  • the extracted texture data is provided to an inverse quantization unit 520 and the extracted MV data is provided to an inverse updating unit 540 and an inverse prediction unit 550.
  • the inverse quantization unit 520 performs inverse quantization on the texture data output from the entropy decoding unit 510.
  • the inverse quantization process is a process of restoring a value matching an index, which is generated in the quantization process, from the index using the same quantization table that is used in the quantization process.
  • the inverse transformation unit 530 performs an inverse transform on the inverse quantization result.
  • the inverse transform process is performed by a method corresponding to the transformation unit 120 of the video encoder 100.
  • an inverse DCT transform method, an inverse wavelet transform method or the like can be used.
  • a final low-frequency frame and a re-estimated high-frequency frame are restored.
  • the restored re-estimated high-frequency frame R' is provided to the inverse updating unit 540 and the inverse prediction unit 550.
  • An inverse MCTF module 545 generates a frame L ' that is finally restored by repeatedly performing an inverse update step on the inverse updating unit 540 and an inverse prediction step on the inverse prediction unit 550.
  • the inverse update step and the inverse prediction step are repeated until a frame at a temporal level 0, that is, the received frame at the encoder 100, is restored.
  • the inverse updating unit 540 performs inverse update on L ' using the frame of R' at a temporal level identical to that of Lf. At this time, an MV that the frame of the same temporal level is used. Furthermore, the inverse updating unit 540 repeatedly performs the inverse update process using the low-frequency frame received from the inverse prediction unit 550 in the same manner.
  • the inverse prediction unit 550 restores the current low-frequency frame by performing inverse prediction on the re-estimated high-frequency frame R' using the low-frequency frame (a peripheral low-frequency frame) that is inversely updated in the inverse updating unit 540. To this end, the inverse prediction unit 550 generates a predicted frame for a current low-frequency frame by performing motion compensation on the peripheral low-frequency frame using an MV received from the entropy decoding unit 510, and adds the re-estimated high-frequency frame R' and the predicted frame.
  • the inverse prediction step can be performed in a manner reverse to the prediction step, as in Equation 6.
  • the current low-frequency frame generated by the inverse prediction unit 550 can also be provided to the inverse updating unit 540.
  • the inverse prediction unit 550 outputs the restored frame L ' in the case where an input frame of a temporal level 0 is restored as a result of inverse prediction.
  • the block diagram according to mode 1 has the same construction as FIG. 13, and there is a difference in that the update step for the low-frequency frame is omitted in the case where a low-frequency frame is positioned at the location of a final low- frequency frame when the inverse updating unit 540 performs the inverse update step on a predetermined low-frequency frame.
  • FIG. 14 is a block diagram illustrating the construction of a video decoder 700 based on mode 2 according to an exemplary embodiment of the present invention. Since mode 2 is applied to multi-layer frames, the video decoder 700 includes a lower layer encoder and a current layer encoder, as shown in FIG. 14. In FIG. 14, a superscript 0 or 1 is an index that distinguishes layers, and 0 indicates the lower layer and 1 indicates the current layer.
  • An entropy decoding unit 710 performs lossless decoding, and extracts from an input bitstream texture data and MV data with respect to each frame.
  • the texture data includes the texture data Q 1 of a current layer and the texture data Q 0 of a lower layer.
  • the MV data includes the MV MV of the current layer and the MV Q of the lower layer.
  • An inverse quantization unit 820 performs an inverse quantization on Q .
  • An inverse transformation unit 830 performs an inverse transform on the inversely quantized result. As a result, a final low-frequency frame L °' and at least one or more high-frequency frames H 0 ' of the lower layer are restored.
  • the restored final low-frequency frame L ' is provided to an inverse updating unit
  • the restored high-frequency frame H 0 ' is provided to the inverse updating unit 840 and an inverse prediction unit 850.
  • An inverse MCTF module 845 generates a restored frame L ' by repeatedly performing the inverse update step on the inverse updating unit 840 and the inverse prediction step on the inverse prediction unit 850.
  • the inverse update step and the inverse prediction step are repeated until the frame of a temporal level 0, that is, the input frame at the encoder 100, is restored.
  • the inverse updating unit 840 performs an inverse update process on L °' using a frame among W' that has the same temporal level as that of L '. At this time, an MV of the frame of the same temporal level is used. Furthermore, the inverse updating unit 840 repeatedly performs the inverse update process using a low-frequency frame received from the inverse prediction unit 850 in the same manner.
  • the inverse prediction unit 850 restores a current low-frequency frame by performing an inverse prediction process on the high-frequency frame H 0 ' using the low-frequency frame (a peripheral low-frequency frame) that is inversely updated in the inverse updating unit 840. To this end, the inverse prediction unit 850 generates a predicted frame for the current low-frequency frame by performing motion compensation on the peripheral low-frequency frame using the MV MV 0 received from the entropy decoding unit 710, and adds the high-frequency frame H 0 ' and the predicted frame. The current low-frequency frame generated by the inverse prediction unit 850 can be provided to the inverse updating unit 840.
  • the low-frequency frame generated as a result of the inverse update process and the low-frequency frame generated as a result of the inverse prediction process are stored in a frame buffer 860 and then provided to a virtual H frame generator 770.
  • the virtual H frame generator 770 receives restored frames of the lower layer L 0 ' and W' from the frame buffer 860 and a restored low-frequency frame (L ') of the upper layer from the frame buffer 760, generates a virtual high-frequency frame S, and provides the frame S to the inverse updating unit 740.
  • the method of generating a virtual H frame may include three modes (mode 2-1, mode 2-2 and mode 2-3) as described above with reference to FlG. 12. The method is also the same as that of FlG. 12. A description thereof will be omitted in order to avoid redundancy.
  • the virtual high-frequency frame S is not directly used in the update step, but a weighted mean of the high-frequency frame H and the virtual high- frequency frame S is applied to the update step. Therefore, according to mode 3, a process of allowing the virtual H frame generator 770 to calculate the weighted mean will be further added, as in Equation 9.
  • FlG. 15 shows the construction of a system for performing the operation of the video encoders 100 and 300 or the video decoders 500 and 700 according to an exemplary embodiment of the present invention.
  • the system may be a set-top box, a desk top computer, a laptop computer, a palmtop computer, a Personal Digital Assistant (PDA), a video or image storage device (for example, a Video Cassette Recorder (VCR) and a Digital Video Recorder (DVR)) or the like.
  • the system may be a combination of the above-described devices, or one of the above-described devices may be included in another device.
  • the system may include at least one video source 910, at least one Input/Output (I/O) device 920, a processor 940, memory 950 and a display device 930.
  • I/O Input/Output
  • the video source 910 may be a TV receiver, a VCR or some other video storage device. Furthermore, the video source 910 may be at least one network connection for receiving video from a server via the Internet, a Wide Area Network (WAN), a Local Area Network (LAN), a terrestrial broadcasting system, a cable network, a satellite communication network, a wireless network, a telephone network or the like. In addition, the video source can be a combination of the above-described networks, or one of the above-described networks may be included in another network.
  • WAN Wide Area Network
  • LAN Local Area Network
  • the video source can be a combination of the above-described networks, or one of the above-described networks may be included in another network.
  • the FO device 920, the processor 940 and the memory 950 communicate with each other via a communication medium 960.
  • the communication medium 960 may be a communication bus, a communication network, or at least one internal connection circuit.
  • Input video data received from the video source 910 may be processed by the processor 940 according to one or more software programs stored in the memory 950, and may be executed by the processor 940 so as to generate output video that is provided to the display apparatus 930.
  • the software programs stored in the memory 950 may include a scalable video codec that performs the method according to the present invention.
  • the encoder or the codec may be stored in the memory 950, may be read from a storage medium such as a CD-ROM or a floppy disk, or may be downloaded from a pre- determined server via one of various networks.
  • the codec may be software, a hardware circuit, or a combination of software and a hardware circuit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un procédé permettant de réduire un mauvais appariement entre un codeur et un décodeur dans un filtrage temporel à compensation de mouvement et des procédé et appareil de codage vidéo mettant en oeuvre celui-ci. Le procédé de codage vidéo consiste à diviser des trames d'entrée en une trame finale basse fréquence et au moins une trame haute fréquence par mise en oeuvre du filtrage temporel à compensation de mouvement sur les trames d'entrée; à coder la trame finale basse fréquence et à décoder la trame finale basse fréquence codée; à réestimer la trame haute fréquence au moyen de la trame finale basse fréquence décodée; et à coder la trame haute fréquence réestimée.
PCT/KR2006/001342 2005-04-13 2006-04-12 Procede et appareil de codage video permettant de reduire un mauvais appariement entre un codeur et un decodeur WO2006109989A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US67070205P 2005-04-13 2005-04-13
US60/670,702 2005-04-13
KR10-2005-0052425 2005-06-17
KR1020050052425A KR100703772B1 (ko) 2005-04-13 2005-06-17 인코더-디코더 간 불일치를 감소시키는 mctf 기반의비디오 코딩 방법 및 장치

Publications (1)

Publication Number Publication Date
WO2006109989A1 true WO2006109989A1 (fr) 2006-10-19

Family

ID=37087229

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2006/001342 WO2006109989A1 (fr) 2005-04-13 2006-04-12 Procede et appareil de codage video permettant de reduire un mauvais appariement entre un codeur et un decodeur

Country Status (1)

Country Link
WO (1) WO2006109989A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030202598A1 (en) * 2002-04-29 2003-10-30 Koninklijke Philips Electronics N.V. Motion compensated temporal filtering based on multiple reference frames for wavelet based coding
US20040114689A1 (en) * 2002-12-13 2004-06-17 Huipin Zhang Wavelet based multiresolution video representation with spatially scalable motion vectors

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030202598A1 (en) * 2002-04-29 2003-10-30 Koninklijke Philips Electronics N.V. Motion compensated temporal filtering based on multiple reference frames for wavelet based coding
US20040114689A1 (en) * 2002-12-13 2004-06-17 Huipin Zhang Wavelet based multiresolution video representation with spatially scalable motion vectors

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MUNTEANU A. ET AL.: "Control of the distortion variation in video coding systems based on motion compensated temporal filtering", PROC. OF INT. CONF. ON IMAGE PROCESSING (ICIP), vol. 2, no. II, September 2003 (2003-09-01), pages 61 - 64 *
SCHAAR V.D. M. ET AL.: "Unconstained motion compensated temporal filtering (UMCIF)framework for wavelet video coding", PROC. OF INT. CONF. ON MULTIMEDIA AND EXPO (ICME), vol. 2, no. II, July 2003 (2003-07-01), pages 584 - 584 *

Similar Documents

Publication Publication Date Title
KR100703760B1 (ko) 시간적 레벨간 모션 벡터 예측을 이용한 비디오인코딩/디코딩 방법 및 장치
KR100714696B1 (ko) 다계층 기반의 가중 예측을 이용한 비디오 코딩 방법 및장치
KR100703788B1 (ko) 스무딩 예측을 이용한 다계층 기반의 비디오 인코딩 방법,디코딩 방법, 비디오 인코더 및 비디오 디코더
KR100763182B1 (ko) 다계층 기반의 가중 예측을 이용한 비디오 코딩 방법 및장치
KR100763179B1 (ko) 비동기 픽쳐의 모션 벡터를 압축/복원하는 방법 및 그방법을 이용한 장치
KR100621581B1 (ko) 기초 계층을 포함하는 비트스트림을 프리디코딩,디코딩하는 방법, 및 장치
KR20060135992A (ko) 다계층 기반의 가중 예측을 이용한 비디오 코딩 방법 및장치
US20060250520A1 (en) Video coding method and apparatus for reducing mismatch between encoder and decoder
US20050169379A1 (en) Apparatus and method for scalable video coding providing scalability in encoder part
US20050157793A1 (en) Video coding/decoding method and apparatus
US20050157794A1 (en) Scalable video encoding method and apparatus supporting closed-loop optimization
KR20060006328A (ko) 기초 계층을 이용하는 스케일러블 비디오 코딩 방법 및 장치
US20060165303A1 (en) Video coding method and apparatus for efficiently predicting unsynchronized frame
US20060165301A1 (en) Video coding method and apparatus for efficiently predicting unsynchronized frame
US20070014356A1 (en) Video coding method and apparatus for reducing mismatch between encoder and decoder
WO2006118384A1 (fr) Procede et appareil destine a coder/decoder une video a couches multiples en utilisant une prediction ponderee
US20060088100A1 (en) Video coding method and apparatus supporting temporal scalability
WO2006132509A1 (fr) Procede de codage video fonde sur des couches multiples, procede de decodage, codeur video, et decodeur video utilisant une prevision de lissage
WO2007027012A1 (fr) Procede et appareil de codage video reduisant la desadaptation entre un codeur et un decodeur
WO2006109989A1 (fr) Procede et appareil de codage video permettant de reduire un mauvais appariement entre un codeur et un decodeur
KR20050074151A (ko) 스케일러블 비디오 코딩에서 모션 벡터를 선정하는 방법및 그 방법을 이용한 비디오 압축 장치
WO2006104357A1 (fr) Procede pour la compression/decompression des vecteurs de mouvement d'une image non synchronisee et appareil utilisant ce procede
WO2006098586A1 (fr) Procede et dispositif de codage/decodage video utilisant une prediction de mouvement entre des niveaux temporels
WO2006043754A1 (fr) Procede de video codage et appareil prenant en charge une extensibilite temporelle
EP1847129A1 (fr) Procede et dispositif pour comprimer un vecteur de mouvement multicouche

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU

122 Ep: pct application non-entry in european phase

Ref document number: 06747350

Country of ref document: EP

Kind code of ref document: A1