WO2008036112A1 - Method and apparatus for multiple pass video coding and decoding - Google Patents

Method and apparatus for multiple pass video coding and decoding Download PDF

Info

Publication number
WO2008036112A1
WO2008036112A1 PCT/US2007/004110 US2007004110W WO2008036112A1 WO 2008036112 A1 WO2008036112 A1 WO 2008036112A1 US 2007004110 W US2007004110 W US 2007004110W WO 2008036112 A1 WO2008036112 A1 WO 2008036112A1
Authority
WO
WIPO (PCT)
Prior art keywords
predicted image
encoding pass
video
pass
encoding
Prior art date
Application number
PCT/US2007/004110
Other languages
English (en)
French (fr)
Inventor
Beibei Wang
Peng Yin
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Priority to EP07750912A priority Critical patent/EP2070334A1/en
Priority to JP2009529167A priority patent/JP5529537B2/ja
Priority to US12/310,757 priority patent/US20100040146A1/en
Priority to BRPI0716540-4A priority patent/BRPI0716540A2/pt
Priority to KR1020157009755A priority patent/KR20150047639A/ko
Priority to KR1020147008589A priority patent/KR20140059270A/ko
Publication of WO2008036112A1 publication Critical patent/WO2008036112A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/583Motion compensation with overlapping blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/43Hardware specially adapted for motion estimation or compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/63Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
    • H04N19/635Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets characterised by filter definition or implementation details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/97Matching pursuit coding

Definitions

  • the present invention relates generally to video encoding and decoding and, more particularly, to a method and apparatus for multiple pass video encoding and decoding.
  • the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 standard (hereinafter the "MPEG4/H.264 standard” or simply the “H.264 standard") is currently the most powerful and state-of-the-art video coding standard.
  • MPEG4/H.264 standard or simply the “H.264 standard
  • H.264 standard uses block-based motion-compensation and discrete cosine transform (DCT)-like transform coding. It is well-known that DCT is efficient for video coding and suitable for high-end applications, like broadcast high definition television (HDTV).
  • DCT discrete cosine transform
  • the DCT algorithm is not as well suited for applications which require very low bit rates, such as a dedicated video cell phone.
  • the DCT transform will introduce blocking artifacts, even with the use of deblocking filters, because very few coefficients can be coded at very low bitrates, and each coefficient tends to have a very coarse quantization step.
  • Matching pursuit is a greedy algorithm to decompose any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions. These waveforms are selected to best match the signal structures.
  • f(t) we have a 1-D signal f(t), and we want to decompose this signal using basis vectors from an over-complete dictionary set G .
  • Individual dictionary functions can be denoted as follows:
  • is an indexing parameter associated with a particular dictionary element.
  • the decomposition begins by choosing ⁇ to maximize the absolute value of the inner product as follows:
  • This residual signal is then expanded in the same way as the original signal.
  • the procedure continues iteratively until either a set number of expansion coefficients are generated or some energy threshold for the residual is reached.
  • Each stage n generates a dictionary function ⁇ n .
  • the signal can be approximated by a linear function of the dictionary elements as follows:
  • Matching Pursuit decomposition of a signal of n samples proves to be of the order k-N-d-n ⁇ og 2 n .
  • d depends on the size of the dictionary without considering translations
  • _v is the number of chosen expansion coefficients
  • k depends on the strategy to select the dictionary functions.
  • Matching Pursuit is more computationally consuming than 8x8 and 4x4 DCT integer transforms used in the H.264 standard, whose complexity is defined as o(niog 2 «).
  • the Matching Pursuit algorithm is compatible with any set of redundant basis shapes. It has been proposed to expand a signal using an over- complete basis of Gabor functions.
  • the 2-D Gabor dictionary is extremely redundant, and each shape may exist at any integer-pixel location in the coded residual image. Since Matching Pursuit has a much larger dictionary set and each coded basis function is well-matched to the structures in the residual signal, the frame-based Gabor dictionary does not include an artificial block structure.
  • the Gabor redundant dictionary set has been adopted for very low bit-rate video coding based on matching pursuits, with respect to a proposed video coding system using a matching pursuit algorithm (hereinafter referred to as the "prior art Gabor-based Matching Pursuit video coding approach").
  • the proposed system is based on the framework of a low bit rate hybrid-DCT system referred to as Simulation Model for Very Low Bit Rate Image Coding, or "SIM3" in short, where the DCT residual coder is replaced with a Matching Pursuit coder.
  • SIM3 Simulation Model for Very Low Bit Rate Image Coding
  • This coder uses Matching Pursuit to decompose the motion residual images over dictionary separable 2-D Gabor functions.
  • the proposed system was shown to perform well on low motion sequences at low bitrate.
  • a smooth 16x16 sine-square window has been applied on the predicted images for 8x8 partitions in the prior art Gabor-based Matching Pursuit video coding approach.
  • the Matching Pursuit video codec in the prior art Gabor-based Matching Pursuit video coding approach is based on the ITU-T H.263 codec.
  • the H.264 standard enables variable block-size motion compensation with small block sizes which, for luma motion compensation, may be as small as 4x4.
  • the H.264 standard is based primarily on a 4x4 DCT-like transform for baseline and main profile, and not 8x8 as are most other prominent prior video coding standards.
  • the directional spatial prediction for intra coding improves the quality of the prediction signals. All those highlighted design features make the H.264 standard more efficient, but it requires dealing with more complicated situations when applying Matching Pursuit on the H.264 standard.
  • the smooth 16x16 sine-squared window is represented as follows:
  • a hybrid coding scheme (hereinafter the "prior art hybrid coding scheme") has been proposed that benefits from some of the features introduced by the H.264 standard for motion estimation and replaces the transform in the spatial domain.
  • the prediction error is coded using the Matching Pursuit algorithm, which decomposes the signal over an appositely designed bi-dimensional, anisotropic, redundant dictionary. Moreover, a fast atom search technique was introduced.
  • the proposed prior art hybrid coding scheme has not addressed whether or not it uses one-pass or two-pass scheme.
  • the proposed prior art hybrid coding scheme disclosed that the motion estimation part is compatible with the H.264 standard, but did not address whether any deblocking filters have been used in the coding scheme or whether any other methods have been used to smooth the blocking artifacts caused by the predicted images at very low bit rate.
  • a video encoder for encoding video signal data using a multiple-pass video encoding scheme.
  • the video encoder includes a motion estimator and a decomposition module.
  • the motion estimator performs motion estimation on the video signal data to obtain a motion residual corresponding to the video signal data in a first encoding pass.
  • the decomposition module in signal communication with the motion estimator, decomposes the motion residual in a subsequent encoding pass.
  • a method for encoding video signal data using a multiple-pass video encoding scheme includes performing motion estimation on the video signal data to obtain a motion residual corresponding to the video signal data in a first encoding pass, and decomposing the motion residual in a subsequent encoding pass.
  • a video decoder for decoding a video bitstream includes an entropy decoder, an atom decoder, an inverse transformer, a motion compensator, a deblocking filter, and a combiner. The entropy decoder decodes the video bitstream to obtain a decompressed video bitstream.
  • the atom decoder in signal communication with the entropy decoder, decodes decompressed atoms corresponding to the decompressed bitstream to obtain decoded atoms.
  • the inverse transformer in signal communication with the atom decoder, applies an inverse transform to the decoded atoms to form a reconstructed residual image.
  • the motion compensator in signal communication with the entropy decoder, performs motion compensation using motion vectors corresponding to the decompressed bitstream to form a reconstructed predicted image.
  • the deblocking filter in signal communication with the motion compensator, performs deblocking filtering on the reconstructed predicted image to smooth the reconstructed predicted image.
  • the combiner in signal communication with the inverse transformer and the overlapped block motion compensator, combines the reconstructed predicted image and the residue image to obtain a reconstructed image.
  • a method for decoding a video bitstream includes decoding the video bitstream to obtain a decompressed video bitstream, decoding decompressed atoms corresponding to the decompressed bitstream to obtain decoded atoms, applying an inverse transform to the decoded atoms to form a reconstructed residual image, performing motion compensation using motion vectors corresponding to the decompressed bitstream to form a reconstructed predicted image, performing deblocking filtering on the reconstructed predicted image to smooth the reconstructed predicted image, and combining the reconstructed predicted image and the residue image to obtain a reconstructed image.
  • FIGs. 1 A and 1 B are diagrams for exemplary first and second pass portions of an encoder in a two-pass H.264 standard-based Matching Pursuit encoder/decoder (CODEC) to which the present principles may be applied according to an embodiment of the present principles;
  • FIG. 2 is a diagram for an exemplary decoder in a two-pass H.264 standard- based Matching Pursuit encoder/decoder (CODEC) to which the present principles may be applied according to an embodiment of the present principles;
  • FIG. 3 is a diagram for an exemplary method for encoding an input video sequence in accordance with an embodiment of the present principles.
  • FlG. 4 is a diagram for an exemplary method for decoding an input video sequence in accordance with an embodiment of the present principles.
  • the present invention is directed to a method and apparatus for multiple pass video encoding and decoding.
  • the present invention corrects the blocking artifacts introduced by the DCT transform used in, e.g., the H.264 standard in very low bit rate applications.
  • the present invention is not limited to solely low bit rate applications, but may be used for other (higher) bit rates as well, while maintaining the scope of the present invention.
  • processor or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
  • DSP digital signal processor
  • ROM read-only memory
  • RAM random access memory
  • any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
  • any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function.
  • the invention as defined by such claims resides in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
  • a multiple pass video encoding and decoding scheme is provided.
  • the multiple pass video encoding and decoding scheme may be used with Matching Pursuit.
  • a two- pass H.264-based coding scheme is disclosed for Matching Pursuit video coding.
  • the H.264 standard applies block-based motion compensation and DCT-like transform similar to other video compression standards.
  • the DCT transform will introduce blocking artifacts, even with the use of de-blocking filters, because very few coefficients can be coded at very low bitrates, and each coefficient tends to have a very coarse quantization step.
  • matching pursuit using an over-complete basis is applied to code the residual images.
  • the motion compensation and mode decision parts are compatible with the H.264 standard.
  • the overlapped block motion compensation (OBMC) is applied to smooth the predicted images.
  • OBMC overlapped block motion compensation
  • a new approach is provided for selecting a basis other than Matching Pursuit.
  • a video encoder and/or decoder applies OBMC on predicted images to reduce the blocking artifacts caused by the prediction models.
  • the Matching Pursuit algorithm is used to code the residual images.
  • the advantage of Matching Pursuit is that it is not block-based, but frame- based, so there are no blocking artifacts caused by the coding residual difference.
  • exemplary first and second pass portions of an encoder in a two-pass H.264 standard-based Matching Pursuit encoder/decoder are indicated generally by the reference numerals 110 and 160.
  • the encoder is indicated generally by the reference numeral 190 and a decoder portion is indicated generally by the reference numeral 191.
  • an input of the first pass portion 110 is connected in signal communication with a non-inverting input of a combiner 112, an input of an encoder control module 114, and a first input of a motion estimator 116.
  • a first output of the combiner 112 is connected in signal communication with a first input of a buffer 118.
  • a second output of the combiner 112 is connected in signal communication with an input of an integer transform/scaling/quantization module 120.
  • An output of the integer transform/scaling/quantization module 120 is connected in signal communication with a first input of a scaling/inverse transform module 122.
  • a first output of the encoder control module 114 is connected in signal communication with a first input of an intra-frame predictor 126.
  • a second output of the encoder control module 114 is connected in signal communication with a first input of a motion compensator 124.
  • a third output of the encoder control module 114 is connected in signal communication with a second input of the motion estimator 116.
  • a fourth output of the encoder control module 114 is connected in signal communication with a second input of the scaling/inverse transform module 122.
  • a fifth output of the encoder control module 114 is connected in signal communication with the first input of the buffer 118.
  • An output of the motion estimator 116 is connected in signal communication with a second input of a motion compensator 124 and a second input of the buffer 128.
  • An inverting input of the combiner 112 is selectively connected in signal communication with an output of the motion compensator 124 or an output of an intra-frame predictor 126.
  • the selected output of either the motion compensator 124 or the intra-frame predictor 126 is connected in signal communication with a first input of a combiner 128.
  • An output of the scaling/inverse transform module 122 is connected in signal communication with a second input of the combiner 128.
  • An output of the combiner 128 is connected in signal communication with a second input of the intra-frame predictor 126, a third input of the motion estimator 116, and an input/output of the motion compensator 124.
  • An output of the buffer 118 is available as an output of the first pass portion 110.
  • the encoder control module 114 the integer transform/scaling/quantization module 120, the buffer 118, and the motion estimator 116 are included in the encoder 190.
  • the scaling/inverse transform module 122, the intra-frame predictor 126, and the motion compensator 124 are included in the decoder portion 191.
  • the input of the first pass portion 110 receives an input video 111 , and stores in the buffer 118 control data (e.g., motion vectors, mode selections, predicted images, and so forth) for use in the second pass portion 160.
  • control data e.g., motion vectors, mode selections, predicted images, and so forth
  • a first input of the second pass portion 160 is connected in signal communication with an input of an entropy coder 166.
  • the first input receives control data 162 (e.g., mode selections, and so forth) and motion vectors 164 from the first pass portion 110.
  • a second input of the second pass portion 160 is connected in signal communication with a non-inverting input of a combiner 168.
  • a third input of the second pass portion 160 is connected in signal communication with an input of an overlapped block motion compensation (OBMC)/deblocking module 170.
  • OBMC overlapped block motion compensation
  • the second input of the second pass portion 160 receives the input video 111, and the third input of the second pass portion receives predicted images 187 from the first pass portion 110.
  • An output of the combiner 168 which provides a residual 172, is connected in signal communication with an input of an atom finder 174.
  • An output of the atom finder 174 which provides a coded residual 178, is connected in signal communication with an input of an atom coder 176 and a first non-inverting input of a combiner 180.
  • An output of the OBMC/deblocking module 170 is connected in signal communication with an inverting input of the combiner 168 and with a second non-inverting input of the combiner 180.
  • An output of the combiner 180 which provides an output video, is connected in signal communication with an input of a reference buffer 182.
  • An output of the atom coder 176 is connected in signal communication with the input of the entropy coder 166.
  • An output of the entropy coder 166 is available as an output of the second pass portion 160, and provides an output bitstream.
  • the entropy coder is included in the encoder 190, and the combiner 168, the OBMC module 170, the atom finder 174, the atom coder 176, and the reference buffer 182 are included in the decoder portion 191.
  • an exemplary decoder in a two-pass H.264 standard-based Matching Pursuit encoder/decoder is indicated generally by the reference numeral 200.
  • An input of the decoder 200 is connected in signal communication with an input of an entropy decoder 210.
  • An output of the entropy decoder is connected in signal communication with an input of an atom decoder 220 and an input of a motion compensator 250.
  • An output of the inverse transform module 230 which provides residuals, is connected in signal communication with a first non-inverting input of a combiner 270.
  • An output of the motion compensator 250 is connected in signal communication with an input of an OBMC/deblocking module 260.
  • An output of the OBMC/deblocking module 260 is connected in signal communication with a second non-inverting input of the combiner 270.
  • An output of the combiner is available as an output of the decoder 200.
  • the present principles are applicable to the ITU-T H.264/AVC coding system. Due to the frame-based residual coding, we apply OBMC on predicted images, which is not implemented in the H.264/AVC codec.
  • a first pass in a video encoding scheme is compatible with the H.264 standard. There is no actual coding in the first pass. All the control data, such as, for example, mode selections, predicted images and motion vectors, are saved into a buffer for the second pass. The DCT transform is still applied in the first pass for motion compensation and mode selections using Rate Distortion Optimization (RDO). Instead of coding the residue image using DCT coefficients, all residual images are saved for the second pass.
  • RDO Rate Distortion Optimization
  • it is proposed to apply 16 x 16 constrained intra coding or H.264 standard compatible constrained intra coding and treat the boundary parts between intra coded and inter coded macroblocks specially.
  • the motion vectors and control data may be coded by entropy coding.
  • the residual images may be coded by Matching Pursuit.
  • the atoms search and parameter coding may be performed, e.g., according to the prior art Gabor-based Matching Pursuit video coding approach.
  • the reconstructed images are saved for reference frames.
  • Matching Pursuit video coding One of the benefits of Matching Pursuit video coding is that Matching Pursuit is not block-based, so there is no blocking artifacts. However, when the motion prediction is performed on a block-basis and is inaccurate, it still originates some blocking artifacts at very low bit rates. Simulations have shown that the atoms appear at the moving contours and the areas where the motion vectors (MVs) are not very accurate. Improving the motion estimation leads the atoms to representing the residuals better.
  • one method involves using a H.264-like or improved deblocking filter to smooth the blocky boundary in a predictive image.
  • a smoother motion model using overlapping blocks OBMC
  • OBMC overlapping blocks
  • a 16 x 16 sine-squared window has been adopted.
  • the N x N sine-squared window may be defined, e.g., in accordance with the prior art hybrid coding scheme.
  • the 16x16 sine-squared window is designed for 8x8 blocks, and 16x16 blocks are treated as four 8x8 blocks.
  • partitions with lurna block size 16x16, 16x8, 8x16, and 8x8 samples are supported.
  • the 8x8 partition is further partitioned into partitions of 8x4, 4x8 or 4x4 luma samples and corresponding chroma samples.
  • four approaches are proposed to deal with more partition types. The first approach is to use an 8x8 sine-squared window for 4x4 partitions. For all other partitions above 4x4, divide those partitions into several 4x4 partitions.
  • the second approach is to use a 16x16 sine-squared window for 8x8 and above partitions, but it does not touch smaller partitions than 8x8.
  • the third approach is to use adaptive OBMC for all partitions. All of these three approaches only implement OBMC not deblocking filters, and the fourth approach is to combine OBMB with a deblocking f ⁇ lter(s).
  • Each subband represents one direction, and it is edge detective.
  • the 2-D DDWT achieves higher PSNR with the same retained coefficients compared to the standard 2-D DWT. Thus, it is more suitable to code the edge information.
  • the error images After applying OBMC on the predicted images, the error images will have smoother edges. Parametric over-complete 2-D dictionaries may be used to provide smoother edges.
  • an exemplary method for encoding an input video sequence is indicated generally by the reference numeral 300.
  • the method 300 includes a start block 305 that passes control to a decision block 310.
  • the decision block 310 determines whether or not the current frame is an l-frame. If so, then control is passed to a function block 355. Otherwise, control is passed to a function block 315.
  • the function block 355 performs H.264 standard compatible frame coding to provide an output bitstream, and passes control to an end block 370.
  • the function block 315 performs H.264 standard compatible motion compensation, and passes control to a function block 320.
  • the function block 320 saves the motion vectors (MVs), control data, and predicted blocks, and passes control to a decision block 325.
  • the decision block 325 determines whether or not the end of the frame has been reached. If so, then control is passed to a function block 330. Otherwise, control is returned to the function block 315.
  • the function block 330 performs OBMC and/or deblocking filtering on the predicted images, and passes control to a function block 335.
  • the function block 335 obtains a residue image from the original and predicted images, and passes control to a function block 340.
  • the function block 340 codes a residual using Matching Pursuit, and passes control to a function block 345.
  • the function block 345 performs entropy coding to provide an output bitstream, and passes control to the end block 370.
  • the method 400 includes a start block 405 that passes control to a decision block 410.
  • the decision block 410 determines whether or not the current frame is an l-frame. If so, then control is passed to a function block 435. Otherwise, control is passed to a function block 415.
  • the function block 435 performs H.264 standard compatible decoding to provide a reconstructed image, and passes control to an end block 470.
  • the function block 415 decodes the motion vectors, control data, and the Matching Pursuit atoms, and passes control to a function block 420 and a function block 425.
  • the function block 420 reconstructs the residue image using decoded atoms, and passes control to a function block 430.
  • the function block 425 reconstructs the predicted images by decoding motion vectors and other control data and applying OBMC and/or deblocking filtering, and passes control to the function block 430.
  • the function block 430 combines the reconstructed residue image and the reconstructed predicted images to provide a reconstructed image, and passes control to the end block 470.
  • one advantage/feature is a video encoder for encoding video signal data using a multiple-pass video encoding scheme, wherein the video encoder includes a motion estimator and a decomposition module.
  • the motion estimator performs motion estimation on the video signal data to obtain a motion residual corresponding to the video signal data in a first encoding pass.
  • the decomposition module in signal communication with the motion estimator, decomposes the motion residual in a subsequent encoding pass.
  • the video encoder as described above, wherein the multiple-pass video coding scheme js a two-pass video encoding scheme.
  • the video encoder further includes a buffer, in signal communication with the motion estimator and the decomposition module, for storing the motion residual obtained in the first encoding pass for subsequent use in a second encoding pass.
  • the decomposition module decomposes the motion residual using a redundant Gabor dictionary set in the second encoding pass.
  • Yet another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the motion estimator performs the motion estimation and coding-mode selection in compliance with the International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 standard in the first encoding pass.
  • ITU-T International Telecommunication Union, Telecommunication Sector
  • Still another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and an overlapped block motion compensator.
  • the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
  • the overlapped block motion compensator in signal communication with the buffer, performs overlapping block motion compensation (OBMC) on the predicted image using a 16x16 sine-square window to smooth the predicted image in the second encoding pass.
  • OBMC overlapping block motion compensation
  • another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and an overlapped block motion compensator.
  • the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
  • the overlapped block motion compensator in signal communication with the buffer, performs overlapped block motion compensation (OBMC) on only 8x8 and greater partitions of the predicted image in the second encoding pass.
  • OBMC overlapped block motion compensation
  • another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and an overlapped block motion compensator.
  • the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
  • the overlapped block motion compensator in signal communication with the buffer, performs overlapping block motion compensation (OBMC) using a 8x8 sine-square window for 4x4 partitions of the predicted image in the second encoding pass. All partitions of the predicted image are divided into 4x4 partitions when OBMC is performed in the second encoding pass.
  • the buffer stores the predicted image therein in the first encoding pass for subsequent use in the second encoding pass.
  • another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and an overlapped block motion compensator.
  • the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
  • the overlapped block motion compensator in signal communication with the buffer, performs adaptive overlapping block motion compensation (OBMC) for all partitions of the predicted image in the second encoding pass.
  • OBMC adaptive overlapping block motion compensation
  • another advantage feature is the video encoder using the two- pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and a deblocking filter.
  • the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
  • the deblocking filter in signal communication with the buffer, performs a deblocking operation on the predicted image in the second encoding pass.
  • the buffer stores the predicted image therein in the first encoding pass for subsequent use in the second encoding pass.
  • Another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the decomposition module applies parametric over-complete 2-D dictionaries to decompose the motion residual in the second encoding pass.
  • another advantage feature is a video decoder for decoding a video bitstream
  • the video decoder includes an entropy decoder, an atom decoder, an inverse transformer, a motion compensator, a deblocking filter, and a combiner.
  • the entropy decoder decodes the video bitstream to obtain a decompressed video bitstream.
  • the atom decoder in signal communication with the entropy decoder, decodes decompressed atoms corresponding to the decompressed bitstream to obtain decoded atoms.
  • the inverse transformer in signal communication with the atom decoder, applies an inverse transform to the decoded atoms to form a reconstructed residual image.
  • the motion compensator in signal communication with the entropy decoder, performs motion compensation using motion vectors corresponding to the decompressed bitstream to form a reconstructed predicted image.
  • the deblocking filter in signal communication with the motion compensator, performs deblocking filtering on the reconstructed predicted image to smooth the reconstructed predicted image.
  • the combiner in signal communication with the inverse transformer and the overlapped block motion compensator, combines the reconstructed predicted image and the residue image to obtain a reconstructed image.
  • the teachings of the present invention are implemented as a combination of hardware and software.
  • the software may be implemented as an application program tangibly embodied on a program storage unit.
  • the application program may be uploaded to, and executed by, a machine comprising any suitable architecture.
  • the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU"), a random access memory (“RAM”), and input/output ("I/O") interfaces.
  • CPU central processing units
  • RAM random access memory
  • I/O input/output
  • the computer platform may also include an operating system and microinstruction code.
  • the various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU.
  • various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
PCT/US2007/004110 2006-09-22 2007-02-15 Method and apparatus for multiple pass video coding and decoding WO2008036112A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
EP07750912A EP2070334A1 (en) 2006-09-22 2007-02-15 Method and apparatus for multiple pass video coding and decoding
JP2009529167A JP5529537B2 (ja) 2006-09-22 2007-02-15 複数経路ビデオ符号化及び復号化のための方法及び装置
US12/310,757 US20100040146A1 (en) 2006-09-22 2007-02-15 Method and apparatus for multiple pass video coding and decoding
BRPI0716540-4A BRPI0716540A2 (pt) 2006-09-22 2007-02-15 mÉtodo e aparelho para codificaÇço e decodificaÇço de vÍdeo de passos méltiplos
KR1020157009755A KR20150047639A (ko) 2006-09-22 2007-02-15 다중 경로 비디오 코딩 및 디코딩을 위한 방법 및 장치
KR1020147008589A KR20140059270A (ko) 2006-09-22 2007-02-15 다중 경로 비디오 코딩 및 디코딩을 위한 방법 및 장치

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
USPCT/US2006/037139 2006-09-22
US2006037139 2006-09-22

Publications (1)

Publication Number Publication Date
WO2008036112A1 true WO2008036112A1 (en) 2008-03-27

Family

ID=38521211

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/004110 WO2008036112A1 (en) 2006-09-22 2007-02-15 Method and apparatus for multiple pass video coding and decoding

Country Status (6)

Country Link
EP (1) EP2070334A1 (ja)
JP (2) JP5529537B2 (ja)
KR (2) KR20150047639A (ja)
CN (2) CN102833544A (ja)
BR (1) BRPI0716540A2 (ja)
WO (1) WO2008036112A1 (ja)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013520058A (ja) * 2010-02-11 2013-05-30 トムソン ライセンシング 画像シーケンスのブロックの符号化および復元の方法
US10390034B2 (en) 2014-01-03 2019-08-20 Microsoft Technology Licensing, Llc Innovations in block vector prediction and estimation of reconstructed sample values within an overlap area
US10469863B2 (en) 2014-01-03 2019-11-05 Microsoft Technology Licensing, Llc Block vector prediction in video and image coding/decoding
US10506254B2 (en) 2013-10-14 2019-12-10 Microsoft Technology Licensing, Llc Features of base color index map mode for video and image coding and decoding
US10542274B2 (en) 2014-02-21 2020-01-21 Microsoft Technology Licensing, Llc Dictionary encoding and decoding of screen content
US10582213B2 (en) 2013-10-14 2020-03-03 Microsoft Technology Licensing, Llc Features of intra block copy prediction mode for video and image coding and decoding
US10659783B2 (en) 2015-06-09 2020-05-19 Microsoft Technology Licensing, Llc Robust encoding/decoding of escape-coded pixels in palette mode
US10785486B2 (en) 2014-06-19 2020-09-22 Microsoft Technology Licensing, Llc Unified intra block copy and inter prediction modes
US10812817B2 (en) 2014-09-30 2020-10-20 Microsoft Technology Licensing, Llc Rules for intra-picture prediction modes when wavefront parallel processing is enabled
US10986349B2 (en) 2017-12-29 2021-04-20 Microsoft Technology Licensing, Llc Constraints on locations of reference blocks for intra block copy prediction
US11109036B2 (en) 2013-10-14 2021-08-31 Microsoft Technology Licensing, Llc Encoder-side options for intra block copy prediction mode for video and image coding
US11284103B2 (en) 2014-01-17 2022-03-22 Microsoft Technology Licensing, Llc Intra block copy prediction with asymmetric partitions and encoder-side search patterns, search ranges and approaches to partitioning
US20230319323A1 (en) * 2007-02-23 2023-10-05 Xylon Llc Video Coding With Embedded Motion

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013077650A1 (ko) * 2011-11-23 2013-05-30 한국전자통신연구원 다시점 비디오 복호화 방법 및 장치
JP6055555B2 (ja) 2012-11-13 2016-12-27 インテル コーポレイション 次世代ビデオのためのビデオコーデックアーキテクチャ

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5699121A (en) * 1995-09-21 1997-12-16 Regents Of The University Of California Method and apparatus for compression of low bit rate video signals
US20030179825A1 (en) * 2001-04-09 2003-09-25 Shunchi Sekiguchi Image encoding method and apparatus, image decoding method and apparatus, and image processing system
WO2005002234A1 (en) * 2003-06-30 2005-01-06 Koninklijke Philips Electronics, N.V. Video coding in an overcomplete wavelet domain
WO2005013201A1 (en) * 2003-08-05 2005-02-10 Koninklijke Philips Electronics N.V. Video encoding and decoding methods and corresponding devices
US20060159359A1 (en) * 2005-01-19 2006-07-20 Samsung Electronics Co., Ltd. Fine granularity scalable video encoding and decoding method and apparatus capable of controlling deblocking

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8204109B2 (en) * 2003-03-28 2012-06-19 Etiip Holdings Inc. Overcomplete basis transform-based motion residual frame coding method and apparatus for video compression
US7653133B2 (en) * 2003-06-10 2010-01-26 Rensselaer Polytechnic Institute (Rpi) Overlapped block motion compression for variable size blocks in the context of MCTF scalable video coders
JP2007502561A (ja) * 2003-08-12 2007-02-08 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ ビデオエンコードおよびデコードの方法および対応する装置
JP4191729B2 (ja) * 2005-01-04 2008-12-03 三星電子株式会社 イントラblモードを考慮したデブロックフィルタリング方法、及び該方法を用いる多階層ビデオエンコーダ/デコーダ

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5699121A (en) * 1995-09-21 1997-12-16 Regents Of The University Of California Method and apparatus for compression of low bit rate video signals
US20030179825A1 (en) * 2001-04-09 2003-09-25 Shunchi Sekiguchi Image encoding method and apparatus, image decoding method and apparatus, and image processing system
WO2005002234A1 (en) * 2003-06-30 2005-01-06 Koninklijke Philips Electronics, N.V. Video coding in an overcomplete wavelet domain
WO2005013201A1 (en) * 2003-08-05 2005-02-10 Koninklijke Philips Electronics N.V. Video encoding and decoding methods and corresponding devices
US20060159359A1 (en) * 2005-01-19 2006-07-20 Samsung Electronics Co., Ltd. Fine granularity scalable video encoding and decoding method and apparatus capable of controlling deblocking

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BEIBEI WANG ET AL: "A Two Pass H.264-Based Matching Pursuit Video Coder", IMAGE PROCESSING, 2006 IEEE INTERNATIONAL CONFERENCE ON, IEEE, PI, October 2006 (2006-10-01), pages 3149 - 3152, XP031049345, ISBN: 1-4244-0480-0 *
MARK R BANHAM ET AL: "A Selective Update Approach to Matching Pursuits Video Coding", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 7, no. 1, February 1997 (1997-02-01), XP011014341, ISSN: 1051-8215 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230319323A1 (en) * 2007-02-23 2023-10-05 Xylon Llc Video Coding With Embedded Motion
JP2013520058A (ja) * 2010-02-11 2013-05-30 トムソン ライセンシング 画像シーケンスのブロックの符号化および復元の方法
US10506254B2 (en) 2013-10-14 2019-12-10 Microsoft Technology Licensing, Llc Features of base color index map mode for video and image coding and decoding
US10582213B2 (en) 2013-10-14 2020-03-03 Microsoft Technology Licensing, Llc Features of intra block copy prediction mode for video and image coding and decoding
US11109036B2 (en) 2013-10-14 2021-08-31 Microsoft Technology Licensing, Llc Encoder-side options for intra block copy prediction mode for video and image coding
US10390034B2 (en) 2014-01-03 2019-08-20 Microsoft Technology Licensing, Llc Innovations in block vector prediction and estimation of reconstructed sample values within an overlap area
US10469863B2 (en) 2014-01-03 2019-11-05 Microsoft Technology Licensing, Llc Block vector prediction in video and image coding/decoding
US11284103B2 (en) 2014-01-17 2022-03-22 Microsoft Technology Licensing, Llc Intra block copy prediction with asymmetric partitions and encoder-side search patterns, search ranges and approaches to partitioning
US10542274B2 (en) 2014-02-21 2020-01-21 Microsoft Technology Licensing, Llc Dictionary encoding and decoding of screen content
US10785486B2 (en) 2014-06-19 2020-09-22 Microsoft Technology Licensing, Llc Unified intra block copy and inter prediction modes
US10812817B2 (en) 2014-09-30 2020-10-20 Microsoft Technology Licensing, Llc Rules for intra-picture prediction modes when wavefront parallel processing is enabled
US10659783B2 (en) 2015-06-09 2020-05-19 Microsoft Technology Licensing, Llc Robust encoding/decoding of escape-coded pixels in palette mode
US10986349B2 (en) 2017-12-29 2021-04-20 Microsoft Technology Licensing, Llc Constraints on locations of reference blocks for intra block copy prediction

Also Published As

Publication number Publication date
KR20090073112A (ko) 2009-07-02
CN101518085A (zh) 2009-08-26
KR20150047639A (ko) 2015-05-04
JP5529537B2 (ja) 2014-06-25
EP2070334A1 (en) 2009-06-17
JP2012235520A (ja) 2012-11-29
CN102833544A (zh) 2012-12-19
JP5639619B2 (ja) 2014-12-10
BRPI0716540A2 (pt) 2012-12-25
JP2010504689A (ja) 2010-02-12

Similar Documents

Publication Publication Date Title
EP2070334A1 (en) Method and apparatus for multiple pass video coding and decoding
JP6120896B2 (ja) ビデオ符号化および復号のためのスパース性に基づくアーティファクト除去フィルタリングを行う方法および装置
JP5801363B2 (ja) 符号化及び復号化のための装置及び方法並びにコンピュータプログラム
CN108028931B (zh) 用于视频编解码的自适应帧间预测的方法及装置
US10743027B2 (en) Methods and apparatus for adaptive template matching prediction for video encoding and decoding
EP1401211A2 (en) Multi-resolution video coding and decoding
EP3633996A1 (en) Methods and apparatus for adaptive coding of motion information
KR101482896B1 (ko) 최적화된 디블록킹 필터
US20200244965A1 (en) Interpolation filter for an inter prediction apparatus and method for video coding
US9736500B2 (en) Methods and apparatus for spatially varying residue coding
US9277245B2 (en) Methods and apparatus for constrained transforms for video coding and decoding having transform selection
US11265582B2 (en) In-loop filter apparatus and method for video coding
JP5746193B2 (ja) 映像符号化及び復号化のための効率的な適応フィルタリングの方法及び装置
US20120263225A1 (en) Apparatus and method for encoding moving picture
US20100040146A1 (en) Method and apparatus for multiple pass video coding and decoding
KR20140059270A (ko) 다중 경로 비디오 코딩 및 디코딩을 위한 방법 및 장치

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200780034952.3

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07750912

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 1307/DELNP/2009

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 12310757

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 1020097005789

Country of ref document: KR

ENP Entry into the national phase

Ref document number: 2009529167

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2007750912

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2007750912

Country of ref document: EP

ENP Entry into the national phase

Ref document number: PI0716540

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20090309

WWE Wipo information: entry into national phase

Ref document number: 1020157009755

Country of ref document: KR