WO2008036112A1 - Procédé et appareil de codage et décodage vidéo à passage multiple - Google Patents
Procédé et appareil de codage et décodage vidéo à passage multiple Download PDFInfo
- Publication number
- WO2008036112A1 WO2008036112A1 PCT/US2007/004110 US2007004110W WO2008036112A1 WO 2008036112 A1 WO2008036112 A1 WO 2008036112A1 US 2007004110 W US2007004110 W US 2007004110W WO 2008036112 A1 WO2008036112 A1 WO 2008036112A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- predicted image
- encoding pass
- video
- pass
- encoding
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/583—Motion compensation with overlapping blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/43—Hardware specially adapted for motion estimation or compensation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/63—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
- H04N19/635—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets characterised by filter definition or implementation details
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/86—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/97—Matching pursuit coding
Definitions
- the present invention relates generally to video encoding and decoding and, more particularly, to a method and apparatus for multiple pass video encoding and decoding.
- the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 standard (hereinafter the "MPEG4/H.264 standard” or simply the “H.264 standard") is currently the most powerful and state-of-the-art video coding standard.
- MPEG4/H.264 standard or simply the “H.264 standard
- H.264 standard uses block-based motion-compensation and discrete cosine transform (DCT)-like transform coding. It is well-known that DCT is efficient for video coding and suitable for high-end applications, like broadcast high definition television (HDTV).
- DCT discrete cosine transform
- the DCT algorithm is not as well suited for applications which require very low bit rates, such as a dedicated video cell phone.
- the DCT transform will introduce blocking artifacts, even with the use of deblocking filters, because very few coefficients can be coded at very low bitrates, and each coefficient tends to have a very coarse quantization step.
- Matching pursuit is a greedy algorithm to decompose any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions. These waveforms are selected to best match the signal structures.
- f(t) we have a 1-D signal f(t), and we want to decompose this signal using basis vectors from an over-complete dictionary set G .
- Individual dictionary functions can be denoted as follows:
- ⁇ is an indexing parameter associated with a particular dictionary element.
- the decomposition begins by choosing ⁇ to maximize the absolute value of the inner product as follows:
- This residual signal is then expanded in the same way as the original signal.
- the procedure continues iteratively until either a set number of expansion coefficients are generated or some energy threshold for the residual is reached.
- Each stage n generates a dictionary function ⁇ n .
- the signal can be approximated by a linear function of the dictionary elements as follows:
- Matching Pursuit decomposition of a signal of n samples proves to be of the order k-N-d-n ⁇ og 2 n .
- d depends on the size of the dictionary without considering translations
- _v is the number of chosen expansion coefficients
- k depends on the strategy to select the dictionary functions.
- Matching Pursuit is more computationally consuming than 8x8 and 4x4 DCT integer transforms used in the H.264 standard, whose complexity is defined as o(niog 2 «).
- the Matching Pursuit algorithm is compatible with any set of redundant basis shapes. It has been proposed to expand a signal using an over- complete basis of Gabor functions.
- the 2-D Gabor dictionary is extremely redundant, and each shape may exist at any integer-pixel location in the coded residual image. Since Matching Pursuit has a much larger dictionary set and each coded basis function is well-matched to the structures in the residual signal, the frame-based Gabor dictionary does not include an artificial block structure.
- the Gabor redundant dictionary set has been adopted for very low bit-rate video coding based on matching pursuits, with respect to a proposed video coding system using a matching pursuit algorithm (hereinafter referred to as the "prior art Gabor-based Matching Pursuit video coding approach").
- the proposed system is based on the framework of a low bit rate hybrid-DCT system referred to as Simulation Model for Very Low Bit Rate Image Coding, or "SIM3" in short, where the DCT residual coder is replaced with a Matching Pursuit coder.
- SIM3 Simulation Model for Very Low Bit Rate Image Coding
- This coder uses Matching Pursuit to decompose the motion residual images over dictionary separable 2-D Gabor functions.
- the proposed system was shown to perform well on low motion sequences at low bitrate.
- a smooth 16x16 sine-square window has been applied on the predicted images for 8x8 partitions in the prior art Gabor-based Matching Pursuit video coding approach.
- the Matching Pursuit video codec in the prior art Gabor-based Matching Pursuit video coding approach is based on the ITU-T H.263 codec.
- the H.264 standard enables variable block-size motion compensation with small block sizes which, for luma motion compensation, may be as small as 4x4.
- the H.264 standard is based primarily on a 4x4 DCT-like transform for baseline and main profile, and not 8x8 as are most other prominent prior video coding standards.
- the directional spatial prediction for intra coding improves the quality of the prediction signals. All those highlighted design features make the H.264 standard more efficient, but it requires dealing with more complicated situations when applying Matching Pursuit on the H.264 standard.
- the smooth 16x16 sine-squared window is represented as follows:
- a hybrid coding scheme (hereinafter the "prior art hybrid coding scheme") has been proposed that benefits from some of the features introduced by the H.264 standard for motion estimation and replaces the transform in the spatial domain.
- the prediction error is coded using the Matching Pursuit algorithm, which decomposes the signal over an appositely designed bi-dimensional, anisotropic, redundant dictionary. Moreover, a fast atom search technique was introduced.
- the proposed prior art hybrid coding scheme has not addressed whether or not it uses one-pass or two-pass scheme.
- the proposed prior art hybrid coding scheme disclosed that the motion estimation part is compatible with the H.264 standard, but did not address whether any deblocking filters have been used in the coding scheme or whether any other methods have been used to smooth the blocking artifacts caused by the predicted images at very low bit rate.
- a video encoder for encoding video signal data using a multiple-pass video encoding scheme.
- the video encoder includes a motion estimator and a decomposition module.
- the motion estimator performs motion estimation on the video signal data to obtain a motion residual corresponding to the video signal data in a first encoding pass.
- the decomposition module in signal communication with the motion estimator, decomposes the motion residual in a subsequent encoding pass.
- a method for encoding video signal data using a multiple-pass video encoding scheme includes performing motion estimation on the video signal data to obtain a motion residual corresponding to the video signal data in a first encoding pass, and decomposing the motion residual in a subsequent encoding pass.
- a video decoder for decoding a video bitstream includes an entropy decoder, an atom decoder, an inverse transformer, a motion compensator, a deblocking filter, and a combiner. The entropy decoder decodes the video bitstream to obtain a decompressed video bitstream.
- the atom decoder in signal communication with the entropy decoder, decodes decompressed atoms corresponding to the decompressed bitstream to obtain decoded atoms.
- the inverse transformer in signal communication with the atom decoder, applies an inverse transform to the decoded atoms to form a reconstructed residual image.
- the motion compensator in signal communication with the entropy decoder, performs motion compensation using motion vectors corresponding to the decompressed bitstream to form a reconstructed predicted image.
- the deblocking filter in signal communication with the motion compensator, performs deblocking filtering on the reconstructed predicted image to smooth the reconstructed predicted image.
- the combiner in signal communication with the inverse transformer and the overlapped block motion compensator, combines the reconstructed predicted image and the residue image to obtain a reconstructed image.
- a method for decoding a video bitstream includes decoding the video bitstream to obtain a decompressed video bitstream, decoding decompressed atoms corresponding to the decompressed bitstream to obtain decoded atoms, applying an inverse transform to the decoded atoms to form a reconstructed residual image, performing motion compensation using motion vectors corresponding to the decompressed bitstream to form a reconstructed predicted image, performing deblocking filtering on the reconstructed predicted image to smooth the reconstructed predicted image, and combining the reconstructed predicted image and the residue image to obtain a reconstructed image.
- FIGs. 1 A and 1 B are diagrams for exemplary first and second pass portions of an encoder in a two-pass H.264 standard-based Matching Pursuit encoder/decoder (CODEC) to which the present principles may be applied according to an embodiment of the present principles;
- FIG. 2 is a diagram for an exemplary decoder in a two-pass H.264 standard- based Matching Pursuit encoder/decoder (CODEC) to which the present principles may be applied according to an embodiment of the present principles;
- FIG. 3 is a diagram for an exemplary method for encoding an input video sequence in accordance with an embodiment of the present principles.
- FlG. 4 is a diagram for an exemplary method for decoding an input video sequence in accordance with an embodiment of the present principles.
- the present invention is directed to a method and apparatus for multiple pass video encoding and decoding.
- the present invention corrects the blocking artifacts introduced by the DCT transform used in, e.g., the H.264 standard in very low bit rate applications.
- the present invention is not limited to solely low bit rate applications, but may be used for other (higher) bit rates as well, while maintaining the scope of the present invention.
- processor or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
- DSP digital signal processor
- ROM read-only memory
- RAM random access memory
- any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
- any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function.
- the invention as defined by such claims resides in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
- a multiple pass video encoding and decoding scheme is provided.
- the multiple pass video encoding and decoding scheme may be used with Matching Pursuit.
- a two- pass H.264-based coding scheme is disclosed for Matching Pursuit video coding.
- the H.264 standard applies block-based motion compensation and DCT-like transform similar to other video compression standards.
- the DCT transform will introduce blocking artifacts, even with the use of de-blocking filters, because very few coefficients can be coded at very low bitrates, and each coefficient tends to have a very coarse quantization step.
- matching pursuit using an over-complete basis is applied to code the residual images.
- the motion compensation and mode decision parts are compatible with the H.264 standard.
- the overlapped block motion compensation (OBMC) is applied to smooth the predicted images.
- OBMC overlapped block motion compensation
- a new approach is provided for selecting a basis other than Matching Pursuit.
- a video encoder and/or decoder applies OBMC on predicted images to reduce the blocking artifacts caused by the prediction models.
- the Matching Pursuit algorithm is used to code the residual images.
- the advantage of Matching Pursuit is that it is not block-based, but frame- based, so there are no blocking artifacts caused by the coding residual difference.
- exemplary first and second pass portions of an encoder in a two-pass H.264 standard-based Matching Pursuit encoder/decoder are indicated generally by the reference numerals 110 and 160.
- the encoder is indicated generally by the reference numeral 190 and a decoder portion is indicated generally by the reference numeral 191.
- an input of the first pass portion 110 is connected in signal communication with a non-inverting input of a combiner 112, an input of an encoder control module 114, and a first input of a motion estimator 116.
- a first output of the combiner 112 is connected in signal communication with a first input of a buffer 118.
- a second output of the combiner 112 is connected in signal communication with an input of an integer transform/scaling/quantization module 120.
- An output of the integer transform/scaling/quantization module 120 is connected in signal communication with a first input of a scaling/inverse transform module 122.
- a first output of the encoder control module 114 is connected in signal communication with a first input of an intra-frame predictor 126.
- a second output of the encoder control module 114 is connected in signal communication with a first input of a motion compensator 124.
- a third output of the encoder control module 114 is connected in signal communication with a second input of the motion estimator 116.
- a fourth output of the encoder control module 114 is connected in signal communication with a second input of the scaling/inverse transform module 122.
- a fifth output of the encoder control module 114 is connected in signal communication with the first input of the buffer 118.
- An output of the motion estimator 116 is connected in signal communication with a second input of a motion compensator 124 and a second input of the buffer 128.
- An inverting input of the combiner 112 is selectively connected in signal communication with an output of the motion compensator 124 or an output of an intra-frame predictor 126.
- the selected output of either the motion compensator 124 or the intra-frame predictor 126 is connected in signal communication with a first input of a combiner 128.
- An output of the scaling/inverse transform module 122 is connected in signal communication with a second input of the combiner 128.
- An output of the combiner 128 is connected in signal communication with a second input of the intra-frame predictor 126, a third input of the motion estimator 116, and an input/output of the motion compensator 124.
- An output of the buffer 118 is available as an output of the first pass portion 110.
- the encoder control module 114 the integer transform/scaling/quantization module 120, the buffer 118, and the motion estimator 116 are included in the encoder 190.
- the scaling/inverse transform module 122, the intra-frame predictor 126, and the motion compensator 124 are included in the decoder portion 191.
- the input of the first pass portion 110 receives an input video 111 , and stores in the buffer 118 control data (e.g., motion vectors, mode selections, predicted images, and so forth) for use in the second pass portion 160.
- control data e.g., motion vectors, mode selections, predicted images, and so forth
- a first input of the second pass portion 160 is connected in signal communication with an input of an entropy coder 166.
- the first input receives control data 162 (e.g., mode selections, and so forth) and motion vectors 164 from the first pass portion 110.
- a second input of the second pass portion 160 is connected in signal communication with a non-inverting input of a combiner 168.
- a third input of the second pass portion 160 is connected in signal communication with an input of an overlapped block motion compensation (OBMC)/deblocking module 170.
- OBMC overlapped block motion compensation
- the second input of the second pass portion 160 receives the input video 111, and the third input of the second pass portion receives predicted images 187 from the first pass portion 110.
- An output of the combiner 168 which provides a residual 172, is connected in signal communication with an input of an atom finder 174.
- An output of the atom finder 174 which provides a coded residual 178, is connected in signal communication with an input of an atom coder 176 and a first non-inverting input of a combiner 180.
- An output of the OBMC/deblocking module 170 is connected in signal communication with an inverting input of the combiner 168 and with a second non-inverting input of the combiner 180.
- An output of the combiner 180 which provides an output video, is connected in signal communication with an input of a reference buffer 182.
- An output of the atom coder 176 is connected in signal communication with the input of the entropy coder 166.
- An output of the entropy coder 166 is available as an output of the second pass portion 160, and provides an output bitstream.
- the entropy coder is included in the encoder 190, and the combiner 168, the OBMC module 170, the atom finder 174, the atom coder 176, and the reference buffer 182 are included in the decoder portion 191.
- an exemplary decoder in a two-pass H.264 standard-based Matching Pursuit encoder/decoder is indicated generally by the reference numeral 200.
- An input of the decoder 200 is connected in signal communication with an input of an entropy decoder 210.
- An output of the entropy decoder is connected in signal communication with an input of an atom decoder 220 and an input of a motion compensator 250.
- An output of the inverse transform module 230 which provides residuals, is connected in signal communication with a first non-inverting input of a combiner 270.
- An output of the motion compensator 250 is connected in signal communication with an input of an OBMC/deblocking module 260.
- An output of the OBMC/deblocking module 260 is connected in signal communication with a second non-inverting input of the combiner 270.
- An output of the combiner is available as an output of the decoder 200.
- the present principles are applicable to the ITU-T H.264/AVC coding system. Due to the frame-based residual coding, we apply OBMC on predicted images, which is not implemented in the H.264/AVC codec.
- a first pass in a video encoding scheme is compatible with the H.264 standard. There is no actual coding in the first pass. All the control data, such as, for example, mode selections, predicted images and motion vectors, are saved into a buffer for the second pass. The DCT transform is still applied in the first pass for motion compensation and mode selections using Rate Distortion Optimization (RDO). Instead of coding the residue image using DCT coefficients, all residual images are saved for the second pass.
- RDO Rate Distortion Optimization
- it is proposed to apply 16 x 16 constrained intra coding or H.264 standard compatible constrained intra coding and treat the boundary parts between intra coded and inter coded macroblocks specially.
- the motion vectors and control data may be coded by entropy coding.
- the residual images may be coded by Matching Pursuit.
- the atoms search and parameter coding may be performed, e.g., according to the prior art Gabor-based Matching Pursuit video coding approach.
- the reconstructed images are saved for reference frames.
- Matching Pursuit video coding One of the benefits of Matching Pursuit video coding is that Matching Pursuit is not block-based, so there is no blocking artifacts. However, when the motion prediction is performed on a block-basis and is inaccurate, it still originates some blocking artifacts at very low bit rates. Simulations have shown that the atoms appear at the moving contours and the areas where the motion vectors (MVs) are not very accurate. Improving the motion estimation leads the atoms to representing the residuals better.
- one method involves using a H.264-like or improved deblocking filter to smooth the blocky boundary in a predictive image.
- a smoother motion model using overlapping blocks OBMC
- OBMC overlapping blocks
- a 16 x 16 sine-squared window has been adopted.
- the N x N sine-squared window may be defined, e.g., in accordance with the prior art hybrid coding scheme.
- the 16x16 sine-squared window is designed for 8x8 blocks, and 16x16 blocks are treated as four 8x8 blocks.
- partitions with lurna block size 16x16, 16x8, 8x16, and 8x8 samples are supported.
- the 8x8 partition is further partitioned into partitions of 8x4, 4x8 or 4x4 luma samples and corresponding chroma samples.
- four approaches are proposed to deal with more partition types. The first approach is to use an 8x8 sine-squared window for 4x4 partitions. For all other partitions above 4x4, divide those partitions into several 4x4 partitions.
- the second approach is to use a 16x16 sine-squared window for 8x8 and above partitions, but it does not touch smaller partitions than 8x8.
- the third approach is to use adaptive OBMC for all partitions. All of these three approaches only implement OBMC not deblocking filters, and the fourth approach is to combine OBMB with a deblocking f ⁇ lter(s).
- Each subband represents one direction, and it is edge detective.
- the 2-D DDWT achieves higher PSNR with the same retained coefficients compared to the standard 2-D DWT. Thus, it is more suitable to code the edge information.
- the error images After applying OBMC on the predicted images, the error images will have smoother edges. Parametric over-complete 2-D dictionaries may be used to provide smoother edges.
- an exemplary method for encoding an input video sequence is indicated generally by the reference numeral 300.
- the method 300 includes a start block 305 that passes control to a decision block 310.
- the decision block 310 determines whether or not the current frame is an l-frame. If so, then control is passed to a function block 355. Otherwise, control is passed to a function block 315.
- the function block 355 performs H.264 standard compatible frame coding to provide an output bitstream, and passes control to an end block 370.
- the function block 315 performs H.264 standard compatible motion compensation, and passes control to a function block 320.
- the function block 320 saves the motion vectors (MVs), control data, and predicted blocks, and passes control to a decision block 325.
- the decision block 325 determines whether or not the end of the frame has been reached. If so, then control is passed to a function block 330. Otherwise, control is returned to the function block 315.
- the function block 330 performs OBMC and/or deblocking filtering on the predicted images, and passes control to a function block 335.
- the function block 335 obtains a residue image from the original and predicted images, and passes control to a function block 340.
- the function block 340 codes a residual using Matching Pursuit, and passes control to a function block 345.
- the function block 345 performs entropy coding to provide an output bitstream, and passes control to the end block 370.
- the method 400 includes a start block 405 that passes control to a decision block 410.
- the decision block 410 determines whether or not the current frame is an l-frame. If so, then control is passed to a function block 435. Otherwise, control is passed to a function block 415.
- the function block 435 performs H.264 standard compatible decoding to provide a reconstructed image, and passes control to an end block 470.
- the function block 415 decodes the motion vectors, control data, and the Matching Pursuit atoms, and passes control to a function block 420 and a function block 425.
- the function block 420 reconstructs the residue image using decoded atoms, and passes control to a function block 430.
- the function block 425 reconstructs the predicted images by decoding motion vectors and other control data and applying OBMC and/or deblocking filtering, and passes control to the function block 430.
- the function block 430 combines the reconstructed residue image and the reconstructed predicted images to provide a reconstructed image, and passes control to the end block 470.
- one advantage/feature is a video encoder for encoding video signal data using a multiple-pass video encoding scheme, wherein the video encoder includes a motion estimator and a decomposition module.
- the motion estimator performs motion estimation on the video signal data to obtain a motion residual corresponding to the video signal data in a first encoding pass.
- the decomposition module in signal communication with the motion estimator, decomposes the motion residual in a subsequent encoding pass.
- the video encoder as described above, wherein the multiple-pass video coding scheme js a two-pass video encoding scheme.
- the video encoder further includes a buffer, in signal communication with the motion estimator and the decomposition module, for storing the motion residual obtained in the first encoding pass for subsequent use in a second encoding pass.
- the decomposition module decomposes the motion residual using a redundant Gabor dictionary set in the second encoding pass.
- Yet another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the motion estimator performs the motion estimation and coding-mode selection in compliance with the International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 standard in the first encoding pass.
- ITU-T International Telecommunication Union, Telecommunication Sector
- Still another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and an overlapped block motion compensator.
- the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
- the overlapped block motion compensator in signal communication with the buffer, performs overlapping block motion compensation (OBMC) on the predicted image using a 16x16 sine-square window to smooth the predicted image in the second encoding pass.
- OBMC overlapping block motion compensation
- another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and an overlapped block motion compensator.
- the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
- the overlapped block motion compensator in signal communication with the buffer, performs overlapped block motion compensation (OBMC) on only 8x8 and greater partitions of the predicted image in the second encoding pass.
- OBMC overlapped block motion compensation
- another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and an overlapped block motion compensator.
- the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
- the overlapped block motion compensator in signal communication with the buffer, performs overlapping block motion compensation (OBMC) using a 8x8 sine-square window for 4x4 partitions of the predicted image in the second encoding pass. All partitions of the predicted image are divided into 4x4 partitions when OBMC is performed in the second encoding pass.
- the buffer stores the predicted image therein in the first encoding pass for subsequent use in the second encoding pass.
- another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and an overlapped block motion compensator.
- the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
- the overlapped block motion compensator in signal communication with the buffer, performs adaptive overlapping block motion compensation (OBMC) for all partitions of the predicted image in the second encoding pass.
- OBMC adaptive overlapping block motion compensation
- another advantage feature is the video encoder using the two- pass video encoding scheme as described above, wherein the video encoder further includes a prediction module and a deblocking filter.
- the prediction module in signal communication with the buffer, forms a predicted image corresponding to the video signal data in the first encoding pass.
- the deblocking filter in signal communication with the buffer, performs a deblocking operation on the predicted image in the second encoding pass.
- the buffer stores the predicted image therein in the first encoding pass for subsequent use in the second encoding pass.
- Another advantage feature is the video encoder using the two-pass video encoding scheme as described above, wherein the decomposition module applies parametric over-complete 2-D dictionaries to decompose the motion residual in the second encoding pass.
- another advantage feature is a video decoder for decoding a video bitstream
- the video decoder includes an entropy decoder, an atom decoder, an inverse transformer, a motion compensator, a deblocking filter, and a combiner.
- the entropy decoder decodes the video bitstream to obtain a decompressed video bitstream.
- the atom decoder in signal communication with the entropy decoder, decodes decompressed atoms corresponding to the decompressed bitstream to obtain decoded atoms.
- the inverse transformer in signal communication with the atom decoder, applies an inverse transform to the decoded atoms to form a reconstructed residual image.
- the motion compensator in signal communication with the entropy decoder, performs motion compensation using motion vectors corresponding to the decompressed bitstream to form a reconstructed predicted image.
- the deblocking filter in signal communication with the motion compensator, performs deblocking filtering on the reconstructed predicted image to smooth the reconstructed predicted image.
- the combiner in signal communication with the inverse transformer and the overlapped block motion compensator, combines the reconstructed predicted image and the residue image to obtain a reconstructed image.
- the teachings of the present invention are implemented as a combination of hardware and software.
- the software may be implemented as an application program tangibly embodied on a program storage unit.
- the application program may be uploaded to, and executed by, a machine comprising any suitable architecture.
- the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU"), a random access memory (“RAM”), and input/output ("I/O") interfaces.
- CPU central processing units
- RAM random access memory
- I/O input/output
- the computer platform may also include an operating system and microinstruction code.
- the various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU.
- various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/310,757 US20100040146A1 (en) | 2006-09-22 | 2007-02-15 | Method and apparatus for multiple pass video coding and decoding |
KR1020157009755A KR20150047639A (ko) | 2006-09-22 | 2007-02-15 | 다중 경로 비디오 코딩 및 디코딩을 위한 방법 및 장치 |
KR1020147008589A KR20140059270A (ko) | 2006-09-22 | 2007-02-15 | 다중 경로 비디오 코딩 및 디코딩을 위한 방법 및 장치 |
JP2009529167A JP5529537B2 (ja) | 2006-09-22 | 2007-02-15 | 複数経路ビデオ符号化及び復号化のための方法及び装置 |
BRPI0716540-4A BRPI0716540A2 (pt) | 2006-09-22 | 2007-02-15 | mÉtodo e aparelho para codificaÇço e decodificaÇço de vÍdeo de passos méltiplos |
EP07750912A EP2070334A1 (fr) | 2006-09-22 | 2007-02-15 | Procédé et appareil de codage et décodage vidéo à passage multiple |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
USPCT/US2006/037139 | 2006-09-22 | ||
US2006037139 | 2006-09-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2008036112A1 true WO2008036112A1 (fr) | 2008-03-27 |
Family
ID=38521211
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2007/004110 WO2008036112A1 (fr) | 2006-09-22 | 2007-02-15 | Procédé et appareil de codage et décodage vidéo à passage multiple |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP2070334A1 (fr) |
JP (2) | JP5529537B2 (fr) |
KR (2) | KR20150047639A (fr) |
CN (2) | CN101518085A (fr) |
BR (1) | BRPI0716540A2 (fr) |
WO (1) | WO2008036112A1 (fr) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013520058A (ja) * | 2010-02-11 | 2013-05-30 | トムソン ライセンシング | 画像シーケンスのブロックの符号化および復元の方法 |
US10390034B2 (en) | 2014-01-03 | 2019-08-20 | Microsoft Technology Licensing, Llc | Innovations in block vector prediction and estimation of reconstructed sample values within an overlap area |
US10469863B2 (en) | 2014-01-03 | 2019-11-05 | Microsoft Technology Licensing, Llc | Block vector prediction in video and image coding/decoding |
US10506254B2 (en) | 2013-10-14 | 2019-12-10 | Microsoft Technology Licensing, Llc | Features of base color index map mode for video and image coding and decoding |
US10542274B2 (en) | 2014-02-21 | 2020-01-21 | Microsoft Technology Licensing, Llc | Dictionary encoding and decoding of screen content |
US10582213B2 (en) | 2013-10-14 | 2020-03-03 | Microsoft Technology Licensing, Llc | Features of intra block copy prediction mode for video and image coding and decoding |
US10659783B2 (en) | 2015-06-09 | 2020-05-19 | Microsoft Technology Licensing, Llc | Robust encoding/decoding of escape-coded pixels in palette mode |
US10785486B2 (en) | 2014-06-19 | 2020-09-22 | Microsoft Technology Licensing, Llc | Unified intra block copy and inter prediction modes |
US10812817B2 (en) | 2014-09-30 | 2020-10-20 | Microsoft Technology Licensing, Llc | Rules for intra-picture prediction modes when wavefront parallel processing is enabled |
US10986349B2 (en) | 2017-12-29 | 2021-04-20 | Microsoft Technology Licensing, Llc | Constraints on locations of reference blocks for intra block copy prediction |
US11109036B2 (en) | 2013-10-14 | 2021-08-31 | Microsoft Technology Licensing, Llc | Encoder-side options for intra block copy prediction mode for video and image coding |
US11284103B2 (en) | 2014-01-17 | 2022-03-22 | Microsoft Technology Licensing, Llc | Intra block copy prediction with asymmetric partitions and encoder-side search patterns, search ranges and approaches to partitioning |
US20230319323A1 (en) * | 2007-02-23 | 2023-10-05 | Xylon Llc | Video Coding With Embedded Motion |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013077650A1 (fr) * | 2011-11-23 | 2013-05-30 | 한국전자통신연구원 | Procédé et appareil pour le décodage d'une vidéo multivue |
KR101677406B1 (ko) * | 2012-11-13 | 2016-11-29 | 인텔 코포레이션 | 차세대 비디오용 비디오 코덱 아키텍처 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5699121A (en) * | 1995-09-21 | 1997-12-16 | Regents Of The University Of California | Method and apparatus for compression of low bit rate video signals |
US20030179825A1 (en) * | 2001-04-09 | 2003-09-25 | Shunchi Sekiguchi | Image encoding method and apparatus, image decoding method and apparatus, and image processing system |
WO2005002234A1 (fr) * | 2003-06-30 | 2005-01-06 | Koninklijke Philips Electronics, N.V. | Codage video dans un domaine d'ondelettes redondantes |
WO2005013201A1 (fr) * | 2003-08-05 | 2005-02-10 | Koninklijke Philips Electronics N.V. | Procedes de codage et de decodage video et dispositifs associes |
US20060159359A1 (en) * | 2005-01-19 | 2006-07-20 | Samsung Electronics Co., Ltd. | Fine granularity scalable video encoding and decoding method and apparatus capable of controlling deblocking |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ATE336763T1 (de) * | 2003-03-28 | 2006-09-15 | Digital Accelerator Corp | Transformations basiertes restbewegungsrahmen kodierungsverfahren mit übervollständiger basis und zugehörige vorrichtung zur videokompression |
US7653133B2 (en) * | 2003-06-10 | 2010-01-26 | Rensselaer Polytechnic Institute (Rpi) | Overlapped block motion compression for variable size blocks in the context of MCTF scalable video coders |
US8059715B2 (en) * | 2003-08-12 | 2011-11-15 | Trident Microsystems (Far East) Ltd. | Video encoding and decoding methods and corresponding devices |
JP4191729B2 (ja) * | 2005-01-04 | 2008-12-03 | 三星電子株式会社 | イントラblモードを考慮したデブロックフィルタリング方法、及び該方法を用いる多階層ビデオエンコーダ/デコーダ |
-
2007
- 2007-02-15 CN CNA2007800349523A patent/CN101518085A/zh active Pending
- 2007-02-15 CN CN2012102958552A patent/CN102833544A/zh active Pending
- 2007-02-15 EP EP07750912A patent/EP2070334A1/fr not_active Withdrawn
- 2007-02-15 WO PCT/US2007/004110 patent/WO2008036112A1/fr active Application Filing
- 2007-02-15 JP JP2009529167A patent/JP5529537B2/ja not_active Expired - Fee Related
- 2007-02-15 BR BRPI0716540-4A patent/BRPI0716540A2/pt not_active IP Right Cessation
- 2007-02-15 KR KR1020157009755A patent/KR20150047639A/ko not_active Application Discontinuation
- 2007-02-15 KR KR1020097005789A patent/KR20090073112A/ko not_active Application Discontinuation
-
2012
- 2012-07-31 JP JP2012169948A patent/JP5639619B2/ja not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5699121A (en) * | 1995-09-21 | 1997-12-16 | Regents Of The University Of California | Method and apparatus for compression of low bit rate video signals |
US20030179825A1 (en) * | 2001-04-09 | 2003-09-25 | Shunchi Sekiguchi | Image encoding method and apparatus, image decoding method and apparatus, and image processing system |
WO2005002234A1 (fr) * | 2003-06-30 | 2005-01-06 | Koninklijke Philips Electronics, N.V. | Codage video dans un domaine d'ondelettes redondantes |
WO2005013201A1 (fr) * | 2003-08-05 | 2005-02-10 | Koninklijke Philips Electronics N.V. | Procedes de codage et de decodage video et dispositifs associes |
US20060159359A1 (en) * | 2005-01-19 | 2006-07-20 | Samsung Electronics Co., Ltd. | Fine granularity scalable video encoding and decoding method and apparatus capable of controlling deblocking |
Non-Patent Citations (2)
Title |
---|
BEIBEI WANG ET AL: "A Two Pass H.264-Based Matching Pursuit Video Coder", IMAGE PROCESSING, 2006 IEEE INTERNATIONAL CONFERENCE ON, IEEE, PI, October 2006 (2006-10-01), pages 3149 - 3152, XP031049345, ISBN: 1-4244-0480-0 * |
MARK R BANHAM ET AL: "A Selective Update Approach to Matching Pursuits Video Coding", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 7, no. 1, February 1997 (1997-02-01), XP011014341, ISSN: 1051-8215 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230319323A1 (en) * | 2007-02-23 | 2023-10-05 | Xylon Llc | Video Coding With Embedded Motion |
US12034980B2 (en) * | 2007-02-23 | 2024-07-09 | Xylon Llc | Video coding with embedded motion |
JP2013520058A (ja) * | 2010-02-11 | 2013-05-30 | トムソン ライセンシング | 画像シーケンスのブロックの符号化および復元の方法 |
US10506254B2 (en) | 2013-10-14 | 2019-12-10 | Microsoft Technology Licensing, Llc | Features of base color index map mode for video and image coding and decoding |
US10582213B2 (en) | 2013-10-14 | 2020-03-03 | Microsoft Technology Licensing, Llc | Features of intra block copy prediction mode for video and image coding and decoding |
US11109036B2 (en) | 2013-10-14 | 2021-08-31 | Microsoft Technology Licensing, Llc | Encoder-side options for intra block copy prediction mode for video and image coding |
US10390034B2 (en) | 2014-01-03 | 2019-08-20 | Microsoft Technology Licensing, Llc | Innovations in block vector prediction and estimation of reconstructed sample values within an overlap area |
US10469863B2 (en) | 2014-01-03 | 2019-11-05 | Microsoft Technology Licensing, Llc | Block vector prediction in video and image coding/decoding |
US11284103B2 (en) | 2014-01-17 | 2022-03-22 | Microsoft Technology Licensing, Llc | Intra block copy prediction with asymmetric partitions and encoder-side search patterns, search ranges and approaches to partitioning |
US10542274B2 (en) | 2014-02-21 | 2020-01-21 | Microsoft Technology Licensing, Llc | Dictionary encoding and decoding of screen content |
US10785486B2 (en) | 2014-06-19 | 2020-09-22 | Microsoft Technology Licensing, Llc | Unified intra block copy and inter prediction modes |
US10812817B2 (en) | 2014-09-30 | 2020-10-20 | Microsoft Technology Licensing, Llc | Rules for intra-picture prediction modes when wavefront parallel processing is enabled |
US10659783B2 (en) | 2015-06-09 | 2020-05-19 | Microsoft Technology Licensing, Llc | Robust encoding/decoding of escape-coded pixels in palette mode |
US10986349B2 (en) | 2017-12-29 | 2021-04-20 | Microsoft Technology Licensing, Llc | Constraints on locations of reference blocks for intra block copy prediction |
Also Published As
Publication number | Publication date |
---|---|
KR20150047639A (ko) | 2015-05-04 |
JP5529537B2 (ja) | 2014-06-25 |
EP2070334A1 (fr) | 2009-06-17 |
JP5639619B2 (ja) | 2014-12-10 |
JP2010504689A (ja) | 2010-02-12 |
CN101518085A (zh) | 2009-08-26 |
JP2012235520A (ja) | 2012-11-29 |
KR20090073112A (ko) | 2009-07-02 |
CN102833544A (zh) | 2012-12-19 |
BRPI0716540A2 (pt) | 2012-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2008036112A1 (fr) | Procédé et appareil de codage et décodage vidéo à passage multiple | |
JP6120896B2 (ja) | ビデオ符号化および復号のためのスパース性に基づくアーティファクト除去フィルタリングを行う方法および装置 | |
JP5801363B2 (ja) | 符号化及び復号化のための装置及び方法並びにコンピュータプログラム | |
CN108028931B (zh) | 用于视频编解码的自适应帧间预测的方法及装置 | |
US10743027B2 (en) | Methods and apparatus for adaptive template matching prediction for video encoding and decoding | |
EP1401211A2 (fr) | Codage et décodage vidéo multirésolution | |
EP3633996A1 (fr) | Procédés et appareil de codage adaptatif d'informations de mouvement | |
KR101482896B1 (ko) | 최적화된 디블록킹 필터 | |
US20200244965A1 (en) | Interpolation filter for an inter prediction apparatus and method for video coding | |
US9736500B2 (en) | Methods and apparatus for spatially varying residue coding | |
US9277245B2 (en) | Methods and apparatus for constrained transforms for video coding and decoding having transform selection | |
US11265582B2 (en) | In-loop filter apparatus and method for video coding | |
JP5746193B2 (ja) | 映像符号化及び復号化のための効率的な適応フィルタリングの方法及び装置 | |
US20120263225A1 (en) | Apparatus and method for encoding moving picture | |
US20100040146A1 (en) | Method and apparatus for multiple pass video coding and decoding | |
KR20140059270A (ko) | 다중 경로 비디오 코딩 및 디코딩을 위한 방법 및 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200780034952.3 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07750912 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1307/DELNP/2009 Country of ref document: IN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12310757 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020097005789 Country of ref document: KR |
|
ENP | Entry into the national phase |
Ref document number: 2009529167 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2007750912 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007750912 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: PI0716540 Country of ref document: BR Kind code of ref document: A2 Effective date: 20090309 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020157009755 Country of ref document: KR |