WO2008029346A2 - Décodage vidéo - Google Patents

Décodage vidéo Download PDF

Info

Publication number
WO2008029346A2
WO2008029346A2 PCT/IB2007/053556 IB2007053556W WO2008029346A2 WO 2008029346 A2 WO2008029346 A2 WO 2008029346A2 IB 2007053556 W IB2007053556 W IB 2007053556W WO 2008029346 A2 WO2008029346 A2 WO 2008029346A2
Authority
WO
WIPO (PCT)
Prior art keywords
quantification
module
frame
residual data
motion
Prior art date
Application number
PCT/IB2007/053556
Other languages
English (en)
Other versions
WO2008029346A3 (fr
Inventor
Stephane Mutz
Philippe Durieux
Original Assignee
Nxp B.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nxp B.V. filed Critical Nxp B.V.
Publication of WO2008029346A2 publication Critical patent/WO2008029346A2/fr
Publication of WO2008029346A3 publication Critical patent/WO2008029346A3/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • H04N19/426Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements using memory downsizing methods
    • H04N19/428Recompression, e.g. by spatial or temporal decimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • This invention relates to a video decoder and to a method of operating such a video decoder.
  • the original moving pictures (the video) and the accompanying audio are encoded according to an agreed standard.
  • MPEG2 is a video and audio coding and compression algorithm that enables a data rate of typically from 2Mb/s up to 10Mb/s for standard resolution.
  • the MPEG2 standard defines three different types of frame data, an l-frame, a P-frame and a B-frame (respectively, an intra frame, a forward predicted frame and a bi-directionally predicted frame).
  • each frame to be coded is subdivided into a set of 16x16 pixel blocks called macroblocks (MB).
  • MB macroblocks
  • a search of a block of pixels with a similar content is made in the reference frames (the l-frame and/or P-frame). Only the location of the motion predictors in the reference frames and the difference between the motion predictors and the content of the MB to be coded is transmitted (this difference is called the residue). In the case of MPEG2, this search is made with Vi pixel accuracy and up to 2 reference frames are used.
  • an MPEG2 decoder is forced to keep the content of the reference frames in memory.
  • the decoder reads the block of pixels from the references frames based on the motion vectors information stored in the MPEG stream, filters them to restore the 14 pixel accuracy of the motion predictors and combines them with the residue information to get the pixels of the original frame.
  • the reference frames are store in a memory buffer. Therefore, the cost of implementation of the decoder function depends on the size of the reference frames.
  • N is normally 1.
  • One method disclosed in US 2002/0154696, involves resampling the video frame, after the frame has been decompressed using an MPEG decoder and prior to storing the decompressed frame in memory. This method can reduce memory footprint by a factor of four if the video frame is subsampled by a factor of two in the horizontal and vertical directions. This involves subsampling motion vectors by a factor of two, then upsampling fetched motion reconstruction data by a factor of two in the horizontal and vertical directions.
  • frequency coefficients are dequantized and passed through an IDCT (inverse discrete cosine transform) module, which converts the coefficients back into spatial domain data.
  • IDCT inverse discrete cosine transform
  • the spatial domain data and the upsampled fetched motion reconstruction data are then summed by a summer.
  • the output of summer is then subsampled by a factor of two in each direction.
  • a video decoder comprising a motion composition module, a reference frame buffer, an IQ/IDCT module, a quantification matrix store, and a summer, the motion composition module and the IQ/IDCT module arranged to receive and perform operations on an encoded video stream comprising residual data and motion vectors, the summer arranged to receive and perform operations on an output from the motion composition module and the IQ/IDCT module and to output a decoded video stream, wherein the IQ/IDCT module is arranged to subsample the residual data using scaled quantification matrices stored by the quantification matrix store, the frame reference buffer is arranged to store a corresponding subsampled reference frame, and the motion composition module includes a % pixel interpolation filter.
  • a method of operating a video decoder comprising a motion composition module, a reference frame buffer, an IQ/IDCT module, a quantification matrix store, and a summer, the motion composition module including a % pixel interpolation filter, the method comprising the steps of receiving an encoded video stream comprising residual data and motion vectors, subsampling the residual data at the IQ/IDCT module using scaled quantification matrices stored by the quantification matrix store, storing a corresponding subsampled reference frame in the frame reference buffer and executing motion composition using the % pixel interpolation filter.
  • this invention aims at enabling MPEG2 decoding with half resolution reference frames by reusing MPEG4 motion compensation mechanism.
  • the key advantages are a lower system cost due to downscaled reference frames (lower memory footprint and bandwidth) and very limited extra logic as the most costly part is a direct reuse of the MPEG4 motion compensation unit.
  • the memory footprint and bandwidth required to decode high definition video sequences has a significant impact on the cost of consumer equipment.
  • the present invention has a large potential for application because high definition sources are increasingly deployed, many consumer devices are already supporting MPEG4 decoding (e.g. for DivX playback), many displays do not offer full resolution HD rendering, and consumers will be interested in watching content distributed in high definition only without bearing the high cost associated to high definition rendering.
  • An efficient solution is achieved by subsampling the reference pictures and the residues, for example, by a factor 2 in the horizontal and/or vertical direction and rebuilding the motion predictors with the % pixel interpolation filter of the MPEG4 motion compensation process thereby restoring the 14 accuracy of the MPEG2 motion predictor from the subsampled reference frames.
  • the additions made to an MPEG2/MPEG4 capable decoder are only the residue coefficient scaling step.
  • motion vectors are divided by a factor of two. This division means that each vector has a quarter pixel resolution, and the motion composition module re-uses the MPEG4 quarter pixel interpolation filter.
  • there is a simple division by two of motion vectors which results in a loss of information.
  • the video decoder of the prior art system is configured to use a simple MPEG2 bi-linear interpolation filter that operates only with a 1/2 pixel accuracy.
  • the present invention there is no loss of motion accuracy as the video decoder is working with quarter pixel accuracy motion vectors. In this way, there is a very large reduction in the loss induced by the reference frame sub-sampling.
  • the video decoder of the present invention is working with quarter pixel accuracy motion vectors/interpolation filters (as defined in MPEG4).
  • the IQ/IDCT module when subsampling the residual data using scaled quantification matrices stored by the quantification matrix store, subsamples the residual data by half horizontal downsampling. Additionally or alternatively, the IQ/IDCT module, when subsampling the residual data using scaled quantification matrices stored by the quantification matrix store, subsamples the residual data by half vertical downsampling.
  • the video decoder further comprises a scaling function, wherein the quantification matrix store is arranged to generate the scaled quantification matrices by scaling the quantification matrices with a vector.
  • a scaling function can be used for subsampling, in the frequency domain in the IQ/IDCT module, which results in the quantization tables, that are being used, being scaled. High frequency cut off is performed by downscaling quantization tables in the higher frequencies. With this process, no subsampling filter is required, and a simple decimation is applied to get a 4x8 block at the output of the IQ/ IDCT module.
  • Figure 1 is a schematic diagram illustrating the video standard MPEG2
  • Figure 2 is a schematic diagram of a prior art MPEG2 video decoder
  • FIG. 3 is a schematic diagram of an MPEG2 video decoder, according to an embodiment of the invention.
  • Figure 4 is a further schematic diagram of the MPEG2 video decoder of Figure 3
  • Figure 5 is a flowchart of a method of operating a video decoder.
  • Figure 1 shows an MPEG2 video stream 10, which consists of a series of successive frames 12.
  • the data that is transmitted for each frame 12 comprises residual data and motion vectors, although for each l-frame only residual data is transmitted, there are no motion vectors.
  • MPEG 2 decreases the amount of data that is to be transmitted is by using the motion vectors to indicate portions of other frames that contain the image data to be used.
  • Figure 1 which is a "talking head" against a static background, once the first frame (an l-frame) has been transmitted, then much of that frame can be recycled for use in the later frames.
  • FIG. 2 illustrates an example of a conventional MPEG2 video decoder 8, which receives the video stream 10.
  • the video decoder 8 comprises a motion composition module 16, a reference frame buffer 18, an IQ/IDCT module 20, a quantification matrix store 22, and a summer 24.
  • the video stream 12 is received by an MPEG2 VLD 26, which separates the residual data 28 and the motion vectors 30 for each frame 12.
  • the motion composition module 16 and the IQ/IDCT module 20 are arranged to receive and perform operations on the encoded video stream 10, which comprises the residual data 28 and the motion vectors 30.
  • the summer 24 is arranged to receive and perform operations on an output from the motion composition module 16 and the IQ/IDCT module 20 and to output a decoded video stream 32.
  • the summer 24 is simply adding together the outputs of the modules 16 and 20 to create an entire frame. That frame is used in the rendering of the video stream 10, but is also transmitted via a feedback loop 34 to the reference frame buffer 18.
  • the reference frame buffer 18 stores complete frames that are then used when P-frames and B-frames are received. These frames 12 refer to data within one or more other frames 12 (via their respective motion vectors 30).
  • the fourth frame is a P-frame that refers back to the original (first received) l-frame, which l-frame will be stored in the reference frame buffer 18.
  • the motion vectors for the later P- frame will define one or more portions of the l-frame which are to be re-used when the decoder 14 is reconstructing that individual frame.
  • the motion composition module 16 will be instructed via the vectors 30 which parts of the l-frame to retrieve from the reference frame buffer 18, and these are then passed to the summer 24 to be added to the residual data for that frame to recreate the original frame.
  • That frame is used in the rendering of the video stream 10, but is also transmitted via a feedback loop 34 to the reference frame buffer 18, as other frames 12 may refer to that P-frame. Indeed in the sequence of eight frames 12 shown to be forming the video stream 10 in Figure 1 , six other frames refer to data within the fourth P-frame 12.
  • FIG 3 shows a first embodiment of an improved MPEG2 video decoder 14, according to an example of the invention.
  • This decoder 14 has the same functional components as the prior art decoder 8 in Figure 1 , but the operation of the decoder 14 in Figure 2, is substantially amended, and some of the functioning of the individual components is changed, to provide an improved but still effective video decoder, that has a greatly reduced storage requirement for the reference frame buffer 18.
  • the IQ/IDCT module 20 is arranged to subsample the residual data 28 using scaled quantification matrices stored by the quantification matrix store 22. These scaled matrices can be generated by reducing weight of high frequency coefficients inside the standard MPEG quantification matrices, thus to generate a set of matrices that will scale the residual data as it is reconstituted by the IQ/IDCT module 20.
  • Scaling is applied in horizontal and/or vertical directions.
  • the scaled quantification matrices are designed to produce a half horizontal downsampling of the data to produce a 4x8 block of pixel data, instead of the usual 8x8 residual data that would be produced in the MPEG2 decoder 14 of Figure 2.
  • Downsampling could occur in the vertical plane at the same time, reducing the data, for example, by a factor of two in both directions (x and y).
  • the extent of the downsampling is a design choice.
  • This subsampling process can be performed by different means but a straightforward method used here is to apply a vector to the residue coefficients before the IQ/IDCT step, as a scaling function. This processing can be achieved with minimum resources by multiplying the quantification matrices by a constant matrix to scale down the high frequency components.
  • Figure 3 is illustrating the operation of the new decoder 14 when an I- frame is received. As discussed above, an l-frame has no motion vectors in the frame data which is received in the video stream 10 for that specific frame.
  • An l-frame consists purely in residual data, which defines all of the pixels for that frame.
  • the motion composition module 16 does not receive any motion vectors 30, and does not execute any processing, when an l-frame is received.
  • the summer 24 produces an 8x16 block of the original picture, which is rendered and also stored. There are 4 luminance DCT blocks per Macroblock (so you get an 8x16 MB with 4 times 4x8).
  • the frame When the frame has passed through the summer 24 to be rendered, the frame is transmitted, via the feedback loop 34, to the frame reference buffer 18, which is arranged to store the corresponding subsampled reference frame. All of the reference frames that are to be stored in the reference frame buffer 18 are reduced in size.
  • the frame data has been subsampled by the IQ/IDCT module 20 during the IQ/IDCT stage of the MPEG2 decode.
  • This video decoder 14 of Figure 3 is configured to operate in such a way that the decoding of MPEG2 video streams is achieved with a reduced amount of memory and a reduction in the memory bandwidth required by a typical decoder.
  • FIG. 4 shows the embodiment of the improved MPEG2 video decoder 14, according to an example of the invention, when the decoder 14 has received a P-frame.
  • This P-frame will have been received following an l-frame, which has been stored in the reference frame buffer 18 in its scaled format.
  • received frames that are not l-frames cannot be decoded until an l-frame has been received by the decoder. So for example, if a user turns on a decoder (for example by turning on a television), then no picture will be rendered until an l-frame has been received. The same is true when a user changes channel on a digital television that uses MPEG2.
  • the residue data 28 is handled in exactly the same way as the residue data of an l-frame, described above, for example, with reference to Figure 3.
  • the data of the P-frame is scaled according to the filtered quantification matrices stored in the store 22 and passed to the summer 24.
  • the motion vectors 30 are passed to the motion composition module 16, where they are scaled by a factor of two, to match the spatial resolution of the reference frames stored by the buffer 18.
  • the motion composition module 16 includes a % pixel interpolation filter.
  • the motion predictor reconstruction can be performed by reusing an MPEG4 module of the motion compensation unit or code. Motion vectors with % pixel accuracy are derived from the original MPEG2 motion vectors by a simple division by a factor of 2 to match the reduced resolution of the subsampled references frames.
  • the motion predictors are computed using the sophisticated 8 tap MPEG4 % pixel filter and then used in the classical MPEG2 decoding process.
  • the result of this process is a downscaled decoded picture by a factor of two in the horizontal and/or vertical direction.
  • This invention provides an improved implementation of a decoder that uses downscaled reference frames. This is achieved by reusing existing components used for MPEG4 decoding thereby lowering the implementation cost.
  • Figure 5 shows a flowchart of the method of operating the video decoder 14.
  • the method comprises the steps of firstly receiving 510 the encoded video stream 10 (which comprises residual data 28 and motion vectors 30).
  • the residual data 28 for each frame 12 passes to step 512, where subsampling 512 of the residual data 28 by the IQ/IDCT module 20 using scaled quantification matrices stored by the quantification matrix store 22 takes place.
  • the motion vectors 30 for each frame 12 are passed to step 514 where the motion composition module 16 executes motion composition using the % pixel interpolation filter. This portion of the method only occurs for P-frames and B-frames of the MPEG2 video decode, as discussed above.
  • the next step 516 is the summing of the two portions to create the finished frame for rendering.
  • the stage 518 comprises the step of storing a corresponding subsampled reference frame in the frame reference buffer 18, for use as desired in the motion reconstruction of other frames.
  • the purpose of the invention is to reduce the cost of implementation by creating and using smaller reference frames in the decoding process. This will both lower the required memory footprint and the amount of traffic required to decode the original picture.
  • a cheaper and more efficient way of implementing the decoder is achieved by reusing part of the MPEG4 decoding process to make it possible at minimal cost.

Abstract

Décodeur vidéo comprenant un module de composition de mouvement, une mémoire tampon d'image de référence, un module IQ/IDCT, une mémoire de matrice de quantification et un sommateur. Le module de composition de mouvement et le module IQ/IDCT sont agencés pour recevoir et effectuer des opérations sur un flux vidéo codé comprenant des données résiduelles et des vecteurs de mouvement. Le sommateur est agencé pour recevoir et effectuer des opérations sur une sortie du module de composition de mouvement et du module IQ/IDCT et pour produire un flux vidéo décodé. Le module IQ/IDCT est agencé pour sous-échantillonner les données résiduelles en utilisant des matrices de quantification normées stockées par la mémoire de matrice de quantification. La mémoire tampon d'image de référence est agencée pour stocker une mémoire d'image sous-échantillonnée correspondante, et le module de composition de mouvement comprend une interpolation des pixels en %.
PCT/IB2007/053556 2006-09-06 2007-09-04 Décodage vidéo WO2008029346A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP06120209.9 2006-09-06
EP06120209 2006-09-06

Publications (2)

Publication Number Publication Date
WO2008029346A2 true WO2008029346A2 (fr) 2008-03-13
WO2008029346A3 WO2008029346A3 (fr) 2008-08-14

Family

ID=39157653

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2007/053556 WO2008029346A2 (fr) 2006-09-06 2007-09-04 Décodage vidéo

Country Status (1)

Country Link
WO (1) WO2008029346A2 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5262854A (en) * 1992-02-21 1993-11-16 Rca Thomson Licensing Corporation Lower resolution HDTV receivers
US20020154696A1 (en) * 2001-04-23 2002-10-24 Tardif John A. Systems and methods for MPEG subsample decoding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5262854A (en) * 1992-02-21 1993-11-16 Rca Thomson Licensing Corporation Lower resolution HDTV receivers
US20020154696A1 (en) * 2001-04-23 2002-10-24 Tardif John A. Systems and methods for MPEG subsample decoding

Also Published As

Publication number Publication date
WO2008029346A3 (fr) 2008-08-14

Similar Documents

Publication Publication Date Title
US6301304B1 (en) Architecture and method for inverse quantization of discrete cosine transform coefficients in MPEG decoders
AU2007231799B8 (en) High-performance video transcoding method
US20030095603A1 (en) Reduced-complexity video decoding using larger pixel-grid motion compensation
JP2005506815A (ja) 空間拡張可能圧縮のための方法及び装置
US6122321A (en) Methods and apparatus for reducing the complexity of inverse quantization operations
JPH0851626A (ja) 動画像信号の変換係数量子化方法及び装置、並びに動画像信号圧縮装置
US6590938B1 (en) DCT domain conversion of a higher definition signal to lower definition signal
US20130010866A1 (en) Method of decoding a digital video sequence and related apparatus
US20060133490A1 (en) Apparatus and method of encoding moving picture
KR102321895B1 (ko) 디지털 비디오의 디코딩 장치
US20080170628A1 (en) Video decoding method and apparatus having image scale-down function
JP4209134B2 (ja) 圧縮ビットストリームをアップサンプリングする方法および装置
EP1231794A1 (fr) Procédé, système et programme pour changer la résolution d'un flux de données MPEG
KR100364748B1 (ko) 영상 변환 부호화 장치
JPH10512730A (ja) 所要メモリ容量の低減化を伴う圧縮ビデオデータ流の復号化及び符号化のための方法
US20030021347A1 (en) Reduced comlexity video decoding at full resolution using video embedded resizing
WO2006132509A1 (fr) Procede de codage video fonde sur des couches multiples, procede de decodage, codeur video, et decodeur video utilisant une prevision de lissage
US6266374B1 (en) Low level digital video decoder for HDTV having backward compatibility
US20030202606A1 (en) Multi-phase processing for real-time display of a compressed video bitstream
WO2008029346A2 (fr) Décodage vidéo
KR100323688B1 (ko) 디지털 동영상 수신 장치
US20080049836A1 (en) Method and System for a Fast Video Transcoder
Shen et al. Closed-loop MPEG video rendering
Hashemi et al. Macroblock type selection for compressed domain down-sampling of MPEG video
JP2001112002A (ja) 画像サイズ変換可能なデジタル動画像復号装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07826252

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07826252

Country of ref document: EP

Kind code of ref document: A2