WO2001005159A1 - Downconverting decoder for interlaced pictures - Google Patents

Downconverting decoder for interlaced pictures Download PDF

Info

Publication number
WO2001005159A1
WO2001005159A1 PCT/US1999/015254 US9915254W WO0105159A1 WO 2001005159 A1 WO2001005159 A1 WO 2001005159A1 US 9915254 W US9915254 W US 9915254W WO 0105159 A1 WO0105159 A1 WO 0105159A1
Authority
WO
WIPO (PCT)
Prior art keywords
blocks
idct
field
dct coefficient
submodule
Prior art date
Application number
PCT/US1999/015254
Other languages
French (fr)
Inventor
Mark Fimoff
Original Assignee
Zenith Electronics Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zenith Electronics Corporation filed Critical Zenith Electronics Corporation
Priority to PCT/US1999/015254 priority Critical patent/WO2001005159A1/en
Priority to AU47320/99A priority patent/AU4732099A/en
Publication of WO2001005159A1 publication Critical patent/WO2001005159A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/152Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/16Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter for a given display mode, e.g. for interlaced or progressive display mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/18Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/48Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention relates to a
  • MPEG-2 for encoding and decoding digital video
  • digital video data is converted into digital video data.
  • This type of encoding is also known as data compression.
  • the standard allows for encoding of video over
  • Each 8x8 array of pixels is known as
  • a block, and a 2x2 array of blocks is known as a macroblock. Compression is achieved by using well known
  • I pictures I pictures
  • pictures are referred to as P pictures, and pictures which are encoded with prediction from both previous and
  • B pictures subsequent pictures are referred to as B pictures.
  • An MPEG-2 encoder 10 is shown in simplified
  • the motion estimator 14 compares each new macroblock
  • the motion estimator 14 finds the
  • the motion estimator 14 reads this matching
  • subtractor 12 which subtracts it, on a pixel by pixel
  • the output of the subtractor 12 is an error
  • the motion estimator 14 also calculates a
  • motion vectors may have pixel resolution
  • the data encoded by the coder 22 are combined with the
  • the residual from the quantization block 20 is inverse quantized by an inverse quantization block 24 and is inverse DCT transformed by an inverse discrete cosine
  • IDCT IDCT transform
  • the object of this internal loop is to have the data in the reference picture memory 16 of the MPEG-2 encoder 10 match the data in the reference
  • decoding process implemented by the MPEG-2 decoder 30 can be thought of as the reverse of the encoding process
  • encoded data is Huffman and run/level decoded by a Huffman and run/level decoder 32.
  • run/level decoder 32 are fed to an inverse quantization
  • each motion vector is
  • reference picture memory 42 which contains previously stored reference pictures.
  • An adder 44 adds this predicted macroblock to the
  • picture memory 42 to be stored as a reference picture for
  • the MPEG encoder 10 can encode sequences of
  • pictures may be encoded as field
  • next picture contains the even lines of the raster.
  • the DCT transform is performed on 8x8 blocks that contain
  • field DCT coded blocks referred to as field DCT coded blocks.
  • Each of the four DCT may be done in two different ways.
  • Each of the four DCT may be done in two different ways.
  • transform blocks in a macroblock may contain both odd and
  • pictures are frame pictures with frame DCT coding and
  • MPEG-2 as described above, includes the
  • HDTV high resolution
  • the down convertor 50 includes a Huffman and run/level
  • down convertor 50 employs a downsampler 56 which discards
  • This block is saved in a reference picture memory 60 for subsequent predictions.
  • decoder loop within the encoder are made from full
  • This error may increase as
  • a motion compensator 62 attempts to reduce this
  • This predicted macroblock is interpolated back to full resolution by a 2x2 prediction
  • a downsampler 66 performs a 2x2
  • the reference picture memory 60 is upsampled to match the
  • resolution reference picture is downsampled prior to
  • upsampling and downsampling is for the prediction upsam-
  • This example is a one dimensional example but is easily extended to two
  • the pixels yl are upsampled to the pixels pi by
  • pixels y2 are likewise upsampled to the pixels p2 by
  • upsampled prediction consisting of eight pixels q can be
  • the down convertor 50 while generally adequate
  • a full resolution block refers to an eight
  • decoder discards the four high order coefficients for
  • the decoded pixels y is shown by columns 70 and 72.
  • pixels y represent the stored reference picture.
  • reference pixels p are then downsampled to produce down-
  • the objective is for the prediction upsampling filter to
  • two pixel blocks e.g., the y 1 and y 2 pixel blocks of
  • This upsampling/downsampling process can either
  • the pixels q can be a two step filtering process, or the pixels q can be a two step filtering process, or the pixels q can be a two step filtering process, or the pixels q can be
  • filters can also include additional pixel interpolation
  • the DCT blocks may be frame or field encoded.
  • DCT domain downsampling must be performed by a down
  • DCT encoded fields A and B are shown in column 94 and the
  • pixels c are evenly spaced.
  • a picture may
  • An entire reference picture may have the a/b structure
  • a reference picture may be composed of macroblocks, some having the a/b structure of column 92
  • prediction upsample filtering must be performed on
  • the reference macroblock as a frame to derive the A and B
  • reference blocks have a mixed macroblock pixel structure
  • the a/b structure avoids the field "mixing" in the
  • the IDCT is arranged to convert the
  • the motion compensator is arranged to apply
  • the first IDCT is arranged to convert
  • IDCT is arranged to directly perform a downconverting
  • Figure 1 is a simplified block diagram of a
  • Figure 2 is a simplified block diagram of a
  • Figure 3 is a block diagram of a known down
  • Figure 4 - 9 show exemplary sets of pixel data
  • Figure 10 is a block diagram of a
  • Figures 11 - 14 show additional exemplary sets
  • Figure 15 shows a block diagram of a motion
  • Figure 16 shows an additional exemplary set of
  • Figure 17 shows an embodiment of an IDCT module
  • Figures 18 - 20 show an exemplary set of pixel
  • reference pictures are stored with a
  • the reference pictures are always stored
  • a downconverting decoder 100 according to an embodiment
  • DCT type frame or field
  • picture type frame or
  • the data parser also provides the motion vector,
  • the IDCT module 102 converts frame DCT coded
  • the IDCT module 102 merely
  • IDCT module 102 has an a or b field structure and is fed to an adder 106.
  • adder 106 For the case of residual pixels, the
  • 106 consists of reconstructed blocks of field pixels
  • the motion compensator 104 reads
  • IDCT module 102 just performs 4x4 IDCT's on each block in
  • IDCT module 102 is always an a/b structured macroblock as
  • IDCT module 102 will be either an a or a b structured
  • upsampler/downsampler filter should use the motion vector
  • predicted macroblock is then filtered and downsampled to
  • This process can be done in three steps, or can be
  • compensator 104 must support two types of prediction
  • the motion vector is
  • column 124 represents the prediction after
  • the upsampling/downsampling filter may include
  • Figure 12 shows an example of vertical
  • prediction is for the A field, and one prediction is for the
  • macroblock is used to read a particular area from the a/b
  • the motion compensator 104 is shown in more detail
  • the motion compensator 104 executes
  • the prediction type signal is
  • the motion vector translator 200 translates the motion vector to a series of reference memory addresses
  • prediction type is for frame prediction, or from a
  • reference field a or a reference field b (as indicated by
  • picture memory 110 are provided to a horizontal
  • upsample/downsample filter 210 executes a horizontal
  • processed pixels are then provided to the vertical
  • upsample/downsample filter 202 If the type of
  • prediction is field prediction, the vertical prediction
  • upsample/downsample filter 202 executes vertical
  • upsample/downsample filter 202 performs only vertical
  • the interpolator 206 executes 34 pixel linear
  • combiner 204 and the interpolator 206 operate only when
  • the interpolation module 108 performs a
  • the IDCT module 102 of Figure 10 is described
  • lowercase x or y refers to spatial domain pixels
  • Equations (1) and (3) can be combined as
  • equation (4) are in the spatial domain.
  • [T f ] may be defined, based on equation (2) , according to
  • Equations (4) and (5) can be combined according
  • frame DCT coef f icients [X] can be
  • the matrix [IT8 2 ] can be modified so that down-
  • [P4] can be formed from a 4x4 DCT basis matrix [T4] by
  • coefficients can be filtered, downsampled, and inverse
  • the operator [Q 3 ] becomes an 8x16 operator instead of a
  • each DCT block becomes an 8x4 block (X 1 ' , X 2 ' ,
  • a 16x4 matrix [Xj may be used to define
  • DCT basis vector matrix T4 is applied to the 8x4 array
  • FIG 17 shows an exemplary hardware
  • the IDCT 300 processes
  • the output of the IDCT 300 is always a or b structured
  • each matrix multiplier of the IDCT 300 has a
  • a macroblock X consisting of four 8x8 DCT
  • blocks are frame coded blocks. If the four 8x8 DCT
  • blocks are field DCT coded blocks as signaled by a DCT type input to the parser 302, each block is fed to a
  • discard module 304 which discards the high order
  • multiplier 306 are selected for output to a matrix
  • the matrix multiplier 310 executes a
  • the output of the matrix multiplier 316 is
  • module 318 then separately outputs the 4x4 blocks Ga and
  • the IDCT module 102 can be modified in order to
  • decoder discards the four high order coefficients for
  • vectors rOO, rOl, ..., r33 are four point DCT basis vectors.
  • the alternative operators [T4 ' ] and [P4 ' ] may
  • This modified operator Q 3 ' is used by the IDCT module 102
  • upsampling filter should result in a close spatial domain
  • the basic filter structure is the same for both
  • the interpolation module 108 can be eliminated

Abstract

Received frame and field coded DCT coefficient blocks are downconverted to lower resolution reconstructed pixel field blocks so that, for example, HDTV programs can be played on NTSC receivers. The frame and field coded DCT coefficient blocks have motion vectors associated therewith. The received frame coded DCT coefficient blocks are converted to converted field coded DCT coefficient blocks and an IDCT module is performed on the converted field coded DCT coefficient blocks to produce downconverted field residual or pixel blocks. Also, an IDCT is performed directly on the received field coded DCT coefficient blocks to produce downconverted field residual or pixel blocks. Reference pixel blocks are selected based upon the motion vectors, the reference pixel blocks are upsampled, and then downsampled. The upsampled and downsampled reference pixel blocks form predictions that are added to the field residual blocks in order to form reconstructed pixel field blocks.

Description

DOWNCONVERTING DECODER FOR INTERLACED PICTURES
Technical Field of the Invention
The present invention relates to a
downconverting decoder for downconverting and decoding
high resolution encoded video for display by a lower
resolution receiver.
Background of the Invention
The international standard ISO/IEC 13818-2
(Generic Coding of Motion Pictures and Associated Audio
Information: Video) and the "Guide to the use of the
ATSC Digital Television Standard" describe a system,
known as MPEG-2, for encoding and decoding digital video
data. According to this system, digital video data is
encoded as a series of code words in a complicated manner
that causes the average length of the code words to be
much smaller than would be the case if, for example, each
pixel in every frame was coded as an eight bit value.
This type of encoding is also known as data compression. The standard allows for encoding of video over
a wide range of resolutions, including higher resolutions
commonly known as HDTV. In MPEG-2, encoded pictures are
made up of pixels. Each 8x8 array of pixels is known as
a block, and a 2x2 array of blocks is known as a macroblock. Compression is achieved by using well known
techniques including (i) prediction (motion estimation in the encoder and motion compensation in the decoder) , (ii)
two dimensional discrete cosine transform (DCT) which is
performed on 8x8 blocks of pixels, (iii) quantization of the resulting DCT coefficients, and (iv) Huffman and run/level coding. In MPEG-2 encoding, pictures which are
encoded without prediction are referred to as I pictures,
pictures which are encoded with prediction from previous
pictures are referred to as P pictures, and pictures which are encoded with prediction from both previous and
subsequent pictures are referred to as B pictures.
An MPEG-2 encoder 10 is shown in simplified
form in Figure 1. Data representing macroblocks of pixel
values are fed to both a subtractor 12 and a motion
estimator 14. In the case of P pictures and B pictures, the motion estimator 14 compares each new macroblock
(i.e., a macroblock to be encoded) with the macroblocks
in a reference picture previously stored in a reference
picture memory 16. The motion estimator 14 finds the
macroblock in the stored reference picture that most
closely matches the new macroblock.
The motion estimator 14 reads this matching
macroblock (known as a predicted macroblock) out of the
reference picture memory 16 and sends it to the
subtractor 12 which subtracts it, on a pixel by pixel
basis, from the new macroblock entering the MPEG-2
encoder 10. The output of the subtractor 12 is an error,
or residual, that represents the difference between the
predicted macroblock and the new macroblock being
encoded. This residual is often very small. The
residual is transformed from the spatial domain by a two
dimensional DCT 18. The DCT residual coefficients
resulting from the two dimensional DCT 18 are then
quantized by a quantization block 20 in a process that
reduces the number of bits needed to represent each
coefficient. Usually, many coefficients are effectively quantized to zero. The quantized DCT coefficients are
Huffman and run/level coded by a coder 22 which further
reduces the average number of bits per coefficient.
The motion estimator 14 also calculates a
motion vector (mv) which represents the horizontal and
vertical displacement of the predicted macroblock in the
reference picture from the position of the new macroblock
in the current picture being encoded. It should be noted
that motion vectors may have pixel resolution which
achieved by linear interpolation between adjacent pixels.
The data encoded by the coder 22 are combined with the
motion vector data from the motion estimator 14 and with
other information (such as an indication of whether the
picture is an I, P or B picture) , and the combined data
are transmitted to a receiver that includes an MPEG-2
decoder 30.
For the case of P pictures, the quantized DCT
coefficients from the quantization block 20 are also
supplied to an internal loop that represents the
operation of the MPEG-2 decoder 30. Within this internal
loop, the residual from the quantization block 20 is inverse quantized by an inverse quantization block 24 and is inverse DCT transformed by an inverse discrete cosine
transform (IDCT) block 26. The predicted macroblock,
that is read out of the reference picture memory 16 and
that is supplied to the subtractor 12, is also added back to the output of the IDCT block 26 on a pixel by pixel
basis by an adder 28, and the result is stored back into the reference picture memory 16 in order to serve as a
macroblock of a reference picture for predicting
subsequent pictures. The object of this internal loop is to have the data in the reference picture memory 16 of the MPEG-2 encoder 10 match the data in the reference
picture memory of the MPEG-2 decoder 30. B pictures are
not stored as reference pictures. In the case of I pictures, no motion estimation
occurs and the negative input to the subtractor 12 is
forced to zero. In this case, the quantized DCT
coefficients provided by the two dimensional DCT 18
represent transformed pixel values rather than residual
values, as is the case with P and B pictures. As in the case of P pictures, decoded I pictures are stored as
reference pictures.
The MPEG-2 decoder 30 illustrated in Figure 2
is a simplified showing of an MPEG-2 decoder. The
decoding process implemented by the MPEG-2 decoder 30 can be thought of as the reverse of the encoding process
implemented by the MPEG-2 encoder 10. The received
encoded data is Huffman and run/level decoded by a Huffman and run/level decoder 32. Motion vectors and
other information are parsed from the data stream flowing through the Huffman and run/level decoder 32. The motion vectors are fed to a motion compensator 34. Quantized
DCT coefficients at the output of the Huffman and
run/level decoder 32 are fed to an inverse quantization
block 36 and then to an IDCT block 38 which transforms
the inverse quantized DCT coefficients back into the
spatial domain.
For P and B pictures, each motion vector is
translated by the motion compensator 34 to a memory
address in order to read a particular macroblock
(predicted macroblock) out of a reference picture memory 42 which contains previously stored reference pictures.
An adder 44 adds this predicted macroblock to the
residual provided by the IDCT block 38 in order to form
reconstructed pixel data. For I pictures, there is no
reference picture so that the prediction provided to the
adder 44 is forced to zero. For I and P pictures, the
output of the adder 42 is fed back to the reference
picture memory 42 to be stored as a reference picture for
future predictions.
The MPEG encoder 10 can encode sequences of
progressive or interlaced pictures. For sequences of
interlaced pictures, pictures may be encoded as field
pictures or as frame pictures. For field pictures, one
picture contains the odd lines of the raster, and the
next picture contains the even lines of the raster. All
encoder and decoder processing is done on fields. Thus,
the DCT transform is performed on 8x8 blocks that contain
all odd or all even numbered lines. These blocks are
referred to as field DCT coded blocks.
On the other hand, for frame pictures, each
picture contains both odd and even numbered lines of the raster. Macroblocks of frame pictures are encoded as
frames in the sense that an encoded macroblock contains
both odd and even lines. However, the DCT performed on
the four blocks within each macroblock of a frame picture
may be done in two different ways. Each of the four DCT
transform blocks in a macroblock may contain both odd and
even lines (frame DCT coded blocks) , or alternatively two
of the four DCT blocks in a macroblock may contain only
the odd lines of the macroblock and the other two blocks
may contain only the even lines of the macroblock (field
DCT coded blocks) . The coding decision as to which way
to encode a picture may be made adaptively by the MPEG-2
encoder 10 based upon which method results in better data
compression.
Residual macroblocks in field pictures are
field DCT coded and are predicted from a reference field.
Residual macroblocks in frame pictures that are frame DCT
coded are predicted from a reference frame. Residual
macroblocks in frame pictures that are field DCT coded
have two blocks predicted from one reference field and two blocks predicted from either the same or other
reference field.
For sequences of progressive pictures, all
pictures are frame pictures with frame DCT coding and
frame prediction.
MPEG-2, as described above, includes the
encoding and decoding of video at high resolution (HDTV)
In order to permit people to use their existing NTSC
televisions in order to view HDTV transmitted programs, it is desirable to provide a decoder that decodes high resolution MPEG-2 encoded data as reduced resolution
video data for display on existing NTSC televisions.
(Reducing the resolution of television signals is often
called down conversion decoding.) Accordingly, such a
downconverting decoder would allow the viewing of HDTV
signals without requiring viewers to buy expensive HDTV
displays.
There are known techniques for making such a
downconverting decoder such that it requires less
circuitry and is, therefore, cheaper than a decoder that
outputs full HDTV resolution. One of these methods is disclosed in U.S. Patent 5,262,854. The down conversion
technique disclosed there is explained herein in
connection with a down convertor 50 shown in Figure 3.
The down convertor 50 includes a Huffman and run/level
decoder 52 and an inverse quantization block 54 which
operate as previously described in connection with the
Huffman and run/level decoder 32 and the inverse
quantization block 36 of Figure 2. However, instead of
utilizing the 8x8 IDCT block 38 as shown in Figure 2, the
down convertor 50 employs a downsampler 56 which discards
the forty-eight high order DCT coefficients of an 8x8
block and performs a 4x4 IDCT on the remaining 4x4 array
of DCT coefficients. This process is usually referred to
as DCT domain downsampling. The result of this
downsampling is effectively a filtered and downsampled
4x4 block of residual samples (for P or B pictures) or
pixels for I pictures.
For residual samples, a prediction is added by
an adder 58 to the residual samples from the downsampler
56 in order to produce a decoded reduced resolution 4x4
block of pixels. This block is saved in a reference picture memory 60 for subsequent predictions.
Accordingly, predictions will be made from a reduced
resolution reference, while predictions made in the
decoder loop within the encoder are made from full
resolution reference pictures. This difference means
that the prediction derived from the reduced resolution
reference will differ by some amount from the
corresponding prediction made by the encoder, resulting
in error in the residual plus prediction sum provided by
the adder 58 (this error is referred to herein as
prediction error) . This error may increase as
predictions are made upon predictions until the reference
is refreshed by the next I picture.
A motion compensator 62 attempts to reduce this
prediction error by using the full resolution motion
vectors, even though the reference picture is at lower
resolution. First, a portion of the reference picture
that includes the predicted macroblock is read from a
reference picture memory 60. This portion is selected
based on all bits of the motion vector except the least
significant bit. This predicted macroblock is interpolated back to full resolution by a 2x2 prediction
upsample filter 64. Using the full resolution motion
vector (which may include pixel resolution) , a
predicted full resolution macroblock is extracted from
the upsampled portion based upon all of the bits of the
motion vector. Then, a downsampler 66 performs a 2x2
downsampling on the extracted full resolution macroblock
in order to match the resolution of the 4x4 IDCT output
of the downsampler 56. In this way, the prediction from
the reference picture memory 60 is upsampled to match the
full resolution residual pixel structure allowing the use
of full resolution motion vectors. Then, the full
resolution reference picture is downsampled prior to
addition by the adder 58 in order to match the resolution
of the downsampled residual from the downsampler 56.
There are several known good prediction upsam-
pling/downsampling methods that tend to minimize the
prediction error caused by upsampling reference pictures
that have been downsampled with a 4x4 IDCT. These
methods typically involve use of a two dimensional filter
having five to eight taps and tap values that vary both with the motion vector value for the predicted macroblock
and the position of the current pixel being interpolated
within the predicted macroblock. Such a filter not only
upsamples the reduced resolution reference to full
resolution and subsequently downsamples in a single
operation, but it can also include additional % pixel
interpolation (when required due to a fractional motion
vector) . (See, for example, Minimal Error Drift in
Frequency Scalability for Motion Compensated DCT Coding,
Mokry and Anastassiou, IEEE Transactions on Circui ts and
Systems for Video Technology, August 1994, and Drift
Minimization in Frequency Scaleable Coders Using Block
Based Filtering, Johnson and Princen, IEEE Workshop on
Visual Signal Processing and Communication, Melbourne,
Australia, September 1993.) The objective of such
upsampling and downsampling is for the prediction upsam-
pling filter to be a close spatial domain approximation
to the effective filtering operation done by a 4x4 IDCT.
The following example is representative of the
prediction upsampling/downsampling filter described in
the Mokry and Johnson papers. This example is a one dimensional example but is easily extended to two
dimensions. Let it be assumed that pixels yl and pixels
y2 as shown in Figure 4 represent two adjacent blocks in
a downsampled reference picture, and that the desired
predicted block stradles the boundary between the two
blocks. The pixels yl are upsampled to the pixels pi by
using a four tap filter with a different set of tap
values for each of the eight calculated pixels pi. The
pixels y2 are likewise upsampled to the pixels p2 by
using the same four tap filter arrangement. (If the
motion vector requires 34 pixel interpolation, this
interpolation is done using linear interpolation to
calculate in between pixel values based on the pixels pi
and p2. ) From these sixteen pixels pi and pixels p2 , an
upsampled prediction consisting of eight pixels q can be
read using the full resolution motion vector. The pixels
q are then filtered and downsampled to pixels q' by an
eight tap filter with a different set of tap values for
each of the four pixels q' . The Johnson paper teaches
how to determine the optimum tap values for these filters
given that the reference picture was downsampled by a four point IDCT. The tap points are optimum in the sense
that the prediction error is minimized. The Johnson and
Mokry papers also show that the upsampling, linear
interpolation, and downsampling filters can be combined
into a single eight tap filter with tap values that
depend on the motion vector value and the particular
pixels q1 being calculated. Accordingly, this single
eight tap filter allows four pixels q' to be calculated
directly from the eight pixels yl and y2.
The down convertor 50, while generally adequate
for progressive pictures with frame DCT coded blocks,
does not address problems that arise when attempting to
down convert sequences of interlaced pictures with mixed
frame and field DCT coded blocks. These problems arise,
for the most part, with respect to vertical prediction
upsampling, and are described below in a one dimensional
vertical context. Thus, for the purpose of this
description, a full resolution block refers to an eight
pixel vertical column with a downsampled block having a
corresponding vertical column of four pixels. Let it be assumed that an eight point vertical
column of pixels as shown in column 70 of Figure 5 is
transformed into DCT coefficients by an encoder utilizing
an eight point DCT transform operation. A downconverting
decoder discards the four high order coefficients for
each block and performs a four point IDCT on the
remaining coefficients (DCT domain downsampling) . The
spatial relationship between the original pixels x and
the decoded pixels y is shown by columns 70 and 72. The
pixels y represent the stored reference picture.
Prediction upsampling/downsampling methods,
such as those previously referenced (Mokry, Johnson) ,
which operate on DCT domain downsampled reference
pictures, result in the spatial relationships shown in
Figure 6, where the reference pixels y are first
upsampled to produce upsampled reference pixels p (these
approximate the original pixels x) and the upsampled
reference pixels p are then downsampled to produce down-
sampled reference pixels q. These methods attempt to
effectively reverse the DCT domain downsampling with a
minimal or small error due to the discarding of the high order DCT coefficients when the 4x4 IDCT is performed.
The objective is for the prediction upsampling filter to
be a close spatial domain approximation to the effective
filtering operation done by a 4x4 IDCT.
The typical operation of such a filter
operating vertically is explained as follows. A portion
of the lower resolution reference picture consisting of
two pixel blocks (e.g., the y1 and y2 pixel blocks of
column 80) overlapped by the desired predicted block is
accessed. As shown in column 82, these two pixel blocks
are upsampled and filtered to approximate the full
resolution reference so that the pixels p1 and p2
approximate full resolution pixels x. Then, the pixels px
and p2 are filtered and downsampled to produce pixels q as
shown in column 84. The pixels q form the predicted
block that is supplied to the adder 58.
This upsampling/downsampling process can either
be a two step filtering process, or the pixels q can be
directly calculated from the pixels y using, for example,
an eight tap filter whose filter coefficients vary with
the motion vector value and the particular pixels q being calculated, as described in the Johnson and Mokry papers.
As shown in Figure 7, prediction upsampling/downsampling
filters can also include additional pixel interpolation
(approximation of pixel values between original pixels x)
when the motion vector is fractional.
It is noted that the pixels y due to DCT domain
downsampling are effectively spatially located half way
between the original pixels x. This spatial relationship
has important implications because, as previously
explained, the DCT blocks may be frame or field encoded.
For example, if it is assumed that a full resolution
frame consisting of fields A and B is encoded by the
encoder, and if these fields are field DCT encoded, the
DCT domain downsampling must be performed by a down
conversion decoder separately on each field block. The
resulting vertical spatial relationship of pixels in the
downsampled fields a and b with respect to pixels in the
original fields A and B is shown in Figure 8, where the
original encoded fields A and B are shown in column 90
and the downsampled fields a and b are shown in column 92. It should be noted that the pixels b are not evenly
spaced between the pixels a.
On the other hand, with frame DCT encoding,
pixels from fields A and B are combined together into DCT
blocks by the encoder. DCT domain downsampling on these
frame DCT coded blocks results in the pixel spatial
relationship shown in Figure 9, where the original frame
DCT encoded fields A and B are shown in column 94 and the
downsampled frame c is shown in column 96. It should be
noted that the pixels c are evenly spaced.
According to the MPEG-2 standard, a picture may
be encoded with all macroblocks containing frame DCT
coded blocks, with all macroblocks containing field DCT
coded blocks, or with a mix of field and frame coded
macroblocks. Therefore, performing DCT domain
downsampling as shown by the prior art results in
reference pictures that have a varying pixel structure.
An entire reference picture may have the a/b structure
shown in column 92 or the c structure shown in column 96.
On the other hand, a reference picture may be composed of macroblocks, some having the a/b structure of column 92
and others having the c structure of column 96.
When forming a predicted macroblock, as
previously explained, the reference picture must be
upsampled so that it matches its original full resolution
structure. The prediction upsampling operation is made
more complicated because the two different reference
picture pixel structures shown in columns 92 and 96
require different upsampling processes. Because the
pixels in the c structured reference picture shown in
column 96 have resulted from DCT domain downsampling of a
frame, prediction upsample filtering must be performed on
the reference macroblock as a frame to derive the A and B
fields together. However, because the pixels in the a/b
structured reference pictures shown in column 92 have
resulted from DCT domain downsampling of separate fields,
prediction upsample filtering must be performed
separately on each field (a and b) of the reference
macroblock in order to derive the A and then B fields
shown in column 90. A further complication is introduced when
reference blocks have a mixed macroblock pixel structure
because predicted macroblocks from the reference picture
may straddle stored reference macroblocks, some having
the c structure and some having the a/b structure . In
this case, two different prediction upsample processes
would have to be executed for different parts of the same
predicted macroblock.
Moreover, a particular disadvantage of using
the c structure shown in column 96 for reference pictures
becomes apparent when it is necessary to do field
prediction from a c structured reference, where the A/B
structured full resolution reference contains high
vertical frequencies. For example, if it is assumed that
at full resolution the A/B reference is entirely composed
of alternating black (A field) and white (B field) lines,
a c structured downsampled reference would be composed of
pixels that are approximately gray due to the mixing of
the A and B pixels that occurs during filtering and
downsampling. However, an a/b structured reference would
have all black pixels for the a field and all white pixels for the b field because each field is filtered and
downsampled separately. If the encoder decides to do a
field prediction from the A field, a decoder with a c
structured reference would read a prediction consisting
of gray pixels. However, a decoder with an a/b
structured reference would read a much more accurate
prediction from the a field consisting of black pixels.
Thus, the a/b structure avoids the field "mixing" in the
decoder that occurs in the c structure.
The downconverting decoder of the present
invention overcomes one or more of the problems inherent
in the prior art .
Summary of the Invention
In accordance with the present invention, a
method of downconverting received frame and field coded
DCT coefficient blocks to reconstructed pixel field
blocks, wherein the frame and field coded DCT coefficient
blocks have motion vectors associated therewith,
comprises the following steps: a) converting the
received frame coded DCT coefficient blocks to converted field coded DCT coefficient blocks and performing an IDCT
on the converted field coded DCT coefficient blocks to
produce residual or pixel field blocks; b) directly
performing an IDCT on the received field coded DCT
coefficient blocks to produce residual or pixel field
blocks; c) selecting reference pixel blocks based upon
the motion vectors, upsampling the reference pixel
blocks, and downsampling at least a portion of the
upsampled reference blocks to form a prediction; and,
d) adding the prediction to the residual field blocks to
form reconstructed field blocks.
In a more detailed aspect of the present
invention, an apparatus for downconverting received frame
and field coded DCT coefficient blocks to reconstructed
pixel field blocks comprises an IDCT and a motion
compensator. The IDCT is arranged to convert the
received frame coded DCT coefficient blocks to converted
field coded DCT coefficient blocks and to perform an IDCT
on the converted field coded DCT coefficient blocks and
on the received field coded DCT coefficient blocks in
order to produce downconverted pixel related field blocks. The motion compensator is arranged to apply
motion compensation, as appropriate, to the downconverted
pixel related field blocks in order to produce the
reconstructed pixel field blocks.
In a further more detailed aspect of the
present invention, an apparatus for downconverting
received frame and field coded DCT coefficient blocks to
downconverted pixel related field blocks comprises first
and second IDCT's. The first IDCT is arranged to convert
the received frame coded DCT coefficient blocks to
converted field coded DCT coefficient blocks and to
perform a downconverting IDCT on the converted field
coded DCT coefficient blocks in order to produce first
downconverted pixel related field blocks. The second
IDCT is arranged to directly perform a downconverting
IDCT on the received field coded DCT coefficient blocks
in order to produce second downconverted pixel related
field blocks.
Brief Description of the Drawings These and other features and advantages of the
present invention will become more apparent from a
detailed consideration of the invention when taken in
conjunction with the drawings in which:
Figure 1 is a simplified block diagram of a
known MPEG-2 encoder;
Figure 2 is a simplified block diagram of a
known MPEG-2 decoder;
Figure 3 is a block diagram of a known down
conversion decoder for an HDTV application;
Figure 4 - 9 show exemplary sets of pixel data
useful in describing the background of the present
invention;
Figure 10 is a block diagram of a
downconverting decoder according to the present
invention;
Figures 11 - 14 show additional exemplary sets
of pixel data useful in describing the present invention;
Figure 15 shows a block diagram of a motion
compensator of Figure 9 in additional detail; Figure 16 shows an additional exemplary set of
pixel data useful in describing the present invention;
Figure 17 shows an embodiment of an IDCT module
of Figure 9; and,
Figures 18 - 20 show an exemplary set of pixel
data useful in describing an alternative embodiment of
the present invention.
Detailed Description
According to one embodiment of the present
invention, reference pictures are stored with a
predetermined vertical pixel structure regardless of
whether the received DCT blocks are field or frame coded.
For example, the reference pictures are always stored
with the a/b vertical pixel structure shown in column 92
regardless of whether the received DCT blocks are field
or frame coded. This consistent vertical pixel structure
for the reference pictures allows both field and frame
prediction with prediction upsampling to be done in a
more straight forward manner because the pixel structure
of the reference picture is always known. A downconverting decoder 100 according to an
embodiment of the present invention is shown in Figure
10. A Huffman and run/level decoder, data parser, and
inverse quantizer (not shown) , which are located upstream
of the downconverting decoder 100, all operate as
described above. The resulting DCT coefficient blocks
from the Huffman and run/level decoder and the inverse
quantizer are provided to an IDCT module 102 of the
downconverting decoder 100. Other information, such as
DCT type (frame or field) and picture type (frame or
field) are provided to the IDCT module 102 from the data
parser. The data parser also provides the motion vector,
prediction type (frame or field) , and field select
signals to a motion compensator 104.
The IDCT module 102 converts frame DCT coded
blocks to field DCT coded blocks and performs DCT domain
filtering, downsampling, and inverse DCT. In the case of
field DCT coded blocks, the IDCT module 102 merely
performs DCT domain filtering, downsampling, and inverse
DCT. The residual pixel and intra pixel output of the
IDCT module 102 has an a or b field structure and is fed to an adder 106. For the case of residual pixels, the
other input of the adder 106 receives predicted pixels
from the motion compensator 104. The output of the adder
106 consists of reconstructed blocks of field pixels
which are provided to an interpolation module 108 and to
a reference picture memory 110. The interpolation module
108 does a position adjust on the b field pixels for
correct display. The field pixel blocks (for I and P
pictures) are stored in the reference picture memory 110
for future predictions. The motion compensator 104 reads
predicted pixel blocks from the reference picture memory
110 as required.
When frame pictures are received, the
macroblocks of the frame pictures may be field or frame
DCT coded. If a macroblock is frame DCT coded as
indicated by the DCT type signal, the IDCT module 102
will convert the first two vertically stacked frame DCT
coded blocks to two field DCT coded blocks and then
perform a 4x4 IDCT on each of the two blocks . This
process will be described below in greater detail and it
will be shown that this process can be done in a very efficient manner. The same process is performed on the
next two vertically stacked blocks in that macroblock.
The result is as if the macroblock was originally field
coded. However, if the macroblock is field DCT coded to
begin with as indicated by the DCT type signal, then the
IDCT module 102 just performs 4x4 IDCT's on each block in
that macroblock. The result is that the output of the
IDCT module 102 is always an a/b structured macroblock as
shown in column 92 because the c structure shown in
column 96 is avoided. Therefore, reference pictures will
always be stored with the a/b pixel structure, which
simplifies prediction upsampling.
When field pictures are received, the
macroblocks of the field pictures input to the IDCT
module 102 will always be field coded. The output of the
IDCT module 102 will be either an a or a b structured
macroblock.
As explained above in the discussion of prior
art, the motion compensator with prediction
upsampler/downsampler filter should use the motion vector
to select an area from the reference picture memory that includes the blocks overlapped by the desired predicted
macroblock. This area is then upsampled to full
resolution (including pixel interpolation if required) .
Then the motion vector is used to select the full
resolution predicted macroblock. The full resolution
predicted macroblock is then filtered and downsampled to
match the structure of the 4x4 IDCT residual output.
This process can be done in three steps, or can be
implemented as a single step. Horizontally, this process
is the same as in the prior art. Vertically, the motion
compensator 104 must support two types of prediction
upsampling/downsampling. These two types are field
prediction and frame prediction, and are indicated by the
prediction type signal.
Field Prediction - When the prediction type
signal indicates field such that field prediction from an
a or b reference field is required, the motion vector is
used to read pixels from a particular area of a
downsampled reference field a or b. Then, a prediction
upsampling/downsampling filter (Mokry or Johnson or
similar type) operates horizontally and then vertically on these pixels to form a predicted field macroblock.
This operation is shown in Figure 11 with respect to a
downsampled reference field a. Column 120 of Figure 11
represents a DCT domain downsampled reference field a as
stored in the reference picture memory 110. Column 122
represents an upsampled portion of this reference field,
and column 124 represents the prediction after
downsampling .
The upsampling/downsampling filter may include
additional pixel interpolation if the motion vector is
fractional. Figure 12 shows an example of vertical
upsampling/downsampling from a reference field a when
pixel interpolation is implemented.
It should be noted that, in the case of frame
pictures, there will be two field predictions for each
macroblock utilizing two motion vectors. One field
prediction is for the A field, and one prediction is for
the B field. Each field prediction is operated upon
separately by the prediction upsampling/downsampling
filter, including additional % pixel interpolation if the
motion vector is fractional. Frame Prediction - Frame prediction is required
from an a/b reference picture for a frame DCT coded
residual macroblock. Due to operation of the IDCT module
102, the residual macroblock output from the IDCT module
102 has the a/b structure. The motion vector for the
macroblock is used to read a particular area from the a/b
structured downsampled reference picture stored in the
reference picture memory 110. Horizontal prediction
upsampling/downsampling is done as in the case of field
prediction. Vertical prediction upsampling/downsampling
must be broken up into separate steps . The a and b
fields from the read area are separately vertically
upsampled (here, both % pixel interpolation and
subsequent downsampling are postponed to a later step) in
order to approximate the full resolution A and B fields
which are shown by steps 1 and 2 of Figure 13. If the
motion vector is fractional, XA pixel interpolation is
performed as follows. The A and B upsampled field areas
are line shuffled (see step 3 of Figure 14) . Then ^ pixel
vertical interpolation is performed as shown by step 4 of
Figure 14 in order to create A'/B' upsampled field areas. This process matches how the encoder does ^ pixel
interpolation on frame coded macroblocks. It would not
be correct to perform pixel interpolation separately on
the fields. Finally, the A and B (or A' and B') portions
of the upsampled prediction are separately filtered and
downsampled to create a prediction for the a/b structured
residual macroblock (see step 5 of Figure 14) . It should
be noted that, unlike the situation for field prediction,
the frame prediction upsample/downsample process must be
split into separate filtering steps.
The motion compensator 104 is shown in more
detail in Figure 15. The motion compensator 104 executes
the two required types of prediction
upsampling/downsampling based on the prediction type
signal (field or frame) . The prediction type signal is
distributed to a motion vector translator 200, to a
vertical upsample/downsample filter 202, to a combiner
204, to an interpolator 206, and to a vertical filter and
downsampler 208. The motion vector translator 200 also
accepts the motion vector from the data parser as an
input. The motion vector translator 200 translates the motion vector to a series of reference memory addresses
in order to read pixels stored in the reference picture
memory 110 from reference fields a and b when the
prediction type is for frame prediction, or from a
reference field a or a reference field b (as indicated by
the field select signal) when the prediction type
indicates field prediction.
The pixels that are read out from the reference
picture memory 110 are provided to a horizontal
upsample/downsample filter 210. The horizontal
upsample/downsample filter 210 executes a horizontal
prediction upsampling/downsampling filtering operation as
previously described, including 34 pixel interpolation if
the motion vector is fractional. The horizontally
processed pixels are then provided to the vertical
upsample/downsample filter 202. If the type of
prediction is field prediction, the vertical
upsample/downsample filter 202 executes vertical
upsampling/downsampling, with additional 34 pixel
interpolation if the motion vector is fractional. The
combiner 204, the interpolator 206, and the vertical filter and downsampler 208 simply pass through the output
of the vertical upsample/downsample filter 202 during
field prediction.
For frame prediction, however, the vertical
upsample/downsample filter 202 performs only vertical
upsampling on the horizontally processed pixels provided
by the horizontal upsample/downsample filter 210. In
this case, 34 pixel prediction (if necessary) and
downsampling are executed later. The combiner 204
operates only during frame prediction and when the motion
vector is vertically fractional by line shuffling the
upsampled A and B field blocks. If the prediction type
is frame and if the motion vector is vertically
fractional, the interpolator 206 executes 34 pixel linear
vertical interpolation. If the prediction type is frame,
the vertical filter and downsampler 208 separately
filters and downsamples the A and B (or A' and B')
fields. The output of the vertical filter and down-
sampler 208 is the desired prediction. Accordingly, the
combiner 204 and the interpolator 206 operate only when
there is frame prediction and the motion vector is fractional, and the vertical filter and downsampler 208
operates only when there is frame prediction. When there
is field prediction, the combiner 204, the interpolator
206, and the vertical filter and downsampler 208 simply
pass through the output of the vertical upsample/down¬
sample filter 202.
The b field pixels at the output of the adder
106 of Figure 10 do not fall directly in between the a
field pixels, as shown by column 220 of Figure 16.
Accordingly, the interpolation module 108 performs a
simple linear interpolation, as shown in Figure 16, so
that the b field pixels are repositioned halfway in
between the a field pixels, as shown in column 224.
Accordingly, the b field pixels will be properly
displayed. Alternatively, a longer FIR filter with an
even number of taps could be used.
The IDCT module 102 of Figure 10 is described
in additional detail subsequently to the following
mathematical derivation. This mathematical derivation
discloses an efficient method of converting two
vertically stacked frame coded 8x8 blocks of DCT coefficients into two blocks of field DCT coefficients
which are then vertically inverse discrete cosine
transformed by a four point IDCT operation that
effectively vertically filters and downsamples the blocks
as separate fields resulting in an a/b spatial pixel
structure. This mathematical derivation demonstrates
that the vertical processing can be done efficiently in a
single matrix operation. This derivation is first shown
one dimensionally for a single column of sixteen pixels
(two vertically stacked one dimensional "blocks") and is
then shown for two dimensional blocks of multiple
columns. With respect to this derivation, an uppercase X
refers to frequency domain DCT coefficients and a
lowercase x or y refers to spatial domain pixels or
residual values.
First, the following equation provides an
initial definition:
Figure imgf000039_0001
where the left hand side of equation (1) comprises two
one dimensional vertical blocks each containing eight
frame DCT coefficients. An 8x8 DCT matrix [T8] is
defined by the following equation:
too toi t 02 t 03 t04 tos t06 t07 ti o til tl2 tl3 tl4 tl5 tl 6 tl 7 t20 t21 t22 t23 t24 t25 t26 t27 t30 t31 t32 t33 t34 t35 t36 t37
[T8] t40 t41 t42 t43 t44 t45 t46 t47 (2) t50 t51 t52 tS3 t54 t5S t56 t57 tβo tβl t62 t63 t64 t6S t66 t67 t70 til t72 t 73 t 74 t7S t76 t77
where the rows of the right hand side of equation (2)
contain the well known values for the eight point DCT
basis vectors. Further, an IDCT operator [IT82] (where IT
represents inverse transform) for two vertically stacked blocks is derived from equation (2) according to the
following equation:
Figure imgf000040_0001
Equations (1) and (3) can be combined as
follows
Figure imgf000040_0002
where the two blocks [x] on the right hand side of
equation (4) are in the spatial domain. Next, a matrix
[Tf] may be defined, based on equation (2) , according to
the following equation:
Figure imgf000041_0001
where the rows of equation (5) are the eight point DCT
basis vectors with zero's placed between each
coefficient
Equations (4) and (5) can be combined according
to the following equation:
Figure imgf000042_0001
where [x] is frame ordered, but where [Xab] comprises two
field DCT coded blocks for f ields a and b .
Accordingly, frame DCT coef f icients [X] can be
converted to field DCT coded coefficients [Xab] in a
single operation according to the following equation :
[Xab] = [TJ [x] = [TJ [ITS J [x] = [Q ] [X] (7)
where [Q is an operator given by the following equation:
[0,3 = [ T ] [IT8 ] (8) The frame DCT coded coefficients [X] can be
converted to two separate field blocks [xab] in the spatial domain according to the following equation:
[xab] = [ IT8 ] [Xab] = [ IT8 ] [ T ] [ IT8 ] [X] = [Q ] [X]
(9)
where Q2 is a 16x16 matrix operator according to the
following expression : [Q2] = [IT82] [Tf] [IT82] . The
quantity [xab] is also given by the following equation :
[ IT8 ] [Xab] = = [xab] (10)
Figure imgf000043_0001
where [xab] comprises two separate field blocks.
The matrix [IT82] can be modified so that down-
sampling and filtering can be added. Thus, an 8x8 matrix
[P4] can be formed from a 4x4 DCT basis matrix [T4] by
padding the well known four point DCT basis vectors with
zero's in both direction according to the following
equations :
Figure imgf000044_0001
Then, the transpose of equation (12) may be
applied to a column of eight pixels according to the
following equation: [P4 T] (13)
Figure imgf000045_0002
where the y's on the right hand side of equation (13) are
the filtered and downsampled spatial pixels resulting
from the inverse DCT operation on the left hand side of
equation 13. Only the four pixels y need to be retained
such that the zero's on the right hand side of equation
(13) may be discarded. These pixels y represent a
filtered and downsampled version of an original eight
pixel block [x] . A matrix [IP42] , which is an operator
for two vertical blocks, may be established according to
the following equation:
Figure imgf000045_0001
where the matrix [IP42] is a filter/downsample/IDCT
operator. The operator [IT82] which is used to perform an IDCT on [X] to produce [x] , the operator [Tf] which is
used to perform a field split in order to produce [Xab] ,
and the operator [IP42] which performs filtering,
downsampling, and an IDCT separately on each field to
derive [y] can be combined according to the following
equation so that two vertical blocks of frame DCT coded
coefficients can be filtered, downsampled, and inverse
discreet cosine transformed in a single operation:
[Q3] [ IP4 ] [ Tf] [IT8 ]
2 (15)
Thus, by applying the operator [Q3] , which is a 16x16
operator, to two vertical blocks of DCT coded
coefficients given by equation (1) , the following results
are produced:
Figure imgf000047_0001
The zero's on the right hand of equation (16) can be discarded so that only the ya's and yb's are retained.
The output, therefore, of applying the operator [Q3] to
two frame DCT coded blocks is separate top and bottom
field blocks for fields a and b, each separately filtered
and downsampled effectively using a 4x4 IDCT operating on
field DCT coded blocks. Because [Q3] has the following form:
Figure imgf000048_0001
the operator [Q3] can be rewritten in the following form:
[p]
0, = (18) Cgl
where [p] and [q] are each 4x16 matrices. Accordingly,
the operator [Q3] becomes an 8x16 operator instead of a
16x16 operator so that, when it is applied to two frame
DCT coded blocks, only the ya's and yb's of equation (16)
results .
The following description shows how the
operator [Q3] may be used on two dimensional blocks having
eight columns, and also shows the horizontal IDCT with
filtering and downsampling. For a full two dimensional
field based 4x4 IDCT of frame DCT coded blocks, the [Q3] operation is only performed vertically. A standard four
point IDCT is performed horizontally. Let X be a
macroblock consisting of four frame coded 8x8 DCT blocks
Xlf X2, X3, and X4. Spatially, these 8x8 DCT coded blocks
are oriented as follows:
X X
1 2
X X
3 4
As a first step, the high order horizontal coefficients
of each of the 8x8 DCT coded blocks X1; X2, X3, and X4 are
discarded (the four high order coefficients in each row)
so that each DCT block becomes an 8x4 block (X1 ' , X2 ' ,
X3 ' , and X4 ' ) . A 16x4 matrix [Xj may be used to define
the two 8x4 frame coded DCT blocks Xx ' and X3 ' according
to the following equation:
[X m] (19) and similarly a 16x4 matrix [X may be used to define the
frame coded 8x4 DCT blocks X2 ' and X4 ' according to the
following equation:
Figure imgf000050_0001
At this point, the vertical operation can be performed
first, followed by the horizontal operation, or the
horizontal operation may be performed first followed by
the vertical operation. Assuming that the vertical
operation is performed first, field based DCT domain
vertical filtering and downsampling is performed
according to the following equation:
[G ] a
[G] - [03] [X ] (21)
where [G] is an 8x8 matrix, [Q3] is an 8x16 matrix, [Xj
is the 16x4 matrix described above, and [Ga] and [Gb] are
4x4 matrices for corresponding fields a and b. Then, horizontal DCT domain filtering and downsampling is
performed according to the following equation:
[ya] [Ga] [T4] [yb] [Gb] (22)
where [ya] and [yb] are each a 4x4 matrix, where [Ga] and
[Gb] are the 4x4 matrices derived from equation (21) ,
where the ya and yb are the resulting residual or pixel
values provided by the IDCT module 102 to the adder 106,
and where [T4] is the four point DCT basis vector 4x4
matrix from equation (11) . These operations represented
by equations (21) and (22) are also applied thereafter to
the 16x4 array [X .
On the other hand, the order of the above
operations can be reversed in the case of first
performing horizontal downsampling, followed by vertical
downsampling with the same results. Thus, the four point
DCT basis vector matrix T4 is applied to the 8x4 array
[Xi1] and the 8x4 array [X3 ' ] according to the following
equations : tx'l
[C ] [ T4] (23)
[x'l
[G ] [ T4] (24)
The results [G and [G2] can be combined according to the
following equation:
Figure imgf000052_0001
Thereafter, a vertical field based operation may be
performed according to the following equation:
[ya] [yb] = [03] [ G2] (26)
Figure 17 shows an exemplary hardware
implementation of an IDCT 300 which can be used for the
IDCT module 102 of Figure 10. The IDCT 300 processes
both field DCT and frame DCT coded macroblocks. The output of the IDCT 300 is always a or b structured
pixels. For field DCT coded
blocks, the usual 4x4 IDCT is executed to achieve
horizontal and vertical filtering and downsampling. For
frame DCT coded blocks, the horizontal processing is the
same, but vertically the Q3 operator is used to
effectively convert the frame coding to field coding, and
then to filter and downsample each field separately. The
result is a or b structured field blocks in either case.
It is noted here that matrix multiplication is
not commutative (a*b does not in general equal b*a) .
Therefore, each matrix multiplier of the IDCT 300 has a
pre and a post input to indicate the order of the
associated matrix multiplication.
A macroblock X consisting of four 8x8 DCT
blocks Xlf X2, X3, and X4 is received by a parser 302 which
sends the DCT blocks along one path of the IDCT 300 if
the DCT blocks are field coded blocks and sends the DCT
blocks along a different path of the IDCT 300 if the DCT
blocks are frame coded blocks. If the four 8x8 DCT
blocks are field DCT coded blocks as signaled by a DCT type input to the parser 302, each block is fed to a
discard module 304 which discards the high order
horizontal and vertical coefficients in order to form a
4x4 field DCT coded block for each of the 8x8 DCT field
coded blocks supplied to it . A matrix multiplier 306
executes a four point vertical IDCT on the columns of the
blocks provided to it. The vertically processed 4x4
blocks are fed to a selector 308. Because the DCT type
is field, the 4x4 blocks provided by the matrix
multiplier 306 are selected for output to a matrix
multiplier 310. The matrix multiplier 310 executes a
four point horizontal IDCT on each row of the blocks
supplied to it. The output of the matrix multiplier 310
is a 4x4 block of filtered and downsampled pixels for
field a or b.
If a macroblock X consists of frame DCT coded
blocks as signaled by the DCT type, the blocks of the
macroblock are fed in turn by the parser 302 to a discard
module 312 which discards the high order horizontal
coefficients in order to form corresponding 8x4 frame DCT
coded blocks . These 8x4 blocks Xx ' , X2 ' , X3 ' , and X4 ' are provided to a reorder module 314 where these 8x4 blocks
are stored and reordered and then provided as two 16x4
blocks Xm (X-. 1 and X3 * ) and Xn (X2* and X4 ' ) . The 16x4
block Xra is first fed to a matrix multiplier 316 which,
using the Q3 operator, applies vertical conversion,
filtering, and downsampling to the field DCT
coefficients. The output of the matrix multiplier 316 is
an 8x4 block consisting of Ga and Gb . A store and delay
module 318 then separately outputs the 4x4 blocks Ga and
Gb to the selector 308. Because the current macroblock X
being processed is of type frame, the selector 308
applies the 4x4 Ga and Gb blocks to the matrix multiplier
310 which, operating first on the 4x4 block Ga, executes
a four point IDCT on each row of the block, and then
subsequently processes the 4x4 block Gb in a similar
fashion. The outputs of the matrix multiplier 310 are
4x4 blocks of filtered and downsampled pixels ya for the
a field and yb for the b field.
The IDCT module 102 can be modified in order to
permit the elimination of the interpolation module 108.
That is, the four point IDCT previously described involves the use of four point DCT basis vectors (the
[T4] operator) on the four low order DCT coefficients
originally derived from an eight point DCT operation in
the encoder. This process results in the "in between"
downsampled pixel positioning as previously described and
as replicated in Figure 18. In order to permit the
elimination of the interpolation module 108, an
alternative operator [T4 ' ] can be derived.
Thus, instead of consisting of the four point
DCT basis vectors, the rows of the alternative operator
[T4 ' ] consist of downsampled eight point DCT basis
vectors. If each row of the alternative operator [T4']
contains the second sample, the fourth sample, the sixth
sample, and the eighth sample, respectively, of the four
low order eight point DCT basis vectors, and if the
decoder discards the four high order coefficients for
each block and performs a four point IDCT on the
remaining coefficients using the alternative operator
[T4 ' ] , then the spatial relationship between the original
pixels (x's) and the decoded pixels (y's) is shown in
Figure 19. Thus, if the [T4] operator is used to downsample the A field, and the alternative operator [T4']
is used to downsample the B field, then the resulting
spatial relationship is shown in Figure 20. It is noted
that this is the desired relationship as shown by the
column 224 of Figure 16.
Therefore, it is desired to modify the IDCT
module 102 so that its output always has the a/b pixel
structure of Figure 20. For field coded macroblocks of
the A field, a four point IDCT (using the operator [T4] )
is performed, as before. But for field coded macroblocks
of the B field, the alternative operator [T4 ' ] derived
from the eight point basis vectors is used for IDCT.
For frame coded macroblocks, a modified
operator similar to the operator Q3 must be derived that
computes two separately downsampled fields directly from
the frame coded coefficients. The modified operator Q3 '
must incorporate the operator [T4] for the A field and
the alternative operator [T4 ' ] for the B field. The
operator [T4] is given by equation (11) and the operator
[P4] is given by equation (12) , where the DCT basis
vectors rOO, rOl, ..., r33 are four point DCT basis vectors. The alternative operators [T4 ' ] and [P4 ' ] may
be given by the following equations :
Figure imgf000058_0001
tOl t03 t05 t07 0 0 0 0 til tl3 tl5 tl7 0 0 0 0 t21 t23 t25 t27 0 0 0 0 t31 t33 t35 t37 0 0 0 0
[P41] (28)
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
where the DCT basis vectors tOl, t03, t37 are eight
point DCT basis vectors. The alternative operator [P4 ' ]
does not need the V"2 scaling factor which is incorporated
into the operator [P4] because the alternative operator
[P4 ' ] uses eight point basis vectors. Accordingly, the
alternative operator [IP4'2] is given by the following
equation:
Figure imgf000059_0001
where the operator [IP4'2] is a filter/downsample/IDCT
operator which clearly operates differently on the A
field (using P4T) than on the B field (using P4) .
A modified operator Q3 ' , therefore, is defined
according to the following equation:
[(?'] = [IP41 ] [ T ] [IT8 ]
3 2 f 2 (30)
This modified operator Q3 ' is used by the IDCT module 102
instead of the operator Q3 previously described. The IDCT
module 102 using these appropriate operators results in
an a/b pixel structure that eliminates the need for the
interpolation module 108.
As previously stated, the use of a prediction
upsampling filter should result in a close spatial domain
approximation to the effective filtering operation performed by a 4x4 IDCT. Because the two fields are now
filtered differently using the 4x4 IDCT, the two fields
must be filtered differently by the motion compensator
104. The basic filter structure is the same for both
fields, only the tap values differ. The tap values for
both fields can easily be derived and stored in memory.
The following are representative values for the
matrices [T8} and [T4] :
[T8] = 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536
0.4904 0.4157 0.2778 0.0975 -0.0975 -0.2778 -0.4157 -0.4904
0.4619 0.1913 -0.1913 -0.4619 -0.4619 -0.1913 0.1913 0.4619
0.4157 -0.0975 -0.4904 -0.2778 0.2778 0.4904 0.0975 -0.4157
0.3536 -0.3536 -0.3536 0.3536 0.3536 -0.3536 -0.3536 0.3536
0.2778 -0.4904 0.0975 0.4157 -0.4157 -0.0975 0.4904 -0.2778
0.1913 -0.4619 0.4619 -0.1913 -0.1913 0.4619 -0.4619 0.1913
0.0975 -0.2778 0.4157 -0.4904 0.4904 -0.4157 0.2778 -0.0975 [T4] =
0.5000 0.5000 0.5000 0.5000
0.6533 0.2706 -0.2706 -0.6533
0.5000 -0.5000 -0.5000 0.5000 0.2706 -0.6533 0.6533 -0.2706
However, it should be understood that these values are
merely exemplary and other values could be used.
Certain modifications of the present invention
have been discussed above. Other modifications will
occur to those practicing in the art of the present
invention. For example, according to the description
above, the interpolation module 108 can be eliminated
through derivation of an alternative operator [T4 ' ] .
Accordingly, the description of the present
invention is to be construed as illustrative only and is
for the purpose of teaching those skilled in the art the
best mode of carrying out the invention. The details may
be varied substantially without departing from the spirit
of the invention, and the exclusive use of all modifications which are within the scope of the appended
claims is reserved.

Claims

WHAT IS CLAIMED IS:
1. An apparatus for downconverting received
frame and field coded DCT coefficient blocks to
downconverted pixel blocks, wherein the apparatus
includes an IDCT module arranged to perform a
downconverting IDCT on the frame coded DCT coefficient
blocks in order to produce corresponding downconverted
pixel blocks and to perform a downconverting IDCT on the
field coded DCT coefficient blocks in order to produce
corresponding downconverted pixel field blocks, and
wherein the IDCT module is CHARACTERIZED IN THAT:
the IDCT module converts the received frame
coded DCT coefficient blocks to converted field coded DCT
coefficient blocks and then subsequently performs a
downconverting IDCT on the converted field coded DCT
coefficient blocks in order to produce corresponding
downconverted pixel field blocks.
2. The apparatus of claim 1 wherein the
received frame and field coded DCT coefficient blocks have motion vectors associated therewith, and wherein the
apparatus further comprises :
a selector arranged to select reference pixel
blocks from a reference picture memory based upon the
motion vectors; and,
an adder arranged to add the reference pixel
blocks to the first and second downconverted pixel field
blocks in order to form reconstructed pixel field blocks.
3. The apparatus of claim 2 wherein the
selector comprises:
an upsampler arranged to upsample the reference
pixel blocks; and,
a downsampler arranged to downsample the
upsampled reference pixel blocks.
4. The apparatus of claim 2 wherein the
selector comprises:
a translator arranged to translate the motion
vector to a reference address;
a horizontal upsampler and downsampler arranged
to horizontally upsample and downsample the reference
pixel blocks based on the reference address; and, a vertical upsampler and downsampler, wherein
the vertical upsampler and downsampler is arranged during
field prediction to vertically upsample and downsample
the horizontally upsampled and downsampled reference
pixel blocks when field coded DCT coefficient blocks are
received, and wherein the vertical upsampler and
downsampler is arranged during frame prediction to (i)
vertically upsample the horizontally upsampled and
downsampled reference pixel blocks separately for fields
a and b, (ii) if the motion vectors are fractional,
combine the separately upsampled a and b fields into a
reference frame block, then perform a half pixel
interpolation on the combined frame block, and (iii)
separately vertically downsample the upsampled a and b
fields.
5. The apparatus of claim 1 wherein the IDCT
module comprises:
a discarder arranged to discard high order
horizontal DCT coefficients from the frame coded DCT
coefficient blocks to produce downsized frame coded DCT
coefficient blocks; a reorderer arranged to reorder the downsized
frame coded DCT coefficient blocks;
a column IDCT submodule arranged to perform an
IDCT by columns on the reordered and downsized frame
coded DCT coefficient blocks; and,
a row IDCT submodule arranged to perform an
IDCT by rows on results from the column IDCT submodule .
6. The apparatus of claim 5 wherein the
column IDCT submodule comprises a multiplier arranged to
multiply the reordered and downsized frame coded DCT
coefficient blocks by [Q3J, wherein [Q3J = [lP42] [τf] [lT82],
wherein [lP42] is in the following form:
Figure imgf000066_0001
wherein [P4] is in the following form :
Figure imgf000066_0002
wherein rOO, rOl, r33 are four point DCT basis
vectors, wherein [Tf] is in the following form:
Figure imgf000067_0001
wherein [lT82] is in the following form:
Figure imgf000067_0002
wherein [T8] is in the following form: too toi t02 03 t04 tos toe t07 tio til t!2 t!3 tl4 tl5 tl β t!7 t20 t21 t22 t23 t24 t25 t26 t27 t30 t31 t32 t33 t34 t35 t36 t37 T8 T 0
[T8] [IT82] = t40 til t42 t43 t44 t4S t46 t47 0 T8T t50 tsi t52 t53 tS4 t55 tS6 t57 tβo tβl t62 te3 t64 tes tee te7 t70 t72 t72 73 t74 t75 t7e t77
and wherein tOO , tOl , . . . t77 are eight point DCT basis vectors
7. The apparatus of claim 6 wherein the multiplier is a first multiplier, and wherein the row
IDCT submodule comprises a second multiplier arranged to
multiply results from the first multiplier by [T ]/ J2 , and
wherein [T4j is given by the following equation:
Figure imgf000068_0001
8. The apparatus of claim 1 wherein the IDCT
module comprises:
a discarder arranged to discard high order DCT
coefficients from the field coded DCT coefficient blocks
to produce downsized field coded DCT coefficient blocks; a column IDCT submodule arranged to perform an IDCT by columns on the downsized field coded DCT coefficient blocks; and,
a row IDCT submodule arranged to perform an
IDCT by rows on results from the column IDCT submodule.
9. The apparatus of claim 8 wherein the
column IDCT submodule comprises a multiplier arranged to multiply the downsized field coded DCT coefficient blocks
by [T4lr/V^, wherein [T4j is given by the following
equation :
Figure imgf000069_0001
and wherein rOO , rOl , . . . r33 are four point DCT basis
vectors .
10. The apparatus of claim 9 wherein the
multiplier is a first multiplier, and wherein the row
IDCT submodule comprises a second multiplier arranged to
multiply results from the first multiplier by [T4]/y2.
11. The apparatus of claim 10 wherein the
discarder is a first discarder, wherein the column IDCT
submodule is a first column IDCT submodule, wherein the
row IDCT submodule is first row IDCT submodule, and
wherein the IDCT module further comprises :
a second discarder arranged to discard high
order horizontal DCT coefficients from the frame coded
DCT coefficient blocks to produce downsized frame coded
DCT coefficient blocks;
a reorderer arranged to reorder the downsized
frame coded DCT coefficient blocks;
a second column IDCT submodule arranged to
perform an IDCT by columns on the reordered and downsized
frame coded DCT coefficient blocks; and,
a second row IDCT submodule arranged to perform
an IDCT by rows on results from the second column IDCT
submodule.
12. The apparatus of claim 11 wherein the
second column IDCT submodule comprises a third multiplier
arranged to multiply the reordered and downsized frame coded DCT coefficient blocks by [Q3], wherein [Q3] = [lP42]
f] [lT82], wherein [lP42] is in the following form :
Figure imgf000071_0001
wherein [P4J is in the following form :
Figure imgf000071_0002
wherein rOO , rOl , . r33 are four point DCT basis
vectors, wherein [Tf] is in the following form:
too 0 toi o t020 t03 0 t040 t050 toe 0 t070 tio 0 til o t!20 t!30 t!40 t!50 tie o ti7 o t200 t21 0 t22 0 t23 0 t240 t250 2e 0 t270 t300 t31 0 t320 t330 t340 t350 t360 t370 t400 t41 0 t42 0 t43 0 t440 t45 0 e 0 t470 t500 t51 0 t520 t530 t54 0 t550 t5e 0 t570 teσ o tei o t62 0 te30 t 40 t650 tee o te7 o t700 t710 t720 t730 t74 0 t7S 0 7e 0 t770
[ Tf] o too o toi 0 t02 0 t03 0 t04 0 t05 o toe o t07 o tio 0 til 0 t!2 0 t!3 0 t24 0 t!5 o tie o ti7
0 t20 0 t21 0 t22 0 t23 0 t24 0 t25 0 t2 0 t27
0 t30 0 t31 0 t32 0 t33 0 t34 0 t35 0 t e 0 t37
0 t40 0 t41 0 t42 0 t43 0 t44 0 t45 0 t 0 t47
0 t50 o tsi 0 t52 0 t53 0 t54 0 t5S 0 t5e 0 t57 o teo o tei 0 t62 0 te3 0 t64 0 t65 o tee o te7
0 t70 0 t71 0 t72 0 t73 0 t74 0 t75 0 7e 0 t77
wherein [lT82] is in the following form :
Figure imgf000072_0001
wherein [T8] ] is in the following form : too toi 02 t03 t04 t05 t06 t07 tio til t22 tl3 tl 4 tl5 tie t!7 t20 t21 t22 t23 t24 t25 t26 t 7 t30 t31 t32 t33 t34 t35 t36 t 7
[re] t40 t41 t42 t43 t44 t45 t46 t47 tso t51 tS2 t53 t54 t55 t56 t57 teo tei t62 t63 t t65 tee te7 t70 t71 t72 t73 t74 t75 t7e t77
and wherein tOO, tOl, . . . t77 are eight point DCT basis
vectors .
13. The apparatus of claim 12 wherein the
second row IDCT submodule comprises a fourth multiplier
arranged to multiply results from the third multiplier by
Figure imgf000073_0001
14. The apparatus of claim 8 wherein the
discarder is a first discarder, wherein the column IDCT
submodule is a first column IDCT submodule, wherein the
row IDCT submodule is a first row IDCT submodule, and
wherein the IDCT module further comprises:
a second discarder arranged to discard high
order horizontal DCT coefficients from the frame coded
DCT coefficient blocks to produce downsized frame coded
DCT coefficient blocks; a reorderer arranged to reorder the downsized
frame coded DCT coefficient blocks;
a second column IDCT submodule arranged to
perform an IDCT by columns on the reordered and downsized
frame coded DCT coefficient blocks; and,
a second row IDCT submodule arranged to perform
an IDCT by rows on results from the second column IDCT
submodule.
15. The apparatus of claim 14 wherein the
IDCT module is arranged to:
translate motion vectors to reference
addresses;
horizontally upsample and downsample reference
pixel blocks based on the reference addresses;
for field prediction, vertically upsample and
downsample the horizontally upsampled and downsampled
reference pixel blocks; and,
for frame prediction, (i) vertically upsample
the horizontally upsampled and downsampled reference
pixel blocks separately for fields a and b, (ii) if the
motion vectors are fractional, combine the separately
upsampled a and b fields into a reference frame block, then perform a half pixel interpolation on the combined
frame block, and (iii) separately vertically downsample
the upsampled a and b fields.
16. The apparatus of claim 1 further
comprising:
a selector arranged to select reference pixel
blocks based upon motion vectors associated with the
received frame and field coded DCT coefficient blocks;
an adder arranged to add the reference pixel
blocks to the downconverted pixel field blocks to form
reconstructed pixel field blocks; and,
a reference picture memory arranged to store
only the reconstructed pixel field blocks as the
reference pixel blocks.
17. The apparatus of claim 1 wherein the IDCT
module block is arranged to:
translate motion vectors associated with the
received frame and field coded DCT coefficient blocks to
reference addresses;
horizontally upsample and downsample reference
pixel blocks selected by the reference addresses; for field prediction, vertically upsample and downsample the horizontally upsampled and downsampled
reference pixel blocks; and, for frame prediction, (i) vertically upsample
the horizontally upsampled and downsampled reference
pixel blocks separately for fields a and b, (ii) if the motion vectors are fractional, combine the separately
upsampled a and b fields into a reference frame block, then perform a half pixel interpolation on the combined
frame block, and (iii) separately vertically downsample
the upsampled a and b fields.
18. The apparatus of claim 1 further comprising:
an adder arranged to add reference pixel blocks
to the downconverted pixel field blocks to form
reconstructed pixel field blocks; and, a reference picture memory arranged to store
only the reconstructed pixel field blocks as the reference pixel blocks.
19. The apparatus of claim 1 further
comprising a motion compensator arranged to apply motion compensation, as appropriate, to the downconverted pixel field blocks in order to produce the reconstructed pixel
field blocks.
PCT/US1999/015254 1999-07-07 1999-07-07 Downconverting decoder for interlaced pictures WO2001005159A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/US1999/015254 WO2001005159A1 (en) 1999-07-07 1999-07-07 Downconverting decoder for interlaced pictures
AU47320/99A AU4732099A (en) 1999-07-07 1999-07-07 Downconverting decoder for interlaced pictures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US1999/015254 WO2001005159A1 (en) 1999-07-07 1999-07-07 Downconverting decoder for interlaced pictures

Publications (1)

Publication Number Publication Date
WO2001005159A1 true WO2001005159A1 (en) 2001-01-18

Family

ID=22273142

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/015254 WO2001005159A1 (en) 1999-07-07 1999-07-07 Downconverting decoder for interlaced pictures

Country Status (2)

Country Link
AU (1) AU4732099A (en)
WO (1) WO2001005159A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004008777A1 (en) * 2002-07-12 2004-01-22 General Instrument Corporation A method and managing reference frame and field buffers in adaptive frame/field encoding
EP1655966A2 (en) * 2004-10-26 2006-05-10 Samsung Electronics Co., Ltd. Apparatus and method for processing an image signal in a digital broadcast receiver
US8630350B2 (en) 2001-11-27 2014-01-14 Motorola Mobility Llc Picture level adaptive frame/field coding for digital video content

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5262854A (en) * 1992-02-21 1993-11-16 Rca Thomson Licensing Corporation Lower resolution HDTV receivers
EP0576290A2 (en) * 1992-06-25 1993-12-29 Sony Corporation Picture signal coding and decoding
EP0707426A2 (en) * 1994-10-11 1996-04-17 Hitachi, Ltd. Digital video decoder for decoding digital high definition and/or digital standard definition television signals
EP0786902A1 (en) * 1996-01-29 1997-07-30 Matsushita Electric Industrial Co., Ltd. Method and apparatus for changing resolution by direct DCT mapping
WO1998041011A1 (en) * 1997-03-12 1998-09-17 Matsushita Electric Industrial Co., Ltd. Hdtv downconversion system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5262854A (en) * 1992-02-21 1993-11-16 Rca Thomson Licensing Corporation Lower resolution HDTV receivers
EP0576290A2 (en) * 1992-06-25 1993-12-29 Sony Corporation Picture signal coding and decoding
EP0707426A2 (en) * 1994-10-11 1996-04-17 Hitachi, Ltd. Digital video decoder for decoding digital high definition and/or digital standard definition television signals
EP0786902A1 (en) * 1996-01-29 1997-07-30 Matsushita Electric Industrial Co., Ltd. Method and apparatus for changing resolution by direct DCT mapping
WO1998041011A1 (en) * 1997-03-12 1998-09-17 Matsushita Electric Industrial Co., Ltd. Hdtv downconversion system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
VETRO A ET AL: "FREQUENCY DOMAIN DOWN-CONVERSION OF HDTV USING AN OPTIMAL MOTION COMPENSATION SCHEME", INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY,US,WILEY AND SONS, NEW YORK, vol. 9, no. 4, 1 January 1998 (1998-01-01), pages 274 - 282, XP000768998, ISSN: 0899-9457 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8630350B2 (en) 2001-11-27 2014-01-14 Motorola Mobility Llc Picture level adaptive frame/field coding for digital video content
WO2004008777A1 (en) * 2002-07-12 2004-01-22 General Instrument Corporation A method and managing reference frame and field buffers in adaptive frame/field encoding
EP1655966A2 (en) * 2004-10-26 2006-05-10 Samsung Electronics Co., Ltd. Apparatus and method for processing an image signal in a digital broadcast receiver
EP1655966A3 (en) * 2004-10-26 2011-04-27 Samsung Electronics Co., Ltd. Apparatus and method for processing an image signal in a digital broadcast receiver
US7986846B2 (en) 2004-10-26 2011-07-26 Samsung Electronics Co., Ltd Apparatus and method for processing an image signal in a digital broadcast receiver

Also Published As

Publication number Publication date
AU4732099A (en) 2001-01-30

Similar Documents

Publication Publication Date Title
US6665344B1 (en) Downconverting decoder for interlaced pictures
KR100370076B1 (en) video decoder with down conversion function and method of decoding a video signal
US6104753A (en) Device and method for decoding HDTV video
EP0707426B1 (en) Digital video decoder for decoding digital high definition and/or digital standard definition television signals
US6690836B2 (en) Circuit and method for decoding an encoded version of an image having a first resolution directly into a decoded version of the image having a second resolution
US6504872B1 (en) Down-conversion decoder for interlaced video
US6628714B1 (en) Down converting MPEG encoded high definition sequences to lower resolution with reduced memory in decoder loop
US5973740A (en) Multi-format reduced memory video decoder with adjustable polyphase expansion filter
EP3065403A1 (en) In-loop adaptive wiener filter for video coding and decoding
JPH10145791A (en) Digital video decoder and method for decoding digital video signal
EP0842585A2 (en) Method and device for decoding coded digital video signals
US6795498B1 (en) Decoding apparatus, decoding method, encoding apparatus, encoding method, image processing system, and image processing method
EP1044567A1 (en) Reduced cost decoder using bitstream editing for image cropping
EP1054566A1 (en) Apparatus and method for deriving an enhanced decoded reduced-resolution video signal from a coded high-definition video signal
EP1134981A1 (en) Automatic setting of optimal search window dimensions for motion estimation
US20120014448A1 (en) System for low resolution power reduction with compressed image
US20010016010A1 (en) Apparatus for receiving digital moving picture
US20010026588A1 (en) Apparatus for receiving moving pictures
KR100323676B1 (en) Apparatus for receiving digital moving picture
US9313523B2 (en) System for low resolution power reduction using deblocking
US6510178B1 (en) Compensating for drift in the down conversion of high definition sequences to lower resolution sequences
US20120014445A1 (en) System for low resolution power reduction using low resolution data
WO2001005159A1 (en) Downconverting decoder for interlaced pictures
KR100323688B1 (en) Apparatus for receiving digital moving picture
KR100255777B1 (en) Digital tv receiver decoder device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase