EP0800684A1 - Method and device for encoding/decoding a displaced frame difference - Google Patents

Method and device for encoding/decoding a displaced frame difference

Info

Publication number
EP0800684A1
EP0800684A1 EP96929873A EP96929873A EP0800684A1 EP 0800684 A1 EP0800684 A1 EP 0800684A1 EP 96929873 A EP96929873 A EP 96929873A EP 96929873 A EP96929873 A EP 96929873A EP 0800684 A1 EP0800684 A1 EP 0800684A1
Authority
EP
European Patent Office
Prior art keywords
predetermined
gabor
frame difference
inner product
displaced frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP96929873A
Other languages
German (de)
French (fr)
Other versions
EP0800684A4 (en
Inventor
Mark R. Banham
James C. Brailean
Stephen N. Levine
Kevin J. O'connell
Aggelos K. Katsaggelos
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Motorola Solutions Inc
Original Assignee
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc filed Critical Motorola Inc
Publication of EP0800684A1 publication Critical patent/EP0800684A1/en
Publication of EP0800684A4 publication Critical patent/EP0800684A4/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N11/00Colour television systems
    • H04N11/04Colour television systems using pulse code modulation
    • H04N11/042Codec means
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction

Definitions

  • the present invention relates generally to video codecs, and more particularly to encoding of displaced frame differences for video codecs.
  • DFD Displaced Frame Difference
  • the DFD is generally a nonstationary, high-pass image which consists of error around the edges of moving objects where a motion estimation technique has failed to adequately represent the motion in the video scene. Often, the DFD will also contain large regions of homogeneous error information. This happens when new objects enter the scene, or when objects are displaced by a large amount of motion between video frames.
  • One common approach used to encode the DFD information is the use of a block-wise Discrete Cosine Transform (DCT), followed by entropy encoding of the coefficients of the transform. This approach is suitable when the pixels within each block are well modeled by a first order Markov random process.
  • DCT Discrete Cosine Transform
  • the DCT is close to the optimal transform (the Karhunen-Loeve Transform) in terms of energy compaction capabilities when the first order Markov model is met.
  • the DCT can become less efficient than other techniques at coding the DFD image.
  • One such alternative technique is to use an iterative expansion of the DFD over a dictionary of non-orthogonal Gabor, or modulated Gaussian functions, followed by coding of the expansion information. This approach can allow localization of the prediction error coding to only those pixels of the DFD which are perceptually most important. It can also reduce the required bit expenditure for the DFD over a comparable DCT based approach.
  • Technology does not currently exist, however, for using the iterative Gabor expansion for video coding in an efficient and effective manner.
  • FIG. 1 is a flow chart of a preferred embodiment of steps of a method in accordance with the present invention.
  • FIG. 2 is a graphical representation of the residual energy in the displaced frame difference as decomposed in accordance with the present invention.
  • FIG. 3 is a diagrammatic representation of regions from a predetermined segmentation, and blocks touching those regions in accordance with the present invention.
  • FIG. 4 is a block diagram of one preferred embodiment of a device in accordance with the present invention.
  • FIG 5 is a schematic of one preferred embodiment of a microprocessor in accordance with the present invention.
  • FIG. 6 is a flow chart of a preferred embodiment of steps of a method in accordance with the present invention.
  • FIG. 7 is a flow chart of a preferred embodiment of steps of a method in accordance with the present invention.
  • FIG. 8 is a block diagram of a preferred embodiment of a device in accordance with the present invention.
  • the present invention provides a method, device and microprocessor for performing a computationally efficient iterative expansion of a displaced frame difference, DFD, image over a special dictionary of two-dimensional non-orthogonal modulated Gaussian, or Gabor functions for the purposes of video compression.
  • the coefficients of this expansion are encoded here using techniques which combine to give an efficient representation of the data at very low bit rates.
  • the expansion is directed to spend the primary amount of available bits on a perceptually important area of the image being encoded.
  • the method describes the encoding of chrominance information associated with the DFD.
  • FIG. 1 is a flow diagram of a preferred embodiment of a method for performing an iterative expansion of a DFD image over a special dictionary of modulated Gaussian functions in accordance with the present invention.
  • This is an iterative process which encodes a projection of the luminance channel of the DFD onto a single basis function and computes a new residual DFD image, at each iteration.
  • the term DFD used here will always refer to the luminance channel of the current residual DFD at the present iteration.
  • the initial residual DFD is set equal to the original DFD obtained from a predetermined motion estimation and compensation technique.
  • the first step in the method is dividing the DFD image into a plurality of blocks (102). These blocks permit the treatment of the DFD image in a localized manner.
  • the sum of absolute values of the intensity at each pixel within each block is then computed (104). These sums are computed for use in estimating which local region of the DFD image has the highest intensity energy.
  • the next step is weighting the blocks in the DFD according to a predetermined biasing scheme, selected from two options in a predetermined manner (106), which places an emphasis on a perceptually important region of the DFD (108). Where the DFD has been segmented into a plurality of regions according to a predetermined segmentation scheme, this information is used in selecting a perceptually important region (110). Otherwise, a centrally biased weighting scheme is used to represent the perceptually important area of the DFD as the center of the image (112).
  • the block within the DFD which has the highest weighted energy estimate is then chosen as the best block for concentrating the encoding (114). Using this best block as a starting spatial point, the DFD energy is then encoded using a predetermined hierarchical Gabor function expansion.
  • a predetermined dictionary of two-dimensional, normalized, real Gabor functions is used for this method (116).
  • the analytical expression for the functions in this dictionary is given by
  • N 16
  • ⁇ .
  • the hierarchical Gabor function expansion begins by picking a representative quadrant of the predetermined dictionary of two-dimensional Gabor functions(118). This representative quadrant is determined by finding the best matching Gabor basis function from four predetermined basis functions which represent each quadrant of the dictionary using a predetermined hierarchical pixel search. The best matching Gabor basis function within the representative quadrant of the dictionary is then found using a projection of the DFD signal onto each basis function with the hierarchical pixel search (120), and picking the function associated with the largest valued projection. The projections in steps (118) and (120) are computed only over bases whose center points lie within the best DFD block chosen in step (114).
  • the predetermined hierarchical pixel search is used in finding the expansion of the DFD at each iteration in (118) and (120).
  • the projection yielding the largest amplitude from these computations determines the best initial matching pixel position.
  • the pixels in a predetermined neighborhood of the best initial matching pixel position are used as centers of projections of the current Gabor basis function, and the best matching pixel position is found by choosing the position yielding the largest projection value. For example, if the best initial matching pixel position from the previous example was (18,20), and the predetermined neighborhood was +/-1 pixel, then the position search would additionally involve computing projections at the positions: (17,19), (17,20), (17,21 );
  • Step (1 18) uses the hierarchical pixel search for each of the four representative Gabor basis functions.
  • the basis function having maximum final projection among the four functions is chosen as the representative matching Gabor basis function.
  • step (120) the remaining Gabor basis functions in the representative quadrant of the dictionary are searched using the same hierarchical pixel position search within the best block in the DFD.
  • the Gabor basis function with the largest valued projection after this search is chosen as the current best matching Gabor atom.
  • An atom comprises three key pieces of information which are encoded: the final best matching pixel position, the index of the chosen Gabor basis function in the dictionary, and the value of the corresponding projection coefficient.
  • the value of the current projection coefficient is quantized with a uniform, fixed length mid-tread quantizer (122).
  • the limits of this quantizer are defined by using the value of the projection coefficient associated with the atom chosen at the first iteration (124).
  • the residual energy in the DFD has a monotonically decreasing nature with the iterative expansion described here.
  • FIG. 2, numeral 200, shows this behavior graphically.
  • the first projection coefficient gives the maximum value needing to be represented in the encoded data.
  • the maximum output of the quantizer is the nearest integer value to the projection coefficient in the first iteration.
  • the minimum output value of the quantizer is a predetermined percentage.
  • the maximum quantizer value is encoded using a predetermined fixed number of bits. For example, the use of 9 bits to encode the maximum value of the quantizer permits the quantization of projection coefficients having an absolute value less than 512, and requires clipping any which have a greater value than this to a value of 51 1.
  • a suitable choice for setting the minimum quantizer value is 10% of the maximum quantizer value.
  • a new residual DFD is computed (126). This is the current residual DFD minus the quantized projection of that DFD onto the last atom. This can be expressed as
  • R n+1 R n - pG n « r /3 r character
  • R" represents the current residual DFD
  • G"i ⁇ represents the current best matching basis function from the predetermined dictionary
  • step (104) only those block sums which have been affected by the computation of the current residual DFD need to be recomputed. This iterative expansion continues until a predetermined number of iterations has occurred, or the value of the projection coefficient of the current atom falls below a predetermined threshold (128).
  • FIG. 6, numeral 600 shows a flow diagram of the encoding process for the atom information associated with the iterative expansion of the DFD image.
  • the spatial position of each atom must be encoded so that the decoder can reconstruct the same residual. This can be accomplished by representing these positions with a differential code which is arithmetically encoded (602), or encoded with a fixed length code (604), the latter being more robust when there is a possibility of errors during transmission of the coded data.
  • a differential vector for each atom position representing the distance from the last atom in the x and y directions, and from the origin for the first atom, is encoded.
  • This approach allows for the encoding of these positions near the actual entropy of the set of symbols representing the positions because of the non-integer value of bits required to uniquely encode each symbol.
  • One arithmetic code is used for the x component differential codewords, and one is used for the y component differential codewords.
  • An adaptive model of the probability distribution o- the symbols is used. Th 3 model can De uplated with each encoded frame of video.
  • the plurality of blocks used in step (102) is used to assist in encoding this data as well.
  • one bit is encoded to specify the presence or absence of any atom positions in that block.
  • the positions of those atoms are encoded using a vector for each atom position, representing the distance from the x and y coordinates of the origin of the block.
  • the components of each vector are encoded using a fixed number of bits determined by the block size. For example, for 32x32 blocks, 5 bit codes are required to fully represent the x and y components.
  • a final bit is encoded after each encoded position to signify the presence or absence of more atoms within the block.
  • the basis indices are encoded with 2 fixed-length codes (606). For example, if the dictionary is made up of 16 one-dimensional elements, 4 bits uniquely specifies one address in the table. Each 2-D function is uniquely represented by the two 4-bit addresses associated with the two one dimensional functions.
  • the projection coefficient associated with each encoded position is encoded with either an arithmetic (608) or a fixed length (610) coder as well.
  • the color channels of the DFD are generally of a much more homogeneous nature locally than the luminance channel.
  • the Gabor function expansion in accordance with the present invention is generally not as efficient as the block DCT for the chrominance channels.
  • the Gabor approach is better suited for thinner, "edge-like" characteristics that are found in the luminance channel, as opposed to the lowpass characteristics of the chromir ance channels.
  • each 8x8 block in is transformed into the DCT domain, and the coefficients of this transform are quantized and variable length encoded, using the appropriate syntax provided by the ITU-T Draft International Standard, H.263.
  • the result is a complete encoded DFD which permits very low bit rates for encoding a motion compensated video sequence.
  • the absence of a predetermined segmentation indicates the choice of a centrally-oriented bias (112). This involves weighting the pixels of the DFD according to the expression
  • DFD(i,j) refers to the three channel image of the displaced frame difference
  • _ J indicates integer truncation.
  • the constants appearing in the weighting function are given here for QCIF resolution video images, which have a support of 176x144 in the Y channel, and 88x72 in the Cr and Cb channels. These constants can be changed to accommodate any image format by appropriate scaling.
  • a 16x16 macroblock contains 16x16 pixels from the Y channel, and 8x8 pixels from each of the chrominance channels.
  • the function above describes a window which maps one-to-one onto the pixels of the Y channel.
  • the chrominance pixels in each macroblock are weighted by the same value as the corresponding pixels from the Y channel in that macroblock. The emphasis in this weighting is placed on macroblocks towards the center of the image, which is perceptually important.
  • a region oriented bias is selected (110). The choice of which region receives the bias is accomplished through a ranking operation. For the pixels of the DFD within each region provided by the predetermined segmentation, : ⁇ e blocks which touch that region are marked. For example, FIG. 3, numeral 300, shows an example of two predetermined regions in an image, and the blocks touching those regions are highlighted accordingly. For each region, and for those accompanying marked blocks, the average absolute value of the centrally weighted DFD, expressed above, and the average absolute value of the centrally weighted motion vectors for the marked blocks are computed.
  • Scored ) Av *( DFD + Wd* ) + Av sW Max(DFD) Max(dx) Max(dy) '
  • weighting constants are given in terms of QCIF resolution images, but can be generalized to any resolution.
  • the region having the greatest score receives the highest rank, and serves as the perceptually important region for biasing the DFD.
  • the pixel values in all blocks of the DFD which do not touch the highest ranking region are set to zero, thus limiting the expansion algorithm to encoding only atoms which lie within the chosen region of interest.
  • FIG. 4, numeral 400 is a block diagram of one preferred embodiment of a device for performing an iterative expansion of a DFD image over a special dictionary of modulated Gaussian functions in accordance with the present invention.
  • the device comprises an estimation unit (402), a memory unit (404), a selection unit (406), a quantizer (408), a residual computation unit (410), a comparator (412), a controller (414), and an encoding unit
  • the estimation unit is used for determining which block in the current residual DFD has the highest energy (402).
  • the memory unit (404) which is coupled to the selection unit (406), is used for storing the predetermined dictionary of Gabor functions and the representative quadrant functions.
  • the selection unit (406), coupled to the estimation unit (402), applies the hierarchical expansion algorithm on the block of interest and selects the best matching atom.
  • the quantizer (408), coupled to the selection unit, is used for quantizing the projection coefficient of the current atom.
  • the residual computation unit (410), coupled to the quantizer, is then used for subtracting out the quantized projection of the current atom, and for computing a new current residual DFD.
  • the comparator (412), which is coupled to the residual computation unit, is used for determining if the iteration termination conditions have been met.
  • the controller (414), which is coupled to the comparator (412), the estimation unit (402), the selection unit (406), the quantizer (408), and the residual computation unit (416), is used for running the iteration and managing the data from the expansion for encoding.
  • the device also contains an encoding unit (416) which is coupled to the controller (414), and encodes the position, basis indices and projection coefficient information of the entire expansion of the DFD.
  • FIG. 5, numeral 500 is a block diagram of one preferred embodiment of a microprocessor for performing an iterative expansion of a DFD image over a special dictionary of modulated Gaussian functions in accordance with the present invention.
  • the microprocessor comprises an estimation unit (502), a memory unit (504), a selection unit (506), a quantizer (508), a residual computation unit (510), a comparator (512), a controller (514), and an encoding unit (516).
  • the estimation unit is used for determining which block in the current residual DFD has the highest energy (502).
  • the memory unit (504) which is coupled to the selection unit (506), is used for storing the predetermined dictionary of Gabor functions and the representative quadrant functions.
  • the comparator (512) which is coupled to the residual computation unit, is used for determining if the iteration termination conditions have been met.
  • the controller (514) which is coupled to the comparator (512), the estimation unit (502), the selection unit (506), the quantizer (508), and the residual computation unit (516), is used for running the iteration and managing the data from the expansion for encoding.
  • the microprocessor also contains an encoding unit (516) which is coupled to the controller (514), and encodes the position, basis indices and projection coefficient information of the entire expansion of the DFD.
  • FIG. 7, numeral 700 is a flow chart showing a preferred embodiment of steps of a method in accordance with the present invention.
  • the method for encoding includes the steps of: A) utilizing (702) a predetermined center/central biased weighting scneme for weighting, for each iteration, block sums to provide a selected block; B) determining (704), for each iteration, a best atom having a center which lies within the selected block using a predetermined hierarchical Gabor function search technique wherein predetermined Gabor functions are utilized from a memory; and C) utilizing (706) an energy adaptive dynamic quantization of Gabor basis coefficients from the best atom of each iteration to provide a minimized bit representation of coefficients for a displaced frame difference.
  • the method for decoding includes the steps of: A) utilizing (708) an i nverse quantization of Gabor basis coefficients defined by the parameters in the energy adaptive dynamic quantization used in the encoder (706), B) projecting (710), for each decoded atom, the quantized Gabor basis coefficient of that atom onto the decoded Gabor basis function of that atom; and C) reconstructing (712) the quantized Gabor expansion of the displaced frame difference by summing all of the projections computed in step (710).
  • FIG. 8, numeral 800 is a block diagram of a preferred embodiment of steps of a device in accordance with the present invention.
  • the device for encoding includes: A) a center/central biased estimator (802), coupled to receive a displaced frame difference, for weighting, for each iteration, block sums utilizing a predetermined center/central biased weighting scheme to provide a selected block; B) a best atom selector (804), coupled to the center/central biased estimator and a memory unit (806) having at least stored predetermined Gabor functions, for determining, for each iteration, a best atom having a center which lies within the selected block using a predetermined hierarchical Gabor function search technique; and C) an energy adaptive dynamic quantization unit (808), coupled to the best atom selector (804), for utilizing an energy adaptive dynamic quantization of Gabor basis coefficients from the best atom of each iteration to provide a minimized bit representation of coefficients for a displaced frame difference.
  • A a center/central biased estimator (8
  • the device for decoding includes A) an inverse quantization unit (810) for decoding Gabor basis coefficients defined by the parameters in the energy adaptive dynamic quantization unit used in the encoder (808); B) a computation unit (812), coupled to a memory unit (814) having at least stored predetermined Gabor functions, for projecting, for each decoded atom, the quantized Gabor basis coefficient of that atom onto the decoded Gabor basis function of that atom; and C) a summation unit (816) coupled to the computation unit (812) for reconstructing the quantized Gabor expansion of the displaced frame difference by summing all of the projections computed by the computation unit (812).
  • the operation of the device is described with greater particularity above.
  • the method and device may be selected to be embodied in least one of: A) an application specific integrated circuit; B) a field programmable gate array; and C) a microprocessor; and D) a computer-readable memory; arranged and configured to determine the first modified received signal having minimized distortion and interference in accordance with the scheme described in greater detail above.

Abstract

The present invention provides a method (100, 700), device (400, 800) and microprocessor (500) for performing a computationally efficient iterative expansion of a displaced frame difference, DFD, image over a predetermined dictionary of modulated Gaussian functions for the purposes of video compression. The iterative expansion described in this invention decomposes the DFD image into a set of coefficients which represent the perceptually important areas of a video frame in a compact way. The resulting method, device and microprocessor serve to provide a means for very low bit rate coding/decoding of a video sequence.

Description

METHOD AND DEVICE FOR ENCODING/DECODING A DISPLACED FRAME DIFFERENCE
Field of the Invention
The present invention relates generally to video codecs, and more particularly to encoding of displaced frame differences for video codecs.
Background of the Invention
In the realm of very low bit rate video coding, it is very difficult to represent a video sequence with a small number of bits, and still preserve acceptable quality. Many coding techniques tend to represent each frame of encoded video completely, thus updating all pixels at each frame, and therefore use too many bits to meet a very low target bit rate. Generally, though, motion compensated prediction is used to reduce the amount of information needed to be coded for each frame. This approach is used in the ISO MPEG-1 and MPEG-2 standards, and in the ITU-H.261 and H.263 standards. When motion compensated prediction is used between frames in a video sequence, the error in the prediction must be encoded to preserve the quality of the decoded video sequence. This prediction error is referred to as the Displaced Frame Difference, or DFD.
The DFD is generally a nonstationary, high-pass image which consists of error around the edges of moving objects where a motion estimation technique has failed to adequately represent the motion in the video scene. Often, the DFD will also contain large regions of homogeneous error information. This happens when new objects enter the scene, or when objects are displaced by a large amount of motion between video frames. One common approach used to encode the DFD information is the use of a block-wise Discrete Cosine Transform (DCT), followed by entropy encoding of the coefficients of the transform. This approach is suitable when the pixels within each block are well modeled by a first order Markov random process. The DCT is close to the optimal transform (the Karhunen-Loeve Transform) in terms of energy compaction capabilities when the first order Markov model is met. However, when this model breaks down, the DCT can become less efficient than other techniques at coding the DFD image. One such alternative technique is to use an iterative expansion of the DFD over a dictionary of non-orthogonal Gabor, or modulated Gaussian functions, followed by coding of the expansion information. This approach can allow localization of the prediction error coding to only those pixels of the DFD which are perceptually most important. It can also reduce the required bit expenditure for the DFD over a comparable DCT based approach. Technology does not currently exist, however, for using the iterative Gabor expansion for video coding in an efficient and effective manner.
Thus, there is a need for a method, device and microprocessor that provide a computationally efficient iterative expansion of a displaced frame difference, DFD, image for the purposes of video compression.
Brief Description of the Drawings
FIG. 1 is a flow chart of a preferred embodiment of steps of a method in accordance with the present invention.
FIG. 2 is a graphical representation of the residual energy in the displaced frame difference as decomposed in accordance with the present invention. FIG. 3 is a diagrammatic representation of regions from a predetermined segmentation, and blocks touching those regions in accordance with the present invention.
FIG. 4 is a block diagram of one preferred embodiment of a device in accordance with the present invention.
FIG 5 is a schematic of one preferred embodiment of a microprocessor in accordance with the present invention.
FIG. 6 is a flow chart of a preferred embodiment of steps of a method in accordance with the present invention.
FIG. 7 is a flow chart of a preferred embodiment of steps of a method in accordance with the present invention.
FIG. 8 is a block diagram of a preferred embodiment of a device in accordance with the present invention.
Detailed Description of a Preferred Embodiment
The present invention provides a method, device and microprocessor for performing a computationally efficient iterative expansion of a displaced frame difference, DFD, image over a special dictionary of two-dimensional non-orthogonal modulated Gaussian, or Gabor functions for the purposes of video compression. The coefficients of this expansion are encoded here using techniques which combine to give an efficient representation of the data at very low bit rates. In addition, the expansion is directed to spend the primary amount of available bits on a perceptually important area of the image being encoded. Finally, the method describes the encoding of chrominance information associated with the DFD.
FIG. 1 , numeral 100, is a flow diagram of a preferred embodiment of a method for performing an iterative expansion of a DFD image over a special dictionary of modulated Gaussian functions in accordance with the present invention. This is an iterative process which encodes a projection of the luminance channel of the DFD onto a single basis function and computes a new residual DFD image, at each iteration. For the purposes of this description, the term DFD used here will always refer to the luminance channel of the current residual DFD at the present iteration. The initial residual DFD is set equal to the original DFD obtained from a predetermined motion estimation and compensation technique.
The first step in the method is dividing the DFD image into a plurality of blocks (102). These blocks permit the treatment of the DFD image in a localized manner. The sum of absolute values of the intensity at each pixel within each block is then computed (104). These sums are computed for use in estimating which local region of the DFD image has the highest intensity energy. The next step is weighting the blocks in the DFD according to a predetermined biasing scheme, selected from two options in a predetermined manner (106), which places an emphasis on a perceptually important region of the DFD (108). Where the DFD has been segmented into a plurality of regions according to a predetermined segmentation scheme, this information is used in selecting a perceptually important region (110). Otherwise, a centrally biased weighting scheme is used to represent the perceptually important area of the DFD as the center of the image (112).
The block within the DFD which has the highest weighted energy estimate is then chosen as the best block for concentrating the encoding (114). Using this best block as a starting spatial point, the DFD energy is then encoded using a predetermined hierarchical Gabor function expansion.
A predetermined dictionary of two-dimensional, normalized, real Gabor functions is used for this method (116). The analytical expression for the functions in this dictionary is given by
G^j) = 8£{i)8g(j) U = {0.1 N- 1}, r r a,β eB,
B = SET(s,ς,φ), where
i = {0,l,...,N - l}, N a predetermined positive integer, where N =16, and
S(*) = V2V"*\ K? is a normalizing constant chosen such that |gr(;)j| = ι. The predetermined dictionary is completely specified by defining a set of parameters B = SET(s,ς,φ),wh ch are to be used.
The hierarchical Gabor function expansion begins by picking a representative quadrant of the predetermined dictionary of two-dimensional Gabor functions(118). This representative quadrant is determined by finding the best matching Gabor basis function from four predetermined basis functions which represent each quadrant of the dictionary using a predetermined hierarchical pixel search. The best matching Gabor basis function within the representative quadrant of the dictionary is then found using a projection of the DFD signal onto each basis function with the hierarchical pixel search (120), and picking the function associated with the largest valued projection. The projections in steps (118) and (120) are computed only over bases whose center points lie within the best DFD block chosen in step (114).
The predetermined hierarchical pixel search is used in finding the expansion of the DFD at each iteration in (118) and (120). In this search, the projection, or inner product, is computed for the current basis function centered at every nth pixel within the best block in the DFD. For example, if the best block of interest was comprised of the pixels with indices between 16 and 31 in the x direction, 16 and 31 in the y direction, and t?=2, then the (x,y) centers for computing the projections would be:
(16,16 ), (16,18), (16,20) .... (16,30); (18,16), (18,18), (18,20) .... (18,30);
(28,16), (28,18), (28,20) ... (28,30). (30,16), (30,18), (30,20) ... (30,30).
The projection yielding the largest amplitude from these computations determines the best initial matching pixel position. Next, the pixels in a predetermined neighborhood of the best initial matching pixel position are used as centers of projections of the current Gabor basis function, and the best matching pixel position is found by choosing the position yielding the largest projection value. For example, if the best initial matching pixel position from the previous example was (18,20), and the predetermined neighborhood was +/-1 pixel, then the position search would additionally involve computing projections at the positions: (17,19), (17,20), (17,21 );
(18,19), (18,21 );
(19,19), (19,20), (19,21 ). The final projection for the current Gabor basis function is chosen as that which has the largest projection value after the neighborhood pixel search. Step (1 18) uses the hierarchical pixel search for each of the four representative Gabor basis functions. The basis function having maximum final projection among the four functions is chosen as the representative matching Gabor basis function. In step (120), the remaining Gabor basis functions in the representative quadrant of the dictionary are searched using the same hierarchical pixel position search within the best block in the DFD. The Gabor basis function with the largest valued projection after this search is chosen as the current best matching Gabor atom. An atom comprises three key pieces of information which are encoded: the final best matching pixel position, the index of the chosen Gabor basis function in the dictionary, and the value of the corresponding projection coefficient.
The value of the current projection coefficient is quantized with a uniform, fixed length mid-tread quantizer (122). The limits of this quantizer are defined by using the value of the projection coefficient associated with the atom chosen at the first iteration (124). The residual energy in the DFD has a monotonically decreasing nature with the iterative expansion described here. FIG. 2, numeral 200, shows this behavior graphically.
Because of this unique characteristic, the first projection coefficient gives the maximum value needing to be represented in the encoded data. The maximum output of the quantizer is the nearest integer value to the projection coefficient in the first iteration. Here, the minimum output value of the quantizer is a predetermined percentage. The maximum quantizer value is encoded using a predetermined fixed number of bits. For example, the use of 9 bits to encode the maximum value of the quantizer permits the quantization of projection coefficients having an absolute value less than 512, and requires clipping any which have a greater value than this to a value of 51 1. A suitable choice for setting the minimum quantizer value is 10% of the maximum quantizer value. Thus, after the first iteration of the hierarchical expansion, the projection coefficient quantizer is completely defined, and can be used on all subsequent iterations for the current video frame being encoded.
After quantization of the current projection coefficient, a new residual DFD is computed (126). This is the current residual DFD minus the quantized projection of that DFD onto the last atom. This can be expressed as
Rn+1 = Rn - pGn«r/3r
where R" represents the current residual DFD, G"iβ represents the current best matching basis function from the predetermined dictionary, and
which is the quantized projection of the current residual DFD onto the current best matching dictionary function.
After this new residual image is computed, the hierarchical expansion process is repeated, steps (102)-(126). In step (104), only those block sums which have been affected by the computation of the current residual DFD need to be recomputed. This iterative expansion continues until a predetermined number of iterations has occurred, or the value of the projection coefficient of the current atom falls below a predetermined threshold (128).
After the last iteration, all of the information associated with the expansion must be encoded. This information can be described in terms of the atom, or selected matching basis function, at each iteration. The components of the atoms which need to be encoded are the position, quantized projection coefficients, and dictionary basis indices.
FIG. 6, numeral 600, shows a flow diagram of the encoding process for the atom information associated with the iterative expansion of the DFD image. The spatial position of each atom must be encoded so that the decoder can reconstruct the same residual. This can be accomplished by representing these positions with a differential code which is arithmetically encoded (602), or encoded with a fixed length code (604), the latter being more robust when there is a possibility of errors during transmission of the coded data. In the case of arithmetic coding, a differential vector for each atom position, representing the distance from the last atom in the x and y directions, and from the origin for the first atom, is encoded. This approach allows for the encoding of these positions near the actual entropy of the set of symbols representing the positions because of the non-integer value of bits required to uniquely encode each symbol. One arithmetic code is used for the x component differential codewords, and one is used for the y component differential codewords. An adaptive model of the probability distribution o- the symbols is used. Th 3 model can De uplated with each encoded frame of video.
In the case of fixed length codes for representing the position (604), the plurality of blocks used in step (102) is used to assist in encoding this data as well. For each block, one bit is encoded to specify the presence or absence of any atom positions in that block. For those blocks which have atoms, the positions of those atoms are encoded using a vector for each atom position, representing the distance from the x and y coordinates of the origin of the block. The components of each vector are encoded using a fixed number of bits determined by the block size. For example, for 32x32 blocks, 5 bit codes are required to fully represent the x and y components. A final bit is encoded after each encoded position to signify the presence or absence of more atoms within the block.
The basis indices are encoded with 2 fixed-length codes (606). For example, if the dictionary is made up of 16 one-dimensional elements, 4 bits uniquely specifies one address in the table. Each 2-D function is uniquely represented by the two 4-bit addresses associated with the two one dimensional functions. The projection coefficient associated with each encoded position is encoded with either an arithmetic (608) or a fixed length (610) coder as well.
The color channels of the DFD are generally of a much more homogeneous nature locally than the luminance channel. Thus, the Gabor function expansion in accordance with the present invention is generally not as efficient as the block DCT for the chrominance channels. The Gabor approach is better suited for thinner, "edge-like" characteristics that are found in the luminance channel, as opposed to the lowpass characteristics of the chromir ance channels. For the c.irominance cnanne s of the DFD, in step (612), each 8x8 block in is transformed into the DCT domain, and the coefficients of this transform are quantized and variable length encoded, using the appropriate syntax provided by the ITU-T Draft International Standard, H.263. The result is a complete encoded DFD which permits very low bit rates for encoding a motion compensated video sequence.
For the biasing of the DFD selected in (106), the absence of a predetermined segmentation indicates the choice of a centrally-oriented bias (112). This involves weighting the pixels of the DFD according to the expression
where, DFD(i,j) refers to the three channel image of the displaced frame difference, and the symbol |_ J indicates integer truncation. The constants appearing in the weighting function are given here for QCIF resolution video images, which have a support of 176x144 in the Y channel, and 88x72 in the Cr and Cb channels. These constants can be changed to accommodate any image format by appropriate scaling. In the QCIF resolution, a 16x16 macroblock contains 16x16 pixels from the Y channel, and 8x8 pixels from each of the chrominance channels. As a result, the function above describes a window which maps one-to-one onto the pixels of the Y channel. The chrominance pixels in each macroblock are weighted by the same value as the corresponding pixels from the Y channel in that macroblock. The emphasis in this weighting is placed on macroblocks towards the center of the image, which is perceptually important.
In the presence of a predetermined segmentation, a region oriented bias is selected (110). The choice of which region receives the bias is accomplished through a ranking operation. For the pixels of the DFD within each region provided by the predetermined segmentation, :ιe blocks which touch that region are marked. For example, FIG. 3, numeral 300, shows an example of two predetermined regions in an image, and the blocks touching those regions are highlighted accordingly. For each region, and for those accompanying marked blocks, the average absolute value of the centrally weighted DFD, expressed above, and the average absolute value of the centrally weighted motion vectors for the marked blocks are computed. There is a motion vector associated with each pixel in any motion compensated coding scheme, although, in a block based motion compensation approach, all vectors within one block are identical. This information is available from a predetermined motion estimation and compensation process. These values are normalized by the maximum value for each category: motion vectors in the x direction, dx, motion vectors in the y direction, dy, and DFD pixel intensity values over all the regions provided by the predetermined segmentation. The result is a score for each region, indexed by /, which allows the regions to be ranked by perceptual importance:
Scored) = Av*(DFD + Wd* ) + AvsW Max(DFD) Max(dx) Max(dy) '
where DFD(i ) is defined above, and, in the same fashion, and
Again, the weighting constants are given in terms of QCIF resolution images, but can be generalized to any resolution.
The region having the greatest score receives the highest rank, and serves as the perceptually important region for biasing the DFD. The pixel values in all blocks of the DFD which do not touch the highest ranking region are set to zero, thus limiting the expansion algorithm to encoding only atoms which lie within the chosen region of interest.
FIG. 4, numeral 400, is a block diagram of one preferred embodiment of a device for performing an iterative expansion of a DFD image over a special dictionary of modulated Gaussian functions in accordance with the present invention. The device comprises an estimation unit (402), a memory unit (404), a selection unit (406), a quantizer (408), a residual computation unit (410), a comparator (412), a controller (414), and an encoding unit
(416). The estimation unit is used for determining which block in the current residual DFD has the highest energy (402). The memory unit (404) which is coupled to the selection unit (406), is used for storing the predetermined dictionary of Gabor functions and the representative quadrant functions. The selection unit (406), coupled to the estimation unit (402), applies the hierarchical expansion algorithm on the block of interest and selects the best matching atom. The quantizer (408), coupled to the selection unit, is used for quantizing the projection coefficient of the current atom. The residual computation unit (410), coupled to the quantizer, is then used for subtracting out the quantized projection of the current atom, and for computing a new current residual DFD. The comparator (412), which is coupled to the residual computation unit, is used for determining if the iteration termination conditions have been met. The controller (414), which is coupled to the comparator (412), the estimation unit (402), the selection unit (406), the quantizer (408), and the residual computation unit (416), is used for running the iteration and managing the data from the expansion for encoding. The device also contains an encoding unit (416) which is coupled to the controller (414), and encodes the position, basis indices and projection coefficient information of the entire expansion of the DFD.
FIG. 5, numeral 500, is a block diagram of one preferred embodiment of a microprocessor for performing an iterative expansion of a DFD image over a special dictionary of modulated Gaussian functions in accordance with the present invention. The microprocessor comprises an estimation unit (502), a memory unit (504), a selection unit (506), a quantizer (508), a residual computation unit (510), a comparator (512), a controller (514), and an encoding unit (516). The estimation unit is used for determining which block in the current residual DFD has the highest energy (502). The memory unit (504) which is coupled to the selection unit (506), is used for storing the predetermined dictionary of Gabor functions and the representative quadrant functions. The selection unit (506), coupled to the estimation unit (502) and to the memory unit (504), applies the hierarchical expansion algorithm on the block of interest and selects the best matching atom. The quantizer (508), coupled to the selection unit, is used for quantizing the projection coefficient of the current atom. The residual computation unit (510), coupled to the quantizer, is then used for subtracting out the quantized projection of the current atom, and for computing a new current residual DFD. The comparator (512), which is coupled to the residual computation unit, is used for determining if the iteration termination conditions have been met. The controller (514), which is coupled to the comparator (512), the estimation unit (502), the selection unit (506), the quantizer (508), and the residual computation unit (516), is used for running the iteration and managing the data from the expansion for encoding. The microprocessor also contains an encoding unit (516) which is coupled to the controller (514), and encodes the position, basis indices and projection coefficient information of the entire expansion of the DFD.
FIG. 7, numeral 700, is a flow chart showing a preferred embodiment of steps of a method in accordance with the present invention. The method for encoding includes the steps of: A) utilizing (702) a predetermined center/central biased weighting scneme for weighting, for each iteration, block sums to provide a selected block; B) determining (704), for each iteration, a best atom having a center which lies within the selected block using a predetermined hierarchical Gabor function search technique wherein predetermined Gabor functions are utilized from a memory; and C) utilizing (706) an energy adaptive dynamic quantization of Gabor basis coefficients from the best atom of each iteration to provide a minimized bit representation of coefficients for a displaced frame difference. The method for decoding includes the steps of: A) utilizing (708) an i nverse quantization of Gabor basis coefficients defined by the parameters in the energy adaptive dynamic quantization used in the encoder (706), B) projecting (710), for each decoded atom, the quantized Gabor basis coefficient of that atom onto the decoded Gabor basis function of that atom; and C) reconstructing (712) the quantized Gabor expansion of the displaced frame difference by summing all of the projections computed in step (710). The method is described with greater particularity above.
FIG. 8, numeral 800, is a block diagram of a preferred embodiment of steps of a device in accordance with the present invention. The device for encoding includes: A) a center/central biased estimator (802), coupled to receive a displaced frame difference, for weighting, for each iteration, block sums utilizing a predetermined center/central biased weighting scheme to provide a selected block; B) a best atom selector (804), coupled to the center/central biased estimator and a memory unit (806) having at least stored predetermined Gabor functions, for determining, for each iteration, a best atom having a center which lies within the selected block using a predetermined hierarchical Gabor function search technique; and C) an energy adaptive dynamic quantization unit (808), coupled to the best atom selector (804), for utilizing an energy adaptive dynamic quantization of Gabor basis coefficients from the best atom of each iteration to provide a minimized bit representation of coefficients for a displaced frame difference. The device for decoding includes A) an inverse quantization unit (810) for decoding Gabor basis coefficients defined by the parameters in the energy adaptive dynamic quantization unit used in the encoder (808); B) a computation unit (812), coupled to a memory unit (814) having at least stored predetermined Gabor functions, for projecting, for each decoded atom, the quantized Gabor basis coefficient of that atom onto the decoded Gabor basis function of that atom; and C) a summation unit (816) coupled to the computation unit (812) for reconstructing the quantized Gabor expansion of the displaced frame difference by summing all of the projections computed by the computation unit (812). The operation of the device is described with greater particularity above.
The method and device may be selected to be embodied in least one of: A) an application specific integrated circuit; B) a field programmable gate array; and C) a microprocessor; and D) a computer-readable memory; arranged and configured to determine the first modified received signal having minimized distortion and interference in accordance with the scheme described in greater detail above.
Although exemplary embodiments are described above, it will be obvious to those skilled in the art that many alterations and modifications may be made without departing from the invention. Accordingly, it is intended that all such alterations and modifications be included within the spirit and scope of the invention as defined in the appended claims.
We claim:

Claims

1. An iterative method for encoding a displaced frame difference, DFD, in a video sequence, wherein the displaced frame difference is divided into a plurality of blocks, and wherein an absolute value of the displaced frame difference is summed within each block to provide a block sum, comprising the steps of at least one of 1 A-1 C:
1 A) weighting, for each iteration, block sums wherein,
1A1 ) where the displaced frame difference has been segmented into a plurality of regions,
1A1 a) ranking each region and lA1 b) weighting each block sum with a bias to a center of a highest ranking region; and
1A1c) selecting and storing in memory a block with a highest weighted sum of displaced frame difference values to be a selected block, and 1A2) where the displaced frame difference is unsegmented,
1A2a) weighting the block sums with a central bias;
1 A2b) selecting and storing in memory the block with a highest weighted sum of displaced frame difference values to be the selected block; and
1 B) determining, for each iteration, a best atom having a center which lies within the selected block using a predetermined hierarchical Gabor function search technique, wherein predetermined Gabor basis functions are stored in a memory and wherein: 1 B1 ) for each of four predetermined Gabor basis functions, each representing a quadrant of a predetermined dictionary of Gabor functions,
1 B1 a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide an initial best matching pixel; 1 B1 b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the initial best matching pixel to provide a best match inner product for the predetermined Gabor function; 1 B2) selecting a representative predetermined Gabor function from the four predetermined Gabor functions based on a highest best match inner product and providing a representative quadrant;
1 B3) for each predetermined Gabor function in the representative quadrant, 1 B3a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide another best matching pixel;
1 B3b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the other best matching pixel to provide a best atom with a largest inner product; and
1C) utilizing an energy adaptive dynamic quantization of Gabor basis coefficients from the best atom of each iteration to provide a minimized bit representation of the coefficients, wherein the energy adaptive dynamic quantization of Gabor basis coefficients includes:
1 C1 ) determining a maximum quantization value that is equal to the inner product of the best atom of a first iteration; 1 C2) determining a minimum quantization value that is equal to a predetermined percentage of the maximum quantization value; and
1 C3) forming a quantized projection of a current atom utilizing the maximum quantization value and the minimum quantization value with a predetermined number of quantization levels and applying the quantized projection of the current atom to subsequent iterations.
2. The method of claim 1 , wherein at least one of 2A-2C: 2A) for step 1B the predetermined dictionary of Gabor functions which are each a two dimensional product of two one dimensional functions represented by the analytical expression
Grr(i,j) = gr(i)gr(j) i = {0,1 N - 1}, r r a,β B,
B = SET(s,ς,φ),
N a predetermined positive integer, where
i = {0.1 N-\}. and and wherein the values of s, ς, and φ in the predetermined dictionary are given by:
s ς Φ
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
8 0 0
10 0 0
11 0 0
1 1 π/2
5 1 π/2
11 1 π/2
10 3 0
8 2 0
4 2 0
4 2 π/4
6 4 π/4
and the four representative functions from the dictionary are given by the four two dimensional products of the functions with values s , ς, and φ given by: s ς φ
1 1 0 0
1 1 π/2
2B) for step 1 A1 , ranking is determined by: 2B1 ) for the pixels which lie within each region provided by a predetermined segmentation of the DFD, the blocks which touch that region are identified as the blocks associated with that region;
2B2) for those blocks associated with each region, the average absolute value of the centrally weighted DFD, and the average absolute value of the centrally weighted motion vectors for those blocks are computed;
2B3) an average absolute value of the centrally weighted DFD, and the average absolute value of the centrally weighted motion vectors are normalized by the maximum value for each category: motion vectors in the x direction, dx, motion vectors in the y direction, dy, and DFD pixels, over all the regions provided by the predetermined segmentation; and
2B4) a sum of the normalized values is computed according to:
Score(i)
where the subscript /' refers to that measure within blocks that touch region /. and where DFD(i.j) is defined as
and, in the same fashion,
and where the constants given in the weighting function are expressed in terms of QCIF resolution images, and may be generalized to any resolution;
2C) wherein the predetermined hierarchical Gabor function search technique includes:
2C1 ) for each of four predetermined Gabor basis functions, each representing a quadrant of a predetermined dictionary of Gabor functions,
2C1 a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide an initial best matching pixel; 2C1 b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the initial best matching pixel to provide a best match inner product for the predetermined Gabor function;
2C2) selecting a representative predetermined Gabor function from the four predetermined Gabor functions based on a highest best match inner product and providing the representative quadrant;
2C3) for each predetermined Gabor function in the quadrant, 2C3a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide another best matching pixel;
2C3b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the other best matching pixel to provide a best atom with a largest inner product.
3. A device for encoding a displaced frame difference, DFD, in a video sequence, wherein the displaced frame difference is divided into a plurality of blocks, wherein an absolute value of the displaced frame difference is summed within each block to provide a block sum, comprising:
3A) an estimation unit, coupled to receive the displaced frame difference, to a controller, and to a comparator, which weights, for each iteration, the block sums wherein the estimation unit is utilized for:
3A1 ) where the displaced frame difference has been segmented into a plurality of regions,
3A1 a) ranking each region and
3A1 b) weighting each block sum with a bias to a center of a highest ranking region; and
3A1c) selecting a block with a highest weighted sum of displaced frame difference values to be a selected block, and
3A2) where the displaced frame difference is unsegmented, 3A2a) weighting the block sums with a central bias;
3A2b) selecting a block with a highest weighted sum of displaced frame difference values to be the selected block; and
3B) a selection unit, coupled to the estimation unit, to the controller, and to a memory unit that stores a predetermined dictionary of Gabor functions and representative quadrant functions, for determining, for each iteration, a best atom having a center which lies within the selected block using a predetermined hierarchical Gabor function search technique, wherein the selector unit is utilized for:
3B1 ) for each of four predetermined Gabor basis functions, each representing a quadrant of a predetermined dictionary of Gabor functions, 3B1 a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide an initial best matching pixel; 3B1 b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the initial best matching pixel to provide a best match inner product for the predetermined Gabor function;
3B2) selecting a representative predetermined Gabor function from the four predetermined Gabor functions based on a highest best match inner product and providing the representative quadrant;
3B3) for each predetermined Gabor function in the representative quadrant,
3B3a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide another best matching pixel;
3B3b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the other best matching pixel to provide a best atom with a largest inner product;
3C) a quantizer, coupled to a selection unit and to the controller, for utilizing energy adaptive dynamic quantization of Gabor basis coefficients including: 3C1 ) determining a maximum quantization value that is equal to the inner product of the best atom of a first iteration;
3C2) determining a minimum quantization value that is equal to a predetermined percentage of the maximum quantization value; and
3C3) forming a quantized projection of a current atom utilizing the maximum quantization value and the minimum quantization value with a predetermined number of quantization levels and applying the quantized projection of the current atom to subsequent iterations; 3D) a residual computation unit, coupled to the quantizer and to the controller, for subtracting the quantized projection of the current atom, and for computing a new current residual differential phase difference;
3E) the comparator, coupled to the residual computation unit and to the controller, for determining whether predetermined iteration termination conditions have been met; and
3F) the controller, coupled to the estimation unit, the selection unit, the quantizer, the residual computation unit and the comparator, for controlling iteration and controlling output from the residual computation unit to provide a minimized bit representation of coefficients to an encoding unit; and
3G) an encoding unit, coupled to the controller, for encoding a position, basis indices and projection coefficient information for an expansion of the differential phase difference.
4. The device of claim 3 wherein at least one of 4A-4C:
4A) in 3B, wherein the memory unit holds a predetermined dictionary of Gabor functions which are each a two dimensional product of two one dimensional functions represented by the analytical expression
Grr(i ) = gs(i)gfU) i = {0.1 N - 1}. r r a,β e B.
B = SET(s.ς,φ),
N a predetermined positive integer, where
/ = {0.1....„tV - l}. and g(k) = *j2e- - πk ' and wherein the values of ς, and φ in the predetermined dictionary are given by:
and the four representative functions from the dictionary are given by the four two dimensional products of the functions with values s, ς, and φ given by: s ς φ
1 1 0 0
1 1 π/2 4B) wherein, for step 3A1 , the estimation unit determines ranking by:
4B1 ) for the pixels which lie within each region provided by a predetermined segmentation of the DFD, the blocks which touch that region are identified as blocks associated with that region;
4B2) for those marked blocks associated with each region, the average absolute value of the centrally weighted DFD, and the average absolute value of the centrally weighted motion vectors for those blocks are computed;
4B3) an average absolute value of the centrally weighted DFD, and an average absolute value of the centrally weighted motion vectors are normalized by the maximum value for each category: motion vectors in the x direction, dx, motion vectors in the y direction, dy, and DFD pixels, over all the regions provided by the predetermined segmentation;
4B4) sum of the normalized values is computed according to:
Score(i) = Avg(^ Av^ Av^Th Max(DFD) Max(dx) Max(dy)
where the subscript / refers to that measure within blocks that touch region and where DFD(i.j) is defined as
and, in the same fashion,
and where the constants given in the weighting function are expressed in terms of QCIF resolution images, and may be generalized to any resolution;
4C) wherein the predetermined hierarchical Gabor function search technique implemented by the selection unit includes:
4C1 ) for each of four predetermined Gabor basis functions, each representing a quadrant of a predetermined dictionary of Gabor functions,
4C1 a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide an initial best matching pixel; 4C1 b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the initial best matching pixel to provide a best match inner product for the predetermined Gabor function;
4C2) selecting a representative predetermined Gabor function from the four predetermined Gabor functions based on a highest best match inner product and providing the representative quadrant;
4C3) for each predetermined Gabor function in the quadrant, 4C3a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide another best matching pixel;
4C3b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the other best matching pixel to provide a best atom with a largest inner product.
5. A microprocessor for encoding a displaced frame difference, DFD, in a video sequence, wherein the displaced frame difference is divided into a plurality of blocks, wherein an absolute value of the displaced frame difference is summed within each block to provide a block sum, comprising the steps of at least one of 5A-5C:
5A) an estimation unit which weights, for each iteration, the block sums wherein,
5A1 ) where the displaced frame difference has also been segmented into a plurality of regions,
5A1 a) ranking each region and
5A1 b) weighting each block sum with a bias to a center of a highest ranking region; and
5A1 C) selecting a block with a highest weighted sum of displaced frame difference values to be a selected block, and
5A2) where the displaced frame difference is unsegmented, 5A2a) weighting the block sums with a central bias;
5A2b) selecting a block with a highest weighted sum of displaced frame difference values to be the selected block; and
5B) a selection unit, coupled to a memory unit, which determines, for each iteration, a best atom having a center which lies within the selected block using a predetermined hierarchical Gabor function search technique, wherein
5B1 ) for each of four predetermined Gabor basis functions, each representing a quadrant of a predetermined dictionary of Gabor functions, 5B1 a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide an initial best matching pixel;
5B1 b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the initial best matching pixel to provide a best match inner product for the predetermined Gabor function;
5B2) selecting a representative predetermined Gabor function from the four predetermined Gabor functions based on a highest best match inner product and providing the representative quadrant; 5B3) for each predetermined Gabor function in the representative quadrant,
5B3a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide another best matching pixel;
5B3b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the other best matching pixel to provide a best atom with a largest inner product; and 5C) a quantizer, coupled to a selection unit and a residual computation unit, utilizing an energy adaptive dynamic quantization of Gabor basis coefficients from the best atom of each iteration to provide a minimized bit representation of the coefficients, wherein the energy adaptive dynamic quantization of Gabor basis coefficients includes:
5C1 ) determining a maximum quantization value that is equal to the inner product of the best atom of a first iteration;
5C2) determining a minimum quantization value that is equal to a predetermined percentage of the maximum quantization value; and 5C3) forming a quantizer utilizing the maximum quantization value and the minimum quantization value with a predetermined number of quantization levels and applying the quantizer to subsequent iterations.
6. The microprocessor of claim 5 B, wherein at least one of 6A-6C:
6A) wherein the memory unit holds a predetermined dictionary of Gabor functions which are each a two dimensional product of two one dimensional functions represented by the analytical expression
Grr(ij) = gr(i)gr(j) ij = {0.1 N - l}, r r ,β .B,
B = SET(s,ς,φ),
N a predetermined positive integer, where
= {0,1 N - l}, and and wherein the values of s , ς, and φ in the predetermined dictionary are given by:
and the four representative functions from the dictionary are given by the four two dimensional products of the functions with values s , ς, and φ given by: s φ
1 1 0 0
6B) wherein the estimator determines, for step 5A1 , ranking by:
6B1 ) for the pixels which lie within each region provided by a predetermined segmentation of the DFD, the blocks which touch that region are identified as blocks associated with that region;
6B2) for those marked blocks associated with each region, the average absolute value of the centrally weighted DFD, and the average absolute value of the centrally weighted motion vectors for those blocks are computed;
6B3) an average absolute value of the centrally weighted DFD, and the average absolute value of the centrally weighted motion vectors are normalized by the maximum value for each category: motion vectors in the x direction, dx, motion vectors in the y direction, dy, and DFD pixels, over all the regions provided by the predetermined segmentation;
6B4) a sum of the normalized values is computed according to:
Scored) = ^ ™^ + *&<&) + vgύjy. ) Max(DFD) Max(dx) Max(dy) '
where the subscript /' refers to that measure within blocks that touch region and where DFD(i,j) is defined as
and, in the same fashion, and
where the constants given in the weighting function are expressed in terms of QCIF resolution images, and may be generalized to any resolution; and
6C) wherein the predetermined hierarchical Gabor function search technique implemented by the selection unit includes: 6C1 ) for each of four predetermined Gabor basis functions, each representing a quadrant of a predetermined dictionary of Gabor functions,
6C1 a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide an initial best matching pixel;
6C1 b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the initial best matching pixel to provide a best match inner product for the predetermined Gabor function;
6C2) selecting a representative predetermined Gabor function from the four predetermined Gabor functions based on a highest best match inner product and providing the representative quadrant;
6C3) for each predetermined Gabor function in the quadrant, 6C3a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide another best matching pixel;
6C3b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the other best matching pixel to provide a best atom with a largest inner product.
7. A method for iteratively encoding/decoding a displaced frame difference, DFD, in a video sequence, wherein the displaced frame difference is divided into a plurality of blocks, and wherein an energy estimate of the displaced frame difference is computed within each block to provide a block energy metric, comprising the steps of at least one of 7A-7C and 7D-7F: for an encoder: 7A) utilizing a predetermined center/central biased weighting scheme for weighting, for each iteration, block energy metrics to provide a selected block;
7B) determining, for each iteration, a best atom having a center which lies within the selected block using a predetermined hierarchical Gabor function search technique wherein predetermined Gabor functions are utilized from a memory; and
7C) utilizing an energy adaptive dynamic quantization of Gabor basis coefficients from the best atom of each iteration to provide a minimized bit representation of coefficients for a displaced frame difference; and for a decoder:
7D) utilizing, in the decoder, an inverse quantization of Gabor basis coefficients defined by the parameters in the energy adaptive dynamic quantization used in the encoder;
7E) projecting, for each decoded atom, the quantized Gabor basis coefficient of that atom onto the decoded Gabor basis function of that atom; and 7F) reconstructing the quantized Gabor expansion of the displaced frame difference by summing all of the projections computed in step 7E.
8. The method of claim 7 wherein at least one of 8A-8D: 8A) wherein the center/central biased weighting scheme of step
7A includes:
8A1 ) where the displaced frame difference has been segmented into a plurality of regions,
8A1 a) ranking each region and 8A1 b) weighting each block sum with a bias to a center of a highest ranking region; and
8A1c) selecting a block with a highest weighted sum of displaced frame difference values to be a selected block, and
8A2) where the displaced frame difference is unsegmented, 8A2a) weighting the block sums with a central bias;
8A2b) selecting the block with a highest weighted sum of displaced frame difference values to be the selected block;
8B) wherein the predetermined hierarchical Gabor function search technique of step 7B includes:
8B1 ) for each of four predetermined Gabor basis functions, each representing a quadrant of a predetermined dictionary of Gabor functions,
8B1 a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide an initial best matching pixel;
8B1 b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the initial best matching pixel to provide a best match inner product for the predetermined Gabor function; 8B2) selecting a representative predetermined Gabor function from the four predetermined Gabor functions based on a highest best match inner product and providing a representative quadrant;
8B3) for each predetermined Gabor function in the representative quadrant,
8B3a) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide another best matching pixel; 8B3b) computing an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the other best matching pixel to provide a best atom with a largest inner product;
8C) wherein the energy adaptive dynamic quantization of Gabor basis coefficients of step 7C includes:
8C1 ) determining a maximum quantization value that is equal to the inner product of the best atom of a first iteration;
8C2) determining a minimum quantization value that is equal to a predetermined percentage of the maximum quantization value; and 8C3) forming a quantized projection of a current atom utilizing the maximum quantization value and the minimum quantization value with a predetermined number of quantization levels and applying the quantizer to subsequent iterations; and 8D) wherein the method is a process whose steps are embodied in least one of:
8D1 ) an application specific integrated circuit; 8D2) a field programmable gate array; and 8D3) a microprocessor; and 8D4) a computer-readable memory; arranged and configured to determine the first modified received signal having minimized distortion and interference in accordance with the scheme of claim 7.
9. A device for encoding/decoding a displaced frame difference, DFD, in a video sequence, wherein the displaced frame difference is divided into a plurality of blocks, wherein an absolute value of the displaced frame difference is summed within each block to provide a block sum, comprising at least one of an encoder and a decoder: wherein the encoder comprises:
9A) a center/central biased estimator, for weighting, for each iteration, block sums utilizing a predetermined center/central biased weighting scheme to provide a selected block; 9B) a best atom selector, coupled to the estimator and a memory unit having at least stored predetermined Gabor functions, for determining, for each iteration, a best atom having a center which lies within the selected block using a predetermined hierarchical Gabor function search technique; and 9C) an energy adaptive dynamic quantization unit, coupled to the selector unit, for utilizing an energy adaptive dynamic quantization of Gabor basis coefficients from the best atom of each iteration to provide a minimized bit representation of coefficients for a displaced frame difference;
and the decoder comprises:
9D) an inverse quantization unit for decoding Gabor basis coefficients defined by the parameters in the energy adaptive dynamic quantization unit used in the encoder;
9E) a computation unit for projecting, for each encoded atom, the quantized Gabor basis coefficient of that atom onto the decoded Gabor basis function of that atom; and 9F) a summation unit for reconstructing the quantized Gabor expansion of the displaced frame difference by summing all of the projections computed by the computation unit in step 9E.
10. The device of claim 9 wherein at least one of 10A-1 OD: 10A) wherein the estimator further:
10A1 ) where the displaced frame difference has also been segmented into a plurality of regions,
10A1 a) ranks each region and
10A1 b weights each block sum with a bias to a center of a highest ranking region; and
10A1c) selects a block with a highest weighted sum of displaced frame difference values to be a selected block, and
10A2) where the displaced frame difference is unsegmented, 10A2a) weights the block sums with a central bias; and
10A2b) selects a block with a highest weighted sum of displaced frame difference values to be the selected block;
10B) wherein the selector further:
10B1 ) for each of four predetermined Gabor basis functions, each representing a quadrant of a predetermined dictionary of Gabor functions, 10B1 a) computes an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide an initial best matching pixel;
10B1 b) computes an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the initial best matching pixel to provide a best match inner product for the predetermined Gabor function;
10B2) selects a representative predetermined Gabor function from the four predetermined Gabor functions based on a highest best match inner product and providing the representative quadrant; for each predetermined Gabor function in the representative quadrant, 10B2a) computes an inner product between the predetermined Gabor basis function and the displaced frame difference at every nth selected pixel within the selected block, n a preselected positive integer, to provide another best matching pixel; 10B2b) computes an inner product between the predetermined Gabor basis function and the displaced frame difference at a predetermined neighborhood of the other best matching pixel to provide a best atom with a largest inner product;
10C) wherein energy adaptive dynamic quantization of Gabor basis coefficients by the quantization unit includes:
10C1 ) determining a maximum quantization value that is equal to the inner product of the best atom of a first iteration;
10C2) determining a minimum quantization value that is equal to a predetermined percentage of the maximum quantization value; and
10C3) forming a quantized projection of a current atom utilizing the maximum quantization value and the minimum quantization value with a predetermined number of quantization levels and applying the quantizer to subsequent iterations; and
10D) wherein the device is embodied in at least one of: 10D1 ) an application specific integrated circuit;
10D2) a field programmable gate array;
10D3) a microprocessor; and 10D4) a computer-readable memory; arranged and configured to determine the first modified received signal having minimized distortion and interference in accordance with the scheme of claim 9.
EP96929873A 1995-10-26 1996-08-27 Method and device for encoding/decoding a displaced frame difference Withdrawn EP0800684A4 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US54878995A 1995-10-26 1995-10-26
US548789 1995-10-26
PCT/US1996/014115 WO1997015897A1 (en) 1995-10-26 1996-08-27 Method and device for encoding/decoding a displaced frame difference

Publications (2)

Publication Number Publication Date
EP0800684A1 true EP0800684A1 (en) 1997-10-15
EP0800684A4 EP0800684A4 (en) 1998-03-25

Family

ID=24190406

Family Applications (1)

Application Number Title Priority Date Filing Date
EP96929873A Withdrawn EP0800684A4 (en) 1995-10-26 1996-08-27 Method and device for encoding/decoding a displaced frame difference

Country Status (3)

Country Link
EP (1) EP0800684A4 (en)
TW (1) TW395119B (en)
WO (1) WO1997015897A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69731066T2 (en) * 1997-01-23 2005-10-06 Hewlett-Packard Development Co., L.P., Houston Memory controller with programmable pulse delay
FR2804777B1 (en) * 2000-02-03 2002-06-21 Univ Paris Curie METHOD AND DEVICE FOR PROCESSING IMAGE SEQUENCES WITH MASKING
KR20130050404A (en) * 2011-11-07 2013-05-16 오수미 Method for generating reconstructed block in inter prediction mode

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993002526A1 (en) * 1991-07-19 1993-02-04 Laboratoire De Traitement Des Signaux Method for compressing digital image sequences

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1993002526A1 (en) * 1991-07-19 1993-02-04 Laboratoire De Traitement Des Signaux Method for compressing digital image sequences

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
FEICHTINGER H G ET AL: "Hierarchical parallel matching pursuit" IMAGE RECONSTRUCTION AND RESTORATION, SAN DIEGO, CA, USA, 25-26 JULY 1994, vol. 2302, ISSN 0277-786X, PROCEEDINGS OF THE SPIE - THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING, 1994, USA, pages 222-232, XP002051992 *
NEFF R ET AL: "Very low bit rate video coding using matching pursuits" VISUAL COMMUNICATIONS AND IMAGE PROCESSING '94, CHICAGO, IL, USA, 25-28 SEPT. 1994, vol. 2308, pt.1, ISSN 0277-786X, PROCEEDINGS OF THE SPIE - THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING, 1994, USA, pages 47-60, XP002051991 *
See also references of WO9715897A1 *
VETTERLI M ET AL: "MATCHING PURSUIT FOR COMPRESSION AND APPLICATION TO MOTION COMPENSATED VIDEO CODING" PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (IC, AUSTIN, NOV. 13 - 16, 1994, vol. 1 OF 3, 13 November 1994, INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS, pages 725-729, XP000522039 *

Also Published As

Publication number Publication date
EP0800684A4 (en) 1998-03-25
TW395119B (en) 2000-06-21
WO1997015897A1 (en) 1997-05-01

Similar Documents

Publication Publication Date Title
US5495538A (en) Segmentation-based JPEG image artifacts reduction
US6014181A (en) Adaptive step-size motion estimation based on statistical sum of absolute differences
US5714950A (en) System for variable-length-coding and variable-length-decoding digitaldata
JP4804342B2 (en) Overcomplete basis transform based motion residual frame encoding method and video compression apparatus
Kaup et al. Coding of segmented images using shape-independent basis functions
JP5414121B2 (en) Compression and coding of 3D mesh networks
JP2000511366A (en) Apparatus and method for variable block size motion estimation based on quadrant tree
JPH11168633A (en) Reconstruction execution method, reconstruction execution device, record medium, inverse conversion execution method, inverse conversion execution device, suitable reconstruction generating method, suitable reconstruction generator, coding data processing method, coding data processing unit and data processing system
WO2000002393A1 (en) Image coding/decoding method and recorded medium on which program is recorded
US20050041856A1 (en) Image compression usable with animated images
US8674859B2 (en) Methods for arithmetic coding and decoding and corresponding devices
EP0800684A1 (en) Method and device for encoding/decoding a displaced frame difference
Matsuda et al. A lossless coding scheme using adaptive predictors and arithmetic code optimized for each image
US20070064810A1 (en) Variable shape motion estimation in video sequence
US20030138046A1 (en) Method for coding and decoding video signals
JP3708218B2 (en) Image coding method
Fränti image COMPRESSION
JP4008846B2 (en) Image encoding apparatus, image encoding method, image encoding program, and recording medium recording the program
KR100296097B1 (en) Method of and apparatus for acquiring motion vectors of control points in control grid interpolation
Li et al. Efficient coding method for stereo image pairs
JP2003153275A (en) Image processing apparatus and method, recording medium, and program
JPH09139941A (en) Encoding/decoding device for picture signal
KR100316411B1 (en) Apparatus for acquiring motion vectors of control points by classified vector quantization in control grid interpolation coder
KR100296098B1 (en) Apparatus for acquiring motion vectors of control points by seperated vector quantization in control grid interpolation coder
Lai et al. Coding of image sequences using variable size block matching and vector quantization with gray-level segmentation and background memory

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE FR GB

17P Request for examination filed

Effective date: 19971103

A4 Supplementary search report drawn up and despatched

Effective date: 19980206

AK Designated contracting states

Kind code of ref document: A4

Designated state(s): DE FR GB

17Q First examination report despatched

Effective date: 19990611

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 19991022