US6760479B1 - Super predictive-transform coding - Google Patents

Super predictive-transform coding Download PDF

Info

Publication number
US6760479B1
US6760479B1 US09/696,197 US69619700A US6760479B1 US 6760479 B1 US6760479 B1 US 6760479B1 US 69619700 A US69619700 A US 69619700A US 6760479 B1 US6760479 B1 US 6760479B1
Authority
US
United States
Prior art keywords
coefficient
symbols
groups
uncorrelated
error
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime, expires
Application number
US09/696,197
Inventor
Erlan H. Feria
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Research Foundation of City University of New York
Original Assignee
Research Foundation of City University of New York
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Research Foundation of City University of New York filed Critical Research Foundation of City University of New York
Priority to US09/696,197 priority Critical patent/US6760479B1/en
Assigned to RESEARCH FOUNDATION OF THE CITY UNIVERSITY OF NEW YORK reassignment RESEARCH FOUNDATION OF THE CITY UNIVERSITY OF NEW YORK ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FERIA, ERLAN H.
Application granted granted Critical
Publication of US6760479B1 publication Critical patent/US6760479B1/en
Adjusted expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/583Motion compensation with overlapping blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/18Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • H04N19/619Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding the transform being operated outside the prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/93Run-length coding

Definitions

  • the present invention relates to image compression in general and, more particularly, to an improved predictive-transform method and apparatus for use in image compression.
  • Said and Pearlman wavelet coding also known as S&P wavelet coding
  • S&P wavelet coding is presently considered to be the most efficient image compression technique available and is commercially used for image compression applications in both private industry and government institutions.
  • Said and Pearlman wavelet coding is described in Said and Pearlman, “A new fast and efficient image coder based on set partitioning in hierarchical trees, ” IEEE Trans. CSVT, vol. 6, n. 3, pp 243-250, June 1996.
  • An algorithm used in this technique method is available from the Rensselaer Polytechnic Institute, Troy, N.Y.
  • a fundamental problem addressed by the present invention is the combined lossy lossless compression of digital signals and images for efficient storage of the images and/or their transmission in bandwidth limited channels.
  • the compressed images can be monochrome or color images and can be used for the compression of still and moving pictures.
  • HDTV high definition television
  • the present invention successfully overcomes the excessive smoothing problem associated with the S&P wavelet algorithm. This is achieved by integration of several novel ideas into a method which is hereinafter referred to as Super Predictive-Transform (PT) Coding.
  • PT Super Predictive-Transform
  • the application of lossless compression in accordance with an embodiment of the present invention to using either q distinct Huffman type coders or an Arithmetic coder that must be reinitialized after each new group of quantizer symbols is received.
  • This approach has led to a significant improvement in the compression derived from the Super Predictive-Transform Coder.
  • the present invention can be successfully applied to other coding algorithms such as those used for JPEG and MPEG or to any other coding scheme where uncorrelated coefficients are used.
  • the superimposed geometry of the coder input and prediction vectors of the Super Predictive-Transform in accordance with the present invention leads to the elimination of undesirable blocking artifacts that are otherwise obtained with PT based coders when operating at very low bit rates.
  • An aspect of the present invention involves the “integration” of a new symbol stream generator, Huffman or Arithmetic coders with properly synchronized initializations, the superimposed geometry of the coder input and prediction signals, and simple round off scalar quantizers in a minimum mean squared error (MMSE) predictive-transform modeling and coding formulation.
  • MMSE minimum mean squared error
  • the Super PT coder of the present invention does not suffer of the smoothing problem encountered with the S&P wavelet algorithm.
  • the Super PT coder does not suffer of blocking artifacts when operating at very low bit rates. This is due to the superimposition property of each encoded pixel block.
  • the lossless encoding of each element of a truncated coefficient error in accordance with an embodiment of the present invention significantly improves the Signal-to-Noise Ratio (SNR) and visual quality of the reconstructed images.
  • SNR Signal-to-Noise Ratio
  • FIG. 1 is a schematic block diagram illustrating a lossy super predictive transform encoder/decoder scheme in accordance with an embodiment of the present invention
  • FIG. 2 is a schematic illustration of an exemplary geometry for block superposition and prediction in accordance with an embodiment of the present invention.
  • FIG. 3 is a schematic block diagram illustrating a lossless encoder/decoder scheme in accordance with an embodiment of the present invention.
  • FIGS. 1, 2 , and 3 Super Predictive-Transform Coding schemes in accordance with the present invention are depicted in FIGS. 1, 2 , and 3 and may consists of either or both lossy and lossless encoders and decoders and may also include a particular geometry for the coder input and prediction signals, as discussed below.
  • a lossy encoder/decoder is shown in detail in FIG. 1, a geometry for the input and prediction signals is depicted in FIG. 2, a lossless encoder/decoder is depicted in FIG. 3 .
  • the overhead information needed by the decoder is documented as well as the proper initialization of the coder are also discussed below.
  • the lossy encoder and decoder of FIG. 1 may be characterized by the following twelve signals:
  • the coder input and coefficient vector estimates ⁇ circumflex over (x) ⁇ (k+1) and ⁇ (k+1).
  • ⁇ c ( k ) [ ⁇ c 1 ( k ) ⁇ c 2 ( k ) . . . ⁇ c n ( k )] t (1)
  • MMSE minimum mean squared error
  • MMSE minimum mean squared error
  • T is an unitary matrix, i.e.,
  • Q i ( ⁇ c i (k) represents the scalar quantization of the coefficient error ⁇ c i (k). Note that the scalar quantizers are not generally optimum since the coefficient errors often remain statistically dependent even if they are uncorrelated.
  • Constraint 2 The optimum transform and predictor matrices must yield coefficient error components with zero mean value, i.e.,
  • Constraint 3 The quantizer will be assumed to work as follows: a) J arbitrary coefficient error components are unaffected by the quantizer, i.e.,
  • the q elements of ⁇ e(k) are the q most energetic elements of ⁇ c(k), i.e.,
  • ⁇ e ( k ) [ ⁇ c 1 ( k ) ⁇ c 2 (k) . . . ⁇ c q ( k )] t (2)
  • a transposed transformation matrix R ′ which is the same as the inverse of R due to its unitary property—of dimension n ⁇ n that is multiplied by the coefficient vector estimate ⁇ (k+1) to yield the coder input vector estimate ⁇ circumflex over (x) ⁇ (k+1).
  • a dimensionality reduction subsystem that multiplies the n ⁇ q less energetic elements of the n-dimensional coefficient error vector ⁇ c(k) by zero gains. This multiplication, in turn, results in the q-dimensional truncated coefficient error vector ⁇ e(k).
  • a memory device that temporarily stores recently reconstructed coder input vector estimates ⁇ circumflex over (x) ⁇ (0), . . . , ⁇ circumflex over (x) ⁇ (k) ⁇ . These stored vectors are used at each processing stage to construct the prediction vector z(k).
  • a scaling device with gain 1/Comp responsible for establishing the amount of compression associated with the coder. More specifically, the constant Comp is adjusted to produce the desired amount of compression for the coder.
  • q scalar quantizers implemented by finding the closest integer vector, ⁇ circumflex over (f) ⁇ (k), to the scaled and truncated coefficient error ⁇ f(k), i.e.,
  • a scaling device with gain Comp responsible for generating the truncated coefficient error vector estimate ⁇ ê(k) from the scaled and truncated coefficient error integer vector ⁇ circumflex over (f) ⁇ (k).
  • a dimensionality restoration subsystem that restores the coefficient error estimate ⁇ (k) from the truncated coefficient error estimate ⁇ ê(k) via zero padding.
  • FIG. 2 schematically illustrates the geometry of the coder input and prediction signals in a simplified 2D image processing example, wherein xij, for all (i,j) pairs, denotes the i-th row and j-th column pixel of the image.
  • the image is of dimension V ⁇ H.
  • the encoding is achieved by moving on the image from left to right and top to bottom, encoding a 3 ⁇ 3 pixel block at each processing stage (in general the pixel block size is of dimension N ⁇ N), e.g., in FIG. 2 the 3 ⁇ 3 pixel block:
  • the Super PT coder Another property of the Super PT coder is that only four reconstructed pixels of the currently encoded 3 ⁇ 3 pixel block are permanently stored-the exception is at the end of each column and/or row when all 3 ⁇ 3 pixel blocks are kept. These four reconstructed pixels correspond to the top and left 2 ⁇ 2 pixel sub-block of the reconstructed 3 ⁇ 3 pixel block, e.g., for the 3 ⁇ 3 pixel block of equation 4, the reconstructed 2 ⁇ 2 pixel sub-block that may be permanently stored is given by:
  • the elements of z( 3 ) are pixel reconstructions which reflect partial knowledge of the pixels of the presently processed 3 ⁇ 3 pixel block of equation 4.
  • the pixel reconstruction ⁇ circumflex over (x) ⁇ 22 appearing in z( 3 ) reflects knowledge about the currently encoded pixel x 33 since ⁇ circumflex over (x) ⁇ 22 was originally obtained when the leftmost 3 ⁇ 3 pixel block shown in equation 5 is encoded.
  • the Super PT formulation generally allows for an arbitrary number of top rows and leftmost columns, of the currently encoded pixel block, to be superimposed with a subset of pixels from previously encoded pixel blocks.
  • other alternative definitions for the z(k) prediction vector are possible.
  • the suggested 2D geometry can be readily generalized to the 3D or motion picture case.
  • Ex t x, Ez t z, and Ez t x are second order expectations of the coder input and predictor vectors x(k+1) and z(k), “Inv( )” denotes a matrix inversion, and L is a diagonal eigenvalue matrix.
  • the second order expectations Ex t x, Ez t z, and Ez t x required to solve the coupled design equations 9 and 10 are found using natural images. A description of how the above equations are derived may be found in Feria, E. H., “ Predictive - Transform Coding,”, Proceedings of 1986 IEEE NAECON, Dayton, Ohio, May 1986.
  • design equations 9 and 10 are a special case of those given in this 1986 IEEE NAECON paper, because equations 9 and 10 do not include a zero mean constraint for ⁇ c(k). Nevertheless, it has been found via extensive simulations that the design equations 9 and 10 yield coefficient errors ⁇ c(k) ⁇ , characterized by a negligibly small mean value.
  • the encoded image is the 5 ⁇ 5 pixel image of FIG. 2 .
  • the geometry of the coder input and prediction signals is the same as that of FIG. 2 .
  • the number of scalar quantizers is five, i.e.,
  • the compression factor is one half, i.e.,
  • the signals and subsystems of the lossless coder of FIG. 3 are as follows.
  • the matrix Q is derived from the encoder memory and consists of the sequence of scaled truncated coefficient error integer vectors ⁇ circumflex over (f) ⁇ ( 1 ), ⁇ circumflex over (f) ⁇ ( 2 ), . . . , ⁇ circumflex over (f) ⁇ (W) ⁇ , i.e.,
  • Comp_vector of dimension q ⁇ 1
  • q the vector compression factor
  • Comp_vector may be selected such that they reflect the decreasing standard deviation associated with each element of the truncated coefficient error vector ⁇ e(k).
  • the elements ⁇ gk ⁇ of Comp_vector may be found using the following formula:
  • Q a [ 4 6 0 1 - 2 - 1 0 0 1 0 0 0 0 1 0 0 0 ] ( 18 ⁇ h )
  • a zero run integer symbol matrix z of dimension q ⁇ W is derived by determining for each column of Q (or Q a ) when a zero run is encountered as we move on the column from top to bottom. When a zero run is found it is represented with a special symbol r_j, wherein j corresponds to the last nonzero integer found just before the zero run is encountered. After the zero run symbol r_j a blank symbol is introduced for each deleted zero, these blank symbols do not contribute to the bit rate of the system since they are never transmitted to the decoder.
  • the following symbol stream vector, s is obtained:
  • the symbol stream vector has a very special structure. That is, it collects first the symbols associated with the first scalar quantizer operating on the most energetic coefficient error, i.e., the set of symbols (3, 5, r — 0, r — 1) obtained from the first row of Z in equation 19, then the set of symbols ( ⁇ 2, ⁇ 1) associated with the second quantizer, and so on until all the quantizers are accounted for.
  • One of two possible types of lossless compressors are used to encode the symbol stream, s, of equation 20 to yield a bit stream, b, of dimension 1 ⁇ J, where J denotes to total number of bits required to encode symbol stream s plus any additional overhead data needed.
  • the first type of compressor are q independent Huffman encoders, each encoding the appropriate set of symbols of symbol stream, s.
  • five Huffman encoders are used for the symbol stream in equation 20, the first for encoding the group of symbols (3, 5, r — 0, r — 1), the second for encoding the group ( ⁇ 2, ⁇ 1), the third for encoding the group (1, 0), the fourth for encoding the group (0, r — 1), and the last for encoding the symbol 1.
  • the symbol stream of equation 20 may be coded with less than 5 Huffman encoders, for example, 3 Huffman encoders, the first for encoding the group of symbols (3, 5, r — 0, r — 1), the second for encoding the group ( ⁇ 2, ⁇ 1), and the third for encoding the remaining symbols (1, 0, 0, r — 1, 1).
  • a hybrid technique of this type may be desirable in cases where the number of symbols for the lowest energy coefficient errors is so small that any overhead associated with either Huffman or Arithmetic coders could compromise the compression reduction that may be achieved by encoding each coefficient error separately.
  • the q Huffman encoders may be designed for each encoded image and their symbol-bit maps transmitted to the decoder as overhead.
  • the arithmetic encoder may use either an adaptive or fixed model for the symbols distributions that is initialized q different times at the beginning of each different group of quantizer symbols. For example, for the symbol stream in equation 20, the arithmetic coder is initialized when the group of symbols (3, 5, r — 0, r — 1) is first encountered, then again when the group ( ⁇ 2, ⁇ 1) is encountered, and so on until the last group of symbols arrives.
  • the subsystems of the lossless decoder of FIG. 3 perform the inverse operation of the lossless encoders, as will be apparent to a person of ordinary skill in the art, in view of FIG. 3 .
  • V_amp An amplitude symbol stream vector, V_amp, consisting of the amplitude associated with each zero-run.
  • V_amp [1 1 0 1] (21)
  • V_amp may be encoded using a single fixed or adaptive Huffman or Arithmetic coder, as are known in the art.
  • V_length A length symbol stream vector, consisting of the length associated with each zero-run.
  • V_length [0 1 4] (22)
  • V_length may be encoded using a single fixed or adaptive Huffman or Arithmetic coder.
  • a coefficient symbol stream vector, V_coef which is similar in structure to that of the symbol stream vector, s, illustrated in equation 20, except that it does not include the zero-run symbols. This vector may be as follows:
  • V_coef [3 5 ⁇ 2 ⁇ 1 1 0 0 blank] (23)
  • V_coef is encoded using four fixed or adaptive Huffman or Arithmetic coders in the same manner as described above with reference to the symbol stream vector, s, in equation 20.
  • This decomposed structure may be convenient because it gives rise to more efficient lossless compressors using Huffman and/or Arithmetic coders.
  • V_amp amplitude zero-run vector
  • V_length length zero-run vector
  • V_zerorun a single zero-run symbol vector
  • V_zerorun V_amp*q ⁇ V_length*sign(V_amp) (22a)
  • V_zerorun [5 4 0 1] (22b)
  • V_amp and V_length may the readily reconstructed from V_zerorun.
  • bit stream b In addition to the bits associated with the symbol stream s (or alternative symbol streams V_amp, V_length, and V_coef), the bit stream b also contains overhead information required to operate the coder.
  • the information transmitted as overhead may include:
  • the maximum pixel value found in the image This pixel value is used to eliminate overshoots that may be found in the reconstructed image.
  • the minimum pixel value of the image This pixel value is used to eliminate undershoots that may be found in the reconstructed image.
  • the prediction vector z( 0 ) is a constant vector whose value is given by the average value between the maximum and minimum pixel values found in the image.
  • the mean value of x( 1 ) may be used to initialize the coder, in which case it is subsequently decoded as overhead.
  • Another characterizing feature is the superimposed geometry of the coder input and prediction vectors of the Super Predictive-Transform Coder of FIG. 2 .
  • This novel geometry leads to the elimination of undesirable blocking artifacts that are otherwise obtained with PT based coders when operating at very low bit rates.
  • Yet another characterizing feature is the integration of the proposed symbol stream generator, q Huffman coders or Arithmetic coder with q properly synchronized initializations, the superimposed geometry of the coder input and prediction signals, and simple round off scalar quantizers (3) in the minimum mean squared error (MMSE) predictive-transform modeling and coding formulation described by the applicant in Feria, E. H., “ Predictive - Transform Coding,”, Proceedings of 1986 IEEE NAECON, Dayton, Ohio, May 1986, which is incorporated herein by reference.
  • MMSE minimum mean squared error
  • the Super PT coder of the present invention does not suffer of the smoothing problem encountered with the S&P wavelet algorithm.
  • the Super PT coder does not suffer of blocking artifacts when operating at very low bit rates. This is due to the superimposition property of each encoded pixel block.
  • a fundamental problem addressed by the invention is the compression of digital signals and images for their efficient storage and transmission in bandwidth limited channels.
  • the compressed images can be monochrome or color images and can be used for the compression of still and moving pictures.
  • Applications are found in diverse fields such as the efficient storage of images for personal computers, the storage of medical images, the storage of finger prints and ballistic or bullet prints, the storage of planetary images, the transmission of facsimile information for commercial applications, the transmission and storage of digital images in the movie industry and other applications, and the transmission of digital images for high definition television (HDTV) systems.
  • HDTV high definition television

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A method of compressing a digital image including the steps of dividing the image into a plurality of blocks, at least some of which blocks may be partially superimposed with other blocks, generating uncorrelated coefficient or coefficient error vectors, preferably using a minimum mean squared error (MMSE) predictive-transform (PT) coding methodology, quantizing each coefficient or coefficient error vector using scalar quantizers, preferably linear scalar quantizers, storing a coefficient or coefficient error matrix consisting of the quantized coefficient error vectors, decomposing the stored quantized coefficient or coefficient error matrix into a plurality of uncorrelated groups of quantized coefficient errors and zero-run symbols, or into uncorrelated groups of quantized coefficient errors as well as a zero-run amplitude dependent group and zero-run length dependent group, and losslessly encoding each of the uncorrelated groups of symbols, for example, using a fixed or adaptive Huffman or Artihmetic coder.

Description

This patent application claims the priority of U.S. provisional patent application No. 60/160,910 filed Oct. 22, 1999, which is incorporated herein by reference.
FIELD OF THE INVENTION
The present invention relates to image compression in general and, more particularly, to an improved predictive-transform method and apparatus for use in image compression.
BACKGROUND OF THE INVENTION
There are many image compression methods and devices known in the art. Some known prior art compression methods are based on coding schemes which decompose images into blocks. Such known methods may use uncorrelated coefficients or coefficient errors, for example Discrete Cosine Transform (DCT). However, these schemes are currently not considered to be the most efficient compression schemes. Further, at high compression ratios, methods using image block decomposition tend to produce artifacts, the most prominent of which are “blocking” artifacts where a pattern of the image blocks may be visible in the image reproduced after decompression. Known image compression standards using such schemes include JPEG and MPEG.
Said and Pearlman wavelet coding, also known as S&P wavelet coding, is presently considered to be the most efficient image compression technique available and is commercially used for image compression applications in both private industry and government institutions. Said and Pearlman wavelet coding is described in Said and Pearlman, “A new fast and efficient image coder based on set partitioning in hierarchical trees, ” IEEE Trans. CSVT, vol. 6, n. 3, pp 243-250, June 1996. An algorithm used in this technique method is available from the Rensselaer Polytechnic Institute, Troy, N.Y.
Notwithstanding a superior efficiency over other known image compression methods, the S&P wavelet algorithm suffers excessive smoothing when operating at low bit rates. This excessive smoothing manifests itself in blurred images.
In Feria, E. H., “Predictive-Transform Coding,”, Proceedings of 1986 IEEE NAECON, Dayton, Ohio, May 1986, the present inventor describes a lossy compression scheme characterized by an uncorrelated coefficient error superstructure.
SUMMARY OF THE INVENTION
A fundamental problem addressed by the present invention is the combined lossy lossless compression of digital signals and images for efficient storage of the images and/or their transmission in bandwidth limited channels. The compressed images can be monochrome or color images and can be used for the compression of still and moving pictures. There are potential applications for the method and apparatus of the present invention in diverse fields, including but not limited to efficient storage of images for personal computers, storage of medical images, storage of finger prints and ballistic or bullet prints, storage of planetary images, transmission of facsimile information for commercial applications, transmission and storage of digital images in the movie industry and other industries, and transmission of digital images for high definition television (HDTV) systems.
The present invention successfully overcomes the excessive smoothing problem associated with the S&P wavelet algorithm. This is achieved by integration of several novel ideas into a method which is hereinafter referred to as Super Predictive-Transform (PT) Coding.
The application of lossless compression in accordance with an embodiment of the present invention to using either q distinct Huffman type coders or an Arithmetic coder that must be reinitialized after each new group of quantizer symbols is received. This approach has led to a significant improvement in the compression derived from the Super Predictive-Transform Coder. The present invention can be successfully applied to other coding algorithms such as those used for JPEG and MPEG or to any other coding scheme where uncorrelated coefficients are used.
The superimposed geometry of the coder input and prediction vectors of the Super Predictive-Transform in accordance with the present invention leads to the elimination of undesirable blocking artifacts that are otherwise obtained with PT based coders when operating at very low bit rates.
An aspect of the present invention involves the “integration” of a new symbol stream generator, Huffman or Arithmetic coders with properly synchronized initializations, the superimposed geometry of the coder input and prediction signals, and simple round off scalar quantizers in a minimum mean squared error (MMSE) predictive-transform modeling and coding formulation. The synergistic operation of all of the aforementioned inventions together with a solid theoretical foundation is what has given rise to a simple, elegant, and powerful technique which has proven to outperform the S&P wavelet algorithm.
The Super PT coder of the present invention does not suffer of the smoothing problem encountered with the S&P wavelet algorithm.
The Super PT coder does not suffer of blocking artifacts when operating at very low bit rates. This is due to the superimposition property of each encoded pixel block.
The lossless encoding of each element of a truncated coefficient error in accordance with an embodiment of the present invention, significantly improves the Signal-to-Noise Ratio (SNR) and visual quality of the reconstructed images.
It will be understood by persons skilled in the art, in view of the detailed description below, that the present invention can be implemented by computer software or dedicated hardware or any combination of computer software and hardware, in accordance with specific applications.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a schematic block diagram illustrating a lossy super predictive transform encoder/decoder scheme in accordance with an embodiment of the present invention;
FIG. 2 is a schematic illustration of an exemplary geometry for block superposition and prediction in accordance with an embodiment of the present invention; and
FIG. 3 is a schematic block diagram illustrating a lossless encoder/decoder scheme in accordance with an embodiment of the present invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
Super Predictive-Transform Coding schemes in accordance with the present invention are depicted in FIGS. 1, 2, and 3 and may consists of either or both lossy and lossless encoders and decoders and may also include a particular geometry for the coder input and prediction signals, as discussed below. A lossy encoder/decoder is shown in detail in FIG. 1, a geometry for the input and prediction signals is depicted in FIG. 2, a lossless encoder/decoder is depicted in FIG. 3. The overhead information needed by the decoder is documented as well as the proper initialization of the coder are also discussed below.
The lossy encoder and decoder of FIG. 1 may be characterized by the following twelve signals:
A coder input vector x(k+1) of dimension n×1.
The coefficient vector c(k+1) of dimension n×1.
The coder input and coefficient vector estimates {circumflex over (x)}(k+1) and ĉ(k+1).
The prediction vector z(k) of dimension m×1.
The predicted coefficient vector c′(k+1) of dimension n×1.
The coefficient error or innovation vector δc(k) of dimension n×1. More specifically,
δc(k)=[δc 1(kc 2(k) . . . δc n(k)]t  (1)
with the variance of each scalar element δcj(k) decreasing as the value of j increases from 1 to n. Also the coefficient error δc(k) is zero mean with uncorrelated elements. A justification of these properties, for instance, the formulation of minimum mean squared error (MMSE) predictive-transform coding, may be found in Feria, E. H., “Predictive-Transform Coding,”, Proceedings of 1986 IEEE NAECON, Dayton, Ohio, May 1986.
The aforementioned justification of these properties is described in this paper as follows.
The specific characteristics of our algorithm that result in a readily implementable coder with low reproduction distortion at high compression rates are several. These are: 1. the subset of previously estimated samples that are used to predict each transform coefficient is composed of only those samples which are significantly correlated to the transform coefficients. 2. In practical applications unbiased scalar quantizers can be used to encode the coefficient errors between the transform coefficient and their predicted values. This is the case since the algorithm transform and predictor matrices minimize the mean square error between a data block and its estimate subject to the constraint that the coefficient errors are uncorrelated and also have zero mean value. Although scalar quantizers are only strictly optimum in the case where the coefficient errors are also independent, e.g. for uncorrelated gaussian data, it is often found that scalar quantizers are quite adequate in applications dealing with uncorrelated but dependent data samples (see (5) and section IV). 3. In practical applications such as the video example of section IV it is found that the multiplications required by the transform and predictor matrices can be easily implemented with additions, subtractions, and register shifts. This is due to the robustness of the predictor and transforms matrices which yield excellent performance even when their optimum elements are roughly approximated by readily implementable values such as 0, ½, ¼, ¾, etc. 4. Relatively small data blocks can yield low reproduction distortion.
The aforementioned formulation of minimum mean squared error (MMSE) predicative-transform coding is described in this paper as follows.
The transform and predictor matrices T and P that minimize the MSE (20) subject to constraints (13)-(19) are obtained from the evaluation of the following matrix equations:
{E[x(k+1)x t(k+1)]−A}T ii T i i=1, . . . ,W  (21)
A = [ E [ x ( k + 1 ) z t ( k ) ] 1 / 2 1 / 2 ] [ E [ z ( k ) z t ( k ) ] 1 / 2 1 / 2 1 / 2 1 / 2 0 ] - 1 = [ E [ z ( k ) x t ( k + 1 ) ] 1 / 2 1 / 2 ] ( 22 )
Figure US06760479-20040706-M00001
and [ P i μ i ] = [ E [ z ( k ) z t ( k ) ] 1 / 2 1 / 2 1 / 2 1 / 2 0 ] - 1 [ E [ z ( k ) x t ( k + 1 ) ] 1 / 2 1 / 2 ] T i i = 1 , , W ( 23 )
Figure US06760479-20040706-M00002
where: a) {Ti} and {Pi} are the columns of the transforms and predictor matrices T and P; b) E[x(k+1)xt(k+1)] is the second moment statistic matrix of the k-th data block x(k+1); c) E[z(k)zt(k)] is the second moment statistic of the data elements z(k) surrounding the block x(k+1); d) E[z(k)xt(k+1)] is the correlation matrix between the block x(k+1) and its surrounding data elements z(k); e) λi, is a Lagrange multiplier associated with the constraint Tt iTj=1 for all i; f) μi is a Lagrange multiplier associated with the zero mean constraint (16); and g) the matrix inversion shown in eqs. (22) and (23) is assumed to exist.
In addition, the minimum MSE obtained with these matrices is given by min E [ x ( k + 1 ) - x ^ ( k + 1 ) ) t ( x ( k + 1 ) - x ^ ( k + 1 ) ) ] T , P = i = J + 1 W E [ Δ c i 2 ( k ) ] = i = J + 1 W T i t { E [ x ( k + 1 ) x t ( k + 1 ) ] - A } T i = i = J + 1 W λ i ( 24 )
Figure US06760479-20040706-M00003
where λJ+1 . . . λW are the smallest W−J eigenvalues of the eigensystem (21).
Also, the optimum transform and predictor matrices result in uncorrelated coefficient errors.
Constraints (13)-(19) are described in this paper as follows.
There are four constraints. These are:
Constraint 1: The basis vectors {Ti} of the transform T will be constrained to be orthonomal T i t T j = { 1 i = j 0 i j ( 13 )
Figure US06760479-20040706-M00004
One reason for this constraint is to give equal weight to the energy associated with each coefficient error. Note that this constraint also implies that T is an unitary matrix, i.e.,
T −1 =T t.  (14)
A second reason for this constraint is that it results in uncorrelated coefficient errors as shown in appendix A. This in turn implies that in applications (see Section IV) we can use simple scalar quantizers to encode each coefficient error, i.e.
Q(Δc(k))=[Q i(Δc i(k)), . . . , Q W(Δc W(k))]t
Where Qi(Δci(k) represents the scalar quantization of the coefficient error Δci(k). Note that the scalar quantizers are not generally optimum since the coefficient errors often remain statistically dependent even if they are uncorrelated.
Constraint 2: The optimum transform and predictor matrices must yield coefficient error components with zero mean value, i.e.,
E[Δci(k)]=0 for all i.  (15)
The objective of this constraint is to simplify the design of the scalar quantizer since it then follows that we do not need to be concerned about coefficient error elements with a nonzero mean component. It should be noted that the constraints (15) and (14) further imply the following constraint on the transform and predictor matrices:
U t w T i −U t M P i=0 for all i  (16)
where UW and UM are unit column vectors with W and M elements, respectively. This constraint can be readily derived as follows:
First, using FIG. 2 and Eq. (14) we note that
Δc i(k)=T t i x(k+1)=P t i z(k) for all i.  (17)
Second and last, taking the expected value of Eq. (17), using constraint (15), and assuming that the expected value of each data sample is constant we obtain the desired result (16).
Constraint 3: The quantizer will be assumed to work as follows: a) J arbitrary coefficient error components are unaffected by the quantizer, i.e.,
Δĉ i(k)=Δc i(k)  (18)
For J arbitrary components of Δc(k) where J<W: and b) the remaining W−J coefficient error components are set to zero by the quantizer, i.e.,
Δĉ i(k)=0  (19)
for the remaining W−J components of Δĉ(k). The basic advantage of this constraint is that it makes the evaluation of the transform and predictor matrices a mathematically tractable problem.
The truncated coefficient error vector δe(k) of dimension q×1 where q≦n. The q elements of δe(k) are the q most energetic elements of δc(k), i.e.,
δe(k)=[δc 1(kc 2(k) . . . δc q(k)]t  (2)
The scaled and truncated coefficient error vector δf(k) of dimension q×1.
The scaled and truncated coefficient error integer vector δ{circumflex over (f)}(k) of dimension q×1.
The truncated coefficient error vector estimate δê(k) of dimension q×1.
The coefficient error vector estimate δĉ(k) of dimension n×1.
The bit stream b of dimension 1×B where B denotes the number of bits present in b.
The following nine subsystems may also characterize the lossy encoder and decoder:
A unitary transformation matrix R of dimension n×n that is multiplied by the coder input vector x(k+1) to yield the coefficient vector c(k+1).
A transposed transformation matrix R—which is the same as the inverse of R due to its unitary property—of dimension n×n that is multiplied by the coefficient vector estimate ĉ(k+1) to yield the coder input vector estimate {circumflex over (x)}(k+1).
A prediction matrix P of dimension m×n that is multiplied by the prediction vector z(k) to yield the predicted coefficient vector c′(k+1).
A dimensionality reduction subsystem that multiplies the n−q less energetic elements of the n-dimensional coefficient error vector δc(k) by zero gains. This multiplication, in turn, results in the q-dimensional truncated coefficient error vector δe(k).
A memory device that temporarily stores recently reconstructed coder input vector estimates {{circumflex over (x)}(0), . . . , {circumflex over (x)}(k)}. These stored vectors are used at each processing stage to construct the prediction vector z(k).
A scaling device with gain 1/Comp responsible for establishing the amount of compression associated with the coder. More specifically, the constant Comp is adjusted to produce the desired amount of compression for the coder.
q scalar quantizers implemented by finding the closest integer vector, δ{circumflex over (f)}(k), to the scaled and truncated coefficient error δf(k), i.e.,
δ{circumflex over (f)}(k)=Round(δf(k)).  (3)
A scaling device with gain Comp responsible for generating the truncated coefficient error vector estimate δê(k) from the scaled and truncated coefficient error integer vector δ{circumflex over (f)}(k).
A dimensionality restoration subsystem that restores the coefficient error estimate δĉ(k) from the truncated coefficient error estimate δê(k) via zero padding.
FIG. 2 schematically illustrates the geometry of the coder input and prediction signals in a simplified 2D image processing example, wherein xij, for all (i,j) pairs, denotes the i-th row and j-th column pixel of the image. In general the image is of dimension V×H. In the example of FIG. 2, V=5 and H=5.
The encoding is achieved by moving on the image from left to right and top to bottom, encoding a 3×3 pixel block at each processing stage (in general the pixel block size is of dimension N×N), e.g., in FIG. 2 the 3×3 pixel block:
x33 x34 x35
x43 x44 x45  (4)
x53 x54 x55
is being currently encoded. A fundamental property of this arrangement is that the top row and left column pixels of equation 4 are superimposed on previously encoded pixel blocks, e.g., note that the pixels x53, x43, x33, x34, and x35 of equation 4 were previously encoded when the 3×3 pixel blocks:
x11 x12 x13 x13 x14 x15 x31 x32 x33
x21 x22 x23 x23 x24 x25 x41 x42 x43  (5)
x31 x32 x33 x33 x34 x35 x51 x52 x53
were processed—also note that the standard PT coder first proposed by the present inventor in the 1986 IEEE NAECON conference paper “Predictive Transform Coding”, does not use superimposed pixel blocks. The present invention is referred to herein as Super Predictive-Transform Coding partly because of the pixel block superimposition described above,.wherein the term “super” indicates superimposition. The term “super” in the title of the invention may also indicate superstructure, referring to the uncorrelated coefficient errors superstructure imposed by the predictive-transform methodology and leading to the highly efficient compressor presented of the invention. It has been found by the present inventor that this type of block superimposition effectively eliminates the blocking artifacts produced by PT based compression algorithms when operating at very low bit rates. Another property of the Super PT coder is that only four reconstructed pixels of the currently encoded 3×3 pixel block are permanently stored-the exception is at the end of each column and/or row when all 3×3 pixel blocks are kept. These four reconstructed pixels correspond to the top and left 2×2 pixel sub-block of the reconstructed 3×3 pixel block, e.g., for the 3×3 pixel block of equation 4, the reconstructed 2×2 pixel sub-block that may be permanently stored is given by:
{circumflex over (x)}33 {circumflex over (x)}34
{circumflex over (x)}43 {circumflex over (x)}44  (6)
where the “{circumflex over ( )}” notation denotes the generally inexact reconstruction of pixel xij—it is assumed here that the last image row and column has not been reached; otherwise, a larger subset of the 3×3 reconstructed pixel block is stored as needed.
The geometry of the n dimensional vector x(k+1) is defined in FIG. 2 for the special case k+1=4 where x(4) is given by the nine dimensional column vector:
x(4)=[x 33 x 34 x 35 x 43 x 44 x 45 x 53 x 54 x 55]t  (7)
with its nine pixel elements obtained from the currently encoded 3×3 pixel block in equation 4. The vectors x(1), x(2), and x(3) of FIG. 2 are also appropriately defined using the three 3×3 pixel blocks given in equation 5. In turn, the geometry of the m dimensional prediction vector z(k) is defined for the special case k=3, where z(3) is given by the seven dimensional column vector:
z(3)=[{circumflex over (x)} 52 {circumflex over (x)} 42 {circumflex over (x)} 32 {circumflex over (x)} 22 {circumflex over (x)} 23 {circumflex over (x)} 24 {circumflex over (x)} 25]t  (8)
The elements of z(3) are pixel reconstructions which reflect partial knowledge of the pixels of the presently processed 3×3 pixel block of equation 4. For instance, the pixel reconstruction {circumflex over (x)}22 appearing in z(3) reflects knowledge about the currently encoded pixel x33 since {circumflex over (x)}22 was originally obtained when the leftmost 3×3 pixel block shown in equation 5 is encoded. It is important to note that the Super PT formulation generally allows for an arbitrary number of top rows and leftmost columns, of the currently encoded pixel block, to be superimposed with a subset of pixels from previously encoded pixel blocks. In addition, other alternative definitions for the z(k) prediction vector are possible. Also, the suggested 2D geometry can be readily generalized to the 3D or motion picture case.
The transformation and prediction matrices, R and P, are designed off-line by solving the following coupled eigensystem and normal equations:
{Ex t x−Ex t z·Inv(Ez t zEz t x}R=R·L,  (9)
P=Inv(Ez t zEz t x*R  (10)
wherein Extx, Eztz, and Eztx are second order expectations of the coder input and predictor vectors x(k+1) and z(k), “Inv( )” denotes a matrix inversion, and L is a diagonal eigenvalue matrix. The second order expectations Extx, Eztz, and Eztx required to solve the coupled design equations 9 and 10 are found using natural images. A description of how the above equations are derived may be found in Feria, E. H., “Predictive-Transform Coding,”, Proceedings of 1986 IEEE NAECON, Dayton, Ohio, May 1986. It should also be noted that the design equations 9 and 10 are a special case of those given in this 1986 IEEE NAECON paper, because equations 9 and 10 do not include a zero mean constraint for δc(k). Nevertheless, it has been found via extensive simulations that the design equations 9 and 10 yield coefficient errors {δc(k)}, characterized by a negligibly small mean value.
To make the presentation of the signals easier and also the operation of each subsystem of the lossless coder of FIG. 3, it will be assumed that the encoded image is the 5×5 pixel image of FIG. 2. In addition, the geometry of the coder input and prediction signals is the same as that of FIG. 2. Finally, it is assumed that the number of scalar quantizers is five, i.e.,
q=5,  (11)
the compression factor is one half, i.e.,
Comp=½,  (12)
and the four truncated coefficient error vectors corresponding to the example of FIG. 2 are given by:
δe(1)=[1.7 −1.1 0.4 −0.14 0.6]t  (13)
δe(2)=[2.6 −0.6 0.2 0.6 0.05]t  (14)
δe(3)=[−0.2 0.2 0.01 −0.02 −0.1]t  (15)
δe(4)=[0.4 −0.15 0.1. 0.15 −0.1]t  (16)
The signals and subsystems of the lossless coder of FIG. 3 are as follows.
The uncompressed integer symbol matrix Q of dimension q×W where q is the number of scalar quantizers of FIG. 1 and W is the number of encoded pixel blocks, e.g., W=4 for the 5×5 2D image of FIG. 2. The matrix Q is derived from the encoder memory and consists of the sequence of scaled truncated coefficient error integer vectors {δ{circumflex over (f)}(1),δ{circumflex over (f)}(2), . . . ,δ{circumflex over (f)}(W)}, i.e.,
Q=[δ{circumflex over (f)}(1),δ{circumflex over (f)}(2), . . . ,δ{circumflex over (f)}(W)]  (17)
For instance, for the illustrative example of FIG. 2 with the assumptions (11 )-(16), the integer symbol matrix Q is found to be given by the 5×4 matrix: Q = [ 3 5 0 1 - 2 - 1 0 0 1 0 0 0 0 1 0 0 1 0 0 0 ] ( 18 )
Figure US06760479-20040706-M00005
A more general vector compression factor, denoted Comp_vector, of dimension q×1, may be used in conjunction with the embodiment of FIG. 1 instead of the scalar compression factor Comp described above. The q elements of the Comp_vector, i.e.,
Comp_vector=[g 1 g 2 . . . gq]  (18a)
may be selected such that they reflect the decreasing standard deviation associated with each element of the truncated coefficient error vector δe(k). For example, the elements }gk} of Comp_vector may be found using the following formula:
gk=Kexp(−a|k−q| b) for k=1, . . . q  (18b)
wherein K is the previously defined scalar compression factor, Comp, i.e., K=Comp.
For example, using in this formula a case in which K=½, a=0.03, b=1, and q=5, yields the following compression vector.
Comp_vector=[0.46 0.47 0.48 0.5]  (18c)
Dividing the {δe(k)} values in equations 13-16 above by the compression vector of equation 18c yields the following values for the scaled and truncated coefficient error vector {δf(k)}:
δf(1)=[3.86 −2.34 0.85 −0.29 1.2]t  (18d)
δf(2)=[5.9 −1.3 0.43 1.25 0.1]t  (18e)
δf(3)=[−0.45 0.43 0.02 −0.04 −0.5]t  (18f)
δf(4)=[0.91 −0.33 0.21 0.31 −0.5]t  (18g)
Finally, an alternative Q matrix, Qa, may be generated from equations 18d -18g, using equations 3 and 17, as follows: Q a = [ 4 6 0 1 - 2 - 1 0 0 1 0 0 0 0 1 0 0 1 0 0 0 ] ( 18 h )
Figure US06760479-20040706-M00006
It has been confirmed by computer simulation that using a vector compression factor such as that of equation 18b improves the quality of the reconstructed pictures and the signal to noise ratio (SNR) for any given compression ratio.
A zero run integer symbol matrix z of dimension q×W is derived by determining for each column of Q (or Qa) when a zero run is encountered as we move on the column from top to bottom. When a zero run is found it is represented with a special symbol r_j, wherein j corresponds to the last nonzero integer found just before the zero run is encountered. After the zero run symbol r_j a blank symbol is introduced for each deleted zero, these blank symbols do not contribute to the bit rate of the system since they are never transmitted to the decoder. For example, using the Q matrix of equation 18, the following zero run integer symbol matrix is generated: Z = [ 3 5 r_ 0 r_ 1 - 2 - 1 blank blank 1 0 blank blank 0 r_ 1 blank blank 1 blank blank blank ] ( 19 )
Figure US06760479-20040706-M00007
A symbol stream vector, s, is of dimension 1×K, wherein K is the number of non-blank symbols in Z, e.g., K=11, and wherein Z is given by equation 19, may be generated using the symbol stream generator by scanning the Z matrix from left to right and top to bottom. For the specific example of equation 19, the following symbol stream vector, s, is obtained:
s=[3 5 r 0 r 1 −2 −1 1 0 0 r 1 1]  (20)
Note that the symbol stream vector has a very special structure. That is, it collects first the symbols associated with the first scalar quantizer operating on the most energetic coefficient error, i.e., the set of symbols (3, 5, r0, r1) obtained from the first row of Z in equation 19, then the set of symbols (−2, −1) associated with the second quantizer, and so on until all the quantizers are accounted for.
One of two possible types of lossless compressors are used to encode the symbol stream, s, of equation 20 to yield a bit stream, b, of dimension 1×J, where J denotes to total number of bits required to encode symbol stream s plus any additional overhead data needed. The first type of compressor are q independent Huffman encoders, each encoding the appropriate set of symbols of symbol stream, s. For example, five Huffman encoders are used for the symbol stream in equation 20, the first for encoding the group of symbols (3, 5, r0, r1), the second for encoding the group (−2, −1), the third for encoding the group (1, 0), the fourth for encoding the group (0, r1), and the last for encoding the symbol 1. In accordance with a more flexible scheme of the invention, the symbol stream of equation 20 may be coded with less than 5 Huffman encoders, for example, 3 Huffman encoders, the first for encoding the group of symbols (3, 5, r0, r1), the second for encoding the group (−2, −1), and the third for encoding the remaining symbols (1, 0, 0, r 1, 1). A hybrid technique of this type may be desirable in cases where the number of symbols for the lowest energy coefficient errors is so small that any overhead associated with either Huffman or Arithmetic coders could compromise the compression reduction that may be achieved by encoding each coefficient error separately.
The q Huffman encoders may be designed for each encoded image and their symbol-bit maps transmitted to the decoder as overhead. On the other hand, the arithmetic encoder may use either an adaptive or fixed model for the symbols distributions that is initialized q different times at the beginning of each different group of quantizer symbols. For example, for the symbol stream in equation 20, the arithmetic coder is initialized when the group of symbols (3, 5, r0, r1) is first encountered, then again when the group (−2, −1) is encountered, and so on until the last group of symbols arrives.
Finally, the subsystems of the lossless decoder of FIG. 3 perform the inverse operation of the lossless encoders, as will be apparent to a person of ordinary skill in the art, in view of FIG. 3.
An alternative lossless compression algorithm for encoding the integer symbol matrix Q will now be described. The basic idea of this alternative procedure is to decompose the Q matrix into three independent symbol streams, as follows:
1. An amplitude symbol stream vector, V_amp, consisting of the amplitude associated with each zero-run. For example,
V_amp=[1 1 0 1]  (21)
for the specific Q matrix given by equation 18. It should be noted that although there is no zero-run in the first column of Q and thus, the last coefficient value of 1 for the first column is used nonetheless. V_amp may be encoded using a single fixed or adaptive Huffman or Arithmetic coder, as are known in the art.
2. A length symbol stream vector, V_length, consisting of the length associated with each zero-run. For example,
V_length=[0 1 4]  (22)
for the Q matrix of equation 18. Note that since there is no zero-run for the first column of this Q matrix and, the first element of the vector in equation 22 is assigned a value of zero. In addition, since it is implied by the third element of the vector of equation 21 that all the elements of the third column of Q are zero, it is unnecessary to include in the vector of equation 22 the number of zeros associated with the third column. V_length may be encoded using a single fixed or adaptive Huffman or Arithmetic coder.
3. A coefficient symbol stream vector, V_coef, which is similar in structure to that of the symbol stream vector, s, illustrated in equation 20, except that it does not include the zero-run symbols. This vector may be as follows:
V_coef=[3 5 −2 −1 1 0 0 blank]  (23)
for the values of the Q matrix in equation 18. Note that the blank component of the vector of equation 23 corresponds to the fifth row of Q which is already fully defined by the vectors of equations 21 and 22. Consequently, it is not necessary to send this blank symbol to the decoder. V_coef is encoded using four fixed or adaptive Huffman or Arithmetic coders in the same manner as described above with reference to the symbol stream vector, s, in equation 20.
This decomposed structure may be convenient because it gives rise to more efficient lossless compressors using Huffman and/or Arithmetic coders.
An alternative representation of the zero-runs for lossless encoding is to combine the amplitude zero-run vector, V_amp, of equation 21 with the length zero-run vector, V_length, to produce a single zero-run symbol vector, V_zerorun. The resultant vector may then be encoded using a single Huffman or Arithmetic coder. The relationship between V_zerorun, V_amp and V13 length may be defined as follows:
V_zerorun=V_amp*q−V_length*sign(V_amp)  (22a)
Substituting in equation 22a the exemplary values from equations 21 and 22 above, namely, q=5, V_amp=[1 1 0 1], and V_length=[0 1 blank 4] (the “blank” in this sequence representing the fact that this particular symbol is not sent to the decoder because it is implied by a corresponding “0” value in equation 21), yields the following:
V_zerorun=[5 4 0 1]  (22b)
It should be appreciated that, on the decoder side of the scheme, V_amp and V_length may the readily reconstructed from V_zerorun.
In addition to the bits associated with the symbol stream s (or alternative symbol streams V_amp, V_length, and V_coef), the bit stream b also contains overhead information required to operate the coder. The information transmitted as overhead may include:
The scalar compression factor, Comp
The scalar number of scalar quantizers q.
The dimensions of the encoded image, V and H.
The maximum pixel value found in the image. This pixel value is used to eliminate overshoots that may be found in the reconstructed image.
The minimum pixel value of the image. This pixel value is used to eliminate undershoots that may be found in the reconstructed image.
symbol-bit maps when Huffman coders are used to compress the symbol stream s.
symbol sets when Arithmetic coders with adaptive symbol distribution models are used to compress the symbol stream, s.
symbol sets and distribution model parameters when fixed symbol distribution model are used for the Huffman or Arithmetic coders.
To initialize the coder it is assumed that the prediction vector z(0) is a constant vector whose value is given by the average value between the maximum and minimum pixel values found in the image. Alternatively, the mean value of x(1) may be used to initialize the coder, in which case it is subsequently decoded as overhead. When processing the top row or left column of an image only partial knowledge of the pixel estimates in z(k) is available. For this special case we select for the unknown pixel estimates in z(k) the same value as that of the known pixel estimate in z(k) physically closest to the unknown pixel estimates.
Several features characterize the super PT coder. For example, the way in which the symbol stream, s (or its alternative version, V_coef), is configured leads to the decomposition of the lossless compression to either q distinct Huffman coders or an Arithmetic coder that must be reinitialized after each new group of quantizer symbols is received. This approach has led to a significant improvement in the compression derived from the Super Predictive-Transform Coder. Other Huffman and Arithmetic coding algorithms such as those used for JPEG have been simulated and found to yield significantly inferior results.
Another characterizing feature is the superimposed geometry of the coder input and prediction vectors of the Super Predictive-Transform Coder of FIG. 2. This novel geometry leads to the elimination of undesirable blocking artifacts that are otherwise obtained with PT based coders when operating at very low bit rates.
Yet another characterizing feature is the integration of the proposed symbol stream generator, q Huffman coders or Arithmetic coder with q properly synchronized initializations, the superimposed geometry of the coder input and prediction signals, and simple round off scalar quantizers (3) in the minimum mean squared error (MMSE) predictive-transform modeling and coding formulation described by the applicant in Feria, E. H., “Predictive-Transform Coding,”, Proceedings of 1986 IEEE NAECON, Dayton, Ohio, May 1986, which is incorporated herein by reference. The synergistic operation of all of the aforementioned inventions together with a solid theoretical foundation is what has given rise to a simple, elegant, and powerful technique which has proven to outperform the S&P wavelet algorithm.
It should be noted that the Super PT coder of the present invention does not suffer of the smoothing problem encountered with the S&P wavelet algorithm.
The Super PT coder does not suffer of blocking artifacts when operating at very low bit rates. This is due to the superimposition property of each encoded pixel block.
The lossless encoding of each element of the scaled and truncated coefficient error vector δ{circumflex over (f)}(k) as described above results in a significant improvements on the SNR and the visual quality of the reconstructed images.
A fundamental problem addressed by the invention is the compression of digital signals and images for their efficient storage and transmission in bandwidth limited channels. The compressed images can be monochrome or color images and can be used for the compression of still and moving pictures. Applications are found in diverse fields such as the efficient storage of images for personal computers, the storage of medical images, the storage of finger prints and ballistic or bullet prints, the storage of planetary images, the transmission of facsimile information for commercial applications, the transmission and storage of digital images in the movie industry and other applications, and the transmission of digital images for high definition television (HDTV) systems.

Claims (18)

What is claimed is:
1. A method of compressing a digital image comprising the steps of:
dividing the digital image into a plurality of blocks;
generating a plurality of uncorrelated coefficient or coefficient error vectors, each coefficient or coefficient error vector corresponding to one of said blocks;
quantizing the coefficient or coefficient error vectors using a plurality of scalar quantizers;
storing a coefficient or coefficient error matrix consisting of the quantized coefficient or coefficient error vectors;
decomposing the coefficient or coefficient error matrix into a plurality of uncorrelated groups of symbols, the symbols in each uncorrelated group of symbols representing either quantized coefficient errors or zero-runs; and
losslessly encoding each of said groups of symbols.
2. A method of compressing a digital image comprising the steps of:
dividing the digital image into a plurality of blocks;
generating a plurality of uncorrelated coefficient or coefficient error vectors, each coefficient or coefficient error vector corresponding to one of said blocks;
quantizing the coefficient or coefficient error vectors using a plurality of scalar quantizers;
storing a coefficient or coefficient error matrix consisting of the quantized coefficient or coefficient error vectors;
decomposing the coefficient or coefficient error matrix into a group of zero-run symbols and a plurality of uncorrelated groups of symbols, the symbols in each uncorrelated group of symbols representing quantized coefficient errors; and
losslessly encoding each of said groups of symbols.
3. A method of compressing a digital image comprising the steps of:
dividing the digital image into a plurality of blocks;
generating a plurality of uncorrelated coefficient or coefficient error vectors, each coefficient or coefficient error vector corresponding to one of said blocks;
quantizing the coefficient or coefficient error vectors using a plurality of scalar quantizers;
storing a coefficient or coefficient error matrix consisting of the quantized coefficient or coefficient error vectors;
decomposing the coefficient or coefficient error matrix into a group of zero-run amplitude symbols, a group of zero-run symbols and a plurality of uncorrelated groups of symbols, the symbols in each uncorrelated group of symbols representing quantized coefficient errors; and
losslessly encoding each of said groups of symbols.
4. A method according to claim 1 wherein each of at least some of said plurality of blocks is at least partly superimposed with at least one other block.
5. A method according to claim 2 wherein each of at least some of said plurality of blocks is at least partly superimposed with at least one other block.
6. A method according to claim 3 wherein each of at least some of said plurality of blocks is at least partly superimposed with at least one other block.
7. A method according to claim 1 wherein the step of generating a plurality of uncorrelated coefficient or coefficient error vectors comprises the step of calculating said coefficient or coefficient error vectors using minimum mean squared error (MMSE) methodology.
8. A method according to claim 2 wherein the step of generating a plurality of uncorrelated coefficient or coefficient error vectors comprises the step of calculating said coefficient or coefficient error vectors using minimum mean squared error (MMSE) methodology.
9. A method according to claim 3 wherein the step of generating a plurality of uncorrelated coefficient or coefficient error vectors comprises the step of calculating said coefficient or coefficient error vectors using minimum mean squared error (MMSE) methodology.
10. A method according to claim 1 wherein said plurality of scalar quantizers comprise a plurality of linear scalar quantizers.
11. A method according to claim 2 wherein said plurality of scalar quantizers comprise a plurality of linear scalar quantizers.
12. A method according to claim 3 wherein said plurality of scalar quantizers comprise a plurality of linear scalar quantizers.
13. A method according to claim 1 wherein the step of losslessly encoding each of said groups of symbols comprises losslessly encoding at least some of said groups of symbols using fixed or adaptive Huffman coding.
14. A method according to claim 2 wherein the step of losslessly encoding each of said groups of symbols comprises losslessly encoding at least some of said groups of symbols using fixed or adaptive Huffman coding.
15. A method according to claim 3 wherein the step of losslessly encoding each of said groups of symbols comprises losslessly encoding at least some of said groups of symbols using fixed or adaptive Huffman coding.
16. A method according to claim 1 wherein the step of losslessly encoding each of said groups of symbols comprises losslessly encoding at least some of said groups of symbols using fixed or adaptive Arithmetic coding.
17. A method according to claim 2 wherein the step of losslessly encoding each of said groups of symbols comprises losslessly encoding at least some of said groups of symbols using fixed or adaptive Arithmetic coding.
18. A method according to claim 3 wherein the step of losslessly encoding each of said groups of symbols comprises losslessly encoding at least some of said groups of symbols using fixed or adaptive Arithmetic coding.
US09/696,197 1999-10-22 2000-10-23 Super predictive-transform coding Expired - Lifetime US6760479B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/696,197 US6760479B1 (en) 1999-10-22 2000-10-23 Super predictive-transform coding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16091099P 1999-10-22 1999-10-22
US09/696,197 US6760479B1 (en) 1999-10-22 2000-10-23 Super predictive-transform coding

Publications (1)

Publication Number Publication Date
US6760479B1 true US6760479B1 (en) 2004-07-06

Family

ID=32599564

Family Applications (1)

Application Number Title Priority Date Filing Date
US09/696,197 Expired - Lifetime US6760479B1 (en) 1999-10-22 2000-10-23 Super predictive-transform coding

Country Status (1)

Country Link
US (1) US6760479B1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030063807A1 (en) * 2000-07-10 2003-04-03 Stmicroelectronics S.R.L. Method of compressing digital images
US20030113024A1 (en) * 2001-11-07 2003-06-19 Research Foundation Of The City University Of New York Accelerated predictive-transform
US20040126031A1 (en) * 2002-12-30 2004-07-01 Dwyer Michael K. Run length encoded digital image
US20060020560A1 (en) * 2004-07-02 2006-01-26 Microsoft Corporation Content distribution using network coding
US20060282677A1 (en) * 2004-07-02 2006-12-14 Microsoft Corporation Security for network coding file distribution
US20080205779A1 (en) * 2007-02-23 2008-08-28 International Business Machines Corporation Selective predictor and selective predictive encoding for two-dimensional geometry compression
US20100195761A1 (en) * 2004-04-23 2010-08-05 France Telecom Method and device for transmitting a signal in a multi-antenna system, signal, and method for estimating the corresponding transmission channels
US20100226569A1 (en) * 2007-09-28 2010-09-09 Vsevolod Yurievich Mokrushin Method of encoding digital signals (variants), method of decoding digital signals (variants), device for implementation thereof (variants), and system for image transmission via limited throughput communication channels (variants)
CN101908143A (en) * 2010-08-09 2010-12-08 哈尔滨工程大学 Method for inspecting living body fingerprint slippage defect based on subband feature fusion
US20120303570A1 (en) * 2011-05-27 2012-11-29 Verizon Patent And Licensing, Inc. System for and method of parsing an electronic mail
US8428376B2 (en) * 2006-09-25 2013-04-23 Research Foundation Of The City University Of New York Predictive-transform source coding with subbands
US20130208790A1 (en) * 2010-07-09 2013-08-15 British Broadcasting Corporation Picture coding and decoding
US11145030B2 (en) * 2018-09-27 2021-10-12 Displaylink (Uk) Limited Method of controlling encoding of display data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5295203A (en) * 1992-03-26 1994-03-15 General Instrument Corporation Method and apparatus for vector coding of video transform coefficients
US5933571A (en) * 1994-04-28 1999-08-03 Matsushita Electric Industrial Co., Ltd. Video image coding and recording apparatus and video image coding, recording and reproducing apparatus
US5966470A (en) * 1993-06-10 1999-10-12 Matsushita Electric Industrial Co., Ltd. Coding apparatus for image compression
US6396955B1 (en) * 1998-06-25 2002-05-28 Asahi Kogaku Kogyo Kabushiki Kaisha Image compression and expansion device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5295203A (en) * 1992-03-26 1994-03-15 General Instrument Corporation Method and apparatus for vector coding of video transform coefficients
US5966470A (en) * 1993-06-10 1999-10-12 Matsushita Electric Industrial Co., Ltd. Coding apparatus for image compression
US5933571A (en) * 1994-04-28 1999-08-03 Matsushita Electric Industrial Co., Ltd. Video image coding and recording apparatus and video image coding, recording and reproducing apparatus
US6396955B1 (en) * 1998-06-25 2002-05-28 Asahi Kogaku Kogyo Kabushiki Kaisha Image compression and expansion device

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
"Fixed and Adaptive Predictors for Hybrid Predictive/Transform Coding," Staffan Ericsson, IEEE Transactions on Communications, vol. COM-33, No. 12, Dec. 1985, pp. 1291-1302.* *
"Predictive-Transform Residual Vector Quantization," Syed A. Rizvi et al., 1998 Internatiopnal Conference on Image Processing vol. 2, Oct. 4-7, 1998, pp. 584-588.* *
"Super Predictive-Transform Coding", Proceedings of 1999 DSP World ICSPAT, Orlando, Florida, Nov. 2-4, 1999.
Erlan H. Feria, "Analog and Leaky Predictive Transform Image Coding for Noisy Channels: Further Results," IEEE NAECON, May 1992.
Erlan H. Feria, "Linear predictive transform of monochrome images," Image and Vision Computing, vol. 5, No. 4, Nov. 1987, pp. 267-278.
Feria and Rizvi, Super Predictive Transform Residual Vector Quantization (with Syed A. Rizvi), Proceedings of 1999 IEEE International Conference on Image Processing, Kobe, Japan, Oct. 24-28, 1999.
Feria, E.H., "Predictive-Transform Coding," Proceedings of 1986 IEEE NAECON, Dayton, Ohio, May 1986.
Said and Pearlman, A new fast and efficient image codec based on set partitioning in hierarchical trees, IEEE Trans. CSVT, vol. 6, n. 3, pp 243-250, Jun. 1996.

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6839467B2 (en) * 2000-07-10 2005-01-04 Stmicroelectronics S.R.L. Method of compressing digital images
US20030063807A1 (en) * 2000-07-10 2003-04-03 Stmicroelectronics S.R.L. Method of compressing digital images
US20030113024A1 (en) * 2001-11-07 2003-06-19 Research Foundation Of The City University Of New York Accelerated predictive-transform
US7046856B2 (en) * 2001-11-07 2006-05-16 Feria Erlan H Accelerated predictive-transform
US7532765B2 (en) * 2002-12-30 2009-05-12 Intel Corporation Run length encoded digital image
US20040126031A1 (en) * 2002-12-30 2004-07-01 Dwyer Michael K. Run length encoded digital image
US8358990B2 (en) * 2004-04-23 2013-01-22 France Telecom Method and device for transmitting a signal in a multi-antenna system, signal, and method for estimating the corresponding transmission channels
US20100195761A1 (en) * 2004-04-23 2010-08-05 France Telecom Method and device for transmitting a signal in a multi-antenna system, signal, and method for estimating the corresponding transmission channels
US20060020560A1 (en) * 2004-07-02 2006-01-26 Microsoft Corporation Content distribution using network coding
US7756051B2 (en) * 2004-07-02 2010-07-13 Microsoft Corporation Content distribution using network coding
US8140849B2 (en) 2004-07-02 2012-03-20 Microsoft Corporation Security for network coding file distribution
US20060282677A1 (en) * 2004-07-02 2006-12-14 Microsoft Corporation Security for network coding file distribution
US8428376B2 (en) * 2006-09-25 2013-04-23 Research Foundation Of The City University Of New York Predictive-transform source coding with subbands
US20080205779A1 (en) * 2007-02-23 2008-08-28 International Business Machines Corporation Selective predictor and selective predictive encoding for two-dimensional geometry compression
US8917947B2 (en) 2007-02-23 2014-12-23 International Business Machines Corporation Selective predictor and selective predictive encoding for two-dimensional geometry compression
US8249371B2 (en) 2007-02-23 2012-08-21 International Business Machines Corporation Selective predictor and selective predictive encoding for two-dimensional geometry compression
US8374446B2 (en) * 2007-09-28 2013-02-12 Vsevolod Yurievich Mokrushin Encoding and decoding of digital signals based on compression of hierarchical pyramid
US20100226569A1 (en) * 2007-09-28 2010-09-09 Vsevolod Yurievich Mokrushin Method of encoding digital signals (variants), method of decoding digital signals (variants), device for implementation thereof (variants), and system for image transmission via limited throughput communication channels (variants)
US20130208790A1 (en) * 2010-07-09 2013-08-15 British Broadcasting Corporation Picture coding and decoding
CN101908143B (en) * 2010-08-09 2012-05-09 哈尔滨工程大学 Method for inspecting living body fingerprint slippage defect based on subband feature fusion
CN101908143A (en) * 2010-08-09 2010-12-08 哈尔滨工程大学 Method for inspecting living body fingerprint slippage defect based on subband feature fusion
US20120303570A1 (en) * 2011-05-27 2012-11-29 Verizon Patent And Licensing, Inc. System for and method of parsing an electronic mail
US11145030B2 (en) * 2018-09-27 2021-10-12 Displaylink (Uk) Limited Method of controlling encoding of display data
US11699212B2 (en) 2018-09-27 2023-07-11 Displaylink (Uk) Limited Method of controlling encoding of display data

Similar Documents

Publication Publication Date Title
KR100308627B1 (en) Low bit rate encoder using overlapping block motion compensation and zerotree wavelet coding
EP1290868B1 (en) Bit-plane dependent signal compression
US6628716B1 (en) Hardware efficient wavelet-based video compression scheme
EP1359546B1 (en) 2-D transforms for image and video coding
KR101348365B1 (en) An video encoding/decoding method and apparatus
US7634148B2 (en) Image signal transforming and inverse-transforming method and computer program product with pre-encoding filtering features
WO2000049570A1 (en) Method for visual optimisation of embedded block codes to exploit visual masking phenomena
US7042946B2 (en) Wavelet based coding using motion compensated filtering based on both single and multiple reference frames
Sathiyanathan Medical image compression using view compensated wavelet transform
WO2000072602A1 (en) Multi-dimensional data compression
US11611777B2 (en) Transformations for signal enhancement coding
JP2005524352A (en) Scalable wavelet-based coding using motion compensated temporal filtering based on multiple reference frames
US6760479B1 (en) Super predictive-transform coding
US20050084014A1 (en) Video encoding with motion-selective wavelet transform
JP2007267384A (en) Compression apparatus and compression method
US6798901B1 (en) Method of compressing a color image
Gharavi-Alkhansari et al. Fractal image coding using rate-distortion optimized matching pursuit
JP2005524354A (en) Wavelet-based coding with motion-compensated temporal filtering based on multiple reference frames
JP2004266794A (en) Multi-resolution video coding and decoding
Fazli et al. JPEG2000 image compression using SVM and DWT
Rahman et al. Efficient Image Compression Technique using JPEG2000 with Adaptive Threshold
Delaunay et al. Satellite image compression by concurrent representations of wavelet blocks
CN116916033B (en) Combined space-time video compression method based on random self-adaptive Fourier decomposition
Ranjeeta et al. Image compression: an overview
Kuo et al. Multiresolution video coding based on Kalman filtering motion estimation

Legal Events

Date Code Title Description
AS Assignment

Owner name: RESEARCH FOUNDATION OF THE CITY UNIVERSITY OF NEW

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:FERIA, ERLAN H.;REEL/FRAME:014879/0689

Effective date: 20040105

STCF Information on status: patent grant

Free format text: PATENTED CASE

CC Certificate of correction
FEPP Fee payment procedure

Free format text: PAT HOLDER CLAIMS SMALL ENTITY STATUS, ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: LTOS); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 4

SULP Surcharge for late payment
FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 8

SULP Surcharge for late payment

Year of fee payment: 7

FEPP Fee payment procedure

Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 12

SULP Surcharge for late payment

Year of fee payment: 11