AU725719B2 - A method of digital image compression - Google Patents

A method of digital image compression Download PDF

Info

Publication number
AU725719B2
AU725719B2 AU87096/98A AU8709698A AU725719B2 AU 725719 B2 AU725719 B2 AU 725719B2 AU 87096/98 A AU87096/98 A AU 87096/98A AU 8709698 A AU8709698 A AU 8709698A AU 725719 B2 AU725719 B2 AU 725719B2
Authority
AU
Australia
Prior art keywords
coefficients
frequency components
region
series
low frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU87096/98A
Other versions
AU8709698A (en
Inventor
James Philip Andrew
Andrew Peter Bradley
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AUPO9515A external-priority patent/AUPO951597A0/en
Application filed by Canon Inc filed Critical Canon Inc
Priority to AU87096/98A priority Critical patent/AU725719B2/en
Publication of AU8709698A publication Critical patent/AU8709698A/en
Assigned to CANON KABUSHIKI KAISHA reassignment CANON KABUSHIKI KAISHA Alteration of Name(s) of Applicant(s) under S113 Assignors: CANON INFORMATION SYSTEMS RESEARCH AUSTRALIA PTY LTD, CANON KABUSHIKI KAISHA
Application granted granted Critical
Publication of AU725719B2 publication Critical patent/AU725719B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)

Description

,i d S F Ref: 433418
AUSTRALIA
PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT
ORIGINAL
Name and Address of Applicant: Canon Kabushiki Kaisha 30-2, Shimomaruko 3-chome, Ohta-ku, Tokyo 146-85ol
JAPAN
I
m i
-S
S
*5 S Canon Information System: ReCsCprC1 A'utra 1 kho 1 Iloelt Dive North Ryd- I"ew cd U AUSllCDIA- 1 i aP ty 444 Actual Inventor(s): Address for Service: Invention Title: Andrew Peter Bradley and James Phillip Andrew Spruson Ferguson, Patent Attorneys Level 33 St Martins Tower, 31 Market Street Sydney, New South Wales, 2000, Australia A Method of Digital Image Compression ASSOCIATED PROVISIONAL APPLICATION DETAILS [313 Application No(s) [33] Country P09515
AU
[321 Application Date 29 September 1997 The following statement is a full description of this invention, including the best method of performing it known to me/us:dlg:3054M it 4 -1- A Method of Digital Image Compression Field of Invention The present invention relates to the field of data compression with particular application to digital image compression. More particularly, the present specification discloses a method for coding digital images utilising the discrete wavelet transform and sibling subband redundancy.
Background of Invention The field of digital data compression and in particular digital image compression has attracted great interest for some time.
In the field of digital image compression, many different techniques have been utilised,. In particular, one popular technique is the JPEG standard which utilises the discrete cosine transform (DCT) to transform standard size blocks of an image into corresponding cosine components. In this respect, the higher frequency cosine components are heavily quantised so as to assist in obtaining substantial compression factors. The heavy quantisation is an example of a "lossy" technique of image compression. The JPEG standard also provides for the subsequent loss less compression of the transformed coefficients.
Recently, the field of wavelet transforms has gained great attention as an alternative form of data compression. The wavelet transform has been found to be highly suitable in representing data having discontinuities such as sharp edges. Such discontinuities are often present in image data or the like.
Although the preferred embodiments of the present invention will be described 25 with reference to the compression of image data, it will be readily evident that the preferred embodiment is not limited thereto. For examples of the many different applications of Wavelet analysis to signals, reference is made to a survey article entitled "Wavelet Analysis" by Bruce et. al. appearing in IEEE Spectrum, October 1996 page 26 35. For a discussion of the different applications of wavelets in computer graphics, reference is made to "Wavelets for Computer Graphics", I. Stollinitz et. al.
published 1996 by Morgan Kaufmann Publishers, Inc.
A number of image coding techniques including those aforementioned utilise a linear transformation of the input image to reduce both inter-pixel correlation and S- coefficient coding overhead. These techniques include the JPEG image compression 35 standard and the federal bureau of investigation (FBI) fingerprint image compression standard. The JPEG standard utilises a discrete cosine transform (DCT) of the image data, while the FBI standard uses a discrete wavelet transform (DWT).
CFP unkwnAU 433418 [O:\ClSRA\1PR\IPRP071433418:BFD -2- It is important to efficiently represent the coefficients from the DWT of image or video data. Previous techniques for encoding DWT coefficients include the embedded zerotree wavelet (EZW) method US Patent Nos: 5,412,741, 5,315,670, and 5,321,776 and set partitioning in hierarchical trees (SPIHT). Each of these techniques first apply a DWT to the source image data to produce a small low frequency representation of the image and a number of high frequency, or detail, subbands. The techniques then utilise the correlation between different frequency subbands, at the same orientation, to predict zero coefficients down the wavelet tree, predicting zero coefficients in the same spatial location and orientation, but at a higher frequency.
Aspects of Invention It is an object of the present invention to ameliorate one or more disadvantages of the prior art.
According to a first aspect of the present invention, there is provided a method of compressing digital data including the steps of:transforming said data into a series of subbands having low frequency components, high frequency components and a mixture of high and low frequency components each having a corresponding series of coefficients each of which are arranged in a predetermined spatial arrangement; combining the coefficients of said high frequency components and said mixture of high and low frequency components in a correlated manner into a single combined series of coefficients; and encoding said single combined series of coefficients utilizing a region hierarchical encoding method which encodes smaller and smaller regions of said o combined series of coefficients.
According to a second aspect of the present invention, there is provided a method :"of decompressing digital data including the steps of: decoding previously encoded data to produce a single combined series of coefficients utilizing a region hierarchical encoding method; extracting coefficients of high frequency components and low frequency components and a mixture of high and low frequency components in a predefined order from said single combined series of coefficients; arranging said high frequency and low frequency components and a mixture of high and low frequency components in a series of 30 sub-bands; and inverse transforming said series of sub-bands.
According to a third aspect of the present invention, there is provided an apparatus for compressing digital data including: transformation means for transforming said data into a series of sub-bands having low frequency components, high frequency components and a mixture of high and low frequency components each having a corresponding series of coefficients each of which are arranged in a predetermined spatial arrangement; combination means for combining the coefficients of said high frequency components and said mixture of high and low frequency components in a correlated manner into a single combined series of coefficients; and encoder means for encoding CFP unkwnAU 433418 [O:\CISRA\IPR\IPR07]433418:ad said single combined series of coefficients utilizing a region hierarchical encoding method which encodes smaller and smaller regions of said combined series of coefficients.
According to a fourth aspect of the present invention, there is provided an apparatus of decompressing digital data including: decoder means for decoding previously encoded data to produce a single combined series of coefficients utilizing a region hierarchical encoding method; extraction means for extracting coefficients of high frequency components and low frequency components and a mixture of high and low frequency components in a predefined order from said single combined series of coefficients; arrangement means for arranging said high frequency and low frequency components and a mixture of high and low frequency components in a series of subbands; and transformation means for inverse transforming said series of sub-bands.
According to a fifth aspect of the present invention, there is provided a computer program product including a computer readable medium having recorded thereon a computer program for compressing digital data, the computer program product including: transformation means for transforming said data into a series of sub-bands having low frequency components, high frequency components and a mixture of high and low frequency components each having a corresponding series of coefficients each of which are arranged in a predetermined spatial arrangement; combination means for combining the coefficients of said high frequency components and said mixture of high and low frequency components in a correlated manner into a single combined series of coefficients; and encoder means for encoding said single combined series of coefficients .°::utilizing a region hierarchical encoding method which encodes smaller and smaller o regions of said combined series of coefficients.
According to a sixth aspect of the present invention, there is provided a computer program product including a computer readable medium having recorded thereon a computer program for decompressing digital data, the computer program product including: decoder means for decoding previously encoded data to produce a single combined series of coefficients utilizing a region hierarchical encoding method; extraction means for extracting coefficients of high frequency components and low 30 frequency components and a mixture of high and low frequency components in a predefined order from said single combined series of coefficients; arrangement means for .arranging said high frequency and low frequency components and a mixture of high and low frequency components in a series of sub-bands; and transformation means for inverse transforming said series of sub-bands.
Brief Description of the Drawings Embodiments of the invention are described, by way of example only, with reference to the drawings, in which: CFP unkwnAU 433418 [O:\CISRA\IPR\IPR07]433418:iad -4- Fig. 1 is a high-level block diagram illustrating the image representation technique described in the herein-mentioned patent application; Fig. 2 is a diagram illustrating partitioning described in the herein-mentioned patent application; Fig. 3 is a flow diagram illustrating the method of representing, or encoding, an image described in the herein-mentioned patent application; Fig. 4 is a detailed flow diagram illustrating the step of coding a region in Fig. 3; Fig. 5 is a flow diagram illustrating the method of decoding a coded representation of an image produced in accordance with the method Fig. 3; Fig. 6 is a detailed flow diagram illustrating the step of decoding a region in Fig.
Figs 7A to 7D are diagrams illustrating the processing of a two-dimensional, eight-coefficient region in accordance with the encoding and decoding method of Figs. 3 to 6; Fig. 8 is a block diagram of a general purpose computer; Figs. 9 to 12 are flow diagrams illustrating an alternate method representing, or encoding, an image described in the herein-mentioned patent application; Figs. 13-15 illustrate the process of wavelet transforming image data; Fig. 16 illustrates an example 4x4 a discrete wavelet transform.
Fig 17 is a flow diagram of illustrating the method of representing, or encoding, an image in accordance with the preferred embodiment; Fig. 18 is a detailed flow diagram illustrating the step of coding a region in Fig. 17; and 2 Fig. 19 is a flow diagram illustrating a method of decoding a code representation So 25 of an image.
Detailed Description Where reference is made in any one or more of the accompanying drawings to oooo steps and/or features, which have the same reference numerals, those steps and/or features o have for the purposes of this description the same function(s) and/or operation(s), unless 30 the contrary intention appears.
Before proceeding with a description of the preferred embodiments, a description .is given of the "SWEET" image compression and decompression method disclosed in *Australian Provisional Patent Application No. PO 4728, entitled "A method for Digital Image Compression", filed on 22 January 1997 by Canon Information Systems Research Australia Pty. Ltd. This method of compression and CFP unkwnAU 433418 [O:\CIS RA\IPR\IPR07]433418:iad f. I decompression is described in the following sections hereinafter entitled "1.0 Overview of SWEET Image Compression Method", "I.1 Encoding Process of First SWEET Image Compression Method", "1.2 Decoding Process of First SWEET Image Compression Method", "1.3 Two-Dimensional Example" and "1.4 Encoding Process of Second SWEET Image Compression Method".
Overview of SWEET Image Compression Method(S) A high-level block diagram is illustrated in Fig. 1 to provide an overview of encoding method. An input image 102 is provided to the transform block 110, which is preferably a linear transform, to produce corresponding transform coefficients 112. A discrete wavelet transform (DWT) is preferably employed.
The two-dimensional DWT of an image is a transform that represents the image using a low frequency approximation to the image and three high frequency detail components. Conventionally, these components are termed subbands. Each of the four sub-images formed by the DWT is one quarter of the size of the original image. The low frequency image contains most of the information about the original image. This information, or energy compaction, is the feature of the discrete wavelet transform image subbands that is exploited for image compression.
The single-level DWT can be applied recursively to the low frequency image, or subband, an arbitrary number of times. For example, a three-level DWT of the image is obtained by applying the transform once and then applying the DWT to the low subband resulting from the transformation. Thus, this results in 9 detail subbands and one (very) low frequency subband. Even after three levels of DWTs, the resulting low frequency subband still contains a significant amount of information of the original image, yet is 64 times smaller (1/4 x 1/4 x thereby effecting a factor of 64 in compression.
However, other linear transformations for decorrelating image data may be practiced. For example, a discrete cosine transform (DCT) can be practiced. The transform coefficients 112, or more specifically the bit sequences representing their values, are then coded by the bit rearrangement block 120 in an efficient fashion to ".provide the coded representation 122.
The decoding process is simply the reverse of this encoding process. The encoded coefficients are decoded into the transform coefficients. The (transform domain) image is then inverse transformed to form the original image, or some 35 approximation thereof.
Before proceeding with a further description of the embodiments, a brief review of terminology used hereinafter is provided. For a binary integer representation of a number, "bit n" or "bit number n" refers to the binary digit n.places to the left of CFP unkwnAU 433418 O :\CISRA\IPR\IPR071433418:BFD -6the least significant bit. For example, assuming an 8-bit binary representation, the decimal number 9 is represented as 00001001. In this number, bit 3 is equal to 1, while bits 2, 1, and 0 are equal to 0, 0, and 1, respectively. Furthermore, a transform may be represented as a matrix having coefficients arranged in rows and columns, with each coefficient represented by a bit sequence. Conceptually speaking the matrix may be regarded as having three dimensions; one dimension in the row direction; a second dimension in the column direction and a third dimension in the bit sequence direction.
A plane in this three-dimensional space, which passes through each bit sequence at the same bitnumber, is called a bitplane or bit plane.
For transform coding applications, the number of bits per coefficient required to represent the possiblerange of coefficients is determined by the linear transform and the resolution of each pixel (in bits per pixel) in the input image. This range of values for each pixel is typically large relative to the values of most of the transform coefficients, and thus many coefficients have a large number of leading zeros. For example, the number 9 has four leading zeros in a 8-bit representation and has 12 leading zeros for a 16-bit representation. The compression method and apparatus represents (or codes) these leading zeros, for blocks of coefficients, in an efficient manner. The remaining bits and sign of the number are encoded directly without modification.
To simplify and the description, the transform coefficients are assumed hereinafter to be represented in an unsigned binary integer form, with a single sign bit.
That is, the decimal numbers -9 and 9 are represented with the same bit sequence, oe..
namely 1001, with the former having a sign bit equal to 1 to indicate a negative value, and the latter having a sign bit equal to 0 to indicate a positive value. The number of 25 leading zeros is determined by the range of the transform coefficients. In using an **integer representation, the coefficients are implicitly already quantised to the nearest integer value, although this is not necessary. Further, for the purpose of compression, any information contained in fractional bits is normally ignored.
A region includes a set of contiguous image coefficients. The term coefficient is used hereinafter interchangeably with pixel, however, as will be well understood by a person skilled in the art, the former is typically used to refer to pixels in a transform domain a DWT domain).
1.1 Encoding Process of First SWEET Image Compression Method A more detailed description of the first image compression method is provided with reference to Figs. 3 and 4.
Fig. 3 is a flow diagram illustrating the first image encoding method. In step 302, processing commences using an input image. In step 304, the input image is CFP unkwnAU 433418 [O:\CISRA\IPR\IPR071433418:BFD -7transformed using a linear transformation, preferably a discrete wavelet transform. An initial region is defined to be the whole image. For example, in the case of a threelevel DWT of the input image, the resulting coefficients including the 10 subbands can be specified as the region. Alternatively each subband can be processed separately, setting each initial region to the whole subband in question.
In step 306, the most significant bit (msb) of the largest absolute value of the transform coefficients is determined and a parameter, maxBitNumber, is set to this coefficient value. For example, if the largest transform coefficient has a binary value of 00001001 (decimal the parameter maxBitNumber is set to 3, since the msb is bit number 3. Alternatively, the parameter maxBitNumber may be set to be any value that is larger that the msb of the largest absolute value of the transform coefficients.
Further, in step 306, a coding parameter, minBitNumber is set to specify the coded image quality. In particular, this coding parameter specifies the precision of every coefficient in the transformed image and can be varied as required. For example, a minBitNumber of 3 provides a coarser reproduction of the original image than does a value of 1.
Optionally, the technique involves step 308, which provides an output header in the coded representation of the input image. Thus, in a practical implementation, header information is output as part of the coded representation. For example, the output header may contain information about the source image, including the image height and width, the number of levels of the DWT, the mean value of the DC subband, the maxBitNumber parameter, and the minBitNumber parameter.
Beginning in step 310, each subband of the transformed image is coded separately in steps 312 and 314. Each subband is coded independently, in order from 25 low frequency to high frequency. For the DC subband, the mean value is removed prior to coding and coded into the header information in step 308. In step 312, each subband is coded by setting an initial region as the whole subband. In step 314, the region is encoded with the maxBitNumber and minBitNumber as parameters. This provides a hierarchical code, since lower resolution versions of the image are coded into the bit stream before higher resolutions. Processing terminates in step 316.
Fig. 4 is a detailed flow diagram of the procedure "Code region(currentBitNumber, minBitNumber)" called in step 314 of Fig. 3 for coding each region, where maxBitNumber is provided as the currentBitNumber. In step 402, processing commences. The inputs to the region coding process of Fig. 4 include the 35 currentBitNumber and minBitNumber parameters. Preferably, the method is implemented as a recursive technique where the process is able to call itself with a selected region or sub-region. However, the process may implemented in a nonrecursive manner.
CFP unkwnAU 433418 [O:\CISRA\IPR\1PR071433418:BFD -8- In decision block 404, a check is made to determine if the currentBitNumber parameter is less than the minBitNumber parameter. Otherwise, if decision block 404 returns true (yes), nothing is done and processing returns to the calling procedure in step 406. This condition indicates that every coefficient in the selected region has a msb number less than minBitNumber. If decision block 404 returns false (no), processing continues at decision block 408.
In decision block 408, a check is made to determine if the selected region is a 1 x 1 pixel. If decision block 408 returns true (yes), processing continues at step 410.
In step 410, the 1 x 1 pixel is coded. Preferably, this involves directly outputting the remaining bits above the minBitNumber in the coded representation. In step 412, processing returns to the calling procedure. Otherwise, if decision block 408 returns false the region includes more than one coefficient and processing continues at decision block 414.
In decision block 414, the selected region is checked to determine if it is significant. That is, the significance of the region is tested. The region is said to be insignificant if the msb number of each coefficient in the region is less than the value of the currentBitNumber parameter. To make the concept of region significance precise, a mathematical definition is given in Equation At a given bit number, say currentBitNumber n, the region is said to be insignificant if: c, 2",Vi,j eR, (1) where R denotes the region, and cij denotes coefficient in this region.
If decision block 414 returns false processing continues at step 416. In o: 25 step 416, a value of 0 (or first token) is output in the coded representation stream, and the currentBitNumber parameter is decremented by 1. That is, the next, lower bitplane of the region is selected for processing. Processing then continues at decision block 404, where the region is again processed with the parameters currentBitNumber-1 and minBitNumber. Otherwise, if decision block 414 returns true (yes), that is, the region is significant, processing continues at step 418.
In step 418, a value of 1 (or second token) is output in the coded representation stream. In step 420, the selected region is partitioned into a predetermined number (preferably, 4) of subregions using a specified partitioning algorithm. The partitioning algorithm used is known to the decoder.
35 In this method, square regions are used. A region is partitioned preferably into 4 equal-sized (square) subregions. As shown in Fig. 2, the selected region 200 has a size of M x M coefficients and is partitioned into four equal-sized subregions 210, 212, 214 and 216. Each of the subregions has a size of N x N, where N is equal to CFP unkwnAU 433418 [O:\CISRA\IPR\IPR071433418:BFD M/2. This is not always possible depending on the size and shape of the initial region.
If this is not possible, the initial region can be partitioned into a number of square regions, each having dimensions that are a power of 2, and encode these partitions separately. In any case, this initialization has minimal effect on the overall results if done in an intelligent fashion. Alternatively, a different partition may be used that is suitable for a block-based coder.
In step 422, each subregion is then coded with the same currentBitNumber and minBitNumber parameters. This is preferably done by means of a recursive call to the procedure "Code region(currentBitNumber, minBitNumber)" of Fig. 4. This coding of subregions may be implemented in parallel or sequentially. In the latter case, the processing may commence from a low frequency subband to higher frequency subbands in turn.
In the coded representation, a transform coefficient is coded by simply outputting the pixel bits from the currentBitNumber to the minBitNumber. Preferably, a convention is followed whereby the sign is output only if some of the coefficient bits were non-zero. For example, if currentBitNumber 3, minBitNumber 1, then -9 (00001001) is coded as "1 0 0" followed by a sign bit 1.2 Decoding Process of First SWEET Image Compression Method Fig. 5 is a flow diagram illustrating a method of decoding the coded representation of an image obtained using the process of Figs. 3 and 4. In step 502, processing commences using the coded representation. In step 504, the header information is read from the coded representation to determine the size of the original image, and hence the initial region size. Also, information such as maxBitNumber (equal to the initial currentBitNumber in the coding process) and minBitNumber are input. Further information includes the mean value of the DC subband.
In step 506, decoding of each subband is commenced by setting the region to the respective subbands in turn. In step 508, the selected region is decoded using the maxBitNumber and minBitNumber parameters. In step 510, the inverse DWT is applied to the decoded selected region. Processing terminates in step 512.
Fig. 6 is a detailed flow diagram of step 508 of Fig. 5 for decoding each region using procedure call "Decode region(currentBitNumber, minBitNumber)", where maxBitNumber is provided as the currentBitNumber. In step 602, processing commences. The inputs to the region decoding process of Fig. 6 are the 35 currentBitNumber and minBitNumber parameters. Again, the method is preferably implemented as a recursive technique. However, the process can be implemented in a non-recursive manner.
CFP unkwnAU 433418 [O:\CISRA\IPR\IPR071433418:BFD In decision block 604, a check is made to determine if the currentBitNumber is less than the minBitNumber. If decision block 604 returns true (yes), processing continues at step 606, where processing returns to the calling procedure. Otherwise, if decision block 604 returns false processing continues at decision block 608.
In decision block 608, a check is made to determine if the selected region has a size of 1 x 1 pixels. If decision block 608 returns true (yes), processing continues at step 610. In step 610, the 1 x 1 region is decoded. Processing then returns to the calling procedure in step 612. If decision block 608 returns false processing continues at step 614. In step 614, a bit is input from the coded representation.
In decision block 616, a check is made to determine if the bit is equal to 1, that is, the input is checked to determine if the region is significant. If decision block 616 returns false processing continues at step 618. In step 618, the currentBitNumber is decremented, and processing continues at decision block 604. Otherwise, if decision block 616 returns true (yes), processing continues at step 620. In step 620, the region is partitioned into the predetermined number (preferably, 4) of sub-regions. In step 622, each of the sub-regions is decoded using the currentBitNumber and minBitNumber. This is carried out by means of a recursive call to the process illustrated in Fig. 6. In step 624, processing returns to the calling procedure.
Thus, the bits output from the significance decisions in the encoder instruct the decoder on which path of the algorithm to take, thus mimicking the encoder. The pixels, and possible sign, are decoded by simply reading in the appropriate number of bits currentBitNumber to minBitNumber and if some of these are non-zero the sign bit).
1.3 Two-Dimensional Example The method effectively codes the leading zeros of most transform coefficients, while coding the bits from the most significant bit to the predetermined least significant bit, specified by the parameter minBitNumber, and the sign simply as is. Thus, the compression method advantageously represents the leading zeros. This method is very efficient in certain situations, namely for coding discrete wavelet transform image coefficients, which typically exhibit a large dynamic range.
A few coefficients typically have very large values, while most have very small values.
An example of encoding a two-dimensional region including 4 x 4 coefficients is described with reference to Figs. 7A to 7D. The processing of the 4 x 4 region 700 35 of Fig. 7A is commenced with the maxBitNumber set to 7 since this is the largest bit number (bitplane) of all of the coefficients: CFP unkwnAU 433418 [O:\CISRA\IPR\1PR071433418:BFD -11 200 13 -11 -8 -13 3 -4 -3 8 1 -2 -2 2 -1 -3 -3 The minBitNumber is set to 3, for illustrative purposes. A header is preferably output in the coded representation containing the maxBitNumber and minBitNumber. The process of coding the region 700 then follows.
At currentBitNumber 7, a one is output since the region 700 is significant with respect to bit number 7 (see decision block 404, 408, and 414 and step 418 of Fig. The region 700 is then partitioned into four sub-regions (see step 420 of Fig. the top left region 710, the top right region 712, the bottom left region 714 and the bottom right region 716 of Fig. 7A. Each of the subregions includes of 2 x 2 coefficients.
The sub-regions 710, 712, 714 and 716 of Fig. 7A are in turn coded in the predefined processing sequence shown of Fig. 7B, where a region 750 includes of four sub-regions 750A to 750D. The three arrows illustrated in the diagram indicate the order or sequence of processing, that is, top left sub-region 750A, top right sub-region 750B, bottom left sub-region 750C, and bottom right sub-region 750D, respectively.
The sub-region 710 of Fig. 7A is coded first (see step 422 of Fig. For the currentBitNumber equal to 7, a one is output in the coded representation. The subregion 710 is then partitioned into four 1 x 1 pixels having decimal values 200, 13, -13 20 and 3. Each of these coefficients is coded by outputting the bits of each coefficient S..from the currentBitNumber 7 to the minBitNumber 3 (see decision block 408 and step 410 of Fig. A sign bit is then output if required. Thus, the decimal value is 200 is coded as 11001 followed by the sign bit 0. The coefficient value 13 is coded as 00001 with a sign bit 0. The coefficient value -13 is coded as 00001 with a sign bit 1.
25 Finally, the coefficient value 3 is coded as 00000 (without a sign bit). The coded representation of each coefficient includes the two bits preceding the bits of coefficients "200" between the currentBitNumber and minBitNumber. This completes the coding of the top left sub-region 710. The coded output at this state is: sign bit S 30 1111001 0 00001000001100000 200 13 -13 3 The header information is not shown in the foregoing expression.
CFP unkwnAU 433418 1O:\CISRA\IPR\IPR071433418:BFD -12- The top right sub-region 712 is then coded (per Fig. 7B). A zero is output for each of currentBitNumber equal to 7, 6, 5, and 4, since the region 712 is insignificant with respect to these bit numbers. A one is output at currentBitNumber 3, since this bitplane is significant with respect to bit number 3.
The sub-region 712 is partitioned into the four 1 x 1 pixels having values -11, -4 and These decimal values are coded as bit value 1 with sign bit 1, bit value 1 with sign bit 1 and bit values 0 and 0 without sign bits, respectively. Thus, at this stage, the coded representation is as follows: 111100100000100000110000000001[1 11 0 0 -11-8-4-3 The bottom left sub-region 714 is then encoded. A zero is output for each of currentBitNumber equal to 7, 6, 5, and 4, since the region 714 is insignificant with respect to these bit numbers. A one is output at currentBitNumber equal to 3, since this bitplane is significant with respect to bit number 3. The sub-region 714 is then partitioned into four lx1 pixels having values 8, 1, 2 and These are coded respectively as binary value 1 with sign bit 0, and binary values 0,0 and 0 without sign bits.
Finally, the bottom right sub-region 716 having values and -3 is coded. A zero is output for each of currentBitNumber 7, 6, 5, 4 and 3 since the sub-region 716 is insignificant with respect to these bit numbers. No sign bits are output. Thus, the coded representation is as follows: "111100100000100000110000000001111100000011000000000.
The decoder simply mimics the encoding process to reconstruct the region from the coded representation as depicted in Fig. 7C.
The decoding process can be made "smarter" in a number of ways. One such a "smarter" way is depicted in Fig. 7D. In this case, the magnitude of the non-zero coefficients is each increased by half of 2 to the power of minBitNumber. This is depicted in Fig. 7D. In this manner, the "smart" decoding processing generally reduces the mean square error between the decoded and the original coefficients. Still further, the encoder can alternatively perform this (type of) operation, thereby leaving the decoder to use the simplest depicted in Fig. 7C.
CFP unkwnAU 433418 [O:\CISRA\IPR\IPR071433418:BFD -13- 1.4 Encoding Process of Second SWEET Image Compression Method A coding process according to an alternate method is hereinafter described with reference to Figs. 9 to 12.
A discrete wavelet transform of an entire digital image can be performed on a block-by-block basis. The result of the transformation upon each block is a set of coefficients, which are essentially equivalent to a set of spatially corresponding coefficients of a discrete wavelet transform of the entire image. For example, from a predetermined set of coefficients of a DWT for an entire image, a portion or block of the digital image can be reproduced to a specified detail. Selecting the predetermined set of coefficients from the frequency domain amounts substantially to representing the corresponding portion of a digital image (the block) from the spatial domain. A block based DWT of a digital image can be performed by decomposing an image into a plurality of blocks and applying the transform to each block independently, thereby substantially evaluating those DWT coefficients relevant to the current spatial location.
The advantage of adopting a block-based transform approach is that a block can be subsequently encoded with minimal interaction (substantially independent) from another block of the image. Block-based techniques are inherently memory localized and therefore are generally efficient when implemented using computer systems.
Fig. 9 is a flow diagram illustrating the block-based encoding process according to the second encoding method. Processing commences at step 902. In step 904, a header is output. This information preferably includes the image height and width, the block size, the number of levels of the DWT, and two coding parameters maxBitNumber and minBitNumber. Optionally, more or less header information may be used depending upon the application.
The coding parameter maxBitNumber can be selected in a variety of ways. If the block DWT is performed on all image blocks prior to coding of any of them, the maxBitNumber can be chosen to be the MSB number of the largest coefficient across all DWT blocks. For example, if the largest coefficient is 10000001 (decimal value S9.o 129), the maxBitNumber is set to 7 since the MSB is bit number 7. Alternatively, a S 30 deterministic bound can be used which is determined by the transform and the S. resolution of the input image. For example, with an 8-bit input image (level shifted to 7-bits plus sign) and the Haar transform, the largest MSB is bounded by J+7 where J is the number of levels of the DWT. If the blocks are small, the selection of this *.**.parameter can have a significant effect on compression. In some instances, more 35 sophisticated ways of selecting maxBitNumber may be employed. However, this depends upon the specific application.
CFP unkwnAU 433418 [O:\CISRA\IPR\IPR07143341 8:BFD 14- The parameter minBitNumber determines the compression ratio versus quality trade off and can be varied. For example, for nearly orthogonal transforms, a value of 3 provides adequate image quality for 8-bit, grey-scale or 24-bit, RGB images.
In step 906, the image is decomposed into blocks (or an image block is formed). The image is decomposed preferably into overlapping blocks. However, non-overlapping blocks may be employed. The block of coefficients can be as large as the whole original image, or as small as a block of 8 x 8 coefficients (for a three-level transform). For low memory applications, a block that is as small as possible may be employed. Generally, a block size of 16 coefficients is sufficient for higher levels of compression with a three or four level DWT. A block size of 8 x 8 coefficients with a three-level DWT can maintain good coding efficiency by employing differential pulse code modulation (DPCM) on the DC coefficient of each block.
In step 908, each block is level shifted and the transform is performed.
Preferably, a DWT is employed. The image values are level shifted (for example, by 128 for an 8-bit image) to reduce or eliminate any undue mean bias, and each spatial block of the image is transformed. For a DWT, usually some knowledge of the block surrounding the current block is needed (and similarly for the inverse DWT), although this is not strictly required.
In step 910, the block is coded using the maxBitNumber and minBitNumber parameters. Processing terminates in step 912.
Step 910 for coding a block is illustrated in detail in the flow diagram of Fig.
The inputs to the block coding process of Fig. 10 include the currentBitNumber S:and the minBitNumber parameters. With reference to step 910 of Fig. 9, the maxBitNumber is input as the currentBitNumber parameter. Processing commences in step 1002. In decision block 1004, a check is made to determine if the currentBitNumber is less than the minBitNumber. If decision block 1004 returns true (yes), processing continues at step 1006. In step 1006, execution returns to the calling process, thereby indicating that every coefficient in the block has an MSB number less than the minBitNumber. Otherwise, if decision block 1004 returns false (no), processing continues at decision block 1008.
In decision block 1008, a check is made to determine if a current block is significant. If decision block 1008 returns false processing continues at step 1010. In step 1010, a zero is output in the coded representation and the o*s° currentBitNumber is decremented, that is, the next lower bit plane is selected.
35 Processing then continues at decision block 1004. Otherwise, if decision block 1008 returns true (yes) processing continues at step 1012.
Decision blocks 1004 and 1008 along with step 1010 enable the process to find the MSB number of the largest coefficient in the block. A block is insignificant with CFP unkwnAU 433418 [O:\CISRA\IPRPR071433418:BFD respect to the currentBitNumber if the MSB number of every coefficient in the block is less than the currentBitNumber. This is repeated until the bitplane of the block is significant or the currentBitNumber is less than the minBitNumber.
In step 1012, a one is output in the coded representation to indicate the bitplane is significant. In step 1014, the DC subband is coded. In step 1016, the block detail is coded using the parameters J, currentBitNumber and minBitNumber. In step, 1018, execution returns to the calling procedure. Thus, given that the block is significant, steps 1012, 1014 and 1016 are carried out to use the (generalized) quadtree segmentation to find all coefficients with an MSB number greater than the minBitNumber. If the block is significant, it is partitioned into two "sub-blocks": the DC subband coefficients and the block including the remaining coefficients, referred to as the "block detail" for level J since it represents the high frequency information about the block of level J at all lower levels.
Step 1014 of Fig. 10 for coding the DC subband is illustrated in detail by the flow diagram of Fig. 12. That is, Fig. 12 shows the process of coding a subband or sub-block using currentBitNumber and minBitNumber parameters. In step 1202, processing commences. In decision block 1204, a check is made to determine if the currentBitNumber is less than the minBitNumber. If decision block 1204 returns true (yes), processing continues at step 1206. In step 1206, execution returns to the calling procedure. Otherwise, if decision block 1204 returns false processing continues at decision block 1208.
In decision block 1208 a check is made to determine if the (subband) block size is 1 x 1 pixels. If decision block 1208 returns true (yes), processing continues at step 1210. In step 1210, the 1 x 1 pixel is coded. This involves outputting the bits between 25 the currentBitNumber and the minBitNumber, inclusive, followed by a sign bit if necessary. Processing then returns to the calling procedure in step 1212. Otherwise, if decision block 1208 returns false processing continues at decision block 1214.
In decision block 1214, a check is made to determine if the (subband) block is significant. If decision block 1214'returns false processing continues at step 30 1216. In step 1216, a zero is output in the coded representation and the currentBitNumber is decremented. Processing then continues at decision block 1204.
Otherwise, if decision block 1214 returns true (yes), processing continues at step 1218.
In step 1218, a one is output in the coded representation to indicate that the (subband) block is significant. In step 1220, the (subband) block is partitioned into four sub-blocks. In step 1222, each sub-block is coded using the parameters currentBitNumber and minBitNumber, by means of a recursive call to the process of Fig. 12. In step 1224, execution returns the calling procedure.
CFP unkwnAU 433418 [O:\CISRA\IPR\IPR071433418:BFD -16- Thus, in the process of Fig. 12, a subband or sub-block thereof is coded. The largest MSB number is isolated as before. If the sub-block includes only one pixel, it is coded as a single coefficient. Otherwise, the currentBitNumber is decremented and a zero is output in the coded representation until the currentBitNumber is less than the minBitNumber, or the subband (sub-block) is significant. If the subband (sub-block) is significant, it is partitioned into four (as close to equal as possible) sub-block, and these are coded in turn. A single coefficient, for example the DC coefficient, is encoded by outputting the coefficient bits from the currentBitNumber to the minBitNumber. Again, the sign is preferably only output if some of the coefficient bits are non-zero.
Step 1016 of Fig. 10 for coding block detail is illustrated by the flow diagram of Fig. 11. In step 1102, processing commences. In decision block 1104, a check is made to determine if the currentBitNumber is less than the minBitNumber. If decision block 1104 returns true (yes), execution returns to the calling procedure in step 1106.
Otherwise, if decision block 1104 returns false processing continues at decision block 1108.
In decision block 1108, a check is made to determine if the block (detail) is significant. If decision block 1108 returns false processing continues at step 1110. In step 1110, a zero is output in the coded representation and the currentBitNumber is decremented. Processing then continues at decision block 1104.
Otherwise, if decision block 1108 returns true (yes), processing continues at step 1112.
In step 1112, a one is output in the coded representation to indicate that the block (detail) is significant. In step 1114, each of the high-low low-high (LH), and high-high (HH) frequency subbands is coded. The HL, LH, and HH frequency subbands of each resolution are commonly referred to as AC subbands. Each of these 25 subbands is coded in accordance with the process of Fig. 12. In step 1116, the block detail is coded using the parameters J-1, currentBitNumber and minBitNumber (if the .*'.*block detail exists) by means of a recursive call to the process illustrated in Fig. 11.
Execution returns to the calling procedure in step 1118.
Thus, the block detail for level J is processed to first isolate the MSB number 30 of the largest coefficient. This is done by decrementing the currentBitNumber and outputting zeros until the block is significant. The block is then partitioned into the' three high frequency subbands at level J and the block detail for level J-1 (if J-1 is greater than This partitioning approach is motivated by the so-called 1/f type spectral models.
The decoding process for the second method can be implemented by mimicking the coding process described with reference to Figs. 9 to 12.
The encoding and decoding methods and apparatuses represent digital image data in an efficient and flexible manner, in which the representation is suitable for CFP unkwnAU 433418 [O:\CISRA\IPR\1PR071433418:BFD -17storing and/or transmitting images. The encoding techniques can be used generally to represent an array of transform coefficients, and to provide an efficient representation by representing an image in the discrete wavelet transform domain. In particular, the methods and apparatuses represent (or code) leading zeros of blocks of transform coefficients obtained from an input image. The techniques are efficient in terms of offering a good reproduction of the original image for a given size code and offering fast decoding. Further, the techniques are flexible in that coefficients obtained from a linear transformation are encoded independently without the use of entropy coding.
The advantageous aspects of the methods include the depth first nature of the coding.
Further, in the case of coding subbands, the advantageous aspects of the method include hierarchical coding of each subband separately.
2. Preferred Embodiment(s) of Method The preferred embodiment proceeds initially by means of a wavelet transform of image data. A description of the wavelet transform process is given in many standard texts and in particular the aforementioned book by Stollnitz et. al. An overview of the standard wavelet process will now be described with reference to the accompanying drawings.
Referring initially to Fig. 13, an original image 1 is transformed utilising a Discrete Wavelet Transform (DWT) into four subimages 3-6. The subimages or subbands are normally denoted LL1, HL1, LH1 and HH1. The one suffix on the subband names indicates level 1. The LL1 subband is a low pass decimated version of the original image.
The wavelet transform utilised can vary and can include, for example, Haar 25 basis functions, Daubechies basis functions etc. The LL1 subband is then in turn utilised and a second Discrete Wavelet Transform is applied as shown in Fig. 14 giving subbands LL2 HL2 LH2 HH2 This process is continued for example as illustrated in Fig. 15 wherein the LL4 subband is illustrated, the LL4 band decomposition process being referred to as an octave band filter bank with the LL4 30 subband being referred to as the DC subband. Obviously, further levels of decomposition can be provided depending on the size of the input image.
Each single level DWT can in turn be inverted to obtain the original image.
*Thus a J-level DWT can be inverted as a series of J-single level inverse DWT's.
A coded image hierarchically can proceed by coding the DC subband. Then, the remaining subbands are coded in order of decreasing level. That is for a 4 level DWT, the subbands at level 4 are coded after the DC subband (LL4). The subbands at level 4 are coded after the DC subband (LL4). That is the HL4, LH4 and HH4 CFP unkwnAU 433418 [O:\CISRA\IPR\IPR071433418:BFD -18subbands. The subbands at level 3 (HL3, LH3, and HH3) are then coded, followed by those at level 2 (HL2, LH2 and HH2) and then level 1 (HL1, LH1 and HH1).
With standard images, the encoded subbands normally contain the "detail" information in an image. Hence, they often include of a sparse array of values and substantial compression can be achieved by quantisation of the subbands and efficient encoding of their sparse matrix form. Further, it is known that there is often a degree of correlation between corresponding subbands at each level and this can be taken advantage of in any encoding scheme.
In the preferred embodiment, a DWT is utilised to exploit initial image pixel correlation. Subsequently, the method of the preferred embodiment takes advantage of the correlation between sibling subbands in a DWT rather than a parent-child correlation. The preferred embodiment treats each of the three detailed coefficients at a particular frequency level or scale as a single coding entity (hereinafter called a triplet).
A significance test is done on each triplet in a region, which will initially include the complete subband. If a region is deemed to be significant, it is partitioned into a number of subregions which then have their significance tested. If a region is not significant, the significance level is reduced. When a subregion gets below a predefined size, each triplet in the subregion is coded as is. The remaining bits of each coefficient in the triplet are output.
When an image has been decomposed into its DWT coefficient subbands these coefficients are usually treated as signed binary integers. However, in an alternative implementation the coefficients may be stored as floating point numbers. Given a binary integer representation of a number it is possible to define "bit n" or it number to refer to the binary digit n places to the left of the least significant bit. For S" 25 example, the decimal number 9 is represented as 00001001, assuming an 8-bit representation. In this number bit (number) 3 1, bit 2 0, bit 1 0, and bit 0 1. For coding applications the number of bits per coefficient, required to represent the possible range of coefficients, is determined by the transform and the input pixel resolution (in bits per pixel). Typically this range is large relative to the values of most of the transform coefficients, and thus many coefficients have a large number of leading zeros. (For example the number 9 has 4 leading zeros in the 8-bit represented above, or has 12 leading zeros assuming a 16-bit representation). The method of the preferred embodiment provides a means of representing (coding) these leading zeros, for blocks of coefficients, in an efficient manner. The remaining bits and sign can be encoded as is.
After the image has been transformed using the DWT a parameter, maxBitNumber, is set to the most significant bit (MSB) of the largest absolute coefficient value. For example, if the largest coefficient is (decimal 9) 00001001, then CFP unkwnAU 433418 IO:\CISRA\IPR\IPR071433418:BFD 19maxBitNumber is set to 3, since the MSB is bit number 3. Alternatively it is possible to set maxBitNumber to be any value that is (guaranteed to be) larger that the MSB of the largest pixel absolute value.
A coding parameter, minBitNumber specifies the decoded image quality. This parameter specifies the precision of every coefficient in the transformed image and can be varied as required.
In a practical implementation of an encoder of the preferred embodiment, header information is initially output. For example, the image height and width, the number of levels of the DWT, the mean of the DC subband, maxBitNumber, and the minBitNumber parameter can all be output in the header.
Each of the DWT subbands are then processed in a predefined order known to both the encoder and decoder. Preferably the encoding/decoding should start with the lowest frequency subband and then continue with the three detail subbands at the next highest frequency. For example, a 3 level decomposition would be processed in the following order: LL3, followed by the triplet subband T3 (HL3, LH3, HH3), followed by T2 (HL2, LH2, HH2), and finally T1 (HL1, LH1, HH1).
Each triplet subband normally includes a three dimensional array containing all of the orientation coefficients (HL, LH, and HH) from one high-pass level of the DWT decomposition. For example, taking a simple case, consider the 2 level DWT of a 4x4 image shown in Fig. 16 there are two triplet subbands set out as follows: T2 and T1 At each spatial location in a region there are three coefficients to be processed.
In general, a region has size IxJx3, where I and J are the (variable) spatial size of, in 25 this case, a rectangular region. In this way, at every location in the region there are 3 DWT coefficients. This is in contrast to most other DWT coding schemes that treat the coefficients in the HL, LH, and HH subbands, at one level, separately.
The DC (LL) subband from the DWT can be coded using the SWEET method, pulse code modulation (PCM), differential PCM (DPCM), or any other suitable coding 30 technique. The actual method used to code the LL subband is not too important as it typically contains a relatively small number of coefficients, for a four-level two dimensional DWT the DC subband contains 1 6 4 th of the pixels in the original image.
Therefore, the coding overhead for the DC subband does not significantly effect the overall coding efficiency.
After the DC subband has been coded the three detail subbands at the next lowest level, next highest frequency, are coded as a triplet subband. This is done by setting an initial region to be the same size as one whole subband. However, at each location in the region there are three coefficients, from the HL, LH, and HH CFP unkwnAU 433418 [O:\CISRA\IPR\IPR071433418:BFD subbands. Coding of the region starts with the parameters currentBitNumber maxBitNumber and minBitNumber set from the required quality of reconstruction.
The region coding process can be basically the same as the SWEET method except that at each location in a region there are three coefficients (rather than just one) to be encoded. The region coding process is then preferably passed currentBitNumber and minBitNumber as parameters and proceeds as follows: If currentBitNumber is less than minBitNumber nothing is done. Basically this means that every coefficient in this region has an MSB number less than minBitNumber.
If currentBitNumber is not less than minBitNumber the method proceeds as follows: If the region has a size of lx1x3, it includes of just one triplet of coefficients, then the bits from currentBitNumber to minBitNumber (inclusive) are output in a consistent order such as {HL, LH, HH}. Alternatively, the leading zeros of these three coefficients could be further encoded using the conventional SWEET method.
If the region has size greater than 1x1x3 the method proceeds as follows: The region is said to be insignificant if the MSB number of each triplet of coefficients in the region is less currentBitNumber. If this is the case, a zero is output and the region is coded with the parameters currentBitNumber-1 and minBitNumber.
If the region is significant, it is partitioned into 4 (or any other number of) subregions, using some specified partitioning method that is known to both i: the encoder and decoder. A suitable method can be a quadtree partition of the regions.
25 Each subregion is then coded with the same currentBitNumber and minBitNumber parameters.
A triplet is significant when any one of the 3 coefficients at that location is significant. That is, the significance test can be done on absolute maximum of the HL, LH, and HH coefficients at each location in the region. An alternative way to consider 30 this partitioning is to construct a two dimensional region including of the maximum coefficient values from the HL, LH, and HH subbands at each location. Then conventional encoding such as SWEET encoding can be done on this region with the modification that when a region reaches its minimum size (normally lxl) the remaining bits of all three coefficients are output.
Each coefficient of the triplet is coded by simply outputting the pixel bits from currentBitNumber to minBitNumber. A convention can be followed whereby the sign is then output only if some of these pixel bits are non-zero. For example, if CFP unkwnAU 433418 [O:\CISRA\IPR\IPR071433418:BFD -21 currentBitNumber 3, minBitNumber 1, then -9 (00001001) is coded as "100" followed by a sign bit of A Coding Example For illustration of the operation of the preferred embodiment consider the 4x4 set of coefficients 20 shown in Fig. 16. MaxBitNumber is set to 7 since this is the largest (bitplane) of all of the coefficients (200). MinBitNumber is set to 2 in this case for illustrative purpose. The header of the compressed file will preferably contain a coded representation of maxBitNumber and minBitNumber for synchronisation with the decoder.
The low-pass coefficient (200) is equivalent to binary 11001000 and significant at maxBitNumber 7 and so a 1 is output and the coefficient coded to minBitNumber (including a sign bit): 1110010.0 200 For ease of illustration, the sign bit has been highlighted in this notation as the bit after the decimal point. White-space has also been added between coefficients.
The three detail coefficients at the next level in the wavelet tree (13,-13, and 3) are then used to form a triplet. The absolute maximum of this triplet is 13 which with currentBitNumber maxBitNumber 7 is not significant. A zero is output and currentBitNumber decremented. This continues until currentBitNumber 3 when the S 20 triplet becomes significant. The output is then as follows: '0000111.011.100 13 -13 3 With the HL2, LH2, and HH2 coefficients coded the next step is to proceed to the next triplet of subbands HL1, LH1, and HH1. The triplet subband is not significant 25 at currentBitNumber 7 and so a zero is output and currentBitNumber decremented. This continues until currentBitNumber 3 when the triplet becomes significant. The region is then split into four single triplets, a one is output, and the triplets are then coded as is (as they are of size 1x1x3): 00001110.110.00010.10000000000000000 -11 8 -2 -8 1 -2 -4 2 -3 -3 -1-3 The complete output from the coefficient coding is thus: 111001000000111011100000011101100001010000000000000000 The decoder then simply mimics the encoding process to reconstruct the original set of 4x4 DWT coefficients.
It will be evident to the person skilled in the art that the teachings of the preferred embodiment can be adapted in many different ways. For example, the CFP unkwnAU 433418 IO:\CISRA\IPR\IPR071433418:BFD -22 preferred embodiment can be utilised adaptively with other systems such as the basic SWEET methodology depending on which is the most efficient. This can be carried out on a block by block basis if necessary. Further, particular techniques of rounding of coefficients can be employed before the coefficients are encoded in order to reduce the overall means where at error. Such rounding methods are disclosed in the associated Australian provisional patent application P04728 described herein.
Fig. 17 is a flow diagram illustrating an image encoding method in accordance with the preferred embodiment. In step 1702, processing commences using an input image. In step 1704, the input image is transformed using a linear transformation, preferably a discrete wavelet transform. An initial region is defined to be the whole image. For example, in the case of a three-level DWT of the input image, the resulting coefficients including the 10 subbands can be specified as the region. Alternatively each subband can be processed separately, setting each initial region to the whole subband in question.
In step 1706, the most significant bit (msb) of the largest absolute value of the transform coefficients is determined and a parameter, maxBitNumber, is set to this coefficient value. For example, if the largest transform coefficient has a binary value of 00001001 (decimal the parameter maxBitNumber is set to 3, since the msb is bit number 3. Alternatively, the parameter maxBitNumber may be set to be any value that is larger that the msb of the largest absolute value of the transform coefficients.
Further, in step 1706, a coding parameter, minBitNumber is set to specify the coded image quality. In particular, this coding parameter specifies the precision of every coefficient in the transformed image and can be varied as required. For example, oa minBitNumber of 3 provides a coarser reproduction of the original image than does a value of 1.
Optionally, the technique involves step 1708, which provides an output header in the coded representation of the input image. Thus, in a practical implementation, header information is output as part of the coded representation. For example, the output header may contain information about the source image, including the image 30 height and width, the number of levels of the DWT, the mean value of the DC subband, the maxBitNumber parameter, and the minBitNumber parameter.
Beginning in step 1710, each level is separately encoded from low to high frequency in the following manner. Firstly, in step 1712, the DC subband is encoded.
In the next step 1714, the next highest frequency triplet is encoded. This is achieved by the calling the function CodeTriplet(currentBitNum, minBitNumber). After step 1714, the processing continues at decision block 1716. In the decision block 1716, a check is made whether all subbands have been encoded. If the decision block returns no(false) CFP unkwnAU 433418 [O:\CISRAUPR\PR071433418:BFD 23then the processing returns to step 1714. Otherwise, the processing terminates at step 1718.
Fig. 18 is a detailed flow diagram of the procedure "Code Triplet(currentBitNumber, minBitNumber)" called in step 1714 of Fig. 17 for coding each triplet or region, where maxBitNumber is provided as the currentBitNumber. In step 1802, processing commences. The inputs to the Triplet/region coding process of Fig. 18 include the currentBitNumber and minBitNumber parameters. Preferably, the method is implemented as a recursive technique where the process is able to call itself with a selected region or sub-region. However, the process may implemented in a nonrecursive manner.
In decision block 1804, a check is made to determine if the currentBitNumber parameter is less than the minBitNumber parameter. Otherwise, if decision block 1804 returns true (yes), nothing is done and processing returns to the calling procedure in step 1806. This condition indicates that every coefficient in the selected region has a msb number less than minBitNumber. If decision block 1804 returns false (no), processing continues at decision block 1808.
In decision block 1808, a check is made to determine if the selected region is a triplet i.e 1 x 1 x 3 pixels. If decision block 408 returns true (yes), processing continues at step 1810. In step 1810, the triplet 1 x 1 x 3 pixels is coded in the manner described above. In step 1812, processing returns to the calling procedure. Otherwise, if decision block 1808 returns false the region includes more than one triplet and processing continues at decision block 1814.
*In decision block 1814, the selected region is checked to determine if it is significant. That is, the significance of the region is tested. The region is said to be 25 insignificant if the msb number of each coefficient in the region is less than the value of the currentBitNumber parameter.
If decision block 1814 returns false processing continues at step 1816. In step 1816, a value of 0 (or first token) is output in the coded representation stream, and the currentBitNumber parameter is decremented by 1. That is, the next, lower 30 bitplane of the region is selected for processing. Processing then continues at decision block 1804, where the region is again processed with the parameters currentBitNumber- 1 and minBitNumber. Otherwise, if decision block 1814 returns true (yes), that is, the region is significant, processing continues at step 1818.
In step 1818, a value of 1 (or second token) is output in the coded representation stream. In step 1820, the selected region is partitioned into a predetermined number (preferably, 4) of subregions using a specified partitioning algorithm. The partitioning algorithm used is known to the decoder. In this method, CFP unkwnAU 433418 [O:\CISRA\IPR\lPR071433418:BFD 24square regions are used. A region is partitioned preferably into 4 equal-sized (square) subregions.
In step 1822, each subregion is then coded with the same currentBitNumber and minBitNumber parameters. This is preferably done by means of a recursive call to the procedure "CodeTriplet(currentBitNumber, minBitNumber)" of Fig. 18. This coding of subregions may be implemented in parallel or sequentially. In the latter case, the processing may commence from a low frequency subband to higher frequency subbands in turn.
Fig. 19 is a flow diagram illustrating a method of decoding the coded representation of an image obtained using the process of Figs. 17 and 18. In step 1902, processing commences using the coded representation. In step 1904, the header information is read from the coded representation to determine the size of the original image, and hence the initial region size. Also, information such as maxBitNumber (equal to the initial currentBitNumber in the coding process) and minBitNumber are input. Further information includes the mean value of the DC subband.
In step 1906, the region is set to the DC subband and in step 1908 the DC subband is decoded. In the next step 1910, decoding is commenced by setting to each triplet subband in turn. In step 1912, the selected region is decoded using the maxBitNumber and minBitNumber parameters. This decoding mimics the encoding process of Fig. 18. In step 1914, the inverse DWT is applied to the decoded selected region. Processing terminates in step 1916.
2.1 Preferred Embodiment of Apparatus(s) The encoding and/or decoding process are preferably practiced using a 25 conventional general-purpose computer, such as the one shown in Fig. 8, wherein the processes of Fig..13 to 19 may be implemented as software executing on the computer.
In particular, the steps of the encoding and/or decoding methods are effected by instructions in the software that are carried out by the computer. The software may be divided into two separate parts; one part for carrying out the encoding and/or decoding 30 methods; and another part to manage the user interface between the latter and the user.
The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer from C .C the computer readable medium, and then executed by the computer. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer preferably effects an advantageous apparatus for encoding and/or decoding representations of digital images in accordance with the embodiments of the invention.
CFP unkwnAU 433418 [O:\CISRA\IPR\IPR07]433418:BFD The computer system 800 includes of the computer 802, a video display 816, and input devices 818, 820. In addition, the computer system 800 can have any of a number of other output devices including line printers, laser printers, plotters, and other reproduction devices connected to the computer 802. The computer system 800 can be connected to one or more other computers via a communication interface 808c using an appropriate communication channel 830 such as a modem communications path, a computer network, or the like. The computer network may include a local area network (LAN), a wide area network (WAN), an Intranet, and/or the Internet The computer 802 itself includes of a central processing unit(s) (simply referred to as a processor hereinafter) 804, a memory 806 which may include random access memory (RAM) and read-only memory (ROM), input/output (IO) interfaces 808a, 808b 808c, a video interface 810, and one or more storage devices generally represented by a block 812 in Fig. 8. The storage device(s) 812 can include of one or more of the following: a floppy disc, a hard disc drive, a magneto-optical disc drive, CD-ROM, magnetic tape or any other of a number of non-volatile storage devices well known to those skilled in the art. Each of the components 804 to 812 is typically connected to one or more of the other devices via a bus 814 that in turn can include of data, address, and control buses.
The video interface 810 is connected to the video display 816 and provides video signals from the computer 802 for display on the video display 816. User input *to operate the computer 802 can be provided by one or more input devices 808b. For example, an operator can use the keyboard 818 and/or a pointing device such as the •mouse 820 to provide input to the computer 802.
I .The system 800 is simply provided for illustrative purposes and other St 25 configurations can be employed without departing from the scope and spirit of the invention. Exemplary computers on which the embodiment can be practiced include MeIBM-PC/ATs or compatibles, one of the Macintosh (TM) family of PCs, Sun Sparcstation or the like. The foregoing is merely exemplary of the types of computers with which the embodiments of the invention may be practiced. Typically, the processes of the embodiments, described hereinafter, are resident as software or a program recorded on a hard disk drive (generally depicted as block 812 in Fig. 8) as the computer readable medium, and read and controlled using the processor 804.
Intermediate storage of the program and pixel data and any data fetched from the ,network may be accomplished using the semiconductor memory 806, possibly in concert with the hard disk drive 812.
In some instances, the program may be supplied to the user encoded on a CD- ROM or a floppy disk (both generally depicted by block 812), or alternatively could be read by the user from the network via a modem device connected to the computer, for CFP unkwnAU 433418 [O:\CISRA\IPR\IPR07143341 8:BFD 26 example. Still further, the software can also be loaded into the computer system 800 from other computer readable medium including magnetic tape, a ROM or integrated circuit, a magneto-optical disk, a radio or infra-red transmission channel between the computer and another device, a computer readable card such as a PCMCIA card, and the Internet and Intranets including email transmissions and information recorded on websites and the like. The foregoing is merely exemplary of relevant computer readable mediums. Other computer readable mediums may be practiced without departing from the scope and spirit of the invention.
The method of coding and/or decoding may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub functions of the encoding. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories.
The foregoing only describes a small number of embodiments of the present invention, however, modifications and/or changes can be made thereto by a person skilled in the art without departing from the scope and spirit of the invention. The present embodiments are, therefore, to be considered in all respects to be illustrative and not restrictive.
(ft I* roe '*o°ee ft a f*o L* (o O [ft <t Cft (ft g ft ftl* CFP unkwnAU 433418 I0:\CISRA\IPR\1PR07143341 8:BFD

Claims (7)

  1. 27- The claims defining the invention are as follows: 1. A method of compressing digital data including the steps of: Stransforming said data into a series of sub-bands having low frequency components, high frequency components and a mixture of high and low frequency components each having a corresponding series of coefficients each of which are arranged in a predetermined spatial arrangement; combining the coefficients of said high frequency components and said mixture of high and low frequency components in a correlated manner into a single combined series of coefficients; and encoding said single combined series of coefficients utilizing a region hierarchical encoding method which encodes smaller and smaller regions of said combined series of coefficients. 2. A method as claimed in claim 1, wherein said coefficients are stored as integers and said encoding further includes determining the maximum magnitude of the coefficients in a region and separately encoding the number of leading zeroes in said maximum magnitude for each coefficient of a region. 3. A method as claimed in claim 1, wherein said region hierarchical encoding method includes dividing each region having coefficients exceeding a first predetermined threshold on a first level into a predetermined number of regions on lower levels and iteratively applying said division to regions on lower levels having rO*O O coefficients exceeding a second predetermined threshold. 'S 4. A method as claimed in claim 3, wherein said dividing includes performing a quadtree subdivision of said series of coefficients. a. A A method as claimed in claim 1, wherein said coefficients are stored as binary numbers. 6. A method as claimed in claim 1, wherein said data includes image data. A 7. A method as claimed in claim 1, wherein said transform includes a wavelet "transform of said data. 8. A method as claimed in claim 1, wherein said correlated manner includes combining equivalently spatially located coefficients from each of said components. 9. A method of decompressing digital data including the steps of: CFP unkwnAU 433418 [O:\CISRAIPR\PR071433418:BFD 28 decoding previously encoded data to produce a single combined series of coefficients utilizing a region hierarchical encoding method extracting coefficients of high frequency components and low frequency components and a mixture of high and low frequency components in a predefined order from said single combined series of coefficients; arranging said high frequency and low frequency components and a mixture of high and low frequency components in a series of sub-bands; and inverse transforming said series of sub-bands. 10. An apparatus for compressing digital data including: transformation means for transforming said data into a series of sub-bands having low frequency components, high frequency components and a mixture of high and low frequency components each having a corresponding series of coefficients each of which are arranged in a predetermined spatial arrangement; combination means for combining the coefficients of said high frequency components and said mixture of high and low frequency components in a correlated manner into a single combined series of coefficients; and encoder means for encoding said single combined series of coefficients utilizing a region hierarchical encoding method which encodes smaller and smaller regions of said combined series of coefficients. :o 11. An apparatus as claimed in claim 10, wherein said coefficients are stored as integers and said encoder means further includes determination means for determining the maximum magnitude of the coefficients in a region and separately encoding the number of leading zeroes in said maximum magnitude for each coefficient of a region. 12. An apparatus as claimed in claim 10 wherein said region hierarchical encoding o oo method includes dividing each region having coefficients exceeding a first predetermined o threshold on a first level into a predetermined number of regions on lower levels and S :oO 30 iteratively applying said division to regions on lower levels having coefficients exceeding a second predetermined threshold. 13. An apparatus as claimed in claim 12 wherein said dividing includes performing a quadtree subdivision of said series of coefficients. 14. An apparatus as claimed in claim 10, wherein said coefficients are stored as binary numbers. CFP unkwnAU 433418 [O:\CISRA\IPR\IPR071433418:ad -29- An apparatus as claimed in claim 10, wherein said data includes image data. 16. An apparatus as claimed in claim 10, wherein said transform includes a wavelet transform of said data. 17. An apparatus as claimed in claim 10, wherein said correlated manner includes combining equivalently spatially located coefficients from each of said components. 18. An apparatus of decompressing digital data including: decoder means for decoding previously encoded data to produce a single combined series of coefficients utilizing a region hierarchical encoding method extraction means for extracting coefficients of high frequency components and low frequency components and a mixture of high and low frequency components in a predefined order from said single combined series of coefficients; arrangement means for arranging said high frequency and low frequency components and a mixture of high and low frequency components in a series of sub- bands; and transformation means for inverse transforming said series of sub-bands. 19. A computer program product including a computer readable medium having recorded thereon a computer program for compressing digital data, the computer program product including: transformation means for transforming said data into a series of sub-bands having low frequency components, high frequency components and a mixture of high and S. 25 low frequency components each having a corresponding series of coefficients each of which are arranged in a predetermined spatial arrangement; combination means for combining the coefficients of said high frequency components and said mixture of high and low frequency components in a correlated °°manner into a single combined series of coefficients; and i° 30 encoder means for encoding said single combined series of coefficients utilizing a region hierarchical encoding method which encodes smaller and smaller regions of said 0o: combined series of coefficients. A computer program product as claimed in claim 19, wherein said coefficients are stored as integers and said encoder means further includes determination means for determining the maximum magnitude of the coefficients in a region and separately encoding the number of leading zeroes in said maximum magnitude for each coefficient k4 of a region. CFP unkwnAU 433418 [(O:\CISRA\IPR\IPR07]433418:iad 21. A computer program product as claimed in claim 19 wherein said region hierarchical encoding method includes dividing each region having coefficients exceeding a first predetermined threshold on a first level into a predetermined number of regions on lower levels and iteratively applying said division to regions on lower levels having coefficients exceeding a second predetermined threshold. 22. A computer program product as claimed in claim 21 wherein said dividing includes performing a quadtree subdivision of said series of coefficients. 23. A computer program product as claimed in claim 19, wherein said coefficients are stored as binary numbers. 24. A computer program product as claimed in claim 19, wherein said data includes image data. A computer program product as claimed in claim 19, wherein said transform includes a wavelet transform of said data. 26. A computer program product as claimed in claim 19, wherein said correlated manner includes combining equivalently spatially located coefficients from each of said components. 27. A computer program product including a computer readable medium having S" 25 recorded thereon a computer program for decompressing digital data, the computer program product including: Soo@decoder means for decoding previously encoded data to produce a single combined series of coefficients utilizing a region hierarchical encoding method r extraction means for extracting coefficients of high frequency components and 30 low frequency components and a mixture of high and low frequency components in a predefined order from said single combined series of coefficients; arrangement means for arranging said high frequency and low frequency components and a mixture of high and low frequency components in a series of sub- bands; and transformation means for inverse transforming said series of sub-bands.
  2. 28. A method of compressing digital data, the method substantially as described ?A L4' herein with reference to Figs. 13 to 18 of the accompanying drawings. CFP unkwnAU 433418 [O:\CISRA\IPR\IP R071433418:ilad -31
  3. 29. A method of decompressing digital data, the method substantially as described herein with reference to Fig. 19 of the accompanying drawings.
  4. 30. An apparatus for compressing digital data, the method substantially as described herein with reference to Fig. 8 and Figs. 13 to 18 of the accompanying drawings.
  5. 31. An apparatus for decompressing digital data, the method substantially as described herein with reference to Fig. 8 and Fig. 19 of the accompanying drawings.
  6. 32. A computer program product including a computer readable medium having recorded thereon a computer program for compressing digital data, the computer program product substantially as described with reference to Fig. 8 and Figs. 13 to 18 of the accompanying drawings.
  7. 33. A computer program product including a computer readable medium having recorded thereon a computer program for decompressing digital data, the computer program product substantially as described with reference to Figs. 8 and 19 of the accompanying drawings. DATED this twenty-eighth Day of March 2000 Canon Kabushiki Kaisha Canon Information Systems Research Australia Pty Ltd Patent Attorneys for the Applicants 25 SPRUSON FERGUSON o go tgo 00 0 0 0 CFP unkwnAU 433418 [Q:\CISRA\IPR\IPRO7]43341 8:iad
AU87096/98A 1997-09-29 1998-09-28 A method of digital image compression Ceased AU725719B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU87096/98A AU725719B2 (en) 1997-09-29 1998-09-28 A method of digital image compression

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
AUPO9515 1997-09-29
AUPO9515A AUPO951597A0 (en) 1997-09-29 1997-09-29 A method for digital data compression
AU87096/98A AU725719B2 (en) 1997-09-29 1998-09-28 A method of digital image compression

Publications (2)

Publication Number Publication Date
AU8709698A AU8709698A (en) 1999-04-15
AU725719B2 true AU725719B2 (en) 2000-10-19

Family

ID=25640792

Family Applications (1)

Application Number Title Priority Date Filing Date
AU87096/98A Ceased AU725719B2 (en) 1997-09-29 1998-09-28 A method of digital image compression

Country Status (1)

Country Link
AU (1) AU725719B2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AUPP918699A0 (en) 1999-03-12 1999-04-15 Canon Kabushiki Kaisha Encoding method and appartus
AU746343B2 (en) * 1999-03-12 2002-04-18 Canon Kabushiki Kaisha Encoding method and apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5214502A (en) * 1991-01-11 1993-05-25 Sony Broadcast & Communications Limited Compression of video signals
US5223926A (en) * 1991-01-11 1993-06-29 Sony Broadcast & Communications Limited Compression of video signals
US5517581A (en) * 1989-05-04 1996-05-14 At&T Corp. Perceptually-adapted image coding system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5517581A (en) * 1989-05-04 1996-05-14 At&T Corp. Perceptually-adapted image coding system
US5214502A (en) * 1991-01-11 1993-05-25 Sony Broadcast & Communications Limited Compression of video signals
US5223926A (en) * 1991-01-11 1993-06-29 Sony Broadcast & Communications Limited Compression of video signals

Also Published As

Publication number Publication date
AU8709698A (en) 1999-04-15

Similar Documents

Publication Publication Date Title
EP0971544B1 (en) An image coding method and apparatus for localised decoding at multiple resolutions
US6266414B1 (en) Method for digital data compression
EP0905979B1 (en) A method for data compression
EP0855838B1 (en) A method for digital image compression using Discrete Wavelet Transform DWT
US6266450B1 (en) Encoding method and apparatus
US6259819B1 (en) Efficient method of image compression comprising a low resolution image in the bit stream
US6389074B1 (en) Method and apparatus for digital data compression
Boujelbene et al. A comparative study of recent improvements in wavelet-based image coding schemes
US6606416B1 (en) Encoding method and apparatus for representing a digital image
EP0920213A2 (en) Method and apparatus for decoding transform coefficients
AU725719B2 (en) A method of digital image compression
AU708489B2 (en) A method and apparatus for digital data compression
AU719749B2 (en) A method for digital data compression
AU728938B2 (en) A method for data compression
AU727434B2 (en) Method and apparatus for decoding
AU727869B2 (en) An efficient method of image compression comprising a low resolution image in the bit stream
AU736469B2 (en) An image coding method and apparatus for localized decoding at multiple resolutions
AU714202B2 (en) A method for digital image compression
AU727894B2 (en) An encoding method and apparatus
AU746400B2 (en) A method and apparatus for decoding a coded representation of a digital image
AU740066B2 (en) Method and apparatus for hierarchical encoding or decoding video images

Legal Events

Date Code Title Description
PC1 Assignment before grant (sect. 113)

Owner name: CANON KABUSHIKI KAISHA

Free format text: THE FORMER OWNER WAS: CANON KABUSHIKI KAISHA, CANON INFORMATION SYSTEMS RESEARCH AUSTRALIA PTY LTD

FGA Letters patent sealed or granted (standard patent)