AU749077B2

AU749077B2 - Digital image coding

Info

Publication number: AU749077B2
Application number: AU45220/00A
Authority: AU
Inventors: James Philip Andrew; Andrew Dorrell
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1999-07-12
Filing date: 2000-07-11
Publication date: 2002-06-20
Anticipated expiration: 2020-07-11
Also published as: AU4522000A

Description

S&F Ref: 512283

AUSTRALIA

PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT

ORIGINAL

Name and Address of Applicant Actual Inventor(s): Address for Service: Canon Kabushiki Kaisha 30-2, Shimomaruko 3-chome, Ohta-ku Tokyo 146 Japan James Philip Andrew, Andrew Dorrell Spruson Ferguson St Martins Tower 31 Market Street Sydney NSW 2000 Digital Image Coding Invention Title: ASSOCIATED PROVISIONAL APPLICATION DETAILS [33] Country [31] Applic. No(s) AU PQ1563 [32] Application Date 12 Jul 1999 The following statement is a full description of this invention, including the best method of performing it known to me/us:- 5815c DIGITAL IMAGE CODING Technical Field of the Invention The invention relates data coding and in particular to digital image coding.

Background The JPEG image compression standard is a successful standard in widespread use. However, since the design of JPEG, the use of digital image data has increased significantly. The original JPEG standard has been unable to meet all the needs of evolving applications and this has lead to the development of many new image compression techniques.

Some of the motivating factors behind the development of new compression techniques have been the need to access different distinct image resolutions and quality levels as well as the need to access (and possibly decode) sub-regions of the entire encoded image. In addition, increased use of image data in slow network environments has increased the importance of progressive codes that can be decoded and refined while data is still being received. Still, an image code that provides a high degree of flexibility in terms of accessing different quality and resolution levels while providing spatial random access and a highly efficient codec structure has remained an open issue.

In terms of providing flexible data access, JPEG performs poorly. In fact in order to achive different kinds of access, JPEG must be used in independent modes of operation. Each one of these modes has its own distinct way of constructing the *compressed bitstream. For example, if random access to different image resolutions is required then a completely separate JPEG mode is required. The same is true of resolution and quality progressive modes of JPEG. To make matters worse the multiresolution mode for JPEG introduces redundancies which compromise compression.

In the baseline JPEG mode, 8x8 blocks of image data are independently coded in (block) raster order. This mode can potentially provide good spatial random access, because of the small block size. However, the cost of encoding offsets to each of the encoded blocks is unacceptably high and this information is not included in the bit stream.

As a result, in order to decode an arbitrary spatial region of a baseline JPEG image it is necessary to substantially decode all of the blocks up to and including the last (bottom right) block of the desired region.

Wavelet coders such as EZW and SPHIT attempt to address the problem of multi-resolution access through the use of a global image transform structure which is 512283.doc explicitly multi-resolution. Generally such transforms are applied to the entire image. A practical problem with this approach is that the recursive nature of the transform results in the need to write out intermediate result to external memory. This implies an increase in the memory bandwidth requirements and as a result increases the cost and limits the throughput of the codec. On the other hand, JPEG achieves a minimal bandwidth requirement of one as compared to these wavelet encoders. A memory bandwidth of one means that the only accesses to memory required during the coding process are for the purpose of reading in input values and writing out the coded data stream. Finally, due to the global nature of the transforms used in these wavelet coders, buffering of the 10 whole image may be required just to perform the forward and inverse transform stages.

oooo More recent wavelet transform methods have begun to address these issues but are still unable to achieve unity bandwidth and remain decoupled from the latter stages of o• compression processing (ie. quantisation and entropy coding), thus adding again to the 0 overall bandwidth requirements.

The Flashpix image format represents an attempt to address the problem of 0 0 combining multi-resolution access with good spatial localisation. It achieves this by 0 storing, in a single file, multiple copies of an image at different resolutions. Images are o divided into tiles prior to coding (possibly using JPEG) thereby facilitating spatial random 0 0 00.0.

access. The biggest problem with the Flashpix approach is that it introduces considerable 20 extra redundancy into the image code, which compromises compression performance and complicates management of the data. Moreover, Flashpix is not a bitstream format, rather it is a container and cannot fulfil many traditional applications of image codes. For example, it provides no explicit mechanism for progressive transmission of the stored resolutions and has no clear mechanism by which rate, quality and resolution tradeoffs can be achieved with consistency. As it uses JPEG as its compression backend it is also inherently limited in its ability to provide usable results at high compression rates.

SUMMARY OF THE INVENTION It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements.

512283 a AMENDMENTSOi.doc According to one aspect of the invention, there is provided a method of representing a digital image as a tiled multi-resolution representation, wherein said digital image comprises a number of said tiles of pixel coefficients, and each said tile is of a predetermined size and said size of each said tile is such that all the pixel coefficients of each said tile is able to be held in a local memory, the method performing the following steps for each tile in turn: transforming a current said tile of pixel coefficients to a derive a current tile of transform coefficients; and coding said current tile of transform coefficients to provide a multi-resolution representation of said current tile such that a decoder is capable of decoding a resolution level of a said current tile substantially independently of any higher resolution level of said current tile and independently of any other tile; wherein during said coding substantially all data associated with said current o tile is stored in said local memory.

•According to another aspect of the invention, there is provided a method of go. decoding a coded representation of a digital image, wherein said coded representation comprises a number of coded tiles of transform coefficients, and each said tile is able to be held in a local memory, the method comprises the steps of: selecting one or more tiles go and selecting a resolution level thereof; processing said coded representation, wherein •said processing step performs the following sub-steps for each selected coded tile in turn: decoding in turn each resolution level up to and including said selected resolution level of 20 a current coded tile of transform coefficients substantially independently of any higher resolution level of said current tile and independently of any other tile to provide a current tile of transform coefficients; inverse transforming said current tile of transform coefficients to derive a current tile of pixel coefficients; wherein during said decoding substantially all data associated with said current tile is stored in said local memory.

According to another aspect of the invention, there is provided apparatus for representing a digital image as a tiled multi-resolution representation, wherein said digital image comprises a number of said tiles of pixel coefficients, and each said tile is of a predetermined size and said size of each said tile is such that all the pixel coefficients of each said tile is able to be held in a local memory, the apparatus comprising: means for 'I 3 transforming a current said tile of pixel coefficients to a derive a current tile of transform 512283_aAMENDMENTSO .doc coefficients; and means for coding said current tile of transform coefficients to provide a multi-resolution representation of said current tile such that a decoder is capable of decoding a resolution level of a said current tile substantially independently of any higher resolution level of said current tile and independently of any other tile; wherein during said coding substantially all data associated with said current tile is stored in said local memory.

According to another aspect of the invention, there is provided apparatus for decoding a coded representation of a digital image, wherein said coded representation comprises a number of coded tiles of transform coefficients, and each said tile is able to be held in a local memory, the apparatus comprising: means for selecting one or more tiles and selecting a resolution level thereof; means for processing said coded representation, wherein said processing means performs the following steps for each selected coded tile in turn: decoding in turn each resolution level up to and including said 0 selected resolution level of a current coded tile of transform coefficients substantially .o:0 independently of any higher resolution level of said current tile and independently of any other tile to provide a current tile of transform coefficients; inverse transforming said current tile of transform coefficients to derive a current tile of pixel coefficients; wherein during said decoding substantially all data associated with said current tile is stored in said local memory.

o• 20 According to another aspect of the invention, there is provided a computer program for representing a digital image as a tiled multi-resolution representation, wherein said digital image comprises a number of said tiles of pixel coefficients, and each said tile is of a predetermined size and said size of each said tile is such that all the pixel coefficients of each said tile is able to be held in a local memory, the computer program comprising: code for transforming a current said tile of pixel coefficients to a derive a current tile of transform coefficients; and code for coding said current tile of transform coefficients to provide a multi-resolution representation of said current tile such that a decoder is capable of decoding a resolution level of a said current tile substantially independently of any higher resolution level of said current tile and independently of any other tile; wherein 512283 a AMENDMENTSOI.doc during said coding substantially all data associated with said current tile is stored in said local memory.

According to another aspect of the invention, there is provided a computer program for decoding a coded representation of a digital image, wherein said coded representation comprises a number of coded tiles of transform coefficients, and each said tile is able to be held in a local memory, the computer program comprising: code for selecting one or more tiles and selecting a resolution level thereof; code for processing said coded representation, wherein said processing code performs the following steps for each selected coded tile in turn: decoding in turn each resolution level up to and including said selected resolution level of a current coded tile of transform coefficients substantially independently of any higher resolution level of said current tile and independently of any other tile to provide a current tile of transform coefficients; inverse transforming said current tile of transform coefficients to derive a current tile of pixel coefficients; wherein -oduring said decoding substantially all data associated with said current tile is stored in said local memory.

o• *oo* oooo *o *o o* 512283 a AMENDMENTSOI.doc -6- Brief Description of the Drawings A number of preferred embodiments of the present invention will now be described with reference to the drawings, in which: Fig. 1 is a schematic block diagram of a general purpose computer with which the preferred embodiments of the present invention can be practiced; Fig. 2 is a flow diagram illustrating an encoding method for compressing an image in accordance with a preferred embodiment; Fig. 3 is a flow diagram illustrating a decoding method for decompressing a compressed image in accordance with a preferred embodiment; Figs. 4A and 4B depict two possible resolution groupings for DCT coefficient blocks used in the partitioning step 230 of the first preferred embodiment of Fig. 2; .o oo• 512283_a AMENDMENTSOl .doc Fig. 5 depicts the groupings for DCT coefficients into horizontal, vertical and diagonal groups used in the partitioning step 240 of the first preferred embodiment of Fig. 2; Fig. 6 is a schematic block diagram of a lifting lattice for use in the transforming step 220 of the second preferred embodiment of Fig. 2; Fig. 7 is an overview of the preferred quadtree sub-bit-plane encoding process for use in step 250 of Fig. 2; Fig. 8 depicts a quadtree partition; Fig. 9 shows the method for coding the list of insignificant regions (LIR) at bit 1o plane n used in Fig. 7; Fig. 10 shows the method for coding the list of insignificant coefficients (LIC) at bit-plane n used in Fig. 7; Fig. 11 shown the method for encoding of the list of significant coefficients (LSC) at bit-plane n used in Fig. 7; and Is Fig. 12 shows the format of the bitstream of a digital image encoded in accordance with the method of Fig. 2.

Detailed Description of Embodiments of the Invention Where reference is made in any one or more of the accompanying drawings to steps and/or features, which have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears.

Preferred Apparatus The embodiments of the invention can preferably be practiced using a generalpurpose computer, such as the one shown in Fig. 1, wherein the processes of Figs. 2 to 13D may be implemented as software executing on the computer. In particular, the steps of the coding, decoding and/or transcoding methods are effected by instructions in the software that are carried out by the computer. The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer from the computer readable medium, and then executed by the computer. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer preferably effects an advantageous apparatus 512283.doc 7 for encoding digital images, decoding or transcoding coded representations of digital images in accordance with the embodiments of the invention.

The computer system 100 consists of the computer 101, a video display 114, and input devices 102, 103. In addition, the computer system 100 can have any of a number of other output devices 115 including line printers, laser printers, plotters, and other reproduction devices connected to the computer 101. The computer system 100 can be connected to one or more other computers using an appropriate communication channel via a modem 116, a computer network 120, or the like. The computer network may include a local area network (LAN), a wide area network (WAN), an Intranet, and/or the Internet.

The computer 101 itself consists of a central processing unit(s) (simply referred to as a processor hereinafter) 105, a memory 106 which may include random access memory (RAM) and read-only memory (ROM), an input/output (IO) interface 108, a ii video interface 107, and one or more storage devices generally represented by a block 109 in Fig. 1. The storage device(s) 109 can consist of one or more of the following: a floppy disc 111, a hard disc drive 110, a magneto-optical disc drive, CD-ROM, magnetic tape or any other of a number of non-volatile storage devices well known to those skilled in the art. Each of the components 105 to 113 is typically connected to one or more of the other devices via a bus 104 that in turn can consist of data, address, and control buses.

20 The video interface 107 is connected to the video display 114 and provides video oooo signals from the computer 101 for display on the video display 114. User input to operate the computer 101 can be provided by one or more input devices. For example, an operator can use the keyboard 102 and/or a pointing device such as the mouse 103 to S"provide input to the computer 101.

The system 100 is simply provided for illustrative purposes and other configurations can be employed without departing from the scope and spirit of the invention. Exemplary computers on which the embodiment can be practiced include IBM-PC/ATs or compatibles, one of the Macintosh (TM) family of PCs, Sun Sparcstation or the like. The foregoing are merely exemplary of the types of computers with which the embodiments of the invention may be practiced. Typically, the processes of the embodiments, described hereinafter, are resident as software or a program recorded on a hard disk drive (generally depicted as block 110 in Fig. 1) as the computer readable medium, and read and controlled using the processor 105. Intermediate storage of the 512283.doc 8 -9program and pixel data and any data fetched from the network may be accomplished using the semiconductor memory 106, possibly in concert with the hard disk drive 110.

In some instances, the program may be supplied to the user encoded on a CD-ROM or a floppy disk (both generally depicted by block 109), or alternatively could be read by the user from the network via a modem device connected to the computer, for example. Still further, the software can also be loaded into the computer system 100 from other computer readable medium including magnetic tape, a ROM or integrated circuit, a magneto-optical disk, a radio or infra-red transmission channel between the computer and another device, a computer readable card such as a PCMCIA card, and the Internet and Intranets including email transmissions and information recorded on websites and the like.

The foregoing are merely exemplary of relevant computer readable mediums. Other computer readable mediums may be practiced without departing from the scope and spirit of the invention.

i The preferred embodiments of the coding method may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub-functions of the encoding, decoding or trans-coding processes. Such dedicated hardware may include ASICs and associated on-chip memories.

2.0 Overview of Preferred Encoding Method Fig. 2 shows a flow diagram of an encoding method for compressing a digital 20 image in accordance with the preferred method. The preferred method commences at step .oo.

200 during which any necessary parameters are initialised. During the next step 210, the ooOQ original digital image 201 is input and divided into a number of tiles.

The preferred encoding method may be based on either wavelet encoding or discrete cosine transform encoding. In the case where the preferred method utilises wavelet encoding, the tiles preferably overlap. This leads to high quality wavelet encoding. Furthermore, the overlap is preferably fixed in spatial extent. This results in greatly simplified spatial random access when compared to other wavelet encoding methods. Moreover, the fixed overlap when combined with the wavelet transform method described herein in the section entitled "2.5 Transforming Step 220 Reduced Overlap DWT Second Preferred Embodiment", introduces no noticeable boundary artifacts in the reconstructed image. Where the preferred method utilises DCT encoding, no tile overlap is used.

512283.doc 9 Preferably, a base tile size of 256 pixels is used to which any overlap is added.

When overlap is present, it occurs only on the lower and right tile boundaries. Thus, for a tile size of (256+overlap) x (256+overlap) pixels, where overlap 1, the top left hand tile is of size 256 x 256. Otherwise the tiles in the top row of tiles are of size 256 x 257, and tiles in the left-most column of tiles are of size 257 x 256. The remaining tiles are of size 257x257, except for the tiles that abut the bottom and right edge of the image, which are no larger than 257x257. This tile size is purely exemplary and it is possible to use different tile sizes without departing from the scope and spirit of the invention.

A number of factors influence the actual choice of tile size. Preferably, the entire tile data is held in local memory cache, or in the case of an ASIC, on-chip memory during processing thereby minimising the need to write out intermediate results to external memory. As the tiles are encoded independently, this tile size limitation is important in achieving optimal memory bandwidth. From another perspective, as there is a fixed coding overhead with each tile the largest practical tile size is desirable. In addition, a I5 larger tile size will permit a greater number of resolution levels to be encoded. Finally, the need for spatial random access places a limit on the size of the tiles used as these represent the unit size for this purpose. Thus, selection of tile size is a process of finding the best compromise between memory constraints and spatial random access requirements on the one hand, and coding overhead and the number of resolution levels on the other.

In other words, the tile size should be sufficiently large so that it offers good compression and range of resolutions, while is sufficiently small to have good random access properties and be contained in working (local) memory. It has been found that a base tile size of 256x256 pixels is of sufficient size to meet these requirements. Generally S° tile sizes between 64x64 pixels and 519x519 pixels have found to be acceptable. During the dividing step 210, the tiles are stored individually and in sequence in working memory for further processing by the method.

In the case where the digital image is made up of multi-components images, preferably each component tile corresponding to each tile is encoded independently.

These encoded component tiles enter the encoded stream as a concatenation of the codes for each component. In the case where image components have different sizes, the tile size is adjusted in accordance so that the tiles from each component refer to an identical region of the source image.

512283.doc -11- After the dividing step 210, the preferred method proceeds to the transforming step 220. During the transforming step 220, a local spatial frequency transformation decorrelates in turn each tile of the image. This transformation is constrained to act entirely within the bounds of the tile (including its overlap) currently being coded.

Preferably, the transforming step is based on either a wavelet encoding or a DCT encoding process.

After the transforming step 220, the preferred method proceeds to a partitioning step 230 where the tiles are partitioned into resolution levels. In the next step 240, these levels are then further partitioned into subbands. Essentially these two partitioning steps result in each tile of transform coefficients being divided into subbands corresponding to: S A single low pass subband S A number of distinct resolution levels each consisting of a subband of vertical detail; a subband ofhorzontal detail and a subband of diagonal detail.

These steps 220, 230 and 240 are described in more detail below. In a first embodiment of the encoding method, the transforming step 220 utilises a block-based Discrete Cosine Transform (DCT) and the partitioning steps 230 and 240 are adapted for .o use with blocks of transformed DCT data. The first embodiment of steps 220, 230 and 240 is described in more detail in the sections entitled "2.1 Transforming Step 220 Block based DCT First Preferred Embodiment", 2.2 Partitioning Step 230 Partitioning blocks into resolution levels First Preferred Embodiment", and "2.3 Partitioning Step °240 Partitioning Levels into Subbands First Preferred Embodiment".

S•In a second embodiment of the encoding method, the transforming step 220 utilises a tile based Discrete Wavelet Transform (DWT) and the partitioning steps 230 and 240 are adapted for use with DWT transformed data. The second embodiment of steps 220, 230 and 240 is described in more detail in the sections entitled "2.5 Transforming Step 220 Reduced Overlay DWT Second Preferred Embodiment", "2.6 Partitioning Steps 230 and 240- Partitioning into resolution and subbands Second Preferred Embodiment".

After step 240 is completed, the method proceeds to an entropy encoding step 250. In this step 250, each subband is coded independently using an embedded coding process producing a code segment that is progressive in quality (precision). The lowest 512283.doc 11 12resolution level is always comprised of the low frequency subband, which may be alternately referred to as the DC subband. This level is treated a little differently to other (detail) levels in that coefficients in this level are DPCM encoded prior to subsequent entropy coding.

Preferably, the entropy encoding 250 uses a sub-bit-plane embedded quadtree coding process. However, other block based entropy encoding processes may be used without departing from the spirit and scope of the invention. The preferred entropy encoding process 250 is described in more detail in the section herein entitled 7 Entropy Coding Step 250. Sub-bit-plane Embedded Quadtree Coding". This particular 0to encoding process facilitates one-pass rate control that is (optionally) implemented by a truncating step 260 after completion of the entropy encoding 250. This truncating step is able to truncate the various code segments at various termination points so that the total S"image code is of a predetermined size. Moreover, the truncating step 260 can also incorporate an optimisation stage allowing the generation of a rate-distortion optimal 15 image code.

After the truncating step 260, the preferred method proceeds to a number of compilation steps 270, 280 and 290, wherein the truncated entropy encoded subbands (viz S the truncated code segments) are arranged into a bitstream. For ease of explanation, the bitstream construction in steps 270, 280, and 290 is described with additional reference to Fig. 12. In the compilation step 270, the truncated code segments 1210 of the subbands of *each level are concatenated together to form level code segments 1220. The order of concatenation is such that subbands containing predominantly vertical information appear first followed by subbands containing predominantly horizontal information, followed by 6 subbands with predominantly diagonal information. This ordering is related in general terms to the order of visual importance of the subbands and produces a bitstream more suitable for progressive decoding. In the next compilation step 280, the level code segments 1220 of each tile are concatenated together to form a tile code segment 1230.

The order of concatenation is in increasing order of resolution (spatial frequency). Steps 270 and 280 are described in more detail in the sections herein entitled "2.8 Steps 270 and 280- Construction of the compressed level and tile code segments During the next step 290, the tile code segments 1230 are compiled, preferably in raster scan order, into a full image code 1240 at 290 which is described in more detail in the section herein entitled "2.9 Step 290 Construction of image code and global header".

512283.doc 12 13- After step 290, the method outputs the constructed bitstream and terminates at step 300. The constructed bitstream permits spatial random access on a tile basis as tiles are encoded independently and may be decoded independently. How this is achieved for blocks with overlap is described herein in the section entitled "Transform stage for the second embodiment-reduced overlap DWT". Information in the global header is sufficient to allow determination of the spatial location of each encoded block without having to decode individual blocks. Headers (either global or tile) also store offsets to the tile codes in the encoded bitstream.

2.1 Transforming Step 220 Block Based DCT First Preferred Embodiment During the transforming step 220 of the first preferred embodiment, each tile is divided into 8x8 blocks of pixels. Preferably, a tile size is selected such as can be divided exactly into 8x8 blocks. For example, a tile size of 256 x 256 pixels is suitable.

Furthermore, preferably, no tile overlap is used. During step 220, each of these blocks is then independently fed to the input of an 8x8 two-dimensional discrete cosine transform a 15 (DCT) transformer. The transformer generates 8x8 blocks of DCT coefficients, in the same manner as in the JPEG baseline standard. Subsequently a number of distinct methods for performing the lossy part of the compression are possible. The DCT oo •coefficients may be quantised according to a quantisation table much in the same way as they are in the JPEG standard prior to subsequent entropy encoding (with an embedded 0. 20 quadtree encoder). Alternatively, unquantised DCT coefficients may be entropy encoded directly. The main benefit of the first approach is that, where the quadtree entropy coding is lossless, then lossless interoperability with existing JPEG standard is possible. That is, 0 •DCT coefficients may be transcoded directly to and from the standard JPEG bitstreams.

This transcoding is described in more detail in the section herein entitled "Transcoding with JPEG". In the second case, where DCT coefficients are not quantised, all loss is produced by a truncation of the bitstream produced by the embedded quadtree encoder.

Clearly it is also possible to use some combination where the DCT coefficients are quantised prior to lossy entropy coding and this may in fact be the most practical implementation for hardware systems. In this case, quantisation need not be performed based on a JPEG style quantisation table. Preferably, during the DCT transform step, each DCT coefficient is quantised according to the JPEG baseline standard with a high quality factor. Other alternative embodiments of quantisation may be practised without departing from the scope and spirit of the invention.

512283.doc 13 -14- 2.2 Partitioning Step 230 Partitioning blocks into resolution levels First Preferred Embodiment During step 230, the quantised DCT coefficients of the 8x8 blocks of a tile are partitioned into resolution levels. This step 230 is applied to each tile of the image. In practice there are a number of different ways of achieving this and two possible implementations are depicted in Figs 4A and 4B.

Fig. 4A shows one arrangement of the 8x8 blocks of a tile in resolution levels. In this arrangement, M different resolution levels are obtained, where M is the size of the DCT blocks used. This arrangement of DCT coefficients is achieved by grouping io together all DCT coefficients of corresponding frequency. For example, the DC components of all the 8x8 DCT blocks of a tile are arranged in the level 0 resolution.

Similarly, the DC, AC(O,1), AC(1,0), AC(1,1) coefficients form the level 1 resolution. In *this arrangement, a resolution level n encompasses nxn groups of DCT coefficients.

0:o Preferably, the DCT coefficients within each group are arranged in raster scan order of the 15 8x8 blocks.

Fig. 4B shows another preferred arrangement of the 8x8 blocks of a tile into *too resolution levels. In this arrangement, a dyadic grouping, reminiscent of the wavelet transform levels is used. Similar to the previous arrangement, all DCT coefficients of 0099 °0o6 corresponding frequency are grouped together. For example, the DC components of all Vo 20 the 8x8 DCT blocks are arranged in the level 0 resolution. In this arrangement, a resolution level n encompasses 2" x 2" groups of coefficients. 0oo4 These arrangements are useful for the reconstruction(decoding) at reduced or increased resolutions. The decoding (reconstruction) of these arrangements are described in more detail in the sections 0 Reconstruction at reduced resolution- First Preferred Embodiment" and "3.2.1 Reconstruction at increased resolution First Preferred Embodiment'".

2.3 Partitioning Step 240 Partitioning Levels into Subbands First Preferred Embodiment During step 240, the DCT coefficients within each level are further classified into three orientation categories: horizontal, vertical and diagonal.

Fig. 5 depicts these groupings for the preferred level partition shown in Fig. 4B.

The level resolutions each comprise horizontal, vertical and diagonal subbands.

512283.doc 14 The preferred method produces a subband of coefficients by extracting corresponding coefficients in raster scan order from all the transform blocks in a tile.

Specifically, having defined orientation groupings on a block level, the first preferred embodiment generates these said subbands by extracting corresponding coefficient groups, in raster scan order, from all of the blocks within the tile. For example, the DC subband is the image formed by extracting the DC coefficients from each transformed block in raster scan order. Note that where more than one coefficient is extracted from each block interleaving schemes may be used in the generation of the subband and do not represent a departure from the spirit or intent of the invention.

2.4 Transcoding with standard JPEG First Preferred Embodiment As mentioned previously, where the image is already in JPEG format DCT coefficients may be extracted directly from the previously encoded stream. This prevents additional losses incurred due to numerical inaccuracies in the forward and inverse transform and quantisation processes.

In this case it may also be desirable to force the subsequent encoding stage to be lossless as this would allow subsequent exact recovery of the original JPEG coefficients for transcoding back to JPEG. Quantisation table data must also be included in the global header if this embodiment is used.

Transforming Step 220 Reduced Overlap DWT Second Preferred Embodiment During the transform step of the second preferred embodiment, the tile sizes are preferably of size of (256+1) x (256+1), except for tiles that abut the edge of the image, .ooo.i S°which are appropriately smaller, as described in the section entitled "2.0 Overview of the Preferred Encoding Method".

Each (256+1) x (256+1) tile is transformed independently using a single sample overlap discrete wavelet transform. Following, conventional Mallat decomposition of an image using separable filtering, each row is analysed with a one-dimensional DWT and then each resulting column is analysed with a one-dimensional DWT to give a single level DWT image consisting of a DC subband and three AC subbands. This single level DWT engine is iterated on subsequent DC subbands to give a multi-level DWT image.

The single sample overlap DWT differs from a conventional DWT in that a single sample overlap one-dimensional DWT is used. The single sample overlap one- 512283.doc -16dimensional DWT differs from the conventional one-dimensional DWT in the way in which it handles boundary pixels.

The one-sample overlap one-dimensional DWT is now explained by way of an example. Consider a one-dimensional signal x, which is segmented into sub-signals nominally of length 9. The first segment consists of the first 8 samples, while subsequent segments consist of 9 samples, where the first sample of each such segment overlaps the previous segment. Thus beginning with a 0 index the first segment consists of signal samples 0, 1, 7 the second segment samples 7, 8, 15 the third segment samples 16, 23 and so on.

A lifting lattice for implementing a one-dimensional DWT of a such a signal segment of the signal is illustrated in Fig. 6. The segment samples are indexed from -1 to 7. The indexing begins at -1 to reflect the assumption that sample is the single sample that overlaps the previous segment of the signal and is typcially associated with the previous segment. In a similar manner sample x 7 is the single sample that overlaps the 15 next signal segment, and is associated with the current segment.

Labelling the highpass signal as d and the lowpass signal as s the single sample overlap DWT is implemented as d. 2 a+x 2 x 2 n N -1 (1) and, 20 Sn 2n+ P(d, n 2 (2) with boundary conditions, x_ 1 and sN- x 2 N- (3) where a 0.25 and in our example 2N 8. Apart from the boundary conditions the DWT implemented by these equations is the transform known sometimes as the 5/3

DWT.

In the general case the input signal segment is of length 2N+1 giving N+1 lowpass and N highpass samples. If the segment is the first segment in the signal then the sample x.l does not exist. In this case samples x.l and further samples x-2, x_ 3 are provided by extending the signal in conventional manner. That is, for odd length symmetric DWT filters, x.n xn. The intial boundary condition is not used. That is equations and are used along with sN-1 X2N-1. Finally in this case only 2N lowpass samples are produced, as we ignore s_ 1 Similary if the segment is the last segment of the 512283.doc 16 -17signal the data is extended to the right in the conventional way and the final boundary condition is not used.

A one sample overlap one-dimensional DWT that uses Daubechies 9/7 filters requires a further highpass and lowpass update step. Equations and are used with a= -1.586134342, IJ= -0.05298011854 while equation becomes Bx_, and sN_ Bx_N- (4) where B 1 2, A second highpass and lowpass stage is required and are given by, d, n N 1 o0 and, s' s, n 2 (6) .with boundary conditions, Ss_, s 1 and (7) S. The output highpass signal is then d' while the output lowpass signal is s'.

15 The single sample overlap two dimensional DWT of the preferred method is performed using this single sample overlap one-dimensional DWT. For the perferable tile size of (256+1) x (256+1) each row is analysed into (128+1) lowpass and 128 higpass samples. The one sample overlap one-dimensional DWT with Daubechies 9/7 filters is .preferably used. Each subsequent column is analysed into (128+1) lowpass samples and S2. 20 128 highpass samples giving a (128+1) x (128+1) DC subband and three AC subbands, the HL1, LH1 and HH1 subband. The HL1 subband is of size (128+1) x 128 samples, the LH1 of size 128 x (128+1) samples and the HH1 subband of size 128 x 128 samples.

This (128+1) x (128+1) DC subband is analysed in turn, in the same way as the original (256+1) x (256+1) tile was analysed, to give a new (64+1) x (64+1) DC subband and three new AC subbands, the LH2, HL2 and HH2 subbands. The HL2 subband is of size (64+1) x 64 samples, the LH1 of size 64 x (64+1) samples and the HH1 subband of size 64 x 64 samples. Further levels of the DWT are performed in the same way, generating three new AC subbands at each level, preferably with 5 levels, or iterations, of the single level two-dimensional DWT engine in total. With 5 levels of transform the resulting final DC subband is of size x coefficients. Other initial tile sizes can be analysed in the same way, as long as each tile dimension (excluding the single sample overlap) is divisible by 2 j where J is the number of DWT levels used.

512283.doc 17 18- For the purposes of quantisation and coding each subband generated by this process is normalised. That is each coefficient in a subband is scaled multiplicatively by some constant value. This value is determined so that the impulse response of the synthesised subband has unit energy.

2.6 Partitioning Steps 230 and 240 Partitioning into resolutions and subbands Second Preferred Embodiment The resolution levels occur naturally within the wavelet transform with each level of wavelet transform consisting of vertical,horizontal and diagonal detail (or AC or high frequency) subbands. The final level also consists of a DC subband, which can be considered as an independent resolution level. Thus, in this embodiment the partitioning steps 230 and 240 are effectively carried out by the second preferred transform step 220.

For the preferable tile size of (256+1) x (256+1) pixels or (28+1) x (28+1) pixels, the subband HLj is of size x 28, where j is the DWT level. Thus HL3, the HL subband generated at level 3 of the DWT, is of size (32+1) x 32. In a similar manner 15 subband LHj is of size 2 8j x (28j+1), and DC subband is of size 2 88-+1) x where J is the total number of DWT levels. Before coding, the first row of the HLj subband is discarded leaving 2 8j x 2 8- samples. Similarly the first column of the LHj subband is discarded leaving 2 8-j x 2 8- samples. Finally the first row and column of the DC subband is discarded leaving 28- x 28J samples. In this way there remain 256 x 256 subband 20 samples in total.

The reason that these subband samples are discarded is that they are contained in neighbouring tiles. That is the first row subband HLj of the current tile is identical to the last row of the subband HLj in the tile immediately above the current tile (and which overlaps the current tile by one row). Similarly the first column of subband LHj in the current tile is identical to the last column of subband LHj in the tile immediately to the left of the current tile (and which overlaps the current tile by one column.) Since this information is discarded prior to entropy coding, to (fully) reconstruct the current tile requires decoding these neighbouring tiles, as well as the current tile. However, as explained in the section entitled "3.3 Reconstruction at reduced resolution Second Preferred Embodiment" below the current tile can be approximately reconstructed by decoding only the code for the current tile.

For tiles in the first row or column of tiles, data is only discarded if it is contained in a previous tile, where previous is defined in terms of raster scan order. In 512283.doc 18 -19this way the total number of subband coefficients is maintained at 256x256, for the preferable tile size of (256+1)x(256+1). Tiles that abut the lower and right hand edge of the image may be smaller than (256+1)x(256+1). In this case the one dimensional DWT is modified to handle the smaller signals with an appropriate extension at the end of the signal. The row and column data is still discarded for the subbands in these tiles, however, as this data is contained in previous tiles.

2.7 Entropy CodingStep 250: Sub-Bit-Plane Embedded Quadtree Coding Each subband is independently entropy encoded during step 250 by a sub-bit plane embedded quadtree coding. As far as the entropy coding process is concerned each subband is treated as a single block of coefficients.

Before proceeding with a description of the entropy encoding process, a brief review of terminology used hereinafter is provided. For a binary integer representation of a number, "bit n" or "bit number n" refers to the binary digit n places to the left of the least significant bit (beginning with bit For example, assuming an 8-bit binary 15 representation, the decimal number 9 is represented as 00001001. In this number, bit 3 is •equal to 1, while bits 2, 1, and 0 are equal to O, 0, and 1, respectively. In addition, a block of coefficients may be represented as a matrix having coefficients arranged in rows and :columns, with each coefficient represented by a bit sequence. Conceptually speaking the matrix may be regarded as having three dimensions; one dimension in the row direction; a 0 second dimension in the column direction and a third dimension in the bit sequence .00.direction. A plane in this three-dimensional space that passes through each bit sequence at the same bit number is referred to as a "bitplane" or "bit plane". The term "bit plane oo I number n" refers to that bit plane that passes through bit number n.

A region of a subband comprises a set of contiguous image coefficients. A set of coefficients or a region can be denoted as T or as Ici 1 j where (ij) is a coefficient coordinate, and cij denotes the coefficient at coordinate A set or the region T of coefficients at a current bit plane is said to be insignificant if the MSB number of each coefficient in the region is less than the value of the current bit plane. To make the concept of region significance precise, a mathematical definition is given in Equation A set or region T of pixels is said to be insignificant with respect to (or at) bit plane n if, Ici j for all C j E T (8) 512283.doc 19 20 By a partition of a set T we mean a collection of non overlapping subsets of T such that the whole of T is contained. More precisely: T=UT, T, rT, =0 Vn m (9) m In other words if cij e T then cj E Tm for one, and only one, of the subsets T,.

For example, T is a square region and the set is the set consisting of the four quadrants of T. Such a partition is referred to as a quadtree partition and is depicted in Fig 8.

The preferred method encodes a region of coefficients in an embedded manner using a quadtree partition. The use of the term embedded is taken to mean that every bit 1o in a higher bit plane is coded before any bit in a lower bit plane. For example, every bit is coded in bit plane 7 before any bit in bit plane 6. In turn, all bits in bit plane 6 are coded before any bit plane 5 and so on. That is bit plane n is coded and put into the coded bitstream before bitplane n-l. The use of the term sub-bit plane is taken to mean that each .:...bitplane is coded in several passes. Preferably, each bit plane is coded in three passes: 15 namely the LIC, LIR, and LSC passes as will be discussed below.

Fig. 7 is a flow diagram of the preferred embedded quadtree coding process. The coefficients are assumed to be represented in a signed magnitude form with a finite oo number of bits. Preferably, 15 bits are used to represent the magnitude of the coefficients and an extra sign bit to give 16 bits in total.

In step 710 the most significant bit of all the coefficients in the block, nmax, is determined. That is nmax is the smallest integer n satisfying, 2+1> CI for all coefficients c in the block. In step 720, the bit plane variable n is set to nmax.

In step 730, a list of insignificant coefficients (LIC), a list of significant coefficients (LSC) and a list of insignificant regions (LIR) are initialised. The LIC and LSC are initialised to be empty. The LIR is initialised to be the four quadrants of the block. The variable, numsigcoeffsto code is initialised to be 0. These lists, and how they are coded, is described in more detail below. If the list is empty however, the process continues onto the next coding step without coding that empty list.

In step 740, bit n of each coefficient in the LIC is coded. Initially, bit n is set to nmax and is decremented for each pass of the loop 740 to 790.

512283.doc -21 At step 750, each region in the list of insignificant regions is coded at bit plane n.

At step 760, bit n of each coefficient in the list of significant coefficients (LSC) is coded.

At step 770, num_sig_coeffstocode is set to the number of coefficients in the LSC.

This is the number of coefficients in the LSC that are coded at bit plane n-i. This variable is used so that the significant coefficients that are added to the LSC in steps 740 and 760 are not coded during the current pass. In decision blocks 745, 755 and 780 a check is made as to whether the current sub bit-plane is the last bit plane to be coded. If any of these decision blocks returns a yes processing terminates at Step 795. That is coding can terminate after any sub bit-plane. In the first preferred embodiment, to maintain compatibility with JPEG, the data that is entropy coded has been quantised to integer values, and hence the method terminates after coding all three sub-bit planes at bit plane 0. In this way each bit of each data integer is coded and hence the data is losslessly coded.

For the second preferred embodiment, or when compatibility with JPEG is not required, coding can terminate at any arbitrary sub-bit plane. Preferably a minimum sub bit-plane is I5 determined prior to coding and all blocks terminate at this sub bit-plane.

If decision block 780 returns yes the process terminates at step 795, otherwise processing continues at step 790. At step 790, the current bit plane variable n is decremented and processing continues at step 740.

2.7.1 Encoding the LIR 20 The list of insignificant regions (LIR) is a list, or vector, of regions. A region is a sub-block of the block of coefficients. A region (within the block) can be described by the top left-hand corner coordinate of the region within the block and by the region size.

The list of insignificant regions is initialised with 4 regions: namely the four quadrants in the block.

Referring to Fig. 8, if 800 represents the block then the four regions are 810, 820, 830 and 840. These regions are put into the LIR in this order.

Referring to Fig. 9, the LIR is coded at bit plane n at step 750 of Fig. 7 as follows. In step 910, the current region R is set to the first region in the LIR, L is set to the number of regions in the LIR, and region num, the index of the current region in the LIR, is set to 1. In decision block 912 a check is made to determine if regionnum is less than or equal to L. If decision block 912 returns a yes, processing continues at step 914.

At step 914, the significance of region R is output. A coefficient c is insignificant at bit plane n if, 512283.doc 21 22 ci 2" (11) A region is insignificant at bit plane n if all coefficients in the region are insignificant at bit plane n. A region or coefficient is significant at bit plane n if it is not insignificant at bit plane n. At step 912 the significance of R is coded by outputting a 1 if R is significant or outputting a 0 if R is insignificant. Processing then resumes at step 920. If decision block 912 returns a no, then processing skips immediately to step 920.

In decision block 920 a check is made to determine if R is insignificant at bit plane n, if decision block 920 returns no, processing continues at step 950.

If decision block 920 returns a yes, processing continues at step 925. At step 925, R is removed from the LIR. In step 930 a 2x2 significance mask is coded with a level Huffman code. This step is further explained below. Decision block 935 checks if R is a region consisting of 2x2 coefficients. If decision block 935 returns a no, then processing continues at step 940. At step 940 R is partitioned into 4 regions, namely its .9 four quadrants, and these are added to the end of the LIR. For example, if block 800 in Fig. 8 is the region R, then 810, 820, 830 and 840 are the four quadrants. The significance mask, coded in step 930, is a 2x2 binary mask indicating the significance (with respect to n) of each of the 2x2 quadrants in R. If, for example, 810, 820, and 840 oo* are insignificant with respect to n, while 830 is significant with respect to n, then the significance mask would be, *019...

where 0 indicates insignificant, and 1 significant. Note that there are only 15 possible a different significance masks as one quadrant must be significant.

Note that at step 940 the significance of each of the 4 regions that are added to the end of the LIR has already been coded at step 930, via the significance mask. This is why at step 912 a check is made if region_num is less than or equal to L. If regionnum is greater than L, then the significance of the region has already been coded at step 930 during the coding of some previous region (whose index is less than regionnum).

Returning to decision block 935, if said block returns a yes, then processing continues at step 945. If R is a 2x2 block of coefficients then the significance mask indicates the significance of each of the 2x2 coefficients. To continue with the example, in the case where R is a 2x2 region, 810, 820, 830 and 840 are lxl regions (namely 512283.doc 22 23 individual coefficients). At step 945, if a coefficient in the 2x2 region R is significant, it is added to the list of significant coefficients (LSC) and a sign bit is output. That is, a 0 output if the coefficient is positive or a 1 is output if the coefficient is negative. At step 945, if a coefficient in the 2x2 region R is insignificant, it is added to the list of insignificant coefficients (LIC). After steps 940 and 945 processing resumes at step 950.

In decision block 950 a check is made to determine if R is the last region in the LIR. If decision block 950 returns a yes then processing terminates at step 960. If decision block 950 returns a no, processing resumes at step 955. At step 955 the current region index, region_num, is incremented, and R is set to the next region in the LIR.

Processing then resumes at step 912.

2.7.2 Encoding the LIC Referring to Fig. 10, the list of insignificant coefficients LIC are coded at bit plane n in step 740 of Fig. 7 as follows. The list of insignificant coefficients is simply a list of coefficients added by the LIR coding process. In step 1010 the current coefficient c is set to the first coefficient in the LIC. In step 1020 bit n of c is output. That is a 1 is output if bit n of c is a 1, else 0 is output. In decision block 1030 a check is made to determine if c is significant at bit plane n. If decision block 1030 returns a yes (that is a 1 o was output at step 1020) processing continues at step 1040. At step 1040, a sign bit is 0 00 output and the coefficient c is removed from the LIC and added to the end of the LSC.

20 Processing then continues at step 1050. If decision block 1030 returns a no, processing resumes at step 1050. At decision block 1050 a check is made to determine if c is the last coefficient in the LIC. If decision block 1050 returns a yes, processing terminates at step 1070. If decision block 1050 returns a no then processing continues at step 1060. At step 1060 the current coefficient c is set to the next coefficient in the LIC. Processing then continues at step 1020.

2.7.3 Encoding the LSC Referring to Fig. 11, the list of significant coefficients LSC are coded at bit plane n in step 760 of Fig. 7 as follows. The list of significant coefficients are simply those coefficients added by the LIR and LIC coding processes. At step 1110 the current coefficient c is set to the first coefficient in the LSC and the current coefficient index, coefficient_num, is set to 1. At step 1120 bit n of c is output. That is a 1 is output if bit n of c is a 1, else a 0 is output. At decision block 1130, a check is made to determine if coefficient_num is greater than num_sig_coeffs tocode. The variable 512283.doc 23 -24numsigcoeffstocode is set in steps 730 and 770 and is used so that those coefficients that are added to the LSC at steps 740 and 750 for bit plane n are not coded again during the coding of the LSC at bit plane n. If decision block 1130 returns a yes, processing terminates at step 1150. If decision block 1130 returns a no, processing continues at step 1140. At step 1140 the current coefficient c is set to the next coefficient in the LSC and the current coefficient index, coefficientnum, is incremented. Processing then continues at step 1120.

2.8 Steps 270 and 280 Construction of the Compressed Level and Tile Code Segments For ease of explanation, the bitstream compilation is described with additional reference to Fig. 12. During step 270, level code segments are compiled. The code segment for a level consists of the concatenation of the code for each entropy encoded subband in said level such that vertical detail information appears first followed by horizontal detail and finally diagonal detail. The level codes are padded with one bits so that they consist of an integral number of bytes. During step 280, the tile code segments are compiled. During this step 280, The level codes are concatenated in order of increasing level to form the tile code.

rr :Prior to concatenation, each subband in a tile can be truncated at an arbitrary subbit plane as this information is coded into the compressed tile code header. For some entropy encoders that can tolerate buffering the whole compressed image, the decision (202,260) at which point each subband in each tile is terminated can be made after entropy coding the whole image. For example, the set of termination points can be found .:oo.i that minimise, over all possible sets of termination points, the compressed image code S"size for a given decompressed image distortion, or minimise the distortion for a given compressed image code size, using Lagrangian constrained optimisation techniques. The subband codes are then terminated at the required points, and the compressed tile codes constructed with these termination points.

During step 280, a tile header is added. The tile header precedes the concatenated level codes, and consists of an integral number of bytes. Minimally the header includes the number of the most significant bit plane and the number of sub-bitplane passes encoded for each subband. Preferably, each of these sets of values are encoded using a tree coding of this information for efficiency though this, along with the 512283.doc 24 other fine details of header format, are not considered to be critical to the current invention and a run-length or similar code could be used in place.

Optionally, to assist parsing or hardware implementation the tile header might also include the length of each of the constituent level codes. In these cases, length is s defined to be the number of bytes. No special coding of these lengths is proposed so as they are more readily interpreted.

2.9 Step 290 Construction of image code and Global Header During step 290, the entire compressed image code is compiled. The compressed image code consists of all the tile code segments, in tile raster scan order, preceded by the global image header. Preferably, the global image header contains information about the image size, number of colour planes, colour space, component rates, transform type, quantisation table (if any) and number of levels, tile size and overlap as well as the length of the tile codes which comprise the compressed bit stream. The DCT block size can be inferred from the number of levels as B=2 n (B is block size and n is the number of levels) 15 and need not be added to the header. Thus it is possible to depart from and 8x8 block S.size. Again, length is defined to be the number of bytes. As an alternative pointer information about the location of each tile code in the compressed image code can be used.

Under certain circumstances it may be desirable to move some of these fields 20 from the global to the tile headers. This is especially the case where a high degree of independence of the tile codes is important and does not represent a departure from the spirit or intent of the current invention. Rather the grouping of these fields into a global oSoS.

S"header provides more efficiency with respect to overall compression.

Overview of the Decoder Fig 3 shows a block diagram of the decoding method of a compressed digital image of the original digital image or region thereof, where the compressed digital image has been encoded by the encoding method shown in Fig. 2. During decoding the global header is first read 310. Information on the tile code lengths combined with image dimensions and the region of image data required is the used at step 320 to scan directly to the first required tile code. The required tiles are retrieved and stored individually and in sequence in working memory for further processing by the decoding method. Header information in this tile code is then extracted in Step 330. For decoding at reduced resolutions not all subband levels are decoded. This is controlled by Step 303 labelled 512283.doc -26- "Scale". The required resolutions or subband levels are decoded using a sub bit-plane quadtree decoder in Step 340. The detailed operation of the sub bit-plane quadtree decoder is described in the section herein entitled "Entropy decoding: embedded quadtree decoding".

Where the image has been encoded in accordance with the first preferred embodiment (see sections 2.1 to the image can be reconstructed at different resolutions. This reconstruction is described in more detail in the following sections "3.2.0 Reconstruction at reduced resolution- First Preferred Embodiment" and "3.2.1 Reconstruction at increased resolution First Preferred Embodiment". Moreover, in the first preferred embodiment the tiles are divided into non-overlapping blocks.

Consequently, the steps 350,355, 370 and 375 of the decoding method may be dispensed with in this embodiment. During the next step 360 of the first preferred embodiment, the blocks (or subband levels thereof) of the current tile are then inverse DCT transformed.

"The method then proceeds to decision block 370, where a check is made to determine if I15 the whole region has been recovered. If decision block 370 returns a no, processing resumes at Step 320 where the code for the next tile of interest is located in the compressed image code. If decision block 370 returns a yes, processing terminates and the decoded image is output.

Where the image has been encoded in accordance with the second preferred 20 embodiment see sections 2.5 and the image can also be reconstructed at different resolutions. This reconstruction is described in more detail in the following section "3.3 Reconstruction at reduced resolution Second Preferred Embodiment".

In the second preferred embodiment, after step 340, the method proceeds to decision block 350. Decision block 350 determines if any overlapping data is required to decode the current tile. If decision block 350 returns a yes, then at Step 355 the required overlapping data is retrieved from previously decoded coefficients. This process is explained further in the following section entitled, "3.3.Reconstruction at reduced resolution- Second Preferred Embodiment". If decision block 350 returns a no then processing resumes at Step 360. In step 360, the current tile (or subband levels thereof) is inverse DWT transformed. After which the method proceeds to decision block 370. In decision block 370 a check is made if any data in the current block is overlapping other blocks that will be decoded at a later stage. If decision block 370 returns a yes then this data is buffered for future use in Step 375. If decision block 370 returns a no, processing 512283.doc 26 -27resumes at Step 370 to check whether the whole region is covered in the same manner as the first embodiment. If the decision block returns a yes, the decoding method terminates and the decoded image is output.

3.1 Entropy decoding: Embedded quadtree decoding Given an embedded quadtree code for a block the block can be reconstructed using the reverse of the preferred quadtree encoding process described previously. The decoding process follows essentially the same process as the encoder except that direction of the branching or decision points in the decoding process are now determined from the bits in the coded bit stream, that were output by the encoding process at the corresponding io points.

The coefficients are quantised by the encoding process by terminating the quadtree code at some sub bit-plane. Given an embedded quadtree code for a block the ease og 00 block can be reconstructed, up to a precision determined by the last sub bit-plane in the encoding process, using the reverse of the quadtree encoding procedure. The decoding 15 process follows essentially the same process. The direction of the branching or decision points in the decoding process are now determined from the bits in the coded bit stream, that were output by the encoding process at the corresponding points.

OS*S

At the termination of any pass (LIC, LIR of LSC) the decoding process can determine each coefficient in the block up to a certain bit precision. For example, if the last pass was the LSC at bit plane n=3, bit 3 and above can be determined for each coefficient in the block by the decoding process, and we say that each decoded coefficient has a bit precision of 3. Preferably, the decoding process reconstructs each coefficient in the middle of the decoded coefficient's uncertainty interval. That is suppose a decoded coefficient has a bit precision of n and the (decoded or actual) coefficient has a non-zero bit in bit plane n or higher. Let m be the magnitude of a number with zeros in bit planes 0 to n-l, and bits in higher bit planes according to the decoded bits for the coefficient.

Then, preferably, the magnitude of the decoded coefficient is given by m+2 n 1 This reflects the fact that as far as the decoder can currently ascertain, the original coefficient can have a magnitude between m and m+2 n The interval m 2 n is called the uncertainty interval. For a bit precision of n if the coefficient has no non-zero bits in bit plane n or higher the decoded value is 0.

3.2.0 Reconstruction at reduced resolution- First Preferred Embodiment 512283.doc 27 -28- In both cases(Figs. 4A and 4B), reduced resolution reconstruction requires all coefficients at levels up to and including the target level and is obtained efficiently using an inverse DCT with length equal to the number of coefficients present. For example in the M level partition a 3/M resolution reconstruction (in both spatial planes) would be obtained by performing a 3x3 inverse DCT on the coefficients from levels 0 through 2.

For example, in the dyadic arrangement, a half resolution image (in both spatial planes) would be obtained by performing and inverse 4x4 DCT on the coefficients from levels 0 through 2. It is possible to reconstruct any resolution n/M (n an integer in the range 1 to M) from either grouping however in the dyadic grouping this may lead to some coefficients being thrown away. It should also be noted that this mechanism allows for independent image downsampling in each image plane.

3.2.1 Reconstruction at increased resolution First Preferred Embodiment A similar approach to the above can also be used for upsampling (interpolating) i the image data for moderate upsample rates. This involves using the decoded block as the 15 upper left set of coefficients in a larger virtual coefficient block. Subsequently a higher resolution image may be reconstructed using this larger block size for the inverse DCT.

For example, reconstruction at a size of 9/8 times the size of the original image is achieved where the original block size is 8x8 and these blocks are padded with zeros to a size of 9x9 then reconstructed using a 9x9 inverse DCT.

20 3.3 Reconstruction at reduced resolution Second Preferred Embodiment ooo# Random single tiles can be decoded at variable resolution from the compressed bit stream for the second preferred embodiment. To fully reconstruct a given tile, requires decoding not only the code for the given tile but requires decoding neighbouring tiles to extract the overlap row and column information. Once the subband tile data is decoded, including the overlap data, the tile can be inverse transformed with a one-sample overlap inverse DWT to generate the output tile. Further, a partial inverese DWT can be used to generate the tile at reduced resolutions.

However, the tile can be reconstructed approximately by estimating the overlap data from the information in the given tile code only. The required extra row and column of data, for the DC subband, is preferably duplicated from the first row and column of data (available) in the DC subband. The extra data for the HL and LH subbands at each level is preferably set to zero.

512283.doc 28 -29- Multiple random tiles are preferably decoded in a similar manner. For example, a subset of the tiles in an image compressed with the second preferred embodiment is preferably decoded using the flow diagram of "3.0 Overview of the decoder". In this case the only tiles that are considered (in Step 320) are the tiles of interest. Preferably the tiles of interest are decoded in raster scan order. That is the tiles of interest are ordered according to raster scan order, and decoded in this order. In Step 355 if the necessary data has been buffered then it is extracted. Preferably if the required data has not been buffered in Step 375, the required extra row and column of DC data is duplicated from the first row and column of DC data for the current block, as described above, while the io required extra data for the HL and LH subbands at each level is set to zero. In Step 375, if a later block will require the data, the last row of each of the HL subbands (at each level) and of the DC subband, and the last column of each of the LH subbands (at each level) and of the DC subband, are buffered. This extra column data is the overlapping .I data of the block to the right of the current block, and the extra row data is the 15 overlapping data of the block below the current block. When decoding the block to the right of the current block, at some later stage, this extra column data is extracted at step 355. When decoding the block below the current block, at some later stage, this row column data is extracted at step 355. In Step 360 the inverse one-sample overlap two oo0• dimensional DWT of the tile is performed.

The foregoing only describes a small number of embodiments of the present invention, however, modifications and/or changes can be made thereto without departing from the scope and spirit of the invention. The present embodiments are, therefore, to be considered in all respects to be illustrative and restrictive.

S°In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including" and not "consisting only of'. Variations of the word comprising, such as "comprise" and "comprises" having corresponding meanings.

512283.doc 29

Claims

1. A method of representing a digital image as a tiled multi-resolution representation, wherein said digital image comprises a number of said tiles of pixel coefficients, and each said tile is of a predetermined size and said size of each said tile is such that all the pixel coefficients of each said tile is able to be held in a local memory, the method performing the following steps for each tile in turn: transforming a current said tile of pixel coefficients to a derive a current tile of transform coefficients; and coding said current tile of transform coefficients to provide a multi-resolution representation of said current tile such that a decoder is capable of decoding a resolution level of a said current tile substantially independently of any higher resolution level of $oo said current tile and independently of any other tile; wherein during said coding substantially all data associated with said current tile is o stored in said local memory. o*o* 0o0*

2. A method as claimed in claim 1, wherein said digital image comprises a plurality of partially overlapping tiles. ooo°

3. A method as claimed in claim 1, wherein the size of each tile is within a range between 64 by 64 pixel coefficients and 519 by 519 pixel coefficients.

4. A method as claimed in claim 1, wherein the size of substantially all of said tiles is 257 by 257 pixel coefficients. A method as claimed in claim 1, wherein the size of said tiles is 256 by 256 pixel coefficients. 512283aAMENDMENTSdoc 512283 aAMENDMENTSO.doc -31

6. A method as claimed in claim 1, wherein said transforming step uses a discrete wavelet transform.

7. A method as claimed in claim 1, wherein said transforming step uses a discrete cosine transform.

8. A method as claimed in claim 1, wherein said transforming step uses a single sample overlap DWT, wherein the tiles are overlapped by one sample. 10 9. A method as claimed in claim 1, wherein each tile comprises a plurality of 8 x 8 *ooo o blocks of pixels and said transforming step performs respective DCT transforms on said blocks of pixels of each tile. ;10. A method as claimed in claim 1, wherein said method further comprises the step of generating a codestream of said coded tiles, wherein said coded tiles are arranged in said tile order sequence and resolution levels within each tile in level order sequence.

11. A method as claimed in claim 10, wherein said generating step further includes generating a header for the codestream comprising pointer data pointing to the location of 20 said coded tiles in the codestream to enable random access to said coded tiles.

12. A method as claimed in claim 1, wherein said coding step comprises coding regions of coefficients of said resolution levels in an embedded manner producing code segments that are progressive in quality.

13. A method as claimed in claim 1, wherein said coding step comprises a sub-bitplane embedded quadtree coding process.

14. A method as claimed in claim 13, wherein the method further comprises the following step of: 512283 a AMENDMENTSOl.doc -32- truncating code segments of said coded tiles at various termination points. A method as claimed in claim 1, wherein said method further comprises the step of: arranging said coded tiles to provide said tiled multi-resolution representation of said digital image.

16. A method as claimed in claim 15, wherein said arranging step comprises arranging into a bitstream each said coded tile in a contiguous manner to effect substantially random access to portions of the digital image. S 0 17. A method as claimed in claim 16, wherein said bitstream includes a plurality of o0o pointers to each resolution for each coded tile of the digital image. o* i •18. A method of decoding a coded representation of a digital image, wherein said coded representation comprises a number of coded tiles of transform coefficients, and each said go tile is able to be held in a local memory, the method comprises the steps of- selecting one or more tiles and selecting a resolution level thereof;

18.. processing said coded representation, wherein said processing step performs the following sub-steps for each selected coded tile in turn: 20 decoding in turn each resolution level up to and including said selected resolution level of a current coded tile of transform coefficients substantially independently of any higher resolution level of said current tile and independently of any other tile to provide a current tile of transform coefficients; inverse transforming said current tile of transform coefficients to derive a current tile of pixel coefficients; wherein during said decoding substantially all data associated with said current tile is stored in said local memory.

19. A method as claimed in claim 18, wherein the size of each tile is within a range fl i sbetween 64 by 64 pixel coefficients and 519 by 519 pixel coefficients. 512283 a AMENDMENTSO.doc -33 A method as claimed in claim 18, wherein the size of substantially all of said tiles is 257 by 257 pixel coefficients.

21. A method as claimed in claim 18, wherein the size of said tiles is 256 by 256 pixel coefficients.

22. A method as claimed in claim 18, wherein said inverse transforming step uses a discrete wavelet transform.

23. A method as claimed in claim 18, wherein, said inverse transforming step uses a discrete cosine transform.

24. A method as claimed in claim 18, wherein said method further comprises arranging 15 said tiles of pixel coefficients to form a resolution of said digital image or portion thereof. oo ooo Apparatus for representing a digital image as a tiled multi-resolution representation, wherein said digital image comprises a number of said tiles of pixel coefficients, and each ••go said tile is of a predetermined size and said size of each said tile is such that all the pixel •o 20 coefficients of each said tile is able to be held in a local memory, the apparatus comprising: means for transforming a current said tile of pixel coefficients to a derive a current tile of transform coefficients; and means for coding said current tile of transform coefficients to provide a multi- resolution representation of said current tile such that a decoder is capable of decoding a resolution level of a said current tile substantially independently of any higher resolution level of said current tile and independently of any other tile; wherein during said coding substantially all data associated with said current tile is stored in said local memory. 512283_aAMENDMENTSO I.doc 34

26. Apparatus claimed in claim 25, wherein said digital image comprises a plurality of partially overlapping tiles.

27. Apparatus as claimed in claim 25, wherein said transforming means uses a single sample overlap DWT, wherein the tiles are overlapped by one sample.

28. Apparatus as claimed in claim 25, wherein each tile comprises a plurality of 8 x 8 blocks of pixels and said transforming means performs respective DCT transforms on said blocks of pixels of each tile.

29. Apparatus as claimed in claim 25, wherein said apparatus further comprises: means o• for generating a codestream of said coded tiles, wherein said coded tiles are arranged in said tile order sequence and resolution levels within each tile in level order sequence. too

30. Apparatus as claimed in claim 29, wherein said generating means includes means for generating a header for the codestream comprising pointer data pointing to the too. location of said coded tiles in the codestream to enable random access to said coded tiles. So.

31. Apparatus as claimed in claim 25, wherein said coding means comprises means for 20 coding regions of coefficients of said resolution levels in an embedded manner producing code segments that are progressive in quality.

32. Apparatus as claimed in claim 25, wherein said coding means comprises means for sub-bitplane embedded quadtree coding.

33. Apparatus as claimed in claim 32, wherein the apparatus further comprises: means for truncating code segments of said coded tiles at various termination points. RA301 34. Apparatus as claimed in claim 25, wherein said apparatus further comprises: -283AMENDMENTSOdoc 512283 a AMENDMENTSOI.doc means for arranging said coded tiles to provide said tiled multi-resolution representation of said digital image. Apparatus as claimed in claim 34, wherein said arranging means comprises means for arranging into a bitstream each said coded tile in a contiguous manner to effect substantially random access to portions of the digital image.

36. Apparatus as claimed in claim 35, wherein said bitstream includes a plurality of pointers to each resolution for each coded tile of the digital image. o 37. Apparatus for decoding a coded representation of a digital image, wherein said 0: coded representation comprises a number of coded tiles of transform coefficients, and each said tile is able to be held in a local memory, the apparatus comprising: means for selecting one or more tiles and selecting a resolution level thereof; *o.,e means for processing said coded representation, wherein said processing means performs the following steps for each selected coded tile in turn: decoding in turn each resolution level up to and including said selected resolution level of a current coded tile of transform coefficients substantially o*oo independently of any higher resolution level of said current tile and independently of any other tile to provide a current tile of transform coefficients; inverse transforming said current tile of transform coefficients to derive a current tile of pixel coefficients; wherein during said decoding substantially all data associated with said current tile is stored in said local memory.

38. Apparatus as claimed in claim 37, wherein said apparatus further comprises means for arranging said tiles of pixel coefficients to form a resolution of said digital image or portion thereof. 512283 aAMENDMENTSO .doc -36-

39. A computer program for representing a digital image as a tiled multi-resolution representation, wherein said digital image comprises a number of said tiles of pixel coefficients, and each said tile is of a predetermined size and said size of each said tile is such that all the pixel coefficients of each said tile is able to be held in a local memory, the computer program comprising: code for transforming a current said tile of pixel coefficients to a derive a current tile of transform coefficients; and code for coding said current tile of transform coefficients to provide a multi- resolution representation of said current tile such that a decoder is capable of decoding a resolution level of a said current tile substantially independently of any higher resolution @0*level of said current tile and independently of any other tile; wherein during said coding substantially all data associated with said current tile is stored in said local memory. o oo

40. A computer program as claimed in claim 39, wherein said digital image comprises a plurality of partially overlapping tiles. 4 e o. i'"2

41. A computer program as claimed in claim 39, wherein said transforming code uses a °°or single sample overlap DWT, wherein the tiles are overlapped by one sample. sol°

42. A computer program as claimed in claim 39, wherein each tile comprises a plurality of 8 x 8 blocks of pixels and said transforming code performs respective DCT transforms on said blocks of pixels of each tile.

43. A computer program as claimed in claim 39, wherein said apparatus further comprises: code for generating a codestream of said coded tiles, wherein said coded tiles are arranged in said tile order sequence and resolution levels within each tile in level order sequence. RA/ T512283_a_AMENDMENTSOI .doc -37-

44. A computer program as claimed in claim 43, wherein said generating code includes code for generating a header for the codestream comprising pointer data pointing to the location of said coded tiles in the codestream to enable random access to said coded tiles.

45. A computer program as claimed in claim 39, wherein said coding code comprises code for coding regions of coefficients of said resolution levels in an embedded manner producing code segments that are progressive in quality.

46. A computer program as claimed in claim 39, wherein said coding code comprises 10 code for sub-bitplane embedded quadtree coding. Oe e ago 'r 47. A computer program as claimed in claim 46, wherein the computer program further S.. comprises: 6*O 0 code for truncating code segments of said coded tiles at various termination points. S.

48. A computer program as claimed in claim 39, wherein said computer program further comprises: •code for arranging said coded tiles to provide said tiled multi-resolution 'g e 0-0°a representation of said digital image.

49. A computer program as claimed in claim 48, wherein said arranging code comprises code for arranging into a bitstream each said coded tile in a contiguous manner to effect substantially random access to portions of the digital image.

50. A computer program as claimed in claim 49, wherein said bitstream includes a plurality of pointers to each resolution for each coded tile of the digital image.

51. A computer program for decoding a coded representation of a digital image, Swherein said coded representation comprises a number of coded tiles of transform 512283_aAMENDMENTSOI .doc -38- coefficients, and each said tile is able to be held in a local memory, the computer program comprising: code for selecting one or more tiles and selecting a resolution level thereof; code for processing said coded representation, wherein said processing code performs the following steps for each selected coded tile in turn: decoding in turn each resolution level up to and including said selected resolution level of a current coded tile of transform coefficients substantially independently of any higher resolution level of said current tile and independently of any other tile to provide a current tile of transform coefficients; inverse transforming said current tile of transform coefficients to derive a current tile ofpixel coefficients; wherein during said decoding substantially all data associated with said current tile is stored in said local memory.

52. A computer program as claimed in claim 51, wherein said computer program further comprises code for arranging said tiles of pixel coefficients to form a resolution of said digital image or portion thereof.

53. A method of encoding a digital image, the method substantially as described hereinwith reference to Fig. 2 and any one or more of Figs. 4A, 4B, 5, or 6, 7, 8, 9, 10, 11, and 12 of the accompanying drawings.

54. A method of decoding an encoded image, the method substantially as described herein with reference to Fig. 3 of the accompanying drawings. Apparatus for encoding a digital image, the apparatus substantially as described herein with reference to Figs. 1 and 2, and any one or more of Figs. 4A, 4B, 5, or 6, 7, 8, 9, 10, 11, and 12 of the accompanying drawings. 512283_aAMENDMENTSO.doc -39-

56. Apparatus for decoding a digital image, the apparatus substantially as described herein with reference to Figs. 1 and 3 of the accompanying drawings.

57. A computer readable medium comprising a computer program for encoding a digital image, the computer program substantially as described herein with reference to Figs. 1 and 2, and any one or more of Figs. 4A, 4B, 5, or 6, 7, 8, 9, 10, 11, and 12 of the accompanying drawings.

58. A computer readable medium comprising a computer program for decoding a 10 digital image, the computer program substantially as described herein with reference to Figs. 1 and 3 of the accompanying drawings. Dated this ELEVENTH day of MARCH 2002 CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant o• SSPRUSON&FERGUSON 512283 a AMENDMENTSOl.doc