WO2000049571A2 - Method and system of region-based image coding with dynamic streaming of code blocks - Google Patents
Method and system of region-based image coding with dynamic streaming of code blocks Download PDFInfo
- Publication number
- WO2000049571A2 WO2000049571A2 PCT/CA2000/000134 CA0000134W WO0049571A2 WO 2000049571 A2 WO2000049571 A2 WO 2000049571A2 CA 0000134 W CA0000134 W CA 0000134W WO 0049571 A2 WO0049571 A2 WO 0049571A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- region
- data
- processing
- image
- coding
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/40—Tree coding, e.g. quadtree, octree
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/20—Contour coding, e.g. using detection of edges
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
- H04N19/126—Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/127—Prioritisation of hardware or computational resources
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/129—Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/152—Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/48—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/63—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/63—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
- H04N19/635—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets characterised by filter definition or implementation details
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/63—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
- H04N19/64—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets characterised by ordering of coefficients or of bits for transmission
- H04N19/647—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets characterised by ordering of coefficients or of bits for transmission using significance based coding, e.g. Embedded Zerotrees of Wavelets [EZW] or Set Partitioning in Hierarchical Trees [SPIHT]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/115—Selection of the code volume for a coding unit prior to coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
Definitions
- the present invention relates generally to image coding, and more particularly to the compression and streaming of scalable and content-based, randomly accessible digital still images.
- JPEG 2000 should also offer unprecedented content based accessibility of the compressed format to support applications of various needs. It is highly preferred and advantageous that image content be accessed, manipulated, exchanged, and stored in compact form.
- JPEG 2000 In order for JPEG 2000 to be the standard coding foundation of new generation image processing systems, it must provide for efficiency in coding, different types of still images (bi-level, gray-level, color) with different characteristics (natural images, scientific, medical imagery, rendered graphics, text, etc.) within a unified system. In addition to providing low bit rate operation with quality performance superiority to existing standards, this new system should include many modern features as listed in the JPEG 2000 requirement document.
- RICS Digital Accelerator Corporation's (DAC) Region based Image Coding System
- DAC Digital Accelerator Corporation's
- RICS has not only achieved a rate distortion performance competitive with best known compression techniques, but also demonstrates a high degree of openness and flexibility that will accommodate most well known algorithms as well as new yet to be implemented.
- the features supported by RICS covers almost all those listed in JPEG 2000 Requirements document.
- the RICS architecture attempts to be an integrated platform that supports and facilitates a variety of applications that may have different characteristics and requirements. For example, providing efficient lossless and lossy compression in a single code stream; efficient processing of compound documents containing both bi-level (text) and color imagery; progressive transmission by pixel accuracy and/or by resolution; random access of arbitrary shaped regions; and so on.
- the new JPEG 2000 standard is intended to be a dynamic, rather than static, suite of coding tools that support the new generation imagery applications and, at the same time, keeping abreast with the progress of technology.
- the mathematical foundation for those leading candidates of new image coding methods (most noticeably the various types of multi-resolution analysis techniques, such as wavelet transforms) is relatively young and still under intensive investigation. If an architectural design is based on or restricted to one or several particular existing coding methods, it may become outdated very quickly.
- the RICS architecture organizes the image coding process into a set of functionally separable modules, such that each module can be developed and optimized individually. Furthermore, all modules in the system are designed to be functionally orthogonal with each other, in the sense that a new algorithm, without affecting the functionality of other modules, can effectively replace the base algorithm of any specific module.
- Content based accessibility is becoming an important feature in supporting applications such as multimedia database query, internet server-client interaction, content production, remote diagnostics, and interactive entertainment.
- the content- based accessibility requires that semantically meaningful visual objects be used as basis for image data representation, explanation, manipulation, and exchange.
- images being represented in compressed format, it is desirable to perform retrieval operations directly in the code space without requiring image reconstruction.
- any search algorithms that require image reconstruction will be infeasible from a practical viewpoint because of the huge amount of images in most image databases.
- the previous JPEG and many existing coding techniques focus primarily on the issue of compression ratio, paying minimal attention to the need of content based image retrieval.
- ROI Region of Interest
- regions are usually user-specified primitive geometric shapes, such as rectangles or circles.
- the regions that define the visual objects are usually of arbitrary shapes. Regions can also be generated as the result of certain mathematical operations or transform properties (e.g. significance of transformation coefficients) used to partition the image into disjoint regions of various (e.g. tiles, hierarchical blocks, or arbitrary shapes).
- RICS In designing the RICS system, DAC considered carefully the distinction between the concept of an object and that of a region.
- RICS is open to very general definition of region. A region is a 2D spatial identity with pure syntactic contribution to a code stream. In contrast, an object may contain semantic information. Therefore, the region is perceived as a more elementary and reliable description than the object.
- RICS is region based, not object based. RICS supports a rich set of region types, from primitive geometrical shapes, tiles, hierarchical blocks, to most general arbitrary shapes.
- the region based coding strategy effectively supports the content-based accessibility of imagery data. Specifically, this ability enables random access to the code stream as well as providing a processing channel for user defined regions of interest or tile based techniques. Supporting MPEG-4 object based accessibility is one of the main objectives of the RICS design. Furthermore, region-based coding provides a natural bridge for 'transcodability' with JPEG and JBIG.
- Scalable image coding involves generating a coded representation (code stream) in a manner that facilitates the reconstruction of the image at multiple resolutions or quality levels by scalable coding.
- the control of scalability should be centralized in a single module as the last stage on the encoder side right before the code stream is fed to the communication channel. Furthermore, it is desirable that the scalability control module can completely handle the required processing locally, without any further request propagating back to any previous stages in the encoder. In this way the scalability control module avoids the need for multi-pass computation.
- the RICS architecture is designed to support three types of scalability: scalability in terms of pixel precision, spatial resolution, and regions.
- the input image data after certain transformation, becomes a set of image primitives.
- This set of primitives can be wavelet transform coefficients, DCT coefficients, other transform coefficients, or even raw image data.
- the image primitives are grouped into regions. Region definition can come from user defined ROI, from other application modules, or from certain automatic segmentation algorithms running in the primitive space.
- Each region contains one or more independent coding units (ICU).
- the primitives in an ICU are encoded and decoded independently, without reference to primitives of any other ICU. This procedure is called the intra-region coding.
- the outcome of an ICU operation is a code block.
- a multiplexer (MUX) is employed to integrate the code blocks into the final code stream.
- Figure 1 is a simplified RICS block diagram.
- FIG. 2 is a detailed RICS block diagram.
- Figure 3 is subband decomposition schemes.
- Figure 4 is a Wavelet Filter Library.
- Figure 5 is a performance comparison of YUV and standardized color systems.
- Figure 6 is geometric shapes supported by RICS.
- Figure 7 is a hierarchical partitioning.
- Figure 8 is the relationship between regions and objects.
- Figure 9 is threshold masks obtained from Level 1 Lena Wavelet Decomposition.
- Figure 10 is a Level 1 common mask obtained by thresholding the combined data set.
- Figure 11 is individual region masks obtained by separating raw the common mask.
- Figure 12 is individual region masks obtained by separating raw common mask.
- Figure 13 is a spectral magnitude of zigzag spectrum as a function of position.
- Figure 14 is common mask spectrum filter sizes and captured coefficient numbers.
- Figure 15 is a coefficient banding concept used to quantize the common mask spectrum.
- Figure 16 is Quantization Band Sizes for Common Mask Spectrum of size 128 by
- Figure 17 is Spectral Quantization Band Sizes for Other Common Mask Sizes.
- Figure 18 is a Mask Overhead for Grayscale Images.
- Figure 19 is a low pass Common Mask Approximation for general coefficient classification.
- Figure 20 is a Mask Quantization following the Spectral Content for Bit Allocation.
- Figure 21 is Translated Common Masks for Remaining Resolution Levels.
- Figure 22 is the three types of ICU structures.
- Figure 23 is a Pavement of a region using type-1 ICUs.
- Figure 24 is a Embedded quad tree flowchart.
- Figure 25 is VLC Codes for EQW.
- Figure 26 is the transcode with JBIG.
- Figure 27 is the 'trans-out' and 'trans-in' modes for transcoding with JBIG.
- Figure 28 is Wavelet Mallot Decomposition for Lossy Color Transform Data.
- Figure 29 is Level Priority Processing Order for Each Channel (Lossy Case).
- Figure 30 is Level Priority Processing Order (Lossy).
- Figure 31 is Level Priority Color Interleave Processing Order (Lossy).
- Figure 32 is Level Priority Color Processing Order (Lossless).
- Figure 33 is Level Priority Color Interleave Processing Order (Lossless).
- Figure 34 is Typical MUX List Data Structure.
- Figure 35 is SNR Progressive Bit Budget Distribution Scheme.
- Figure 36 is MUX List Overhead for Y-Channel 8-Level Decomposition.
- Figure 37 is MUX List Overhead for Square Images of Various Sizes.
- Figure 38 is MUX List Overhead for Square Images of Various Sizes.
- Figure 39 is General Region Level Priority Color Processing Order (Lossy).
- Figure 40 is Region Level Priority Color Processing Order (Lossy).
- Figure 41 is Color Interleave Region Level Priority Processing Order (Lossy).
- Figure 42 is Region Level Priority Color Processing Order (Lossless).
- Figure 43 is Transparent Region Level Priority Color Processing Order (Lossy).
- Figure 44 is Transparent Region Level Priority Color Processing Order (Lossless).
- Figure 45 is a graph representation of the transparent region level in color mode for the 4 DCT region channels.
- Figure 46 is a graph representation of the region priority level in color mode for the 4 DCT channels.
- Figure 47 is a graph representation of the absolute region priority level color mode over the 4 DCT region channels.
- Figure 48 is a graph representation of scaled region priority level color mode of the 4 DCT channels.
- Figure 49 is a graph representation of the region percentage priority level in color mode over the 4 DCT region channels.
- Figure 50 is a graph representation of the mixed processing multiplexer mode of operation over the 4 DCT channels
- Figure 51 is a diagram of the basic lossless header structure.
- Figure 52 is a diagram of the basic lossy header structure.
- Figure 53 is a diagram of the basic header tag structure.
- Figure 54 is a block diagram of the basic bit stream syntax for normal modes of operation in the lossy case.
- Figure 55 is a block diagram of basic bit stream syntax for region modes of operation.
- Figure 56 is a block diagram of the basic bit stream syntax for mixed modes of operation.
- Figure 57 is a block diagram of the JPEG transcode/decode method
- Figure 58 is a block diagram of the JBIG transcode/decode method
- Figure 59 is a block diagram of the modified post filtering procedure
- Figure 60 are three gradations of post processing in digital still images..
- Figure 61 is a representation Edge areas where the de-ringing filtering is selectively applies.
- Figure 62 is a table of Test flmage: Camera (256x256 grayscale) Test Image: hk (256x256 color)
- Figure 63 is a table of Test Image hk.raw results
- Figure 64 is a table of Test Image camera.raw (grayscale) results
- Figure 65 is a table of Test Image hk.raw (color image) results
- Figure 66 is a table of Camera raw (part a) results
- Figure 67 is a table of Camera.raw (part a) results
- Figure 68 is a table of HK raw results
- Figure 69 is a table of: Quality versus iterations.
- Figure 70 is a table of: Quality versus iterations.As discussed. A Detailed Description of the Drawings
- the architecture of the RICS system can be described as the scheme of dynamic streaming of code blocks.
- the image primitives of ICUs generate unit code blocks.
- a code block is a logically independent unit of coding and decoding which does not rely on information contained in any other blocks. Compression is achieved in each block coder, and the coding efficiency of the block coders determines the overall efficiency of a RICS system.
- the openness of the system is reflected in the different coding algorithms that can be used to produce different code blocks.
- the system uses a multiplex structure (a MUX) to assemble the code blocks into the code stream and realize the bit rate allocation.
- a MUX multiplex structure
- the RICS architecture allows for flexibility in the areas of compactness and openness to the block coders and, at the same time, allowing the multiplexer to handle the various schemes of scalability and random accessibility to the code stream.
- a code block may correspond to an 8x8 tile (in the case of JPEG mode coding), a pyramid data structure in zero tree like coding schemes, a rectangular area in block based coding schemes, or an arbitrary shaped area in the raw image buffer.
- the independence of encoding and decoding is the primary requirement of a code block.
- a code block is the outcome of an intra-region coding.
- FIG. 1 illustrates the simplified RICS coding architecture. A more detailed diagram is shown in Figure 2. A detailed description of the function of each module is given in this document. Because the RICS is intended to be an open architecture, DAC does not limit each module to any specific algorithm. Instead, any individual algorithm can be placed into a module as long as it meets the functional requirement of that module. Supporting algorithms are included for certain modules, which reflect preferred operation of the overall system. Types of Input Image
- the RICS architecture supports the coding of three types of image data: grayscale, color, and bi-level images.
- Transformations In a typical multi-resolution coding scheme, an image is transformed via a multi- resolution decomposition process.
- transforms such as KL, wavelet, wavelet package, lifting scheme, etc. can be placed in the transformation module. These transforms produce a set of decomposition coefficients ⁇ Cij ⁇ at different resolution levels and in different spatial orientations.
- the RICS architecture also supports DCT or windowed Fourier transform as a transformation technique. This is mainly for the transcodability with JPEG. It should be noted that the Fourier based transforms have been studied for more than a century; its theory is relatively complete and its mathematical and physical properties are well understood. Particularly, its translation, scaling, and rotation properties may be very useful for content based retrieval computations.
- wavelet transforms are relatively new, and many of its properties require further investigations. As a result, the support of DCT may have an impact beyond the sole backward compatibility consideration.
- RICS also allows the NULL transformation (that is, no transformation is applied at all). In this case, an identity transformation is applied to the raw image data as the transformation step.
- the NULL transformation is useful in several instances. For example, it is usually not beneficial to apply DCT or wavelet transforms to bi-level images (text) for compression purpose.
- Another example is the residual images in video coding. The information in a residual image is the difference between video blocks and has a high frequency spectral content already. It is highly questionable whether it is beneficial at all to apply another mathematical decomposition (DCT or wavelet) to this type of data for the purpose of compact coding.
- DCT mathematical decomposition
- the function of this module is to partition the coefficients produced by the transformation module into a number of spatial regions.
- the RICS supports three region schemes. 1. Automatic partition based on the preordering of the coefficients.
- the first partition scheme is also referred to as the region hierarchy formation process.
- the region hierarchy formation process partitions the set into a number of hierarchically disjoint subsets according to certain definitions of significance.
- RICS can perform highly flexible progressive transmission modes of operation that depend on data priorities set up for the code stream.
- Scheme 2 deals with user specified ROI, typically primitive geometrical shapes, such as rectangles or circles, as well as object related shapes.
- Object related shapes could come from a variety of sources, such as user input, motion analysis in video compression, etc.
- Scheme 3 is a simple partition and requires minimal for shape coding. This scheme is essential in the JPEG mode. Some wavelet based compression techniques utilize this scheme to explore coding efficiency. This scheme offers very little support for content based accessibility.
- Hierarchically disjoint regions can be used in combination with user defined ROI in still image processing and objects in video processing. However providing a fair user partition for detail information is difficult in still image compression. But automatic partitioning or preordering techniques can be performed to control user selections in both arbitrary and block based modes of operation.
- the research presented in this document introduces a new multi-level control architecture for advanced multi-level image processing. The ability to perform a host of partitioning and preordering techniques in both normal and region based modes of operation is encompassed in this architecture. Both arbitrary and automatic region formation schemes are handled in the same manner at a high level.
- Region Shape Coding Three types of region shape are supported in RICS: tiling, primitive geometric shapes, and arbitrary shapes. 1. Tiling requires only a small set of parameters to describe the configuration especially in sequence based processing modes. However the sequences can be organized and presented in many ways in packing them into the final code stream.
- a rectangle can be defined by four integers (x min, y mir) and Otmax, >maj) , or and (width, height) etc.
- the region shape code is included in the code stream when the region definition is determined at the encoder. In the case of decoder specified regions, the region shape coding has a whole new meaning.
- Intra-Region Coding The function of the intra-region coding is to arrange the transformation data in an arbitrary shaped region into a one dimensional code stream. Regardless of the region definition scheme or the region coding technique, this streaming process requires an intermediate state where a control architecture can be designed to tailor the region channels whether dealing with a bitmap mask, an auto-detection routine or any other of numerous classification techniques. At the decoding end, the inverse routine generates the same mask to unpack the values from the one dimensional code stream and place them back into the correct position in each region.
- Intra-region coding is completed in block coders. Different block coders can be used to produce the ID code stream. For JPEG mode, the zigzag scanning/quantization routine is called to pack an 8x8 DCT block. For wavelet based coding, both explicit quantization and implicit quantization schemes can be used. In particular, embedded zero tree and embedded quad tree can be used as implicit quantization schemes. Furthermore most implicit quantization schemes are implicitly decodable. In dealing with bi-level images, a JBIG routine can be called a block coder. This effect can be staged by calling an implicit quantization scheme using one bit plane. Alternatively, efficient JBIG routines can be called block coders in an embedded coding scheme at each of the multiple bit planes.
- the function of the multiplexer is to assemble the code blocks derived from different subbands and different regions in proper orders into a single code stream. Due to the richness of region definition and block coding schemes, there are plenty of ways to pack the code blocks. Different ways to merging the data lead to different transmission priorities. For all transmission ordering modes, the final code stream can be transmitted progressively, and can be truncated at any desired place.
- the notion of using a multiplexer has been adopted in some standardized processes such as MPEG. In contrast, the use of a multiplexer in the design of a still image coding system is rare.
- the region based coding strategy opens the opportunity to systematically explore the syntactic and semantic richness in code stream ordering and transmission medium. With the image segmentation techniques improving in the future, region shapes will become more accurate in tracking objects in the image. In that case, the multiplexer will not only work as a syntactic composer, but also impose semantic meanings to the code stream.
- the RICS is a true open architecture. It supports not only the algorithms DAC has developed, but also can accommodate most existing well known compression algorithms. Users may include their own functions that are appropriate to their application in a number of modules such as transformation, region definition, region shape coding, and intra-region coding all under the MUX control architecture. It is also implicitly flexible to new technological advances.
- the RICS architecture is designed to support three categories of transformation: the DCT, wavelet transforms, and a special NULL transform.
- the RICS architecture supports the DCT transforms as defined in the baseline JPEG.
- Wavelet Transform The RICS architecture supports various kinds of subband decomposition schemes, including the three schemes in Figure 3.
- the present version of RICS uses totally 34 types of wavelet kernels as given in the table in Figure 4.
- biorthogonal filters Bior9-15 of the table
- Both low pass filtering and high pass filtering are done using convolution.
- the wavelet coefficients are down sampled by 2. This process is repeated using the low pass part until the desired decomposition level is reached.
- quadtrature mirror filters QMF for low and high pass are used. Then the coefficients are up-sampled by 2.
- the RICS system uses the Karhunen-Loeve Transform for color transformation. Following the notation of statistics we term this process as the color standardization.
- the KL color transform requires more computation than YUV transform.
- Figure 5 shows the PSNR comparison of the two color transforms. More test results are also available from DAC.
- Region Definition and Shape Coding The notion of region processing plays a fundamental role in the operation of RICS.
- the choice of region based coding is motivated by many application needs such as content based retrieval, interactive multimedia application, graphics object and image composition, and coding of dynamic object in video compression.
- region coding is divided into the shape coding and the intra-region coding (the content coding). Section 3, describes the various schemes for region definition and shape coding.
- the intra-region coding is discussed below.
- region definition does not have to be standardized.
- schemes for forming regions that the JPEG 2000 will need to support should be defined clearly.
- the RICS supports three types of region definition.
- Region definition is an optional step: an image can be coded without specifying any region (the non-region mode). In this case, the entire image area is considered as a single region.
- tiling block size may vary from one subband to another.
- the RICS architecture has a sound operational foundation. It can be operated in both tile based or arbitrary processing modes of optional arbitrary region formation schemes.
- Primitive geometrical shapes are ideal for supporting user interaction; i.e. user specified ROI, and for processing compound documents where the text areas can be well covered with one or more rectangular boxes.
- Geometric shapes currently supported by RICS are listed in the Table in Figure 6.
- C 0 be the set of image primitives. Given an ordering relationship >- , generic binaries partition of C 0 into C 10 and C ⁇ is defined by the following conditions.
- each of the new generated subsets can be further partitioned into two subsets, resulting in a hierarchical structure to represent the partition (see Figure 7)
- a useful ordering relation popular in the compression community is "more significant than": a >- b whereby a is more significant than b.
- the definition of significance can be very general. One definition may be used to create an ordering, or several definitions are combined together to generate an ordering. This can be illustrated by the following example shown in Figure 8.
- the region masks shown in the leftmost image are generated by considering the magnitude of wavelet transform coefficients as the measure of significance.
- the region mask in the middle image is an object shape generated by another definition of importance that, for example, can be the intensive dynamic region in a video frame.
- the rightmost image shows the intersection of the two region masks, which is a new region partitioned according to a multiple definition of significance.
- the ordering relation is simply the numeric ">" relation.
- a set of thresholds is sufficient.
- the first threshold say T 0 , separate the initial set C 0 into two subsets, C 10 and C u .
- Subsequent threshold values can be specified to recursively partition the new subsets.
- the above partition scheme produces accurately a number of disjoint subsets which satisfies the strict ordering relationship of > * - . However, it usually produces many small, scattered regions. Consequently, the representation of region shapes for these subsets can be potentially expensive. From the representation point of view, fewer numbers of smoothly shaped regions are desirable. In the following, a less strict partition scheme is defined, which takes the above partition scheme as a special case.
- this definition defines a partition in which no element in C, proceeds any element of C 0 .
- some elements belonging to C, (by Eq.IIIJ) may now be placed in C 0 , but, no element belonging to C 0 C, (by Eq.IIIJ) can be placed in C, (Eq.IIIJ).
- the subset C 0 will be enlarged, and the region shape of C 0 will be smoother.
- the enlargement of C 0 is controlled to such an extent that the loss of classification accuracy is limited to a reasonably low level.
- Another scheme, which is symmetric to the Eq.IIIJ can be defined similarly for the following condition.
- the wavelet transform is used to decompose the original image information into a multi-resolution hierarchy of data that is more suitable for compression.
- the result of completing one pass of the 2D wavelet transform is one low pass part (LLi) and three high pass parts (HLi, LHi and HHj.)
- the transformation procedure is repeated using the low pass LL part as the starting point for each succeeding pass.
- the LL part is an approximation of the original image at a lower resolution.
- the HL parts carry the vertical
- the LH parts carry the horizontal
- the HH parts carry the diagonal edge information.
- the level of detail information contained in any particular orientation is specific to the local response generated in the choice of wavelet kernel.
- the original image is transformed into a unique band pass response hierarchy of detail information. In any compression process, it is the highest energy coefficients that are considered first for making a contribution to the reconstructed image. It is this transformation property that is used to partition the coefficients into regions of interest corresponding to high energy locations in the original image space.
- the magnitude of the coefficients in a particular subband is related to the response that the original image information has to the transform kernel. Sharp edges in the original image (or a lower resolution LL part) are indicated by an increased kernel response in terms of coefficient magnitude, in the general vicinity of a specified area of concern. Other image changes that are not as abrupt will translate into a gradual response that is less significant. In those areas where little or no change occurs, the wavelet transform will indicate little or no response.
- Level 1 Kernel Response Threshold Images DAC's automatic region formation scheme makes use of coefficient magnitude information to categorize the data.
- the starting point for the procedure is at the highest resolution level of the wavelet transform hierarchy looking at the three detail data sets (HLi, LHi, HHi.)
- a 2 bit representation of each detail orientation for a 256 by 256 Lena image is given in Figure 9. These images are obtained by using threshold values to categorize the coefficients by magnitude.
- the wavelet kernel used in this case is the standard 9-7 filter set implemented in a lifting scheme.
- a decaying histogram procedure is used to determine the threshold values in each case.
- the coefficients are partitioned into regions according to decreasing levels of magnitude; 10% of coefficients are partitioned into region 1, 15% into region 2, 25% into region 3 and 50% into region 4. Note that the largest coefficients appear darker in the images while the smallest appear lighter.
- the threshold values are calculated automatically for each orientation. The values used to generate the partition in each case are given in Eq.III.4 (assume C, is the magnitude of the coefficient under consideration.)
- the set of threshold images appearing in Figure 9 is motivational in determining a region formation scheme based on this approach. It is apparent that wavelet coefficients can be classified in this manner to form a mask. However, the large overhead required to code each individual mask must be solved in order to make this type of classification scheme viable for implementation.
- the first step taken in the development of a technique designed to reduce the mask overhead is to consider a common mask approach that can be used to capture the most important coefficients between all three orientations at the lowest resolution level. Given that the data range is different in each orientation, the data is normalized to the largest range in an absolute value sense. This step is taken in order to compensate for the range differences that exist between the data sets, and to ensure that corresponding pixels from each orientation can make an equal contribution in the formation of the common mask. This step is illustrated in Eq.III.5.
- NewRange(HL,) ScaleRange(MaxRange(LH,, HL, , HH,))
- the next step is to generate a new normalized data set by taking the maximum value between scaled orientations at each pixel location. Let C (i) be the new coefficient obtained in this procedure and let C H i . C H i and C HHI be the individual scaled values from each orientation.
- the new data set is formed using the step illustrated in Eq.III.6.
- the final step in forming the common mask is to threshold the new coefficients based on the histogram decay and the percentage of the total data amount to encapsulate in each region category.
- the same values mentioned earlier are used in this case (10%, 15%, 25% and 50%.)
- the raw common mask categorization image appears in Figure 10.
- the threshold values used to generate the mask are given in Eq.HI.7.
- the raw size overhead of the common mask obtained in the previous section is still too large to be included in the final bit stream for most image processing needs (i.e. an overhead of 2 Bpp for this particular 4 region mask).
- This section outlines the next step in the automatic region formation scheme.
- the DCT transform is used to reduce the overhead size to an acceptable amount.
- the DCT is not applied directly to the common mask data in its mult-valued form. Rather it starts by considering each region contained in the common mask as an binary data set such that the sum of the individual data sets is equal to the multi- valued mask in functional form.
- the DCT transform applies the transform of the sum of functions in the same manner as taking the transform of each individual function and summing the results. Furthermore there are unique spectra in each case that should be considered separately for a fully integrated analysis.
- the individual region masks obtained by decomposing the raw common mask for the Lena image appear in Figure 11. Notice that although there are only three images, the fourth region channel (the background) is implicit.
- the 2D fast DCT algorithm is used to transform each binary data set into the frequency domain for spectral analysis.
- the spectra for the upper three regions appear in Figure 12. Please note that a log scaling of the coefficients is used in the images to exaggerate the appearance for display purposes. Notice how the low energy content is dispersed through the spectrum in each case. Even at such low energy levels, the energy is localized in some areas.
- the total mask spectrum must be filtered carefully to reduce the excess content that has little contribution to the region trends that exist in the retained coefficients.
- the main concern is to determine how to quantify the filter size such that a dynamic implementation may be obtained that will work accommodate any arbitrary image sizes.
- the DAC has developed two techniques for quantizing the filtered mask spectrum.
- the first approach is basically a modified uniform quantization procedure that is used to verify the concepts. It proved useful in confirming the algorithms and developing the concepts, but it lacks robustness and flexibility.
- the second technique is a general approach that is much more robust and elegant. It is dynamic since it tracks the magnitude of the spectral content in determining how to quantize the coefficients.
- a First Quantization Approach The first approach combines the filter dimension determination into the quantization procedure.
- a base mask size of 128 by 128 is use as the starting point in this approach.
- a quantization technique can be developed to exploit the relationship.
- the main premise is that there are bands of spectral coefficients that can be used to guide the quantization procedure. The importance of the coefficients in each band decreases along the spectrum. This concept is illustrated in Figure 15.
- the number of diagonal zigzag rows to include in each band is a function of the filter size and the original common mask size as well as the spectral content.
- a table look up technique is used to determine the number of rows to use in each band.
- the base band sizes determined experimentally to give reasonable partition and reconstruction results for a 128 by 128 mask are given in the table in Figure 16.
- the mask overhead based for this quantization scheme is tabulated in the table in Figure 18 for the corresponding image sizes.
- Quantization is no longer tied to the filtering stage and the coefficient banding technique of the previous approach.
- the current technique follows the actual content of the spectrum in forming quantization categories. In this way, coefficients are grouped into categories according to the energy gradient of the spectral distribution. The concept is illustrated in Figure 20.
- the rate of spectral decent together with position information within the spectrum can be used to quantize the number of bits used to categorize the data. This process can be controlled precisely.
- the procedure can be calibrated as required by studying the effects of changing image dimension in terms of an optimal quantization change.
- the previous quantization technique can be used to estimate the mask overhead for the new scheme proposed here.
- the spectral filter size in this case is 44 diagonal rows. If 4 quantization intervals are used at 16, 12, 8 and 4 bits each the mask overhead can be estimated.
- N be the number of coefficients in each interval
- b the number of bits used to quantize this interval
- OM the total mask overhead.
- a rough estimate of the new packed size is calculated in Eq.HI.9.
- the Common Mask and Multi-resolution Classification A translation technique is required to apply the region mask at other resolution levels.
- the simplest approach is to look at the significance of the mask coefficients in a small block to determine an appropriate value for the next resolution level. This process is repeated until a mask for each resolution is obtained.
- the most logical approach is to take the largest coefficient through to the next level.
- Mv (L+1)lJ Max(M V(L)2l ⁇ 2j , M V(L)2 , +1 2J , Mv ⁇ , , M V(U2l+l ⁇ 2j+1 ) (Eq.fflJ0)
- the low pass filtering and spectral quantization steps can be tailored to deliver similar quality masks for most images in a dynamic implementation.
- the partition can be studied in regards to how the original grouping is affected by the filtering step.
- the original partition is 10/15/25/50%.
- the coverage ratios used for bit budget calculation and the code stream distribution changed to approximately 9/31/33/27%.
- There may be some advantages in adjusting the original partition such that the data coverage ratios are controllable in the final stages.
- the question of miscoverage must be addressed in all lossy region formation techniques. It is the subsequent modules that are affected by the misclassified information. Thus the interaction between the DCT region formation module and the subsequent sorting and packing modules must be understood in any attempt to obtain the best region channel results for this classification scheme. Handling Misclassification in a Region Formation Scheme
- bit level processing can be considered as a second partitioning stage that deals with misclassification.
- bit plane sorting is a well know technique used to organize priority in a distributed list of data.
- ID sorting a list of information generates a map specific to that list in an optimal fashion. The largest values are mapped first as the list transposition routine progresses.
- the order of decent is the standard order used to classify information if considered from the normal processing standpoint where no regions are involved at all. As a result, most of the misclassified information is considered for further classification and are thus biased to be included first in the final code stream order.
- the only difference is that now there is a region description overhead as well as the final mapping overhead of sorting the region of interest partitions in a hierarchical fashion.
- DAC has developed a quad tree ordering technique (see EQW Ch.4) to track coefficient importance in both normal and region based processing modes of operation.
- the normal quad tree approach is to track the leading one position through each block under consideration in a recursive fashion.
- a map is generated in the process for each entry based on importance such that decoding follows the same order.
- the quad tree order is an efficient method of processing the multi-resolution decomposition image information in an effective manner.
- the tracking or positioning overhead introduced by mapping the data is generally smaller than in the ID sorting case. Retaining vertical and horizontal positioning information is highly advantageous (in most image processing applications) since the underlying technology is generally consistent in this medium.
- the ID sorting approach sacrifices the natural vertical correlation that may exit (although it can be partially recovered by bit level entropy coding in the sorting routine.) However each of the two sorting techniques can be used from an operational standpoint to produce a similar net effect. One may produce slightly better results than the next, but the control architecture is consistent. Each final ordering routine can be tuned to meet specific processing needs.
- the ID sorting approach may be a useful candidate for bi- level image processing, and DAC is currently investigating this possibility.
- the quad tree approach has currently been modified to operate in arbitrary region processing modes.
- the ID sorting routine tracks the importance in a list of information specific to the block under consideration. That is the region partition is distributed in ID hierarchical lists of information. The original preordering of the coefficients is captured in the common mask procedure. Thus the decoder map already exists for the inverse ordering.
- DAC's region processing quad tree begins by building an information map for each region channel to guide the classification.
- the core engine has been modified to drops blocks not contributing to the final ordering in each region category.
- the overhead comparisons for each sorting case have not been documented. The results will be available from DAC once the region based quad tree design implementation is finalized (currently under implementation at DAC.)
- the misclassified coefficients obtained from the original preorder mask partition are considered again for ordering in the sorting stages that follow.
- the most important misclassified coefficients still have the opportunity to make important contributions to the final code stream as well as improving the reconstruction quality measurements in the decoded image.
- the difference between this approach and most efficient classification approaches is that the original processing order used of the efficient scheme is somewhat distributed through each region channels.
- the misclassification in each region channel is forgiven to a certain extend the subsequent bit level sorting module.
- One of the prices that must be paid for any automatic region classification scheme is that the optimal processing orders that exist in standard approaches must be considered in a different context.
- the original ordering concepts used for standard procedures must be extended to include other ordering techniques specifically tailored for operation in arbitrary region processing.
- DAC's region technology in its original form was designed to include region processing capabilities from the start. All internal modules can be operated in both normal and region based modes of operation.
- the MUX architecture (introduced in Chapter V) is designed to meet operational requirements in both normal and region processing modes. It is possible to extend the notion of block based image processing to a level where a block of information is interpreted in an arbitrary way that depends only on the underlying core processing technology. In this way, blocks can be considered as individual processing units each fully capable of making a contribution to the final code stream.
- control architecture can be developed to fully exploit the region processing capability. All region formation schemes, whether arbitrarily defined user regions of interest, automatic region formation schemes or any other specially designed region classification technique, can be processed in a common framework. The same overall processing architecture can be tailored to operate in an optimal fashion for both normal and region processing modes of operation.
- Sequence based mode When this mode of operation is used, each set of data encapsulated by a specific region is treated as a separate sequence. The sequences are coded independently.
- ROI from the middle In this mode of operation, a concern of coding an ROI in the middle of the bit stream is addressed. This is useful for progressive transmission of images where the user is given the opportunity to focus on a specific ROI before the complete set of image data arrives at the decoder. This is important for medical image processing. • ROI without sending mask information.
- This mode of operation is a special case for scaling based mode. The coefficients for each region are scaled such that the magnitude of each ROI exceeds that of the ROI of less importance. The decoder knows which ROI the coefficients belong to based on the magnitude.
- the ROI coding technique that DAC has developed generally falls into the sequence based category. But internal modules have been tailored to operate in both block based and arbitrary block sized processing units.
- DAC is currently studying the different mask and region formation techniques to determine the effect each ordering and partitioning approach has on the overall organizational structure required for region channel processing.
- the current implementation includes a scalable ROI mode of operation implemented without actually shifting the ROI coefficients encompassed by the classification. Data scaling is handled in the MUX control architecture. Internally, the scaled version is a special case that can be implemented in MUX control. Preliminary study has begun for the last of the four categories cited above. ROI from the middle implementations can be addressed by using resolution a progressive processing order. This type of ordering is supported in the MUX architecture.
- DAC is currently considering a design implementation for client side region of interest requests in a resolution progressive architecture for client-server region interaction. ROI without mask information makes sense in theory.
- ICUs Independent coding units
- RICS Independent coding units
- ICU Full data independence. From the notion of ICU, the encoding and decoding of a region is done without referring to the data of any other regions. (2) Allowance for multiple coding schemes. Each ICU may be coded using different coding schemes. New coding schemes can be added to the system (modular openness.)
- JPEG and JBIG transcodability It is preferred to have a single JPEG 2000 coding platform that also accommodates JPEG and JBIG modes of coding.
- the ICUs are natural blocks for bit error localization.
- Parallelism There is no data dependence between ICUs and so the encoding and decoding of all ICUs can be performed in parallel.
- Intra-region coding in RICS involves the following steps.
- the current RICS design employs six categories of coding schemes.
- an ICU While being a logic concept, an ICU still has a geometric structure.
- the ICU structure is directly relevant to the coding scheme chosen.
- the RICS recognizes three types of ICU structures ( Figure 22).
- Type 1 ICU Structure 8x8 Blocks
- This type of ICU is designed exclusively for use in JPEG mode, although it is not necessary that DCT transform coefficients be coded this way. If there is no region defined in the image and the JPEG mode is chosen, the entire image is paved with type-1 ICUs.
- the set of ICUs that covers R form a unique, minimal pavement for R, in the following procedure.
- Step 1 Let y in be the first row from the top that has at least one pixel in R, y max the last row, x min the first column from the left, and x max the last column.
- Step 2 Let (x min, y min) and ( ⁇ : max, y max) be respectively the top-left and the bottom-right corners of the bounding rectangle of R.
- Step 3 Starting from the top left corner, pave the bounding box completely with type-1 ICUs. Step 4. Remove those ICUs that cover no pixels in R.
- Type 2 ICU Structure Rectangles Type-2 ICUs are rectangles. Except for the JPEG and embedded zero tree schemes, any other intra-region coding scheme uses type-2 ICUs to pave and code a region. There is no general restriction on the dimensions of a type-2 ICU. It is usually defined by the selected coding scheme. For example, both embedded quad tree and explicit block based quantization can use arbitrary sized rectangles, with some preferred embodiments (such as the 64x64 blocks used in EBCOT of VM 3.0 (A)). It should be noted that once a region is paved by type-2 ICUs, different coding schemes can be used for each ICU.
- type-2 ICU For subband decompositions, no type-2 ICU is allowed to cross subband borders. Essentially, this means that type-2 ICUs support only various types of intra-band coding methods.
- Type-3 ICU Structure Pyramids
- the third type of ICU is the pyramid structure, which is used by various inter-band coding methods, such as the embedded zero tree.
- region definition procedure for type-3 ICUs: a region must be specified in the top-down manner, from lower resolution to higher resolution in the decomposition pyramid. That is, for a given region definition R in LL, every element in R defines a set of type-3 ICUs (i.e. in three spatial orientations, respectively).
- C u (i,j) be a wavelet transform coefficient at spatial location (i,j) in the LL subband
- C LL (i,j) can be represented in the following fashion.
- Type-1 ICUs are defined exclusively for coding 8x8 DCT coefficient blocks. Coding of type-1 ICUs follows the JPEG baseline algorithms.
- each ICU is coded independently, and the collection of codes for all ICUs is the coded representation of that region.
- this subsection we describe three techniques that may be used for coding type-2 ICUs.
- Embedded Quad tree Wavelet Figure 24 is an efficient and fast method for type-2 ICU coding.
- This technique implements an embedded progressive sorting scheme in a quad tree-like structure.
- the EQW explores the intra-band self-similarity of the wavelet decomposition coefficients.
- the EQW-produced code-stream realizes scalability by pixel-precision (the scalability by spatial resolution is realized by the multiplexer.)
- the EQW method can be used for both lossless and lossy coding.
- the primitives to be encoded in the ICUs are coefficients of a reversible wavelet transform.
- each transform coefficient is represented in a fixed-point binary format - typically with less than 16 bits - and is treated as an integer.
- the maximum coefficient magnitude M is determined.
- a value N is found which satisfies the condition 2 N ⁇ M ⁇ 2 N+X .
- the EQW works in a bit-plane manner: the initial bit-plane is set to 2 N , followed by 2 N ⁇ X , 2 N ⁇ 2 ... and so on.
- a binary significance map is produced for every bit-plane by comparing coefficients in power of 2 increments.
- Figure IVJ illustrates the EQW encoding process on a single bit-plane.
- Two working lists are used.
- One is called the list of significant primitives (LSP).
- LSP significant primitives
- Each entry in LSP corresponds to an individual primitive in the ICU.
- the LSP is initialized as an empty list.
- Another list is called the list of insignificant blocks (LIB).
- Each entry in the LIB is registered with the coordinates of the top-left corner of the block, together with the width and height information.
- Each LIB entry represents an individual primitive of width and height equal to 1.
- the ICU to be encoded is put into the LIB.
- a temporary list of insignificant blocks (TLIB) is used for refreshing the LIB after each bit-plane coding pass is completed.
- the first method is to insert the four sub-blocks into LIB at the position of their parent block. In this case, the four child blocks are evaluated immediately after the parent block. This rule is applied recursively until no more subdivision is possible. This means that control returns to the next entry in the LIB only after the present entry is completely encoded up to the highest resolution level.
- the second method known as breadth-first quad tree coding, is to add the four sub-blocks under consideration to the end of LIB. In using the breadth-first process, all parent blocks at the same level will be processed first, followed by their respective children blocks.
- the entries in TLIB can be reordered according to certain pre-defined set of rules (this is an optional step.)
- Experimental evidence suggests that a higher PS ⁇ R can be achieved by using an effective reordering scheme.
- the LIB is replaced with the TLIB and the coding resumes at the next bit-plane.
- the next step is the refinement pass.
- the ⁇ th bit is output for entries in the LSP.
- the process resumes using the new LIB and a new bit-plane set to Error! Objects cannot be created from editing field codes.. EQW with VLC
- VLC variable length coding
- Step 1 Initialization: output n satisfying the inequality 2" ⁇ max ⁇ C ⁇ ⁇ 2" +1 , set the list of significant primitives (LSP) as an empty list. Put the ICU to be coded into the LIB.
- LSP significant primitives
- Step 2 Bit-Plane Sorting: for each entry of the LIB, perform quad tree coding.
- Step 3 Reordering (optional).
- Step 4 Refinement: for each entry in the LSP, except those included in the last sorting pass, (i.e. with same n), output the nth. most significant bit of CJ .
- Step 5 Quantization Update: decrement n by 1 and go to step 2.
- Some other researchers have also proposed the embedded coding methods where quad tree is used to explore the intra-band similarity . However, some major differences exist in how the quad tree operates. The work of [2] has suggested that empirically the quad tree decomposition should stop at size 16x16. However (as a result of our experience and experimentation) this size may not always be the optimal choice. Even without VLC, the efficiency of DAC's EQW is comparable with EZW. In fact, for a 2x2 block, if it is insignificant, one zero can represent four zeros. However, when a 2x2 block is significant, quad tree coding is inefficient.
- the embedded quad tree (EQW, EQPC, etc.) is a special case of the more general block-based quantization techniques.
- Other block-based quantization algorithms such as EBCOT, can also be very efficient for type-2 ICU coding.
- the transcode mode In the first mode, the transcode mode, the JBIG ICU coder and EQW ICU coder provide a two-way transcode between JPEG 2000 and JBIG ( Figure 26).
- the EQW procedure In this transcode, the EQW procedure is called for bi-level type-2 ICUs, with the total number of bit-planes being set to one. Note that the sign bit coding in the EQW algorithm should be skipped.
- the proper JBIG routines are called as the bit-plane coder in the EQW routine for certain bit-planes and for certain decomposition sizes (the quad tree decomposition stops at this particular size and the coding is handed over to the JBIG routine). Since the algorithms for bi-level data coding are also evolving with the compression technology, having an embedding JBIG mode reserved in the JPEG 2000 system can make the Standard evolve with its sister technologies.
- Type-3 ICUs are defined to support various inter-band coding methods for wavelet transform coefficients.
- the methods of EZW, SPIHT, etc. are known to be efficient approaches for coding the type-3 ICUs.
- the standard versions must be modified slightly to make them fit into the RICS architecture.
- these algorithms implement the ICU coding and multiplexing steps into an integrated procedure.
- a simple modification is required to bridge the existing architecture to the RICS architecture.
- the two steps must be separated, which is straightforward. Since the multiplexing stage in RICS can effectively simulate the performance of zerotree-based schemes (refer to ChN), this particular split in functionality has virtually no impact on the attainable coding efficiency for standard schemes. Instead it adds more flexibility and openness to the existing architecture.
- the LL subband is usually encoded differently than the other high-pass subbands. Because different sets of primitives may possess different statistical characteristics, providing several coding scheme choices in the intra-region coding module may offer a higher coding efficiency. In practice, the following combinations are useful. • In a wavelet decomposition, coding the LL subband with type-2 ICUs whereas the other subbands with either type-3 or type-2 ICUs.
- Intra-region coding can be performed without specifying any ICUs in a particular region. In this case, the entire region is considered an ICU. If the natural geometric shape of the region fits into any of the three ICU categories, the appropriate ICU coder can be used. If the region has a very irregular shape, then the ID coding method can be an efficient approach. Once the data for a particular region is scanned in a certain pre-defined order to form a ID stream, the following ID progressive sorting algorithms can be used to produce an embedded code-stream.
- Step 1 Initialization: Output n satisfying the inequality 2" ⁇ max ⁇
- Step 3 Reordering (optional).
- Step 4 Refinement: for each entry in the LSP, except those included in the last sorting pass, (i.e. with same n), output the nth most significant bit of
- Step 5 Quantization Update: decrement n by 1 and go to step 2.
- the JPEG-2000 committee has been investigating many emerging compression technologies in order to define a new still image compression standard.
- the emerging standard will address both present and near future image compression needs.
- the task of selecting what technologies to include in the new standard is not easy given the rate at which technological advances occur in this area.
- the primary focus of DAC has been to develop a still image compression engine that addresses the issues set forth by the standards committee.
- MUX multiplexer
- Dynamic re-orderability is a data access issue relating to the degree that encoded information can be reorganized to suite user specific needs for an image under consideration. This type of data access is an important concern for many applications, especially in the medical field.
- the encoded bit stream must have a certain degree of error resilience to address this concern.
- Using a MUX to organize the encoded bit stream can address these issues.
- the first step is to divide the encoded information into logical groups. Organizing the data in ICUs leads naturally into a high degree of accessibility. Each ICU is an independent unit. The size and position information is self-contained within each ICU.
- One of the effects of using a muliplexed bit stream design is that it is inherently error resilient. If any logical unit incurs an error, only that unit needs to be recovered. If the encoded data is divided into an array of logical units in preparation for MUX encoding, dynamic re-ordering is much easier to achieve.
- DAC makes the distinction between region and non-region processing modes of operation in their design implementation.
- a hierarchy of list structures is used to control optimal bit budget distribution and flow of both non-region and region bit levels through MUX channels.
- the discussion presented in the next Section begins by introducing the concepts in normal processing modes.
- the normal MUX modes of operation are generalized later on to include specialized region and mixed processing modes.
- the complete level 4 Y-channel information i.e. LL -HL 4 -LH 4 -HH 4
- U and V channel low pass information i.e. LL in each case.
- a "level split" on the U / V data channels at the lowest inverse wavelet resolution level is used to exploit the natural relationships that exist for both inverse transform spaces. A similar order exists for the lossless case where full data sets are used.
- Color Interleave Processing Order (Lossy ) Another processing order that exists for color images in the wavelet transform domain is obtained by interleaving the color channel detail information. The order is illustrated in Figure 31. This type of ordering may be suitable for certain applications such as those that require the data to be encoded for a progressive download. As in the previous case, the down sampling relationships that exist are maintained in both color / wavelet transform spaces.
- the natural processing order for lossless color image processing is illustrated in Figure 32.
- the color interleave order for lossless image processing is illustrated in Figure 33. Note that in each case, the full decomposition data set for each channel is maintained.
- the inherent relationships that exist in the color transform space are maintained in the wavelet transform space for ordering and reconstruction of the original image.
- bit plane ordering techniques are generally classified as using implicit quantization for the wavelet data sets. Coefficients are packed into the final bit stream based on bit level priority.
- the bit plane processing technology will be used to introduce the data structure used in the MUX design. However, the MUX processing discussion that follows is not restricted to bit level processing algorithms.
- the general ordering and processing relationships are useful for introducing the MUX control architecture at a basic level.
- the key to the operation and many benefits of the MUX technique is in the data structure design used to encompass the natural ordering concepts.
- the first step is to define a list structure for each level, channel, and orientation of data that exists in the wavelet transform space. A typical example of this type of structure is given in Figure 34.
- a general list structure for each level data set can be utilized in many ways.
- a lossless "prepack" of multi-resolution hierarchy information is useful for exploiting the many relationships that exist in the data taken in MUX list processing context.
- the data and information contained in each list is used to control how the final bit stream is organized. All data packed into the final bit stream must conform to this structure. New fields may be added, but the basic operation of the structure remains the same.
- Most implicit quantization techniques are implicitly decodable. In other words, minimal header information is required for each list.
- the MUX architecture organizes the lists or units of information into a bit stream that is scaleable in terms of bit precision and resolution and is controllable by the many MUX modes of operation.
- EQW DAC's current two dimensional bit level processing design
- each list (as EQW progresses), the fields in the corresponding list structure are updated.
- the bit stream at each bit level is appended to the MUX "prepack" buffer as well as its corresponding size being added to the total in the bit packing information field.
- the bit level position where processing begins is put in the high bit information field.
- the total packed list size is placed in the total packed field.
- the scheme used for the list is placed in the scheme field.
- Each MUX list is fully independent, implicitly contains a full set of statistical information available for determining the amount of data to pack for each list and is optimal for organizing the final compressed bit stream.
- each list structure is used to organize the final bit stream based on end user compression requirements.
- DAC has implemented numerous MUX modes o operation. Two common modes of operation are outlined in this Section.
- SNR Signal to Noise Ratio
- Fields that refer to the MUX list structure appear in italics while other local variables appear in normal typeface. Basically the idea is to loop through each list checking to see whether the current processing bit level is equal to the bit level of the list under consideration. If the two bit levels are equal, the channel bit budget is decreased by the amount of data available in this particular list for the bit level in question (from MUX field pliPackinglnfofcCurBitLevelJ. The MUX field liCurBytesCount is incremented by the number of bytes available. Additional processing takes care of the remaining bits. The remaining bits will be considered in the channel bit budget on the next visit to this particular list. The MUX field cCurBitLevel is decreased by one such that it will again be enabled when the bit level in the main loop is decreased by one.
- each MUX list will contain a field indicating the amount of data to pack into the final bit stream in each case.
- the resolution progressive mode of bit budget distribution is a simple extension of the psuedocode implementation of the SNR mode of the previous Section.
- the 'Waveket Transform level' for loop is taken outside of the main BitBudget / BitPlane while loop such that resolution level is given priority. In this manner, a lower resolution image can be reconstructed in the inverse wavelet transform stage.
- the resolution of the ensuing image depends upon the number of levels required in the end user requirements.
- Composing the final bit stream is a simple matter once the bit budget distribution scheme has run to completion.
- the packing technique can follow one of the normal processing orders outlined earlier (e.g. color level priority or level priority color interleave) or another access technique as required by the end user.
- the amount of data to pack for each list has been determined in the bit budget distribution stage. If the data to be packed for a given list is greater than zero, then the total data size, scheme and high processing bit level are packed as header information into the final bit stream. If the data to be packed for a given list is zero (i.e. not required), then a smaller header is packed to indicate a zero length list.
- Currently a register type approach is being developed to handle zero length lists.
- the total Y-channel header size for this particular image is 443 bits.
- a similar calculation can be conducted on the U/V-Channels (assuming one less decomposition level for each) to yield 368 bits each.
- the original raw image size is 2048 x 2048 x 3 bytes.
- the MUX packing overhead in this particular example is about 1 header bit for every 854 bits of compressed data. In the lossless case where each channels has the same number of decomposition levels (or correspondingly, in the 8 bit grayscale case), the overhead is still very small at 1 header bit for every 757 bits of compressed data packed.
- the table in Figure 37 outlines the overhead size for square images with dimensions that are a power of 2 beginning at 16 by 16 and ending at 64k by 64k. Note that the overhead size if 44 bits in both the lossy and gray scale cases. The reason for this is that the wavelet transform is not used on the U V channels when the original image dimensions are 16 by 16. The Y-channels is decomposed once. A plot of percentage overhead versus image dimension is given in Figure 38. Both lossy and lossless cases appear on the same plot. Note that the image size can be quite small while still maintaining a relatively low overhead in terms of the total MUX list header sizes.
- This particular transform technique is based upon a principle component analysis (PC A) implemented on the original raw color information in order to determine the optimal color redistribution for the three RGB channels.
- PC A principle component analysis
- two of the three channels are down sampled by a factor of 4 to 1 for lossy compression. In doing this, the amount of data that must be compressed is reduced by a factor of 2 with minimal loss in visual quality for the reconstructed image.
- the easiest bit budget distribution scheme to implement is one that follows the color transform down sampling ratios. If the transform under consideration is YUV-411, then 4 parts are allocated to the Y-channel for every 1 part allocated to the U / V- channels respectively. However, the energy distribution of the wavelet transform data generally does not follow this strict down sampling guideline. Instead it varies from one image to the next.
- DAC has developed a dynamic bit budget control architecture that is based on the implicit information contained in each MUX list.
- a bit budget allocation procedure will be introduced that can be used for optimal scalability in normal processing modes of operation. This technique is based implicitly on the amount of energy contained in each color channel of the wavelet decomposition data sets.
- a data ratio concept is used together with the prepack information contained in the MUX list structures to determine the budget for each channel.
- the next step is to calculate the pack ratios to be used for each channel of the wavelet decomposition hierarchy.
- the pack ratios are determined by taking the ratio of the two largest amounts of data to the smallest amount of data. Let the three pack ratios be denoted R y , R u and R v . The smallest data size will be in either the U or V channels because of the down sampling step. Note that one of the pack ratios will be unity.
- the pack ratios given in Eq.V.2 are used to determine the optimal amount of data to allocate for each color channel based on the user specified compressed file size. However, any additional overhead introduced into the final bit stream by the MUX must be taken into consideration. Let yh , uh, j and vh u be the individual header sizes in each wavelet channel for a particular orientation level. Then the total header sizes for each wavelet channel are obtained by summing the individual totals.
- the packing overhead introduced by the mask must be taken into consideration in determining the bit budget for each channel.
- M t the total pack size for the ROI mask.
- the logical approach to take is to make an adjustment in the bit budget for each channel based upon the pack ratios of Eq.V.2. In this manner, the mask overhead is distributed proportionally to each wavelet channel.
- the unit adjustment factor U a is determined from the total mask overhead and the total pack ratio.
- the adjustment size for each channel is determined by using the result of Eq.V.4 and the pack ratios of EqN.2.
- BppMax • ( Y. + U, + Vt + Yh + Uh + Vh + Ya + Ua + Va )
- the unit bit budget Ubb is determined from the pack ratios and the total header sizes.
- U b b and V bb can be determined using U bb , the individual pack ratios and the overhead parameters.
- This concept can be extended to form pack ratios for each subband if more accuracy is required.
- a similar data ratio technique can be implemented to control the bit budget in that case.
- the necessary information is contained in the MUX list structures.
- the cited distribution technique is sufficiently accurate.
- the user specified file size can be obtained within several bytes in implementing this technique.
- the maximum and minimum attainable file sizes are known prior to the final packing stage.
- a simple routine has been developed for distributing the final bit budget in the level where it is determined it will expire.
- the idea is quite simple. Given that the bit budget will expire on a particular bit plane in a certain transform level, then the bit budget is redistributed such that each orientation gets a proportional amount based on the amount of data that each orientation can take. The amount of data each orientation gets is determined by using ratios between orientations on the bit level under consideration. This is an important consideration since the net effect is approximately a 1 dB improvement in the PSNR.
- the processing orders introduced in the previous Section are specific to the normal processing modes where one or more ROI are not specified.
- DAC has implemented a region processing design based on the natural extension of the MUX concepts introduced in the previous Sections. Both lossy and lossless processing orders are again considered.
- the region level processing order places priority on each ROI in a descending fashion followed by the normal level priority scheme of the original MUX design. Inherent in the MUX scheme is the general assumption that data of similar magnitude in each succeeding region is less important to the overall image reconstruction. A secondary priority key is placed on the wavelet transform level. Data of similar magnitude contained in a lower resolution level is more important to the reconstruction procedure than data at a higher resolution level.
- the intrinsic 4-1-1 color down sampling relationship is maintained for region processing MUX modes.
- the complete level 4 Y-channel information i.e. LL 4 -HL 4 -LH 4 -HH 4
- the U and V channel low pass information i.e. LL 3 in each case
- the color interleave processing order extends naturally to the MUX region processing modes.
- the order is illustrated in Figure 41. This type of ordering may be more suitable for certain applications that require the data to be encoded for a progressive download based on regions of importance. As in the previous case, the down sampling relationships that exist for the inverse color and the wavelet space are maintained.
- DAC has developed both SNR progressive and resolution progressive MUX modes of operation for region processing.
- regions can be defined automatically by the technique described in Chapter III, or user defined ROI can be used to group the data.
- User defined ROI can have arbitrary shape or can be defined in terms of simple geometric elliptical or rectangular region primitives.
- This particular region color processing order is useful for applications that require transparent region channels that has little or no influence on the ordering of the data.
- Either of the normal SNR and resolution progressive modes are overlaid with a complete region channel description.
- the technique is implemented simply by taking the region index to the inner processing loop in the pseudo code implementation example of Figure V.8.
- This ordering technique may be useful in video processing applications where a complete region mask description is required together with a compressed frame of information.
- the region description may be used to process subsequent frames in the sequence.
- One of the processing orders that falls into this MUX mode of operation is illustrated in Figure 43. In this case, the high bit plane of each region list is processed first causing the region classification of all coefficients to be transparent to the distribution and packing routines.
- the MUX organizational list structure concepts introduced for normal processing modes have been extended to include many region processing modes of operation.
- the total number of lists is a function of the number of ROI. If there are 4 ROI, then there are 4 times as many MUX lists.
- the basic operation of the MUX control architecture is similar in each case.
- Transparent region processing is mentioned above.
- the region index is placed in the inner most loop in the bit budget distribution function.
- the processing index placement is basically a method of incorporating a transparent region layer of processing into the normal MUX modes outlined earlier in the Chapter. Processing and packing in this mode gives ROI a low priority.
- the DCT auto regions and arbitrary or primitives user defined region types can be operated in this mode.
- the user In DAC's internal processing architecture the user must select the region coverage technique used to categorize the wavelet coefficients.
- DAC has implemented the wavelet mask down sampling technique of the JPEG-2000 VM, and it can be used in any of the region processing MUX modes. In this manner a common mask can be used in each orientation level or the VM mask down sampling technique can be used for individual orientation level mask coverage, for both automatic or user defined ROI in lossy and lossless MUX modes of operation.
- the number of region categories can be selected as 2, 3, or 4 channels.
- the YUV-411 color transform is applied to a 24 bit 256 x 256 Glacier park mountain scenery image.
- the 9-7 kernel implemented in a lifting scheme is used as the wavelet transform.
- the DCT region formation technique of Chapter III is employed to generate the common masks for each wavelet transform level.
- the mean square error (MSE) is measured for a number of compression ratios. The result is illustrated in Figure 45 in a plot of MSE versus Bpp. The plot shows the break points in MSE for each automatic ROI used in the example.
- the rate of image degradation i.e. MSE and PSNR
- the MUX overhead is calculated based on the mask type and process selections. For arbitrary user constructed masks, the mask file can be loaded to guide the MUX. In the 4 ROI common mask case, the overhead is 0.5 Bpp in packing the entire mask. In the VM mask case, the overhead is 2.0 Bpp. If simple primitives are used to form the user mask instead of an arbitrary shape, the overhead is greatly reduced. The arbitrary mask overhead can be reduced with the addition of an entropy coding stage.
- the common DCT mask overhead is cited in Chapter III as approximately 1Kb (0.2 Bpp) for an 8 bit 256 x 256 image size. These are rough estimates that do not take the header sizes into account. However, they do serve as a comparative guideline for the current discussion.
- pack ratios for each wavelet channel are determined as in the normal processing mode. However, there is one important difference. In this particular mode of operation pack ratios are determined for each region channel. This implies that there are 12 pack ratios for a 4 ROI process.
- the bit budget distribution is based on the overall wavelet channel pack ratios and the ROI data totals in each region channel. The distribution function also takes the mask and list header size information into account in the overall calculation.
- This mode of MUX operation is designed to distribute the MSE and the PSNR image reconstruction measurements in an approximately uniform manner for all region channels. A proportional amount of overall bit budget is allocated to each region channel based on the region and color channel pack ratios.
- Figure 46 illustrates the result of using this mode of operation for the Glacier park image. Notice how the quality of each region channel degrades in comparison to the others. Auto detected DCT common masks are used to generate the result.
- bit budget is divided into 12 according to total data ratio technique mentioned earlier.
- bit budget is distributed in a biased fashion giving the highest priority to the most important ROI. Some regions may not receive a budget based on the compression requirements.
- the bit budget is distributed region by region beginning at the most important ROI. Thus complete sections of the original image can be eliminated altogether if the bit budget is small. If only the important regions of an image need to be saved, this technique can be employed to partition the image.
- FIG 47 A plot of MSE versus Bpp for the absolute region priority color MUX mode is given in Figure 47 for the Glacier park image. Notice how the regions fade out very fast and sharply. This occurs when the region no longer has any budget allocated to it. At that point, the region is basically invalidated in terms of any contribution to the reconstructed image. That portion of the image basically fades out. Note that the amount of quality of the fade out depends on the down sampling technique to translate the masks. The VM mask formation technique causes a very gradual fade out to occur. In the common mask approach, the effect is much more abrupt.
- each ROI contributes to the final bit stream.
- the initial bit budget for each region channel is calculated as before in that the specified compressed file size is split into 12 bit budgets to be spread between each of the 4 region channels.
- the allocation amounts are changed heuristically based on a priority factor that is set for each ROI. For example, suppose it is decided that ROI 1 is 50% more important than ROI 2, ROI 2 is 30% more important than ROI 3 and ROI 3 is 20% more important than ROI 4 (the background). Using this assumption as a starting point, the amount of data allocated to each region channel is changed slightly. The net effect is to cause the quality measurements in each region channel to degrade at slightly different rates. A plot illustrating this effect is given in Figure 48. Note how the MSE break points for regions 2 and 3 have shifted slightly towards the left.
- the result of using this particular MUX mode illustrates an important property that should be available for any ROI processing technique.
- the problem with many ROI processing techniques is that it is difficult to control how much each region contributes to the final bit stream.
- the technique outlined here can be used not only to implement this effect, but also to control the degradation for each region channel based solely on an importance factor associated with each ROI.
- Region Percentage Priority Level Color Mode In this mode of operation the user can specify a data percentage for each region based on the total amount available in each region channel.
- the distribution function is slightly different in this case.
- the data totals are determined for each region channel along with the wavelet color channel totals. There are still 3 bit budget for each region channel. In this case however, the amount of data to include for each region channel can be set by the user as a percentage of the total in each case.
- MUX for Mixed Processing So far both normal (non-region) and ROI processing modes have been discussed. In addition to these modes of MUX operation, DAC has developed mixed mode processing capabilities. The modes of operation discussed for both normal and ROI processing are extended such that they can be run simultaneously for an image under consideration. A wavelet transform level partition is conducted based on the desired number of region and the number of non-region levels.
- non-region levels can be defined for processing lower resolution levels and region levels can be defined for process higher resolution levels.
- region levels can be defined for process higher resolution levels.
- the implementation is not restricted by this distinction. It can be changed to regions over non-regions or to an interlaced combination of the two.
- the parameter that controls this distinction is termed the region start level. It can be set to any valid wavelet transform level (it is set internally to -1 to disable region processing.)
- the MUX technology can be used in exclusive non-region mode, exclusive region mode or a combination mixed mode of operation.
- bit budget distribution functions work as they did before, but in this case they exist simultaneously. In some modes of operation there may be as many as 15 bit budget definitions used to organize the final bit stream. The method used to determine them has not changes. The same ratio technique that has its basis in the MUX list structure is used to determine the appropriate allocation in each case.
- An importance factor can be attached to the non-region levels.
- the net effect of this parameter is to taper the amount of data included for the non-region levels.
- the non-region levels are processed first. Thus they considered first by the distribution function.
- the importance factor allows the user to decrease the amount of data included for the non-region levels by a certain percentage.
- the delta amount is considered in the bit budget distribution for the region levels.
- the region processing overhead is slightly smaller in this case. Depending on the region start level parameter, there will be less region header information required in the bit stream since there are fewer region levels. However, the header overhead for the lower resolution levels is quite small anyway.
- the region channel coverage is not an exact overlay. All masks used to group or categorize data must deal with the down sampling issue at different resolution levels, and the tight overhead restraints of the compression channel.
- the DCT region detection mode can be applied to threshold wavelet coefficients to form region channels.
- the amount of data packed for each channel can be controlled by the MUX.
- the initial data split used to create the partition can be controlled before the DCT mask procedure begins.
- the DCT approach shows much promise especially on the highest resolution level where the masks are the most accurate.
- One of the benefits of using this the DCT approach is that it can be translated to other transform levels in the frequency domain. Or alternatively, the new masks could be generated at a different wavelet level.
- MUX modes are currently under development.
- the experimental results presented in this Chapter were obtained using DACs ID bit level sorting implementation.
- DAC is currently inco ⁇ orating the 2D version based on EQW into the region processing channel.
- EQW sorting is implemented for normal (non-region) processing MUX modes.
- Both sorting modes are available for lossless compression.
- wavelet kernels and color transforms for both lossy and lossless compression.
- the number of region channels can be set with a current maximum of 4. Primitives can be used as desired.
- the number of primitives is not restricted to 4 with region overlaps given to the most important region.
- Bit Stream Syntax DAC's current bit stream structure is rather dynamic given the different types of organizational strategies that exist in the underlying core technology. Normal, region, and mixed processing modes in addition to arbitrary wavelet, color, entropy coding various sorting stages in both lossy and losses cases.
- bit stream syntax depends largely on the mode of operation selected by the user (i.e. regions, no regions, sorting, lossy/lossless etc.)
- normal processing modes can be selected for all transform levels, or for any number of lower resolution transform levels.
- Region levels can be defined for all transform levels, or the higher resolution levels used in combination with the lower resolution normal levels.
- the region and the normal levels can be mixed.
- Each unit list organized by the MUX has a header tag. This header tag carries the list size and the high bit plane processing level for the list.
- the current implementation uses 5 bytes per list. However bit packing is currently under implementation for tag headers. This will reduce this by a significant amount.
- FIG. 53 A diagram illustrating the structure of the tag header is given in Figure 53. As other core technologies are added to the core engine or the core technology advances, other fields may be required in the tag headers.
- FIG 54 The basic structure for normal modes of operation is given in Figure 54.
- the diagram illustrates a lossy pack arrangement for a color image (YUV down sampled color transform assumed) according to the packing/processing order given in Figure 30 above.
- the code stream consists of the file header, followed by the header tags/data.
- FIG 55 The basic structure for region processing modes of operation is given in Figure 55.
- the diagram illustrates a lossy pack arrangement for a color image (YUV down sampled color transform assumed) according to the packing/processing order given in Figure 40 above.
- the code stream consists of the file header, region header, regions description and finally the header tags/data.
- JBIG bi-level images
- JPEG bi-level images
- Figure 57 shows how a DCT-based code can be produced in RICS. With this transcodability, it will be straightforward to write a small conversion program to transcode an old JPEG file into a JPEG 2000 file. Although it is possible to transcode a JPEG 2000 file into a JPEG file (losing some scalability), it will probably be of very little use.
- Figure 58 shows the execution path illustrating how a bi-level image such as a text document can be efficiently handled in the RICS system.
- the NULL transform is applied (nothing has to be done).
- binary text documents contain mostly high frequency energy.
- Multi-resolution decompositions will not necessarily be a suitable basis for efficient coding.
- current well-known techniques specialized for binary images are applied directly in spatial domain. While using a RICS system for processing a compound document with mixed grayscale/color/text information, a set of rectangle shapes suffice to enclose the text regions. This is roughly equivalent to run-length coding in existing binary image compression techniques to skip strings of zeros.
- the pixels within each rectangular region are then coded into a one- dimensional stream via 1-D algorithms or JBIG routines, as discussed above.
- de- ringing is usually a useful post processing procedure to improve (mainly) the visual quality.
- Nonadaptive procedures which apply the de-ringing filtering to all pixels without discrimination, have the following problems. • While they can successfully remove ringing artifacts, they may also wash out details of image. • They usually rely on too many parameters that are to be determined by human users.
- the RICS system employs an adaptive de-ringing algorithm for post processing. Since the ringing artifacts usually appear around edges where sha ⁇ changes occur, this adaptive filtering process is applied only to edge areas. As the result, the post processing removes artifacts around edges and prevents fine details from being smoothed out by the filter. Compared with non-adaptive methods, the processing time is remarkably reduced since the filtering is applied selectively to pixels.
- the diagram of Figure 59 shows the algorithm of the adaptive post processing filter. First of all, the pixels of reconstructed image are classified into edge pixels and non-edge pixels. Edge pixels are enlarged into edge regions since the ringing artifacts do not occur right at the edges but around the edges. Then the artifact removal filter is applied only to the edge areas.
- Figure 60 shows the result of the modified post processing. Artifacts are clearly visible in the first picture especially behind the cameraman's back and around the tripods. As we can see from the second picture, artifacts are removed, however, the details in the grass field and on the man's pants are also removed. In the last picture, the modified filter successfully removes artifacts and at the same time retains the details.
- Figure 61 shows the edge area (in white color) that was used in producing the third picture.
- the average power difference between the pixel and its neighboring pixels are measured against a threshold (bottom threshold).
- the threshold acts as a high pass filter that filters out the pixels with low power levels (Band pass filters can also be used for edge detection by introducing a proper ceiling threshold).
- Edges are thickened to form edge areas after the edge pixels are detected.
- this modified filtering process can be applied to reconstructed images with compression ratios as low as 8:1. Since for most wavelet coded images, no artifacts can be visually detected for compression ratio lower than 8:1, our results suggest that this filter can be applied to most reconstructed pictures regardless of its compression ratio.
- Quadratic Truncation 1 comparison and 1 float multiplication
- Quadratic Truncation is the fastest and it performs the best for artifact removal with the least degradation to image quality. Therefore, the other two potential functions will not be used.
- Threshold ( ⁇ parameter in the potential function) The ⁇ parameter is examined using the quadratic truncation potential function (default value is 16), given by:
- Filter length (F) determines of the size of samples collected for pixel estimations. It is obvious that the larger the filter length the longer the processing time.
- the default length given by VM3.0 (B) is 9 pixels.
- the default value for the filter length is chose to be 7 pixels to increase image quality and to decrease processing time without degrading the ability for artifact removal.
- the adaptive de-ringing is implemented based on the post processing filter in VM3.0 (B). Few parameters have been eliminated and default values have been established to increase the usability of the post processing filter.
- Default values will be used for the parameters if the parameters are not specified in the command line.
- the default parameters are: mask lower threshold: lower threshold for edge detection [30] mask width: width of mask [12] thresh: ⁇ parameter in the potential function (for estimation) [12] fjength: filter length [7] constraints: constraints using in the clipping function [8] iterations: number of iterations [2]
- the modified post processing filter performs well with the above default parameters, however, these parameters can be changed at the command line if the user wishes.
- the post processing filter can now be applied to pictures with 8:1 compression ratio with very small increase in MSE.
- the modified post processing filter seems to improve image quality for compression ratio beyond 11:1.
- Decoder end The decoder will have the full control of the post processing and the of the associate parameters.
- the encoder will have no control on how the images will look at the decoder end. Accordingly, there will be no addition to file header since no post processing parameters will be included.
- the encoder can also predetermine the post processing filter parameters.
- Post processing filter parameters will be stored in the image header, and the image will be restored according to these parameters.
- the user at the encoding end knows exactly how the image will look at the decoder end.
- adding these parameters in the header will increase the size of the compressed image.
- Total of 6 parameters will need to be packed in the header: mask lower threshold, mask width, estimation threshold, filter length, constraints and number of iterations.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Discrete Mathematics (AREA)
- Computing Systems (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA 2363273 CA2363273A1 (en) | 1999-02-15 | 2000-02-15 | Method and system of region-based image coding with dynamic streaming of code blocks |
AU25299/00A AU2529900A (en) | 1999-02-15 | 2000-02-15 | Method and system of region-based image coding with dynamic streaming of code blocks |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA2,261,833 | 1999-02-15 | ||
CA002261833A CA2261833A1 (en) | 1999-02-15 | 1999-02-15 | Method and system of region-based image coding with dynamic streaming of code blocks |
Publications (3)
Publication Number | Publication Date |
---|---|
WO2000049571A2 true WO2000049571A2 (en) | 2000-08-24 |
WO2000049571A3 WO2000049571A3 (en) | 2001-04-05 |
WO2000049571A9 WO2000049571A9 (en) | 2001-06-21 |
Family
ID=4163291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CA2000/000134 WO2000049571A2 (en) | 1999-02-15 | 2000-02-15 | Method and system of region-based image coding with dynamic streaming of code blocks |
Country Status (3)
Country | Link |
---|---|
AU (1) | AU2529900A (en) |
CA (1) | CA2261833A1 (en) |
WO (1) | WO2000049571A2 (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1274247A3 (en) * | 2001-06-27 | 2003-04-16 | Ricoh Company, Ltd. | JPEG 2000 for efficient imaging in a client/server environment |
FR2849329A1 (en) * | 2002-12-20 | 2004-06-25 | France Telecom | Video sequence successive image coding having an image mesh divided and wavelet coded with two wavelet types applied distinct image zones |
GB2395855B (en) * | 2002-11-28 | 2005-07-13 | Rolls Royce Plc | Wavelet compression |
FR2870615A1 (en) * | 2004-05-18 | 2005-11-25 | Canon Kk | METHODS AND DEVICES FOR HANDLING, TRANSMITTING AND DISPLAYING DIGITAL IMAGES |
US7116833B2 (en) | 2002-12-23 | 2006-10-03 | Eastman Kodak Company | Method of transmitting selected regions of interest of digital video data at selected resolutions |
EP1871094A1 (en) * | 2005-04-12 | 2007-12-26 | Olympus Corporation | Image processor, imaging apparatus, and image processing program |
EP1882233A1 (en) * | 2005-05-18 | 2008-01-30 | DTS (BVI) AZ Research Limited | Rate control of scalably coded images |
WO2009073730A2 (en) | 2007-12-03 | 2009-06-11 | Samplify Systems, Inc. | Compression and decompression of computed tomography data |
US8285062B2 (en) | 2009-08-05 | 2012-10-09 | Sony Corporation | Method for improving the performance of embedded graphics coding |
EP2187645A3 (en) * | 2007-07-18 | 2012-10-24 | Humax Co., Ltd. | Adaptive bit-precision entropy coding |
US8457425B2 (en) | 2009-06-09 | 2013-06-04 | Sony Corporation | Embedded graphics coding for images with sparse histograms |
US8964851B2 (en) | 2009-06-09 | 2015-02-24 | Sony Corporation | Dual-mode compression of images and videos for reliable real-time transmission |
US9185424B2 (en) | 2011-07-05 | 2015-11-10 | Qualcomm Incorporated | Image data compression |
US9294782B1 (en) | 2014-10-28 | 2016-03-22 | Sony Corporation | Image processing system with artifact reduction mechanism and method of operation thereof |
US9357237B2 (en) | 2014-10-28 | 2016-05-31 | Sony Corporation | Image processing system with bitstream reduction and method of operation thereof |
US9357232B2 (en) | 2014-10-28 | 2016-05-31 | Sony Corporation | Image processing system with binary decomposition and method of operation thereof |
US9674554B2 (en) | 2014-10-28 | 2017-06-06 | Sony Corporation | Image processing system with coding mode and method of operation thereof |
EP3185554A1 (en) * | 2015-12-21 | 2017-06-28 | Alcatel Lucent | Devices for video encoding and reconstruction with adaptive quantization |
US10063889B2 (en) | 2014-10-28 | 2018-08-28 | Sony Corporation | Image processing system with conditional coding and method of operation thereof |
US10223810B2 (en) | 2016-05-28 | 2019-03-05 | Microsoft Technology Licensing, Llc | Region-adaptive hierarchical transform and entropy coding for point cloud compression, and corresponding decompression |
US10356410B2 (en) | 2014-10-28 | 2019-07-16 | Sony Corporation | Image processing system with joint encoding and method of operation thereof |
US10694210B2 (en) | 2016-05-28 | 2020-06-23 | Microsoft Technology Licensing, Llc | Scalable point cloud compression with transform, and corresponding decompression |
CN113554722A (en) * | 2021-07-22 | 2021-10-26 | 辽宁科技大学 | Improved EZW-based image compression method for crown word number of RMB banknote |
US11297346B2 (en) | 2016-05-28 | 2022-04-05 | Microsoft Technology Licensing, Llc | Motion-compensated compression of dynamic voxelized point clouds |
CN114332261A (en) * | 2021-12-31 | 2022-04-12 | 易思维(杭州)科技有限公司 | Picture compression method |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109712068A (en) * | 2018-12-21 | 2019-05-03 | 云南大学 | Image Style Transfer and analogy method for cucurbit pyrography |
CN113012113B (en) * | 2021-03-01 | 2023-04-07 | 和远智能科技股份有限公司 | Automatic detection method for bolt looseness of high-speed rail contact network power supply equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5563960A (en) * | 1993-01-22 | 1996-10-08 | David Sarnoff Research Center, Inc. | Apparatus and method for emphasizing a selected region in the compressed representation of an image |
WO2000004721A1 (en) * | 1998-07-15 | 2000-01-27 | Digital Accelerator Corporation | Region-based scalable image coding |
-
1999
- 1999-02-15 CA CA002261833A patent/CA2261833A1/en not_active Abandoned
-
2000
- 2000-02-15 WO PCT/CA2000/000134 patent/WO2000049571A2/en active Application Filing
- 2000-02-15 AU AU25299/00A patent/AU2529900A/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5563960A (en) * | 1993-01-22 | 1996-10-08 | David Sarnoff Research Center, Inc. | Apparatus and method for emphasizing a selected region in the compressed representation of an image |
WO2000004721A1 (en) * | 1998-07-15 | 2000-01-27 | Digital Accelerator Corporation | Region-based scalable image coding |
Non-Patent Citations (4)
Title |
---|
EGGER O ET AL: "ARBITRARILY-SHAPED WAVELET PACKETS FOR ZEROTREE CODING" IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING - PROCEEDINGS. (ICASSP),US,NEW YORK, IEEE, vol. CONF. 21, 1996, pages 2335-2338, XP000681695 ISBN: 0-7803-3193-1 * |
FRAJKA T ET AL: "PROGRESSIVE IMAGE CODING WITH SPATIALLY VARIABLE RESOLUTION" PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING,US,LOS ALAMITOS, CA: IEEE,1997, pages 53-56, XP000792714 ISBN: 0-8186-8184-5 * |
TAUBMAN D: "High performance scalable image compression with EBCOT" PROCEEDINGS 1999 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (CAT. 99CH36348), PROCEEDINGS OF 6TH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP'99), KOBE, JAPAN, 24-28 OCT. 1999, pages 344-348 vol.3, XP000921211 1999, Piscataway, NJ, USA, IEEE, USA ISBN: 0-7803-5467-2 * |
TAUBMAN D: "JPEG 2000 Verification Model VM3A, ISO/IEC JTC1/SC29/WG1 N1143" ISO/IEC JTC1/SC29/WG1: CODING OF STILL PICTURES - JBIG,JPEG, 1 February 1999 (1999-02-01), pages 1-89, XP002146099 cited in the application * |
Cited By (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1274247A3 (en) * | 2001-06-27 | 2003-04-16 | Ricoh Company, Ltd. | JPEG 2000 for efficient imaging in a client/server environment |
GB2395855B (en) * | 2002-11-28 | 2005-07-13 | Rolls Royce Plc | Wavelet compression |
WO2004059982A1 (en) * | 2002-12-20 | 2004-07-15 | France Telecom | Wavelet image-encoding method and corresponding decoding method |
CN100588252C (en) * | 2002-12-20 | 2010-02-03 | 法国电信公司 | Wavelet image-encoding method and corresponding decoding method |
FR2849329A1 (en) * | 2002-12-20 | 2004-06-25 | France Telecom | Video sequence successive image coding having an image mesh divided and wavelet coded with two wavelet types applied distinct image zones |
US7512283B2 (en) | 2002-12-23 | 2009-03-31 | Eastman Kodak Company | Method of transmitting selected regions of interest of digital video data at selected resolutions |
US7116833B2 (en) | 2002-12-23 | 2006-10-03 | Eastman Kodak Company | Method of transmitting selected regions of interest of digital video data at selected resolutions |
FR2870615A1 (en) * | 2004-05-18 | 2005-11-25 | Canon Kk | METHODS AND DEVICES FOR HANDLING, TRANSMITTING AND DISPLAYING DIGITAL IMAGES |
US7610334B2 (en) | 2004-05-18 | 2009-10-27 | Canon Kabushiki Kaisha | Method and device for distributing digital data in particular for a peer-to-peer network |
EP1871094A1 (en) * | 2005-04-12 | 2007-12-26 | Olympus Corporation | Image processor, imaging apparatus, and image processing program |
EP1871094A4 (en) * | 2005-04-12 | 2010-01-06 | Olympus Corp | Image processor, imaging apparatus, and image processing program |
EP1882233A1 (en) * | 2005-05-18 | 2008-01-30 | DTS (BVI) AZ Research Limited | Rate control of scalably coded images |
EP1882233A4 (en) * | 2005-05-18 | 2009-11-04 | Dts Bvi Az Res Ltd | Rate control of scalably coded images |
US7668380B2 (en) | 2005-05-18 | 2010-02-23 | Dts, Inc. | Rate control of scalably coded images |
EP2187645A3 (en) * | 2007-07-18 | 2012-10-24 | Humax Co., Ltd. | Adaptive bit-precision entropy coding |
EP2217149A4 (en) * | 2007-12-03 | 2012-11-14 | Samplify Systems Inc | Compression and decompression of computed tomography data |
EP2217149A2 (en) * | 2007-12-03 | 2010-08-18 | Samplify Systems, Inc. | Compression and decompression of computed tomography data |
WO2009073730A2 (en) | 2007-12-03 | 2009-06-11 | Samplify Systems, Inc. | Compression and decompression of computed tomography data |
US8457425B2 (en) | 2009-06-09 | 2013-06-04 | Sony Corporation | Embedded graphics coding for images with sparse histograms |
US8964851B2 (en) | 2009-06-09 | 2015-02-24 | Sony Corporation | Dual-mode compression of images and videos for reliable real-time transmission |
US8285062B2 (en) | 2009-08-05 | 2012-10-09 | Sony Corporation | Method for improving the performance of embedded graphics coding |
US9185424B2 (en) | 2011-07-05 | 2015-11-10 | Qualcomm Incorporated | Image data compression |
US9674554B2 (en) | 2014-10-28 | 2017-06-06 | Sony Corporation | Image processing system with coding mode and method of operation thereof |
US10063889B2 (en) | 2014-10-28 | 2018-08-28 | Sony Corporation | Image processing system with conditional coding and method of operation thereof |
US9357232B2 (en) | 2014-10-28 | 2016-05-31 | Sony Corporation | Image processing system with binary decomposition and method of operation thereof |
US9591330B2 (en) | 2014-10-28 | 2017-03-07 | Sony Corporation | Image processing system with binary adaptive Golomb coding and method of operation thereof |
US9294782B1 (en) | 2014-10-28 | 2016-03-22 | Sony Corporation | Image processing system with artifact reduction mechanism and method of operation thereof |
US10356410B2 (en) | 2014-10-28 | 2019-07-16 | Sony Corporation | Image processing system with joint encoding and method of operation thereof |
US9357237B2 (en) | 2014-10-28 | 2016-05-31 | Sony Corporation | Image processing system with bitstream reduction and method of operation thereof |
WO2017108573A1 (en) * | 2015-12-21 | 2017-06-29 | Alcatel Lucent | Devices for video encoding and reconstruction with adaptive quantization |
EP3185554A1 (en) * | 2015-12-21 | 2017-06-28 | Alcatel Lucent | Devices for video encoding and reconstruction with adaptive quantization |
US10223810B2 (en) | 2016-05-28 | 2019-03-05 | Microsoft Technology Licensing, Llc | Region-adaptive hierarchical transform and entropy coding for point cloud compression, and corresponding decompression |
US10694210B2 (en) | 2016-05-28 | 2020-06-23 | Microsoft Technology Licensing, Llc | Scalable point cloud compression with transform, and corresponding decompression |
US11297346B2 (en) | 2016-05-28 | 2022-04-05 | Microsoft Technology Licensing, Llc | Motion-compensated compression of dynamic voxelized point clouds |
CN113554722A (en) * | 2021-07-22 | 2021-10-26 | 辽宁科技大学 | Improved EZW-based image compression method for crown word number of RMB banknote |
CN113554722B (en) * | 2021-07-22 | 2023-08-15 | 辽宁科技大学 | Image compression method for crown word number of RMB paper currency based on improved EZW |
CN114332261A (en) * | 2021-12-31 | 2022-04-12 | 易思维(杭州)科技有限公司 | Picture compression method |
CN114332261B (en) * | 2021-12-31 | 2024-05-31 | 易思维(杭州)科技股份有限公司 | Picture compression method |
Also Published As
Publication number | Publication date |
---|---|
WO2000049571A9 (en) | 2001-06-21 |
CA2261833A1 (en) | 2000-08-15 |
WO2000049571A3 (en) | 2001-04-05 |
AU2529900A (en) | 2000-09-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2000049571A2 (en) | Method and system of region-based image coding with dynamic streaming of code blocks | |
EP0971544B1 (en) | An image coding method and apparatus for localised decoding at multiple resolutions | |
US6941024B2 (en) | Coder matched layer separation and interpolation for compression of compound documents | |
JP3853758B2 (en) | Image encoding device | |
US6236758B1 (en) | Apparatus and method for encoding wavelet trees by backward predictive coding of wavelet transformed coefficients | |
EP1110180B1 (en) | Embedded quadtree wavelets image compression | |
Walker et al. | Wavelet-based image compression | |
US6597739B1 (en) | Three-dimensional shape-adaptive wavelet transform for efficient object-based video coding | |
Nguyen et al. | Rapid high quality compression of volume data for visualization | |
EP1461941A1 (en) | Code matched layer separation for compression of compound documents | |
Pan et al. | A fast and low memory image coding algorithm based on lifting wavelet transform and modified SPIHT | |
Martin et al. | SPIHT-based coding of the shape and texture of arbitrarily shaped visual objects | |
EP1095519B1 (en) | Region-based scalable image coding | |
CA2363273A1 (en) | Method and system of region-based image coding with dynamic streaming of code blocks | |
EP0920213A2 (en) | Method and apparatus for decoding transform coefficients | |
Ranjan et al. | An Efficient Compression of Gray Scale Images Using Wavelet Transform | |
AU708489B2 (en) | A method and apparatus for digital data compression | |
Al-Sammaraie | Medical Images Compression Using Modified SPIHT Algorithm and Multiwavelets Transformation | |
AU725719B2 (en) | A method of digital image compression | |
Yin et al. | Archive image communication with improved compression | |
Vrindavanam et al. | Wavelet and JPEG based image compression: an experimental analysis | |
AU719749B2 (en) | A method for digital data compression | |
Yew | Detail preserving image compression using wavelet transform | |
AU736469B2 (en) | An image coding method and apparatus for localized decoding at multiple resolutions | |
AU727434B2 (en) | Method and apparatus for decoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
AK | Designated states |
Kind code of ref document: A3 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
AK | Designated states |
Kind code of ref document: C2 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: C2 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
COP | Corrected version of pamphlet |
Free format text: PAGES 1-73, DESCRIPTION, REPLACED BY NEW PAGES 1-74; PAGES 74-76, CLAIMS, REPLACED BY NEW PAGES 75-77; PAGES 1/35-35/35, DRAWINGS, REPLACED BY NEW PAGES 1/35-35/35 |
|
ENP | Entry into the national phase |
Ref document number: 2363273 Country of ref document: CA Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2000903464 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2000903464 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase |