WO2003056518A1 - Image compression - Google Patents

Image compression Download PDF

Info

Publication number
WO2003056518A1
WO2003056518A1 PCT/CA2003/000036 CA0300036W WO03056518A1 WO 2003056518 A1 WO2003056518 A1 WO 2003056518A1 CA 0300036 W CA0300036 W CA 0300036W WO 03056518 A1 WO03056518 A1 WO 03056518A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
edges
detected
critical points
points
Prior art date
Application number
PCT/CA2003/000036
Other languages
French (fr)
Other versions
WO2003056518A9 (en
Inventor
Philippe St-Jean
Original Assignee
Zeugma Technologies Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zeugma Technologies Inc. filed Critical Zeugma Technologies Inc.
Priority to AU2003201243A priority Critical patent/AU2003201243A1/en
Publication of WO2003056518A1 publication Critical patent/WO2003056518A1/en
Publication of WO2003056518A9 publication Critical patent/WO2003056518A9/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame

Definitions

  • This invention relates to the field of data compression/decompression and more precisely to the field of image compression.
  • Image compression methods are usually classified in terms of first and second generation image coding.
  • First generation methods are described with keywords such as: block methods, harmonic analysis, functional analysis, frequency-based, scale-space-based, etc and regroup all methods which are based on the effort to create a compressed image as close as possible from the original image in term of the L-2 norm of the difference image, using a linear transform (i.e. a family of function, or a basis) in which the representation of the image is somewhat sparse.
  • Examples are Fourier Transform methods (like Discrete Cosine Transform, used in the JPEG and MPEG-2 standards) and Wavelet methods (JPEG-2000 is based on wavelet analysis).
  • Second generation methods refer to methods that try to include the Human Visual System (HVS) characteristics in order to define important features that should be reproduced in the compressed image. This is usually translated to edge or contour detection, image segmentation and to a lesser extent: texture coding. Methods such as Bandlets, Wedgelets and segmentation-based method (found in MPEG-4 for instance) fall in this category.
  • HVS Human Visual System
  • Yet another object of the invention is to provide a method for compressing an image which allows a « soft failure » capability under very strong compression, whereas soft- failure refers to the possibility of transmitting a compressed image that keeps a reasonable perceptual sense for the user who visualizes it after decompression even if the quantity of information allowed to represent it is far below the range of operability of the existing state-of-the-art methods;
  • Yet another object of the invention is to provide an apparatus for compressing an image
  • a method for compressing the data of an image comprising the steps of detecting the critical points of an image, the critical points being a maximum, a minimum or a saddle point; detecting the edges of the denoised image; and coding the detected critical points and the detected edges to provide compressed data of the image.
  • an apparatus for compressing the data of an image comprising a critical point detection unit receiving the image and providing the critical points of the image, an edge detection unit receiving the image and providing the edges of the image and a coding unit receiving the provided critical points and the provided edges and providing compressed data of the image;
  • First generation methods may be defined as methods that do not use the Human
  • the second generation methods may be defined as methods that use the Human
  • An object of the invention is to use the Human Visual System in an active way.
  • Figure 1 is a flow chart which shows the image compression technique in the preferred embodiment of the invention.
  • Figure 2 is a flow chart which shows the image decompression technique in the preferred embodiment of the invention.
  • Figure 3 is a block diagram which show an apparatus for compression an image in the preferred embodiment of the invention.
  • Figure 4 is an example of a gray-scale digitized image
  • Figure 5a is a diagram which shows the original (canonical) sampling of the image
  • Figure 5b is a diagram which shows how the image is transformed in the new sampling
  • Figure 5c is a diagram which shows how the two samplings are spatially related to one another, with a shift of ⁇ A pixel width for odd and even rows respectively;
  • Figure 6 is a diagram which shows the position and indexing of the nearest neighbor of a given pixel in the hexagonal resampling
  • Figure 7 is a diagram which illustrates the three different line structuring elements used by the COCO-LSE algorithm
  • Figure 8 is a pre-processed image which shows the results of the morphological denoising operators applied to the gray-scale digitized image
  • Figure 9 is a diagram which shows the three different types of critical points that are detected.
  • Figure 10 is an image of the critical points detected on the pre-processed image
  • Figure 11 shows the Maximum Difference of Nearest-neighbor (MADN) image obtained from the pre-processed image
  • Figure 12 shows the list of edges detected by the algorithm using the Maximum
  • MADN Difference of Nearest-neighbor
  • Figure 13a is a diagram which illustrates the expansion of edges by identifying neighbor pixels of edge pixels from which intensity values are taken;
  • Figure 13b is a diagram which illustrates the expansion of edges by assigning those values to an expansion of the edge pixels covering a small area around them;
  • Figure 14 is a picture which illustrates a constrained Delaunay triangulation applied to the list of critical points and edge points;
  • Figure 15 is a picture which illustrates the simplified version of the list of edges detected by the algorithm using the Maximum Difference of Nearest-neighbor
  • Figure 16 is a picture which illustrates a Minimal Spanning Tree (MST) obtained from the constrained Delaunay triangulation applied to the list of critical points and edge points;
  • MST Minimal Spanning Tree
  • Figure 17 is a reconstructed version of the gray-scale digitized image using the invention.
  • the present invention provides a compression/decompression method and apparatus for digitized images and video signals that enables compact archiving and economical transmission of those digitized images and video signals.
  • the present invention is based on a model for a specific (but large) class of images and video signals: "natural" photographs and movies, acquired with common cameras, either digital or analog.
  • the invention makes use of "a priori" knowledge of some important features of these images (the term “image” refers either to still picture or to video signal, unless specified), as perceived by the viewer.
  • the method relies on the fact that most of the represented objects can be modeled as surfaces upon which light is reflected back to the camera.
  • the invention exploits the following characteristics of how these surfaces are represented on an image: the continuity of the pixels intensity throughout large parts of objects, the appearance of edges separating objects from the background, and information about the shape of an object from its interaction with lighting (specular points, gradual decay).
  • HVS Human Visual System
  • edges can appear as the manifestation of different phenomenons: either an actual physical edge of the object, a "projective" edge (e.g. due to the 2-dimensional representation of a 3-dimensional object in the case of a still picture), a shadow, or a boundary between two regions with different light reflectance properties.
  • a projective edge e.g. due to the 2-dimensional representation of a 3-dimensional object in the case of a still picture
  • a shadow e.g. due to the 2-dimensional representation of a 3-dimensional object in the case of a still picture
  • a boundary between two regions with different light reflectance properties e.g. the end of a sequence
  • the excursion set is a binary set obtained by thresholding an image, giving a value of one to all pixels having intensity values above some predetermined threshold, and zero to the rest. If the image is understood as a topographic map, where high intensity regions (white regions) are thought of as mountains and dark regions as valleys, then the excursion set is the map of regions that would stay dry if the image was flooded up to a certain level (according to the predetermined threshold).
  • topology of the excursion set is the "presence" of dry regions (such as continents or islands) or flooded ones (such as seas or lakes).
  • the precise shapes of these regions are considered somewhat irrelevant, i.e. in the preferred embodiment of the invention some of this shape information is selected to be discarded, considering that it is less important than the actual presence or absence of the region.
  • the invention intends, in its preferred embodiment, to preserve the topology for all predetermined thresholds (i.e. for all levels of flooding), maintaining therefore a recognizable image regardless of significant loss of information. It will be appreciated by someone skilled in the art that the preservation of the topology is important because it represents the preservation of basic light reflection information as recorded by the camera.
  • the algorithm may also organize the information in order of importance such that a coarse-to-fine version of the image may be transmitted or retrieved (this capability is usually referred to as "scalability" in the literature).
  • FIG. 1 there is shown the steps performed by the compression algorithm in the preferred embodiment of the invention.
  • step 10 a pre-processing of the image (topological "denoising") is performed.
  • steps 12 and 14 a detection of the critical points and edges of the pre-processed image is then performed.
  • a simplification is then performed.
  • a coding of the information is then performed.
  • step 26 a decoding of the coded information is performed
  • step 28 a tessellation of the recorded data is performed, the tessellation of the recorded data is performed using the decoded data.
  • step 30 an interpolation the data generation by the tessellation is then performed in the preferred embodiment of the invention.
  • step 10 of Fig. 1 there is shown the pre-processing step.
  • natural images photographic images
  • the image is polluted by some noise (which may be perceptible or not to the viewer) which manifests itself on a digitized image through slight intensity changes from pixels to pixels; these noise components are unnecessary to the comprehension of the image and constitute a nuisance to most compression algorithms as they add no valuable information.
  • the object of the pre-processing step is therefore to greatly reduce the presence of such noise in the image, in a way that does not alter the truly important visual information found on the image.
  • the morphological operators that are used in this step of pre-processing are designed specifically for the invention in order to prepare the image for the following steps.
  • the morphological operators are not a straightforward application of well-known operators found in the public domain such as opening and closing operators, although they are intended to achieve somewhat similar tasks. These operators are controlled by a few parameters to adjust to specific contexts.
  • This step which is the first step of the compression method, is intended to work in real-time conditions; it should not therefore be considered as an independent processing of the image to be performed some time prior to the compression. According to step 12, the critical points are detected.
  • the critical points are detected locally on the pre-processed image through an immediate neighbor comparison algorithm, avoiding therefore "floating-point” type calculation. These critical points will be used in the next steps.
  • the edges are detected.
  • the edges are detected using an algorithm specifically designed for the invention; this algorithm is based on a seeding-growing approach applied to a nearest-neighbor finite difference image obtained from the pre-processed image.
  • step 16 of Fig. 1 a reduction is performed. Even though every single critical point and edge carries some of the perceptual information on the image, not all of them are of the same importance. In the case where the final size of the image is limited to a known threshold, it is important to decide which critical points and edges will be removed from the list prior to coding. In cases where a scalable version of the compressed image is necessary, this constraint evolves in the necessity for a complete ordering of critical points and edges in term of relative importance.
  • the “reduction” relates to the removal of those points and edges of lesser importance.
  • the reduction is based on two criterions in the preferred embodiment of the invention: the length of the individual edge and the average contrast.
  • the reduction is based, in the preferred embodiment of the invention, on the absolute contrast difference with respect to other neighboring critical points or edges.
  • a simplification is performed.
  • an extra step of simplification is required in order to keep only those points representing an edge which are the most important regarding its geometry. This is done using a Douglas-Peucker algorithm in the preferred embodiment of the invention.
  • a quantization is performed.
  • the last step of the compression functionality is an optimal quantizer of intensity values.
  • a modified Minimal Spanning Tree algorithm is used in order to connect edges and critical points together in an oriented tree structure, in a nearest neighbor approach. This allows to define a parenthood relationship for every point (critical and edge). Difference value between the intensity of a point and the one of its unique parent is then computed and strongly quantized prior to coding. This strong quantization is possible only with this invention's approach and constitutes one of the most important feature of the invention.
  • a coding is performed.
  • the coding of the remaining information is then performed in order to achieve an optimal compression.
  • the geometric information for the position of the critical points and the end points of edges is encoded using a run-length algorithm in the preferred embodiment of the invention.
  • the positions of intermediate points on edges are coded using a reverse Douglas-Peucker algorithm in the preferred embodiment of the invention.
  • Both the run-length vector and the quantized intensity-tree vector are then entropy-coded using a Huffman code.
  • the coded information may then be sent over a transmission channel.
  • a reconstruction is performed; the reconstruction comprises the steps of: decoding, tessellation
  • the decoding step consists in the simple inversion of the run-length algorithm to obtain critical points positions and in the reconstruction of the connection tree.
  • the reconstruction also involves a complete tessellation of the image domain according to step 28 of Fig. 2. Intensity values for all critical points can then be obtained.
  • the interpolation is then performed according to step 30.
  • the interpolated values for the interior of the triangles contained in the tessellation must be computed; each triangle is considered as a subset of a plane, whose equation is derived from the three-dimensional positions of the three vertices of the triangle.
  • Interpolation needs not to be linear, and can be optimized for specific classes of images, depending on time and complexity constraints.
  • FIG. 3 there is shown a block diagram of an apparatus to be used for the compression technique in the preferred embodiment of the invention.
  • the apparatus comprises a pre-processing unit 31 , an edge detection unit 32, an edge selection unit 36, a critical point detection unit 34, a critical point selection unit 38, a coding unit 40 and a selection unit controller 42.
  • the pre-processing unit 31 performs a pre-processing as explained below on an image to compress.
  • the pre-processed image is then provided to the edge detection unit 32 and to the critical point detection unit 34.
  • the pre-processed unit provides two different images to the edge detection unit
  • the edge detection unit 32 detects the edges of the pre-processed image provided by the pre-processing unit 31.
  • the critical point detection unit 34 detects the edges of the pre-processed image provided by the pre-processing unit 31.
  • the edge selection unit 36 selects at least some of the detected edges of the pre- processed image.
  • the edge selection unit 36 is controlled by the selection unit controller 42.
  • the selection unit controller 42 may control the edge selection unit 36 to select more or less edges in the pre-processed image according to the bandwidth or the speed required for the process.
  • the selection unit controller 42 provides an adaptive command on the edge selection unit 36.
  • the critical point selection unit 38 selects at least some of the detected critical points of the pre-processed image.
  • the critical point selection unit 38 is controlled by the selection unit controller 42.
  • the selection unit controller may control the critical point selection unit 38 to select more or less critical points in the pre- processed image according to the bandwidth or the speed required for the process.
  • the selection unit controller 42 provides an adaptive command on the critical point selection unit 38.
  • the invention is applied to compress/decompress digitized grayscale two-dimensional images of arbitrary size.
  • Extension to higher dimension The section named “Extension to scalable compression” will enable someone skilled in the art to adapt the method in order to meet scalability requirements.
  • the original image Prior to the actual morphological filtering, the original image is resampled to a 6- connected grid.
  • the 6-connected grid is an hexagonal grid, or more specifically a brick-wall pattern grid as shown in Fig. 5b.
  • The' resampling is achieved in only one of the two dimensions; it is done in the horizontal dimension, as shown on Fig. 5b. It will be appreciated by someone skilled in the art that this choice of horizontal vs. vertical is arbitrary. However, it is crucial, in the preferred embodiment of the invention, to resample the original rectangular grid with a shift of +% of the pixel width, alternating the sign for even and odd rows.
  • a Shannon-filter resampling approach may be adopted (Stephane Mallat, "A wavelet tour of signal processing", Chapter III, Academic Press, 1998, pp.39-63) for applications where the frequency content must be preserved in horizontal and vertical direction simultaneously. Pixels are indexed in the brick wall pattern as indicated on Fig. 5c. The output of this resampling is referred to as the H-image. In all of the following description, nearest-neighborhood always refers to the 6- connected pixels shown on Fig. 6, unless specified otherwise.
  • the morphological filtering step requires some operators to be defined. This step is divided in sub-steps of local-contrast masking, masked opening-closing, and unicity transform.
  • the H-image is fed into the Maximum Absolute Difference of Nearest- neighbors (MADN) masking operator.
  • MADN Maximum Absolute Difference of Nearest- neighbors
  • a 6-Connected Opening-Closing Operator (6-COCO)
  • a 6-Connected Opening-Closing Operator with Linear Structuring Element (6-COCO-LSE) operator.
  • a 6-connected dilation (resp. erosion) operator is defined as follows: from an input image, every pixel is taken and replaced with its maximum (resp. minimum) intensity nearest-neighbor.
  • a 6-COCO operator is the composition of a 6- connected erosion followed by two iterations of 6-connected dilation and one more 6-connected erosion.
  • a 6-connected dilation LSE operator is defined as follow: all 3 maxima along straight lines crossing a pixel are found, pixels 1-4-7, 2-5-7 and 3-6-7 are grouped as shown in Fig. 7; then the minimum value of these 3 maxima is taken.
  • the corresponding 6-connected erosion LSE is defined respectively as the maximum of the 3 minima.
  • the 6-COCO-LSE operator is then defined as the composition of a 6-connected erosion LSE followed by two iterations of 6- connected dilation LSE and one more 6-connected erosion LSE.
  • the Masked Opening-Closing Operator takes both the H-image and the
  • the MOCO-image is fed to a unicity operator which ensures that no two neighboring pixels have the exact same intensity value.
  • This operator consists of recursive iterations of matrix multiplication of the image with a NxN unitary matrix with most of its energy found on the diagonal, followed by testing for neighbor equalities.
  • Effective radius is an extra controllable parameter of the pre-processing. It allows to iterate every dilation and erosion more than one time in order to spatially extend the effect of the morphological filtering.
  • the output of these operations is a pre-processed image.
  • An example of such as pre-processed image is shown in Fig. 8.
  • the critical point detection step requires the pre-processed image as input.
  • a neighbor difference 6-bit image is created by computing the intensity difference between every pixel and its six neighbors, and keeping only the six signs in a 6-bit entry on each pixel (-1 for negative, 1 for positive).
  • the critical points can be detected from this 6-bit image, as local maxima (resp. minima) of the image corresponds to pixels where all six values are 1 (resp. -1 ), while saddle points correspond to pixel where four or six sign changes are observed while visiting the six bits circularly. This is shown in Fig. 9.
  • Critical points are detected by computing the number of sign changes: 0 sign change is an indication of a maximum or a minimum while four (or six) sign changes are an indication of a saddle point. This results in a critical point image, where all pixels are given intensity zero except those representing a critical point (of any kind), which are given a value of one.
  • An example of the critical points for the example shown earlier is shown in Fig. 10.
  • the edge detection requires, as the critical point detection, a pre-processed image as input in the preferred embodiment of the present invention, however someone skilled in the art will understand that the choice of parameters is not necessarily the same than the one that was used to generate the pre-processed image used for the critical point detection.
  • the detection is achieved through a seed ing-g rowing algorithm.
  • a Maximum Difference of Nearest- neighbor image (MDN-image) is first computed from the pre-processed image provided for the edge detection.
  • An example of such step is shown in Fig. 11.
  • the Maximum Difference of Nearest-neighbor image is obtained by computing the intensity differences between every pixel and its 6 neighbors, and by taking the maximum of these differences, without taking the absolute value.
  • This last image is thresholded so that all values below the threshold (typically 20 for a 8-bit depth original image) are set to zero, while others remain unchanged.
  • the local maxima of this mdn-image are detected using the same approach as described above, and only those maxima above a higher threshold (typically 40 in the preferred embodiment of the invention) are kept as seeds for the growing algorithm.
  • the growing algorithm begins by numbering arbitrarily these local maxima and choosing the first as the active pixel. Every time a pixel is made active, its coordinates are marked with an index corresponding to the seed number in a visited-mask image.
  • the algorithm looks for the neighbor with the highest mdn-value (this value cannot be zero), and makes it the new active pixel (marking it as well with the same index). Again the maximum neighbor is looked for, although it is required not to be the preceding point nor one of its neighbors. The algorithm stops when there is no more next point candidate (when all neighbors are either zero-valued or are neighbors of the preceding point).
  • the algorithm then comes back to the seed, and starts a second branch assuming that the second point of the first branch was the preceding point. Again, this ends when there is no more next point candidate.
  • the list of points forming the edge is then reordered so that the end point of the first branch becomes the beginning of the list, reversing order on the first branch until the seed is reached, while the second branch is simply concatenated to the first. This is done for every seed, by first testing that the seed has not been visited before (looking up in the visited-mask image).
  • the output of the edge detector is a list of edges where each edge is itself a list of connected point coordinates forming the edge. Fig. 12 shows such step.
  • the reduction step refers to the complete elimination of certain edges and critical points from the compressed image data.
  • a prior reduction is obtained by computing the length of each edge and removing from the list the ones with length shorter than a predetermined threshold (typically 5 pixels in the preferred embodiment of the invention).
  • a predetermined threshold typically 5 pixels in the preferred embodiment of the invention.
  • the edges are also geometrically extended to cover small areas around them, as shown in figure 13b. The extent of these areas on both sides of the edges is an external parameter (typically 0.2 pixel width in the preferred embodiment of the invention).
  • the critical points and the edge points are then joined in a list that is fed to a constrained Delaunay triangulation algorithm as shown in figure 14. This allows to define a new nearest-neighborhood for all points (critical and edges).
  • a Maximum Absolute Difference list is obtained by computing maximum intensity differences with those new neighbors, and critical points with MAD values smaller than a given threshold (typically 5 in the preferred embodiment of the invention) are removed from the list of critical points.
  • the number of critical points is then further reduced by organizing them into one-dimensional structure (which can be seen as long ridges or river valleys), and storing them locally in the exact same way as the edge list.
  • the edges are simplified by keeping only a few points for each edge; these kept l points are sufficient to well represent the edges geometrically.
  • This simplification is performed using a standard Douglas-Peucker algorithm (David H. Douglas and Thomas K. Peucker, "Algorithms for the reduction of the number of points required to represent a digitized line or its caricature", The Canadian Cartographer, 10(2):112-122, Dec.1973), with an interrupt criterion based on a maximal distance error (typically 2 pixel-width in the preferred embodiment of the invention).
  • This simplification step is also applied to ridges of critical points. Such an example of the simplification is shown in Fig. 15.
  • the lists for left and right sides of edges and the list for critical points are fed into a Delaunay triangulation algorithm, which is then used in a Minimal Spanning Tree algorithm (F. James Rolhf, "A probabilistic Minimum Spanning Tree algorithm", Information Processing Letters, Vol.7 Number 1 , January 1978, pp. 44-48.).
  • the connectivity found in the lists for edges and for ridges is considered as constraints for the Delaunay triangulation.
  • the minimal spanning tree yields an acyclic simply connected graph joining all points found in the 3 lists altogether.
  • Figure 16 shows the result of such step on the example.
  • This tree is then ordered using a standard parenthood algorithm (ref). From this directed tree, a list of intensity differences between every critical point and its sole parent is computed. This list is then strongly quantized using a quantization table.
  • the table contains only 3 entries: zero (0), positive (+) and negative (-).
  • the table chosen is an external parameter. At reconstruction, these categories will represent different contrast values depending on whether they represent the difference between critical points or between left and right sides Of edges. These contrast values are typically chosen by obtaining the median of absolute values of contrast for each of the different pair possibilities (edge-critical points, edge-edge, edge-ridge, etc.). If the table contains more than 3 entries, quantiles are used instead of the median.
  • a list of points is created with the uppermost end point of all edges (lists for left and right sides are not used here), the uppermost end point of all ridges and all other individual critical points. This list is fed into a standard run-length algorithm. The position of the lowermost endpoint of edges and ridges relative to uppermost points are also put in a list, and intermediate points are stored using a reverse Douglas-Peucker algorithm in the preferred embodiment of the invention. Those lists, all along with the strongly-quantized intensity list are all entropy coded using a standard Huffman code. These are concatenated with the Huffman tables and the contrast rules; they are binary coded in a file or a binary stream. Reconstruction: decoding and tessellation
  • the decoding step is the composition of decoding the Huffman code, unfolding of the run-length and the reconstruction of the edges and ridges through the
  • Douglas-Peucker coding The edges representing left and right sides of compressed edges are reconstructed as well, with the same algorithm that was used in the reduction step.
  • the full tessellation is computed through the same Delaunay algorithm as in the quantization step, a Minimal Spanning Tree is also computed and allows for the reconstruction of intensity values on all points (left and right side of edges, ridges and isolated critical points).
  • contrast values used depend on the type of points involved in the child-parent relationship (e.g. points belonging to left and right side of the same edge, or pair of isolated critical points); these contrast values are listed in the contrast rules.
  • the interpolation step consists in filling the interior of the tessellation triangles according to intensity values, this is done on a grid of pixels, which resolution is an external parameter of the reconstruction.
  • the interpolation is linear: it is a simple planar interpolation on the 3 vertices of each triangle.
  • This last step results in an output image which can be stored or sent to a visualization unit.
  • FIG 17 there is shown an example of the result of this step on the example.
  • the algorithm for the interpolation may be for instance Linear, Sub-Triangular
  • the Linear algorithm is a simple planar interpolation on the 3 vertices of each triangle. It is the fastest of the four, but might produce visible artifacts on the final image.
  • the Sub-Triangular Linear algorithm divides each triangle into four smaller triangles, then computes intensity values for the 3 extra points created in the process, and applies the Linear algorithm on this new tessellation. Extension to higher dimension
  • N-dimensional neighborhood must be defined.
  • a rhombohedric neighborhood is used (i.e. thight packing of spheres, which corresponds to a 12-nearest neighborhood).
  • a generalized version is used, i.e. a thight packing of N-spheres.
  • “growing” part of the algorithm is a region-growing algorithm on a N-1 dimensional sub-manifold. It is also possible to use the 2-dimensional method on a family of 2-D sub-manifold (e.g. frames in a video) and to further connect together neighboring edges in the extra dimensions. This choice is an external parameter. Compression: reduction
  • An N-dimensional Delaunay triangulation may be used instead of the standard 2-
  • a decimation manifold-surface-simplification algorithm may be used instead of the Douglas-Peucker algorithm.
  • Minimal Spanning Tree algorithm should be replaced by an N-dimensional
  • the original digitized image is oversampled by a factor of 4 both in height and width. Then the same pre-processing steps are performed. The critical point detection and the edge detection steps also work the same.
  • the reduction step is replaced with a partial-ordering step (a classification by importance).
  • Critical points are categorized in N classes (from the most important to the least), using the Maximum Absolute Difference list as a criterion to create the classes.
  • the limiting values that separate the classes are external parameters of the method, and control the relative size of each part of the scalable signal. The same is done with edges, although the criterion now is both length and average contrast.
  • the critical points and the edge points that were eliminated in the simplification of ridges and edges are coded in the same order as they appear in the Douglas- Peucker algorithm with the error threshold progressively reduced to some minimal threshold (typically 0.5 pixel width in the preferred embodiment of the invention).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

A method and apparatus is disclosed for compressing the data of an image. The method comprises the step of detecting critical points of an image which may be a maximum, a minimum or a saddle point. A detection of contour edges of the image is also performed. The coding of the data of the image is then performed using the detected edges and the critical points.

Description

IMAGE COMPRESSION
FIELD OF THE INVENTION
This invention relates to the field of data compression/decompression and more precisely to the field of image compression.
BACKGROUND OF THE INVENTION
Image compression methods are usually classified in terms of first and second generation image coding. First generation methods are described with keywords such as: block methods, harmonic analysis, functional analysis, frequency-based, scale-space-based, etc and regroup all methods which are based on the effort to create a compressed image as close as possible from the original image in term of the L-2 norm of the difference image, using a linear transform (i.e. a family of function, or a basis) in which the representation of the image is somewhat sparse. Examples are Fourier Transform methods (like Discrete Cosine Transform, used in the JPEG and MPEG-2 standards) and Wavelet methods (JPEG-2000 is based on wavelet analysis).
These methods achieve good result when only "slight" compression is necessary, after what they tend to lead to strong artifacts which greatly reduce the perceptual ensemble of the image.
Second generation methods refer to methods that try to include the Human Visual System (HVS) characteristics in order to define important features that should be reproduced in the compressed image. This is usually translated to edge or contour detection, image segmentation and to a lesser extent: texture coding. Methods such as Bandlets, Wedgelets and segmentation-based method (found in MPEG-4 for instance) fall in this category.
However, neither of these classes of methods refer to the true physics of imaging, whereas an image is understood as a two-dimensional projection of the resulting interaction of lighting source with objects in a scene. This understanding comes prior to the functioning of the HVS, which had to learn how to perceive objects, given as input images which are 2-d projections. SUMMARY OF THE INVENTION
It is an object of the invention to provide a method for compressing an image in order to obtain a reasonable approximation to the image that can be compactly coded, for optimal storage or transmission;
Yet another object of the invention is to provide a method for compressing an image which allows a « soft failure » capability under very strong compression, whereas soft- failure refers to the possibility of transmitting a compressed image that keeps a reasonable perceptual sense for the user who visualizes it after decompression even if the quantity of information allowed to represent it is far below the range of operability of the existing state-of-the-art methods;
Yet another object of the invention is to provide an apparatus for compressing an image;
According to one aspect of the invention there is provided a method for compressing the data of an image, the method comprising the steps of detecting the critical points of an image, the critical points being a maximum, a minimum or a saddle point; detecting the edges of the denoised image; and coding the detected critical points and the detected edges to provide compressed data of the image.
According to another aspect of the invention, there is provided an apparatus for compressing the data of an image, the apparatus comprising a critical point detection unit receiving the image and providing the critical points of the image, an edge detection unit receiving the image and providing the edges of the image and a coding unit receiving the provided critical points and the provided edges and providing compressed data of the image;
First generation methods may be defined as methods that do not use the Human
Visual System;
The second generation methods may be defined as methods that use the Human
Visual System in a passive way;
An object of the invention is to use the Human Visual System in an active way.
BRIEF DESCRIPTION OF THE DRAWINGS The invention will be better understood by way of the following detailed description of a preferred embodiment and other embodiments with reference to the appended drawings, in which:
Figure 1 is a flow chart which shows the image compression technique in the preferred embodiment of the invention;
Figure 2 is a flow chart which shows the image decompression technique in the preferred embodiment of the invention;
Figure 3 is a block diagram which show an apparatus for compression an image in the preferred embodiment of the invention;
Figure 4 is an example of a gray-scale digitized image;
Figure 5a is a diagram which shows the original (canonical) sampling of the image;
Figure 5b is a diagram which shows how the image is transformed in the new sampling;
Figure 5c is a diagram which shows how the two samplings are spatially related to one another, with a shift of ± A pixel width for odd and even rows respectively;
Figure 6 is a diagram which shows the position and indexing of the nearest neighbor of a given pixel in the hexagonal resampling;
Figure 7 is a diagram which illustrates the three different line structuring elements used by the COCO-LSE algorithm;
Figure 8 is a pre-processed image which shows the results of the morphological denoising operators applied to the gray-scale digitized image;
Figure 9 is a diagram which shows the three different types of critical points that are detected;
Figure 10 is an image of the critical points detected on the pre-processed image;
Figure 11 shows the Maximum Difference of Nearest-neighbor (MADN) image obtained from the pre-processed image;
Figure 12 shows the list of edges detected by the algorithm using the Maximum
Difference of Nearest-neighbor (MADN) image as input;
Figure 13a is a diagram which illustrates the expansion of edges by identifying neighbor pixels of edge pixels from which intensity values are taken; Figure 13b is a diagram which illustrates the expansion of edges by assigning those values to an expansion of the edge pixels covering a small area around them;
Figure 14 is a picture which illustrates a constrained Delaunay triangulation applied to the list of critical points and edge points;
Figure 15 is a picture which illustrates the simplified version of the list of edges detected by the algorithm using the Maximum Difference of Nearest-neighbor
(MADN) image as input;
Figure 16 is a picture which illustrates a Minimal Spanning Tree (MST) obtained from the constrained Delaunay triangulation applied to the list of critical points and edge points;
Figure 17 is a reconstructed version of the gray-scale digitized image using the invention.
DETAILED DESCRIPTION
The present invention provides a compression/decompression method and apparatus for digitized images and video signals that enables compact archiving and economical transmission of those digitized images and video signals.
The present invention is based on a model for a specific (but large) class of images and video signals: "natural" photographs and movies, acquired with common cameras, either digital or analog.
Most common images fall in this class. A few examples of images that would not be part of that class are: microscopy, medical images (such as x-rays, MRI, and endoscopy), range data, calorimetric images, etc. Nevertheless, someone skilled in the art may take advantage of the invention and collect reasonable results even for some of those out of-class images.
The invention makes use of "a priori" knowledge of some important features of these images (the term "image" refers either to still picture or to video signal, unless specified), as perceived by the viewer.
"More precisely, the method relies on the fact that most of the represented objects can be modeled as surfaces upon which light is reflected back to the camera.
One easily sees that this is not the case for x-ray images, in which the light (the x-rays) is going through the imaged object and gets directly to the film. However on a natural image, many of the objects that are perceived can be understood as such surfaces upon which light is reflected (a face, a wall, a car, etc.). For video signals, these surfaces representation will evolve in time, either due to physical movement, camera movement, lighting changes, etc.
The invention exploits the following characteristics of how these surfaces are represented on an image: the continuity of the pixels intensity throughout large parts of objects, the appearance of edges separating objects from the background, and information about the shape of an object from its interaction with lighting (specular points, gradual decay).
The fundamental idea behind the invention is that the important information in an image is defined through its treatment by the Human Visual System (HVS); however, the HVS itself interprets the image as the reflection of light on objects, which can often be roughly described as piecewise smooth surfaces. Hence a compression method can use this very general a priori information in order to discriminate important information on an image.
From this fundamental idea, a fundamental assumption is derived: two important characteristics of an image as far as viewer perception is concerned are the presence of edges and the local presentation of the topology of the excursion set on the image. An important aspect of the invention is thus to extract the edge information as well as information relative to the topology of excursion set. In an image, the edges can appear as the manifestation of different phenomenons: either an actual physical edge of the object, a "projective" edge (e.g. due to the 2-dimensional representation of a 3-dimensional object in the case of a still picture), a shadow, or a boundary between two regions with different light reflectance properties. For video signals, discontinuities in the time dimension (e.g. the end of a sequence) are also considered as edges. In the preferred embodiment of the invention, the excursion set is a binary set obtained by thresholding an image, giving a value of one to all pixels having intensity values above some predetermined threshold, and zero to the rest. If the image is understood as a topographic map, where high intensity regions (white regions) are thought of as mountains and dark regions as valleys, then the excursion set is the map of regions that would stay dry if the image was flooded up to a certain level (according to the predetermined threshold).
What is then referred to as the topology of the excursion set is the "presence" of dry regions (such as continents or islands) or flooded ones (such as seas or lakes).
In the preferred embodiment of the invention, the precise shapes of these regions are considered somewhat irrelevant, i.e. in the preferred embodiment of the invention some of this shape information is selected to be discarded, considering that it is less important than the actual presence or absence of the region.
The invention intends, in its preferred embodiment, to preserve the topology for all predetermined thresholds (i.e. for all levels of flooding), maintaining therefore a recognizable image regardless of significant loss of information. It will be appreciated by someone skilled in the art that the preservation of the topology is important because it represents the preservation of basic light reflection information as recorded by the camera.
The relative positions and intensities of edges and critical points (namely maxima, minima and saddle points) will therefore be sufficient to obtain a reasonable approximation to the image.
This is better understood with an example: assume that the excursion set at a certain threshold contains one island (i.e. one mountain that is not completely flooded). The raising of the threshold (which corresponds to flooding) will make the mountain disappear at some point. This will happen when the threshold reaches the top of the mountain, which happens to be a critical point (local maximum) of the image.
Someone could argue that this point somewhat "represents" the presence of the whole mountain. Now assume that this mountain has two peaks. At low threshold, it still appears as a single island, but at some higher threshold the two peeks separate and give rise to two islands. One can show that this happens only when a saddle point is reached. Hence the saddle points can represent the separation of one island into two islands. This illustrates how the relative positions of those points carry most of the topological information about the image. Once the information is gathered, a weight criteria allows the algorithm to select the most important points and edges of the initial points and edges, and to use the selected information in order to decimate the representation (i.e. to get rid of the least important critical points, and to reduce the quantity of information related to them) until a target file size or a minimal acceptable quality of the image is reached. In another embodiment, the algorithm may also organize the information in order of importance such that a coarse-to-fine version of the image may be transmitted or retrieved (this capability is usually referred to as "scalability" in the literature).
General principles of the invention
Now referring to Fig. 1 , there is shown the steps performed by the compression algorithm in the preferred embodiment of the invention.
According to step 10, a pre-processing of the image (topological "denoising") is performed.
According to steps 12 and 14, a detection of the critical points and edges of the pre-processed image is then performed.
Still referring to Fig. 1 and according to steps 16, 18, 20 of the invention, a simplification is then performed. According to step 22, a coding of the information is then performed.
Now referring to Fig. 2, there are shown the steps of the image decompression. According to step 26, a decoding of the coded information is performed
According to step 28, a tessellation of the recorded data is performed, the tessellation of the recorded data is performed using the decoded data.
Still referring to Fig. 2 and according to step 30, an interpolation the data generation by the tessellation is then performed in the preferred embodiment of the invention.
Depending on the application, and as will be noticed by someone skilled in the art, some of these steps may be overridden as explained below.
In the following, the method is described for grayscale 2-dimensional digitized images (still pictures) for simplicity sake. Someone skilled in the art will understand that the described method may be straightforwardly extended to higher dimensional objects, such as video signals (which can be seen as 3- dimensional signals, where the third dimension is time), or multi-view video signals, where camera displacements add extra dimensions. Details on how the 2-D approach shall be modified in order to accommodate those higher dimensional signals are presented below.
Now referring to step 10 of Fig. 1 , there is shown the pre-processing step. Typically, natural images (photographic images) come in multiple forms: different quality, formats, resolutions, lighting conditions, etc. In most cases however, the image is polluted by some noise (which may be perceptible or not to the viewer) which manifests itself on a digitized image through slight intensity changes from pixels to pixels; these noise components are unnecessary to the comprehension of the image and constitute a nuisance to most compression algorithms as they add no valuable information.
The object of the pre-processing step is therefore to greatly reduce the presence of such noise in the image, in a way that does not alter the truly important visual information found on the image.
Many methods use linear filtering and thresholding to achieve this step, with the inconvenient of altering edge information. In the preferred embodiment of the invention, a morphological filtering is used in order to avoid the above-mentioned side effect.
The morphological operators that are used in this step of pre-processing are designed specifically for the invention in order to prepare the image for the following steps. The morphological operators are not a straightforward application of well-known operators found in the public domain such as opening and closing operators, although they are intended to achieve somewhat similar tasks. These operators are controlled by a few parameters to adjust to specific contexts. This step, which is the first step of the compression method, is intended to work in real-time conditions; it should not therefore be considered as an independent processing of the image to be performed some time prior to the compression. According to step 12, the critical points are detected.
The critical points are detected locally on the pre-processed image through an immediate neighbor comparison algorithm, avoiding therefore "floating-point" type calculation. These critical points will be used in the next steps.
Edge detection
According to step 14 of Fig. 1 , the edges are detected. The edges are detected using an algorithm specifically designed for the invention; this algorithm is based on a seeding-growing approach applied to a nearest-neighbor finite difference image obtained from the pre-processed image.
Note that two different pre-processed images (i.e. pre-processed with different parameter values) can be used for critical point detection and for edge detection.
Compression: reduction
According to step 16 of Fig. 1 , a reduction is performed. Even though every single critical point and edge carries some of the perceptual information on the image, not all of them are of the same importance. In the case where the final size of the image is limited to a known threshold, it is important to decide which critical points and edges will be removed from the list prior to coding. In cases where a scalable version of the compressed image is necessary, this constraint evolves in the necessity for a complete ordering of critical points and edges in term of relative importance.
The "reduction" relates to the removal of those points and edges of lesser importance. In the case of edges, the reduction is based on two criterions in the preferred embodiment of the invention: the length of the individual edge and the average contrast. In the case of critical points, the reduction is based, in the preferred embodiment of the invention, on the absolute contrast difference with respect to other neighboring critical points or edges.
Compression: simplification
According to step 18 of Fig. 1 , a simplification is performed. For edges, an extra step of simplification is required in order to keep only those points representing an edge which are the most important regarding its geometry. This is done using a Douglas-Peucker algorithm in the preferred embodiment of the invention.
Compression: quantization
According to step 20 of Fig. 1 , a quantization is performed. The last step of the compression functionality is an optimal quantizer of intensity values. To prepare the intensity values to be coded, a modified Minimal Spanning Tree algorithm is used in order to connect edges and critical points together in an oriented tree structure, in a nearest neighbor approach. This allows to define a parenthood relationship for every point (critical and edge). Difference value between the intensity of a point and the one of its unique parent is then computed and strongly quantized prior to coding. This strong quantization is possible only with this invention's approach and constitutes one of the most important feature of the invention.
Coding
According to step 22 of Fig. 1 , a coding is performed. The coding of the remaining information is then performed in order to achieve an optimal compression. The geometric information for the position of the critical points and the end points of edges is encoded using a run-length algorithm in the preferred embodiment of the invention. The positions of intermediate points on edges are coded using a reverse Douglas-Peucker algorithm in the preferred embodiment of the invention.
Both the run-length vector and the quantized intensity-tree vector are then entropy-coded using a Huffman code. Reconstruction
The coded information may then be sent over a transmission channel. Upon receiving and according to steps 26, 28 and 30 of Fig. 2, a reconstruction is performed; the reconstruction comprises the steps of: decoding, tessellation
(triangulation) and interpolation.
The decoding step consists in the simple inversion of the run-length algorithm to obtain critical points positions and in the reconstruction of the connection tree.
The reconstruction also involves a complete tessellation of the image domain according to step 28 of Fig. 2. Intensity values for all critical points can then be obtained.
Interpolation
The interpolation is then performed according to step 30. The interpolated values for the interior of the triangles contained in the tessellation must be computed; each triangle is considered as a subset of a plane, whose equation is derived from the three-dimensional positions of the three vertices of the triangle.
Interpolation needs not to be linear, and can be optimized for specific classes of images, depending on time and complexity constraints.
Now referring to Fig. 3, there is shown a block diagram of an apparatus to be used for the compression technique in the preferred embodiment of the invention.
The apparatus comprises a pre-processing unit 31 , an edge detection unit 32, an edge selection unit 36, a critical point detection unit 34, a critical point selection unit 38, a coding unit 40 and a selection unit controller 42.
The pre-processing unit 31 performs a pre-processing as explained below on an image to compress. The pre-processed image is then provided to the edge detection unit 32 and to the critical point detection unit 34. In another embodiment the pre-processed unit provides two different images to the edge detection unit
32 and to the critical point detection unit 34.
The edge detection unit 32 detects the edges of the pre-processed image provided by the pre-processing unit 31. The critical point detection unit 34 detects the edges of the pre-processed image provided by the pre-processing unit 31.
The edge selection unit 36 selects at least some of the detected edges of the pre- processed image. The edge selection unit 36 is controlled by the selection unit controller 42. The selection unit controller 42 may control the edge selection unit 36 to select more or less edges in the pre-processed image according to the bandwidth or the speed required for the process. The selection unit controller 42 provides an adaptive command on the edge selection unit 36. The critical point selection unit 38 selects at least some of the detected critical points of the pre-processed image. The critical point selection unit 38 is controlled by the selection unit controller 42. The selection unit controller may control the critical point selection unit 38 to select more or less critical points in the pre- processed image according to the bandwidth or the speed required for the process. The selection unit controller 42 provides an adaptive command on the critical point selection unit 38.
The following description describes an embodiment of the invention. In this embodiment, the invention is applied to compress/decompress digitized grayscale two-dimensional images of arbitrary size.
Detailed description to enable someone skilled in the art to modify this embodiment in order to accommodate the compression of a video signal or of some other types of signal will be found in the section named "Extension to higher dimension". The section named "Extension to scalable compression" will enable someone skilled in the art to adapt the method in order to meet scalability requirements. Pre-processing
Prior to the actual morphological filtering, the original image is resampled to a 6- connected grid. In the preferred embodiment the 6-connected grid is an hexagonal grid, or more specifically a brick-wall pattern grid as shown in Fig. 5b. The' resampling is achieved in only one of the two dimensions; it is done in the horizontal dimension, as shown on Fig. 5b. It will be appreciated by someone skilled in the art that this choice of horizontal vs. vertical is arbitrary. However, it is crucial, in the preferred embodiment of the invention, to resample the original rectangular grid with a shift of +% of the pixel width, alternating the sign for even and odd rows. For instance, using a A shift for even rows and 0 shift for odd ones will result in modifying the frequency content of only every other row, which is unacceptable for the following steps. In another embodiment of the invention, other proper frequency-preserving resampling methods may be used. A Shannon-filter resampling approach may be adopted (Stephane Mallat, "A wavelet tour of signal processing", Chapter III, Academic Press, 1998, pp.39-63) for applications where the frequency content must be preserved in horizontal and vertical direction simultaneously. Pixels are indexed in the brick wall pattern as indicated on Fig. 5c. The output of this resampling is referred to as the H-image. In all of the following description, nearest-neighborhood always refers to the 6- connected pixels shown on Fig. 6, unless specified otherwise. The morphological filtering step requires some operators to be defined. This step is divided in sub-steps of local-contrast masking, masked opening-closing, and unicity transform.
First, the H-image is fed into the Maximum Absolute Difference of Nearest- neighbors (MADN) masking operator. This operator computes the absolute difference value of each of the 6-connected pixels for each pixel of the H-image, and stores the maximum of those 6 values. This output is binarized through a threshold. The value of the threshold is an external parameter: on a 8-bit depth image, it is set to 30 by default in the preferred embodiment of the invention. Pixels with madn values of 30 or less are set to zero (FALSE) while others are set to one (TRUE). This mask image (referred to as a MADNM-image) is required for the masked opening-closing operator which needs to discriminate between regions of high and low contrast.
For the adapted opening-closing operator, two types of opening-closing are used: a 6-Connected Opening-Closing Operator (6-COCO), and a "6-Connected Opening-Closing Operator with Linear Structuring Element" (6-COCO-LSE) operator.
A 6-connected dilation (resp. erosion) operator is defined as follows: from an input image, every pixel is taken and replaced with its maximum (resp. minimum) intensity nearest-neighbor. A 6-COCO operator is the composition of a 6- connected erosion followed by two iterations of 6-connected dilation and one more 6-connected erosion.
A 6-connected dilation LSE operator is defined as follow: all 3 maxima along straight lines crossing a pixel are found, pixels 1-4-7, 2-5-7 and 3-6-7 are grouped as shown in Fig. 7; then the minimum value of these 3 maxima is taken.
The corresponding 6-connected erosion LSE is defined respectively as the maximum of the 3 minima. The 6-COCO-LSE operator is then defined as the composition of a 6-connected erosion LSE followed by two iterations of 6- connected dilation LSE and one more 6-connected erosion LSE.
The Masked Opening-Closing Operator (MOCO) takes both the H-image and the
MADNM-image and applies the standard 6-COCO on the H-image for pixel with
MADNM values of zero, and the 6-COCO-LSE on the rest of the H-image. This results in the MQCO-image.
Finally, the MOCO-image is fed to a unicity operator which ensures that no two neighboring pixels have the exact same intensity value. This operator consists of recursive iterations of matrix multiplication of the image with a NxN unitary matrix with most of its energy found on the diagonal, followed by testing for neighbor equalities.
Effective radius is an extra controllable parameter of the pre-processing. It allows to iterate every dilation and erosion more than one time in order to spatially extend the effect of the morphological filtering.
The output of these operations is a pre-processed image. An example of such as pre-processed image is shown in Fig. 8.
Critical point detection
The critical point detection step requires the pre-processed image as input.
In the preferred embodiment, a neighbor difference 6-bit image is created by computing the intensity difference between every pixel and its six neighbors, and keeping only the six signs in a 6-bit entry on each pixel (-1 for negative, 1 for positive). The critical points can be detected from this 6-bit image, as local maxima (resp. minima) of the image corresponds to pixels where all six values are 1 (resp. -1 ), while saddle points correspond to pixel where four or six sign changes are observed while visiting the six bits circularly. This is shown in Fig. 9.
Critical points are detected by computing the number of sign changes: 0 sign change is an indication of a maximum or a minimum while four (or six) sign changes are an indication of a saddle point. This results in a critical point image, where all pixels are given intensity zero except those representing a critical point (of any kind), which are given a value of one. An example of the critical points for the example shown earlier is shown in Fig. 10.
Edge detection
The edge detection requires, as the critical point detection, a pre-processed image as input in the preferred embodiment of the present invention, however someone skilled in the art will understand that the choice of parameters is not necessarily the same than the one that was used to generate the pre-processed image used for the critical point detection. The detection is achieved through a seed ing-g rowing algorithm.
In order to perform the edge detection, a Maximum Difference of Nearest- neighbor image (MDN-image) is first computed from the pre-processed image provided for the edge detection. An example of such step is shown in Fig. 11. The Maximum Difference of Nearest-neighbor image is obtained by computing the intensity differences between every pixel and its 6 neighbors, and by taking the maximum of these differences, without taking the absolute value. This last image is thresholded so that all values below the threshold (typically 20 for a 8-bit depth original image) are set to zero, while others remain unchanged. Then the local maxima of this mdn-image are detected using the same approach as described above, and only those maxima above a higher threshold (typically 40 in the preferred embodiment of the invention) are kept as seeds for the growing algorithm.
The growing algorithm begins by numbering arbitrarily these local maxima and choosing the first as the active pixel. Every time a pixel is made active, its coordinates are marked with an index corresponding to the seed number in a visited-mask image.
The algorithm then looks for the neighbor with the highest mdn-value (this value cannot be zero), and makes it the new active pixel (marking it as well with the same index). Again the maximum neighbor is looked for, although it is required not to be the preceding point nor one of its neighbors. The algorithm stops when there is no more next point candidate (when all neighbors are either zero-valued or are neighbors of the preceding point).
The algorithm then comes back to the seed, and starts a second branch assuming that the second point of the first branch was the preceding point. Again, this ends when there is no more next point candidate. The list of points forming the edge is then reordered so that the end point of the first branch becomes the beginning of the list, reversing order on the first branch until the seed is reached, while the second branch is simply concatenated to the first. This is done for every seed, by first testing that the seed has not been visited before (looking up in the visited-mask image). The output of the edge detector is a list of edges where each edge is itself a list of connected point coordinates forming the edge. Fig. 12 shows such step.
Compression: reduction
The reduction step refers to the complete elimination of certain edges and critical points from the compressed image data.
For the edges, a prior reduction is obtained by computing the length of each edge and removing from the list the ones with length shorter than a predetermined threshold (typically 5 pixels in the preferred embodiment of the invention). Before the critical point reduction takes place, it is necessary to further process the edge list in order to add the contrast information. It is possible to define a left and right side to every edge by visiting its connected points from the first point to the last point. For each point of each edge, a left and right neighbor is defined as shown in figure 13a. The intensity values taken from the pre-processed image (at the left and right neighbor coordinates) are stored in a list. The edges are also geometrically extended to cover small areas around them, as shown in figure 13b. The extent of these areas on both sides of the edges is an external parameter (typically 0.2 pixel width in the preferred embodiment of the invention).
The critical points and the edge points are then joined in a list that is fed to a constrained Delaunay triangulation algorithm as shown in figure 14. This allows to define a new nearest-neighborhood for all points (critical and edges).
A Maximum Absolute Difference list is obtained by computing maximum intensity differences with those new neighbors, and critical points with MAD values smaller than a given threshold (typically 5 in the preferred embodiment of the invention) are removed from the list of critical points.
The number of critical points is then further reduced by organizing them into one-dimensional structure (which can be seen as long ridges or river valleys), and storing them locally in the exact same way as the edge list.
Compression: simplification
The edges are simplified by keeping only a few points for each edge; these kept l points are sufficient to well represent the edges geometrically. This simplification is performed using a standard Douglas-Peucker algorithm (David H. Douglas and Thomas K. Peucker, "Algorithms for the reduction of the number of points required to represent a digitized line or its caricature", The Canadian Cartographer, 10(2):112-122, Dec.1973), with an interrupt criterion based on a maximal distance error (typically 2 pixel-width in the preferred embodiment of the invention). This simplification step is also applied to ridges of critical points. Such an example of the simplification is shown in Fig. 15.
Compression: quantization
The lists for left and right sides of edges and the list for critical points are fed into a Delaunay triangulation algorithm, which is then used in a Minimal Spanning Tree algorithm (F. James Rolhf, "A probabilistic Minimum Spanning Tree algorithm", Information Processing Letters, Vol.7 Number 1 , January 1978, pp. 44-48.). The connectivity found in the lists for edges and for ridges is considered as constraints for the Delaunay triangulation. The minimal spanning tree yields an acyclic simply connected graph joining all points found in the 3 lists altogether. Figure 16 shows the result of such step on the example.
This tree is then ordered using a standard parenthood algorithm (ref). From this directed tree, a list of intensity differences between every critical point and its sole parent is computed. This list is then strongly quantized using a quantization table. In the preferred embodiment of the invention, the table contains only 3 entries: zero (0), positive (+) and negative (-). The table chosen is an external parameter. At reconstruction, these categories will represent different contrast values depending on whether they represent the difference between critical points or between left and right sides Of edges. These contrast values are typically chosen by obtaining the median of absolute values of contrast for each of the different pair possibilities (edge-critical points, edge-edge, edge-ridge, etc.). If the table contains more than 3 entries, quantiles are used instead of the median. For instance 7 entries: --,--,-,0,+,++,+++ will require quartiles. However, other contrast rules can be applied in order to obtain contrast values as will note someone skilled in the art. Also, the set of contrast rules does not need to correspond exactly to the quantization table. For instance, even if the quantization has not been done through quantiles, the contrast rules can still use quantiles or any other user-defined rules.
Coding
A list of points is created with the uppermost end point of all edges (lists for left and right sides are not used here), the uppermost end point of all ridges and all other individual critical points. This list is fed into a standard run-length algorithm. The position of the lowermost endpoint of edges and ridges relative to uppermost points are also put in a list, and intermediate points are stored using a reverse Douglas-Peucker algorithm in the preferred embodiment of the invention. Those lists, all along with the strongly-quantized intensity list are all entropy coded using a standard Huffman code. These are concatenated with the Huffman tables and the contrast rules; they are binary coded in a file or a binary stream. Reconstruction: decoding and tessellation
The decoding step is the composition of decoding the Huffman code, unfolding of the run-length and the reconstruction of the edges and ridges through the
Douglas-Peucker coding. The edges representing left and right sides of compressed edges are reconstructed as well, with the same algorithm that was used in the reduction step.
The full tessellation is computed through the same Delaunay algorithm as in the quantization step, a Minimal Spanning Tree is also computed and allows for the reconstruction of intensity values on all points (left and right side of edges, ridges and isolated critical points).
The actual contrast values used depend on the type of points involved in the child-parent relationship (e.g. points belonging to left and right side of the same edge, or pair of isolated critical points); these contrast values are listed in the contrast rules.
Reconstruction: interpolation
The interpolation step consists in filling the interior of the tessellation triangles according to intensity values, this is done on a grid of pixels, which resolution is an external parameter of the reconstruction.
The interpolation is linear: it is a simple planar interpolation on the 3 vertices of each triangle. This last step results in an output image which can be stored or sent to a visualization unit. Now referring to figure 17, there is shown an example of the result of this step on the example.
The algorithm for the interpolation may be for instance Linear, Sub-Triangular
Linear, Cubic Spline or Thin-Plate Spline. The choice of an interpolation algorithm is an external parameter of the reconstruction.
The Linear algorithm is a simple planar interpolation on the 3 vertices of each triangle. It is the fastest of the four, but might produce visible artifacts on the final image.
The Sub-Triangular Linear algorithm divides each triangle into four smaller triangles, then computes intensity values for the 3 extra points created in the process, and applies the Linear algorithm on this new tessellation. Extension to higher dimension
Most of the method remains the same for signals representing objects of higher dimension. The modifications which are necessary to extend the method to these signals are presented in the following.
Pre-processing
All operators must be considered N-dimensional, and an N-dimensional neighborhood must be defined. In 3-dimension, a rhombohedric neighborhood is used (i.e. thight packing of spheres, which corresponds to a 12-nearest neighborhood). In an higher dimension, a generalized version is used, i.e. a thight packing of N-spheres.
The opening-closing operators are naturally extended with the same definition but on the new neighborhood, while the opening-closing-LSE use neighborhood in N-
1 dimensions to replace the line elements.
Critical point detection
In the N-dimensional case, the generalization of the "sign-change" approach to detect the critical points lies in the definition of a winding number (an homotopy number) which also holds the information regarding the presence or absence of a critical point inside the neighborhood.
Edge detection
The procedure is the same as in the 2-dimensional case, although this time the
"growing" part of the algorithm is a region-growing algorithm on a N-1 dimensional sub-manifold. It is also possible to use the 2-dimensional method on a family of 2-D sub-manifold (e.g. frames in a video) and to further connect together neighboring edges in the extra dimensions. This choice is an external parameter. Compression: reduction
An N-dimensional Delaunay triangulation may be used instead of the standard 2-
D algorithm.
Compression: simplification
A decimation manifold-surface-simplification algorithm may be used instead of the Douglas-Peucker algorithm.
Compression: quantization
The Minimal Spanning Tree algorithm should be replaced by an N-dimensional
Minimal Spanning Tree algorithm.
Coding
Nothing is changed in this step.
Reconstruction: decoding and tessellation
Again, Delaunay and MST algorithm are generalized to N-dimensional versions.
Reconstruction: interpolation Nothing is changed in this step.
Extension to scalable compression
The algorithm described previously may be modified to account for scalability requirements.
First, the original digitized image is oversampled by a factor of 4 both in height and width. Then the same pre-processing steps are performed. The critical point detection and the edge detection steps also work the same.
The reduction step is replaced with a partial-ordering step (a classification by importance). Critical points are categorized in N classes (from the most important to the least), using the Maximum Absolute Difference list as a criterion to create the classes. The limiting values that separate the classes are external parameters of the method, and control the relative size of each part of the scalable signal. The same is done with edges, although the criterion now is both length and average contrast.
For both the most important critical points and the most important edges, the same steps of simplification, quantization and coding are performed. Delaunay triangulation of those simplified most important points is kept as a "basic" triangulation for the next step.
The critical points and the edge points that were eliminated in the simplification of ridges and edges are coded in the same order as they appear in the Douglas- Peucker algorithm with the error threshold progressively reduced to some minimal threshold (typically 0.5 pixel width in the preferred embodiment of the invention).
Other points that were eliminated in the reduction steps and which are part of the second class are position-encoded using a run-length algorithm, and a new Delaunay triangulation and MST are obtained, using the basic one as a constraint. This new Delaunay triangulation becomes the new basic triangulation, and the new MST is used for the quantization of these second-class points. This is repeated until all classes have been coded.

Claims

What is claimed is:
1. A method for compressing the data of an image, the method comprising the steps of: detecting critical points of an image, the critical points being a maximum, a minimum or a saddle point; detecting contour edges of the image; coding the detected critical points and the detected edges to provide compressed data of the image.
2. The method as claimed in claim 1 , further comprising the step of selecting at least some of the detected critical points of the image.
3. The method as claimed in claim 1 , further comprising the step of selecting at least some of the detected edges of the image.
4. The method as claimed in claim 1 , further comprising the step of preprocessing the image to provide a denoised image.
5. The method as claimed in claim 4, wherein the step of pre-processing the received image is performed using morphological filtering.
6. The method as claimed in claim 2, wherein the step of selecting at least one of the detected critical points is performed using the absolute contrast difference with respect to the other detected critical points or edges.
7. The method as claimed in claim 3, wherein the step of selecting at least one of the detected edges of the denoised image is performed using the length of the detected edges and the average contrast of the detected edges.
8. The method as claimed in claim 5, wherein morphological filtering comprises the steps of local-contrast masking, masked opening-closing and a unicity transformation.
9. An apparatus for compressing the data of an image, the apparatus comprising: a critical point detection unit receiving the image and providing the critical points of the image; an edge detection unit receiving the image and providing the edges of the image; a coding unit receiving the provided critical points and the provided edges and providing compressed data of the image.
10. The apparatus as claimed in claim 9, further comprising an edge selection unit receiving the detected edges from the edge detection unit and providing at least some of the detected edges to the coding unit.
11. The apparatus as claimed in claim 9, further comprising a critical point selection unit receiving the detected critical points from the critical point detection unit and providing at least some of the detected critical point to the coding unit.
12. A method for decompressing the data of an image, the method comprising the steps of: decoding the received data to provide a list of points comprising at least some edges and at least some critical points of the image; calculating pixel values using at least some neighboring edges of the pixels and at least some neighboring critical points of the pixels to provide the image.
13. The method as claimed in claim 12, further comprising the step of performing a tesselation using the provided at least some critical points and the provided at least some edges of the image, the tesselation providing a set of polygons, the set of polygons being used to perform the interpolation.
14. The method as claimed in claim 13, wherein the set of polygons is a set of triangles.
13. The method as claimed in claim 12, wherein each critical point and each edge has a third coordinate.
14. The method as claimed in claim 1 , wherein the image is a three-dimension image.
15. The method as claimed in claim 14, wherein one of the three-dimension is the temporal dimension.
16. The method as claimed in claim 1 , further comprising the step of performing a quantization of the selected at least some of the detected edges of the image.
17. The method as claimed in claim 1 , wherein the image is a fifth-dimension image, the fourth and the fifth dimension of the image representing the elevation angle and the azimuth angle of the camera.
PCT/CA2003/000036 2002-01-04 2003-01-03 Image compression WO2003056518A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003201243A AU2003201243A1 (en) 2002-01-04 2003-01-03 Image compression

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US34436702P 2002-01-04 2002-01-04
US60/344,367 2002-01-04

Publications (2)

Publication Number Publication Date
WO2003056518A1 true WO2003056518A1 (en) 2003-07-10
WO2003056518A9 WO2003056518A9 (en) 2004-06-24

Family

ID=23350237

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2003/000036 WO2003056518A1 (en) 2002-01-04 2003-01-03 Image compression

Country Status (2)

Country Link
AU (1) AU2003201243A1 (en)
WO (1) WO2003056518A1 (en)

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BAJAJ C L ET AL: "PROGRESSIVE COMPRESSION AND TRANSMISSION OF ARBITRARY TRIANGULAR MESHES", PROCEEDINGS VISUALIZATION '99. VIS '99. SAN FRANCISCO, CA, OCT. 24 - 29, 1999, ANNUAL IEEE CONFERENCE ON VISUALIZATION, LOS ALMITOS, CA: IEEE COMP. SOC, US, 24 October 1999 (1999-10-24), pages 307 - 316, XP000895701, ISBN: 0-7803-5899-6 *
BAJAJ C L ET AL: "TOPOLOGY PRESERVING DATA SIMPLIFICATION WITH ERROR BOUNDS", COMPUTERS AND GRAPHICS, PERGAMON PRESS LTD. OXFORD, GB, vol. 22, no. 1, January 1998 (1998-01-01), pages 3 - 12, XP000928944, ISSN: 0097-8493 *
DUSSERT C ET AL: "OPTIMAL COMPRESSIONS AND RETRIEVAL OF POINT PATTERNS THROUGH MINIMAL SPANNING TREE REPRESENTATIONS", PROCEEDINGS OF THE SPIE, SPIE, BELLINGHAM, VA, US, vol. 2785, 11 June 1996 (1996-06-11), pages 33 - 44, XP001001153 *
RIEGEL T ET AL: "SHAPE-INITIALISATION OF 3-D OBJECTS IN VIDEOCONFERENCE SCENES", PROCEEDINGS OF THE SPIE, SPIE, BELLINGHAM, VA, US, vol. 3012, 11 February 1997 (1997-02-11), pages 116 - 124, XP002072067 *
SHENG X ET AL: "GENERATING TOPOLOGICAL STRUCTURES FOR SURFACE MODELS", IEEE COMPUTER GRAPHICS AND APPLICATIONS, IEEE INC. NEW YORK, US, vol. 15, no. 6, November 1995 (1995-11-01), pages 35 - 41, XP000936429, ISSN: 0272-1716 *

Also Published As

Publication number Publication date
WO2003056518A9 (en) 2004-06-24
AU2003201243A1 (en) 2003-07-15

Similar Documents

Publication Publication Date Title
KR102184261B1 (en) How to compress a point cloud
EP2850835B1 (en) Estimation, encoding and decoding of motion information in multidimensional signals through motion zones, and of auxiliary information through auxiliary zones
US8218908B2 (en) Mixed content image compression with two edge data representations
US5615287A (en) Image compression technique
US5694331A (en) Method for expressing and restoring image data
US20170251214A1 (en) Shape-adaptive model-based codec for lossy and lossless compression of images
US20220108483A1 (en) Video based mesh compression
CN113302940A (en) Point cloud encoding using homography transformation
KR20100016272A (en) Image compression and decompression using the pixon method
Park et al. A mesh-based disparity representation method for view interpolation and stereo image compression
EP1368972A2 (en) Scalable video coding using vector graphics
McLean Structured video coding
Bahce et al. Compression of geometry videos by 3D-SPECK wavelet coder
WO2003056518A1 (en) Image compression
Borman et al. Image sequence processing
Marvie et al. Coding of dynamic 3D meshes
Wilson Spatially encoded image-space simplifications for interactive walkthrough
US20230306684A1 (en) Patch generation for dynamic mesh coding
CN113170094B (en) Video decoding method, computer device, computer readable medium, and video decoder
JP3773417B2 (en) Method and apparatus for image data encoding and decoding
WO2023180843A1 (en) Patch generation for dynamic mesh coding
Prazeres Static Point Clouds Compression Efficiency of Mpeg Point Clouds Coding Standards A practical Comparison between Two Powerful PCC Codec’s
Bosworth et al. Segmentation-based image coding by morphological local monotonicity
KR20240066268A (en) ATLAS sampling-based mesh compression using charts of common topologies
WO2022131946A2 (en) Devices and methods for spatial quantization for point cloud compression

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
COP Corrected version of pamphlet

Free format text: PAGES 23-25, CLAIMS, REPLACED BY NEW PAGES 23-25; PAGES 1/20-20/20, DRAWINGS, REPLACED BY NEW PAGES1/18-18/18; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP