JP2005218124A

JP2005218124A - Data compression system

Info

Publication number: JP2005218124A
Application number: JP2005044810A
Authority: JP
Inventors: F Keith Alexander; エフキースアレクサンダー; Edward L Schwartz; エルシュワルツエドワード; Ahmad Zandi; ザンディアーマド; Martin Boliek; ボーリックマーティン; J Gomissh Michael; ジェーゴーミッシュマイケル
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1996-05-03
Filing date: 2005-02-21
Publication date: 2005-08-11
Also published as: JP3989999B2; JPH1084484A

Abstract

<P>PROBLEM TO BE SOLVED: To provide an appropriate quantization corresponding to a property of an output unit without elongating sign streams. <P>SOLUTION: Compression bitstreams are input into a parser. The parser chooses and outputs all or a portion of the compression bitstreams according to a request. For example, a coded data for displaying a picture on a monitor is output by choosing a coefficient of compressibility of low resolution. Alternatively, a compression data is chosen so that un-losing nature expanssion of an attention region may be made into enable. In one embodiment, the parser, according to a request, outputs a bit required for a variance from a pre-view picture to a printer resolution picture or a full-sized medical monitor picture. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明はデータ圧縮及び伸長システムの分野に係り、特に、パーサを含むデータ圧縮システムに関する。 The present invention relates to the field of data compression and decompression systems, and more particularly to a data compression system including a parser.

データ圧縮は、大量のデータの蓄積及び伝送のために非常に有用なツールである。例えば、文書のファクシミリ伝送のような画像伝送に要する時間は、圧縮を利用して画像再生に必要とされるビット数を減らすと飛躍的に短縮される。 Data compression is a very useful tool for storing and transmitting large amounts of data. For example, the time required for image transmission such as facsimile transmission of a document is drastically shortened by reducing the number of bits required for image reproduction using compression.

従来より、多くの様々なデータ圧縮手法が存在している。圧縮手法は、おおまかに分類すると２つのカテゴリー、つまり損失性符号化と非損失性符号化とに分けることができる。損失性符号化とは、情報の損失を生じ、したがって元のデータの完全な再現が保証されない符号化のことである。損失性符号化の目標とするところは、元のデータから変わったとしても、その変化が不快であったり目だったりしないようにすることである。非損失性圧縮では、情報がすべて保存され、データは完全な復元が可能な方法で圧縮される。 Conventionally, many different data compression methods exist. The compression methods can be roughly classified into two categories: lossy coding and lossless coding. Lossy coding is coding that results in loss of information and thus does not guarantee a complete reproduction of the original data. The goal of lossy coding is to make sure that even if it changes from the original data, the change is not uncomfortable or noticeable. In lossless compression, all information is preserved and the data is compressed in a way that allows for full decompression.

非損失性圧縮では、入力シンボルもしくは輝度データが出力符号語に変換される。入力としては、画像データ、音声データ、１次元データ（例えば空間的または時間的に変化するデータ）、２次元データ（例えば２つの空間軸方向に変化する（または１つの空間次元と１つの時間次元で変化する）データ）、あるいは多次元／マルチスペクトルのデータがあろう。圧縮がうまくいけば、その符号語は、符号化前の入力シンボル（または輝度データ）のために必要とされたビット数より少ないビット数で表現される。非損失性符号化法には、辞書符号化方式（例えば、Lempel‐Ziv方式）、ランレングス符号化方式、計数符号化方式、エントロピー符号化方式がある。非損失性の画像圧縮では、圧縮は予測またはコンテキストと符号化に基づいている。ファクシミリ圧縮用ＪＢＩＧ規格と、連続階調画像用のＤＰＣＭ（差分パルス符号変調−ＪＰＥＧ規格のオプション）は画像用の非損失性圧縮の例である。損失性圧縮では、入力シンボルまたは輝度データは、量子化されてから出力符号語へ変換される。量子化は、データの重要な特徴量を保存する一方、重要でない特徴量を除去することを目的としている。損失性圧縮システムは、量子化に先立ち、エネルギー集中をするための変換を利用することが多い。ＪＰＥＧは画像データ用の損失性符号化法の一例である。 In lossless compression, input symbols or luminance data are converted into output codewords. As input, image data, audio data, one-dimensional data (for example, data that changes spatially or temporally), two-dimensional data (for example, changes in two spatial axis directions (or one spatial dimension and one temporal dimension) Data), or multidimensional / multispectral data. If compression is successful, the codeword is represented with a number of bits less than that required for the input symbol (or luminance data) before encoding. Non-lossy coding methods include a dictionary coding method (for example, Lempel-Ziv method), a run length coding method, a counting coding method, and an entropy coding method. In lossless image compression, the compression is based on prediction or context and coding. The JBIG standard for facsimile compression and DPCM for continuous tone images (differential pulse code modulation—an option of the JPEG standard) are examples of lossless compression for images. In lossy compression, input symbols or luminance data are quantized and then converted to output codewords. Quantization aims to remove important feature quantities while preserving important feature quantities of data. Lossy compression systems often use transformations to concentrate energy prior to quantization. JPEG is an example of a lossy encoding method for image data.

画像信号処理における近年の開発は、効率的かつ高精度のデータ圧縮符号化方式を追求することに関心を集中してきた。変換またはピラミッド信号処理の様々な方式が提案されており、その中に多重解像度ピラミッド処理方式とウェーブレット（wavelet）ピラミッド処理方式とがある。これら２方式はサブバンド処理方式及び階層処理方式とも呼ばれる。画像データのウェーブレット・ピラミッド処理方式は、直交ミラーフィルタ（ＱＭＦ）を用いてオリジナル画像のサブバンド分解をする特殊な多重解像度ピラミッド処理方式である。他の非ＱＭＦウェーブレット方式もある。ウェーブレット処理方式に関し、これ以上の情報を得るにはＡntonini，Ｍ．，et l．，“Ｉmage Ｃoding Ｕsing Ｗavelet Ｔransform”，IEEE Ｔransactions on Ｉmage Ｐrocessing，Ｖol．１，Ｎo．２，Ａpril 1992、及びＳhapiro，Ｊ．，“Ａn Ｅmbedded Ｈierarchical Ｉmage Ｃoder Ｕsing Ｚerotrees of Ｗavelet Ｃoefficients”，Ｐroc．IEEE Ｄata Ｃompression Ｃonference，pgs．214-223，1993を参照されたい。また、可逆変換に関する情報を得るには、Ｓaid，Ａ．and Ｐearlman，Ｗ．“Ｒeversible Ｉmage Ｃompression via Ｍultiresolution Ｒepresentation and ＰredictiveＣoding”，Ｄept．of Ｅlectrical，Ｃomputer and Ｓystem Ｅngineering，Ｒenssealaer Ｐolytechnic Ｉnstitute，Ｔroy，ＮＹ 1993を参照されたい。 Recent developments in image signal processing have focused on the pursuit of efficient and highly accurate data compression coding schemes. Various schemes for transformation or pyramid signal processing have been proposed, including multi-resolution pyramid processing schemes and wavelet pyramid processing schemes. These two methods are also called a subband processing method and a hierarchical processing method. The wavelet pyramid processing method for image data is a special multi-resolution pyramid processing method that performs subband decomposition of an original image using an orthogonal mirror filter (QMF). There are other non-QMF wavelet schemes. For more information on wavelet processing, see Antonini, M. et al. , Et l. "Image Coding Using Wavelet Transform", IEEE Transactions on Image Processing, Vol. 1, No. 2, April 1992, and Shapiro, J. et al. "An Embedded Hierarchical Image Coder Using Zerotrees of Wavelet Coefficients", Proc. IEEE Data Compression Conference, pgs. See 214-223, 1993. To obtain information on reversible transformation, see Said, A. et al. and Pearlman, W.M. “Reversible Image Com- pression via Multiresolution Representation and Predictive Coding”, Dept. See of Electrical, Computer and System Engineering, Renssealaer Polytechnic Institute, Troy, NY 1993.

圧縮は、しばしば非常に時間がかかり、また膨大なメモリを必要とする。より高速に、かつ／又は、可能なかぎり少ないメモリで、圧縮を行うのが望ましい。品質を保証できない、圧縮率が不十分である、あるいはデータレートが制御可能でないという理由で、圧縮を利用しなかった応用分野もある。しかし、伝送及び／又は記憶すべき情報量を減らすため圧縮を利用するのが望ましい。 Compression is often very time consuming and requires a large amount of memory. It is desirable to perform compression at higher speeds and / or with as little memory as possible. Some applications have not utilized compression because quality cannot be guaranteed, compression rates are insufficient, or data rates are not controllable. However, it is desirable to use compression to reduce the amount of information to be transmitted and / or stored.

従来技術に、自然連続階調画像を扱うための圧縮システムがある。その一例が、国際標準Ｄis．10918‐1，“Ｄigital Ｃompression and Ｃoding of Ｃontinuous-Ｔone Ｓtill Ｉmages”，CCITT勧告Ｔ．８１であり、これは通常、ＪＰＥＧと呼ばれる。従来技術に、２値／ノイズフリー／浅画素深度画像を扱うための圧縮システムもある。そのようなシステムの一例が、国際標準ＩＳＯ／ＩＥＣ 11544，“Ｉnformation Ｔechnology-Ｃoded Ｒepersentation of Ｐicture and Ａudio Ｉnformation-Ｐrogressive Ｂi-level Ｉmage Ｃompression”，CCITT勧告Ｔ．８２であり、これは通常、ＪＢＩＧと呼ばれる。しかしながら、従来技術には両方を適切に処理するシステムがない。そのようなシステムがあると望ましい。 The prior art includes a compression system for handling natural continuous tone images. One example is the international standard Dis. 10918-1, “Digital Compression and Coding of Continuous-Tone Still Images”, CCITT Recommendation T. 81, which is usually called JPEG. The prior art also has compression systems for handling binary / noise-free / shallow pixel depth images. An example of such a system is the international standard ISO / IEC 11544, “Information Technology-Coded Redistribution of Picture and Audio Information-Progressive Bi-level Image Compression”, CCITT Recommendation T.30. 82, which is usually called JBIG. However, the prior art does not have a system that properly handles both. It would be desirable to have such a system.

パーサ（parser）はコンピュータ科学において周知である。パーサは、構造が初めは分かっていないオブジェクトの種々の部分に意義付けする役割がある。例えば、コンパイラの一部として動作するあるパーサは、プログラム・ファイル中のある文字列が“識別子”であり、別の文字列が予約語を構成し、また別の文字列がコメントの部分であると決定するだろう。このパーサは、文字列がどういう“意味”であるかを判定するのではなく、対象のどういう種類の部分であるかを判断するだけである。 Parsers are well known in computer science. The parser plays a role in different parts of an object whose structure is not initially known. For example, a parser that runs as part of a compiler has a string in the program file that is an “identifier”, another string that makes up a reserved word, and another string that is a comment. Will decide. This parser does not determine what “meaning” the character string is, but only what type of part of the object it is.

ほとんどの画像記憶フォーマットは単一用途のものである。すなわち、単一の解像度または単一の品質レベルしか利用できない。他の画像フォーマットは多用途が可能である。従来技術の多用途画像フォーマットの中には、２つ又は３つの解像度／品質の選択肢をサポートするものもあるが、解像度又は品質の一方しか指定できず、両方は指定できないものもある。利用できる解像度及び品質の選択肢を増加させることが望ましい。 Most image storage formats are single use. That is, only a single resolution or a single quality level is available. Other image formats are versatile. Some prior art versatile image formats support two or three resolution / quality options, but some can specify only one of resolution or quality, not both. It is desirable to increase the available resolution and quality options.

例えば、インターネットのワールド・ワイド・ウェブサーバーは、現在、大量のデータの中から必要とされる情報を提供する。普通、ユーザは画面上で多数の画像を閲覧し、いくつかを印刷することに決めることができる。しかし残念ながら、閲覧ツールの現状では、画像が主にモニタ用のものであると印刷出力の品質がかなり悪くなってしまい、画像が主に印刷用のものであると閲覧時間が極端に長くなってしまう。”非損失の”画像の取得は、不可能であるか、あるいは、全く独自のダウンロードとしてなされねばならない。 For example, the Internet's World Wide Web server currently provides the information needed from a large amount of data. Usually, the user can view many images on the screen and decide to print some. Unfortunately, with the current state of viewing tools, the quality of the printed output is considerably worse if the image is mainly for monitoring, and the viewing time becomes extremely long if the image is mainly for printing. End up. Acquisition of “lossless” images is impossible or must be done as a completely unique download.

本発明の一般的な目的は、良好なエネルギー集中をもたらす変換を利用する損失性及び非損失性のデータ圧縮システムを提供することにある。より具体的に述べれば、画像出力装置より与えられる装置特性に応じて装置依存の量子化を遂行するパーサを含むデータ圧縮システムを提供することにある。 It is a general object of the present invention to provide a lossy and lossless data compression system that utilizes transformations that provide good energy concentration. More specifically, it is an object to provide a data compression system including a parser that performs device-dependent quantization according to device characteristics given by an image output device.

請求項１記載の発明によるデータ圧縮システムは、少なくとも１つのマーカーを持つヘッダを有する符号ストリームを格納するメモリ、少なくとも１つの出力装置、該メモリに接続され、かつ、該少なくとも１つの出力装置より装置特性を受け取るように接続されたパーサからなり、該パーサは装置依存の量子化を実行するように動作可能である。請求項２記載の発明によれば、符号ストリームは非損失性圧縮データからなり、請求項３記載の発明によれば、少なくとも１つのマーカーは符号ストリーム中の各タイルのために用いられた成分の数、サブサンプリング及びアラインメントを示し、請求項４記載の発明によれば、符号ストリームは主ヘッダを含み、符号ストリーム中の各タイルの前にローカルヘッダが置かれる。請求項５記載の発明によれば、主ヘッダは符号ストリーム中の全てのタイルに適用され、各ローカルヘッダは関連したタイルにのみ適用される、請求項６記載の発明によれば、ローカルヘッダ中の少なくとも１つは主ヘッダに優先する。請求項７記載の発明によれば、パーサは符号ストリーム中のマーカーを符号ストリームを量子化するために利用し、請求項８記載の発明によれば、マーカー中の少なくとも１つは周波数情報を示す。請求項９記載の発明によれば、データ圧縮システムは符号ストリームを生成するための圧縮装置をさらに含む。請求項１０記載の発明によれば、パーサは量子化選択装置からなり、請求項１１記載の発明によれば、量子化選択装置は画像の集合の変換及び量子化を、様々な係数のビットプレーンを捨てることによって行う。請求項１２記載の発明によれば、タグの１つは各タイル中のデータ内の重要性レベルを示し、請求項１３記載の発明によれば、タグは重要性レベルロケータ信号を示し、該信号に従って該パーサは打ち切りをする。請求項１４記載の発明によれば、タグは保存すべき重要性レベルの数を示し、請求項１５記載の発明によれば、タグは保存すべきバイトの数を示す。請求項１６記載の発明によれば、タグは重要性レベルとバイト数を関連付ける指示を各タイルに含み、請求項１７記載の発明によれば、少なくとも１つのマーカーは各タイル中の重要性レベルのバイト数を示す。 A data compression system according to claim 1 comprises a memory for storing a code stream having a header having at least one marker, at least one output device, connected to the memory, and from the at least one output device. It consists of a parser connected to receive the characteristics, the parser being operable to perform device dependent quantization. According to the second aspect of the present invention, the code stream is composed of lossless compressed data, and according to the third aspect of the present invention, the at least one marker is a component used for each tile in the code stream. According to the invention of claim 4, the code stream includes a main header, and a local header is placed before each tile in the code stream. According to the fifth aspect of the present invention, the main header is applied to all tiles in the code stream, and each local header is applied only to the associated tile. At least one of these takes precedence over the main header. According to the invention described in claim 7, the parser uses a marker in the code stream to quantize the code stream, and according to the invention according to claim 8, at least one of the markers indicates frequency information. . According to the ninth aspect of the present invention, the data compression system further includes a compression device for generating a code stream. According to the invention described in claim 10, the parser comprises a quantization selection device. According to the invention described in claim 11, the quantization selection device performs conversion and quantization of a set of images with bit planes having various coefficients. Do by throwing away. According to the invention of claim 12, one of the tags indicates an importance level in the data in each tile, and according to the invention of claim 13, the tag indicates an importance level locator signal, the signal The parser will abort accordingly. According to the invention described in claim 14, the tag indicates the number of importance levels to be stored, and according to the invention described in claim 15, the tag indicates the number of bytes to be stored. According to the invention described in claim 16, the tag includes an instruction for associating the importance level with the number of bytes in each tile, and according to the invention according to claim 17, the at least one marker includes the importance level in each tile. Indicates the number of bytes.

以上の説明から明らかな如く、本発明のパーサを含むデータ圧縮システムによれば、符号データを伸長することなく、画像出力装置の特性に応じて符号データストリームの適切な量子化を行うことができる等の効果を有する。 As is apparent from the above description, according to the data compression system including the parser of the present invention, the code data stream can be appropriately quantized according to the characteristics of the image output apparatus without decompressing the code data. It has effects such as.

圧縮及び伸長のための方法及び装置について述べる。以下の本発明に関する詳細な説明において、本発明を完全に理解してもらうために、コーダの種類、ビット数、信号名等々、様々な具体例が示される。しかし、当業者には、そのような具体例によらずに本発明を実施し得ることは明白になろう。他方、本発明をいたずらに難解にしないため、周知の構造及びデバイスはブロック図の形式で表し、詳しくは示さない。 A method and apparatus for compression and decompression is described. In the following detailed description of the present invention, various specific examples such as the type of coder, the number of bits, the signal name, etc. are shown in order to fully understand the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced without such specific examples. On the other hand, well-known structures and devices are shown in block diagram form and are not shown in detail in order not to obscure the present invention.

以下の詳細説明のかなりの部分は、コンピュータメモリ内のデータビットに対する演算のアルゴリズム及び記号表現によって与えられる。このようなアルゴリズム記述及び表現は、データ処理技術分野の当業者によって、その研究の内容を他の当業者に対し最も効率的に伝えるために用いられる手段である。あるアルゴリズムがあり、それが概して、希望する結果に至る自己矛盾のないステップ系列だと考えられるとしよう。これらのステップは、物理量の物理的処理を必要とするものである。必ずという訳ではないが、これらの物理量は記憶、転送、結合、比較、その他処理が可能な電気的または磁気的信号の形をとるのが普通である。これらの信号をビット、値、要素、記号、文字、用語、数字等で表わすのが、主に慣用上の理由から、時に都合がよいことが分かっている。 A significant portion of the detailed description below is given by algorithms and symbolic representations of operations on data bits in computer memory. Such algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. Suppose that there is an algorithm that is generally considered a self-consistent sequence of steps that leads to the desired result. These steps are those requiring physical processing of physical quantities. Usually, though not necessarily, these physical quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to represent these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.

しかしながら、このような用語は、適切な物理量と関係付けられるべきであり、また、これら物理量につけた便宜上のラベルに過ぎないということに留意すべきである。以下の説明から明らかなように、特に断わらない限り、“処理”“演算”“計算”“判定”“表示”等々の用語を用いて論じることは、コンピュータシステムのレジスタ及びメモリ内の物理的（電子的）な量として表現されたデータを処理して、コンピュータシステムのメモリまたはレジスタ、同様の情報記憶装置、情報伝送装置あるいは表示装置の内部の同様に物理量として表現された他のデータへ変換する、コンピュータシステムあるいは同様の電子演算装置の作用及びプロセスを指すものである。 However, it should be noted that such terms are to be associated with appropriate physical quantities and are merely convenient labels attached to these physical quantities. As will be apparent from the following description, unless otherwise specified, discussions using terms such as “processing”, “operation”, “calculation”, “judgment”, “display”, etc., refer to the physical ( Processes data expressed as electronic quantities and converts it to other data expressed as physical quantities in a computer system memory or register, similar information storage device, information transmission device or display device as well Refers to the operation and process of a computer system or similar electronic computing device.

本発明はまた、本明細書に述べる操作を実行するための装置にも関係する。この装置は、要求目的のために専用に作られてもよいし、あるいは、汎用コンピュータを内蔵プログラムにより選択的に駆動または再構成したものでもよい。本明細書に提示されるアルゴリズム及び表示は、本質的に、いかなる特定のコンピュータやその他装置とも関係がない。様々な汎用マシンを本明細書に述べたところに従うプログラムで利用してもよいし、あるいは、必要な方法ステップの実行のためにより特化した装置を作るほうが好都合であるかもしれない。これら多様なマシンに要求される構造は以下の説明より明らかになろう。さらに、本発明を説明するにあたり、いかなる特定のプログラミング言語とも関連付けない。本明細書において述べるように、本発明の教えるところを実現するために多様なプログラミング言語を使用してよいことが分かるであろう。 The present invention also relates to an apparatus for performing the operations described herein. This device may be made exclusively for the required purpose, or it may be a general-purpose computer selectively driven or reconfigured by a built-in program. The algorithms and displays presented herein are essentially unrelated to any particular computer or other device. Various general purpose machines may be utilized in programs according to those described herein, or it may be advantageous to create a more specialized device for performing the necessary method steps. The required structure for a variety of these machines will appear from the description below. Further, in describing the present invention, it is not associated with any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention, as described herein.

下記用語が以下の説明に用いられる。それら各種用語にはすでに語義がある。しかし、規定された語義は、それら用語が当該分野において知られている範囲に限定して考えられるべきでない。これら語義は、本発明の理解を手助けするために規定されたものである。 The following terms are used in the description below. These terms already have meaning. However, the defined meanings should not be considered limited to the extent that the terms are known in the art. These terms are defined in order to help understanding of the present invention.

アラインメント（alignment）：
ある周波数帯域内の変換係数の、他の周波数帯域に対するシフト度合。 Alignment:
The degree of shift of the transform coefficient within a frequency band relative to other frequency bands.

バイナリ符号化方式：
２値、有限画素深度の、又はノイズフリーのデータのための符号化の一方式。一実施例にあっては、バイナリ符号化方式は画素のグレイ(Gray)符号化と特有のコンテキスト・モデルからなる。 Binary encoding method:
A coding scheme for binary, finite pixel depth, or noise-free data. In one embodiment, the binary coding scheme consists of pixel gray coding and a specific context model.

ビット・シグニフィカンス（bit-significance）：
符号（sign）絶対値表現に似た数表現で、ヘッド(head)ビットの後に符号(sign)ビットが続き、さらに、テール(tail)ビットがあれば、その後に続く。埋め込み(embedding)は、この数表現に対しビットプレーン順に符号化する。 Bit-significance:
A number representation similar to the sign absolute value representation, where the head bit is followed by a sign bit, followed by a tail bit, if any. Embedding encodes this number representation in bit-plane order.

コンテキスト・モデル：
符号化しようとするカレント・ビットに関する原因として利用可能な情報で、カレント・ビットに関する過去に学習した情報を提供し、エントロピー符号化のための条件付確率予測を可能にする。 Context model:
Information that can be used as a cause for the current bit to be encoded and provides previously learned information about the current bit to enable conditional probability prediction for entropy coding.

埋め込み量子化：
符号ストリームに包含される量子化。例えば、重要性レベルが、最高のレベルから最低のレベルへと順に並べられているときには、符号ストリームの単なる打ち切りによって量子化が行われる。タグ、マーカー、ポインタ、その他の信号によって同じ作用を得ることができる。 Embedded quantization:
Quantization included in the code stream. For example, when the importance levels are arranged in order from the highest level to the lowest level, the quantization is performed by simply truncating the code stream. The same effect can be obtained with tags, markers, pointers, and other signals.

エントロピー・コーダ：
カレント・ビットを、確率予測に基づいて符号化又は復号化する装置。
エントロピー・コーダは、本明細書では多重コンテキスト・バイナリ・コーダとも呼ばれるであろう。カレント・ビットのコンテキストは“近傍”ビットに関するいくつかの選ばれた配置であり、カレント・ビット（１ビットまたは複数ビット）の最適表現のための確率予測を可能にする。一実施例では、エントロピー・コーダはバイナリ・コーダ又はハフマン・コーダを含む。 Entropy coder:
An apparatus for encoding or decoding current bits based on probability prediction.
An entropy coder will also be referred to herein as a multi-context binary coder. The context of the current bit is some chosen arrangement with respect to “neighboring” bits, allowing probability prediction for an optimal representation of the current bit (one or more bits). In one embodiment, the entropy coder includes a binary coder or a Huffman coder.

固定長：
データの特定ブロックを圧縮データの特定ブロックへ変換する方式。例えばＢＴＣ（ブロック打ち切り符号化）、ＶＱ（ベクトル量子化）のいくつかの方式。固定長符号は固定レート・固定サイズのアプリケーションに適するが、レート・歪み性能は可変レート方式に比べ劣ることが多い。 Fixed length:
A method of converting a specific block of data into a specific block of compressed data. For example, some methods of BTC (block truncation coding) and VQ (vector quantization). Fixed length codes are suitable for fixed rate and fixed size applications, but rate and distortion performance is often inferior to variable rate systems.

固定レート：
ある一定の画素レートを維持しなければならず、帯域幅の限定された通信路を持つアプリケーション又は方式。この目的を成し遂げるには、全体的に平均して圧縮するというよりも、局所的に平均して圧縮することが必要である。例えば、ＭＰＥＧは固定レートを要求する。 Fixed rate:
An application or scheme that has to maintain a certain pixel rate and has a limited bandwidth channel. To achieve this goal, it is necessary to compress locally on average rather than compress on average overall. For example, MPEG requires a fixed rate.

固定サイズ：
限られたサイズのバッファを持つアプリケーション又は方式。この目的を成し遂げるため、全体的に平均した圧縮が達成される、例えば、印刷バッファ。（アプリケーションは、固定レートでかつ固定サイズのことも、そのどちらかのこともある。） Fixed size:
An application or scheme with a limited size buffer. To achieve this goal, overall average compression is achieved, eg, a print buffer. (Applications can be fixed rate and fixed size, or both.)

周波数帯域：
各周波数帯域は、同じフィルタ処理系列によりもたらされる一群の係数を表す。 frequency band:
Each frequency band represents a group of coefficients produced by the same filtering sequence.

ヘッド・ビット：
ビット・シグニフィカンス表現において、ヘッドビットとは、最上位ビッから最初の非ゼロのビットまでの、該最初の非ゼロビットを含めた絶対値ビットである。 Head bit:
In the bit-significance representation, the head bit is an absolute value bit including the first non-zero bit from the most significant bit to the first non-zero bit.

水平コンテキストモデル：
（一実施例では）埋め込みウェーブレット係数及びバイナリ・エントロピー・コーダのためのコンテキスト・モデル。 Horizontal context model:
Context model for embedded wavelet coefficients and binary entropy coder (in one embodiment).

ベキ等：
画像を損失性形式で伸長してから同じ損失性符号語へ再圧縮することを可能にする符号化。 Power, etc .:
An encoding that allows an image to be decompressed in a lossy format and then recompressed to the same lossy codeword.

画像タイル：
それぞれが同一のパラメータを持つ、オーバーラップのない連続した部分画像の格子の定義を可能にするため選ばれた矩形領域。画像タイルは、ウェーブレット方式符号化において変換の計算のため必要になるバッファ・サイズに影響を及ぼす。画像タイルはランダムにアドレスできる。符号化操作は１画像タイル中の画素及び係数データを処理用する。このため、画像タイルを乱順に構文解析又は復号化することができる。すなわち、画像タイルを、ランダムにアドレスし、又は注目領域の伸長の様々な歪みレベルに応じて復号化することができる。一実施例では、画像タイルは最上部及び最下部のもの以外は全て同一サイズである。画像タイルは、画像全体のサイズ以下の任意サイズにしてよい。 Image tile:
A rectangular region chosen to allow the definition of a grid of consecutive non-overlapping sub-images, each with the same parameters. Image tiles affect the buffer size required for transform calculations in wavelet coding. Image tiles can be randomly addressed. The encoding operation processes the pixel and coefficient data in one image tile. Thus, image tiles can be parsed or decoded in random order. That is, image tiles can be randomly addressed or decoded according to various distortion levels of region of interest decompression. In one embodiment, the image tiles are all the same size except for the top and bottom ones. The image tile may be any size that is less than or equal to the size of the entire image.

重要性レベル：
特定の体系を定義することにより、入力データ（画素データ、係数、誤差信号等）は視覚的効果が同じ複数のグループに論理的に分類される。例えば、最上位の一つまたは複数のビットプレーンは、多分、それより下位のビットプレーンより視覚的に重要であろう。また、低い周波数の情報は一般に高い周波数の情報より重要である。“視覚的重要性”の実用定義の殆どは、後述のように本発明も含め、何らかの誤差基準に関係している。しかし、それよりも良好な視覚的尺度が、視覚的重要性の体系定義に組み入れられるかもしれない。データの種類が異なれば視覚的重要性レベルも異なる。例えば、音声データは音声の重要性レベルを持つ。 Importance level:
By defining a specific system, input data (pixel data, coefficients, error signals, etc.) is logically classified into a plurality of groups having the same visual effect. For example, the most significant bit plane or planes are probably more visually important than the lower bit planes. Also, low frequency information is generally more important than high frequency information. Most practical definitions of “visual importance” relate to some error criterion, including the present invention, as described below. However, a better visual measure may be incorporated into the system definition of visual importance. Different data types have different levels of visual importance. For example, voice data has a voice importance level.

オーバーラップ変換：
単一のソース標本点が同一周波数の複数の係数に寄与する変換。その例に、多くのウェーブレットとオーバーラップ直交変換（Lapped Orthogonal Tansform）がある。 Overlap conversion:
A transformation in which a single source sample point contributes to multiple coefficients of the same frequency. Examples include many wavelets and Lapped Orthogonal Tansform.

プログレッシブ：
符号化データの一部から矛盾のない伸長結果を得られ、かつデータを増やすことで精度を上げることができるように順序付けられた符号ストリーム。データのビットプレーンが浅いほうから深いほうへ順序付けられた符号ストリーム；この場合は、普通、ウェーブレット係数データをさす。 progressive:
A code stream that is ordered so that a consistent decompression result can be obtained from a portion of the encoded data and the accuracy can be increased by increasing the data. A code stream in which the bit plane of data is ordered from shallow to deep; in this case, it usually refers to wavelet coefficient data.

プログレッシブ画素深度：
データのビットプレーンが浅いほうから深いほうへ順序付けられた符号ストリーム。 Progressive pixel depth:
A code stream in which the bit plane of data is ordered from shallow to deep.

プログレッシブ・ピラミッド：
解像度が下がる毎に大きさが２分の１（面積では４分の１）になる解像度成分の連続。 Progressive pyramid:
A series of resolution components whose size is reduced to half (1/4 in area) each time the resolution decreases.

可逆変換：
一実施例では、圧縮結果を元に復元できる、整数演算により実施される効率的変換。 Reversible conversion:
In one embodiment, an efficient transformation performed by integer arithmetic that can be restored based on the compression result.

Ｓ変換：
１つの２タップ・ローパスフィルタと１つの２タップ・ハイパスフィルタからなる特殊な可逆ウェーブレットフィルタ対。 S conversion:
A special reversible wavelet filter pair consisting of one 2-tap low-pass filter and one 2-tap high-pass filter.

テール：
ビット・シグニフィカンス表現で、テール(tail)ビットとは最上位の非ゼロのビットより下位の低い絶対値ビットである。 Tail:
In bit-significance representation, a tail bit is a lower absolute value bit that is lower than the most significant non-zero bit.

テール情報：
一実施例では、ビット・シグニフィカンス表現で表された係数のためにとり得る４つの状態。係数及びカレント・ビットプレーンの関数であり、水平コンテキスト・モデルのために利用される。 Tail information:
In one embodiment, the four possible states for the coefficients expressed in bit-significance representation. It is a function of the coefficients and the current bitplane and is used for the horizontal context model.

テール・オン(tail-on)：
一実施例では、テール情報の状態がゼロか非ゼロであるかに依存した２つの状態。水平コンテキスト・モデルのために利用される。 Tail-on:
In one embodiment, two states depending on whether the tail information state is zero or non-zero. Used for horizontal context model.

タイルデータ（tile data）セグメント：
一つの画像タイルを完全に記述する符号ストリームの部分。一実施例においては、画像タイルの始まり（ＳＯＴ）を定義するタグから、次のＳＯＴまで、又は画像の終わり（ＥＯＩ）のタグまでの全データ。 Tile data segment:
The part of the code stream that completely describes one image tile. In one embodiment, all data from the tag defining the beginning of an image tile (SOT) to the next SOT or the end of image (EOI) tag.

変換係数：
ウェーブレット変換を適用した結果。ウェーブレット変換においては、係数は対数分割された周波数スケールを表す。 Conversion factor:
The result of applying the wavelet transform. In the wavelet transform, the coefficient represents a logarithmically divided frequency scale.

ＴＳ変換：
２・６（Two‐Six）変換。１つの２タップ・ローパス分析フィルタと１つの６タップ・ハイパス分析フィルタからなる特殊な可逆ウェーブレットフィルタ対。合成フィルタは、分析フィルタの直交ミラー・フィルタである。 TS conversion:
2. 6 (Two-Six) conversion. A special reversible wavelet filter pair consisting of one 2-tap low-pass analysis filter and one 6-tap high-pass analysis filter. The synthesis filter is an orthogonal mirror filter of the analysis filter.

ＴＴ変換：
２・１０（Two‐Ten）変換。１つの２タップ・ローパス分析フィルタと１つの１０タップ・ハイパス分析フィルタからなる特殊な可逆ウェーブレットフィルタ対。合成フィルタは分析フィルタの直交ミラー・フィルタである。 TT conversion:
2.10 (Two-Ten) conversion. A special reversible wavelet filter pair consisting of one 2-tap low-pass analysis filter and one 10-tap high-pass analysis filter. The synthesis filter is an orthogonal mirror filter of the analysis filter.

統合型（unified）非損失性／損失性：
同じ圧縮システムが、非損失性又は損失性の復元が可能な符号データストリームを提供する。 Unified non-loss / loss:
The same compression system provides a code data stream that can be lossless or lossy restored.

ウェーブレット・フィルタ：
ウェーブレット変換に使われるハイパスとローパスの合成フィルタ及び分析フィルタ。 Wavelet filter:
High-pass and low-pass synthesis and analysis filters used for wavelet transform.

ウェーブレット変換：
“周波数”及び“時間（空間）”領域の両方の拘束条件を用いる変換。説明する一実施例では、１つのハイパスフィルタと１つのローパスフィルタからなる変換である。結果として得られる係数は２：１の間引きを施され（臨界フィルタ処理）、次にそれらフィルタがローパス係数にかけられる。 Wavelet transform:
Transformation using constraints in both the “frequency” and “time (space)” domains. In the embodiment to be described, the conversion is composed of one high-pass filter and one low-pass filter. The resulting coefficients are thinned out 2: 1 (critical filtering) and then the filters are subjected to low pass coefficients.

ウェーブレット・ツリー：
最高レベルのウェーブレット分解のＬＬ部内の単一の係数と関係付けられた係数群。係数の個数はレベル数の関数である。ウェーブレット・ツリーのスパンは、分解レベル数に依存する。例えば、１レベル分解の場合には、ウェーブレット・ツリーのスパンは４画素、２レベル分解では１６画素、等々である。 Wavelet tree:
Coefficients associated with a single coefficient in the LL part of the highest level wavelet decomposition. The number of coefficients is a function of the number of levels. The span of the wavelet tree depends on the number of decomposition levels. For example, for 1 level decomposition, the wavelet tree span is 4 pixels, for 2 level decomposition, 16 pixels, and so on.

本発明の概要
本発明は、符号化部及び復号化部を持つ圧縮／伸長システムを提供する。符号化部は入力データを符号化して圧縮データを生成する働きをし、他方、復号化部は既に符号化されたデータを復号化して元の入力データの再構成データを生成する働きをする。入力データには、画像（静止画像あるいは動画像）、音声等々の様々な種類のデータが含まれる。一実施例では、データはデジタル信号データであるが、デジタル化したアナログデータ、テキストデータ形式、その他の形式も可能である。そのデータのソースは、例えば符号化部及び／または復号化部のためのメモリまたは通信路である。 SUMMARY OF THE INVENTION The present invention provides a compression / decompression system having an encoder and a decoder. The encoder serves to encode the input data and generate compressed data, while the decoder serves to decode the already encoded data and generate reconstructed data of the original input data. The input data includes various types of data such as images (still images or moving images), sounds, and the like. In one embodiment, the data is digital signal data, but digitized analog data, text data formats, and other formats are possible. The source of the data is, for example, a memory or a communication channel for an encoding unit and / or a decoding unit.

本発明において、符号化部及び／または復号化部の構成要素は、ハードウエア又はコンピュータシステム上で利用されるソフトウエアによって実現し得る。本発明は、非損失性の圧縮／伸長システムを提供する。本発明はまた、損失性の圧縮／伸長を実行するようにも構成し得る。本発明は、圧縮データの構文解析を、伸長をすることなく実行するように構成し得る。 In the present invention, the components of the encoding unit and / or the decoding unit can be realized by hardware or software used on a computer system. The present invention provides a lossless compression / decompression system. The present invention may also be configured to perform lossy compression / decompression. The present invention can be configured to perform parsing of compressed data without decompression.

本発明のシステムの概要
本発明は、自然画像に見られる滑らかなエッジと平坦な領域を非常に良好に表現する。本発明は、可逆埋め込みウェーブレットを利用して、画素深度の深い画像を圧縮する。しかしながら、可逆埋め込みウェーブレット、他のウェーブレット変換方式及びシヌソイド変換方式は、テキストや図形画像に見られるシャープなエッジを表現するのは得意ではない。この種の画像は、グレイ（Gray）符号化を行ってからＪＢＩＧのようなコンテキスト・ベースのビットプレーン符号化を行うことにより良好に圧縮できる。さらに、ノイズフリーのコンピュータ生成画像は、バイナリ方式により良好にモデル化される。 System Overview of the Present Invention The present invention represents very well the smooth edges and flat areas found in natural images. The present invention uses reversible embedded wavelets to compress images with deep pixel depth. However, reversible embedded wavelets, other wavelet transform methods and sinusoid transform methods are not good at expressing the sharp edges found in text and graphic images. This type of image can be compressed well by performing Gray coding followed by context-based bitplane coding such as JBIG. Furthermore, noise-free computer-generated images are well modeled by the binary method.

本発明は、２値画像及び図形画像の圧縮のためのバイナリ方式を提供する。このバイナリ方式は、ダイナミックレンジ全体を使わないある種の画像に対する圧縮も改善する。このバイナリ方式においては、本発明は変換を使わないで画像のビットプレーンを符号化する。 The present invention provides a binary scheme for compression of binary and graphic images. This binary scheme also improves compression for certain types of images that do not use the entire dynamic range. In this binary system, the present invention encodes a bit plane of an image without using a transformation.

図１は、バイナリ方式を採用した本発明の圧縮システムの一実施例のブロック図である。なお、システムの復号化部は逆の順序で動作し、データフローも同様である。図１において、入力画像１０１は多成分処理機構１１１に入力される。この多成分処理機構１１１は、オプションの色空間変換、及び、サブサンプリングを施された画像成分に関するオプションの処理を提供する。方式選択機構１１０は、画像が連続階調画像か２値画像か、あるいは、画像のどの部分がそのような特性を持っているかを判定する。画像データは方式選択機構１１０へ送られ、方式選択機構１１０は、その画像データ又はその部分をウェーブレット方式処理（ブロック１０２，１０３，１０５）又はバイナリ方式処理（ブロック１０４）へ送る。本発明においては、どのモードを利用するかの決定は、データに依存して決まる。一実施例では、方式選択機構１１０はマルチプレクサからなる。方式選択機構１１０は、復号化動作中は利用されない。 FIG. 1 is a block diagram of an embodiment of a compression system of the present invention adopting a binary method. Note that the decryption unit of the system operates in the reverse order, and the data flow is the same. In FIG. 1, an input image 101 is input to a multi-component processing mechanism 111. The multi-component processing mechanism 111 provides optional color space conversion and optional processing for subsampled image components. The method selection mechanism 110 determines whether the image is a continuous tone image or a binary image, or which part of the image has such characteristics. The image data is sent to the method selection mechanism 110, and the method selection mechanism 110 sends the image data or a part thereof to the wavelet method processing (blocks 102, 103, 105) or the binary method processing (block 104). In the present invention, the determination of which mode to use depends on the data. In one embodiment, scheme selection mechanism 110 comprises a multiplexer. The scheme selection mechanism 110 is not used during the decoding operation.

ウェーブレット方式では、可逆ウェーブレット変換ブロック１０２が可逆ウェーブレット変換を実行する。同ブロック１０２の出力は係数の系列である。埋め込み順序付け量子化ブロック１０３は、（可逆ウェーブレット変換ブロック１０２により生成された）入力画像１０１中の係数全部のアラインメントを生成するため、係数をビット・シグニフィカンス表現にしてからラベル付けする。 In the wavelet method, the reversible wavelet transform block 102 performs a reversible wavelet transform. The output of the block 102 is a series of coefficients. The embedding ordered quantization block 103 labels the coefficients after making them a bit-significance representation in order to generate an alignment of all the coefficients in the input image 101 (generated by the reversible wavelet transform block 102).

画像データ１０１が受け取られ、そして（適切な多成分処理の後）可逆ウェーブレット変換ブロック１０２において後に説明されるように可逆ウェーブレットを利用して変換されることにより、画像の多重解像度分解を表す係数の系列が生成される。本発明の可逆ウェーブレット変換は、計算が複雑でない。この変換は、ソフトウエア又はハードウエアにより、全く系統誤差を生じさせないで実行できる。さらに、本発明のウェーブレットはエネルギー集中及び圧縮性能に優れている。これらの係数は埋め込み順序付け量子化ブロック１０３に受け取られる。 Image data 101 is received and transformed (after appropriate multi-component processing) using a reversible wavelet as described later in a reversible wavelet transform block 102 to provide a coefficient representation representing the multi-resolution decomposition of the image. A series is generated. The reversible wavelet transform of the present invention is not complicated to calculate. This conversion can be performed by software or hardware without causing any systematic errors. Furthermore, the wavelet of the present invention is excellent in energy concentration and compression performance. These coefficients are received by the embedding ordered quantization block 103.

埋め込み順序付け量子化ブロック１０３は、後述のように埋め込み順序付け量子化をする。その結果は埋め込み（embedded）データストリームである。この埋め込みデータストリームは、符号化時、伝送時又は復号化時に、結果の符号ストリームの量子化を許す。一実施例においては、埋め込み順序付け量子化ブロック１０３は、係数を順序付けして符号・絶対値形式に変換する。 The embedding ordered quantization block 103 performs embedding ordered quantization as will be described later. The result is an embedded data stream. This embedded data stream allows quantization of the resulting code stream at the time of encoding, transmission or decoding. In one embodiment, the embedding ordered quantization block 103 orders the coefficients and converts them to signed / absolute value format.

埋め込みデータストリームは水平コンテキストモデル・ブロック１０５に受け取られる。水平コンテキストモデル・ブロック１０５は、埋め込みデータストリーム中のデータをその重要性に基づきモデル化する（後述）。変換モードでは、“ビットプレーン”は変換係数の重要性レベル・プレーンであり、水平コンテキストモデル・ブロック１０５はウェーブレット係数をビット・シグニフィカンス表現に整える。 The embedded data stream is received in the horizontal context model block 105. The horizontal context model block 105 models the data in the embedded data stream based on its importance (described later). In transform mode, the “bit plane” is the transform coefficient importance level plane, and the horizontal context model block 105 arranges the wavelet coefficients into a bit-significance representation.

順序付け及びモデリングの結果は、エントロピー・コーダ１０６により符号化すべきデシジョン(decisions)（又はシンボル）である。一実施例では、全てのデシジョンが一つのコーダへ送られる。他の実施例では、デシジョンは重要性によってラベル付けされ、各重要性レベルのデシジョンは別々の複数の（物理または仮想）コーダによって処理される。ビットストリームは、エントロピー・コーダ１０６により重要性の順に符号化される。一実施例では、エントロピー・コーダ１０６は１つ又は複数のバイナリ・エントロピー・コーダからなる。別の実施例では、ハフマン符号化が利用される。 The result of the ordering and modeling is a decision (or symbol) to be encoded by the entropy coder 106. In one embodiment, all decisions are sent to one coder. In other embodiments, decisions are labeled by importance, and each importance level decision is processed by separate multiple (physical or virtual) coders. The bitstream is encoded by the entropy coder 106 in order of importance. In one embodiment, entropy coder 106 comprises one or more binary entropy coders. In another embodiment, Huffman coding is utilized.

バイナリ方式では、グレイ(Gray)符号化ブロック１０４が入力画像１０１の画素に対しグレイ符号化を行う。グレイ符号化は画素のビットプレーン間の相関の一部を利用するビット操作である。それは、任意の値ｘとｘ＋１に対し、gray(x)とgray(x+1)は＜その基数２の表現で異なるのは１ビットだけであるからである。一実施例では、グレー符号化ブロック１０４は８ビット画素に対し点毎の変換、すなわち
gray(x)＝ｘXORｘ/2
を実行する。本発明は、この形式のグレー符号化を利用することに限定されるわけでも、８ビットのサイズの画素を利用しなければならないわけでもない。しかし、上記式を利用すると、ビットプレーン単位のプログレッシブ伝送の場合のように、利用可能な最上位ビットの一部だけで画素を再構成できるという利点がある。言い換えると、この形式のグレー符号化はビット・シグニフィカンスの順序付けを保存する。 In the binary method, the gray encoding block 104 performs gray encoding on the pixels of the input image 101. Gray coding is a bit operation that uses a portion of the correlation between the bit planes of a pixel. This is because, for an arbitrary value x and x + 1, gray (x) and gray (x + 1) differ only in one bit in their radix-2 representation. In one embodiment, the gray coding block 104 performs a point-by-point transformation on 8-bit pixels, i.e.
gray (x) = xXORx / 2
Execute. The present invention is not limited to using this form of gray coding, nor does it have to use 8-bit size pixels. However, the use of the above formula has an advantage that the pixels can be reconfigured with only a part of the most significant bits that can be used, as in the case of progressive transmission in bit plane units. In other words, this form of gray coding preserves the bit-significance ordering.

バイナリ方式では、グレイ符号化ブロック１０４及びエントロピー・コーダ１０６を利用し、データはビットプレーン毎に符号化される。一実施例では、グレイ符号化ブロック１０４内のコンテキストモデルは、カレント・ビットを、空間及び重要性レベル情報を利用して条件付けする。 In the binary method, the gray coding block 104 and the entropy coder 106 are used, and the data is coded for each bit plane. In one embodiment, the context model in Gray coding block 104 conditions the current bit using spatial and importance level information.

バイナリ方式の場合、グレイ符号化画素に対しＪＢＩＧのようなコンテキストモデルが利用される。一実施例においては、画像タイルの各ビットプレーンは別々に符号化され、それぞれのビットは、周辺の１０画素の値を利用し、ラスター順に条件付けされて符号化される。図２はバイナリ方式における各ビットプレーンの各ビットのためのコンテキストモデルの幾何学的関係を示す。このビットの条件付けは、固有パターン毎の適応的確率予測をもたらす。なお、バイナリ・エントロピー・コーダがグレイ符号化値のビットプレーン・エントロピー符号化に利用されるときには、いくつかのテンプレートがバイナリ・エントロピー・コーダのコンテキストモデルのために用いられてもよい。図３は２⁹個のコンテキスト・ビン(bin）のための７画素と２ビットのビットプレーン情報を示す。 In the binary method, a context model such as JBIG is used for gray coded pixels. In one embodiment, each bit plane of the image tile is encoded separately, and each bit is conditioned and encoded in raster order using the surrounding 10 pixel values. FIG. 2 shows the geometric relationship of the context model for each bit of each bit plane in the binary scheme. This bit conditioning results in adaptive probability prediction for each unique pattern. Note that when a binary entropy coder is used for bit-plane entropy coding of gray coded values, several templates may be used for the context model of the binary entropy coder. Figure 3 shows a 7 pixel and bit plane information 2 bits for 2 ^nine context bin (bin).

このコンテキストとカレント・ビットの値を利用して、エントロピー・コーダ１０６はビットストリームを生成する。この同じバイナリ・エントロピー・コーダ１０６が、変換モードとバイナリ方式の両方のデータの符号化に利用される。一実施例では、エントロピー・コーダ１０６はルックアップ・テーブルで実現される有限状態マシンからなる。なお、本発明は、Ｑコーダ、ＱＭコーダ、高速並列コーダのような任意のバイナリ・エントロピー・コーダと一緒に利用し得る。 Using this context and the value of the current bit, the entropy coder 106 generates a bit stream. This same binary entropy coder 106 is used to encode both conversion mode and binary data. In one embodiment, entropy coder 106 comprises a finite state machine implemented with a look-up table. It should be noted that the present invention can be used with any binary entropy coder such as a Q coder, a QM coder, or a high speed parallel coder.

エントロピー・コーダ１０６はいずれの方式についても同じものであり、かつ、バイナリ・コンテキストモデルは単純であるため、同一システムでバイナリ方式と変換方式を実現するのに、ごく僅かな追加資源しか必要とされない。さらに、コンテキストモデルの構成は異なるが、両モードのための必要資源は同じである。すなわち、両方のモードで、コンテキスト格納のために同じメモリを利用し、また、同じバイナリ・エントロピー・コーダを利用する。 Since the entropy coder 106 is the same for both systems, and the binary context model is simple, very little additional resources are required to implement the binary system and the conversion system in the same system. . Furthermore, although the configuration of the context model is different, the required resources for both modes are the same. That is, both modes use the same memory for context storage and the same binary entropy coder.

本発明は、画像全体に対して実行されてもよいし、あるいは、より一般的であるが、画像のタイリングされたセグメントに対して実行されてもよい。タイルの中には、変換方式による方が圧縮が良好なものと、バイナリ方式による方が良好なものとがある。使用すべきモードの選択アルゴリズムは様々なものが可能である。タイルが利用されるときには、タイル単位のランダムアクセスが可能である。また、注目領域は各別に高精度に復号化することができる。最後に、変換方式とバイナリ方式のいずれを選択するかは、１つ１つのタイル毎に決定することができる。 The present invention may be performed on the entire image or, more generally, on a tiled segment of the image. Some tiles have better compression by the conversion method and some by the binary method. Various algorithms for selecting the mode to be used are possible. When tiles are used, random access in units of tiles is possible. In addition, the attention area can be decoded with high accuracy for each. Finally, it can be determined for each tile whether to select the conversion method or the binary method.

また、画像は、本発明のデュアルモード・システムを利用してもビットプレーンに関しプログレッシブであり、ＪＢＩＧに教えられるように階層形式に符号化し得ることに注意されたい。 It should also be noted that the images are progressive with respect to the bit plane using the dual mode system of the present invention and can be encoded in a hierarchical format as taught by JBIG.

復号化に関しては、タイルのヘッダ中の１ビットを、データの符号化に利用された方式を指定するために利用してよい。方式選択機構１１０は用いられない。元のダイナミックレンジから低いダイナミックレンジへの非損失性マッピング、例えばヒストグラム圧縮（後述）によるようなものが可能であれば、さらに役に立つことがある。ＪＢＩＧにおけるようなルック・アヘッド（look ahead）を利用してもよい。このルック・アヘッドは、ＪＢＩＧにおけるような普通の予測又は決定論的予測を使用してよい。 For decoding, one bit in the tile header may be used to specify the scheme used to encode the data. The system selection mechanism 110 is not used. It may be even more useful if a lossless mapping from the original dynamic range to the low dynamic range is possible, such as by histogram compression (see below). Look ahead as in JBIG may be used. This look ahead may use normal or deterministic prediction as in JBIG.

バイナリ方式又は変換方式の選択
方式選択機構１１０は、バイナリ方式と変換方式の選択をする。一実施例では、入力画像は両方の方式で符号化され、方式選択機構１１０は得られたビットレートが低い方の方式を選択する（非損失性圧縮を仮定）。つまり、よく圧縮する方のモードが選ばれる。この方法は、コストが高いと思われるかもしれないが、それほどではない。というのは、バイナリ方式と変換方式は共に、ソフトウエアが比較的高速であり、かつ、ハードウエアも小規模であるからである。この方法から派生する方法は、コーダをバイパスし、エントロピー値を利用して低い方のビットレートを判定する方法である。 Selection of binary system or conversion system The system selection mechanism 110 selects a binary system or a conversion system. In one embodiment, the input image is encoded with both schemes, and the scheme selection mechanism 110 selects the scheme with the lower bit rate obtained (assuming lossless compression). That is, the mode that compresses better is selected. This method may seem expensive, but not so much. This is because both the binary method and the conversion method have relatively high software speed and small hardware. A method derived from this method is a method of determining a lower bit rate by bypassing the coder and using an entropy value.

別の実施例においては、本発明は画像の画素値の完全な（又は部分的な）ヒストグラム、又は、隣接画素値のペア間の差分のヒストグラムを生成する。画素値のヒストグラムの場合、そのヒストグラムのピークが、画素深度のダイナミックレンジより遥かに小さな値で生じたならば、バイナリ方式を選ぶ、というようなデータの統計的解析を使う。 In another embodiment, the present invention generates a complete (or partial) histogram of image pixel values, or a histogram of differences between adjacent pixel value pairs. In the case of a histogram of pixel values, a statistical analysis of the data is used such that if the peak of the histogram occurs at a value much smaller than the dynamic range of the pixel depth, the binary method is selected.

一実施例では、本発明は隣接画素のペア間の第１次差分の完全な（又は部分的な）ヒストグラムが生成される。標準的な画像では、そのようなヒストグラムは正にラプラシアン分布であり、ウェーブレット方式が利用されよう。しかし、ヒストグラムがラプラシアン分布のピークを持たないときには、バイナリ方式が利用される。 In one embodiment, the present invention generates a complete (or partial) histogram of the first order differences between adjacent pixel pairs. In a standard image, such a histogram is exactly a Laplacian distribution and the wavelet method will be used. However, when the histogram does not have a Laplacian distribution peak, the binary method is used.

両方の種類のヒストグラムを生成し、方式の選択のために一緒に利用してもよい。 Both types of histograms may be generated and used together for method selection.

いずれも後述するが、ＴＳ変換又はＴＴ変換のｄnフィルタ出力は第１次統計量に近い。これは、変換が実行されヒストグラムが生成される方法を示唆する。そのヒストグラムに基づいて、方式が選択される。その方式が変換方式のときには、システムは既に生成された変換係数を続けて処理する。バイナリ方式が選択されると、変換係数は捨てられ（又は、画素がセーブされたか否かによっては逆変換され）、システムはバイナリ方式を開始する。 As will be described later, the dn filter output of TS conversion or TT conversion is close to the first-order statistic. This suggests how the transformation is performed and a histogram is generated. A scheme is selected based on the histogram. When the method is a conversion method, the system continues to process already generated conversion coefficients. When the binary scheme is selected, the transform coefficients are discarded (or inverse transformed depending on whether the pixel is saved) and the system starts the binary scheme.

別の実施例では、領域分割及び／又は文書種類に関する以前の知識が、どちらの方式を選択すべきの決定を支援するかもしれない。 In another embodiment, previous knowledge of region segmentation and / or document type may assist in deciding which method to choose.

もっとより長い符号化時間を利用できるならば、２方式の利点を最大にするようにタイリングのサイズを選ぶことができる。 If longer encoding times are available, the tiling size can be chosen to maximize the benefits of the two schemes.

なお、一実施例においては、本発明のシステムはバイナリ方式符号化を含まず、したがって、可逆埋め込みウェーブレット圧縮（ＣＲＥＷ）及び伸長だけを利用することに注意されたい。 It should be noted that in one embodiment, the system of the present invention does not include binary coding and therefore only uses lossless embedded wavelet compression (CREW) and decompression.

ウェーブレット分解
本発明は、最初に、可逆ウェーブレットを利用して、（画像データとしての）画像または他のデータ信号の分解を実行する。本発明において、可逆ウェーブレット変換は、整数係数を持つ信号の非損失性復元が可能な完全再構成システムを整数演算で実現する。効率的な可逆変換は、行列式が１（又はほぼ１）の変換行列によるものである。 Wavelet Decomposition The present invention first uses reversible wavelets to perform image or other data signal decomposition (as image data). In the present invention, the reversible wavelet transform realizes a complete reconstruction system capable of restoring lossless signals having integer coefficients by integer arithmetic. An efficient reversible transformation is due to a transformation matrix having a determinant of 1 (or nearly 1).

本発明は、可逆ウェーブレットを利用することにより、有限精度の演算で非損失性圧縮を提供することができる。画像データに可逆ウェーブレット変換を適用することにより生成される結果は、係数の系列である。 The present invention can provide lossless compression with finite precision operations by utilizing reversible wavelets. The result generated by applying the reversible wavelet transform to the image data is a sequence of coefficients.

本発明の可逆ウェーブレット変換は、フィルタの集合を用いて実現し得る。一実施例では、そのフィルタは１つの２タップ・ローパスフィルタと１つの６タップ・ハイパスフィルタである。一実施例では、これらフィルタは加減算（とハードワイヤのビットシフト）だけで実現される。 The reversible wavelet transform of the present invention can be realized using a set of filters. In one embodiment, the filters are one 2-tap low-pass filter and one 6-tap high-pass filter. In one embodiment, these filters are implemented with only addition and subtraction (and hardwire bit shifting).

Ｈadamard変換を利用する本発明の一実施例は、完全再構成システムである。 One embodiment of the present invention that utilizes Hadamard transforms is a complete reconstruction system.

Ｈadamard変換に関する情報を得るには、Ａnil Ｋ．Ｊain，“Ｆundamentals of Ｉmage Ｐrocessing”，Ｐ．155を読まれたい。Ｈadamard変換の逆変換は、本明細書においてＳ変換と呼ばれる。 To obtain information about the Hadamard transform, see Anil K. et al. Jain, “Fundamentals of Image Processing”, p. I want to read 155. The inverse transformation of Hadamard transformation is referred to herein as S transformation.

Ｓ変換は、一般添数ｎを用いて出力を次のように定義することができる。 In the S conversion, the output can be defined as follows using the general index n.

なお、変換係数アドレッシングにおける因数２は、暗黙の１／２サブサンプリングの結果である。この変換は可逆であり、その逆変換は次の通りである。 Note that the factor 2 in transform coefficient addressing is the result of implicit ½ subsampling. This transformation is reversible, and the inverse transformation is as follows.

記号 symbol

は、切り捨てて丸めること、つまり打ち切りを意味し、床関数と呼ばれることがある。同様に、天井関数

Means rounding down, that is, censoring, and is sometimes called the floor function. Similarly, the ceiling function

は最も近い整数へ切り上げて丸めることを意味する。

Means rounding up to the nearest whole number.

完全再構成システムのもう一つの例は２・６（ＴＳ）変換である。可逆ＴＳ変換は、ローパスとハイパスのフィルタの２つの出力に関する次の式により定義される。 Another example of a complete reconstruction system is a 2 · 6 (TS) transformation. The reversible TS transform is defined by the following equation for the two outputs of the low pass and high pass filters.

ＴＳ変換は可逆であり、その逆変換は次の通りである。 TS conversion is reversible, and its inverse conversion is as follows.

ここで、次式によりｐ（ｎ）がまず計算されなければならない。

Here, p (n) must first be calculated according to the following equation.

ローパスフィルタからの結果を、ハイパスフィルタにおいて２度（第１項と第２項で）利用できる。したがって、ほかに２つの加算を行うだけで、ハイパスフィルタの結果を得られる。 The result from the low pass filter can be used twice (in the first and second terms) in the high pass filter. Therefore, the result of the high-pass filter can be obtained only by performing two other additions.

完全再構成システムのもう一つの例は２・１０（ＴＴ）変換である。可逆ＴＴ変換は、ローパスとハイパスのフィルタの２つの出力に関する次式により定義される。 Another example of a complete reconstruction system is a 2 · 10 (TT) transformation. The reversible TT transform is defined by the following equation for the two outputs of the low pass and high pass filters.

このｄ（ｎ）の式はｓ（ｎ）を使って単純化することができる（さらに、６４による整数除算は、分子に３２を足すことにより丸めることができる）。これにより次式が得られる。 The expression for d (n) can be simplified using s (n) (and integer division by 64 can be rounded by adding 32 to the numerator). As a result, the following equation is obtained.

このＴＴ変換は可逆であり、その逆変換は次式である。

This TT transform is reversible, and its inverse transform is

ここで、ｐ（ｎ）は次式によりまず計算されなければならない。

Here, p (n) must first be calculated according to the following equation.

ＴＳ変換とＴＴ変換のいずれにおいても、Ｓ変換と同様、ローパスフィルタは、入力信号ｘ(n)のレンジが出力信号ｓ(n)のレンジと同じになるように作られる。すなわち、平滑出力の増大はまったくない。入力信号がｂビットの深さのときには、出力信号もｂビットの深さである。例えば、信号が８ビット画像の場合、ローパスフィルタの出力も８ビットである。このことは、例えばローパスフィルタを連続して適用することにより平滑出力がさらに分解されるピラミッド・システムのために重要な特性である。従来技術のシステムにおいては、出力信号のレンジが入力信号のレンジより大きく、このことがフィルタの連続的適用を困難にしている。また、変換を整数演算で行う際の丸めによる系統誤差がないので、損失性システムの全ての誤差を量子化により制御可能である。さらに、ローパスフィルタは、２つのタップしか持たないため、非オーバーラップ・フィルタになる。この特性は、ハードウエア化のために重要である。 In both the TS conversion and the TT conversion, as in the S conversion, the low-pass filter is formed so that the range of the input signal x (n) is the same as the range of the output signal s (n). That is, there is no increase in smooth output. When the input signal is b bits deep, the output signal is also b bits deep. For example, when the signal is an 8-bit image, the output of the low-pass filter is 8 bits. This is an important characteristic for pyramid systems where the smooth output is further decomposed, for example by applying a low pass filter continuously. In prior art systems, the range of the output signal is larger than the range of the input signal, which makes continuous application of the filter difficult. In addition, since there is no systematic error due to rounding when performing conversion by integer arithmetic, all errors of the lossy system can be controlled by quantization. Furthermore, the low-pass filter is a non-overlapping filter because it has only two taps. This property is important for hardware implementation.

一実施例では、３と２２による乗算は、図２２に示すようなシフトと加算により実現される。図２２において、ｓ(n)入力は乗算器１５０１に接続され、乗算器１５０１はｓ(n)入力に２を乗じる。一実施例では、この乗算は、ｓ(n)信号のビットの１桁左シフトとして実現される。乗算器１５０１の出力は加算器１５０２によりｓ(n)信号と加算される。加算器１５０２の出力は、３ｓ(n)信号である。加算器１５０２の出力はまた、乗算器１５０３により２を乗じられる。乗算器１５０３は、１桁左シフトとして実現される。乗算器１５０３の出力は加算器１５０５により乗算器１５０４の出力と加算され、乗算器１５０４はｓ(n)信号を４桁左シフトにより１６倍する。加算器１５０５の出力は２２ｓ(n)信号である。 In one embodiment, multiplication by 3 and 22 is realized by shift and addition as shown in FIG. In FIG. 22, the s (n) input is connected to a multiplier 1501, and the multiplier 1501 multiplies the s (n) input by 2. In one embodiment, this multiplication is implemented as a one digit left shift of the bits of the s (n) signal. The output of the multiplier 1501 is added to the s (n) signal by the adder 1502. The output of the adder 1502 is a 3s (n) signal. The output of adder 1502 is also multiplied by 2 by multiplier 1503. Multiplier 1503 is implemented as a left shift by one digit. The output of the multiplier 1503 is added to the output of the multiplier 1504 by an adder 1505, and the multiplier 1504 multiplies the s (n) signal by 16 by a left shift of 4 digits. The output of the adder 1505 is a 22s (n) signal.

フィルタに対する厳格な可逆性要件は、次のことに着目することによって緩和することができる。ハイパス係数は、ある順序で符号化されて復号化される。前に復号化されたハイパス係数に対応する画素値は、正確に分かっているので、カレント・ハイパスフィルタ処理に用いることができる。 Strict reversibility requirements for filters can be relaxed by noting the following: High pass coefficients are encoded and decoded in a certain order. Since the pixel value corresponding to the previously decoded high-pass coefficient is known accurately, it can be used for current high-pass filtering.

ＴＳ変換及びＴＴ変換は非オーバーラップのローパス合成フィルタとハイパス分析フィルタを有する。ハイパス合成フィルタとローパス分析フィルタだけがオーバーラップ・フィルタである。 The TS transform and TT transform have non-overlapping low-pass synthesis filters and high-pass analysis filters. Only the high-pass synthesis filter and the low-pass analysis filter are overlap filters.

ＴＳフィルタは、タイル境界に関し良好な特性を有する。タイルのサイズがツリーのサイズの倍数である場合を考える。そして、画像がタイルに分割される場合に生じるような、ある信号の一部に対する変換の適用を考察する。ローパス分析フィルタはオーバーラップ・フィルタではないので、ローパス係数はタイリングによる影響を受けない。すなわち、信号のその部分が均一個数の信号を持つ場合、そのローパス係数は信号全体を変換したときのそれと同じである。 TS filters have good properties with respect to tile boundaries. Consider the case where the tile size is a multiple of the tree size. Then consider the application of a transformation to a portion of a signal, such as occurs when an image is divided into tiles. Since the low pass analysis filter is not an overlap filter, the low pass coefficients are not affected by tiling. That is, if the portion of the signal has a uniform number of signals, the low-pass coefficient is the same as when the entire signal is converted.

復号化中に、量子化のためハイパス係数が存在せず、画像がＳＳ係数だけを使って最高圧縮率で再構成される場合には、タイル境界にまたがってローパス合成フィルタが使われてよく、逆変換はローパス係数を用いて信号全体について実行される。ＳＳ係数のタイリングによって変化しないので、解はタイリングが利用されないときと全く同じである。これにより、信号の一部分に対しフォワード変換を実行することにより生じるアーティファクト（artifacts）が除去される。 During decoding, if there are no high-pass coefficients due to quantization and the image is reconstructed at the highest compression rate using only SS coefficients, a low-pass synthesis filter may be used across tile boundaries, Inverse transformation is performed on the entire signal using low-pass coefficients. The solution is exactly the same as when tiling is not used because it does not change with tiling of SS coefficients. This eliminates artifacts caused by performing forward transformation on a portion of the signal.

復号化中に、ハイパス係数が存在する（しかし、ハイパス係数はそれらの値が幾分の不確定性を持つように量子化されている）場合には、オーバーラップ１Ｄローパス分析フィルタ演算がある境界を横切って別の部分へ入る場所の標本に対し次のことを行うことができる。タイル境界を横切らない標本について、実際に利用されたフィルタに基づき、可能な最小及び最大の再構成値が決定される。その標本に関し、ローパス係数（とローパスフィルタ）だけを用い、そしてタイル境界を横切ることによって、そうであったであろう再構成値（すなわちオーバーラップ予測値）が決定される。そのオーバーラップ予測値が可能な最小と最大の再構成値の間（それらの値も含む）であれば、そのオーバーラップ予測値が用いられる。そうでなければ、その可能な最大と最小の再構成値のうちの、オーバーラップ予測値に近い方の値が用いられる。こうすることによって、信号の断片に対しフォワード変換を実行することにより生じるアーティファクトを減らす。 During decoding, if there are high-pass coefficients (but the high-pass coefficients are quantized so that their values have some uncertainty), there is a boundary where there is an overlap 1D low-pass analysis filter operation You can do the following on a specimen that crosses and enters another part: For samples that do not cross the tile boundary, the minimum and maximum possible reconstruction values are determined based on the filters actually used. For that sample, only the low-pass coefficients (and the low-pass filter) are used, and by traversing the tile boundary, the reconstruction value that would have been (ie, the overlap prediction value) is determined. If the overlap prediction value is between the possible minimum and maximum reconstruction values (including those values), the overlap prediction value is used. Otherwise, the closest possible overlap value of the possible maximum and minimum reconstruction values is used. This reduces the artifacts caused by performing forward transforms on the signal fragments.

１Ｄフィルタ演算が実行される度に再構成値が選ばれる。これが正しく行われると、各ハイパス係数はちょうど一つの有効な再構成値が与えられ、選択により誤差を変換の複数レベルに伝搬し得なくなる。 Each time a 1D filter operation is performed, a reconstruction value is selected. If this is done correctly, each high-pass coefficient is given exactly one valid reconstruction value, and the selection cannot propagate the error to multiple levels of conversion.

非線形画像モデル
本発明の一実施例は、ＴＳ変換又はＴＴ変換のような線形フィルタの可逆近似であるウェーブレット・フィルタを利用する。一実施例では、可逆非線形フィルタが用いられるかもしれない。ＴＳ変換及びＴＴ変換に類似した非線形フィルタの一種は次のとおりである。 Nonlinear Image Model One embodiment of the present invention utilizes a wavelet filter that is a reversible approximation of a linear filter such as a TS transform or a TT transform. In one embodiment, a reversible nonlinear filter may be used. One type of nonlinear filter similar to the TS transform and TT transform is as follows.

その逆変換はＴＳ変換及びＴＴ変換の場合と同じであるが、ただしｐ(n)は次の通りである。 The inverse transformation is the same as in the case of TS transformation and TT transformation, except that p (n) is as follows.

この実施例において、ｑ(n)は平滑(smooth)係数（及び必要なら前の詳細(detail)係数）からのｘ(2n)-ｘ(2n+1)に関する予測である。この予測は非線形画像モデルを利用する。一実施例では、非線形画像モデルはＨuber-Ｍarkov確率場である。この非線形画像モデルは、フォーワード変換と逆変換ｓにおいて全く同一である。反復画像モデルの場合、反復の回数と次数は同一である。 In this example, q (n) is a prediction on x (2n) -x (2n + 1) from smooth coefficients (and previous detail coefficients if necessary). This prediction uses a nonlinear image model. In one embodiment, the nonlinear image model is a Huber-Markov random field. This nonlinear image model is exactly the same in the forward transformation and the inverse transformation s. In the case of an iterative image model, the number of iterations and the order are the same.

非線形画像モデルの一例は次の通りである。ある一定の反復回数の場合に、それぞれの値（画素又はローパス係数）ｘ(k)は新たな値ｘ'(k)に調整される（ただし、ｋは２ｎ又は２ｎ＋１である）。任意の反復回数を用いてよいが、一実施例では３回の繰り返しが用いられる。最終の反復からｑ(n)の値が求まる。 An example of the nonlinear image model is as follows. For a certain number of iterations, each value (pixel or low-pass coefficient) x (k) is adjusted to a new value x ′ (k) (where k is 2n or 2n + 1). Any number of iterations may be used, but in one embodiment, 3 iterations are used. The value of q (n) is determined from the final iteration.

反復の度に、各ｘ(k)に対する変化ｙ(k)が計算される。 At each iteration, the change y (k) for each x (k) is calculated.

ここでＡは変化率であり、これは任意の正値でよい。一実施例ではＡ＝１である。Ｂiは差分検出子である。例えば、１次元の場合、Ｂiは次の通りである。

Here, A is the rate of change, which may be any positive value. In one embodiment, A = 1. Bi is a difference detector. For example, in the case of one dimension, Bi is as follows.

２次元の場合、水平、垂直、及び２つの対角線方向の差分を検出するための４つのＢi値があろう。Ｂiとして他の差分演算子を用いてもよい。

In the two-dimensional case, there will be four Bi values to detect horizontal, vertical, and two diagonal differences. Other difference operators may be used as Bi.

ここで、Ｔは閾値であり、これは任意の正値でよい。Ｔは画像のエッジを構成する差分を表す。一実施例では、Ｔ＝８である。

Here, T is a threshold value, which may be any positive value. T represents the difference constituting the edge of the image. In one embodiment, T = 8.

ｘ'(2n）＋ｘ'(2n+1）＝ｘ(2n）＋ｘ(2n+1）の拘束条件のもとに、値ｘ'(2n），ｘ'(2n+1）のペアは変化ｙ(2n），ｙ(2n+1）のペアによって調整される。これは、変化ｙ(2n），ｙ(2n+1）を結合して単一の変化ｙ'(2n）、つまり、両変化にサポートされる最大の変化にすることによって達成される。 Under the constraint of x ′ (2n) + x ′ (2n + 1) = x (2n) + x (2n + 1), the pair of values x ′ (2n) and x ′ (2n + 1) changes y It is adjusted by a pair of (2n) and y (2n + 1). This is accomplished by combining the changes y (2n), y (2n + 1) into a single change y ′ (2n), the maximum change supported by both changes.

ｙ(n)とｙ(2n+1)が両方とも正か負のときには、
ｙ'(sn)＝０
である。そうでない場合、｜ｙ(2n)｜＜｜ｙ(2n+1)｜ならば、
ｙ'(2n)＝ｙ(2n)
であり、そうでなければ
ｙ'(sn)＝−ｙ(2n+1)
である。そして
ｘ'(2n)＝ｘ(2n)＋ｙ'(2n)
ｘ'(2n+1)＝ｘ(2n+1)−ｙ'(2n)
である。Ｈuber-Ｍarkov確率場に関するこれ以上の情報を得るには、Ｒ．Ｒ．Ｓchultz and Ｒ．Ｌ．Ｓtevenson，“Ｉmproved definition image expansion”，Ｐroceedings of IEEE Ｉnternational Ｃonference on Ａcoust.，Ｓpeech and Ｓignal Ｐrocessing，vol．III，pp.173-176，Ｓan Ｆransisco，Ｍarch 1992を見られたい。 When y (n) and y (2n + 1) are both positive or negative,
y '(sn) = 0
It is. Otherwise, if | y (2n) | <| y (2n + 1) |
y '(2n) = y (2n)
And if not
y ′ (sn) = − y (2n + 1)
It is. And
x ′ (2n) = x (2n) + y ′ (2n)
x ′ (2n + 1) = x (2n + 1) −y ′ (2n)
It is. To obtain more information about the Huber-Markov random field, see R. R. Schultz and R.C. L. Stevenson, “Improved definition image expansion”, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. See III, pp.173-176, San Fransisco, March 1992.

いくつかの実施例では、変換は１次元から２次元へ拡張されるが、その拡張のためにまずｑ(n)＝０として各次元に対し別々に変換を行う。次に、画像モデルの同様の適用によってＬＨ，ＨＬ，ＨＨ値のための３つのｑ(n)値を計算する。 In some embodiments, the transformation is extended from one dimension to two, but for that extension, the transformation is first performed separately for each dimension with q (n) = 0. Next, three q (n) values for the LH, HL, and HH values are calculated by similar application of the image model.

２次元ウェーブレット分解
本発明のローパスフィルタ及びハイパスフィルタを用いて、多重解像度分解が行なわれる。分解レベル数は可変であり任意数でよいが、現在のところ分解レベル数は２レベル乃至５レベルである。最大レベル数は、ｌｏｇ2（長さ又は幅の最大値）である。 Two-dimensional wavelet decomposition Multi-resolution decomposition is performed using the low-pass filter and high-pass filter of the present invention. The number of decomposition levels is variable and may be any number, but currently the number of decomposition levels is 2 to 5 levels. The maximum number of levels is log 2 (the maximum value of length or width).

画像のような２次元データに対し変換を実行する最も普通のやり方は、１次元フィルタを別々に適用する方法、つまり、行に沿って適用したのち列に沿って適用するという方法である。第１レベルの分解により４つの異なった係数バンドが得られ、これら係数バンドは本明細書ではＬＬ，ＨＬ，ＬＨ，ＨＨと呼ぶ。これら文字は、前に定義した平滑(smooth)フィルタと詳細(detail)フィルタの適用を意味するロー（Ｌ）とハイ（Ｈ）を表す。したがって、ＬＬバンドは行方向及び列方向の平滑フィルタより得られた係数からなっている。ウェーブレット係数を図４乃至図７のような形に配置するのが一般的なやりかたである。 The most common way to perform transformations on two-dimensional data such as images is to apply the one-dimensional filter separately, ie along the rows and then along the columns. The first level decomposition yields four different coefficient bands, referred to herein as LL, HL, LH, and HH. These characters represent low (L) and high (H) which means the application of the smooth and detail filters defined above. Therefore, the LL band is composed of coefficients obtained from the smoothing filters in the row direction and the column direction. It is a common way to arrange the wavelet coefficients in the form as shown in FIGS.

ウェーブレット分解の各周波数サブバンドはさらに分解することができる。最も普通のやりかたはＬＬサブブロックだけをさらに分解する方法であり、これは各分解レベルのＬＬ周波数サブバンドが生成された時にそれをさらに分解することを含むであろう。このような多重分解はピラミッド分解と呼ばれる（図４乃至図７）。記号ＬＬ，ＬＨ，ＨＬ，ＨＨと分解レベル番号によって各分解を示す。なお、本発明のＴＳフィルタ、ＴＴフィルタのいずれによっても、ピラミッド分解は係数サイズを増加させない。 Each frequency subband of the wavelet decomposition can be further decomposed. The most common way is to further decompose only the LL sub-block, which would involve further decomposition of the LL frequency subband for each decomposition level as it is generated. Such multiple decomposition is called pyramid decomposition (FIGS. 4 to 7). Each decomposition is indicated by the symbols LL, LH, HL, HH and the decomposition level number. Note that pyramid decomposition does not increase the coefficient size with either the TS filter or the TT filter of the present invention.

例えば、可逆ウェーブレット変換が再帰的に１つの画像に適用されると、第１レベルの分解は最も細かいディテールもしくは解像度に対し作用する。第１分解レベルで、画像は４つのサブ画像（すなわちサブバンド）に分解される。各サブバンドは、１つの空間周波数帯域を表わしている。第１レベルのサブバンドはＬＬ0，ＬＨ0，ＨＬ0，ＨＨ0と表される。元の画像を分解するプロセスは、水平，垂直の両次元における１／２サブサンプリングを含むので、図４に示されるように、第１レベルのサブバンドＬＬ0，ＬＨ0，ＨＬ0，ＨＨ0はそれぞれ、入力が持っていた画像の画素（または係数）の個数の４分の１の個数の係数を持つ。 For example, when the reversible wavelet transform is applied recursively to an image, the first level decomposition works on the finest detail or resolution. At the first decomposition level, the image is decomposed into four sub-images (ie sub-bands). Each subband represents one spatial frequency band. The first level subbands are denoted LL0, LH0, HL0, HH0. Since the process of decomposing the original image includes 1/2 subsampling in both horizontal and vertical dimensions, as shown in FIG. 4, the first level subbands LL0, LH0, HL0, and HH0 are input respectively. Have one-fourth the number of pixels (or coefficients) in the image.

サブバンドＬＬ0は、水平方向の低い周波数情報と垂直方向の低い周波数情報を同時に含んでいる。一般に、画像エネルギーの大部分は当該サブバンドに集中している。サブバンドＬＨ0は、水平方向の低い周波数情報と垂直方向の高い周波数情報（例えば水平方向エッジの情報）を含んでいる。サブバンドＨＬ0は、水平方向の高い周波数情報と垂直方向の低い周波数情報（例えば垂直方向エッジの情報）を含んでいる。サブバンドＨＨ0は、水平方向の高い周波数情報と垂直方向の高い周波数情報（例えばテクスチャ又は斜めエッジの情報）を含んでいる。 The subband LL0 includes low frequency information in the horizontal direction and low frequency information in the vertical direction at the same time. In general, most of the image energy is concentrated in the subband. The subband LH0 includes low frequency information in the horizontal direction and high frequency information in the vertical direction (for example, horizontal edge information). The subband HL0 includes high frequency information in the horizontal direction and low frequency information in the vertical direction (for example, vertical edge information). The subband HH0 includes high frequency information in the horizontal direction and high frequency information in the vertical direction (for example, texture or oblique edge information).

この後に続く第２、第３、第４の下位分解レベルはそれぞれ、前レベルの低周波数ＬＬサブバンドを分解することによって作られる。第１レベルの当該サブバンドＬＬ0が分解されることによって、図５に示すように、やや精細な第２レベルのサブバンドＬＬ1，ＬＨ1，ＨＬ1，ＨＨ1が作られる。同様に、サブバンドＬＬ1が分割されることによって、図６に示すように、精細度の粗い第３レベルのサブバンドＬＬ2，ＬＨ2，ＨＬ2，ＨＨ2が生成される。また、図７に示すように、サブバンドＬＬ2が分割されることにより、精細度がより粗い第４レベルのサブバンドＬＬ3，ＬＨ3，ＨＬ3，ＨＨ3が作られる。２：１のサブサンプリングにより、第２レベルの各サブバンドは、原画像の１６分の１の大きさである。このレベルの各標本（つまり画素）は、原画像中の同一位置のやや細いディテールを表す。同様に、第３レベルの各サブバンドは、原画像の６４分の１の大きさである。第３レベルでの各画素は、原画像中の同一位置のかなり粗いディテールを表す。また、第４レベルの各サブバンドは原画像の２５６分の１の大きさである。 The second, third, and fourth sub-decomposition levels that follow are each created by resolving the previous level low frequency LL subband. By disassembling the first level subband LL0, as shown in FIG. 5, slightly finer second level subbands LL1, LH1, HL1, and HH1 are created. Similarly, by dividing subband LL1, as shown in FIG. 6, third-level subbands LL2, LH2, HL2, and HH2 with a coarse definition are generated. Further, as shown in FIG. 7, by dividing subband LL2, fourth level subbands LL3, LH3, HL3, and HH3 with coarser definition are created. With 2: 1 subsampling, each second level subband is 1 / 16th the size of the original image. Each sample (ie pixel) at this level represents a slightly fine detail at the same position in the original image. Similarly, each third level subband is 1 / 64th the size of the original image. Each pixel at the third level represents a fairly coarse detail at the same position in the original image. In addition, each sub-band of the fourth level is 1/256 the size of the original image.

分解画像はサブサンプリングのため原画像より物理的に小さいので、原画像の格納のために使用されたメモリを利用して、分解サブバンド全部を格納できる。つまり、３レベル分割の場合、原画像と分解サブバンドＬＬ0，ＬＬ1は捨てられ、保存されない。 Since the decomposed image is physically smaller than the original image for subsampling, the entire decomposed subband can be stored using the memory used for storing the original image. That is, in the case of three-level division, the original image and the decomposed subbands LL0 and LL1 are discarded and are not stored.

４つのサブバンド分解レベルだけを示したが、個々のシステムの要件に応じて、それ以上のレベルを生成することも可能である。また、ＤＣＴのような他の変換又は一次元配置のサブバンドによって、様々な親子関係を定義してもよい。 Although only four subband decomposition levels are shown, higher levels can be generated depending on the requirements of the individual system. Various parent-child relationships may be defined by other transforms such as DCT or one-dimensionally arranged subbands.

ピラミッド分解
ウェーブレット分解の各周波数サブバンドはさらに分解することができる。一実施例においては、ＬＬ周波数サブバンドだけが分解される。本明細書では、このような分解はピラミッド分解と呼ばれる。記号ＬＬ，ＬＨ，ＨＬ，ＨＨと分解レベル番号で各分解を表示する。なお、本発明のウェーブレット・フィルタによれば、ピラミッド分解は係数サイズを増大させない。 Pyramid decomposition Each frequency subband of the wavelet decomposition can be further decomposed. In one embodiment, only the LL frequency subband is resolved. Herein, such decomposition is referred to as pyramid decomposition. Each decomposition is indicated by the symbols LL, LH, HL, HH and the decomposition level number. Note that according to the wavelet filter of the present invention, pyramid decomposition does not increase the coefficient size.

別の実施例では、ＬＬに加えて他のサブバンドも分解されるかもしれない。以下の説明において、”ＬＬ”なる用語は”ＳＳ”（”Ｌ”＝”Ｓ”）と入れ替えて用いてよい。同様に、”Ｈ”なる用語も”Ｄ”と入れ替えて用いてもよい。 In other embodiments, other subbands may be resolved in addition to LL. In the following description, the term “LL” may be used interchangeably with “SS” (“L” = “S”). Similarly, the term “H” may be used interchangeably with “D”.

ウェーブレットのツリー構造
ピラミッド分解のウェーブレット係数には自然で有用なツリー構造がある。なお、ＬＬ周波数サブブロックは最終の分解レベルに対応したただ一つしかない。これに対し、ＬＨ，ＨＬ，ＨＨのバンドはレベル数と同数存在する。このツリー構造により、ある周波数帯域内の係数の親は、それより低い解像度の同じ周波数帯域内の係数であり、かつ同じ空間位置関係にあることが明らかになる。 Wavelet Tree Structure There are natural and useful tree structures for wavelet coefficients of pyramid decomposition. Note that there is only one LL frequency sub-block corresponding to the final decomposition level. On the other hand, there are the same number of LH, HL, and HH bands as the number of levels. This tree structure reveals that the parents of the coefficients in a certain frequency band are the coefficients in the same frequency band with a lower resolution and are in the same spatial positional relationship.

各ツリーのルートは、純粋に平滑な係数である。画像のような２次元の信号の場合、ツリーのルートは３つの”子”を持ち、ほかのノードはそれぞれ４つの子を持つ。この階層的ツリーは２次元信号に限定されない。例えば、１次元信号の場合、ルートは１つの子を持ち、ルート以外のノードはそれぞれ２つの子を持つ。これ以上高い次元は、１次元の場合及び２次元の場合より導かれる。 The root of each tree is a purely smooth coefficient. In the case of a two-dimensional signal such as an image, the root of the tree has three “children” and the other nodes each have four children. This hierarchical tree is not limited to two-dimensional signals. For example, in the case of a one-dimensional signal, the root has one child, and each node other than the root has two children. Higher dimensions are derived from the one-dimensional case and the two-dimensional case.

図８は連続した２レベル間の親子関係を表している。図８において、Ａの係数は、Ｂ，Ｃ，Ｄに対する直接の親であるが、Ｂ，Ｃ，Ｄを親とする係数（ＥとＨ，ＦとＩ，ＧとＪ）に対する親でもある。例えば、Ｂは、Ｅ付近の４係数、Ｈ付近の１６係数、等々に対する親である。 FIG. 8 shows a parent-child relationship between two consecutive levels. In FIG. 8, the coefficient A is a direct parent to B, C, and D, but is also a parent to coefficients (E and H, F and I, G and J) having B, C, and D as parents. For example, B is the parent for 4 coefficients near E, 16 coefficients near H, and so on.

多重解像度分解のプロセスは、フィルタ系列を使って遂行し得る。 The process of multi-resolution decomposition can be performed using a filter sequence.

１次元の模範的フィルタを使って実現される１次元２レベル変換の例については、米国特許出願第０８／４９８，６９５号（１９９５年６月３０日受理、“Ｍethod and Ａpparatus Ｆor Ｃompression Ｕsing Ｒeversible Ｗavelet Ｔransforms and an Ｅmbedded Ｃodestream”）及び米国特許出願第０８／４９８，０３６号（１９９５年６月３０日受理”Ｒeversible Ｗavelet Ｔransform and Ｅmbedded Ｃode-stream Ｍanipulation”）を見られたい。 For an example of a one-dimensional two-level transformation implemented using a one-dimensional exemplary filter, see US patent application Ser. No. 08 / 498,695 (accepted Jun. 30, 1995, “Method and Apparatus For Compression Using Reversible Wavelet”. See "Transforms and an Embedded Codestream") and U.S. Patent Application No. 08 / 498,036 (Receivable Wavelet Transform and Embedded Code-stream Manipulation ", accepted June 30, 1995).

フォワード・ウェーブレット変換の実行
本発明では、水平方向の１−Ｄ（次元）パス、次で垂直方向の１−Ｄパスによってウェーブレット変換が実行される。レベル数で反復回数が決まる。図９は前に定義したようなフォワードＴＴ変換フィルタを使う４レベル分解を表す。 Execution of Forward Wavelet Transformation In the present invention, wavelet transformation is executed by a 1-D (dimensional) path in the horizontal direction and then a 1-D path in the vertical direction. The number of iterations is determined by the number of levels. FIG. 9 represents a four-level decomposition using a forward TT transform filter as previously defined.

別の実施例では、任意のレベルの水平又は垂直のウェーブレット変換の任意の時点のＴＴ変換の代わりに、Ｓ変換のような別の可逆ウェーブレット変換フィルタを用いることができる。一実施例では、水平方向と垂直方向の両方にＴＴ変換を利用して４レベル分解が実行される。一実施例では、４レベル分解において、４つのＴＴ変換の中の２つがＳ変換で置き換えられる。これは圧縮の損失は少ないが、メモリ使用量に対する効果は大きい。水平方向の変換と垂直方向の変換は交互に適用されるであろう。 In another embodiment, another reversible wavelet transform filter, such as an S transform, can be used instead of an arbitrary time TT transform of any level horizontal or vertical wavelet transform. In one embodiment, a four-level decomposition is performed using a TT transform in both the horizontal and vertical directions. In one embodiment, in a four level decomposition, two of the four TT transforms are replaced with S transforms. This has little loss of compression, but has a significant effect on memory usage. The horizontal transformation and the vertical transformation will be applied alternately.

なお、Ｓ変換とＴＴ変換の組合せが水平方向及び垂直方向の変換を実施するために用いられてもよい。変換の順序は雑多でよいが、完全に可逆的であるためには、復号化器はその順序を知って逆の順序で逆操作を実行しなければならない。後述のように、復号化器は変換順序をヘッダで知らされるかもしれない。 Note that a combination of S conversion and TT conversion may be used to perform horizontal and vertical conversion. The order of transformations can be miscellaneous, but in order to be completely reversible, the decoder must know the order and perform the reverse operation in the reverse order. As described below, the decoder may be informed of the conversion order in the header.

埋め込み順序付け
本発明では、ウェーブレット分解の結果として生成された係数はエントロピー符号化される。本発明においては、係数は最初に埋め込み符号化（embedded coding）を施されるが、この符号化では、視覚的に重要な順に係数が順序付けられ、または、より一般的に、何等かの誤差規準（例えば、歪み規準）を考慮して係数が順序付けられる。誤差または歪みの規準には、ピーク誤差と平均２乗誤差（ＭＳＥ）が含まれる。また、ビット・シグニフィカンス空間配置（bit-significance spatial location）より、データベース照会のための妥当性を優先させるように、また方向別に（垂直、水平、斜め等）順序付けてもよい。 Embedding Ordering In the present invention, the coefficients generated as a result of wavelet decomposition are entropy coded. In the present invention, the coefficients are first subjected to embedded coding, in which the coefficients are ordered in a visually significant order, or more generally, some error criterion. The coefficients are ordered taking into account (eg, distortion criteria). Error or distortion criteria include peak error and mean square error (MSE). In addition, ordering may be performed by direction (vertical, horizontal, diagonal, etc.) so as to prioritize the validity for database query over bit-significance spatial location.

データの順序付けは、符号ストリームの埋め込み量子化したものを生成するために行われる。本発明においては、２つの順序付け方法が用いられる。その一つは係数を順序付けするためのものであり、もう一つは係数中の２進値を順序付けするためのものである。本発明の順序付けは、ビットストリームを生成し、このビットストリームはその後にバイナリ・エントロピー・コーダにより符号化される。 Data ordering is performed to generate an embedded quantized version of the code stream. In the present invention, two ordering methods are used. One is for ordering the coefficients, and the other is for ordering the binary values in the coefficients. The ordering of the present invention produces a bitstream that is then encoded by a binary entropy coder.

タイル
本発明においては、変換と符号化の前に、画像はタイルに分割される。タイルは、完全独立に符号化される全体画像の部分画像であり、図１１のように番号がつけられた画像上に配置された規則的な矩形格子により定義される。右端と下端のタイルは、原画像及びタイル・サイズに応じて色々なサイズになる。 Tile In the present invention, an image is divided into tiles prior to transformation and encoding. A tile is a partial image of a whole image that is encoded completely independently, and is defined by a regular rectangular grid arranged on a numbered image as shown in FIG. The right and bottom tiles vary in size depending on the original image and the tile size.

タイルは画像サイズ以下の任意の高さ、幅にしてよいが、タイル・サイズの選び方は性能に影響する。小さなタイル、特にラスタ順画像の垂直方向の寸法が小さいタイルは、作業域用メモりを減らすことができる。しかし、タイルが小さすぎると、合図のための（signaling）オーバーヘッド、タイル境界での変換効率の損失、エントロピー・コーダの立ち上がり適応という３つの要因により圧縮効率が下がる。タイルの寸法を、最も低い周波数成分の大きさの倍数（ＣＲＥＷツリー）、つまりレベル数の関数（２のレベル数乗）にするのが有利である。原画像のサイズによるが、１２８×１２８又は２５６×２５６のタイルが多くのアプリケーションに適当と思われる。 The tiles can be any height and width below the image size, but the choice of tile size affects performance. Small tiles, especially tiles with small vertical dimensions in raster-ordered images, can reduce work area memory. However, if the tile is too small, the compression efficiency decreases due to three factors: signaling overhead, loss of conversion efficiency at the tile boundary, and adaptation of the entropy coder to rise. The size of the tile is advantageously a multiple of the size of the lowest frequency component (CREW tree), ie a function of the number of levels (2 to the power of the level). Depending on the size of the original image, 128 × 128 or 256 × 256 tiles may be suitable for many applications.

タイルは一連の画像の圧縮のために利用されるかもしれない。したがって、タイルされた画像は、時間的に異なる画像（映画のような）又は空間的に異なる画像（ＭＲＩの如き３Ｄ断面のような）かもしれない。これを知らせる格別の方法はないが、ＣＭＴが利用されるかもしれない。 Tiles may be used for compression of a series of images. Thus, a tiled image may be a temporally different image (such as a movie) or a spatially different image (such as a 3D cross section such as MRI). There is no special way to signal this, but CMT may be used.

変換、コンテキスト・モデル、エントロピー符号化は１つの画像タイルの画素及び係数だけに作用する。それゆえに、画像タイルは順序によらずに解析又は復号化することができ、すなわちランダムにアドレスすることができ、あるいは、注目領域の伸長のため様々な歪みレベルに復号化することができる。 Transforms, context models, and entropy coding work only on the pixels and coefficients of one image tile. Therefore, image tiles can be analyzed or decoded out of order, i.e., randomly addressed, or decoded to various distortion levels for region of interest expansion.

１つの画像タイルの画素データは全部、符号化器で一度に利用できる、例えばメモリにバッファされる。ひとたび画素データが変換されれば、全ての係数データを水平コンテキスト・モデルに利用できる。全ての係数をランダムにアクセスできるので、画像タイル内部の埋め込みの順序は、符号化器及び復号化器が知っている限り任意でよい。エントロピー・コーダは、この順序付けに関し無頓着であるから、その順序は圧縮率に大きな影響を及ぼすので注意して選ばねばならない。 All pixel data of one image tile is buffered in, for example, a memory that can be used at once by the encoder. Once the pixel data is converted, all coefficient data can be used in the horizontal context model. Since all coefficients can be accessed randomly, the order of embedding within an image tile can be arbitrary as long as the encoder and decoder know. Entropy coders are careless about this ordering, so the order has a significant effect on the compression ratio and must be chosen with care.

画像タイルは矩形配置されたツリー（ＬＬ係数及びその全ての子孫）の番号により定義される。各ツリー内の画素の数はウェーブレット分解のレベル数の関数である。 An image tile is defined by the number of a tree (LL coefficient and all its descendants) arranged in a rectangle. The number of pixels in each tree is a function of the number of wavelet decomposition levels.

画像基準格子は、各成分の大きさが格子点の整数倍である最小の格子面である。このことは、殆どの画像で、画像基準格子が最頻成分と同一であることを暗に意味する。 The image reference lattice is the smallest lattice plane in which the size of each component is an integer multiple of the lattice point. This implies that for most images, the image reference grid is the same as the most frequent component.

１成分の画像又は全成分が同一サイズの画像の場合、画像基準格子は画像と同一サイズである（例えば格子点は画像の画素である）。複数の成分の全部が同一サイズというわけではない画像の場合、そのサイズは画像基準格子点の整数倍と定義される。例えば、ＣＣＩＲ601 ＹＣrＣb色成分系は、各Ｃr及びＣb成分に対し２つのＹ成分を持つように定義される。したがって、Ｙ成分は画像基準格子を定義し、Ｃr成分とＣb成分はそれぞれ水平方向に２単位を、垂直方向に１単位をカバーする。 When a one-component image or all components have the same size, the image reference grid is the same size as the image (for example, grid points are image pixels). In the case of an image in which all of the plurality of components are not the same size, the size is defined as an integer multiple of the image reference grid point. For example, the CCIR601 YCrCb color component system is defined to have two Y components for each Cr and Cb component. Therefore, the Y component defines the image reference grid, and the Cr component and the Cb component each cover 2 units in the horizontal direction and 1 unit in the vertical direction.

ビット・シグニフィカンス表現
一実施例では、係数内の２進値に対し用いられる埋め込み順序はビットプレーン順である。係数はビット・シグニフィカンス表現で表される。ビット・シグニフィカンス表現は、最上位ビット（ＭＳＢ）ではなくて符号（sign）ビットが、最初の非ゼロの絶対値ビットと共に符号化される符号・絶対値表現である。 Bit Significance Representation In one embodiment, the embedding order used for binary values in the coefficients is bit plane order. The coefficient is expressed in bit-significance expression. The bit-significance representation is a sign / absolute value representation in which the sign bit, rather than the most significant bit (MSB), is encoded with the first non-zero absolute value bit.

ビット・シグニフィカンス形式で表現される数には３種類のビット、すなわちヘッド(head)ビット、テール(tail)ビット及び符号(sign)ビットがある。ヘッドビットとは、ＭＳＢから最初の非ゼロ絶対値ビットまでの全てのゼロビットと、その最初の非ゼロ絶対値ビットである。その最初の非ゼロ絶対値ビットが存在するビットプレーンで、係数の重要性が定まる。最初の非ゼロ絶対値ビットの後からＬＳＢまでのビットがテールビットである。符号ビットは符号(sign)を表わすにすぎない。ＭＳＢが非ゼロビットの数、例えば±２nは、ヘッドビットを１ビットしか持たない。ゼロの係数は、テールビットも符号ビットも持たない。図１２にビット・シグニフィカンス表現の例を示す。 There are three types of bits represented in bit-significance format: head bits, tail bits, and sign bits. The head bits are all zero bits from the MSB to the first non-zero absolute value bit and the first non-zero absolute value bit. The importance of the coefficient is determined by the bit plane in which the first non-zero absolute value bit exists. The bits after the first non-zero absolute value bit to the LSB are tail bits. The sign bit only represents a sign. The number of non-zero bits in the MSB, for example ± 2n, has only one head bit. Zero coefficients have no tail bits or sign bits. FIG. 12 shows an example of bit-significance expression.

画素の輝度に関連して起こるような、値が非負整数の場合、採用し得る順序はビットプレーン順（例えば、最上位ビットプレーンから最下位ビットプレーンへの順）である。２の補数による負整数も許容される実施例では、符号ビットの埋め込み順序は、整数の絶対値の最初の非ゼロビットと同じである。したがって、１つの非ゼロビットが符号化されるまで、符号ビットは考慮されない。例えば、符号・絶対値表記法によれば、−７の１６ビット数は
１００００００００００００１１１
である。ビットプレーン・ベースで、初めの１２デシジョン（decision）は“無意味”すなわちゼロとなる。最初の１のビットは１３番目のデシジョンに見つかる。次に符号ビット（“負”）が符号化される。符号ビットが符号化された後、テールビットが処理される。１４番目と１５番目のデシジョンは共に“１”である。 If the value is a non-negative integer, such as occurs in relation to the luminance of the pixel, the order that can be employed is the bit plane order (eg, from the most significant bit plane to the least significant bit plane). In embodiments where negative integers with two's complement are also acceptable, the sign bit padding order is the same as the first non-zero bit of the absolute value of the integer. Thus, the sign bit is not considered until one non-zero bit is encoded. For example, according to the sign / absolute value notation, the 16-bit number of -7 is 1000000000000111.
It is. On a bit plane basis, the first 12 decisions are “meaningless” or zero. The first 1 bit is found in the 13th decision. The sign bit (“negative”) is then encoded. After the sign bit is encoded, the tail bit is processed. The 14th and 15th decisions are both "1".

係数は最上位のビットプレーンから最下位のビットプレーンへと符号化されるので、データのビットプレーン数が正確にわからなければならない。本発明においては、データから計算される、又は画像の深度及びフィルタ係数から導き出される係数値の絶対値の上限を見つけることによって、ビットプレーン数が決定される。例えば、その上限が１４９のときには、有意な８ビットつまり８つのビットプレーンがある。ソフトウエアの速度のため、ビットプレーン符号化は用いられないかもしれない。別の実施例では、ビットプレーンが符号化されるのは、係数が２進数として意味をなす時だけである。 Since the coefficients are encoded from the most significant bit plane to the least significant bit plane, the number of data bit planes must be known accurately. In the present invention, the number of bit planes is determined by finding the upper limit of the absolute value of the coefficient value calculated from the data or derived from the image depth and filter coefficients. For example, when the upper limit is 149, there are significant 8 bits or 8 bit planes. Because of software speed, bitplane coding may not be used. In another embodiment, the bit plane is only encoded when the coefficients make sense as binary numbers.

係数アラインメント（alignment）
本発明は、ビットプレーン符号化の前に係数相互のアラインメントを行う。これは、ＦＦＴやＤＣＴと同様、異なった周波数サブバンド内の係数は異なった周波数を表すからである。本発明は、係数のアラインメントを行うことにより量子化を可能にする。量子化の重さが小さい係数ほど早いビットプレーン側へアラインメントされる（例えば左へシフトされる）。よって、ストリームが打ち切りされる場合、これらの係数は、それを定義するビットが、より重く量子化された係数に比べ多くなる。 Coefficient alignment (alignment)
The present invention aligns coefficients before bit-plane coding. This is because, like FFT and DCT, coefficients in different frequency subbands represent different frequencies. The present invention enables quantization by performing coefficient alignment. Coefficients with a smaller quantization weight are aligned to the faster bit plane side (for example, shifted to the left). Thus, if the stream is censored, these coefficients have more bits that define them compared to the more heavily quantized coefficients.

一実施例では、係数はＳＮＲ又はＭＳＥの見地から最高のレート・歪み性能が得られるようにアラインメントがなされる。ＭＳＥのような統計的誤差基準から見てほぼ最適のアラインメントを含め、多くのアラインメントが可能である。あるいは、アラインメントは係数データの物理視覚的（physchovisual)量子化を許すかもしれない。アラインメントは画像品質に（換言すればレート・歪み曲線に）相当な影響を及ぼすが、非損失性システムの最終的な圧縮率には殆ど影響しない。他のアラインメントは、特殊な量子化である注目領域忠実度符号化や解像度プログレッシブアラインメントに対応するかもしれない。 In one embodiment, the coefficients are aligned for best rate and distortion performance from an SNR or MSE perspective. Many alignments are possible, including near-optimal alignments in terms of statistical error criteria such as MSE. Alternatively, the alignment may allow physchovisual quantization of the coefficient data. Alignment has a considerable effect on image quality (in other words on the rate-distortion curve), but has little effect on the final compression ratio of the lossless system. Other alignments may correspond to special region-of-interest fidelity coding or resolution progressive alignment, which is a special quantization.

アラインメントは、圧縮データのヘッダで通知されるかもしれない。係数はビット・シグニフィカンス順に符号化されるが、最上位の重要性レベルは符号化単位中の係数から導き出される。各係数の符号ビットは、その係数が非ゼロの絶対値ビットを持つ最も上位の重要性レベルまで符号化されない。これは、絶対値がゼロの係数の符号ビットを符号化しないという利点がある。また、符号ビットは、埋め込み符号ストリーム中のそれが関連する点まで符号化されない。様々なサイズの係数に関するアラインメントは、符号化器及び復号化器のいずれも分かっているので、エントロピー・コーダの効率に全く影響を与えない。 The alignment may be signaled in the header of the compressed data. The coefficients are encoded in bit-significance order, but the highest importance level is derived from the coefficients in the coding unit. The sign bit of each coefficient is not encoded to the highest importance level where the coefficient has a non-zero absolute value bit. This has the advantage of not coding the sign bit of the coefficient whose absolute value is zero. Also, the code bit is not encoded until the point in the embedded code stream where it is relevant. The alignment for coefficients of various sizes has no effect on the efficiency of the entropy coder since both the encoder and the decoder are known.

ｂビット／画素の画像の２レベルのＴＳ変換及びＴＴ変換分解における係数のビット深度を図１３に示す。図１４は、本発明における係数アラインメントに用いられる周波数帯域用乗数の例である。係数のアラインメントのために、1-HH係数のサイズが基準として用いられ、このサイズに対し相対的にシフトが与えられる。 FIG. 13 shows the bit depth of the coefficients in the two-level TS transform and TT transform decomposition of an image of b bits / pixel. FIG. 14 is an example of a frequency band multiplier used for coefficient alignment in the present invention. For coefficient alignment, the size of the 1-HH coefficient is used as a reference and a shift is given relative to this size.

一実施例では、画像中の全ての係数のアラインメントを生成するため、係数は最大の係数の絶対値を考えてシフトされる。アラインメント後の係数は、次に、重要性レベルと呼ばれるビットプレーン単位で、最上位の重要性レベル（ＭＳＩＬ）より最下位の重要性レベル（ＬＳＩＬ）へと処理される。符号(sign)ビットは、ＭＳＩＬの一部ではないので、各係数の最後のヘッドビットまで符号化されない。重要なことは、アラインメントはエントロピー・コーダへビットが送られる順序を制御するに過ぎないことである。割増の０のビットのパッディング、シフト、格納、符号化が実際に行われるわけではない。 In one embodiment, the coefficients are shifted to account for the absolute value of the largest coefficient to produce an alignment of all coefficients in the image. The aligned coefficients are then processed in bit plane units, called importance levels, from the highest importance level (MSIL) to the lowest importance level (LSIL). Since the sign bit is not part of the MSIL, it is not encoded until the last head bit of each coefficient. Importantly, alignment only controls the order in which bits are sent to the entropy coder. There is no actual padding, shifting, storing, or encoding of the extra zero bits.

表１はアラインメントの例を示す。 Table 1 shows an example of alignment.

様々なサイズの係数に関するアラインメントは、符号化器と復号化器の両方に分かっているので、エントロピー・コーダの効率にはまったく影響を与えない。 Since the alignment for coefficients of various sizes is known to both the encoder and decoder, it does not affect the efficiency of the entropy coder at all.

同じデータセットの符号化単位が異なったアラインメントを持っても構わないことに注意されたい。 Note that the coding units of the same data set may have different alignments.

符号ストリームの順序付け
図１５は、符号化ストリームの順序付けと符号化単位内における順序付けを示している。図１５において、ヘッダ１００１の後に、符号化単位１００２が最も上の帯域より最も下の帯域へと順に続く。符号化単位の内部では、ＬＬ係数１００３は符号化されずにラスター（ライン）順に格納される。ＬＬ係数の後に、重要性レベルが、１ビットプレーンずつ、最上位のビットプレーンから最下位のビットプレーンへと順にエントロピー符号化される。この時、すべての係数の第１ビットプレーンが符号化され、次に第２ビットプレーンが符号化され、以下同様に符号化される。一実施例では、アラインメントはヘッダ１００１中に指定される。 Code Stream Ordering FIG. 15 shows the ordering of the coded stream and the ordering within the coding unit. In FIG. 15, after the header 1001, the encoding unit 1002 continues in order from the uppermost band to the lowermost band. Inside the encoding unit, the LL coefficients 1003 are stored in raster (line) order without being encoded. After the LL coefficient, the importance level is entropy encoded in order from the most significant bit plane to the least significant bit plane, one bit plane at a time. At this time, the first bit plane of all coefficients is encoded, then the second bit plane is encoded, and so on. In one embodiment, the alignment is specified in the header 1001.

一実施例では、ＬＬ係数は、それが８ビット値のときは、符号化されずにラスター順に格納されるにすぎない。ＬＬ係数のサイズが８ビット未満のときには、ＬＬ係数はパッディングにより８ビットにされる。ＬＬ係数が８ビットより長いときには、ＬＬ係数は次のように格納される。まず、各係数の最上位の８ビットが符号化されずラスター順に格納される。次に、係数の残りの下位ビットはパックされてラスター順に格納される。例えば、１０ビットのＬＬ係数の場合、４個のＬＬ係数の最下位ビットは１バイトにパックされる。このようにして、実際の画像の深度にかかわらず、係数毎に８ビットのＬＬデータを得られるので、簡略画像もしくはプレビュー画像の素早い生成が可能になる。 In one embodiment, LL coefficients are stored in raster order without being encoded when they are 8-bit values. When the size of the LL coefficient is less than 8 bits, the LL coefficient is made 8 bits by padding. When the LL coefficient is longer than 8 bits, the LL coefficient is stored as follows. First, the most significant 8 bits of each coefficient are not encoded and are stored in raster order. The remaining lower bits of the coefficients are then packed and stored in raster order. For example, for a 10-bit LL coefficient, the least significant bits of the 4 LL coefficients are packed into 1 byte. In this way, since 8-bit LL data can be obtained for each coefficient regardless of the actual image depth, a simple image or a preview image can be generated quickly.

各ビットプレーン期間内に係数が処理される順序は、低い解像度より高い解像度へ、かつ低い周波数より高い周波数への順である。各ビットプレーン内の係数サブバンドの順序は、高いレベル（低解像度、低周波数）より低いレベル（高分解能、高周波数）への順である。各周波数サブバンドの内部において、符号化はある決まった順序でなされる。一実施例では、その順序はラスター順、２×２ブロック順、ジグザグ順、Ｐeanoスキャン順、等々である。 The order in which the coefficients are processed within each bit plane period is from lower resolution to higher resolution and lower frequency to higher frequency. The order of coefficient subbands within each bit plane is from higher level (low resolution, low frequency) to lower level (high resolution, high frequency). Within each frequency subband, the encoding is done in a certain order. In one embodiment, the order is raster order, 2 × 2 block order, zigzag order, Peano scan order, and so on.

図１５の符号ストリームを用いる４レベル分解の場合、その順序は次のとおりである。
4-LL,4-HL,4-LH,4-HH,3-HL,3-LH,3-HH,2-HL,2-LH,2-HH,1-HL,1-LH,1-HH In the case of four-level decomposition using the code stream of FIG. 15, the order is as follows.
4-LL, 4-HL, 4-LH, 4-HH, 3-HL, 3-LH, 3-HH, 2-HL, 2-LH, 2-HH, 1-HL, 1-LH, 1- HH

符号ストリーム・データを重要性によって分けると、データを媒体に格納したり、ノイズのある通信路により伝送するのに有利である。データの異なった部分に、異なった冗長度を持つ誤り訂正／検出符号を用いることができる。最も高い冗長度の符号をＬＬ係数のヘッダに用いることができる。（重要度を基準にして）重要性の低いエントロピー符号化データは、冗長度の低い誤り訂正／検出符号を用いることができる。訂正不可能な誤りが生じたときには、データのパケットを廃棄（量子化）すべきか、または、通信路より再送信すべきか、もしくは記憶装置から再読みだしすべきかを決定するために、そのデータの重要性レベルも利用できる。例えば、高冗長度の誤り訂正ＢＣＨ符号（Ｒeed-Ｓolomon符号など）が、ヘッダデータ、ＬＬデータ、エントロピー符号化データの最も重要な４分の１のために用いられるかもしれない。エントロピー符号化データの残りの４分の３は、低冗長度の誤り検出チェックサム又はＣＲＣ（巡回冗長検査）によって保護されるであろう。一実施例では、ＢＣＨ符号を用いるパケットは常に再送信されて廃棄されず、一方、チェックサム又はＣＲＣ符号を持つパケットは再送信されず、データの送信を試みて失敗した後に廃棄されるであろう。 Dividing the code stream data according to importance is advantageous for storing the data in a medium or transmitting the data through a noisy channel. Error correction / detection codes with different redundancy can be used for different parts of the data. The code with the highest redundancy can be used for the header of the LL coefficient. For the less important entropy encoded data (on the basis of importance), an error correction / detection code with low redundancy can be used. When an uncorrectable error occurs, to determine whether the packet of data should be discarded (quantized), retransmitted from the channel, or reread from the storage device Importance levels are also available. For example, a high redundancy error correcting BCH code (such as a Reed-Solomon code) may be used for the most important quarter of header data, LL data, entropy encoded data. The remaining three quarters of the entropy encoded data will be protected by a low redundancy error detection checksum or CRC (Cyclic Redundancy Check). In one embodiment, packets with BCH codes are always retransmitted and not discarded, while packets with checksum or CRC codes are not retransmitted and are discarded after a failed attempt to transmit data. Let's go.

一実施例においては、ヘッダデータがデータの各部分に使用される誤り訂正／検出符号を指定する。言い換えれば、ヘッダ中の情報が、誤り訂正符号を切り替えるべき時点を指示する。一実施例では、誤り訂正／検出符号は、通信路に利用されるパケット間で、又は記憶媒体に利用されるブロック間で変更されるだけである。 In one embodiment, the header data specifies an error correction / detection code that is used for each piece of data. In other words, the information in the header indicates when to switch the error correction code. In one embodiment, the error correction / detection code is only changed between packets used for the communication path or between blocks used for the storage medium.

図２６は、ヘッダ１９０１の後に符号化されないＬＬ係数（１９０２）とエントロピー符号化データ１９０３が埋め込み順に続く符号ストリームを示す。図示のように、ヘッダ１９０１とＬＬ係数１９０２は最高冗長度の符号を使用するが、エントロピー符号化データ１９０３は最低冗長度の符号を使用する。本発明は、最高の冗長度から最低の冗長度までの多くの異なった符号が使用されるスライディングスケールを採用することもできる。 FIG. 26 shows a code stream in which an LL coefficient (1902) that is not encoded and entropy encoded data 1903 follow the header 1901 in the embedding order. As shown, the header 1901 and the LL coefficient 1902 use the code with the highest redundancy, while the entropy encoded data 1903 uses the code with the lowest redundancy. The present invention can also employ a sliding scale in which many different codes are used, from the highest redundancy to the lowest redundancy.

水平コンテキストモデル
本発明に利用される水平コンテキストモデルの一実施例を以下に説明する。このモデルは、係数の空間及びスペクトル従属性に基づいて符号化単位内のビットを利用する。隣接した係数及び親係数の利用可能な２進値を、コンテキストを生成するために使用してもよい。しかし、コンテキストはデコーダビリティを左右し、また、多少は効率的適応に影響を及ぼす。 Horizontal Context Model An embodiment of the horizontal context model used in the present invention will be described below. This model takes advantage of the bits in the coding unit based on the spatial and spectral dependencies of the coefficients. Available binary values of adjacent coefficients and parent coefficients may be used to generate the context. However, the context affects the decodability and somewhat affects the efficient adaptation.

水平コンテキストモデルによる係数のモデリング
本発明は、バイナリ・エントロピー・コーダのための埋め込みビット・シグニフィカンス順の係数により生成された符号ストリームをモデル化するためのコンテキスト・モデルを提供する。一実施例では、コンテキスト・モデルはランレングス・カウント（count）、空間モデル、符号（sign）ビット・モデル、テールビット・モデルからなる。ランレングス・カウントは、同じ状態のビットのランを測定する。空間モデルは、ヘッドビットに関する近傍係数及び親係数の情報を含んでいる。 Coefficient Modeling with Horizontal Context Model The present invention provides a context model for modeling a code stream generated by embedded bit-significance ordered coefficients for a binary entropy coder. In one embodiment, the context model consists of a run length count, a spatial model, a sign bit model, and a tail bit model. The run length count measures the run of bits in the same state. The spatial model includes information on neighborhood coefficients and parent coefficients related to head bits.

図１６は、符号化単位の全ての係数の隣接係数を表す。図１６において、隣接係数は分かりやすい地理的表記法で表されている（例えば、Ｎ＝北、ＮＥ＝北東、等々）。ある係数、例えば図１６のＰと、カレント・ビットプレーンが与えられたとすると、コンテキストモデルは、そのビットプレーンより前の符号化単位全てから得られるどの情報も利用することができる。本コンテキストモデルの場合、注目係数の親係数も利用される。 FIG. 16 shows adjacent coefficients of all the coefficients of the coding unit. In FIG. 16, the adjacency coefficient is expressed in an easy-to-understand geographical notation (for example, N = north, NE = northeast, etc.). Given a coefficient, eg, P in FIG. 16, and the current bitplane, the context model can use any information obtained from all coding units prior to that bitplane. In the case of this context model, the parent coefficient of the attention coefficient is also used.

水平ヘッドビット・コンテキストモデル
ヘッドビットは、最も圧縮できるデータである。したがって、圧縮率を上げるため、大量のコンテキストもしくは条件付けが使われる。隣接係数又は親係数の値を注目係数の注目ビットに対するコンテキストを決定するために利用するというよりもむしろ、その情報は本明細書においてテール情報と呼ぶ２ビットにまとめられる。この情報は、メモリに格納されてもよいし、隣接係数又は親係数から動的に計算されてもよい。テール情報は、最初の非ゼロの絶対値ビットがすでに見つかったか否か（例えば最初の“オン”ビットがすでに見つかったか否か）を示し、そして、すでに見つかっているならば、幾つ前のビットプレーンであったかを示す。表２はテール情報ビットの説明である。 Horizontal headbit context model Headbits are the most compressible data. Therefore, a large amount of context or conditioning is used to increase the compression rate. Rather than using the value of the adjacent coefficient or parent coefficient to determine the context for the bit of interest of the coefficient of interest, that information is organized into two bits, referred to herein as tail information. This information may be stored in memory or calculated dynamically from neighboring coefficients or parent coefficients. The tail information indicates whether the first non-zero absolute value bit has already been found (eg, whether the first “on” bit has already been found), and if so, how many previous bit planes It was shown. Table 2 describes the tail information bits.

この２ビットのテール情報から、そのテール情報がゼロか否かを示す１ビットの“テール・オン”ビットが合成される。一実施例では、テール情報とテール・オンビットは係数が符号化された直後に更新される。別の実施例では、その更新は、並列的コンテキスト生成を可能にするため、もっと後に行われる。 From the 2-bit tail information, a 1-bit “tail on” bit indicating whether the tail information is zero or not is synthesized. In one embodiment, tail information and tail on bits are updated immediately after the coefficients are encoded. In another embodiment, the update is done later to allow parallel context generation.

さらに、この２ビットは、符号化される重要性レベルを示すために利用される。最初の２ビットプレーンは値０、第２の２ビットプレーンは値１、第３の２ビットプレーンは値２、残りのビットプレーンは値３を使用する。さらに、すべてゼロのヘッドビットのランレングス符号化がある。 Furthermore, these 2 bits are used to indicate the importance level to be encoded. The first 2-bit plane uses the value 0, the second 2-bit plane uses the value 1, the third 2-bit plane uses the value 2, and the remaining bit planes use the value 3. In addition, there is a run-length encoding of all zero head bits.

ヘッドビットのための１０ビットのコンテキストは、親係数及びＷ係数それぞれの２ビット情報、Ｎ，Ｅ，ＳＷ，Ｓ各係数の１ビット情報、２ビットの重要性レベル情報からなる。 The 10-bit context for the head bits consists of 2-bit information for each of the parent coefficient and W coefficient, 1-bit information for each of the N, E, SW, and S coefficients, and 2-bit importance level information.

一実施例では、一部又は全ての周波数帯域についてテール情報は用いられない。こうすることにより、周波数帯域を、その親を前もって復号化すねことなく復号化できるようになる。 In one embodiment, tail information is not used for some or all frequency bands. In this way, the frequency band can be decoded without first decoding its parent.

別の実施例では、各周波数帯域のビットプレーンの重要性レベルへの割り当てに、一つのアラインメントを用いる。親のテール・オン情報の決定に、もう一つのアラインメントを利用するが、これが利用する親のビットプレーンは実際に符号化されているビットプレーンより少ない。これにより、ある周波数帯域のいくつかのビットプレーンが、同じ重要性レベルの対応した親ビットプレーンを復号化せずに復号化できるようになる（図５１参照）。例えば、ＭＳＥアラインメントに基づいた親のテール・オン情報によらず、ピラミッド・アラインメントにより画像を符号化できる（図５０参照）。これにより、復号化器は、ＭＳＥアラインメントを模擬し、又はピラミッド・アラインメントとＭＳＥアラインメントの間の任意のアラインメントを模擬し、ピラミッド・アラインメントで復号化することができる。 In another embodiment, one alignment is used to assign bit planes of each frequency band to importance levels. Another alignment is used to determine the tail-on information of the parent, but it uses fewer parent bitplanes than the actual encoded bitplane. This allows several bit planes in a certain frequency band to be decoded without decoding corresponding parent bit planes of the same importance level (see FIG. 51). For example, an image can be encoded by pyramid alignment regardless of parent tail-on information based on MSE alignment (see FIG. 50). This allows the decoder to simulate MSE alignment, or simulate any alignment between pyramid alignment and MSE alignment, and decode with pyramid alignment.

図２９はコンテキスト従属関係を示す。子はその親に条件付けられる。したがって、親は、その子を復号化する前に復号化されなければならない、特に符号化時に用いたアラインメントと異なるアラインメントを用いて復号化する時にはそうである。 FIG. 29 shows context dependencies. A child is conditioned to its parent. Therefore, a parent must be decoded before decoding its children, especially when decoding using an alignment different from the alignment used during encoding.

水平符号(sign)ビット・コンテキストモデル
最後のヘッドビットの後で、符号ビットが符号化される。符号のコンテキストは、Ｎ係数が正であるか負であるか、符号がまだ符号化されていないかによって、３つ存在する。 Horizontal sign bit context model After the last head bit, the sign bit is encoded. There are three sign contexts depending on whether the N coefficient is positive or negative, or whether the sign is not yet encoded.

水平テールビット・コンテキストモデル
テールビットのためのコンテキストは、注目係数のテール情報の値によって３つ存在する。（テールビットを符号化しようとしているときには、テール情報値は１，２又は３に限られることに注意されたい） Horizontal tail bit context model There are three contexts for tail bits depending on the value of the tail information of the coefficient of interest. (Note that tail information values are limited to 1, 2 or 3 when trying to encode tail bits)

水平コンテキストモデルのためのステップ
システムのコンテキストモデルは、コンテキストを記述するために最高１１ビットを使う。この数がまるまる指定されなくともよい。各ビット位置の意味は、前の２進値に依存する。まず第１に、ヘッドビットのある“ラン符号化”を提供するために、ただ１つのコンテキストが用いられる。ヘッドビットのランがないとにきは、各ビットは、あるコンテキストに寄与する隣接係数及び親係数により符号化される。ステップの具体例は以下のとおりである。 Steps for the horizontal context model The system context model uses up to 11 bits to describe the context. This number need not be specified as a whole. The meaning of each bit position depends on the previous binary value. First of all, only one context is used to provide “run coding” with head bits. In the absence of a run of head bits, each bit is encoded with adjacent and parent coefficients that contribute to a context. Specific examples of steps are as follows.

１）ルックアヘッド（look-ahead）をすべきか判定する。
次のＮ個の係数とそれらの北側に隣接する係数のテール情報がすべてゼロならば、システムはステップ２に進む。そうでなければ、次のＮ個の係数のためステップ３に進む。一実施例では、Ｎ＝１６である。 1) Determine whether look-ahead should be performed.
If the tail information for the next N coefficients and their north neighbor coefficients are all zero, the system proceeds to step 2. Otherwise, go to step 3 for the next N coefficients. In one embodiment, N = 16.

２）ルックアヘッド手順を行う。
次のＮ個の係数の符号化すべきカレント・ビットプレーンのビットがゼロならば、１つの０が符号化され、システムはステップ１から次のＮ係数に移行する。そうでなければ、１つの１が符号化され、システムは次のＮ係数のためステップ３に進む。 2) Perform the look-ahead procedure.
If the bit of the current bitplane to be encoded of the next N coefficients is zero, one 0 is encoded and the system moves from step 1 to the next N coefficient. Otherwise, one 1 is encoded and the system proceeds to step 3 for the next N coefficients.

３）注目係数の状態を判定し符号化する。
注目係数のテール情報が０ならば、注目係数のカレント・ビットプレーンのビットは、西の係数と親係数（オプション）のテール情報の２ビット、北西、東、南西、南の係数のテールオンビット、及び重要性レベル情報の２ビットにより作られる１０２４個の可能なコンテキストにより符号化され、システムはステップ４に進む。なお、一実施例では、親係数は利用されないため、コンテキストは隣接係数と重要性レベル情報だけから作られる。注目係数のテール情報が０でなければ、注目係数のカレント・ビットプレーンのビットはテールビットであり、注目係数のテール情報の２ビットから作られる３つのコンテキストにより符号化される。 3) The state of the attention coefficient is determined and encoded.
If the tail information of the coefficient of interest is 0, the bits of the current bit plane of the coefficient of interest are the two bits of the tail information of the west coefficient and the parent coefficient (optional), the tail on bit of the north west, east, south west, south coefficient , And 1024 possible contexts created by 2 bits of importance level information, the system proceeds to step 4. Note that in one embodiment, the parent coefficient is not used, so the context is created only from the adjacent coefficient and importance level information. If the tail information of the coefficient of interest is not 0, the bit of the current bit plane of the coefficient of interest is a tail bit and is encoded by three contexts created from 2 bits of the tail information of the coefficient of interest.

４）カレント・ヘッドビットの状態を判定し、必要なら符号（sign）ビットを符号化する。
注目係数のカレント・ビットプレーンのビットが１ならば、注目係数の符号（sign）は、北係数のテールオンビット及び符号ビットにより作られる３つの可能なコンテキストにより符号化される。 4) Determine the state of the current head bit and, if necessary, encode the sign bit.
If the bit of the current bitplane of the coefficient of interest is 1, the sign of the coefficient of interest is encoded with three possible contexts created by the tail-on bit and the sign bit of the north coefficient.

図１７は前述のプロセスのフローチャートである。図１７において、白いブロックは符号化と関係がなく、黒いブロックは符号化に関係がある。図に示さないが、各エントロピー符号化デシジョン(decision）に対し１つのコンテキストが定義される。前述の作用及びフローは、当業者には理解できるであろう。 FIG. 17 is a flowchart of the above-described process. In FIG. 17, white blocks are not related to encoding, and black blocks are related to encoding. Although not shown in the figure, one context is defined for each entropy coding decision. The foregoing actions and flows will be understood by those skilled in the art.

水平コンテキストモデルの一例が、入力係数を符号／絶対値形式に変換する符号／絶対値ユニットの一例とともに、米国特許出願第０８／４９８，６９５号（１９９５年６月３０日受理、”Ｍethod and Ａpparatus Ｆor Ｃompression Ｕsing Ｒeversible Ｗavelet Ｔransforms and an Ｅmbedded Ｃodestream”）、及び、米国特許出願第０８／４９８，０３６号（１９９５年６月３０日受理、”Ｒeversible Ｗavelet Ｔransform and Ｅmbedded Ｃodestream Ｍanipulation”）に記述されている。 An example of a horizontal context model is US patent application Ser. No. 08 / 498,695 (June 30, 1995, “Method and Apparatus,” along with an example of a sign / absolute value unit that converts input coefficients to sign / absolute value format. For Compression Using Reversible Waveform Transforms and an Embedded Codestream ”) and U.S. Patent Application No. 08 / 498,036 (accepted June 30, 1995,“ Reversible Wavelet Transform and Embedded Codestream Manipulation ”).

エントロピー符号化
一実施例では、本発明により実施されるエントロピー符号化は、バイナリ・エントロピー・コーダによって実行される。一実施例では、エントロピー・コーダ１０６はＱコーダ、ＱＭコーダ、有限状態マシン、又は高速並列コーダ等からなる。単一のコーダを用いて単一の出力符号ストリームを生成してもよい。あるいは、複数の（物理又は仮想）コーダを用い、複数の（物理又は仮想）データストリームを生成してもよい。 Entropy coding In one embodiment, the entropy coding performed by the present invention is performed by a binary entropy coder. In one embodiment, entropy coder 106 comprises a Q coder, a QM coder, a finite state machine, a high speed parallel coder, or the like. A single coder may be used to generate a single output code stream. Alternatively, multiple (physical or virtual) coders may be used to generate multiple (physical or virtual) data streams.

一実施例では、本発明のバイナリ・エントロピー・コーダはＱコーダからなる。Ｑコーダに関する情報を得るには、Ｐennebaker，Ｗ．Ｂ.，et al．，“Ａn Ｏverview of the Ｂasic Ｐrinciples of the Ｑ-coder Ａdaptive Ｂinary Ａrithmetic，”IBM Ｊournal of Ｒesearch and Ｄevelopment，Vol．32，pg．717-26，1988を読まれたい。別の実施例では、バイナリ・エントロピー・コーダは、周知の効率的なバイナリ・エントロピー・コーダであるＱＭコーダを用いる。ＱＭコーダは、確率スキューが非常に高いビットに対し特に効率的である。ＱＭコーダはＪＰＥＧ規格とＪＢＩＧ規格の両方で利用される。 In one embodiment, the binary entropy coder of the present invention comprises a Q coder. To obtain information about the Q coder, see Pennebaker, W .; B., et al. "An Overview of the Basic Principles of the Q-coder Adaptive Bi- ary Arithmetic," IBM Journal of Research and Development, Vol. 32, pg. Please read 717-26, 1988. In another embodiment, the binary entropy coder uses a QM coder, which is a well-known efficient binary entropy coder. QM coders are particularly efficient for bits with very high probability skew. QM coders are used in both JPEG and JBIG standards.

バイナリ・エントロピー・コーダは、有限状態マシン（ＦＳＭ）コーダでもよい。このようなコーダは、確率及び事象(outcome)から圧縮ビットストリームへの単純な変換を提供する。一実施例では、有限状態マシン・コーダは、符号化器、復号化器の両方として、テーブルルックアップにより実現される。多様な確率予測法を、このような有限状態マシンコーダに利用できる。０．５に近い確率に対する圧縮率が非常によい。大きくスキューした確率に対する圧縮率は、用いられるルックアップテーブルのサイズに依存する。ＱＭコーダと同様、有限状態マシン・コーダは、デシジョンが発生順に符号化されるので、埋め込みビットストリームに有効である。出力はルックアップテーブルにより決められるので、“キャリーオーバー”(carry over)問題が起こる心配は全くない。実際には、ＱコーダやＱＭコーダと違って、符号化と圧縮出力ビットの生成との間に最大の遅延がある。一実施例では、本発明の有限状態マシン・コーダは、１９９３年１２月２１日発行の米国特許第５，２７２，４７８号“Ｍethod and Ａpparatus for Ｅntropy Ｃoding”に述べられているＢコーダからなる。 The binary entropy coder may be a finite state machine (FSM) coder. Such a coder provides a simple conversion from probability and outcome to a compressed bitstream. In one embodiment, the finite state machine coder is implemented by table lookup as both an encoder and a decoder. A variety of probability prediction methods can be used for such a finite state machine coder. The compression ratio for a probability close to 0.5 is very good. The compression ratio for a highly skewed probability depends on the size of the lookup table used. Like QM coders, finite state machine coders are useful for embedded bitstreams because decisions are encoded in the order in which they occur. Since the output is determined by a look-up table, there is no concern about a “carry over” problem. In practice, unlike Q and QM coders, there is a maximum delay between encoding and generating compressed output bits. In one embodiment, the finite state machine coder of the present invention comprises a B coder as described in US Pat. No. 5,272,478 issued December 21, 1993 “Method and Apparatus for Entropy Coding”.

一実施例では、本発明のバイナリ・エントロピー・コーダは高速並列コーダからなる。ＱＭコーダもＦＳＭコーダも、１ビットずつ符号化又は復号化される必要がある。高速並列コーダは、数ビットを並列に処理する。一実施例では、高速並列コーダは、圧縮性能を犠牲にすることなく、ＶＬＳＩハードウエア又はマルチプロセッサ・コンピュータで実現される。本発明において利用し得る高速並列コーダの一例が、１９９５年１月１０日発行の米国特許第５，３８１，１４５号“Ｍethod and Ａpparatus for Ｐarallel Ｄecoding and Ｅncoding of Ｄata”に述べられている。 In one embodiment, the binary entropy coder of the present invention comprises a high speed parallel coder. Both QM and FSM coders need to be encoded or decoded bit by bit. A high speed parallel coder processes several bits in parallel. In one embodiment, the high speed parallel coder is implemented in VLSI hardware or a multiprocessor computer without sacrificing compression performance. An example of a high speed parallel coder that can be used in the present invention is described in US Pat. No. 5,381,145 “Method and Apparatus for Parallel Decoding and Encoding of Data” issued on Jan. 10, 1995.

殆どの効率的なバイナリ・エントロピー・コーダは、基本フィードバックループによって速度が制限される。考えられる一解決法は、入力データストリームを複数のストリームに分割して並列の複数の符号化器に与えることである。それら符号化器の出力は、複数の可変長符号化データ・ストリームである。この種の方法の一つの課題は、データを単一のチャンネルでどのようにして伝送するかである。米国特許第５，３８１，１４５号に述べられている高速並列コーダは、この課題を、それら符号化データストリームをインターリーブする方法によって解決する。 Most efficient binary entropy coders are speed limited by the basic feedback loop. One possible solution is to divide the input data stream into multiple streams and feed them to multiple encoders in parallel. The output of these encoders is a plurality of variable length encoded data streams. One challenge with this type of method is how to transmit data on a single channel. The high speed parallel coder described in US Pat. No. 5,381,145 solves this problem by a method of interleaving these encoded data streams.

本発明において利用されるコンテキストの多くは定確率であり、このことがＢコーダのような有限状態マシン・コーダを特に有効なものにする。なお、システムが０．５に近い確率を利用する場合、上記特許に開示された高速並列コーダ及び有限状態マシン・コーダは共にＱコーダより効率よく動作する。よって、それら両方のコーダは、本発明のコンテキストモデルに対し本質的な圧縮上の強みを持っている。 Many of the contexts utilized in the present invention are constant probabilities, which make finite state machine coders such as B coders particularly useful. Note that both high speed parallel coder and finite state machine coder disclosed in the above patent operate more efficiently than Q coder when the system uses a probability close to 0.5. Thus, both these coders have an inherent compression advantage over the context model of the present invention.

別の実施例では、バイナリ・エントロピー・コーダ及び高速ｍ元コーダの両方が利用される。高速ｍ元コーダはハフマン・コーダでよい。 In another embodiment, both a binary entropy coder and a fast m-ary coder are utilized. The high-speed m-original coder may be a Huffman coder.

本発明の符号化及び復号化のプロセス
図１８乃至図２０のフローチャートは、本発明の符号化プロセス及び復号化プロセスの例を表している。処理ロジックは、ソフトウエア及び／又はハードウエアによって実現してよい。 Encoding and Decoding Process of the Present Invention The flowcharts of FIGS. 18-20 represent examples of the encoding and decoding processes of the present invention. The processing logic may be realized by software and / or hardware.

図１８は本発明の符号化プロセスの一例を示す。図１８において、符号化プロセスの最初で、処理ロジックが１タイル分の入力データを取得する（処理ブロック１２０１）。 FIG. 18 shows an example of the encoding process of the present invention. In FIG. 18, at the beginning of the encoding process, processing logic obtains input data for one tile (processing block 1201).

処理ロジックは次に、バイナリ符号化を実行しなければならないか判定する（処理ブロック１２０２）。バイナリ符号化を実行すべきときには、プロセスは処理ブロック１２１１に進み、処理ロジックは入力データに対しＧｒａｙ符号化を実行し、そして各係数の各ビットをバイナリ方式コンテキストモデルによりモデル化する（処理ブロック１２１２）。処理は処理ブロック１２０８へ進む。 Processing logic then determines whether binary encoding must be performed (processing block 1202). When binary encoding is to be performed, the process proceeds to processing block 1211 where processing logic performs Gray encoding on the input data and models each bit of each coefficient with a binary context model (processing block 1212). ). Processing continues at processing block 1208.

バイナリ符号化を実行する必要がなければ、プロセスは処理ブロック１２０３へ進み、処理ロジックはデータに可逆フィルタをかける。可逆フィルタをかけた後、処理ロジックは別の分解レベルが必要か判定する（処理ブロック１２０４）。別の分解レベルが必要ならば、処理ロジックはＬＬ係数に可逆フィルタをかけ（処理ブロック１２０５）、そして処理は処理ブロック１２０４へ戻り再び判定を行う。別の分解レベルが必要でないならば、プロセスは処理ブロック１２０６へ進み、処理ロジックは係数を符号・絶対値形式へ変換する。それから、処理ロジックは各係数の各ビットを水平コンテキストモデルによりモデル化し（処理ブロック１２０７）、プロセスは処理ブロック１２０８へ進む。 If binary encoding does not need to be performed, the process proceeds to processing block 1203 where processing logic filters the data losslessly. After applying the reversible filter, processing logic determines whether another decomposition level is required (processing block 1204). If another decomposition level is required, processing logic applies a reversible filter to the LL coefficients (processing block 1205) and processing returns to processing block 1204 to make a determination again. If another decomposition level is not required, the process proceeds to processing block 1206 where processing logic converts the coefficients to signed / absolute value format. Processing logic then models each bit of each coefficient with a horizontal context model (processing block 1207) and the process proceeds to processing block 1208.

処理ブロック１２０８で、処理ロジックは各係数の各ビットを符号化する。そして、処理ブロックは各符号化データを送信又は格納する（処理ブロック１２０９）。 At processing block 1208, processing logic encodes each bit of each coefficient. Then, the processing block transmits or stores each encoded data (processing block 1209).

処理ブロックは、次に、画像中に他にもタイルが使われているか判定する（処理ブロック１２１０）。画像中に他のタイルがあるならば、処理ロジックは処理ブロック１２０１へループバックし処理が繰り返されるが、他のタイルがなければプロセスは終了する。 The processing block then determines whether other tiles are used in the image (processing block 1210). If there are other tiles in the image, processing logic loops back to processing block 1201 and the process is repeated, but if there are no other tiles, the process ends.

図１９は、本発明の復号化プロセスの一例を示す。図１９において、プロセスはまず１タイル分の符号化データを取得する（処理ブロック１３０１）。次に、処理ロジックはその符号化データをエントロピー復号化する（処理ブロック１３０２）。そして、処理ロジックは、そのデータがバイナリ復号化されなければならないか判定する（処理ブロック１２０３）。そのデータがビット毎にバイナリ復号化されなければならないときには、プロセスは処理ブロック１３１１へ進み、処理ロジックは各係数の各ビットをバイナリ方式コンテキストモデルによりモデル化し、そして、そのデータに対し逆Ｇｒａｙ符号化を施す（処理ブロック１３１２）。この逆グレイ符号化の後、プロセスは処理ブロック１３０９へ進む。 FIG. 19 shows an example of the decoding process of the present invention. In FIG. 19, the process first obtains encoded data for one tile (processing block 1301). Next, processing logic entropy decodes the encoded data (processing block 1302). Processing logic then determines whether the data must be binary decoded (processing block 1203). When the data has to be binary decoded bit by bit, the process proceeds to processing block 1311 where processing logic models each bit of each coefficient with a binary context model and inverse Gray encoding for the data. (Processing block 1312). After this inverse gray encoding, the process proceeds to processing block 1309.

バイナリ復号化が実行される必要がないときには、プロセスは処理ブロック１３０４へ進み、処理ロジックは各係数の各ビットを水平コンテキストモデルによりモデル化する。そして、処理ロジックは各係数をフィルタ処理に適した形式へ変換し（処理ブロック１３０５）、係数に可逆フィルタをかける（処理ブロック１３０６）。 When binary decoding does not need to be performed, the process proceeds to processing block 1304 where processing logic models each bit of each coefficient with a horizontal context model. Processing logic then converts each coefficient to a form suitable for filtering (processing block 1305) and applies a reversible filter to the coefficient (processing block 1306).

可逆フィルタをかけた後、処理ロジックは別のレベルの分解があるか判定する（処理ブロック１３０７）。別レベルの分解があるならば、プロセスは処理ブロック１３０８へ進み、処理ロジックは係数に可逆フィルタをかけ、そしてプロセスは処理ブロック１３０７へループバックする。別の分解レベルが必要でなければ、プロセスは処理ブロック１３０９へ進み、再構成されたデータは送信されるか格納される。 After applying the reversible filter, processing logic determines whether there is another level of decomposition (processing block 1307). If there is another level of decomposition, the process proceeds to processing block 1308 where processing logic filters the coefficients reversibly and the process loops back to processing block 1307. If another decomposition level is not required, the process proceeds to processing block 1309 where the reconstructed data is transmitted or stored.

次に、処理ロジックは画像中に他にタイルがあるか判定する（処理ブロック１３１０）。画像中にほかにタイルがあるときには、処理は処理ブロック１３０１へループバックして処理が繰り返されが、他にタイルがなければ、プロセスは終了する。 Next, processing logic determines whether there are other tiles in the image (processing block 1310). If there are other tiles in the image, the process loops back to processing block 1301 and the process is repeated, but if there are no more tiles, the process ends.

図２０は本発明によるビット・モデル化のためのプロセスの一例を示す。図２０において、ビット・モデル化プロセスは初めに係数変数Ｃを最初の係数に設定する（処理ブロック１４０１）。つぎに、｜ｃ｜＞２^Sの判定を行う（処理ブロック１４０２）。判定結果がｙｅｓのときには、処理は処理ブロック１４０３に進み、処理ロジックはテールビット用モデルを用いて係数ＣのビットＳを符号化し、そして処理ブロック１４０８に処理が進む。このテールビット用モデルは静的（非適応型）モデルでかまわない。｜ｃ｜が２^Sより大きくないときには、処理は処理ブロック１４０４に進み、処理ロジックはテンプレートをヘッドビット（頭の０と最初の“１”ビット）に適用する。テンプレートを適用した後、処理ロジックは係数ＣのビットＳを符号化する（処理ブロック１４０５）。可能なテンプレートを図２１に示す。 FIG. 20 shows an example of a process for bit modeling according to the present invention. In FIG. 20, the bit modeling process first sets the coefficient variable C to the first coefficient (processing block 1401). Next, a determination of | c |> 2 ^S is made (processing block 1402). If the determination result is yes, processing proceeds to processing block 1403 where processing logic encodes the bit S of coefficient C using the tail bit model and processing proceeds to processing block 1408. This tail bit model may be a static (non-adaptive) model. If | c | is not greater than 2 ^S , processing proceeds to processing block 1404 where processing logic applies the template to the head bits (the leading 0 and the first “1” bit). After applying the template, processing logic encodes bit S of coefficient C (processing block 1405). A possible template is shown in FIG.

次に、係数ＣのビットＳがオンであるか判定する（処理ブロック１４０６）。係数ＣのビットＳがオンでなければ、処理ブロック１４０８へ進む。一方、係数ＣのビットＳがオンならば、処理は処理ブロック１４０７に進み、処理ロジックは符号(sign)ビットを符号化する。そして処理は処理ブロック１４０８へ進む。 Next, it is determined whether the bit S of the coefficient C is on (processing block 1406). If bit S of coefficient C is not on, processing proceeds to processing block 1408. On the other hand, if bit S of coefficient C is on, processing proceeds to processing block 1407 where processing logic encodes the sign bit. Processing then proceeds to processing block 1408.

処理ブロック１４０８で、係数Ｃが最後の係数であるか判定する。係数Ｃが最後の係数でなければ、処理は処理ブロック１４０９に進み、係数変数Ｃは次の係数に設定され、そして処理ブロック１４０２から処理を続ける。一方、係数Ｃが最後の係数ならば、処理ブロック１４１０に進み、Ｓが最後のビットプレーンであるか判定する。Ｓが最後のビットプレーンでなければ、ビットプレーン変数Ｓが１だけデクリメントされ（処理ブロック１４１１）、処理ブロック１４０１から処理を続ける。Ｓが最後のビットプレーンならば、処理は終了する。 At processing block 1408, it is determined whether coefficient C is the last coefficient. If coefficient C is not the last coefficient, processing proceeds to processing block 1409 where coefficient variable C is set to the next coefficient and processing continues from processing block 1402. On the other hand, if the coefficient C is the last coefficient, the process proceeds to processing block 1410 to determine whether S is the last bit plane. If S is not the last bitplane, the bitplane variable S is decremented by 1 (processing block 1411) and processing continues from processing block 1401. If S is the last bit plane, the process ends.

ＴＳ変換設計
本発明は、一実施例において、バッファメモリ内の適所においてＴＳ変換を計算する。これを行う際、計算値を再配置するためのメモリの余分なライン及び余分な時間は必要とされない。ＴＳ変換については既に述べたが、本発明は、臨界サンプリングしたものに対しオーバーラップ変換を適用する。別の実施例では、ＴＴ変換が用いられる。 TS Conversion Design The present invention, in one embodiment, calculates TS conversions in place in the buffer memory. In doing this, no extra lines of memory and no extra time are needed to relocate the calculated values. Although the TS transform has already been described, the present invention applies the overlap transform to the critically sampled one. In another embodiment, a TT transform is used.

図２４（Ａ）乃至図２４（Ｃ）は、本発明の変換の計算中に本発明により採用されるメモリ操作方法を示す。図２４（Ａ）はメモリの初期状態を示す。図２４（Ａ）において、メモリの最初の行には、前の値(n-1)の平滑（"S")係数及び詳細（"D")係数（計算済み）、カレント値（ｎ）の平滑（"S")係数及び部分的に完成した詳細係数（"B"）が、４つの入力標本（"X"）の値（Ｘ2n+2，Ｘ2n+2，Ｘ2n+4，Ｘ2n+5）とともに入っている。変換計算の中間結果は、図２４（Ｂ）の同じメモリ行に示されている。この行の変更点は、第５記憶エレメント及び第６記憶エレメントにおいて、Ｘ2n+2とＸ2n+3の値がＳn+1とＢn+1の値に置き換わったことだけであることに注意されたい。このように、もはや必要でない格納値を変換計算中に生成された結果によって置き換えることにより、本発明はメモリスペースを節約する。図２４（Ｃ）は変換が完了し詳細出力Ｄnを生成後における同じメモリ行を示す。図２４（Ｂ）から唯一変わった点は部分的に完成した詳細係数Ｂnが詳細出力Ｄnで置き換わったことである。 24 (A) to 24 (C) show the memory operation method employed by the present invention during the calculation of the conversion of the present invention. FIG. 24A shows an initial state of the memory. In FIG. 24A, the first row of the memory contains the smoothing ("S") coefficient, the detailed ("D") coefficient (calculated) of the previous value (n-1), and the current value (n). Smooth ("S") coefficient and partially completed detail coefficient ("B") are the values of four input samples ("X") (X2n + 2, X2n + 2, X2n + 4, X2n + 5) Comes with. The intermediate result of the conversion calculation is shown in the same memory row in FIG. Note that the only change in this row is that the values of X2n + 2 and X2n + 3 have been replaced by the values of Sn + 1 and Bn + 1 in the fifth and sixth storage elements. Thus, the present invention saves memory space by replacing stored values that are no longer needed with the results generated during the conversion calculation. FIG. 24C shows the same memory row after conversion is complete and the detailed output Dn is generated. The only change from FIG. 24B is that the partially completed detail coefficient Bn is replaced with the detail output Dn.

ｎに関し詳細出力が計算された後、変換計算プロセスは、詳細出力Ｄn+1を計算するために行を下げて計算を続ける。 After the detailed output is calculated for n, the transform calculation process continues down the line to calculate the detailed output Dn + 1.

次に示すコード例を、変換を行うため用いてよい。フォワード変換とリバース変換のための水平コードが含まれていることに注意されたい。 The following code example may be used to perform the conversion. Note that horizontal codes for forward and reverse conversion are included.

以下において、変数ｓｏｏはＳn-1値を指し、変数ｏｓｏはＳn値を指し、変数ｏｏｓはＳn+1を指す。 In the following, the variable soo refers to the Sn-1 value, the variable oso refers to the Sn value, and the variable oos refers to Sn + 1.

フォワードＴＳ変換の一例のためのコード（Ｃ言語）の具体例は以下の通りである。
/*
TSFoword_1()
*/

void TSForward_1(long *x,int width)
{

long *start=x;
long *ox=x+2;
long soo;
long oso;
long oos;

oso=(*x+*(x+1))>>1;
oos=(*ox+*(ox+1))>>1;
soo=oos;

*(x+1)=*x-*(x+1);
*x=oso;

while((ox+2)-start<width){

x=ox;
os+=2;

soo=oso;
oso=oos;

oos=(*ox+*(ox+1))>>1;

*(x+1)=*x-*(x+1)+((oos-soo+2)>>2);
*x=oso;
｝

x=ox;

soo=oso;
oso=oos;

oos-soo;

*(x+1)=*x-*(x+1);
*x=oso;
} A specific example of a code (C language) for an example of forward TS conversion is as follows.
/ *
TSFoword_1 ()
* /

void TSForward_1 (long * x, int width)
{

long * start = x;
long * ox = x + 2;
long soo;
long oso;
long oos;

oso = (* x + * (x + 1)) >>1;
oos = (* ox + * (ox + 1)) >>1;
soo = oos;

* (x + 1) = * x-* (x + 1);
* x = oso;

while ((ox + 2) -start <width) {

x = ox;
os + = 2;

soo = oso;
oso = oos;

oos = (* ox + * (ox + 1)) >>1;

* (x + 1) = * x-* (x + 1) + ((oos-soo + 2) >>2);
* x = oso;
}

x = ox;

soo = oso;
oso = oos;

oos-soo;

* (x + 1) = * x-* (x + 1);
* x = oso;
}

インバースＴＳ変換の一例のためのコード（Ｃ言語）の具体例は以下のとおりである。
/*
TSReverse_1()
*/

void TSReverse_1(long *x,int width)
{

long *start=x;
long *d=x+1;
long ns=*(x+1);
long P;

while(x+2-start<width){

p=*d-((*(x+2)-ns+2)>>2);
ns=*x;

*d=*x-(P>>1);
*x+=((p+1)>>1);

x+=2;
d=x+1;
}

p=*d;
*d=*x-(p>>1);
*x=((p+1)>>1); A specific example of a code (C language) for an example of inverse TS conversion is as follows.
/ *
TSReverse_1 ()
* /

void TSReverse_1 (long * x, int width)
{

long * start = x;
long * d = x + 1;
long ns = * (x + 1);
long P;

while (x + 2-start <width) {

p = * d-((* (x + 2) -ns + 2) >>2);
ns = * x;

* d = * x- (P >>1);
* x + = ((p + 1) >>1);

x + = 2;
d = x + 1;
}

p = * d;
* d = * x- (p >>1);
* x = ((p + 1) >>1);

１次元の例のみ示したが、本発明は多次元及び多レベルにも利用し得る。なお、この手法は、他の計算のためにもはや必要でなくなった値を部分的又は最終的結果により１対１に置き換える任意のオーバーラップ変換に利用できる。 Although only a one-dimensional example is shown, the present invention can be used in multiple dimensions and multiple levels. Note that this approach can be used for any overlap transform that replaces values that are no longer needed for other computations, one-to-one with partial or final results.

図２５は、３レベル用のメモリバッフアの２次元表現を示す。図２５において、各ブロック１８０１〜１８０４のメモリロケーションには係数値が入っている。すなわち、ブロック１８０１〜１８０４のそれぞれは、係数値の８×８ブロックである。 FIG. 25 shows a two-dimensional representation of a 3-level memory buffer. In FIG. 25, coefficient values are stored in the memory locations of the blocks 1801 to 1804. That is, each of the blocks 1801 to 1804 is an 8 × 8 block of coefficient values.

係数は、２の自然数乗の間隔で配置される。レベルと、Ｓ又はＤのオフセットが与えられれば、どの係数もアクセスできる。それゆえに、特定のレベルと水平及び垂直周波数を選択することによって、アクセスをすることができる。バッファはラスター順にアクセスされてもよい。 The coefficients are arranged at intervals of 2 to the natural number. Any coefficient can be accessed given a level and an S or D offset. Therefore, access can be made by selecting specific levels and horizontal and vertical frequencies. The buffers may be accessed in raster order.

ユニット・バッファ構成
本発明の一実施例においては、単一のバッファで圧縮システムの変換ブロック、コンテキストモデル・ブロック、符号化ブロックをサポートする。このバッファは、係数を効率的にアクセスでき他にメモリを必要としない２次元のスクローリング・メモリバッファである。バッファの各ラインはライン・アクセス・バッファに格納されたポインタを介してアクセスされる。図２３（Ａ）及び図２３（Ｂ）は、このスクローリング・バッファの構成を示し、ライン・アクセス・バッファ１６０１にバッファ１６０２の各ラインを指すポインタが入っている。 Unit Buffer Configuration In one embodiment of the present invention, a single buffer supports the transform block, context model block, and encoding block of the compression system. This buffer is a two-dimensional scrolling memory buffer that can access coefficients efficiently and does not require any other memory. Each line of the buffer is accessed via a pointer stored in the line access buffer. FIG. 23A and FIG. 23B show the configuration of this scrolling buffer. The line access buffer 1601 contains pointers pointing to the respective lines of the buffer 1602.

スクロールは、ライン・アクセス・バッファに格納されるポインタを並べ直すことによりなされる。その例が図２３（Ａ）及び図２３（Ｂ）に示されている。図２３（Ａ）はバッファの初期状態を示す。図２３（Ｂ）を見るに、ラインＡ，Ｂ，Ｃがバッフアから取り除かれ、ラインＧ，Ｈ，Ｉによってそれぞれで置き換えられた後に、バッファにスクローリング・バッファの作用を付与するため、ライン・アクセス・バッファのポインタは、第１のポインタがバッファ内のラインＤを指し、第２のポインタがラインＥを指し、第３のポインタがラインＦを指すように変更される。次に、ラインＧ，Ｈ，Ｉを指すポインタがライン・アクセス・バッファの最後の３つの位置を占める。なお、６ラインのバッファを持つことに本発明が限定されるわけではなく、これはあくまで一例として用いられるにすぎない。もっとライン数の多いバッファが一般的に使用されるが、これは当業者には周知であろう。しかして、ライン・アクセス・バッファ経由のアクセスは、記憶を物理的に移動させることなく、ユニット・バッファがスクロールするような外観を呈する。これにより、スピードを犠牲にせずに、最小限のメモリを使用できるようになる。 Scrolling is done by rearranging the pointers stored in the line access buffer. Examples thereof are shown in FIGS. 23A and 23B. FIG. 23A shows the initial state of the buffer. Referring to FIG. 23B, after lines A, B, and C are removed from the buffer and replaced by lines G, H, and I, respectively, the line The pointer in the access buffer is changed so that the first pointer points to line D in the buffer, the second pointer points to line E, and the third pointer points to line F. Next, pointers to lines G, H, and I occupy the last three positions of the line access buffer. Note that the present invention is not limited to having a buffer of 6 lines, and this is merely used as an example. A buffer with a higher number of lines is commonly used, which will be well known to those skilled in the art. Thus, access via the line access buffer has the appearance that the unit buffer scrolls without physically moving the storage. This allows a minimum amount of memory to be used without sacrificing speed.

本発明においては、そのようなユニット・バッファを用いることにより、常に画像の１つの帯域のみメモリに記憶しつつ、画像全体に対するオーバーラップ変換の適用をサポートする。これを達成するため、少なくとも１バンドのウェーブレット・ユニットを構成するウェーブレット係数のセットを完全に計算するために必要な数分の画像のラインに対してだけウェーブレット変換を適用する。このような場合には、計算の完了したウェーブレット係数のセットを、モデル化し、エントロピー符号化し、そしてウェーブレット・ユニット・バッファの該当部分より取り除くことができる。計算が未完のウェーブレット係数は、次の反復で計算を完了させるために、そのまま残る。そして、ライン・ポインタを並べ替えることによりウェーブレット・ユニット・バッファをスクロールし、ウェーブレット・ユニット・バッファの空き部分にほかのデータを入れることができる。これで、計算途中のウェーブレット係数を完全に計算することができる。 In the present invention, by using such a unit buffer, only one band of the image is always stored in the memory, and the application of the overlap transform to the entire image is supported. To achieve this, the wavelet transform is applied only to the number of lines of the image necessary to fully compute the set of wavelet coefficients that make up at least one band of wavelet units. In such a case, the set of completed wavelet coefficients can be modeled, entropy encoded, and removed from the relevant portion of the wavelet unit buffer. The wavelet coefficients that have not been calculated remain as they are in order to complete the calculation in the next iteration. Then, the wavelet unit buffer can be scrolled by rearranging the line pointers, and other data can be put in the empty part of the wavelet unit buffer. Thus, the wavelet coefficient being calculated can be completely calculated.

一例として、ハイパスフィルタがカレント係数及び次のローパスフィルタ係数に依存するオーバーラップ変換を適用することを考える。この例の場合、たった２レベルの分解が画像データに適用されるであろうが、これは１つのウェーブレット・ユニットが４エレメント長であることを暗に示す。 As an example, consider that the high-pass filter applies an overlap transform that depends on the current coefficient and the next low-pass filter coefficient. In this example, only two levels of decomposition would be applied to the image data, which implies that one wavelet unit is 4 elements long.

少なくとも１つの帯域のウェーブレット・ユニットを構成するウェーブレット係数のセットを完全に計算するために、ウェーブレット・ユニット・バッファの高さは少なくとも８ラインつまり２ウェーブレット・ユニットである。 In order to fully compute the set of wavelet coefficients that make up at least one band of wavelet units, the height of the wavelet unit buffer is at least 8 lines or 2 wavelet units.

２次元のウェーブレット・ユニット・バッファに対しウェーブレット変換を適用する際には、まず１次元のウェーブレット変換がバッファの各行（ライン）に対し適用される。それから、１次元のウェーブレット変換がバッファの各列に対し適用される。 When applying wavelet transform to a two-dimensional wavelet unit buffer, first, the one-dimensional wavelet transform is applied to each row (line) of the buffer. A one-dimensional wavelet transform is then applied to each column of the buffer.

ウェーブレット・ユニット・バッファの各列に対し１次元ウェーブレット変換を適用する時に、ユニット・バッファに格納されていない画像のエレメントに依存する各列の最後のエレメントについては、ハイパスフィルタの計算を部分的にしか終わらせることができない。これが図５５に示されている。 When applying a one-dimensional wavelet transform to each wavelet unit buffer column, the high-pass filter calculation is partially performed on the last element of each column that depends on image elements not stored in the unit buffer. It can only be ended. This is illustrated in FIG.

第２レベルのウェーブレット分解を実行する時に、再び、各列の最後のエレメントについては、ハイパスフィルタの計算を部分的にしか終わらせることができない。これが図５６に示されている。 When performing the second level wavelet decomposition, again the high pass filter calculation can only be partially completed for the last element of each column. This is illustrated in FIG.

なお、一実施例においては、多くの分解レベルを利用する時に、ＳＳ係数（第２分解レベルの場合は図５５の１ＳＳ、第３レベルの場合は図５６の２ＳＳ）にのみウェーブレット変換が適用されるだろう。このような場合、ユニット・バッファの行及び列のロケーションは、バッファの適切なエントリーの読み書きを保証するようスキップされるだろう。 In one embodiment, when many decomposition levels are used, wavelet transform is applied only to SS coefficients (1SS in FIG. 55 for the second decomposition level and 2SS in FIG. 56 for the third level). It will be. In such a case, the row and column location of the unit buffer will be skipped to ensure reading and writing the appropriate entry in the buffer.

この例では、バッファの上半分には計算が完了したウェーブレット係数のセットが入っており、このウェーブレット係数セットは１帯域のウェーブレット・ユニットを構成しており、モデル化し、エントロピー符号化し、そしてバッファから取り除くことができる。 In this example, the upper half of the buffer contains a set of wavelet coefficients that have been calculated, and this wavelet coefficient set comprises a band of wavelet units that are modeled, entropy coded, and Can be removed.

バッファの上半分が空くと、バッファを高さの半分だけスクロースすることができる。ここで、画像の次の４ラインをバッファに読み込むことができる。バッファに格納された新たなラインそれぞれに対し、１次元ウェーブレット変換を適用することができる。バッファの列方向に、部分的に計算された係数を完全に計算することができるが、再び、各列の最後のエレメントは部分的にしか計算することができない。 When the upper half of the buffer is free, the buffer can be sucrose by half the height. Here, the next four lines of the image can be read into the buffer. One-dimensional wavelet transform can be applied to each new line stored in the buffer. In the column direction of the buffer, the partially calculated coefficients can be calculated completely, but again, the last element of each column can only be calculated partially.

第２レベルのウェーブレット分解のためにも同様のことが行われる。再び、バッファの上半分は計算の完了したウェーブレット係数が入り、その段階でプロセスはほかに処理すべき画像のラインがなくなるまで反復する。 The same is done for the second level wavelet decomposition. Again, the upper half of the buffer contains the calculated wavelet coefficients, at which stage the process repeats until there are no more image lines to process.

ライン・アクセス・バッファ内のライン・ポインタの並べ替えは、様々なやり方で行うことができる。一つのやり方は、新しいライン・アクセス・バッファを作り、それに旧ライン・アクセス・バッファからポインタをコピーする方法である。ウェーブレット・ユニット・バッファの高さを法としてスクロールするためには、旧ライン・アクセス・バッファのエレメントｉに格納されているポインタが、インデックス（ｉ＋ライン数）にコピーされることになろう。 The reordering of the line pointers in the line access buffer can be done in various ways. One way is to create a new line access buffer and copy the pointer from it to the old line access buffer. To scroll modulo the height of the wavelet unit buffer, the pointer stored in element i of the old line access buffer would be copied to the index (i + number of lines).

なお、圧縮システムの３つのステージは、バッファ内のデータに対し、そのデータがバッファより出される前に遂行されるので、係数は一般的に様々に順序つけられること留意すべきである。ラスター順データ操作が行われる場合には、本発明のスクロール・バッファは最小限のメモリで間に合う。 Note that since the three stages of the compression system are performed on the data in the buffer before the data is extracted from the buffer, the coefficients are generally variously ordered. When raster ordered data manipulation is performed, the scroll buffer of the present invention is in time with minimal memory.

ソフトウエア（及び／又はハードウエア）によりライン・アクセス・バッファを管理してポインタを操作する。このソフトウエアも、バッファ内のどのデータが処理を完了しバッファから出してよいか知っている。 The line access buffer is managed by software (and / or hardware) and the pointer is manipulated. This software also knows what data in the buffer can be processed and taken out of the buffer.

アラインメント法
本発明は、係数値を左へ任意量だけシフトする。一実施例では、このアラインメントは、仮想的アラインメント法によって行われる。この仮想的アラインメント法は、係数を実際にシフトしない。その代わりとして、係数を１ビットプレーンずつ処理する間に、特定の係数についてアラインメントが必要とされる実際のビットプレーンが計算される。重要性レベルと特定の係数に対し適用されるべきシフト量とが与えられれば、本発明は、その係数の所望の絶対ビットプレーンを、それが可能なビットプレーンの範囲内にあればアクセスする。すなわち、特定の係数の所望の絶対ビットプレーンは、カレント重要性レベルから該係数に適用されるべきシフト量を差し引いたものにより与えられる。この所望ビットプレーンは、それが最小の有効な絶対ビットプレーンより大きいか又は等しく、かつ最大の有効な絶対ビットプレーンより小さいか又は等しいときに有効とみなされる。 Alignment Method The present invention shifts the coefficient value to the left by an arbitrary amount. In one embodiment, this alignment is performed by a virtual alignment method. This virtual alignment method does not actually shift the coefficients. Instead, while processing the coefficients one bit plane at a time, the actual bit plane that requires alignment for a particular coefficient is calculated. Given the importance level and the amount of shift to be applied to a particular coefficient, the present invention accesses the desired absolute bit plane of that coefficient if it is within the range of bit planes where it is possible. That is, the desired absolute bitplane for a particular coefficient is given by the current importance level minus the amount of shift to be applied to that coefficient. This desired bit plane is considered valid when it is greater than or equal to the smallest valid absolute bit plane and less than or equal to the largest valid absolute bit plane.

２つのアラインメント法が普通である。その中の第１の方法は、平均２乗誤差（ＭＳＥ）法と呼ばれるもので、フルフレーム（full-frame）の再構成画像を原画像と比較した時にＭＳＥが縮小もしくは最小化されるように、係数をアラインメントする方法である。図５０は、このアラインメントの一例である。図５１も参照のこと。 Two alignment methods are common. The first method, called the mean square error (MSE) method, is such that the MSE is reduced or minimized when a full-frame reconstructed image is compared with the original image. This is a method for aligning coefficients. FIG. 50 is an example of this alignment. See also FIG.

第２の方法は、ピラミッド型のアラインメント法であり、画像がピラミッド・レベルのサイズに再構成される場合に良好なレート・歪み性能を提供する。ここでは、隣接レベルの係数は共通の重要レベルを持たない、つまりオーバーラップがない。図５１の左側にあるアラインメントは、３レベルＴＳ変換のための厳密にピラミッド型のアラインメントを表す。図５１の右側は、レベル２のピラミッド型アラインメントを表す。（図５１の厳密にピラミッド型の部分は、レベル３と２のピラミッド型アラインメントと呼んでよかろう。）各場合において、レベル内部の係数はＭＳＥに関連してアラインメントされる。 The second method is a pyramid-type alignment method that provides good rate and distortion performance when the image is reconstructed to a pyramid level size. Here, the coefficients of adjacent levels do not have a common important level, that is, there is no overlap. The alignment on the left side of FIG. 51 represents a strictly pyramidal alignment for 3-level TS transformation. The right side of FIG. 51 represents a level 2 pyramid alignment. (The strictly pyramidal portion of FIG. 51 may be referred to as a level 3 and 2 pyramid alignment.) In each case, the coefficients within the level are aligned with respect to the MSE.

図５２は、メモリ記憶係数と一つのアラインメントとの間の典型的な関係を示す。 FIG. 52 shows an exemplary relationship between memory storage coefficients and one alignment.

本発明によれば、実際のシフト操作を行う必要がないため、メモリサイズの制約がなくなる。さらに、本発明は、余分なメモリを必要とせず、任意のアラインメント法を簡単に実施できる。 According to the present invention, since there is no need to perform an actual shift operation, there is no memory size limitation. Furthermore, the present invention does not require extra memory and can easily implement any alignment method.

ヒストグラム圧縮
本発明はヒストグラム圧縮を利用してもよい。一実施例では、変換又はバイナリ方式の処理を施される前にヒストグラム圧縮が利用される。ヒストグラム圧縮は、一部の画像に対する圧縮率を向上させる。そのような画像は通常、ダイナミックレンジの一部の値がどの画素にも使われない画像である。言い換えれば、画像のダイナミックレンジにギャップが存在する。例えば、ある画像が合計２５６の値の中の０と２５５の値しかとらないときには、その原画像と一対一対応を持つが、ダイナミックレンジのずっと小さな新たな画像を生成できる。この画像生成は、整数を画像のとる値に写像する増加関数を定義することにより達成される。例えば、画像が０と２５５の値しか使わないときには、マッピングで０を０に、１を２５５に写像する。別の実施例では、画像が偶数（又は奇数）画素しか持たないときには、画素値を０〜１２８の値に再写像する。 Histogram Compression The present invention may utilize histogram compression. In one embodiment, histogram compression is utilized before being subjected to conversion or binary processing. Histogram compression improves the compression rate for some images. Such an image is typically an image in which some value of the dynamic range is not used for any pixel. In other words, there is a gap in the dynamic range of the image. For example, when an image takes only values 0 and 255 out of a total of 256 values, a new image having a one-to-one correspondence with the original image but a much smaller dynamic range can be generated. This image generation is accomplished by defining an increasing function that maps integers to values taken by the image. For example, when an image uses only values 0 and 255, 0 is mapped to 0 and 1 is mapped to 255 in mapping. In another embodiment, when the image has only even (or odd) pixels, the pixel value is remapped to a value between 0 and 128.

ヒストグラム圧縮を行った後に、画像データに本発明の可逆埋め込みウェーブレットによる圧縮を施してよい。このように、ヒストグラム圧縮は前処理モードで用いられる。一実施例では、ヒストグラムは、ブーリアン（Boolean）ヒストグラムを基礎としており、値が生じるか否かのリストを保持するようなものである。まず、全ての生起数が昇順に記録される。次に、それぞれの値は０から順に写像される。 After performing the histogram compression, the image data may be compressed by the reversible embedded wavelet of the present invention. Thus, histogram compression is used in the preprocessing mode. In one embodiment, the histogram is based on a Boolean histogram, such as keeping a list of whether values occur. First, all occurrences are recorded in ascending order. Next, each value is mapped sequentially from 0.

一実施例では、誤差の影響を減らすためガード（guard）画素値が用いられる。隣接した再写像画素値が大きなギャップで隔てられた実画素値に対応するかもしれないので、再写像値の小さな誤差が実値の大きな誤差をもたらすことがある。再写像値の近くに割増値を追加することにより、そのような誤差の影響は減少するであろう。 In one embodiment, guard pixel values are used to reduce the effects of errors. Since adjacent remapped pixel values may correspond to real pixel values separated by a large gap, small errors in remapped values may result in large real value errors. By adding a premium value near the remap value, the effect of such errors will be reduced.

原画像を再構成するために、マッピングが利用されたことが復号化器へ通知される。このマッピングをヘッダ中に指示してもよい。これにより、復号化器において後処理のために同様のテーブルを作成できるようになる。一実施例では、復号化器はレンジをタイル毎に通知される。一実施例では、本発明は、まず当該マッピングが行われることを知らせ、次に欠落した値（例えば上例の２５４）の数を知らせる。ヒストグラム圧縮の利用の有無を知らせるためのコストは、たったの１ビットである。このビットの後に、全ての再写像値のテーブルが続くことになろう。 The decoder is informed that the mapping has been used to reconstruct the original image. This mapping may be indicated in the header. This allows a similar table to be created for post-processing at the decoder. In one embodiment, the decoder is notified of the range for each tile. In one embodiment, the present invention first informs that the mapping is performed, and then informs the number of missing values (eg, 254 in the above example). The cost for notifying the use of histogram compression is only 1 bit. This bit will be followed by a table of all remap values.

一実施例では、１タイルずつヒストグラム圧縮を実行する時の通知量を減らすため、１つのビットで、新しいブーリアン・ヒストグラムが直前に使用したブーリアン・ヒストグラムと同じか違うかを知らせる。このような場合、新たなブーリアン・ヒストグラムが復号化器に通知されるのは、それが直前のヒストグラムと異なるとき（のみ）である。新しいブーリアン・ヒストグラムが前のものと違う場合でも、その間に類似点があるのが普通である。より詳しくいえば、２つのヒストグラムの排他的論理和はエントロピー・コーダによる圧縮性がより高いので、それを生成して復号化器へ通知してもよい。 In one embodiment, to reduce the amount of notification when performing histogram compression one tile at a time, one bit informs whether the new Boolean histogram is the same as or different from the Boolean histogram used immediately before. In such a case, the new Boolean histogram is notified to the decoder when it is (only) different from the previous histogram. Even if the new Boolean histogram is different from the previous one, there are usually similarities between them. More specifically, since the exclusive OR of the two histograms is more compressible by the entropy coder, it may be generated and notified to the decoder.

ヒストグラムは、サイズのダイナミックレンジと同じくらいのビット数（例えば８ビット深度の場合は２５６ビット）を送信することにより通知できる。そのビットの並び順は画素値に対応する。この場合、あるビットが１であるのは、それに対応した値が画像中に用いられるときである。ヘッダのコストを削減もくしは最小化するため、このビット列を、第１次マルコフ（Markov）コンテキストモデルのもとでエントロピー符号化してもよい。 The histogram can be notified by transmitting as many bits as the size dynamic range (for example, 256 bits for 8-bit depth). The arrangement order of the bits corresponds to the pixel value. In this case, a certain bit is 1 when a corresponding value is used in the image. This bit sequence may be entropy encoded under a first order Markov context model to reduce header cost or minimize headers.

別の実施例では、欠落した値が過半数のときには発生した値が順に記録されるであろうが、そうでないときには欠落した値が順に記録される。 In another embodiment, when the missing values are a majority, the generated values will be recorded in order, otherwise the missing values are recorded in order.

一実施例では、本発明のバイナリ方式が、パレット化された画像の圧縮のために利用されるであろう。パレットはヘッダに格納されるであろう。しかし、パレット化画像は埋め込まれないであろうから、損失性伸長のための量子化は妥当な結果をもたらさない。別の実施例では、パレット化画像は連続階調（カラー又はグレースケール）画像に変換され、各成分が変換方式又はバイナリ方式により圧縮されるかもしれない。これは妥当な損失性圧縮が可能である。 In one embodiment, the binary scheme of the present invention will be utilized for compression of palletized images. The palette will be stored in the header. However, since the paletted image will not be embedded, quantization for lossy decompression does not give reasonable results. In another embodiment, the palletized image may be converted to a continuous tone (color or grayscale) image, and each component may be compressed in a conversion or binary manner. This allows reasonable lossy compression.

ある種の画像は、１つの指定色（又は指定色の小サブセット）が特定の目的のために使用された連続階調画像である。その特定目的色は注記用かもしれない。例えば、グレースケールの医用画像には、その画像を識別するためのコンピュータ生成のカラー文字があるかもしれない。もう一つの特定目的色は、オーバーレイ画像において画素が透過であって、下側の画像の画素が代わりに表示されることを示すかもしれない。禁止された色は別の成分画像に分解されるであろう。そして、連続階調成分及び特殊色成分は、変換方式又はバイナリ方式で圧縮／伸長されるであろう。 One type of image is a continuous tone image where one specified color (or a small subset of the specified color) is used for a specific purpose. The special purpose color may be for notes. For example, a grayscale medical image may have computer generated color characters to identify the image. Another special purpose color may indicate that the pixel in the overlay image is transparent and the pixel in the lower image is displayed instead. The forbidden color will be separated into separate component images. The continuous tone component and the special color component will be compressed / decompressed in a conversion method or a binary method.

なお、変換方式及びバイナリ方式は輝度データのために利用されることが多いが、アルファ混合用のアルファ・チャネルのような他の形式の２次元データが用いられてもよい。 The conversion method and the binary method are often used for luminance data, but other types of two-dimensional data such as an alpha channel for alpha mixing may be used.

パーサ（parser）
本発明によれば、符号ストリームを、伸長することなく、送信又は復号化の前に構文解析できるようになる。この構文解析は、ビットストリームを打ち切り、特定の量子化のために必要な量の情報だけを伝達することができるパーサによって実行される。このパーサを支援するため、マーカー（marker）及びポインタがビットストリーム内の符号化単位の各ビットプレーンの位置を決定する。 Parser
According to the present invention, the code stream can be parsed before transmission or decoding without decompression. This parsing is performed by a parser that can abort the bitstream and convey only the amount of information needed for a particular quantization. To support this parser, a marker and a pointer determine the position of each bit plane of the coding unit in the bitstream.

本発明は、画像圧縮システムにおいて構文解析によって実施される装置依存の量子化を提供する。圧縮システムにおいてマーカーを用いることにより、符号化後に装置に応じ選択される量子化が可能になる。出力装置はその特性をパーサに報告し、パーサはその特定装置向けに符号化済みファイルを量子化する。この量子化は、ファイルの一部を省くことによる。可逆ウェーブレット変換の利用は、画像の非損失復元、あるいは、装置に応じた視覚的に非損失の色々な歪みでの画像の復元を可能にする。 The present invention provides device-dependent quantization performed by parsing in an image compression system. The use of markers in the compression system allows quantization that is selected depending on the device after encoding. The output device reports its characteristics to the parser, and the parser quantizes the encoded file for that particular device. This quantization is by omitting part of the file. The use of the reversible wavelet transform enables image loss-free restoration or image restoration with various distortions that are visually lossless depending on the apparatus.

本発明によれば、量子化を符号化後に実行することが可能になる。図２７及び図２８は、パーサを備えた圧縮システムのブロック図である。図２７及び図２８において、元の圧縮されていない画像２１０１が本発明の圧縮装置２１０２に入力される。圧縮装置２１０２は、画像２１０１を圧縮ビットストリーム２１０３へ非損失圧縮するとともとに、圧縮ビットストリーム２１０３にマーカーを付加する。 According to the present invention, quantization can be performed after encoding. 27 and 28 are block diagrams of a compression system including a parser. 27 and 28, the original uncompressed image 2101 is input to the compression apparatus 2102 of the present invention. The compression device 2102 performs lossless compression of the image 2101 into the compressed bit stream 2103 and adds a marker to the compressed bit stream 2103.

圧縮ビットストリーム２１０３はパーサ２１０４に入力し、パーサ２１０４は圧縮ビットストリーム２１０３のある部分を出力として提供する。そのある部分は、圧縮ビットストリーム２１０３の全部であるかもしれないし、その一部分だけかもしれない。要求側のエージェントもしくは装置は、伸長画像が必要とされる時に、その装置特性をパーサ２１０４に与える。それに応じて、パーサ２１０４は圧縮ビットストリーム２１０４の適切な部分を選択して送出する。パーサ２１０４は画素もしくは係数レベルの演算もエントロピー符号化／復号化も行わない。別の実施例では、パーサ２１０４はそのような働きを多少は果たすかもしれない。 The compressed bitstream 2103 is input to a parser 2104, which provides a portion of the compressed bitstream 2103 as an output. The part may be the entire compressed bitstream 2103 or only a part thereof. The requesting agent or device provides its device characteristics to the parser 2104 when the decompressed image is needed. In response, the parser 2104 selects and sends the appropriate portion of the compressed bitstream 2104. Parser 2104 does not perform pixel or coefficient level computations or entropy encoding / decoding. In another embodiment, parser 2104 may perform some of such work.

パーサ２１０４は、低解像度用の圧縮係数を選択することにより、モニタに画像を表示するための符号化データを提供することができる。それと異なる要求に対しては、パーサ２１０４は注目領域（ＲＯＩ）の非損失性伸長を可能にするように圧縮データを選択する。一実施例では、パーサ２１０４は、要求に従って、プレビュー画像からプリンタ解像度画像又はフルサイズの医用モニタ画像（恐らく１６ビットの画素深度を持つ）までの変化に必要なビットを送出する。 The parser 2104 can provide encoded data for displaying an image on a monitor by selecting a compression coefficient for low resolution. For different requirements, parser 2104 selects the compressed data to allow lossless decompression of the region of interest (ROI). In one embodiment, the parser 2104 sends the bits necessary for the transition from the preview image to the printer resolution image or full size medical monitor image (possibly with a 16-bit pixel depth), as required.

パーサ２１０４により提供されたデータは通信路及び／又は記憶装置２１０６へ出力される。伸長装置２１０７は、そのデータにアクセスし、圧縮データを伸長する。伸長された、すなわち再構成されたデータは、伸長画像２１０８として出力される。 The data provided by the parser 2104 is output to the communication path and / or the storage device 2106. The decompression device 2107 accesses the data and decompresses the compressed data. The decompressed, that is, reconstructed data is output as a decompressed image 2108.

図２９において、２ＨＨ周波数帯域内のビットプレーンは３ＨＨ周波数帯域からの情報を利用して符号化される。ビットプレーンをより明瞭に示すため、図５０及び図５１に図２９が書き直されている。図５０のように係数が格納される場合（ＭＳＥ）には、圧縮ビットストリームの打ち切りはＭＳＥレート・歪み最適量子化とほとんど同一である。この打ち切りが、図５０に陰をつけたマーカーで示されている。図３０を調べると、この配列はプリンタには適切かもしれないが、モニタには十分でないであろう。図５１に示すように、係数が”ピラミッド状に”格納される場合、つまり、ある周波数帯域の全ビットが最初に格納される場合には、ビットストリームの打ち切りは様々な解像度の画像を提供する。 In FIG. 29, a bit plane in the 2HH frequency band is encoded using information from the 3HH frequency band. FIG. 29 has been rewritten in FIGS. 50 and 51 to show the bit plane more clearly. When coefficients are stored as shown in FIG. 50 (MSE), truncation of the compressed bitstream is almost the same as MSE rate / distortion optimal quantization. This truncation is indicated by the shaded marker in FIG. Examining FIG. 30, this arrangement may be appropriate for a printer but not sufficient for a monitor. As shown in FIG. 51, truncation of the bitstream provides images of various resolutions when the coefficients are stored “in a pyramid”, that is, when all bits of a frequency band are stored first. .

マーカーを巧みに使えば、両タイプの打ち切りが可能になり、解像度の低い、忠実度の低い画像を発生するであろう。図５０及び図５１中の濃淡の変化は、最高解像度で低忠実度のビットストリームを発生するであろうビットストリームの打ち切りを表す。ＬＨ，ＨＬ，ＨＨ係数の全部をさらに打ち切れば、画像の解像度を下げることになろう。 Skillful use of the markers will allow for both types of truncation and will produce low resolution and low fidelity images. The shading changes in FIGS. 50 and 51 represent the bitstream truncation that would produce the highest resolution and low fidelity bitstream. If all of the LH, HL, and HH coefficients are further truncated, the resolution of the image will be lowered.

多くの画像圧縮の応用では、画像は、一度だけ圧縮されるが、何度も伸長されるかもしれない。あいにく、たいていの圧縮システムは、許容される損失量と適当な量子化とが符号化の時点で決定されなければならない。プログレッシブ・システムは段々に精細になる一揃いの画像を与えるが、非損失性再構成は一般にできない、すなわちプログレッシブ・ビルドアップと関係のない非損失な方法で符号化された“差分画像”を送り出すことにより、非損失性再構成が提供される。 In many image compression applications, an image is compressed only once but may be decompressed many times. Unfortunately, for most compression systems, the amount of loss allowed and the appropriate quantization must be determined at the time of encoding. Progressive systems give a set of progressively more detailed images, but lossless reconstruction is generally not possible, i.e., sending out "difference images" encoded in a lossless manner unrelated to progressive buildup This provides a lossless reconstruction.

本発明においては、符号化器は、種々の係数を周波数及びビットプレーン要素に分解するに足るだけの情報を保存する。一実施例では、次のエントロピー符号化データ単位に何が入っているか知らせるためのマーカーがビットストリームに挿入される。例えば、マーカーは、次のエントロピー符号化データ単位に、最上位より３番目のビットプレーンのためのＨＨ周波数情報が入っていることを示すかもしれない。 In the present invention, the encoder stores enough information to decompose the various coefficients into frequency and bit plane elements. In one embodiment, a marker is inserted into the bitstream to inform what is in the next entropy encoded data unit. For example, the marker may indicate that the next entropy encoded data unit contains HH frequency information for the third bit plane from the most significant bit.

誰かがモニタ上で画像を調べたいとすると、低解像度のグレースケール画像を生成するために必要な情報を要求するであろう。そのユーザがその画像を印刷したいときには、高解像度の２値画像を生成するために必要な情報が要求されるであろう。最後に、そのユーザが圧縮試験を実施したいとき、あるいはセンサ雑音の統計分析や医療診断を行いたいときには、その画像の非損失版が要求されるであろう。 If someone wants to examine the image on the monitor, he will request the information needed to produce a low resolution grayscale image. When the user wants to print the image, the information required to generate a high resolution binary image will be required. Finally, when the user wants to perform a compression test, or to perform statistical analysis of sensor noise or medical diagnosis, a lossless version of the image will be required.

図３１は、パーサ、復号化器、出力装置とのやりとりに関するブロック図である。図３１において、パーサ２４０２は、マーカー付の非損失性圧縮データ、並びに１つ以上の出力装置、例えば図示のディスプレイ・モジュール２４０５の装置特性を受け取るように接続される。この装置特性に基づき、パーサ２４０２は圧縮データの適当な部分を選択し、それを通信路２４０３へ送り、通信路２４０３はそのデータを伸長装置２４０４へ転送する。伸長装置２４０４はそのデータを復号化し、復号化データをディスプレイ・モジュール２４０５に与える。 FIG. 31 is a block diagram regarding the exchange with the parser, the decoder, and the output device. In FIG. 31, parser 2402 is connected to receive lossless compressed data with markers, as well as device characteristics of one or more output devices, eg, display module 2405 shown. Based on this device characteristic, parser 2402 selects the appropriate portion of the compressed data and sends it to channel 2403 which forwards the data to decompressor 2404. The decompressor 2404 decrypts the data and provides the decrypted data to the display module 2405.

本発明は、ワールド・ワイド・ウェブ、その他の画像サーバーに対する改良したサポートをデータストリームに与える。データストリームのある部分はモニタ用の低空間解像度・高画素深度の画像をサポートすることができる。別の部分は、高空間解像度・低画素深度のプリンタをサポートすることができる。データストリーム全体は非損失性伝送を提供する。これら３つの使い方が同じ圧縮データによりサポートされるので、ブラウザがモニタ画像、印刷画像及び非損失性画像を順番に要求するならば、余分なデータを全く送る必要がない。伝送されたモニタ画像用情報で印刷画像のために必要とされるものは、印刷画像のために再利用できる。伝送されたモニタ画像と印刷画像のための情報は、非損失画像のために再利用できる。本発明は、閲覧のための伝送時間（伝送費用）を減らし、また、サーバーに格納しなければならないデータ量を最小にする。 The present invention provides data streams with improved support for the World Wide Web and other image servers. Some parts of the data stream can support low spatial resolution and high pixel depth images for monitoring. Another part can support high spatial resolution and low pixel depth printers. The entire data stream provides lossless transmission. Since these three uses are supported by the same compressed data, there is no need to send any extra data if the browser requests the monitor image, the print image and the lossless image in turn. The transmitted monitor image information required for the print image can be reused for the print image. The transmitted monitor image and print image information can be reused for non-loss images. The present invention reduces transmission time (transmission costs) for browsing and minimizes the amount of data that must be stored on the server.

本発明のシステムにおいては、画像は１度だけ圧縮されるが、データが何であるかを示すために様々なマーカーが格納される。その後、ワールド・ワイド・ウェブ（ＷＥＢ）サーバーは、表示の要求を受信し、必要な係数を提供するであろう。ＷＥＢサーバーは、圧縮とか伸長とか何もする必要がなく、送るべきビットストリームの適当な部分を選択するだけである。 In the system of the present invention, the image is compressed only once, but various markers are stored to indicate what the data is. A World Wide Web (WEB) server will then receive the request for display and provide the necessary coefficients. The WEB server does not need to do any compression or decompression, it just selects the appropriate part of the bitstream to send.

このような構文解析システムによる量子化は、可逆ウェーブレット及びコンテキストモデルによる高度の非損失性圧縮がないとしても、帯域幅の実質的な増加をもたらす。この構文解析システムは高品質の注目領域の選択にも利用できる。 Quantization with such a parsing system results in a substantial increase in bandwidth even without the high lossless compression with reversible wavelets and context models. This parsing system can also be used to select high quality areas of interest.

図３２は量子化選択装置を示す。一実施例では、この選択装置はソフトウエアによって、各種装置のための適切な量子化プロファイルを決定するよう構成される。画像は変換され、様々な周波数帯域のビットプレーンを捨てることによって量子化される。次に逆ウェーブレット変換が実行される。再構成画像は表示に矛盾しない何らかの方法で処理される。モニタに表示される高解像度画像については、その処理はある種のスケーリングであろう。プリンタの場合には、ある種の閾値処理又はディザ処理かもしれない。同じ処理が原画像に適用され、圧縮画像と比較される。平均２乗誤差が例として用いられたが、どのような視覚的差異基準を用いてもよい。様々なビットプレーンの量子化による誤差を利用し、ビットレートの節減による歪みが最低となるようにビットプレーンを選択し量子化する。このプロセスは所望のビットレート又は歪みに達するまで続けられるであろう。各種画像処理操作のための代表的な量子化がいったん決まったならば、量子化をシミュレートする必要はなく、代表値が用いられるであろう。 FIG. 32 shows a quantization selection device. In one embodiment, the selection device is configured by software to determine an appropriate quantization profile for various devices. The image is transformed and quantized by discarding bit planes of various frequency bands. Next, an inverse wavelet transform is performed. The reconstructed image is processed in some way consistent with the display. For high resolution images displayed on a monitor, the process may be some sort of scaling. In the case of a printer, it may be some sort of thresholding or dithering. The same processing is applied to the original image and compared with the compressed image. Although the mean square error was used as an example, any visual difference criterion may be used. By using errors due to quantization of various bit planes, bit planes are selected and quantized so that distortion due to bit rate saving is minimized. This process will continue until the desired bit rate or distortion is reached. Once the representative quantization for various image processing operations is determined, it is not necessary to simulate the quantization and representative values will be used.

勿論、スケーリングのような単純な画像処理操作に関しては、様々な周波数帯域の量子化の効果を解析的に決定できる。その他のディザ処理やコントラスト・マスキングのような操作に関しては、シミュレーションによりほぼ最適な量子化を見つけるのはずっと簡単である。 Of course, for simple image processing operations such as scaling, the effects of quantization in various frequency bands can be analytically determined. For other operations such as dithering and contrast masking, it is much easier to find a nearly optimal quantization by simulation.

図３２において、符号ストリーム２５００は、量子化を含む伸長２５０１及び非損失性伸長２５０３を施される。その伸長結果に対し、画像処理又は歪みモデル２５０２，２５０４が適用される。その出力は画像であり、ＭＳＥ又はＨＶ５差モデル２５０５のような差モデルを適用される。その差判定の結果に基づき、アラインメントが調節され（２５０４）、したがって量子化も調節される。 In FIG. 32, the code stream 2500 is subjected to decompression 2501 including quantization and lossless decompression 2503. Image processing or distortion models 2502 and 2504 are applied to the expansion result. The output is an image and a difference model such as MSE or HV5 difference model 2505 is applied. Based on the result of the difference determination, the alignment is adjusted (2504) and therefore the quantization is also adjusted.

構文解析を容易にするため、本発明は一連のヘッダによる合図を利用する。一実施例では、本発明の符号ストリーム構造は、１つ以上のタグ値を持つ主ヘッダを含む。主ヘッダ中のタグは、符号ストリーム中のすべてのタイルのために用いられた成分の数、サブサンプリング及びアラインメント等の情報を知らせる。一実施例では、符号ストリーム中の各タイルの前に、そのヘッダがある。タイル・ヘッダの情報は、当該タイルに対してのみ適用され、また主ヘッダの情報をくつがえすかもしれない。 To facilitate parsing, the present invention utilizes a series of header cues. In one embodiment, the code stream structure of the present invention includes a main header having one or more tag values. A tag in the main header informs information such as the number of components used for all tiles in the codestream, subsampling and alignment. In one embodiment, each tile in the code stream is preceded by its header. The tile header information applies only to that tile and may override the main header information.

ヘッダはそれぞれ１つ以上のタグを含む。一実施例では、インライン・マーカーはない。ヘッダ・タグは、ある既知の点からユーザがコーダをリセットする所までの圧縮データの量を示す。一実施例では、どのタグもみな１６の倍数のビット数である。したがって、主ヘッダ及びタイル・ヘッダはどれもみな１６の倍数のビット数である。なお、どのタグも、１６以外の数の倍数のビット数でも構わないことに注意されたい。どのタイルデータ・セグメントも、１６の倍数のビット数になるよう適当数の０が挿入される。 Each header includes one or more tags. In one embodiment, there are no inline markers. The header tag indicates the amount of compressed data from a known point to where the user resets the coder. In one embodiment, every tag is a multiple of 16 bits. Thus, the main header and tile header are all multiples of 16 bits. It should be noted that any tag may have a bit number that is a multiple of a number other than 16. Each tile data segment is inserted with an appropriate number of zeros so that the number of bits is a multiple of 16.

一実施例では、各タイル・ヘッダはそのタイル・サイズを指示するかもしれない。別の実施例では、各タイルは、どこで次のタイルが始まるか指示するかもしれない。なお、符号ストリームのバックトラックが可能なときには、そのような情報をすべて主ヘッダに挿入することにより、符号化が簡単になるかもしれない。パーサは、符号ストリームに関する情報をその量子化を実行するために利用することができる。 In one embodiment, each tile header may indicate its tile size. In another embodiment, each tile may indicate where the next tile begins. When backtracking of the code stream is possible, encoding may be simplified by inserting all such information into the main header. The parser can use information about the code stream to perform its quantization.

一実施例では、タイル・ヘッダはタイルがウェーブレット方式とバイナリ方式のいずれで符号化されたか指示するかもしれない。重要性レベル・インディケータは、タイルのデータと重要性レベルとを関係付ける。重要性レベル・ロケータ（locator）は、可能な打ち切り位置を知らせる。例えば、各タイルに対し同じ歪みが望まれるときには、どの重要性レベルがその所望の歪みレベルと等しいか分かれば、パーサは符号ストリームを適切な位置で打ち切ることができる。一実施例では、各タイルは、同じビット数を持つのではなく、ほぼ同じ歪みを持つ。 In one embodiment, the tile header may indicate whether the tile was encoded in wavelet or binary format. The importance level indicator relates the tile data to the importance level. The importance level locator informs of possible censoring positions. For example, when the same distortion is desired for each tile, the parser can abort the codestream at the appropriate location if it knows which importance level is equal to the desired distortion level. In one embodiment, each tile has approximately the same distortion, rather than having the same number of bits.

本発明は、重要性レベル・ロケータ・タグを持つことにより、複数のタイルと、各タイルのどこで終わるべきかの指示を持ち得るようにしている。 The present invention has an importance level locator tag so that it can have multiple tiles and an indication of where to end each tile.

タグとポインタ
復号化又は構文解析に用いられる構文解析用マーカー及びその他の情報が、タグに入れられてもよい。一実施例では、ヘッダは以下のルールに従うタグによって制御情報を与える。 Tags and pointers Parsing markers and other information used for decoding or parsing may be placed in the tags. In one embodiment, the header provides control information by tags that follow the following rules.

タグは固定サイズでも可変サイズでもよい。成分数、タイル数、レベル数、又はリセットもしくは所望情報の数により、タグは長さが変わってもよい。 Tags can be fixed or variable. Depending on the number of components, the number of tiles, the number of levels, or the number of reset or desired information, the tag may vary in length.

画像が構文解析され量子化されるならば、それらのタグは新しい画像特性を表すように変更される。 If the image is parsed and quantized, the tags are changed to represent new image characteristics.

データストリーム中のリセット点は、８の倍数のビット数になるように０を挿入される。エントロピー・コーダを符号ストリームのある点でリセットできるが、その点は符号化時に決定される（しかし、それを行うことができるのは、一つの重要性レベルの符号化の終わりでだけである）。このリセットは、エントロピー・コーダにおける全ての状態情報（コンテキスト及び確率）が既知の初期状態に戻されることを意味する。次に、符号ストリームは次の８の倍数のビット数まで０を挿入される。 0 is inserted into the reset point in the data stream so that the number of bits is a multiple of 8. The entropy coder can be reset at some point in the code stream, but that point is determined at the time of encoding (but it can only be done at the end of one importance level encoding) . This reset means that all state information (context and probability) in the entropy coder is returned to a known initial state. Next, 0 is inserted into the code stream up to the next multiple of 8 bits.

パーサは、画像を量子化する際に符号ストリーム・タグだけを手がかりとして用いる。一実施例では、この量子化処理のために、タイル長、成分長、リセット、ビット対重要性レベル、及び、重要性レベル・ロケータの各タグが用いられる。 The parser uses only the code stream tag as a clue when quantizing the image. In one embodiment, tile length, component length, reset, bit to importance level, and importance level locator tags are used for this quantization process.

パーサによって画像が量子化された後、そのタグはすべて新しい符号ストリームを反映するよう変更される。これは画像と、タイルサイズ、成分数、成分のスパン、全ての長さとポインタ等々に影響を及ぼすのが普通である。さらに、画像がどのように量子化されたかを記述する情報タグも含まれる。 After the image is quantized by the parser, all of its tags are changed to reflect the new code stream. This usually affects the image, tile size, number of components, component span, all lengths and pointers, etc. Further included is an information tag that describes how the image was quantized.

表３は、本発明の一実施例における全てのタグの一覧表である。説明と用語はしばしばＪＰＥＧとは異なるが、可能な場合には同じマーカーと識別子が用いられる。どの画像も少なくとも２つのヘッダ、すなわち、画像の始まりにある主ヘッダと、各タイルの始まりにあるタイルヘッダとを有する。（どの符号ストリームもみな少なくとも１つのタイルを含む）
３種類のタグ、すなわち区切りタグ、機能タグ及び情報タグも用いられる。区切りタグは、ヘッダ及びデータのフレーミングのために用いられる。機能タグは、利用される符号化機能を記述するために使われる。情報タグはデータに関するオプションの情報を提供する。 Table 3 is a list of all tags in one embodiment of the present invention. The description and terminology are often different from JPEG, but the same markers and identifiers are used where possible. Every image has at least two headers: a main header at the beginning of the image and a tile header at the beginning of each tile. (Every code stream contains at least one tile)
Three types of tags are also used: separator tags, function tags and information tags. Separator tags are used for header and data framing. The function tag is used to describe the encoding function to be used. Information tags provide optional information about the data.

なお、”ｘ”は当該タグが当該ヘッダ中で用いられないことを意味する。ヘッダ中のＴＬＭタグ又は各タイル中のＴＬＴタグのどちかが必要とされるが、その両方は必要でない。成分ポインタが必要であるのは、２つ以上の成分があるときだけである。 Note that “x” means that the tag is not used in the header. Either a TLM tag in the header or a TLT tag in each tile is required, but not both. A component pointer is only needed when there are two or more components.

図３３は本発明の符号ストリームにおける区切りタグの配置を表す。各符号ストリームは、１つのＳＯＩタグ、１つのＳＯＣタグ、１つのＥＯＩタグだけ（及び少なくとも１つのタイル）を持つ。各タイルは１つのＳＯＴタグと１つのＳＯＳタグを持つ。各区切りタグは１６ビットで、長さ情報を含まない。 FIG. 33 shows the arrangement of delimiter tags in the code stream of the present invention. Each code stream has only one SOI tag, one SOC tag, one EOI tag (and at least one tile). Each tile has one SOT tag and one SOS tag. Each delimiter tag is 16 bits and does not include length information.

ＳＯＩタグは、ＪＰＥＧファイルの始まりを示し、１６ビットのＪＰＥＧマジックナンバー（magic number）である。 The SOI tag indicates the start of a JPEG file and is a 16-bit JPEG magic number.

ＳＯＣタグはファイルの始まりを示し、ＳＯＩタグの直後にくる。ＳＯＩタグとＳＯＣタグは全体として、ユニーク数を形成する１６ビットとなる。 The SOC tag indicates the beginning of the file and comes immediately after the SOI tag. The SOI tag and the SOC tag as a whole are 16 bits forming a unique number.

ＳＯＴタグはタイルの始まりを示す。符号ストリームには少なくとも１つのタイルがある。ＳＯＴはストリームがまだ同期していることを保証するためのチェックとして働く。 The SOT tag indicates the beginning of a tile. There is at least one tile in the code stream. SOT serves as a check to ensure that the stream is still synchronized.

ＳＯＳタグは”スキャン”の始まりを示し、その後にタイルの実画像データが続く。ＳＯＳはタイル・ヘッダの終わりを示し、また、ＣＲＥＷ符号ストリームには少なくとも１つのＳＯＳがなければならない。ＳＯＳとその次のＳＯＴ又はＥＯＩ（画像の終わり）との間のデータは１６の倍数のビット数であり、この符号ストリームは必要であれば０が挿入される。 The SOS tag indicates the start of “scan”, followed by the actual image data of the tile. The SOS indicates the end of the tile header, and there must be at least one SOS in the CREW code stream. The data between the SOS and the next SOT or EOI (end of image) is a multiple of 16 bits, and 0 is inserted into this code stream if necessary.

ＥＯＩタグは画像の終わりを示す。ＥＯＩはストリームがまだ同期していることを保証するためのチェックとして働く。符号ストリーム中に少なくとも１つのＥＯＩがある。 The EOI tag indicates the end of the image. The EOI serves as a check to ensure that the stream is still synchronized. There is at least one EOI in the code stream.

機能タグは、タイル又は画像の全体を符号化するために利用される機能を記述する。これらタグの中には、主ヘッダに用いられるが、個々のタイルの符号化中に別の値を持つ同じタグを使って覆すことができるものもある。ＳＩＺタグは画像格子の幅と高さ、タイルの幅と高さ、成分数、色空間変換（必要なとき）、各成分のサイズ（画素深度）、及び成分が基準格子をどのように埋めるかを定義する。このタグは主ヘッダにのみ出現し、タイル・ヘッダ中には出現しない。各タイルは、その成分のすべてに同じ特性を提供させる。ここで定義されたパラメータの多くは他のタグのためにも用いられるため、ＳＩＺタグはＳＯＣタグのすぐ後に続かねばならない。このタグの長さは、ＳＩＺの後の最初のフィールドであるＬsizに保存されるが、成分数に依存する。図３４はＳＩＺタグに関する画像とタイルのサイズのシンタックスを示す。 The function tag describes the function used to encode the entire tile or image. Some of these tags are used for the main header, but can be overridden using the same tag with different values during the encoding of individual tiles. SIZ tags are image grid width and height, tile width and height, number of components, color space conversion (when required), size of each component (pixel depth), and how the components fill the reference grid Define This tag appears only in the main header and not in the tile header. Each tile causes all of its components to provide the same characteristics. Since many of the parameters defined here are also used for other tags, the SIZ tag must immediately follow the SOC tag. The length of this tag is stored in Lsiz, the first field after SIZ, but depends on the number of components. FIG. 34 shows the image and tile size syntax for the SIZ tag.

以下は各要素のサイズと値の説明リストである。 The following is an explanation list of the size and value of each element.

ＳＩＺ：マーカー SIZ: Marker

Ｌｓｉｚ：マーカを含めない、バイト数で表したタグの長さ（偶数でなければならない）。 Lsiz: Tag length in bytes, not including markers (must be an even number).

Ｘｓｉｚ：画像基準格子の幅（１成分の画像又は共通のサブサンプリングによる色成分を持つ画像では画像幅と同じ）。 Xsiz: the width of the image reference grid (the same as the image width in a one-component image or an image having color components by common sub-sampling)

Ｙｓｉｚ：画像基準格子の高さ（１成分の画像又は共通のサブサンプリングによる色成分を持つ画像では画像の高さと同じ）。 Ysiz: The height of the image reference grid (the same as the image height in the case of a one-component image or an image having color components by common subsampling)

ＸＴｓｉｚ：１タイル画像基準格子の幅。タイルは、あらゆる成分の１標本を持てるだけの幅でなければならない。画像幅内のタイルの数は XTsiz: 1 tile image reference grid width. The tile must be wide enough to hold one sample of every component. The number of tiles in the image width is

に等しい。

be equivalent to.

ＹＴｓｉｚ：１タイル画像基準格子の高さ。タイルは、あらゆる成分の１標本を持てるだけの高さでなければならない。画像高さ内のタイルの数は YTsiz: the height of the tile image reference grid. The tile must be tall enough to hold one sample of every component. The number of tiles in the image height is

に等しい。

be equivalent to.

Ｃｓｉｚ：画像中の成分の数。 Csiz: the number of components in the image.

ＣＳｓｉｚ：色空間変換の種類（必要なとき）。このタグは包括的ではない。（多くの多成分空間変換はここでは指定できない。それらは本発明のファイルフォーマット内でない、ほかの場所で指示される必要がある）。表４に色空間変換のための値を示す。 CSsiz: Color space conversion type (when necessary). This tag is not comprehensive. (Many multi-component spatial transformations cannot be specified here. They are not within the file format of the present invention and need to be indicated elsewhere). Table 4 shows values for color space conversion.

このタグにおいて記述されるサブサンプリングは、各成分に最高解像度を利用できない画像に適用される。本発明のシステムは、最高解像度を利用できる時に重要性の低い成分のサイズを縮小する別の方法がある。 The subsampling described in this tag is applied to images where the highest resolution is not available for each component. The system of the present invention has another way to reduce the size of less important components when the highest resolution is available.

Ｓｓｉｚｉ：第ｉ成分の精度（画素深度）。このパラメータ、ＸＲｓｉｚ及びＹＲｓｉｚは全ての成分ために繰り返される。 Ssizi: Accuracy of i-th component (pixel depth). This parameter, XRsiz and YRsiz, is repeated for all components.

ＸＲｓｉｚｉ：第ｉ成分のＸ次元の大きさ。例えば、数字２は当該成分が２つの水平基準格子点に寄与することを意味する。このパラメータと、Ｓｓｉｚ、ＹＲｓｉｚは全ての成分のために繰り返される。 XRsizi: X-dimensional size of the i-th component. For example, the number 2 means that the component contributes to two horizontal reference grid points. This parameter, Ssiz and YRsiz are repeated for all components.

ＹＲｓｉｚｉ：第ｉ成分のＹ次元の大きさ。例えば、数字２は当該成分が２つの垂直基準格子点に寄与することを意味する。このパラメータと、Ｘｓｉｚ、ＸＲｓｉｚは全ての成分のために繰り返される。 YRsizi: Y-dimensional size of the i-th component. For example, the number 2 means that the component contributes to two vertical reference grid points. This parameter, Xsiz, and XRsiz are repeated for all components.

ｒｅｓ：必要なときに最後に置かれる０の埋め草バイト。 res: A zero padding byte placed last when needed.

ＣＯＤタグは、画像又はタイルに用いられたバイナリ方式やウェーブレット方式といった符号化方式、変換フィルタ及びエントロピー・コーダを記述する。このタグは主ヘッダに含まれ、またタイルヘッダにも使用できる。このタグの長さは成分数に依存する。図３５は、符号化方式シンタックスを示す。表６は符号化方式のためのサイズと値を示す。 The COD tag describes an encoding method such as a binary method or a wavelet method used for an image or tile, a conversion filter, and an entropy coder. This tag is included in the main header and can also be used for tile headers. The length of this tag depends on the number of components. FIG. 35 shows an encoding scheme syntax. Table 6 shows the sizes and values for the encoding scheme.

ＣＯＤ：マーカー。 COD: marker.

Ｌｃｏｄ：マーカーを含めない、バイト数で表したタグの長さ（偶数でなければならない）。 Lcod: Tag length in bytes, not including markers (must be an even number).

Ｃｃｏｄｉ：各成分の符号化方式。 Ccodi: Coding method for each component.

各成分毎に、ＡＬＧタグはピラミッドレベル数と係数のアラインメントを記述する。ＡＬＧは主ヘッダに用いられ、またタイルヘッダにも用いることができる。このタグの長さは、成分数に依存し、場合によってはレベル数にも依存する。図３６は、本発明の成分アラインメント・シンタックスの一例を示す。図３６において、以下の成分が含まれる。 For each component, the ALG tag describes the pyramid level number and coefficient alignment. ALG is used for the main header and can also be used for tile headers. The length of this tag depends on the number of components and in some cases also on the number of levels. FIG. 36 shows an example of the component alignment syntax of the present invention. In FIG. 36, the following components are included.

ＡＬＧ：このマーカーは、成分アラインメント・パラメータのサイズと値を示す。 ALG: This marker indicates the size and value of the component alignment parameter.

Ｌａｌｇ：マーカーを含めない、バイト数で表したタグの長さ（偶数である）。 Lalg: Tag length in bytes, not including markers (even).

Ｐａｌｇi：第ｉ成分のピラミッド分解レベル数。このパラメータとＡａｌｇ、場合によってはＳａｌｇも、各成分毎に１レコードとして繰り返される。 Palgi: Number of pyramid decomposition levels of the i-th component. This parameter, Aalg, and possibly Salg are also repeated as one record for each component.

Ａｓｌｇi：第ｉ成分のアラインメント。このテーブル・エントリーは、係数のアラインメントを記述し、あらゆる成分のために繰り返される。表９にＡａｌｇパラメータの値を表す。 Aslgi: Alignment of i-th component. This table entry describes the alignment of the coefficients and is repeated for every component. Table 9 shows the value of the Aalg parameter.

Ｐａｌｇ、場合によってはＳａｌｇも、各成分毎に１レコードとして繰り返される。 Palg and possibly Salg are also repeated as one record for each component.

Ｔａｌｇi：表１０にテール情報選択方法を示す。 Talgi: Table 10 shows the tail information selection method.

Ｓａｌｇij：第ｉ成分の第ｊサブブロックのアラインメント値であり、当該成分のためのＡａｌｇiの値が”カスタム・アラインメント"であるときにのみ用いられる。この数は、どのカスタム・アラインメントが選ばれるかにより８ビット又は１６ビットであり、また、当該成分に関し、画像のあらゆる周波数帯域のために順番に繰り返される。（バイナリ方式に関しては、Ｓａｌｇijは第ｉピラミッドレベルのアラインメント値である）
Ｓａｌｇijが用いられる時には、Ｓａｌｇijと、Ａａｌｇ及びＰａｌｇは各成分毎に１レコードとして繰り返される。 Salgij: The alignment value of the j-th sub-block of the i-th component, and is used only when the value of Aalgi for the component is “custom alignment”. This number is 8 bits or 16 bits depending on which custom alignment is chosen and is repeated in turn for every frequency band of the image for that component. (For the binary method, Salgij is the alignment value of the i-th pyramid level)
When Salgij is used, Salgij, Aalg and Palg are repeated as one record for each component.

ＴＬＭタグは、画像中のあらゆるタイルの長さを記述する。各タイルの長さは、ＳＯＴタグの第１バイトから（次のタイルの）次のＳＯＴタグの第１バイト、又はＥＯＩ（画像の終わり）までを測った長さである。言い換えれば、この長さはタイルへのポインタのリスト又はデイジーチェーンである。 The TLM tag describes the length of every tile in the image. The length of each tile is the length measured from the first byte of the SOT tag to the first byte of the next SOT tag (of the next tile), or EOI (end of image). In other words, the length is a list of pointers to tiles or a daisy chain.

符号ストリームは、単一のＴＬＭタグ又は各タイル毎のＴＴＬタグのいずれかを含むが、その両方は含まない。主ヘッダ中にＴＬＭタグが使用される時には、ＴＬＴタグは用いられない。逆に、各タイルがＴＬＴタグで終わるときには、ＴＬＭタグは用いられない。ＴＬＭヘッダ中の個々のタイル長の値は、ＴＬＭが使われないとしたならば対応ＴＬＴタグのために用いられるであろう値と同じである。ＴＬＭタグの長さは、画像中のタイル数に依存する。図３７はタイル長・主ヘッダのシンタックスの一例を示す。 The code stream includes either a single TLM tag or a TTL tag for each tile, but not both. When a TLM tag is used in the main header, the TLT tag is not used. Conversely, when each tile ends with a TLT tag, the TLM tag is not used. The individual tile length values in the TLM header are the same values that would be used for the corresponding TLT tag if TLM was not used. The length of the TLM tag depends on the number of tiles in the image. FIG. 37 shows an example of the tile length / main header syntax.

ＴＬＭ：表１１に、タイル長・主ヘッダパラメータのサイズと値を示す。 TLM: Table 11 shows the sizes and values of tile length / main header parameters.

Ｌｔｌｍ：マーカーを含めない、バイト数で表したタグの長さ（偶数である）。 Ltlm: The length of the tag in bytes, not including a marker (even number).

Ｐｔｌｍｉ：第ｉタイルのＳＯＴマーカーから次のＳＯＴ（又はＥＯＩ）マーカーまでのバイト数で表した長さ。これは、画像中のあらゆるタイルのために繰り返される。 Ptlmi: Length in bytes from the SOT marker of the i-th tile to the next SOT (or EOI) marker. This is repeated for every tile in the image.

ＴＬＴタグはカレント・タイルの長さを記述するが、この長さは、ＳＯＴタグの第１バイトから次のタイルのＳＯＴタグの第１バイトまで（又はＥＯＩまで）を測った長さである。言い換えれば、ＴＬＴは次のタイルへのポインタである。ＴＬＴシンタックスの一例を図３８に示す。 The TLT tag describes the length of the current tile, which is the length measured from the first byte of the SOT tag to the first byte (or EOI) of the next tile's SOT tag. In other words, the TLT is a pointer to the next tile. An example of the TLT syntax is shown in FIG.

ＴＬＭタグかＴＬＴタグのいずれかが必要とされ、両方は必要とされない。ＴＬＴタグは、使用される時には、全てのタイルヘッダに必要とされ、そしてＴＬＭタグは使われない。これらのタイル長の値は両マーカーとも同一である。 Either a TLM tag or a TLT tag is required, not both. When used, the TLT tag is required for all tile headers and the TLM tag is not used. These tile length values are the same for both markers.

ＴＬＴ：表１２に、タイル長・タイルヘッダパラメータのサイズと値を示す。 TLT: Table 12 shows the size and value of the tile length / tile header parameter.

Ｌｔｌｔ：マーカーを含めない、バイト数で表したタグの長さ（偶数である）。 Ltlt: Tag length in bytes, not including markers (even number).

Ｐｔｌｔ：タイルのＳＯＴマーカーから次のＳＯＴマーカー（又はＥＯＩマーカー）までの長さ（バイト数）。 Ptlt: Length (in bytes) from the SOT marker of the tile to the next SOT marker (or EOI marker).

ＣＰＴタグは、ＳＯＴの第１バイトより、タイル中の第１成分以外のすべての成分の第１バイトを指し示す。成分符号化データは各タイル内にノンインターリーブ形式で配置され、８ビット境界から始まる。この点でエントロピー・コーダはリセットされる。 The CPT tag indicates the first byte of all components other than the first component in the tile from the first byte of the SOT. The component encoded data is arranged in a non-interleaved format within each tile and starts on an 8-bit boundary. At this point, the entropy coder is reset.

画像が２つ以上の成分を含むときに、このタグはあらゆるタイルのタイルヘッダに使用される。この可変長タグのサイズは、画像中の成分数に依存する。成分ポインタのシンタックスの一例を図３９に示す。 This tag is used for the tile header of every tile when the image contains more than one component. The size of the variable length tag depends on the number of components in the image. An example of the syntax of the component pointer is shown in FIG.

ＣＰＴ：表１３に、成分ポインタのパラメータのサイズと値を示す。 CPT: Table 13 shows the size and value of the parameter of the component pointer.

Ｌｃｐｔ：マーカーを含めない、バイト数で表したタグの長さ（偶数である）。 Lcpt: Tag length in bytes, not including markers (even number).

Ｐｃｐｔi：カレント・タイルのＳＯＴタグから次の成分の始まりまでのバイト数。第１成分のデータはＳＯＳタグの直後に始まるため、Ｐｃｐｔ値の数は成分数より小さい。新たな成分データは８ビット境界上で始まる。 Pcpti: The number of bytes from the SOT tag of the current tile to the beginning of the next component. Since the data of the first component starts immediately after the SOS tag, the number of Pcpt values is smaller than the number of components. New component data begins on an 8-bit boundary.

ＩＲＳタグは、カレント・タイルのＳＯＴタグの第１バイトよりデータ中のリセットを指し示す。これらのリセットは、符号化が完了した重要レベルの終わりの後の８ビット境界に見出される。リセットが生じる点の成分は、ＣＰＴタグ値とリセット・ポインタとの間の関係によって決定できる。このタグの長さは、復号化器に利用されたリセットの数に依存する。重要性レベル・リセット・シンタックスの一例を図４０に示す。 The IRS tag indicates a reset in the data from the first byte of the SOT tag of the current tile. These resets are found on 8-bit boundaries after the end of the critical level where encoding is complete. The component of the point at which reset occurs can be determined by the relationship between the CPT tag value and the reset pointer. The length of this tag depends on the number of resets utilized by the decoder. An example of the importance level reset syntax is shown in FIG.

ＩＲＳ：表１４に重要性レベル・リセットのパラメータのサイズと値を表す。 IRS: Table 14 shows the size and value of the importance level reset parameter.

Ｌｉｒｓ：マーカーを含めない、バイト数で表したタグの長さ（偶数である）。 Lirs: Tag length in bytes, not including markers (even).

Ｉｉｒｓⁱ：第ｉリセットでのカレント重要性レベルの番号。このＩｉｒｓタグと、対応したＰｉｒｓタグとが一種のレコードを形成し、これは各リセット毎に繰り返される。これらのレコードは、リセットを持つ最も高い重要性レベルからリセットを持つ最低の重要性レベルへと続き、その次の成分の重要性レベルのものが続き、同様にして最後の成分まで続く順序である。 Iirs ⁱ : Number of the current importance level at the i-th reset. This Iirs tag and the corresponding Pirs tag form a kind of record, which is repeated for each reset. These records are ordered from the highest importance level with reset to the lowest importance level with reset, followed by the importance level of the next ingredient, and so on until the last ingredient. .

Ｐｉｒｓⁱ：カレント・タイルのＳＯＴタグから第ｉリセットのバイトまでのバイト数。このＰｉｒｓタグとＩｉｒｓタグとが一種のレコードを形成し、これは各リセットに対し繰り返される。これらのレコードは、最小のポインタから最大のポインタへの順序でなければならない。すなわち、これらのレコードは、各リセットバイトを符号ストリーム中で出現した順に指し示す（数が小さくなるほど、物理的には先に出現するバイトを指す）。 Pirs ⁱ : The number of bytes from the SOT tag of the current tile to the i-th reset byte. The Pirs tag and Iirs tag form a kind of record, which is repeated for each reset. These records must be ordered from the smallest pointer to the largest pointer. That is, these records indicate each reset byte in the order in which it appears in the code stream (the smaller the number, the more physically it appears earlier).

特定の情報タグがもっぱら情報目的のために含まれる。これらの情報タグは、復号化器のためには必要ではないが、パーサの助けとなろう。 Specific information tags are included solely for information purposes. These information tags are not necessary for the decoder, but will help the parser.

例えば、ＶＥＲタグはメジャー・バージョン番号及びマイナー・バージョン番号を記述する。このタグは、主ヘッダに使われる。このタグは、規定されてはいるが、画像の復号化に必要とされる機能レベルを意味しない。実は、その目的は、あらゆる復号化器及びパーサを、本発明のどのバージョンの符号ストリームも復号化及び構文解析できるようにすることである。本発明のバージョン番号のシンタックスの一例を図４１に示す。 For example, the VER tag describes a major version number and a minor version number. This tag is used for the main header. Although this tag is defined, it does not mean a functional level required for decoding an image. In fact, its purpose is to allow any decoder and parser to decode and parse any version of the codestream of the present invention. An example of the version number syntax of the present invention is shown in FIG.

ＶＥＲ：表１５にバージョン番号パラメータのサイズと値を示す。 VER: Table 15 shows the size and value of the version number parameter.

Ｌｖｅｒ：マーカーを含めない、バイト数で表した卓の長さ（偶数である）。 Lver: The length of the table in bytes, not including the marker (even).

Ｖｖｅｒ：メジャー・バージョン番号。 Vver: Major version number.

Ｒｖｅｒ：マイナー・バージョン番号。 Rver: minor version number.

ＢＶＩタグは、画像幅を基準にして、ビットの数を重要性レベルに関連付ける。このオプションのタグは、主ヘッダに用いられる。この可変長タグのサイズは、符号化器によって数え上げられた重要性レベルの数に依存する。ビット対重要性レベル・シンタックスの一例を図４２に示す。 The BVI tag associates the number of bits with the importance level relative to the image width. This optional tag is used in the main header. The size of this variable length tag depends on the number of importance levels counted by the encoder. An example of bit-to-importance level syntax is shown in FIG.

ＢＶＩ：表１６に、タイル長主ヘッダ・パラメータのサイズと値を示す。 BVI: Table 16 shows the size and value of the tile length main header parameter.

Ｌｂｖｉ：マーカーを含めない、ビット数で表したタグの長さ（偶数である）。 Lbvi: The length of the tag expressed as the number of bits (not even) including no marker.

Ｃｂｖｉⁱ：これは、どの成分データが記述されるのか知らせる。このＣｂｖｉパラメータはＩｂｖｉ及びＰｂｖｉと共に、１レコードを形成し、これは記述されたすべての成分及び重要性レベルについて繰り返される。最初の成分の全ての重要性レベル記述、次の成分の全ての重要性レベル記述、等々と続くような順序でなければならない。 Cbvi ⁱ : This informs which component data is described. This Cbvi parameter together with Ibvi and Pbvi forms a record, which is repeated for all components and importance levels described. The order must be such that all importance level descriptions for the first component, all importance level descriptions for the next component, and so on are followed.

Ｉｂｖｉⁱ：カレント成分において、Ｐｂｖｉi内のバイト数につき符号化された重要性レベルの番号。この番号（１つ又は複数）は、レート・歪み曲線の関心点を伝えるために符号化時に選択される。このＩｂｖｉパラメータはＣｂｖｉ及びＰｂｖｉとともに１レコードを形成し、これは記述されたすべての成分及び重要性レベルについて繰り返される。 Ibvi ⁱ : The importance level number encoded for the number of bytes in Pbvii in the current component. This number (s) is selected during encoding to convey the point of interest of the rate / distortion curve. This Ibvi parameter forms one record with Cbvi and Pbvi, which is repeated for all components and importance levels described.

Ｐｂｖｉⁱ：主ヘッダとタイルヘッダ、及び、Ｉｂｖｉi内の重要性レベルの数に関連した全てのデータを含む符号化ファイル中のバイト数。このＰｂｖｉパラメータはＣｂｖｉ及びＩｂｖｉとともに１レコードを形成し、これは記述されたすべての成分及び重要性レベルについて繰り返される。 Pbvi ⁱ : The number of bytes in the encoded file that contains all data associated with the number of importance levels in the main and tile headers and Ibvii. This Pbvi parameter forms one record with Cbvi and Ibvi, which is repeated for all components and importance levels described.

ＩＬＬタグは、符号化データの重要性レベルの終わりに対応した符号ストリームへのポインタを記述する。ＩＬＬタグは、ＩＲＳタグと似ているけれども、リセットも８ビット境界へのビット挿入もないデータを指し示す。このタグにより、パーサは、画像幅基準でほぼ同じひずみのタイルを見つけて打ち切ることが可能になる。このタグは、オプションであり、タイルヘッダ中でだけ使われる。このタグの長さは、数え上げられた重要性レベルの数に依存する。重要性レベル・ロケータのシンタックスの一例を図４３に示す。 The ILL tag describes a pointer to a code stream corresponding to the end of the importance level of the encoded data. The ILL tag is similar to the IRS tag, but points to data that is neither reset nor inserted into the 8-bit boundary. This tag allows the parser to find and censor tiles with approximately the same distortion on an image width basis. This tag is optional and is used only in the tile header. The length of this tag depends on the number of importance levels enumerated. An example of the importance level locator syntax is shown in FIG.

ＩＬＬ：マーカー。表１７に、重要性レベル・ロケータのパラメータのサイズと値を示す。 ILL: marker. Table 17 shows the size and value of the importance level locator parameters.

Ｌｉｌｌ：マーカーを含めない、バイト数で表したタグの長さ（偶数である）。 Lill: Tag length in bytes, not including markers (even).

Ｉｉｌｌⁱ：Ｐｉｌｌi内のバイト数に関し符号化される重要性レベルの番号。それら番号はそれぞれ、レート・歪み曲線の関心点を伝達するため符号化時に選択される。このＩｉｌｌ番号はＰｉｌｌパラメータと共に１レコードを形成するが、これは最も速い成分において最高重要度レベルから最低重要性レベルへの順に繰り返され、以下、後の成分における重要な最高重要性レベルから最低重要性レベルまでを特定する同様レコードが続く。 Ill ⁱ : The importance level number encoded for the number of bytes in Pilli. Each of these numbers is selected at the time of encoding to convey the point of interest of the rate / distortion curve. This Iill number, together with the Pill parameter, forms one record, which is repeated in the order of highest importance level to lowest importance level for the fastest component, and so on. A similar record follows up to the sex level.

Ｐｉｌｌⁱ：カレント・タイルのＳＯＴの第１バイトより、当該タイルの符号化データ中のＩｉｌｌiの重要性レベルが完了するバイトを指し示す。このＰｉｌｌ数はＩｉｌｌパラメータと共に１レコードを形成し、これは最も速い成分において最高の重要性レベルから最低の重要性レベルへの順に繰り返され、以下、後の成分における重要な最高重要性レベルから最低重要レベルまでを特定する同様レコードが続く。 Pil ⁱ : The byte from the first byte of the SOT of the current tile indicates the byte that completes the importance level of Illi in the encoded data of the tile. This Pill number forms one record with the Iill parameter, which is repeated in order from the highest importance level to the lowest importance level in the fastest component, and so on, from the highest importance level in later components to the lowest. A similar record follows up to the critical level.

ＲＸＹタグは、実寸法に関する画像基準格子のＸ解像度及びＹ解像度を定義する。このタグは主ヘッダにのみ用いられる。解像度（画素／単位）のシンタックスの一例を図４４に示す。 The RXY tag defines the X resolution and Y resolution of the image reference grid with respect to actual dimensions. This tag is only used for the main header. An example of the resolution (pixel / unit) syntax is shown in FIG.

ＲＸＹ：表１８に、解像度（画素／単位）を指定するためのパラメータのサイズと値を示す。 RXY: Table 18 shows the size and value of the parameter for designating the resolution (pixel / unit).

Ｌｒｘｙ：マーカーを含めない、バイト数で表したタグの長さ（偶数である）。 Lrxy: Tag length in bytes, not including markers (even).

Ｘｒｘｙ：単位あたりの基準格子画素数。 Xrxy: the number of reference grid pixels per unit.

Ｙｒｘｙ：単位あたりの基準格子ライン数。 Yrxy: number of reference grid lines per unit.

ＲＸｒｘｙ：Ｘ次元の単位。したがって、水平方向の解像度は、Ｘｒｘｙ格子画素／１０（ＲＸｒｘｙ−１２８）メートルである。 RXrxy: X-dimensional unit. Thus, the horizontal resolution is Xrxy grid pixels / 10 (RXrxy−128) meters.

ＲＹｒｘｙ：Ｙ次元の単位。したがって、垂直方向の解像度はＹｒｘｙ格子ライン／１０（ＲＹｒｘｙ−１２８）メートルである。 RYrxy: Y-dimensional unit. Therefore, the vertical resolution is Yrxy grid lines / 10 (RYrxy-128) meters.

ＣＭＴタグはヘッダ内の非構造化データを許す。このタグは、主ヘッダとタイルヘッダのいずれにも使用できる。このタグの長さは、コメントの長さに依存する。コメントのシンタックスの一例を図４５に示す。 The CMT tag allows unstructured data in the header. This tag can be used for either the main header or the tile header. The length of this tag depends on the length of the comment. An example of the syntax of the comment is shown in FIG.

ＣＭＴ：表１９にコメント・パラメータの大きさと値を示す。 CMT: Table 19 shows the size and value of the comment parameter.

Ｌｃｍｔ：マーカーを含めない、バイト数で表したタグの長さ（偶数である）。 Lcmt: Tag length in bytes, not including markers (even number).

Ｒｃｍｔ：タグのレジストレーション（registration）値。表２０に、レジストレーション・パラメータの値を示す。 Rcmt: Tag registration value. Table 20 shows registration parameter values.

Ｃｃｍｔｉ：非構造化データのバイト。任意に繰り返される。 Ccmti: bytes of unstructured data. Repeated arbitrarily.

ｒｅｓ：必要なときに、最後に置かれる０の埋め草バイト。 res: zero padding byte placed last when needed.

ＱＣＳタグは、量子化符号データがどこまで量子化済みかを記述する。パーサ又は符号化器により量子化が実行される時に、このタグは、符号化器が、重要性レベルに関しどこまで符号化すべきかを大まかに判断するのを助ける。このタグは、オプションであり、タイルヘッダにのみ使用される。量子化符号ストリームのシンタックスの一例を図４６に示す。 The QCS tag describes how much the quantized code data has been quantized. When quantization is performed by the parser or encoder, this tag helps the encoder roughly determine how far to encode with respect to the importance level. This tag is optional and is used only for tile headers. An example of the syntax of the quantized code stream is shown in FIG.

ＱＣＳ：表２１に、量子化符号ストリームのパラメータのサイズと値を示す。 QCS: Table 21 shows the sizes and values of the parameters of the quantized code stream.

Ｌｑｃｓ：マーカーを含めない、バイト数で表したタグの長さ（偶数である）。 Lqcs: Tag length in bytes, not including markers (even number).

Ｃｉｌｌⁱ：カレント成分の番号。このＣｉｌｌ番号はＩｑｃｓとともに１レコードを形成し、これは最も速い成分における最高重要性レベルから最低重要性レベルへの順に繰り返され、以下、後の成分における重要な最高重要性レベルから最低重要レベルまでを特定する同様レコードが続く。 Cill ⁱ : current component number. This Cill number forms a record with Iqcs, which is repeated in order from the highest importance level to the lowest importance level in the fastest component, from the highest importance level to the lowest importance level in later components. Followed by a similar record that identifies

Ｉｑｃｓi：これは、符号化データの少なくとも一部分が残っている重要性レベルである。当該点から次のリセットまでに残っているデータは全て、打ち切り済み（量子化済み）である。 Iqcsi: This is the importance level where at least a portion of the encoded data remains. All data remaining from that point until the next reset has been censored (quantized).

ｒｅｓ：必要に応じて最後に置かれる０の埋め草バイト。 res: A zero padding byte placed last if necessary.

損失性係数再構成
本発明は、一実施例において、値を所定の整数値の集合に丸めることで損失性再構成を行う。例えば、０と３１の間の全ての係数は０に量子化され、３２〜６３の間の全ての係数は３２に量子化される等々である。図４７は、量子化しないときの係数の代表的分布を示す。各係数の最も下のビットが分かっていない場合に、そのような量子化が行われるかもしれない。別の実施例では、各値域の中央の値が、その係数群を表すより正確な値を提供するかもしれない。例えば、６４と１２７の間の全ての係数が９５に量子化される。値がある点へ量子化されるとき、その点は再構成点と呼ばれる。 Loss factor reconstruction In one embodiment, the present invention performs lossy reconstruction by rounding values to a set of predetermined integer values. For example, all coefficients between 0 and 31 are quantized to 0, all coefficients between 32 and 63 are quantized to 32, and so on. FIG. 47 shows a typical distribution of coefficients when not quantized. Such quantization may be performed when the lowest bit of each coefficient is not known. In another embodiment, the central value of each range may provide a more accurate value that represents the coefficient group. For example, all coefficients between 64 and 127 are quantized to 95. When a value is quantized to a point, that point is called a reconstruction point.

画像間の差異により、得られる分布は形がゆがむ。例えば、図４７中の曲線２７０１と曲線２７０２を比較されたい。 Due to the differences between the images, the resulting distribution is distorted. For example, compare curve 2701 and curve 2702 in FIG.

本発明においては、再構成点は、その分布に基づいて選ばれる。一実施例では、分布が推定され、その推定分布に基づき再構成点が選ばれる。推定分布は、既に分かったデータに基づき生成される。データを収集する以前は、デフォルトの再構成点が用いられるであろう。このように、本発明は、適応的な損失性再構成方法を提供する。さらに、本発明は、係数再構成を改善する非反復の方法である。分布の差異によって値域の使用が不均一になることを補償するため、本発明は次のように規定する。 In the present invention, the reconstruction point is selected based on its distribution. In one embodiment, a distribution is estimated and a reconstruction point is selected based on the estimated distribution. The estimated distribution is generated based on already known data. Prior to collecting data, the default reconstruction point will be used. Thus, the present invention provides an adaptive lossy reconstruction method. Furthermore, the present invention is a non-iterative method that improves coefficient reconstruction. In order to compensate for the non-uniform use of the range due to the difference in distribution, the present invention defines as follows.

ただし、２^Sは利用できるデータを基に復号化器により測定された標本分散であり、Ｑは復号化器に知らされた量子化である。次に、非ゼロ係数を０から遠ざけることによって、それを修正する。

Where 2 ^S is the sample variance measured by the decoder based on the available data, and Q is the quantization informed to the decoder. It is then corrected by moving the non-zero coefficient away from zero.

ただし、ｉは任意の整数である。

However, i is an arbitrary integer.

一実施例では、全部の復号化が完了した後に、非ゼロ係数はすべて、ある再構成レベルに調整される。この調整をするためには、各係数を読み込み、恐らく修正し、そして書き込むことが必要である。 In one embodiment, all non-zero coefficients are adjusted to a certain reconstruction level after all decoding is complete. To make this adjustment, it is necessary to read, possibly modify, and write each coefficient.

別の実施例では、各係数の各ビットプレーンが処理される時に、その係数が非ゼロならば、その係数の適当な再構成値が記憶される。復号化が止まった時に、全係数がそれらの適当な再構成値に設定される。こうすることにより、再構成レベルの設定のため別にメモリを経由する必要がなくなる。 In another embodiment, when each bitplane of each coefficient is processed, if that coefficient is non-zero, the appropriate reconstruction value for that coefficient is stored. When decoding stops, all coefficients are set to their appropriate reconstructed values. This eliminates the need for a separate memory for setting the reconstruction level.

カラー
本発明は、カラー画像（及びデータ）に適用できる。図１の多成分処理機構１０１は、カラーデータのために必要とされる処理を実行する。例えば、ＹＵＶ色空間には、３つの成分、つまりＹ成分、Ｕ成分、Ｖ成分があり、各成分は別々に符号化される。 Color The present invention can be applied to color images (and data). The multi-component processing mechanism 101 of FIG. 1 performs processing required for color data. For example, the YUV color space has three components, that is, a Y component, a U component, and a V component, and each component is encoded separately.

一実施例では、各成分のエントロピー符号化データは、他の成分のエントロピー符号化データから分離される。この実施例においては、成分のインターリービングはない。成分別にデータを分けることは、ピラミッド・アラインメントと組み合わされると、復号化器又はパーサが異なった成分を容易に別々に量子化できるようにするのに役立つ。 In one embodiment, the entropy encoded data for each component is separated from the entropy encoded data for the other components. In this embodiment, there is no component interleaving. Separating data by component helps, when combined with pyramid alignment, allows a decoder or parser to easily quantize different components separately.

他の実施例では、異なった成分のエントロピー符号化データが周波数帯域単位又は重要性レベル単位でインターリーブされる。これは、ＭＳＥアラインメントと組み合わされると、共通の打ち切りを全成分のデータの量子化に利用できるので有益である。このインターリービング方式のためには、符号化器が異なった成分の周波数帯域間又は重要性レベル間の関係を提供する必要がある。周波数帯域又は重要性レベルはかなり大量の符号化データであろうから、パーサ又は復号化器はマーカーを利用し成分を独立に量子化できるであろう。 In other embodiments, different components of entropy encoded data are interleaved in frequency band units or importance level units. This is beneficial when combined with MSE alignment because a common truncation can be used for quantization of all component data. For this interleaving scheme, the encoder needs to provide a relationship between frequency bands or importance levels of different components. Since the frequency band or importance level will be a fairly large amount of encoded data, the parser or decoder will be able to quantize the components independently using markers.

さらに別の実施例では、異なった成分のエントロピー符号化データは、画素毎又は係数毎にインターリーブされる。これは、ＭＳＥアラインメントと組み合わされると、全成分に共通の打ち切りが作用するので有益である。画素単位のインターリービングの場合、復号化器及びパーサは符号化器で定義されのと同じ成分間関係を利用しなければならない。 In yet another embodiment, different components of entropy encoded data are interleaved pixel by pixel or coefficient. This is beneficial because when combined with MSE alignment, a common truncation acts on all components. For pixel-by-pixel interleaving, the decoder and parser must make use of the same component relationships as defined in the encoder.

本発明によれば、同じシステムでサブサンプリングを実行できる。 According to the present invention, subsampling can be performed in the same system.

一実施例では、各成分は別々に記憶される。伸長装置及びパーサを使うことにより、損失性出力画像を生成する時には、別々の成分メモリのそれぞれから分解レベル及び成分の選択されたものだけが取得されるであろう。例えば、ＹＵＶ色空間において、Ｙ色成分については分解レベルの全部が取得されるであろうが、Ｕ成分とＹ成分については第１分解レベル以外の分解レベルがすべて取得されるであろう。結果として得られる画像の組合せは、４：１：１画像である。なお、メモリに格納されているデータの異なった部分を用いることにより、別の型式の画像を得ることもできる。 In one embodiment, each component is stored separately. By using a decompressor and parser, when generating a lossy output image, only a selection of decomposition levels and components will be obtained from each of the separate component memories. For example, in the YUV color space, all of the decomposition levels will be acquired for the Y color component, but all of the decomposition levels other than the first decomposition level will be acquired for the U component and the Y component. The resulting image combination is a 4: 1: 1 image. Note that different types of images can be obtained by using different portions of the data stored in the memory.

多くの型式の多成分画像を処理可能である。画像データは、ＹＵＶのほかに、ＲＧＢ（赤、緑、青）、ＣＭＹ（シアン、マゼンタ、黄）、ＣＭＹＫ（シアン、マゼンタ、黄、黒）又はＣＣＩＲ601 ＹＣｒＣｂでもよい。多重スペクトル画像データ（例えば、リモートセンシング・データ）も用い得る。ＲＧＢやＣＭＹのような視覚的データに対しては、米国特許出願第０８／４３６，６６２号（１９９５年５月８日受理、“Ｍethod and Ａpparatus for Ｒeversible Ｃolor Ｃompression”に述べられているような非損失性色空間変換を利用できる。 Many types of multi-component images can be processed. In addition to YUV, the image data may be RGB (red, green, blue), CMY (cyan, magenta, yellow), CMYK (cyan, magenta, yellow, black) or CCIR601 YCrCb. Multispectral image data (eg, remote sensing data) may also be used. For visual data such as RGB and CMY, US patent application Ser. No. 08 / 436,662 (accepted May 8, 1995, “Method and Apparatus for Reversible Color Compression”) Lossy color space conversion can be used.

ビット抽出
本発明は、ビット抽出を高めるようにコンテキストモデルを計算しビットを符号化することができる。具体的には、ヘッドビットのためのコンテキストモデルは、隣接画素より与えられる情報を基礎にしている。しばしば、特に損失性圧縮を行う時に、このコンテキストは０である。ヘッドビット・コンテキストの近似統計量のため、本発明はヘッドビットのためのコンテキストを保持する機構を提供する。 Bit Extraction The present invention can calculate a context model and encode bits to enhance bit extraction. Specifically, the context model for head bits is based on information provided by neighboring pixels. Often this context is zero, especially when performing lossy compression. Due to the approximate statistics of the headbit context, the present invention provides a mechanism for maintaining the context for the headbit.

一実施例では、符号化に先だってメモリがクリアされる。コンテキストは、その親、隣接画素の一つ、又は注目画素が変わるまで、そのままである。変化した時に、影響を受ける全てのコンテキストに関しコンテクスト・メモリが更新される。テール情報を利用するときには、隣接画素と子だけが更新される。ヘッドビットがオンの時に１係数につき１度だけメモリが更新される。 In one embodiment, the memory is cleared prior to encoding. The context remains until its parent, one of the neighboring pixels, or the pixel of interest changes. When changed, the context memory is updated for all affected contexts. When using tail information, only neighboring pixels and children are updated. When the head bit is on, the memory is updated only once per coefficient.

一実施例では、各係数は、符号(sign)の１ビット、テールオン情報の４ビット、コンテキストの８ビット、その後に続く係数の１９ビットからなる３２ビット整数として記憶される。係数の一例を図４８に示す。 In one embodiment, each coefficient is stored as a 32-bit integer consisting of 1 bit of sign, 4 bits of tail-on information, 8 bits of context, followed by 19 bits of the coefficient. An example of the coefficients is shown in FIG.

一実施例では、テールオン情報の４ビットを利用して５つの異なったケースを生成する。 In one embodiment, four different cases of tail-on information are used to generate five different cases.

テールオン情報の４ビットの値が０のケースにおいては、カレント係数の絶対値ビットのカレント・ビットプレーンのビットは、コンテキストビットを利用して符号化される。該ビットが０ならば、プロセスは終了する。該ビットが１ならば、係数の符号が符号化される。それから、テールオン情報の第１ビットが反転され、北、北東、西、南、東及び４つの子のコンテキストが更新され、プロセスは終了する。 When the 4-bit value of the tail-on information is 0, the bit of the current bit plane of the absolute value bit of the current coefficient is encoded using the context bit. If the bit is 0, the process ends. If the bit is 1, the coefficient sign is encoded. The first bit of tail-on information is then inverted, the north, northeast, west, south, east and four child contexts are updated and the process ends.

テールオン情報の４ビットの値が１のケースにおいては、カレント係数の絶対値ビットのカレント・ビットプレーンのビットは当該ケースのための一定のコンテキストを使って符号化される。テールオン情報の第２ビットが反転される。カレント係数の東と子のコンテキストが更新される。プロセスは終了する。 In the case where the 4-bit value of the tail on information is 1, the bits of the current bit plane of the absolute value bits of the current coefficient are encoded using the constant context for that case. The second bit of the tail on information is inverted. The east and child contexts of the current coefficient are updated. The process ends.

テールオン情報の４ビットの値が７のケースにおいては、カレント係数の絶対値ビットのカレント・ビットプレーンのビットは当該ケースのための一定のコンテキストを使って符号化される。テールオン情報の第３ビットが反転される。どのコンテキストも更新不要である。プロセスは終了する。 In the case where the 4-bit value of the tail-on information is 7, the bits of the current bit plane of the absolute value bits of the current coefficient are encoded using the constant context for that case. The third bit of the tail on information is inverted. No update is required for any context. The process ends.

テールオン情報の４ビットの値が３のケースにおいては、カレント係数の絶対値ビットのカレント・ビットプレーンのビットは、当該ケース用の一定のコンテキストを用いて符号化される。テールオン情報の第４ビットが反転される。カレント係数の東と子のコンテキストが更新される。プロセスは終了する。 In the case where the 4-bit value of the tail-on information is 3, the bits of the current bit plane of the absolute value bits of the current coefficient are encoded using a certain context for the case. The fourth bit of the tail on information is inverted. The east and child contexts of the current coefficient are updated. The process ends.

テールオン情報の４ビットの値が１５のケースにおいては、カレント係数の絶対値ビットのカレント・ビットプレーンのビットは、当該ケースのための一定のコンテキストを使って符号化される。テールオン情報のどのビットも反転不要である。プロセスは終了する。 In the case where the 4-bit value of the tail-on information is 15, the bits of the current bit plane of the absolute value bits of the current coefficient are encoded using the constant context for that case. No bits of tail-on information need be inverted. The process ends.

図４８は本発明の係数の例を示す。図４８において、係数２８０１は、符号ビット２８０２と、それに続くテールネオン情報ビット２８０３、それに続くコンテキストビット２８０４、それに続く係数絶対値ビット２８０５とからなる。前述のプロセスが図４９のフローチャートに示されている。 FIG. 48 shows an example of the coefficient of the present invention. In FIG. 48, a coefficient 2801 is composed of a sign bit 2802, a tail neon information bit 2803 that follows, a context bit 2804 that follows, and a coefficient absolute value bit 2805 that follows. The foregoing process is illustrated in the flowchart of FIG.

変化が生じた時に全コンテキストを更新する当該手法を使うことにより、ヘッドビットが圧倒的に０である限り、コンテキスト・モデリングが高速に働く。特に損失性符号化の場合にそうである。 By using this technique of updating the entire context when a change occurs, context modeling works fast as long as the head bit is predominantly zero. This is especially true for lossy coding.

可逆ウェーブレット係数のハフマン符号化
本発明は、一実施例において、ハフマン符号化を使ってウェーブレット係数を符号化する。ハフマン符号化のためのアルファベットは２つの部分からなる。第１の部分は０係数のランの長さに等しく、第２の部分は０でないターミネータ（terminator）係数のハッシュ値である。図５３にアルファベット・フォーマットを示すが、これは０係数の数、換言すれば、そのランの長さを示す４ビットと、それに続く０から１５までのハッシュ値を表す４ビットとからなる。 Huffman Coding of Lossless Wavelet Coefficients The present invention, in one embodiment, encodes wavelet coefficients using Huffman coding. The alphabet for Huffman coding consists of two parts. The first part is equal to the length of a zero coefficient run, and the second part is a hash value of a non-zero terminator coefficient. FIG. 53 shows the alphabet format, which consists of the number of 0 coefficients, in other words, 4 bits indicating the length of the run, and 4 bits indicating the hash value from 0 to 15 following the number.

このハッシュ値は値Ｎであり、このＮは０でないターミネータ係数の絶対値の、２を底とする対数の整数部分である。一実施例では、このハッシュ値は値Ｎを表すのに必要なビット数である。例えば、Ｎ＝−１，１の場合、ハッシュ値は１である。他方、Ｎ＝−３，−２，２，３の場合、値Ｎを表すのに必要なビット数は２である。同様の対応はＪＰＥＧに用いられている。 This hash value is the value N, which is the integer part of the logarithm base 2 of the absolute value of the non-zero terminator coefficient. In one embodiment, this hash value is the number of bits required to represent the value N. For example, when N = −1, 1, the hash value is 1. On the other hand, if N = −3, −2, 2 and 3, the number of bits required to represent the value N is two. Similar correspondence is used in JPEG.

このようなシチュエーションでは、許容される０係数のランの最大長は１５である。ランが１５を超えるときには、０が１６個のランの後に新たなランが続くことを表すため特殊なトークンが使われるであろう。このような例外トークンの一つは、最初の４ビットと最後の４ビットの両方とも全部０である。一実施例では、２番目の４ビットが０の１６個のトークンが全部、例外ケースのために用いられる。したがって、２５６個の８ビットのハフマン・トークンがある。 In such situations, the maximum length of a zero coefficient run allowed is 15. When a run exceeds 15, a special token will be used to indicate that 0 is a new run followed by 16 runs. One such exception token is all zeros in both the first 4 bits and the last 4 bits. In one embodiment, all 16 tokens where the second 4 bits are 0 are used for exception cases. Thus, there are 256 8-bit Huffman tokens.

一実施例では、ハフマン・トークンに関するテーブルが作られる。一実施例では、そのテーブルが全ての画像に対して用いられる。別の実施例では、多くのテーブルが作成され、量子化に応じて１つの特定のテーブルが選ばれる。各テーブルは、量子化しようとするビット数に基づいて選択される。すなわち、量子化するビット数が１ビット、２ビット、３ビット等々であるかによって、テーブルがそれぞれ選択されるわけである。別の実施例では、ハフマン符号は特定画像向けのものであり、画像と一緒に記憶／伝送される。 In one embodiment, a table for Huffman tokens is created. In one embodiment, the table is used for all images. In another embodiment, many tables are created and one particular table is chosen depending on the quantization. Each table is selected based on the number of bits to be quantized. That is, the table is selected depending on whether the number of bits to be quantized is 1 bit, 2 bits, 3 bits, or the like. In another embodiment, the Huffman code is for a specific image and is stored / transmitted with the image.

テーブルを利用するために、一つのハフマン・トークンが生成される。そして、このトークンが、それが符号化されるテーブルに送られる。 One Huffman token is generated to use the table. This token is then sent to the table in which it is encoded.

ハフマン・トークンは０のランの長さ及び非０のターミネータ・シンボルのハッシュ値を特定するが、ターミネータ・シンボルを一意的に特定するために割増のビットが必要になる。本発明の一実施例は、これら割増ビットを用意する。ハフマン・トークンが（例えばテーブル等から得られる）ハフマン符号語で置き換えられた後、ターミネータ・シンボルのハッシュ値に等しい割増ビットが書かれる。例えば、−１，１のケースでは割増の１ビットが書かれるが、−３，−２，２，３のケースにおいては割増の２ビットが書かれる。このように、本発明は、ターミネータ・シンボルを一意的に特定する、割増ビットによってサイズが可変のハフマン符号化を提供する。 The Huffman token specifies a run length of zero and a hash value of a non-zero terminator symbol, but an extra bit is required to uniquely identify the terminator symbol. One embodiment of the present invention provides these extra bits. After the Huffman token is replaced with a Huffman codeword (eg, obtained from a table, etc.), an extra bit equal to the hash value of the terminator symbol is written. For example, in the case of -1,1, the extra 1 bit is written, but in the cases of -3, -2,2,3, the extra 2 bits are written. Thus, the present invention provides Huffman coding that is variable in size with additional bits that uniquely identify terminator symbols.

なお、他のｍ元コーダを用いてもよい。例えば、あるアルファベットとｍ元符号を０係数のために用い、別のアルファベットとｍ元符号をハッシュ値のために用いてもよい。 Other m-ary coders may be used. For example, one alphabet and m-element code may be used for the 0 coefficient, and another alphabet and m-element code may be used for the hash value.

一実施例では、量子化レベル毎のハフマン・テーブルのセットが予め計算され、殆どの画像に対して利用される。様々なテーブル間で選択するために、あるテーブルを使用中に圧縮がモニタされるであろう。そのテーブルを使用した結果に基づいて、スキューがもっと大きい又は小さいテーブルへの切り替えが行われるであろう。 In one embodiment, a set of Huffman tables for each quantization level is pre-calculated and used for most images. The compression will be monitored while using one table to select between the various tables. Based on the results of using that table, a switch to a table with more or less skew will occur.

本発明の係数はすべて、あるバッファに入れられる。各バッファ毎に、どのテーブルを使用すべきかの決定がなされるであろう。８つのハフマン・テーブルのどれを利用すべきか指示するため、３ビットと１つのヘッダが用いられるかもしれない。しかして、そのヘッダを知らせることによって、テーブル選択がなされるであろう。 All the coefficients of the present invention are placed in a buffer. For each buffer, a determination will be made as to which table to use. Three bits and one header may be used to indicate which of the eight Huffman tables should be used. Thus, the table selection will be made by informing the header.

係数が符号化される順序は重要である。従来技術の係数符号化では、例えばＪＰＥＧでは、係数はジグザグ順に圧縮されることに注意されたい。本発明においては、係数全部があるバッファ内にあるので、ジグザグ順にすることはできない。ジグザグ順は、低い周波数から高い周波数への順序と理解されるなら、埋め込みウェーブレットによる圧縮（ツリー順）に拡張することができる。 The order in which the coefficients are encoded is important. Note that in prior art coefficient coding, for example in JPEG, the coefficients are compressed in zigzag order. In the present invention, since all the coefficients are in a buffer, they cannot be in zigzag order. The zigzag order can be extended to compression with an embedded wavelet (tree order) if understood as an order from low to high frequency.

一実施例では、バッファ全体について直線的な順序で係数が符号化される。そのような例を図５４に示す。なお、この実施例において、平滑係数の最初のブロックは除外されることに注意されたい。 In one embodiment, the coefficients are encoded in a linear order for the entire buffer. Such an example is shown in FIG. Note that in this embodiment, the first block of smoothing coefficients is excluded.

別の実施例では、すべてのブロックは、低い周波数のブロックより高い周波数のブロックへと、ラスター順に符号化される。そのような例を図５４（Ｂ）に示す。メモリの制約のため、１つの周波数パスの全部は、別の周波数パスが始まる前に完了しないかもしれない。メモリによって制限される場合、もう一つの方法は１つのツリーを一度に符号化する方法である。ルートから初めて、すべてのツリーが横方向に符号化される。ただし、平滑係数であるところのルートは含めない。この方法が図５４（Ｃ）に示されている。図５４（Ｃ）には最初のツリーが示されており、最初のサブブロックのセットより１ラインが取られ、その次のサブブロックのセットより２ラインが取られ、その次のサブブロックのセットより４ラインが取られる。これらラインは他のラインが利用可能になる以前に利用可能であるため、このような実施例が可能である。 In another embodiment, all blocks are encoded in raster order into higher frequency blocks than lower frequency blocks. Such an example is shown in FIG. Due to memory constraints, all of one frequency path may not complete before another frequency path begins. When limited by memory, another method is to encode one tree at a time. Starting from the root, all trees are encoded horizontally. However, routes that are smooth coefficients are not included. This method is shown in FIG. FIG. 54C shows the first tree. One line is taken from the first set of subblocks, two lines are taken from the next set of subblocks, and the next set of subblocks is taken. 4 lines are taken. Such an embodiment is possible because these lines are available before other lines become available.

残りのツリーが０係数からなることを示すため、例外トークンを保存してもよい。これは、１６個の０を示す同じトークンが何度も何度も使用されないようにする。 An exception token may be stored to indicate that the remaining tree consists of zero coefficients. This prevents the same token representing 16 zeros from being used again and again.

一実施例では、全ての重要性レベルがハフマン符号化によって符号化される。別の実施例では、複数の重要性レベルからなる１又は複数のグループがハフマン符号化によって符号化される。別々のグループ毎に全ての重要性レベルをハフマン符号化により符号化してもよいし、あるいは、一部の重要性レベルをハフマン符号化で符号化し、残りの重要性レベルを水平コンテキストモデルとバイナリ・エントロピー・コーダにより符号化してもよい。 In one embodiment, all importance levels are encoded by Huffman coding. In another embodiment, one or more groups of importance levels are encoded by Huffman coding. All importance levels may be encoded with Huffman encoding for each separate group, or some importance levels may be encoded with Huffman encoding, and the remaining importance levels may be encoded with horizontal context models and binary Encoding may be performed by an entropy coder.

重要性レベルの１グループのハフマン符号化による符号化は、以下のように行われる。そのグループ内の重要性レベルの係数のビットが全てヘッドビットのときには、その係数は０係数として（多分、ラン・カウントの一部として）ハフマン符号化される。その係数のビットが全てテールビットならば、それらビットは（多分、ランを終結させる）割増ビットとして符号化される。ハフマン符号語は使われない。その係数のビットが（ヘッドビット又はテールビットのほかに）に符号(sign)ビットを含んでいるときには、（多分、ランを終結させる）ハフマン符号語と割増ビットの両方が符号化される。 Coding by Huffman coding of one group of importance levels is performed as follows. When all bits of importance level coefficients in the group are head bits, the coefficients are Huffman coded as zero coefficients (possibly as part of the run count). If the bits of the coefficient are all tail bits, they are encoded as extra bits (possibly terminating the run). Huffman codewords are not used. When the coefficient bits contain sign bits (in addition to head bits or tail bits), both the Huffman codeword and the extra bits (possibly terminating the run) are encoded.

複数の重要性レベルをハフマン符号化すれば、実行コストは減少する。しかし、ハフマン符号化データの中途での打ち切りは、レート・歪みの悪化を招く。重要性レベルのグループをハフマン符号化すれば、レート・歪みが良好になるようグループの始まり／終わりでの打ち切りが可能になる。用途によっては、限定数の必要とされる量子化点が符号化時に分かっている。量子化点のない重要性レベルは、それに続くレベルと一緒にしてハフマン符号化することができる。 Executing Huffman coding for multiple importance levels reduces the execution cost. However, truncation in the middle of Huffman encoded data leads to deterioration of rate and distortion. If a group of importance levels is Huffman coded, it becomes possible to censor the group at the beginning / end of the group for better rate distortion. For some applications, a limited number of required quantization points are known at the time of encoding. Importance levels without quantization points can be Huffman coded along with subsequent levels.

用途
本発明は多くの用途に利用できる。そのような用途のいくつかを例として以下に述べる。具体的には、解像度が高く画素深度が大きいハンエンドの用途及びアーティファクト(artifact)を許容しない用途に、本発明を利用できる。本発明によれば、ハイエンドの用途は高品質環境で最高品質を維持でき、同時に、帯域幅、データ記憶又は表示機能がさらに制限される用途でも同じ圧縮データを利用可能である。これはまさに、ウェブ・ブラウザのような近頃の画像応用分野に一般に要求される装置独立な表現である。 Applications The present invention can be used in many applications. Some such applications are described below by way of example. Specifically, the present invention can be used for a hand-end application with a high resolution and a large pixel depth and an application that does not allow artifacts. According to the present invention, high-end applications can maintain the highest quality in a high-quality environment, and at the same time, the same compressed data can be used in applications where bandwidth, data storage or display functions are further limited. This is exactly the device-independent representation that is generally required in modern image applications such as web browsers.

画素深度の深い画像（１０ビット〜１６ビット／画素）に対する本発明の優れた非損失圧縮性能は、医用画像のために理想的である。非損失性圧縮のみならず、本発明は、ブロックベース圧縮装置に知られている多くのアーティファクトのない真の損失性圧縮装置である。本発明を利用することに由来する損失性アーティファクトは、急峻なエッジに沿う傾向があるので、人間の視覚系の視覚マスキング現象によって見えないことが多い。 The superior lossless compression performance of the present invention for deep pixel depth images (10 bits to 16 bits / pixel) is ideal for medical images. In addition to lossless compression, the present invention is a true lossy compression device that is free of many artifacts known to block-based compression devices. Lossy artifacts resulting from the use of the present invention tend to be along sharp edges and are therefore often not visible due to the visual masking phenomenon of the human visual system.

本発明は、画像が非常に高解像度で高い画素深度を持つことの多いプリプレス(pre-press)業に関連した用途に利用できる。本発明のピラミッド分解によれば、プリプレス・オペレータが（モニタ上の）画像の低解像度損失性バージョンに対し画像処理操作を行うのが容易である。操作が終わったならば、同じ操作を非損失性バージョンに対して実行できる。 The present invention can be used in applications related to the pre-press industry where images are often very high resolution and have a high pixel depth. The pyramid decomposition of the present invention makes it easy for a prepress operator to perform image processing operations on a low resolution lossy version of an image (on a monitor). Once the operation is over, the same operation can be performed on the lossless version.

本発明は、圧縮しないと送信に要する時間があまりに長くなりやすいファクシミリ文書の用途にも適用可能である。本発明によれば、様々な空間解像度及び画素解像度のファクス装置より、非常に高品位の画像出力が可能になる。 The present invention can also be applied to a facsimile document that tends to take a long time if it is not compressed. According to the present invention, it is possible to output a very high-quality image from a fax apparatus having various spatial resolutions and pixel resolutions.

本発明は、圧縮を必要とする画像アーカイブシステムに、特に記憶容量を増加させるために、利用することもできる。本発明の装置独立な出力は、帯域幅が異なる資源、メモリ及びディスプレイを持つシステムにより画像アーカイブシステムをアクセスでき、有益である。本発明のプログレッシブ伝送機能は、ブラウジングのためにも有益である。最後に、画像アーカイブシステムの出力装置用に望ましい非損失性圧縮が本発明により提供される。 The present invention can also be used in image archiving systems that require compression, particularly to increase storage capacity. The device independent output of the present invention is beneficial because it allows the image archiving system to be accessed by systems with different bandwidth resources, memory and display. The progressive transmission function of the present invention is also useful for browsing. Finally, the present invention provides a lossless compression that is desirable for output devices of image archiving systems.

本発明の非損失性又は高品質損失性データストリームの階層プログレッシブ性により、本発明はワールド・ワイド・ウェブ用に、特に装置独立性、プログレッシブ伝送及び高品質が必須な場合に理想的である。 Due to the hierarchical progressive nature of the non-lossy or high quality lossy data stream of the present invention, the present invention is ideal for the World Wide Web, especially where device independence, progressive transmission and high quality are essential.

本発明は、衛星画像、特に高画素深度及び高解像度になる傾向のある衛星画像にも適用できる。さらに、衛星画像の用途は通信路の帯域幅が制限される。本発明はフレキシビリティがあり、またプログレッシブ伝送特性があるので、本発明を利用すれば人間による画像のブラウジング又はプレビューが可能になろう。 The present invention is also applicable to satellite images, particularly satellite images that tend to have high pixel depth and high resolution. Furthermore, the use of satellite images limits the bandwidth of the communication path. Since the present invention is flexible and has progressive transmission characteristics, it will be possible to browse or preview images by humans.

ＡＴＭネットワークのような“固定レート”で帯域幅が制限される用途は、データが利用可能な帯域幅をオーバーフローしたときにデータを減少させる手段を必要とする。しかしながら、十分な帯域幅があるときには（あるいはデータが高度に圧縮可能なときには）、品質上の不利益があってはならない。同様に、コンピュータや他の画像装置におけるメモリが制限されたフレーム記憶装置のような“固定サイズ”の用途も、メモリが満杯になったときにデータを減少させる手段を必要とする。繰り返すが、適当なメモリ量に非損失圧縮することが可能な画像に対して不利益があってはならない。 Applications where bandwidth is limited at a "fixed rate", such as ATM networks, require a means to reduce data when the data overflows available bandwidth. However, there should be no quality penalty when there is sufficient bandwidth (or when the data is highly compressible). Similarly, "fixed size" applications such as memory limited frame stores in computers and other imaging devices also require a means to reduce data when the memory is full. Again, there should be no penalty for images that can be losslessly compressed to an appropriate amount of memory.

本発明の埋め込み符号ストリームは、これら両方の用途にかなう。埋め込み操作は、損失性画像の伝送又は記憶のために符号ストリームが切り捨てもしくは打ち切りされることを無条件に許す。切りつめもしくは打ち切りが必要でなければ、画像は非損失で届く。 The embedded codestream of the present invention is suitable for both these applications. The embedding operation unconditionally allows the code stream to be truncated or truncated for transmission or storage of lossy images. If truncation or truncation is not required, the image will arrive without loss.

要するに、本発明は、単一連続階調画像圧縮システムを提供する。本発明のシステムは、同じ符号ストリームに対して非損失性かつ損失性であり、埋め込みの量子化（符号ストリームに含まれる）を利用する。本発明のシステムはまた、ピラミッド型であり、プログレッシブであり、補間手段を提供し、かつ、実施が容易である。したがって、本発明はフレキシブルな“装置独立の”圧縮システムを提供する。 In summary, the present invention provides a single continuous tone image compression system. The system of the present invention is lossless and lossy for the same code stream, and utilizes embedded quantization (included in the code stream). The system of the present invention is also pyramid-type, progressive, provides interpolation means and is easy to implement. Thus, the present invention provides a flexible “device independent” compression system.

統合型の損失性及び非損失性圧縮システムは非常に有用である。同じシステムで最新の損失性及び非損失性圧縮を実行でき、その上、同じ符号ストリームである。このシステムは、画像の非損失性符号を保存するか打ち切って損失性バージョンにするかを、符号化中、符号ストリームの格納又は伝送中あるいは復号化中に決定することができる。 Integrated lossy and lossless compression systems are very useful. The latest lossy and lossless compression can be performed on the same system, as well as the same code stream. The system can determine whether the lossless code of the image is preserved or truncated into a lossy version during encoding, during storage or transmission of the code stream, or during decoding.

本発明により提供される損失性圧縮は、埋め込み量子化によって達成される。すなわち、符号ストリームは量子化を含んでいる。実際の量子化（又は視覚的重要性）レベルは、復号化器又は通信路との相関点要素であることもあり、必ずしも符号化器との相関的要素ではない。バンド幅、記憶及びディスプレイ資源が許すなら、画像は非損失で復元される。そうでないならば、画像は最も制約された資源に要求されるだけ量子化される。 The lossy compression provided by the present invention is achieved by embedded quantization. That is, the code stream includes quantization. The actual quantization (or visual importance) level may be a correlation point element with the decoder or channel, not necessarily a correlation element with the encoder. If bandwidth, storage and display resources allow, the image is restored losslessly. If not, the image is quantized as required by the most constrained resource.

本発明に用いられるウェーブレットはピラミッド型であり、差分画像のない、画像の１／２分解が実行される。これは非常に特殊な階層分解である。画像のブラウジングのため又は低解像度装置による表示のために縮小画像(thumbnails)を必要とする用途に、本発明のピラミッド性は理想的である。 The wavelet used in the present invention is a pyramid type, and ½ decomposition of an image without a difference image is executed. This is a very special hierarchical decomposition. The pyramidity of the present invention is ideal for applications that require thumbnails for image browsing or for display by low resolution devices.

本発明における埋め込みの使い方はプログレッシブであり、より具体的にはビットプレーン順である、すなわちＭＳＢの後に下位ビットが続く順である。具体的には本発明はウェーブレット領域においてプログレッシブであるが、空間領域及びウェーブレット領域の両方ともプログレッシブに分解してもよい。プリンタのような、空間解像度はあるが画素解像度は低い用途にとって、本発明におけるビットのプログレッシブな順序づけは理想的である。これらの特徴を同一符号ストリームで得られる。 The use of embedding in the present invention is progressive, more specifically in bit-plane order, that is, the order in which the MSB is followed by the lower bits. Specifically, although the present invention is progressive in the wavelet domain, both the spatial domain and the wavelet domain may be decomposed progressively. For applications that have spatial resolution but low pixel resolution, such as printers, the progressive ordering of bits in the present invention is ideal. These features are obtained with the same code stream.

本発明は、ソフトウエアでもハードウエアでも比較的容易に実施できる。ウェーブレット変換は、ハイパス、ローパスの各係数ペアにつき４つの加算／減算操作と、いくつかのシフトだけで計算することができる。埋め込み及び符号化は、単純な”コンテキストモデル”とバイナリ又はｍ元”エントロピー・コーダ”によって実行される。このエントロピー・コーダは、有限状態マシン、並列コーダ又はハフマン・コーダによって実現できる。 The present invention can be implemented relatively easily with either software or hardware. The wavelet transform can be calculated with only four addition / subtraction operations and a few shifts for each high-pass and low-pass coefficient pair. Embedding and encoding is performed by a simple “context model” and a binary or m-ary “entropy coder”. This entropy coder can be realized by a finite state machine, a parallel coder or a Huffman coder.

本発明の圧縮システムの一実施例のブロック図である。It is a block diagram of one Example of the compression system of this invention. バイナリ方式における各ビットプレーンの各ビットに対するコンテキストモデルの可能な幾何学的関係の一例を示す図である。It is a figure which shows an example of the possible geometric relationship of the context model with respect to each bit of each bit plane in a binary system. バイナリ方式における各ビットプレーンの各ビットに対するコンテキストモデルの可能な幾何学的関係の一例を示す図である。It is a figure which shows an example of the possible geometric relationship of the context model with respect to each bit of each bit plane in a binary system. 第１レベルの分解を示す図である。FIG. 6 is a diagram illustrating a first level decomposition. 第２レベルの分解を示す図である。FIG. 6 is a diagram illustrating a second level decomposition. 第３レベルの分解を示す図である。FIG. 6 is a diagram illustrating a third level decomposition. 第４レベルの分解を示す図である。FIG. 6 is a diagram illustrating a fourth level decomposition. 前後２レベル間の親子関係を示す。The parent-child relationship between the two levels before and after is shown. ＴＴ変換だけを利用するウェーブレット分解過程の一例を示す図である。It is a figure which shows an example of the wavelet decomposition | disassembly process using only TT transformation. ＴＴ変換とＳ変換を利用するウェーブレット分解過程の一例を示す図である。It is a figure which shows an example of the wavelet decomposition process using TT conversion and S conversion. 画像のタイリングの説明図である。It is explanatory drawing of the tiling of an image. ビット・シグニフィカンス表現の例を示す図である。It is a figure which shows the example of bit-significance expression. 本発明における係数サイズを示す図である。It is a figure which shows the coefficient size in this invention. 本発明における係数アラインメントのために使われる周波数帯域用乗数の例を示す図である。It is a figure which shows the example of the multiplier for frequency bands used for the coefficient alignment in this invention. 符号ストリームの構成の一例を示す図である。It is a figure which shows an example of a structure of a code stream. 係数（又は画素）間の隣接関係を示す図である。It is a figure which shows the adjacent relationship between a coefficient (or pixel). テール・ビット処理プロセスのフローチャートである。It is a flowchart of a tail bit processing process. 本発明の符号化プロセスの一例のフローチャートである。3 is a flowchart of an example of an encoding process of the present invention. 本発明の復号化プロセスの一例のフローチャートである。6 is a flowchart of an example of a decoding process of the present invention. 本発明のモデリング・プロセスのフローチャートである。4 is a flowchart of the modeling process of the present invention. モデリング・プロセスに利用可能なテンプレートを示す図である。FIG. 4 shows templates that can be used for the modeling process. ＴＴ変換フィルタの一部分の一例を示すブロック図である。It is a block diagram which shows an example of a part of TT conversion filter. 本発明のスクロール・バッファの説明図である。It is explanatory drawing of the scroll buffer of this invention. 本発明に採用されるメモリ操作の説明図である。It is explanatory drawing of memory operation employ | adopted as this invention. ３レベル用メモリ・バッファの２次元表現を示す図である。It is a figure which shows the two-dimensional representation of the memory buffer for 3 levels. 本発明の符号ストリームの一例を示す図である。It is a figure which shows an example of the code stream of this invention. パーサを備えた圧縮システムのブロック図であるIt is a block diagram of the compression system provided with the parser. 図２７の圧縮システムに対応する伸長システムのブロック図である。FIG. 28 is a block diagram of a decompression system corresponding to the compression system of FIG. 27. コンテキスト従属関係を示す図である。It is a figure which shows a context dependency. 画素深度及び空間解像度の面から定義された用途を示す図である。It is a figure which shows the use defined from the surface of pixel depth and spatial resolution. パーサ、復号化器及びそれらの出力装置との相互作用の一例を示すブロック図である。It is a block diagram which shows an example of interaction with a parser, a decoder, and those output devices. 量子化選択装置の一例を示すブロック図である。It is a block diagram which shows an example of a quantization selection apparatus. 符号ストリーム中の区切りタグの配置を示す図である。It is a figure which shows arrangement | positioning of the division | segmentation tag in a code stream. ＳＩＺタグの説明図である。It is explanatory drawing of a SIZ tag. ＣＯＤタグの説明図である。It is explanatory drawing of a COD tag. ＡＬＧタグの説明図である。It is explanatory drawing of an ALG tag. ＴＬＭタグの説明図である。It is explanatory drawing of a TLM tag. ＴＬＴタグの説明図である。It is explanatory drawing of a TLT tag. ＣＰＴタグの説明図である。It is explanatory drawing of a CPT tag. ＩＲＳタグの説明図である。It is explanatory drawing of an IRS tag. ＶＥＲタグの説明図である。It is explanatory drawing of a VER tag. ＢＶＩタグの説明図である。It is explanatory drawing of a BVI tag. ＩＬＬタグの説明図である。It is explanatory drawing of an ILL tag. ＲＸＹタグの説明図である。It is explanatory drawing of a RXY tag. ＣＭＴタグの説明図である。It is explanatory drawing of a CMT tag. ＱＣＳタグの説明図である。It is explanatory drawing of a QCS tag. 損失性再構成のための典型的分布を示すグラフである。Figure 6 is a graph showing a typical distribution for lossy reconstruction. 典型的な係数を示す図である。It is a figure which shows a typical coefficient. テール情報解析プロセスのフローチャートである。It is a flowchart of a tail information analysis process. ＭＳＥアラインメント法を説明するための図である。It is a figure for demonstrating the MSE alignment method. ピラミッド・アラインメント法を説明するための図である。It is a figure for demonstrating the pyramid alignment method. メモリ記憶係数とアラインメントの間の典型的な関係を示す図である。FIG. 5 illustrates an exemplary relationship between memory storage coefficients and alignment. 符号語の一例を示す図である。It is a figure which shows an example of a code word. ハフマン符号化法による係数の構文解析の方法を説明するための図である。It is a figure for demonstrating the method of the syntax analysis of the coefficient by a Huffman encoding method. ユニットバッファを用い第２レベルのウェーブレット分解を実行する場合の２Ｄメモリの中間形式を示す図である。It is a figure which shows the intermediate | middle format of 2D memory in the case of performing a 2nd level wavelet decomposition using a unit buffer. ユニットバッファを用い第３レベルのウェーブレット分解を実行する場合の２Ｄメモリの中間型式を示す図である。It is a figure which shows the intermediate | middle type | mold of 2D memory in the case of performing a 3rd level wavelet decomposition using a unit buffer.

Explanation of symbols

１０１入力画像データ
１０２可逆ウェーブレット変換ブロック
１０３埋め込み順序付け量子化ブロック
１０４グレイ（Ｇｒａｙ）符号化ブロック
１０５水平コンテキストモデル・ブロック
１０６エントロピー・コーダ
１１０方式選択機構
１１１多成分処理機構
１００１ヘッダ
１００２符号化単位
１００３ＬＬ係数
１００４第１ビットプレーン
１００５第２ビットプレーン
１００６最終ビットプレーン
１５０１乗算器
１５０２加算器
１５０３乗算器
１５０４乗算器
１５０５加算器
１６０１ラインアクセスバッファ
１６０２バッファ
１９０１ヘッダ
１９０２ＬＬ係数
１９０３エントロピー符号化データ
２１０１圧縮されていない原画像
２１０２圧縮装置
２１０３マーカ付き非損失性圧縮ビットストリーム
２１０４パーサ
２１０６通信路又は記憶装置
２１０７伸長装置
２１０８伸長画像
２４０１マーカ付きの非損失性圧縮データ
２４０２パーサ
２４０３通信路
２４０４伸長装置
２４０５ディスプレイ・モジュール
２５００符号ストリーム
２５０１量子化を含む伸長
２５０２画像処理又は歪みモデル
２５０３非損失性伸長
２５０４画像処理又は歪みモデル
２５０５ＭＳＥ又はＨＶＳ差モデル
２５０６アラインメント調整
２８０１係数
２８０２符号ビット
２８０３テールオン情報
２８０４コンテキストビット
２８０５係数ビット 101 Input Image Data 102 Lossless Wavelet Transform Block 103 Embedding Ordered Quantization Block 104 Gray Coding Block 105 Horizontal Context Model Block 106 Entropy Coder 110 Method Selection Mechanism 111 Multi-Component Processing Mechanism 1001 Header 1002 Coding Unit 1003 LL Coefficient 1004 First bit plane 1005 Second bit plane 1006 Final bit plane 1501 Multiplier 1502 Adder 1503 Multiplier 1504 Multiplier 1505 Adder 1601 Line access buffer 1602 Buffer 1901 Header 1902 LL coefficient 1903 Entropy encoded data 2101 Compressed Original image 2102 Compressor 2103 Lossless compressed bitstream with marker 2 04 parser 2106 communication path or storage device 2107 decompression device 2108 decompressed image 2401 lossless compressed data with marker 2402 parser 2403 communication path 2404 decompression device 2405 display module 2500 code stream 2501 decompression including quantization 2502 image processing or distortion model 2503 Lossless extension 2504 Image processing or distortion model 2505 MSE or HVS difference model 2506 Alignment adjustment 2801 Coefficient 2802 Sign bit 2803 Tail-on information 2804 Context bit 2805 Coefficient bit

Claims

A memory for storing a code stream having a header with at least one marker;
At least one output device,
A data compression system comprising a parser connected to the memory and connected to receive device characteristics from the at least one output device, the parser being operable to perform device dependent quantization.

2. The data compression system according to claim 1, wherein the code stream is composed of lossless compressed data.

The data compression system of claim 1, wherein the at least one marker indicates the number of components used for each tile in the codestream, subsampling and alignment.

2. The data compression system of claim 1, wherein the code stream includes a main header, and a local header is placed before each tile in the code stream.

5. A data compression system according to claim 4, wherein the main header is applied to all tiles in the codestream and each local header is applied only to the associated tile.

6. The data compression system according to claim 5, wherein at least one of the local headers has priority over the main header.

2. A data compression system according to claim 1, wherein the parser uses markers in the code stream to quantize the code stream.

8. The data compression system according to claim 7, wherein at least one of the markers indicates frequency information.

The data compression system according to claim 1, further comprising a compression device for generating a code stream.

2. The data compression system according to claim 1, wherein the parser comprises a quantization selection device.

11. The data compression system according to claim 10, wherein the quantization selection device performs conversion and quantization of a set of images by discarding bit planes of various coefficients.

The data compression system of claim 1, wherein one of the tags indicates a level of importance within the data in each tile.

2. The data compression system of claim 1, wherein the tag indicates an importance level locator signal, and the parser terminates according to the signal.

2. A data compression system according to claim 1, wherein the tag indicates the number of importance levels to be stored.

2. The data compression system according to claim 1, wherein the tag indicates the number of bytes to be stored.

The data compression system according to claim 1, wherein the tag includes an instruction for associating the importance level with the number of bytes in each tile.

8. The data compression system of claim 7, wherein the at least one marker indicates the number of bytes of importance level in each tile.