JP2000188692A

JP2000188692A - Data processing method

Info

Publication number: JP2000188692A
Application number: JP10363700A
Authority: JP
Inventors: Akira Saito; 明斉藤
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1998-12-22
Filing date: 1998-12-22
Publication date: 2000-07-04
Anticipated expiration: 2018-12-22
Also published as: JP3839604B2

Abstract

PROBLEM TO BE SOLVED: To provide a data compression method that can compress compression objet data at high speed, from which a comparatively longer coincidence length is obtained. SOLUTION: When image data stream is coded and compressed by a plurality of lines whose line length is L, a specific symbol included in the image data stream and a plurality of symbols with the specific symbol placed at the head are used for coding object symbols, a symbol adjacent to an upper-stream side of the specific symbol with an offset of 1 and a plurality of symbols toward the upper-stream side and the lower-stream side of the stream with offsets of L+n to L-n around the symbol with an offset L, that is apart from the specific symbol by one line length, are used for comparison object symbols. The coding object symbols and the comparison object symbols are sequentially compared and detection of matching length is finished, when a maximum matching length is obtained which was specified in advance.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、データ処理方法
に関するものであり、具体的には、ＬＺ７７およびＬＺ
７８に代表される辞書ベース方式を基にしたデータ圧縮
技術を用いて、画像データを効率的に圧縮するデータ処
理方法に関する。[0001] 1. Field of the Invention [0002] The present invention relates to a data processing method, and more specifically, to LZ77 and LZ77.
The present invention relates to a data processing method for efficiently compressing image data by using a data compression technique based on a dictionary-based method represented by 78.

【０００２】[0002]

【従来の技術】現在の辞書ベースによるデータ圧縮方法
の起源は、Abraham Lempel氏とJacobZiv 氏とが１９７
７年にIEEE Transaction on Information Theoryに発表
した論文｀AUniversal Algorithm for Sequential Data
Compression´に見られる。これは、通称Lempel-Ziv符
号化のスライド辞書法又はＬＺ７７法と言われている。2. Description of the Related Art Abraham Lempel and Jacob Ziv 197 the origin of the current dictionary-based data compression method.
Paper published in IEEE Transaction on Information Theory in 1995７AUniversal Algorithm for Sequential Data
Seen in Compression '. This is commonly called the slide dictionary method of Lempel-Ziv coding or the LZ77 method.

【０００３】例えば、宗像清治：Ziv-Lempelのデ一タ圧
縮法，情報処理，Ｖｏｌ．２６．Ｎｏ．１（１９８５）
に、それが紹介されている。For example, Seiji Munakata: Data compression method of Ziv-Lempel, Information Processing, Vol. 26. No. 1 (1985)
It is introduced.

【０００４】ＬＺ７７のアルゴリズムは、符号化データ
を過去のデータ系列の任意の位置から一致する最大長の
系列に区切り、過去の系列の複製として符号化する方法
である。[0004] The LZ77 algorithm is a method in which coded data is divided from an arbitrary position in a past data sequence into a sequence of the maximum length that matches, and encoded as a duplicate of the past sequence.

【０００５】具体的には、図２に示すように、符号化済
みの入力データを格納する移動窓と、これから符号化す
るデータを格納する先読みバッファとを備え、先読みバ
ッファのデータ系列と移動窓のデータ系列のすべての部
分系列とを照合して、移動窓中で一致する最大長の部分
系列を求める。More specifically, as shown in FIG. 2, a moving window for storing encoded input data and a look-ahead buffer for storing data to be coded from now on are provided. Is compared with all the subsequences of the data sequence, and a subsequence of the maximum length that matches in the moving window is obtained.

【０００６】そして、移動窓中でこの最大長の部分系列
を指定するために、「その最大長の部分系列の開始位
置」と「一致する長さ」と「不一致をもたらした次のシ
ンボル」との組を符号化する。Then, in order to specify the maximum-length sub-sequence in the moving window, the "start position of the maximum-length sub-sequence", the "matching length", and the "next symbol that caused a mismatch" Is encoded.

【０００７】次に、先読みバッファ内の符号化したデー
タ系列を移動窓に移して、先読みバッファ内に符号化し
たデータ系列分の新たなデータ系列を入力する。Next, the encoded data sequence in the prefetch buffer is moved to the moving window, and a new data sequence corresponding to the encoded data sequence is input into the prefetch buffer.

【０００８】以下、同様の処理を繰り返していくこと
で、データを部分系列に分解して符号化を実行していく
のである。Hereinafter, by repeating the same processing, the data is decomposed into sub-series and the encoding is executed.

【０００９】そして、このような基本的なデータ圧縮技
術に対して、多くの改良型が提案されている。[0009] Many improvements have been proposed for such basic data compression techniques.

【００１０】例えば、符号化コードであるのか、生デー
タであるのかを識別するフラグを設けて、符号化コード
が生データよりも長くなってしまうときには生データを
符号化するという方法をとるＬＺＳＳ符号方式(T.C.Bel
l,“Better OPM/L TextCompression",IEEE Transaction
Commun.,Vol.COM-34,No.12,Dec.(1986)) がある。For example, an LZSS code which employs a method of providing a flag for identifying whether it is an encoded code or raw data and encoding the raw data when the encoded code becomes longer than the raw data. Method (TCBel
l, “Better OPM / L TextCompression”, IEEE Transaction
Commun., Vol. COM-34, No. 12, Dec. (1986)).

【００１１】また、他の文献としては、Ｍ．ネルソン：
データ圧縮ハンドブック改訂第２版、トッパン(1996).
ISBN4-8101-8605-9 がある。Further, as another document, M. Nelson:
Data Compression Handbook, 2nd revised edition, Toppan (1996).
ISBN4-8101-8605-9.

【００１２】ところで、近年、ＯＡシステム（スキャ
ナ、プリンタ、ディジタル複写機など）が普及し、高速
化・高解像度化の方向を目指している。In recent years, OA systems (scanners, printers, digital copiers, etc.) have become widespread and aiming for higher speed and higher resolution.

【００１３】これらの装置では、大容量の画像データを
高速で処理する必要があり、高速・高圧縮率のデータ圧
縮を加えることで、処理するデータ量を滅らすことが必
須となっている。In these apparatuses, it is necessary to process a large amount of image data at high speed, and it is essential to reduce the amount of data to be processed by applying high-speed and high-compression data compression. .

【００１４】このようなデータ圧縮の従来技術として
は、ＭＭＲ、ＪＢＩＧなど標準化された方式があるが、
ＭＭＲは精細な画像で圧縮率が悪化する傾向にある。As a conventional technique for such data compression, there are standardized methods such as MMR and JBIG.
In the MMR, the compression ratio tends to be deteriorated in a fine image.

【００１５】また、圧縮率の点ではべストに近いＪＢＩ
Ｇは基本的に画素単位の処理であるため高速化に限界が
あり、高速システムでは採用できなかった。Also, JBI which is close to the best in terms of compression ratio
Since G is basically processing in units of pixels, there is a limit in speeding up, and it cannot be adopted in a high-speed system.

【００１６】しかるに、上述した辞書ベース圧縮方式
は、基本的にバイト単位の処理であるためＪＢＩＧより
はるかに高速化が可能であり、また精細な画像に対して
もＭＭＲほど圧縮率が悪化しないという特徴があり、高
速・高解像度のＯＡシステムに適している。However, the dictionary-based compression method described above is basically a processing in units of bytes, so that it is possible to achieve a much higher speed than JBIG, and that the compression rate does not deteriorate as much as MMR even for a fine image. It has features and is suitable for high-speed and high-resolution OA systems.

【００１７】[0017]

【発明が解決しようとする課題】しかしながら、従来の
ＬＺ７７ベースによるデータ圧縮装置では、符号化する
際、移動窓中で一致する最大長の部分データ列を求める
ためには、これから符号化するデータ列と移動窓の中の
すべての位置の間でデータ列比較を行わなければならな
い。However, in a conventional LZ77-based data compression apparatus, when encoding, in order to obtain a partial data string having the same maximum length in a moving window, a data string to be encoded from now on is required. A data string comparison must be performed between the and all positions in the moving window.

【００１８】すなわち、図２に示すように、これから符
号化するデータ列を、移動窓中のオフセット１の位置か
ら始まるデータ列、オフセット２の位置から始まるデー
タ列、…オフセットｎ（ｎは移動窓のサイズ）の位置か
ら始まるデータ列と比較して、最大一致長が得られるオ
フセットを見つけることである。That is, as shown in FIG. 2, a data sequence to be encoded is a data sequence starting from the position of offset 1 in the moving window, a data sequence starting from the position of offset 2,. Is compared with the data sequence starting from the position of (size) to find the offset that gives the maximum match length.

【００１９】上記のような最大一致長を求める方式で
は、それぞれのオフセットとも長い一致が得られる場合
に処理速度が落ちるという欠点がある。たとえば画像の
白い部分を符号化すると、すべてのオフセットとの比較
で最長の一致（たとえば２５６）が得られるので、１デ
ータの比較を１回とカウントすると、各オフセットあた
り２５６回の比較を行うことになり、データ列の比較時
間が飛躍的に伸びるという問題があった。The method of obtaining the maximum match length as described above has a disadvantage that the processing speed is reduced when a long match is obtained with each offset. For example, encoding the white part of the image gives the longest match (eg 256) in comparison with all offsets, so counting one data comparison as one would result in 256 comparisons for each offset Therefore, there is a problem that the comparison time of the data sequence is dramatically increased.

【００２０】この発明の目的は、上記した事情に鑑みな
されたものであって、同一データが連続する圧縮対象デ
ータ、つまり比較的長い一致長が得られる圧縮対象デー
タを高速に圧縮することが可能なデータ圧縮方法を提供
することにある。SUMMARY OF THE INVENTION An object of the present invention has been made in view of the above circumstances, and enables high-speed compression of compression target data in which identical data is continuous, that is, compression target data having a relatively long matching length. Another object of the present invention is to provide a simple data compression method.

【００２１】[0021]

【課題を解決するための手段】上記課題を解決し目的を
達成するために、この発明のデータ圧縮方法は、下記に
示す通りである。In order to solve the above problems and achieve the object, a data compression method according to the present invention is as follows.

【００２２】この発明は、画像の走査により得られるこ
の画像のライン長Ｌの複数ラインに相当する画像データ
ストリームを符号化して、この画像データストリームを
圧縮するデータ処理方法において、前記画像データスト
リームに含まれる特定のシンボル、及びこの特定のシン
ボルを先頭とした複数のシンボルを符号化対象シンボル
とし、前記特定のシンボルの上流側に隣接するオフセッ
ト１のシンボル、及び前記特定のシンボルから１ライン
長離れたオフセットＬのシンボルを中心としたストリー
ムの上流側及び下流側のオフセットＬ＋ｎ〜オフセット
Ｌ−ｎの複数のシンボルを比較対象シンボルとし、前記
符号化対象シンボルと前記比較対象シンボルとを順次比
較して一致長を検出するとき、予め規定された最大一致
長が得られた時点で一致長の検出を終了し、この検出さ
れた一致長を基にして符号化を行う。The present invention provides a data processing method for encoding an image data stream corresponding to a plurality of lines having a line length L of the image obtained by scanning the image and compressing the image data stream. A specific symbol included and a plurality of symbols starting with the specific symbol are set as encoding target symbols, and a symbol having an offset of 1 adjacent to the upstream side of the specific symbol and a distance of one line length from the specific symbol are used. And a plurality of symbols at offsets L + n to L-n on the upstream side and the downstream side of the stream centered on the symbol at the offset L, are set as comparison target symbols, and the encoding target symbol and the comparison target symbol are sequentially compared. When a match length is detected, the point in time when a predefined maximum match length is obtained Exit matching length detection, encoding is performed by the detected match length based.

【００２３】この発明は、画像の走査により得られるこ
の画像のライン長Ｌの複数ラインに相当する画像データ
ストリームを符号化して、この画像データストリームを
圧縮するデータ処理方法において、前記画像データスト
リームに含まれる特定のシンボル、及びこの特定のシン
ボルを先頭とした複数のシンボルを符号化対象シンボル
とし、前記特定のシンボルの上流側に隣接するオフセッ
ト１のシンボル、及び前記特定のシンボルから１ライン
長離れたオフセットＬのシンボルを中心としたストリー
ムの上流側及び下流側のオフセットＬ＋ｎ〜オフセット
Ｌ−ｎの複数のシンボルを比較対象シンボルとし、前記
符号化対象シンボルと前記比較対象シンボルとを順次比
較して符号化するとき、前記符号化対象シンボルに含ま
れる前記特定のシンボルの比較の対象として、前記オフ
セット１、Ｌ、Ｌ＋１、及びＬ−１のシンボルを優先的
に選択する。The present invention provides a data processing method for encoding an image data stream corresponding to a plurality of lines having a line length L of the image obtained by scanning the image and compressing the image data stream. A specific symbol included and a plurality of symbols starting with the specific symbol are set as encoding target symbols, and a symbol having an offset of 1 adjacent to the upstream side of the specific symbol and a distance of one line length from the specific symbol are used. And a plurality of symbols at offsets L + n to L-n on the upstream side and the downstream side of the stream centered on the symbol at the offset L, are set as comparison target symbols, and the encoding target symbol and the comparison target symbol are sequentially compared. When encoding, the specific sequence included in the symbol to be encoded is included. As the object of comparison of Bol, the offset 1, L, L + 1, and preferentially selects the L-1 symbols.

【００２４】この発明は、画像の走査により得られるこ
の画像のライン長Ｌの複数ラインに相当する画像データ
ストリームを符号化して、この画像データストリームを
圧縮するデータ処理方法において、前記画像データスト
リームに含まれる特定のシンボル、及びこの特定のシン
ボルを先頭とした複数のシンボルを符号化対象シンボル
とし、前記特定のシンボルの上流側に隣接するオフセッ
ト１のシンボル、及び前記特定のシンボルから１ライン
長離れたオフセットＬのシンボルを中心としたストリー
ムの上流側及び下流側のオフセットＬ＋ｎ〜オフセット
Ｌ−ｎの複数のシンボルを比較対象シンボルとし、前記
符号化対象シンボルと前記比較対象シンボルとを順次比
較して符号化するとき、前記符号化対象シンボルに含ま
れる前記特定のシンボルの比較対象優先順位を優先度の
高いものから順に、前記オフセットＬ、１、Ｌ−１、Ｌ
＋１、Ｌ−２、Ｌ＋２、…、Ｌ−ｎ、及びＬ＋ｎのシン
ボルとし、前記符号化対象シンボルと前記比較対象シン
ボルとを順次比較して一致長を検出するとき、予め規定
された最大一致長が得られた時点で一致長の検出を終了
し、この検出された一致長を基にして符号化を行う。According to the present invention, there is provided a data processing method for encoding an image data stream corresponding to a plurality of lines having a line length L of the image obtained by scanning the image and compressing the image data stream. A specific symbol included and a plurality of symbols starting with the specific symbol are set as encoding target symbols, and a symbol having an offset of 1 adjacent to the upstream side of the specific symbol and a distance of one line length from the specific symbol are used. And a plurality of symbols at offsets L + n to L-n on the upstream side and the downstream side of the stream centered on the symbol at the offset L, are set as comparison target symbols, and the encoding target symbol and the comparison target symbol are sequentially compared. When encoding, the specific sequence included in the symbol to be encoded is included. The comparison priority Bol in descending order of priority, the offset L, 1, L-1, L
+1, L−2, L + 2,..., L−n, and L + n, and when the encoding target symbol and the comparison target symbol are sequentially compared to detect a matching length, a predetermined maximum matching length is determined. Is detected at the time point when is obtained, and encoding is performed based on the detected match length.

【００２５】[0025]

【発明の実施の形態】以下、この発明の実施の形態につ
いて図面を参照して説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００２６】まず、二次元的に近い位置から比較ポイン
トを選択する点を説明する。First, the point at which a comparison point is selected from two-dimensionally close positions will be described.

【００２７】従来例で述べたように、ＬＺ７７ベースの
圧縮をソフトウェアで実現しようとすると、もっとも単
純なインプリメントでは、符号か位置から始まるデータ
列と、移動窓中のすべての位置から始まるデータ列とを
比較し、最長の一致位置を検出することになる。この方
式では、移動窓を大きくとったときに処理速度の低下が
著しい。As described in the conventional example, when realizing LZ77-based compression by software, in the simplest implementation, a data sequence starting from a code or a position and a data sequence starting from all positions in a moving window are obtained. And the longest matching position is detected. In this method, when the moving window is large, the processing speed is significantly reduced.

【００２８】そこで、第１の発明では、ＬＺ７７をベー
スとしながらも、これから符号化するデータ列を移動窓
中のすべての位置から始まるデータ列と比較するのでは
なく、一致する可能性の高い位置から始まるデータ列だ
けを比較対象とすることで、処理速度向上を図ってい
る。例えば、比較対象位置として１６あるいは３２程度
で実現するものである。Therefore, in the first invention, based on LZ77, a data sequence to be encoded is not compared with a data sequence starting from all positions in the moving window. The processing speed is improved by making only the data sequence starting with "." For example, it is realized at about 16 or 32 as comparison target positions.

【００２９】しかしながら、単純に比較対象位置の数を
減らしただけでは、一致する可能性が小さくなり圧縮率
が低下すると考えられる。第１の発明では、画像データ
の周期性に着目して比較対照する位置を選択している。
すなわち、移動窓中のすべての部分列を比較するのでな
く、画像データ周期性に着目して、一致する可能性の高
いデータ位置だけを比較する。However, simply reducing the number of comparison target positions is considered to reduce the possibility of coincidence and reduce the compression ratio. In the first invention, a position to be compared is selected by focusing on the periodicity of the image data.
That is, instead of comparing all the partial columns in the moving window, attention is paid to the image data periodicity, and only data positions that are highly likely to match are compared.

【００３０】以下、図１を基に第１の発明の原理を説明
する。ここでは圧縮対象のデータの単位をバイト単位と
している。画像データの２次元的な局所性を考慮する
と、あるバイトともっとも類似性が高いのはその上下左
右の位置である。画像データの入力順として一般的な左
上から右下へのラスタスキャンを考えると、あるバイト
に対して、右と下の隣接バイトはこれから入力されるも
のであるため移動窓にはまだ入っていない。したがって
右と下の隣接バイトを比較対象とすることはできない。
左の隣接バイトは入力順で一つ前に入力したもっとも最
近のデータであり、移動窓中でオフセット１の位置に入
っている。上の隣接バイトは、入力画像データのライン
長（横幅）がバイト数でＬとすると、移動窓中のオフセ
ットＬの位置に入っている。ただし、移動窓のサイズが
Ｌ以上であることが条件である。従来のＬＺ系コーデッ
クは、入力が画像データであっても、その周期性を無視
して左方向だけで一致するポイントを探していたことに
なる。ここでは、左方向に加えて上方向に隣接する位置
とその周辺を比較ポイントに選んでいる。図１におい
て、ひし形は、これから符号化するデータの先頭バイト
を示し、イコールは、これから符号化するデータ系列を
示し、黒塗り四角は、移動窓のうち比較ポイントとする
バイト位置を示し、しろ抜きの四角は、移動窓のうち比
較ポイントとしないバイト位置を示す。すなわち、ここ
では、比較対象として、１６箇所のオフセット位置
（１、Ｌ−７、Ｌ−６、Ｌ−５、Ｌ−４、Ｌ−３、Ｌ−
２、Ｌ−１、Ｌ、Ｌ＋１、Ｌ＋２、Ｌ＋３、Ｌ＋４、Ｌ
＋５、Ｌ＋６、Ｌ＋７）から始まるデータ列を選んでい
る。Ｌは画像のライン長（主走査方法のサイズ）であ
り、あらかじめ外部から設定されている。The principle of the first invention will be described below with reference to FIG. Here, the unit of data to be compressed is a byte unit. In consideration of the two-dimensional locality of the image data, the highest similarity to a certain byte is the upper, lower, left, and right positions. Considering a typical raster scan from upper left to lower right as the input order of image data, for a certain byte, the right and lower adjacent bytes are to be input from now, so they have not yet entered the moving window . Therefore, the right and lower adjacent bytes cannot be compared.
The left adjacent byte is the most recent data that was input immediately before in the input order and is located at offset 1 in the moving window. The upper adjacent byte is located at the offset L in the moving window, where the line length (width) of the input image data is L in bytes. However, the condition is that the size of the moving window is L or more. In the conventional LZ codec, even if the input is image data, it ignores the periodicity and searches for a matching point only in the left direction. Here, in addition to the left direction, a position adjacent in the upward direction and its periphery are selected as comparison points. In FIG. 1, a diamond indicates a first byte of data to be encoded from now on, an equal indicates a data sequence to be encoded from now on, a black square indicates a byte position as a comparison point in the moving window, and a margin is shown. Indicates a byte position that is not a comparison point in the moving window. That is, here, 16 offset positions (1, L-7, L-6, L-5, L-4, L-3, L-
2, L-1, L, L + 1, L + 2, L + 3, L + 4, L
+5, L + 6, L + 7). L is the line length of the image (the size of the main scanning method) and is set in advance from the outside.

【００３１】比較対象位置が少なくすることは、オフセ
ット符号を短くできる点でも優れている。例えば、移動
窓のサイズを２ｋＢとすると、従来例ではオフセットと
して２ｋ通りの符号が必要になるが、図１の例ではオフ
セットとして１６通りしか必要ないので、単純な符号を
選んだ場合、従来例では１１ビットのオフセット符号長
になるのに対して、この発明では４ビットと短い。Reducing the number of comparison targets is also advantageous in that the offset code can be shortened. For example, if the size of the moving window is 2 kB, the conventional example requires 2 k kinds of offset codes, but the example of FIG. 1 requires only 16 kinds of offsets. In this case, the offset code length is 11 bits, whereas in the present invention, it is as short as 4 bits.

【００３２】さらに、この発明では、各オフセットにお
ける一致長を求める順番と最長一致探索打ち切り条件を
組み合わせて圧縮処理の高速化を図っている。この発明
の１６箇所の比較ポイントに対して、単純に最長一致を
求めるやり方は次のようになる。Further, in the present invention, the order of obtaining the match length at each offset and the condition for terminating the longest match search are combined to speed up the compression processing. The method of simply finding the longest match for the 16 comparison points of the present invention is as follows.

【００３３】１６箇所の比較ポイントに対して単純に最
長一致を求めるモジュール｛符号化位置からのデータ列とオフセット１からのデー
タ列の一致長を求め、結果をｌｅｎ１とする符号化位置からのデータ列とオフセットＬ−７からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ−７）とする符号化位置からのデータ列とオフセットＬ−６からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ−６）とする
… 符号化位置からのデータ列とオフセットＬ＋６からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ＋６）とする符号化位置からのデータ列とオフセットＬ＋７からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ＋７）とするｌｅｎ１、ｌｅｎ（Ｌ−７）、（Ｌ−６）…、ｌｅｎ
（Ｌ＋７）の最大値とそのときのオフセットを返す｝各オフセットで一致長を求める際には、一致長符号の構
成で上限を決めて、長い一致が得られても上限で比較処
理をうち切る。たとえば一致長符号の最大長が２５６と
なっている場合には、一致の検出が２５６に達したとこ
ろでその後の比較をうち切り、２５６を一致長とする。A module for simply obtaining the longest match for the 16 comparison points. {The match length between the data sequence from the encoding position and the data sequence from offset 1 is obtained, and the result is len1. The data from the encoding position The matching length of the data string from the column and the offset L-7 is determined, and the result is len (L-7). The matching length of the data string from the encoding position and the data string from the offset L-6 is determined. Let len (L-6) ... Find the matching length of the data string from the encoding position and the data string from offset L + 6, and let the result be len (L + 6). The data string from the encoding position and the data from offset L + 7 The match length of the column is determined, and the result is len (L + 7). Len1, len (L-7), (L-6), len
Returns the maximum value of (L + 7) and the offset at that time. When determining the match length at each offset, the upper limit is determined by the configuration of the match length code, and even if a long match is obtained, the comparison process is terminated at the upper limit. . For example, when the maximum length of the match length code is 256, when the number of matches reaches 256, the subsequent comparison is stopped and 256 is set as the match length.

【００３４】従来のような最大一致長を求める方式で
は、それぞれのオフセットとも長い一致が得られる場合
に処理速度が落ちるという欠点がある。たとえば画像の
白い部分を符号化すると、すべてのオフセットとの比較
で最長の一致（たとえば２５６）が得られるので、１デ
ータの比較を１回とカウントすると、各オフセットあた
り２５６回、計２０９６回の比較を行うことになる。The conventional method of obtaining the maximum matching length has a disadvantage that the processing speed is reduced when a long matching is obtained for each offset. For example, encoding a white portion of an image will yield the longest match (eg, 256) in comparison with all offsets, so counting one data comparison as one would result in 256 times for each offset, for a total of 2096 times A comparison will be made.

【００３５】この発明では、２５６が最大の一致長であ
ることに着目して、２５６という一致長がえられた時点
で残りのオフセットの比較をうち切ることで高速化を図
る。ただし、通常はオフセット符号、一致長符号ともハ
フマン符号を用いるので、長い一致長が得られそうなオ
フセットに短いオフセット符号を割り当てている。たと
えばこの発明のようにオフセットを選んだ場合は、これ
から符号化する位置に近い方が高い類似度を持つと考え
られるので、Ｌに最短の符号を割り当て、以下、Ｌ−
１、Ｌ＋１、Ｌ−１、Ｌ＋１、Ｌ−２、Ｌ＋２…、Ｌ−
７、Ｌ＋７の順に短い符号を割り当てるのがよい。この
とき、次のように単純に最大一致長で打ち切りを導入す
ると、最短でないオフセットが選ばれることがあり、最
適とは言えない。In the present invention, attention is paid to the fact that 256 is the maximum matching length, and when the matching length of 256 is obtained, the comparison of the remaining offsets is cut off, thereby increasing the speed. However, since the Huffman code is usually used for both the offset code and the match length code, a short offset code is assigned to the offset at which a long match length is likely to be obtained. For example, when an offset is selected as in the present invention, it is considered that the closer to the position to be coded from now on has higher similarity, so the shortest code is assigned to L, and L-
1, L + 1, L-1, L + 1, L-2, L + 2 ..., L-
It is preferable to assign short codes in the order of 7, L + 7. At this time, if a truncation is simply introduced with the maximum matching length as follows, a non-shortest offset may be selected, which is not optimal.

【００３６】この発明の１６箇所の比較ポイントに対し
て単純に最長一致を求めるモジュールに打ち切りを導入｛符号化位置からのデータ列とオフセット１からのデー
タ列の一致長を求め、結果をｌｅｎ１とするｌｅｎ１＝２５６なら、オフセット＝１、一致長＝２５
６としてモジュール終了符号化位置からのデータ列とオフセットＬ−７からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ−７）とするｌｅｎ（Ｌ−７）＝２５６なら、オフセット＝Ｌ−７、
一致長＝２５６としてモジュール終了符号化位置からのデータ列とオフセットＬ−６からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ−６）とするｌｅｎ（Ｌ−６）＝２５６なら、オフセット＝Ｌ−６、
一致長＝２５６としてモジュール終了… 符号化位置からのデータ列とオフセットＬ＋６からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ＋６）とするｌｅｎ（Ｌ＋６）＝２５６なら、オフセット＝Ｌ＋６、
一致長＝２５６としてモジュール終了符号化位置からのデータ列とオフセットＬ＋７からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ＋７）とするｌｅｎ（Ｌ＋７）＝２５６なら、オフセット＝Ｌ＋７、
一致長＝２５６としてモジュール終了ｌｅｎ１、ｌｅｎ（Ｌ−７）、（Ｌ−６）…、ｌｅｎ
（Ｌ＋６）、ｌｅｎ（Ｌ＋７）の最大値とそのときのオ
フセットを返す｝この順に探索すると、画像の白い部分ではオフセット１
が選ばれることになるが、最短の符号を割り当てたのは
オフセットＬなので、最適符号とはならない。この点を
改善するためには、つぎのように探索順をオフセット符
号の短い順（長くない順）にすればよい。Introduce truncation into the module that simply finds the longest match for the 16 comparison points of the present invention. 求め Find the match length between the data string from the encoding position and the data string from offset 1, and call the result len1 If len1 = 256, offset = 1, match length = 25
The module length is set to 6, and the matching length of the data sequence from the encoding position and the data sequence from the offset L-7 is determined, and the result is len (L-7). If len (L-7) = 256, the offset = L− 7,
Module length is set as the match length = 256. The match length between the data string from the encoding position and the data string from the offset L-6 is obtained, and the result is len (L-6). If len (L-6) = 256, the offset is obtained. = L-6,
The module ends with a match length = 256. The match length between the data string from the encoding position and the data string from the offset L + 6 is determined, and the result is len (L + 6). If len (L + 6) = 256, the offset = L + 6,
Module length is set as the match length = 256. The match length between the data string from the encoding position and the data string from the offset L + 7 is determined, and the result is len (L + 7). If len (L + 7) = 256, the offset = L + 7,
End of module with matching length = 256 len1, len (L-7), (L-6) ..., len
Return the maximum value of (L + 6) and len (L + 7) and the offset at that time. When searching in this order, the white part of the image has an offset of 1
Is selected, but since the shortest code is assigned to the offset L, it is not the optimum code. In order to improve this point, the search order may be set in the order of the shortest offset code (not the longest order) as follows.

【００３７】打ち切りを導入し探索位置を改善（この発
明の方式）｛符号化位置からのデータ列とオフセットＬからのデー
タ列の一致長を求め、結果をｌｅｎＬとするｌｅｎＬ＝２５６なら、オフセット＝Ｌ、一致長＝２５
６としてモジュール終了符号化位置からのデータ列とオフセットＬからのデータ
列の一致長を求め、結果をｌｅｎＬとするｌｅｎ＝１なら、オフセット＝１、一致長＝２５６とし
てモジュール終了符号化位置からのデータ列とオフセットＬ−１からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ−１）とするｌｅｎ（Ｌ−１）＝２５６なら、オフセット＝Ｌ−１，
一致長＝２５６としてモジュール終了符号化位置からのデータ列とオフセットＬ＋１からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ＋１）とするｌｅｎ（Ｌ＋１）＝２５６なら、オフセット＝Ｌ＋１，
一致長＝２５６としてモジュール終了… 符号化位置からのデータ列とオフセットＬ−７からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ−７）とするｌｅｎ（Ｌ−７）＝２５６なら、オフセット＝Ｌ−７、
一致長＝２５６としてモジュール終了符号化位置からのデータ列とオフセットＬ＋７からのデ
ータ列の一致長を求め、結果をｌｅｎ（Ｌ＋７）とするｌｅｎ（Ｌ＋７）＝２５６なら、オフセット＝Ｌ＋７，
一致長＝２５６としてモジュール終了ｌｅｎ１、ｌｅｎ（Ｌ−７）、（Ｌ−６）…、ｌｅｎ
（Ｌ＋６）、ｌｅｎ（Ｌ＋７）の最大値とその時のオフ
セットを返す｝この場合、２５６の一致が発生してそれ以降の一致長探
索を中断したとしても必ずもっとも短いオフセット符号
となるオフセットが選ばれるので、符号化効率の改善と
処理高速化を両立できる。たとえば画像の白い部分に対
しては、当初の方法では計４０９６回の比較が必要であ
ったが、本発明の方式では２５６回の比較でよく、しか
も発生する符号は同じになる。Improving the search position by introducing truncation (method of the present invention) 求め Find the coincidence length of the data sequence from the encoding position and the data sequence from the offset L and set the result to lenL If lenL = 256, the offset = L, match length = 25
The module length is determined as 6, and the matching length between the data sequence from the encoding position and the data sequence from the offset L is determined. The result is lenL. If len = 1, the offset = 1 and the matching length = 256. The matching length between the data sequence and the data sequence from the offset L-1 is determined, and the result is len (L-1). If len (L-1) = 256, the offset = L-1,
Module length is set as the match length = 256. The match length between the data string from the encoding position and the data string from the offset L + 1 is obtained, and the result is len (L + 1). If len (L + 1) = 256, the offset = L + 1,
When the matching length = 256, the module ends ... The matching length between the data string from the encoding position and the data string from the offset L-7 is determined, and the result is len (L-7). If len (L-7) = 256, Offset = L-7,
Module length is set as the match length = 256. The match length between the data string from the encoding position and the data string from the offset L + 7 is determined, and the result is len (L + 7). If len (L + 7) = 256, the offset = L + 7,
End of module with matching length = 256 len1, len (L-7), (L-6) ..., len
Returns the maximum value of (L + 6) and len (L + 7) and the offset at that time. In this case, even if 256 matches occur and the search for a matching length thereafter is interrupted, the offset with the shortest offset code is always selected. Therefore, it is possible to improve the encoding efficiency and increase the processing speed at the same time. For example, for the white part of the image, the initial method required a total of 4096 comparisons, but the method of the present invention requires 256 comparisons, and the same sign is generated.

【００３８】符号化効率は多少落ちても高速化を実現し
ようとして、この発明の探索打ち切り条件を、一致長符
号の最大値よりも小さく設定するようにしてもよい。例
えば、一致長符号の最大値が２５６のとき、１２８を越
える一致が得られたらそれ以降の探索を行わない、とす
ることで、多少符号化効率は落ちるものの、高速化を実
現できる。The search termination condition of the present invention may be set to be smaller than the maximum value of the matching length code in order to realize high speed even if the coding efficiency is slightly lowered. For example, when the maximum value of the match length code is 256, if a match exceeding 128 is obtained, the subsequent search is not performed, so that although the coding efficiency is slightly reduced, the speed can be increased.

【００３９】次に、第２の発明について説明する。Next, the second invention will be described.

【００４０】最長一致位置を求めるモジュールでは、符
号化位置からのデータ列とそれぞれのオフセット位置か
ら始まるデータ列とを比較して一致長を求める作業を行
う、この発明では、計１６個所のオフセットに対して比
較を行っている。符号化単位をバイトしたとき、単純に
インプリメントすると、従来は、下記に示すように、１
バイトづつ比較することになる。８ビットＣＰＵで圧縮処理を行う場合はこれでもよい
が、最近のように３２ビットなどのＣＰＵへの実装を考
えると次のように高速化できる。ＣＰＵの自然なデータ
長を３２ビットとしたとき、この第２の発明では、下記
に示すようにインプリメントする。 int search_maechlen(BYTE*offset, BYTE*cp){ count=0 （offsetとcpの差が４バイトの倍数でなければ従来例のように１バイト単位に比較し、そうでなければ以下の処理を行う）（offsetとcpは４バイト境界に一致するまで１〜３バイト比較） while(*(int)offset==*(int)cp) count+=4 //4バイト単位に比較（最終不一致の４バイトないで１〜３バイト一致しているか、一致していればその分count追加） return (count) } オフセットが４バイトの倍数でなければ、４バイト単位
の比較ができないので従来のように１バイト単位で比較
する。オフセットが４バイトの倍数の場合、４バイト境
界に一致した部分は４バイト単位で比較できるので、ま
ず４バイト境界まで一致しているかどうかを１バイト単
位で１〜３バイト比較する。４バイト境界に一致した後
は４バイト単位で高速に比較する。不一致が発生する
か、一致長符号の上限に達するまで続ける。不一致が発
生したときも最後の４バイト境界内で１〜３バイト一致
している可能性があるので、１バイト単位で比較する。In the module for obtaining the longest matching position, the data sequence from the coding position is compared with the data sequence starting from each offset position to determine the matching length. In the present invention, a total of 16 offsets are calculated. Comparison is made. When the encoding unit is byte-wise, if it is simply implemented, conventionally, as shown below, 1
You will compare byte by byte. This may be the case where the compression processing is performed by an 8-bit CPU, but considering the recent implementation in a 32-bit CPU, the speed can be increased as follows. Assuming that the natural data length of the CPU is 32 bits, the second invention implements the following as described below. int search_maechlen (BYTE * offset, BYTE * cp) {count = 0 (If the difference between offset and cp is not a multiple of 4 bytes, compare in 1-byte units as in the conventional example, otherwise perform the following processing (Offset and cp compare 1 to 3 bytes until they match the 4-byte boundary) while (* (int) offset == * (int) cp) compare in count + = 4 // 4 byte units (4 bytes of last mismatch) 1 to 3 bytes match, or if they match, add count for that match) return (count)} Unless the offset is a multiple of 4 bytes, comparison in 4-byte units is not possible, so 1 byte as in the past Compare by unit. If the offset is a multiple of 4 bytes, the part that matches the 4-byte boundary can be compared in 4-byte units. First, 1 to 3 bytes are compared in 1-byte units to determine whether they match up to the 4-byte boundary. After matching with the 4-byte boundary, high-speed comparison is performed in 4-byte units. Continue until a mismatch occurs or the maximum match length code is reached. Even when a mismatch occurs, there is a possibility that 1 to 3 bytes match within the last 4-byte boundary, so comparison is made in byte units.

【００４１】一般的な３２ビットＣＰＵでは１バイト比
較と４バイト比較は同じサイクルで処理される。したが
ってこの発明のように比較を行うことで最大４倍の高速
化が実現できる。In a general 32-bit CPU, 1-byte comparison and 4-byte comparison are processed in the same cycle. Therefore, by performing comparison as in the present invention, up to four times the speed can be realized.

【００４２】次に、第３の発明について説明する。Next, the third invention will be described.

【００４３】伸長処理では、符号をデコードして一致オ
フセットと一致長を求め、データバッファの一致オフセ
ットから一致長分だけデータをコピーし、新たなデコー
ドデータとしてデータ出力に追加する作業を繰り返すこ
とになる。このとき、単純にインプリメントすると、従
来は、下記に示すように１バイトずつメモリコピーを行
うことになる。 void matchl_copy( BYTE*offset, BY TE*cp, int length){ memcpy_in_BYTE(cp, offset, length) } 第２の発明と同様に、最近のように３２ビットなどのＣ
ＰＵへの実装を考えると次のように高速化できる。ＣＰ
Ｕの自然データ長を３２ビットとしたとき、この第３の
発明では、下記に示すようにインプリメントする。 void matchl_copy( BYTE*offset, BY TE*cp, int length){ （offsetとcpの差が４バイトの倍数でなければ従来例のように１バイト単位にコピーし、そうでなければ以下の処理を行う）（offsetとcpは４バイト境界に一致するまで１〜３バイトコピー） memcpy_in_4BYTE(cp, offset, length) //4バイト単位にメモリコピー（余りがあれば、１〜３バイト分コピー） } オフセットが４バイトの倍数でなければ、４バイト単位
のメモリコピーができないので従来例のように１バイト
単位でメモリコピーを行う。オフセットが４バイトの倍
数の場合、４バイト境界に一致した部分は４バイト単位
でメモリコピーできるので、まず４バイト境界まで１バ
イト単位で１〜３バイトメモリコピーする。４バイト境
界に一致した後は４バイト単位で一致長に達するまで高
速にメモリコピーを行う。最後に４バイト境界ないであ
まりがある場合は、１バイト単位でメモリコピーを行
う。In the decompression process, a code is decoded to obtain a match offset and a match length, data is copied from the match offset in the data buffer by the match length, and added to the data output as new decoded data. Become. At this time, if simply implemented, conventionally, memory copying is performed one byte at a time as shown below. void matchl_copy (BYTE * offset, BYTE * cp, int length) {memcpy_in_BYTE (cp, offset, length)} Similar to the second invention, the C of 32 bits or the like is used recently.
Considering the implementation on the PU, the speed can be increased as follows. CP
Assuming that the natural data length of U is 32 bits, the third embodiment implements the following as described below. void matchl_copy (BYTE * offset, BYTE * cp, int length) {(If the difference between offset and cp is not a multiple of 4 bytes, copy in 1-byte units as in the conventional example. Otherwise, perform the following processing. (Offset and cp are copied 1 to 3 bytes until they match the 4-byte boundary) memcpy_in_4BYTE (cp, offset, length) / Memory copy in units of 4 bytes (if there is a remainder, copy 1 to 3 bytes)} If the offset is not a multiple of 4 bytes, memory copy cannot be performed in units of 4 bytes. Therefore, memory copy is performed in units of 1 byte as in the conventional example. If the offset is a multiple of 4 bytes, the portion that matches the 4-byte boundary can be memory-copied in 4-byte units. After matching the 4-byte boundary, high-speed memory copying is performed in 4-byte units until the matching length is reached. Finally, if there is not a 4-byte boundary and there are too many, memory copy is performed in 1-byte units.

【００４４】一般的な３２ビットＣＰＵでは１バイトメ
モリコピーと４バイトメモリコピーは同じサイクルで処
理される。したがって、この第３の発明のようにメモリ
コピーを行うことで最大４倍の高速化が実現できる。In a general 32-bit CPU, 1-byte memory copy and 4-byte memory copy are processed in the same cycle. Therefore, by performing memory copy as in the third aspect of the invention, up to four times the speed can be realized.

【００４５】次に、この発明のポイントの一覧をまとめ
る。Next, a list of points of the present invention will be summarized.

【００４６】Lempel- Ziv方式（移動窓方式）の圧縮伸
長をソフトウェアで表現するとき＜圧縮＞先に調べたオフセットで長い一致が得られれば
他のオフセットは調査しない。（従来） for (from offset1 to offset N){ len 1 = search_matchlen( offset 1, current_pointer) len 2 = search_matchlen( offset 2, current_pointer) len 3 = search_matchlen( offset 3, current_pointer) … } maxlen = max(len 1, len 2...) （本発明） for ( from offset 1 to offset N){ if ( len 1 = search_matchlen( offset 1, current_pointer)＞＝thresh_len) break if ( len 2 = search_matchlen( offset 2, current_pointer)＞＝thresh_len) break if ( len 3 = search_matchlen( offset 3, current_pointer)＞＝thresh_len) break … } maxlen = max(len 1, len 2...) search_matchlen関数をそのプロセッサのネイティブワ
ード長（３２ビットプロセッサなら４バイト）で比較す
る。＜伸長＞一致符号から原画像を形成するとき、ワー
ド境界が一致したら、memcpy動作を４バイトコピー命令
で実行する。When expressing the compression / expansion of the Lempel-Ziv method (moving window method) by software <Compression> If a long match is obtained with the offset checked earlier, other offsets are not checked. (Conventional) for (from offset1 to offset N) {len 1 = search_matchlen (offset 1, current_pointer) len 2 = search_matchlen (offset 2, current_pointer) len 3 = search_matchlen (offset 3, current_pointer)…} maxlen = max (len 1 , len 2 ...) (the present invention) for (from offset 1 to offset N) {if (len 1 = search_matchlen (offset 1, current_pointer)> = thresh_len) break if (len 2 = search_matchlen (offset 2, current_pointer) > = Thresh_len) break if (len 3 = search_matchlen (offset 3, current_pointer)> = thresh_len) break…} maxlen = max (len 1, len 2 ...) The search_matchlen function is used as the native word length (32-bit 4 bytes). <Expansion> When forming an original image from a matching code, if the word boundaries match, the memcpy operation is executed by a 4-byte copy instruction.

【００４７】上記したように、この発明は、ＬＺ７７ベ
ースの圧縮装置をソフトウェアで実現する際に、画像デ
ータの周期性に着目して効率的に最長一致を与えるオフ
セットを探索し、最長一致を与えるオフセットが見つか
った時点で、一致長の探索を終了する。これにより、余
分な一致長探索処理を省略することができ、圧縮処理速
度を向上させることができる。また、夫々のオフセット
で一致長を調べる際に、例えば圧縮の単位（１バイト）
と、圧縮ソフトウェアを実装するプロセッサのネイティ
ブワード長（例えば３２ビットプロセッサの場合４バイ
ト）とが異なるときに、プロセッサのネイティブワード
長で一致の比較を行うことで処理効率を向上させること
ができる。さらに、伸張処理をソフトウェアで実装する
際に、一致符号のデコードにおいて、プロセッサのネイ
ティブワード長でメモリコピーを行うことで処理効率を
向上させることもできる。As described above, according to the present invention, when the LZ77-based compression apparatus is realized by software, the offset which gives the longest match is efficiently searched by paying attention to the periodicity of the image data, and the longest match is given. When the offset is found, the search for the matching length ends. Thereby, an extra matching length search process can be omitted, and the compression processing speed can be improved. When checking the matching length at each offset, for example, the unit of compression (1 byte)
When the native word length (for example, 4 bytes in the case of a 32-bit processor) of the processor that implements the compression software differs, the processing efficiency can be improved by comparing the match with the native word length of the processor. Further, when the decompression processing is implemented by software, the processing efficiency can be improved by performing memory copy with the native word length of the processor in decoding the coincidence code.

【００４８】[0048]

【発明の効果】この発明によれば、同一データが連続す
る圧縮対象データ、つまり比較的長い一致長が得られる
圧縮対象データを高速に圧縮することが可能なデータ圧
縮方法を提供できる。According to the present invention, it is possible to provide a data compression method capable of rapidly compressing compression target data in which the same data continues, that is, compression target data having a relatively long matching length.

【００４９】（１）従来のＬＺ７７ベースのデータ圧縮
装置をソフトウェアで実現する際には、これから符号化
するデータ列と比較対象とするすべてオフセット位置の
間でデータ列比較を行うため、たとえば文書の周辺部な
どすべて白からなる部分で無駄な比較を行っていた。こ
の発明では、画像データの周期性に注目し、２次元的に
近いオフセット位置から比較し、一致符号で規定した最
大一致長に達するオフセット位置が得られた時点でその
他のオフセット位置との比較を取りやめることで、圧縮
処理時間を短縮することができる。(1) When a conventional LZ77-based data compression apparatus is implemented by software, a data sequence to be encoded is compared with a data sequence to be compared and all offset positions to be compared. Unnecessary comparisons were made in areas consisting entirely of white, such as the periphery. According to the present invention, attention is paid to the periodicity of image data, comparison is performed from offset positions that are two-dimensionally close to each other, and when an offset position that reaches the maximum matching length defined by the matching code is obtained, comparison with other offset positions is performed. By canceling, the compression processing time can be reduced.

【００５０】また、一致符号で規定した最大一致長に達
しなくとも、予め一致長のしきい値を設定しておき、そ
れ以上の一致長が得られたらその後のオフセットとの比
較を取りやめることで類似の効果が得られる。この場合
最適の圧縮率が得られないことも有り得るが、しきい値
を調整することで圧縮率の低下を押さえることもでき
る。Even if the maximum match length specified by the match code is not reached, a threshold value of the match length is set in advance, and if a match length longer than that is obtained, the comparison with the subsequent offset is canceled. A similar effect is obtained. In this case, an optimal compression ratio may not be obtained. However, by adjusting the threshold value, a decrease in the compression ratio can be suppressed.

【００５１】（２）従来のＬＺ７７ベースのデータ圧縮
装置をソフトウェアで実現する際には、これから符号化
するデータ列と比較対象のオフセット位置からのデータ
列を比較し、一致長を調べる必要がある。従来は、圧縮
の処理単位が１バイトのときは、両データ列を１バイト
単位で比較して一致長を求めていた。この発明では、圧
縮の単位（たとえば１バイト）より、圧縮ソフトウェア
を実装するプロセサのネイティブワード長（たとえば３
２ビットプロセサの場合４バイト）が大である場合、プ
ロセサのネイティブワード長でデータ列の比較を行うこ
とで処理効率が向上する。(2) When a conventional LZ77-based data compression apparatus is implemented by software, it is necessary to compare a data string to be encoded from now with a data string from an offset position to be compared, and check a matching length. . Conventionally, when the processing unit of compression is 1 byte, both data strings are compared in 1-byte units to determine the matching length. According to the present invention, the native word length (for example, 3 bytes) of a processor that implements compression software is used instead of the compression unit (for example, 1 byte).
If the 2-byte processor is large (4 bytes), processing efficiency is improved by comparing data strings with the native word length of the processor.

【００５２】（３）従来のＬＺ７７ベースのデータ伸長
装置をソフトウェアで実現する際には、一致符号のデコ
ードにおいて、一致符号の示すオフセット位置から一致
長分の原データをコピーする必要があり、従来は圧縮の
処理単位（たとえば１バイト）でメモリコピーを行って
いた。本発明では、圧縮の単位（たとえば１バイト）よ
り、圧縮ソフトウェアを実装するプロセサのネイティブ
ワード長（たとえば３２ビットプロセサの場合４バイ
ト）が大である場合、プロセサのネイティブワード長で
メモリコピーを行うことで処理効率が向上する。(3) When a conventional LZ77-based data decompression device is implemented by software, it is necessary to copy the original data of the matching length from the offset position indicated by the matching code in decoding the matching code. Has performed memory copy in a compression processing unit (for example, 1 byte). In the present invention, when the native word length of a processor implementing compression software (for example, 4 bytes in the case of a 32-bit processor) is larger than the unit of compression (for example, 1 byte), memory copy is performed using the native word length of the processor. This improves the processing efficiency.

[Brief description of the drawings]

【図１】この発明に係るデータ圧縮方法の概略を説明す
るための図。FIG. 1 is a diagram for explaining an outline of a data compression method according to the present invention.

【図２】従来のデータ圧縮方法の概略を説明するための
図。FIG. 2 is a diagram for explaining an outline of a conventional data compression method.

Claims

[Claims]

1. A data processing method for encoding an image data stream corresponding to a plurality of lines having a line length L of the image obtained by scanning the image and compressing the image data stream, wherein the image data stream is included in the image data stream. Certain symbols,
And a plurality of symbols starting from the specific symbol as encoding target symbols, a symbol having an offset 1 adjacent to the upstream side of the specific symbol and a symbol having an offset L one line length away from the specific symbol. When a plurality of symbols at offsets L + n to L-n on the upstream and downstream sides of the centered stream are set as comparison target symbols, and the encoding target symbols and the comparison target symbols are sequentially compared to detect a coincidence length. A data processing method for terminating detection of a match length when a predetermined maximum match length is obtained, and performing encoding based on the detected match length.

2. A data processing method for encoding an image data stream corresponding to a plurality of lines having a line length L of the image obtained by scanning the image and compressing the image data stream, wherein the image data stream is included in the image data stream. Certain symbols,
And a plurality of symbols starting from the specific symbol as encoding target symbols, a symbol having an offset 1 adjacent to the upstream side of the specific symbol and a symbol having an offset L one line length away from the specific symbol. When a plurality of symbols at an offset L + n to an offset L−n on the upstream side and the downstream side of the center stream are set as comparison target symbols, and the encoding target symbol and the comparison target symbol are sequentially compared and encoded, The offsets 1, L, L + 1, and L-1 may be compared with the specific symbol included in the encoding target symbol.
A data processing method characterized by preferentially selecting a symbol.

3. A data processing method for encoding an image data stream corresponding to a plurality of lines having a line length L of the image obtained by scanning the image and compressing the image data stream, wherein the image data stream is included in the image data stream. Certain symbols,
And a plurality of symbols starting from the specific symbol as encoding target symbols, a symbol having an offset 1 adjacent to the upstream side of the specific symbol and a symbol having an offset L one line length away from the specific symbol. When a plurality of symbols at an offset L + n to an offset L−n on the upstream side and the downstream side of the center stream are set as comparison target symbols, and the encoding target symbol and the comparison target symbol are sequentially compared and encoded, The offsets L, 1, L-1, L + 1, L-2, L + 2,..., L- are set in order of comparison priority of the specific symbols included in the encoding target symbols, in descending order of priority.
n and L + n symbols, and sequentially comparing the encoding target symbol with the comparison target symbol to detect a match length, ends the match length detection when a predetermined maximum match length is obtained. And performing encoding based on the detected matching length.

4. The data processing apparatus according to claim 1, wherein the data processing unit compresses the data in units of n bits when the processing unit is an n-bit processing unit. Data processing method.

5. When one symbol is 8-bit data, and the arithmetic processing unit responsible for compression processing is an n-bit arithmetic processing unit, the data is compressed in units of (n / 8 × symbol). The data processing method according to claim 1, wherein the data is processed.

6. An arithmetic processing unit for copying data when a part of data included in an encoded data stream is copied in order to decompress an encoded data stream. 6. The data processing method according to claim 1, wherein the data is copied in units of n bits.

7. When copying a part of data included in an encoded data stream in order to decompress the encoded encoded data stream, one symbol is 8-bit data and the data is copied. 5. The data processing apparatus according to claim 1, wherein when the arithmetic processing unit is an n-bit arithmetic processing unit, data is copied in units of (n / 8.times.symbol). Item 6. The data processing method according to Item 5.