JP3034016B2

JP3034016B2 - Data compression and decompression method

Info

Publication number: JP3034016B2
Application number: JP31520790A
Authority: JP
Inventors: 佳之岡田; 広隆千葉; 茂吉田; 泰彦中野
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-11-20
Filing date: 1990-11-20
Publication date: 2000-04-17
Anticipated expiration: 2015-04-17
Also published as: JPH04185118A

Description

【発明の詳細な説明】〔目次〕概要産業上の利用分野従来の技術（第９図乃至第13図）発明が解決しようとする課題課題を解決するための手段（第１図）作用実施例（ａ）第１の実施例の説明（第２図乃至第５図）（ｂ）第２の実施例の説明（第６図）（ｃ）第３の実施例の説明（第７図）（ｄ）第４の実施例の説明（第８図）（ｅ）他の実施例の説明発明の効果〔概要〕ユニバーサル符号化によりデータ圧縮するデータ圧縮
及び復元方法に関し，無関係なデータ系列間のデータが登録されることを防
ぎ，圧縮率を向上することを目的とし，符号化済データを相異なる部分列に分けて該部分列を
辞書に登録しておき，入力データで辞書に登録された部
分列を検索し，該入力データを該辞書に登録された部分
列の内，最大長一致するものの参照番号で指定して符号
化し，該辞書に登録するデータ圧縮方法において，該入
力データから局所的なデータの区切り及び大局的なデー
タの区切りを検出し，該局所的なデータの区切り間で，
該一致した部分列に次の入力データを付加した新部分列
を該辞書に登録し，該大局的なデータの区切り間で，該
大局的なデータの区切りの先頭データから最終データま
でに至る各部分列を連続して該辞書に登録する。Detailed Description of the Invention [Table of Contents] Overview Industrial application field Conventional technology (FIGS. 9 to 13) Problems to be solved by the Invention Means for Solving the Problems (FIG. 1) Action Embodiment (A) Description of the first embodiment (FIGS. 2 to 5) (b) Description of the second embodiment (FIG. 6) (c) Description of the third embodiment (FIG. 7) d) Description of the fourth embodiment (FIG. 8) (e) Description of other embodiments [Summary] Regarding a data compression and decompression method for compressing data by universal encoding, data between irrelevant data sequences In order to prevent data from being registered and to improve the compression ratio, the encoded data is divided into different sub-sequences, and the sub-sequences are registered in the dictionary. Search the column and match the input data with the maximum length of the substrings registered in the dictionary. In a data compression method of coding by specifying a reference number of an object and registering it in the dictionary, local data breaks and global data breaks are detected from the input data, and the data breaks between the local data breaks are detected. ,
A new subsequence obtained by adding the next input data to the matched subsequence is registered in the dictionary, and between the global data delimiters, each of the data from the head data to the final data of the global data delimiter is registered. Subsequences are successively registered in the dictionary.

[Industrial applications]

本発明は，ユニバーサル符号化によりデータ圧縮する
データ圧縮及び復元方法に関する。The present invention relates to a data compression and decompression method for compressing data by universal coding.

近年，文字コード，ベクトル情報，画像等様々な種類
のデータがコンピュータで扱われるようになっており，
扱われるデータ量も急速に増加してきている。大量のデ
ータを扱うときは，データの中の冗長な部分を省いてデ
ータ量を圧縮することで，記憶容量を減らしたり，速く
伝送したりできるようになる。In recent years, various types of data such as character codes, vector information, and images have been handled by computers.
The amount of data handled is also rapidly increasing. When dealing with a large amount of data, by compressing the amount of data by omitting redundant portions in the data, it becomes possible to reduce the storage capacity or to transmit data quickly.

様々なデータを１つの方式でデータ圧縮できる方法と
してユニバーサル符号化が提案されている。ここで，本
発明の分野は，文字コードの圧縮に限らず，様々なデー
タに適用できるが，以下では，情報理論で用いられてい
る呼称を踏襲し，データの1word単位を文字と呼び，デ
ータが任意wordつながったものを文字列と呼ぶことにす
る。Universal coding has been proposed as a method that can compress various data in one system. Here, the field of the present invention can be applied not only to character code compression but also to various types of data. In the following, following the name used in information theory, one word unit of data is called a character, and Are connected to an arbitrary word.

ユニバーサル符号の代表的な方法としては,Ziv−Lemp
el（ジプ−レンペル）符号がある（詳しくは，例えば，
宗像「Ziv−Lempelのデータ圧縮法」，情報処理,Vol.2
6,No.1,1985年を参照のこと）。A typical universal code is Ziv-Lemp
There is an el (Jip-Lempel) code (for example, for example,
Munakata "Ziv-Lempel Data Compression", Information Processing, Vol.2
6, No. 1, 1985).

Ziv−Lempel符号ではユニバーサル型と，増分分
解型（Incremental parsing）の２つのアルゴリズムが
提案されている。さらに，ユニバーサル型アルゴリズム
の改良として,LZSS符号がある（T.C.Bell,“Better OPM
/1.Text Compression",IEEE Trans.on Commun.,Vol.COM
−34,No.12,Dec.1986参照）。また，増分分解型アルゴ
リズムの改良としては,LZW（Lempel−Ziv−Welch）符号
がある（T.A.Welch,“A Technique for High−Performa
nce Data Compression",Computer,June 1984参照）。For the Ziv-Lempel code, two algorithms of a universal type and an incremental decomposition type (Incremental parsing) have been proposed. Furthermore, as an improvement of the universal algorithm, there is LZSS code (TCBell, “Better OPM
/1.Text Compression ", IEEE Trans.on Commun., Vol.COM
-34, No. 12, Dec. 1986). As an improvement of the incremental decomposition type algorithm, there is an LZW (Lempel-Ziv-Welch) code (TAWelch, "A Technique for High-Performa").
nce Data Compression ", Computer, June 1984).

このようなユニバーサル符号化を用いたデータ圧縮方
法では，圧縮率の向上が求められる。In such a data compression method using universal coding, an improvement in compression ratio is required.

[Conventional technology]

第13図は従来技術の説明図である。 FIG. 13 is an explanatory diagram of the prior art.

元来，ユニバーサル符号は，情報保存型のデータ圧縮
方法であり，データ圧縮時に情報源の統計的な性質を予
め仮定しないため，種々のタイプ（文字コード，オブジ
ェクトコードなど）のデータに適用することができる。Originally, the universal code is an information preservation type data compression method. Since the statistical properties of the information source are not assumed in advance at the time of data compression, it can be applied to data of various types (character codes, object codes, etc.). Can be.

文書画像では，文字の文字線の直線性や曲がり具合に
は類似性がある。また，網点画像は，画像全体が網点分
散するため膨大な数の変化点が出現するが，網点周期
性，網点形状の同一性から輪郭線の接続関係は類似して
いる。この類似性のもつ冗長性をユニバーサル符号化に
より削減し，有効な圧縮ができる。In a document image, there is a similarity in the linearity and the degree of bending of a character line of a character. In a halftone dot image, an enormous number of change points appear because the entire image is halftone dispersed, but the connection relationship of the outlines is similar due to the halftone dot periodicity and the same halftone dot shape. Redundancy having this similarity can be reduced by universal coding, and effective compression can be performed.

このように，画像データに対しユニバーサル符号化が
有効なため，第９図に示すように，画像データをパター
ンランレングス法等により二次元的な情報（中間デー
タ）に変換する前処理を行い，次にLZW符号等を用いて
ユニバーサル符号化を行う本処理を行って，画像に対す
るユニバーサル符号化の圧縮効率をより一層高める方法
を既に提案している。As described above, since universal coding is effective for image data, as shown in FIG. 9, preprocessing for converting image data into two-dimensional information (intermediate data) by a pattern run length method or the like is performed. Next, a method has been already proposed in which the present process of performing universal encoding using an LZW code or the like is performed to further increase the compression efficiency of universal encoding for an image.

この既提案の方法には，次のようなものがある。 The proposed methods include the following.

画像データを二次元的に捉えた白黒パターンの種類
とそのパターンが連続する数の情報に変換した後にユニ
バーサル符号化を適用する（パターンランレングス
法）。The universal coding is applied after converting the image data into information of the type of the black and white pattern that is two-dimensionally captured and the number of continuous patterns (pattern run length method).

画像データの輪郭線を追跡すべく，白から黒，また
は黒から白に変わる変化画素の接続関係の情報に変換し
た後にユニバーサル符号化を適用する（輪郭線法）。In order to track the outline of the image data, universal coding is applied after converting the information into information on the connection relationship of changed pixels that change from white to black or from black to white (contour line method).

入力した画像を変化画素の相対アドレスデータに変
換し，その変換データから走査線方向の変換画素の連続
性等の統計的性質をユニバーサル符号化の手法により学
習しながら符号の最良化を図り，種々の性質の画像にお
いて効率の良い圧縮を行うようにする（変形MMR法）。The input image is converted into the relative address data of the changed pixels, and from the converted data, the statistical properties such as the continuity of the converted pixels in the scanning line direction are learned by the universal coding method to optimize the codes, and various codes are optimized. Efficient compression is performed on an image with the characteristic (modified MMR method).

このユニバーサル符号化をLZW符号を例に説明する。 This universal encoding will be described using an LZW code as an example.

第10図はLZW符号化処理フロー図，第11図はLZW復号化
処理フロー図である。FIG. 10 is a flowchart of the LZW encoding process, and FIG. 11 is a flowchart of the LZW decoding process.

LZW符号化は，書き替え可能な辞書をもち，入力文字
コード，データ中を相異なる文字列に分け，この文字例
を出現した順に番号を付けて辞書に登録するとともに，
現在入力している文字列を辞書に登録してある最長一致
文字列の番号（インデックス）で表して，符号化するも
のである。LZW coding has a rewritable dictionary, divides the input character code and data into different character strings, assigns numbers to these character examples in the order they appear, and registers them in the dictionary.
The currently input character string is represented by the number (index) of the longest matching character string registered in the dictionary and encoded.

第１図のフロー図により符号化処理を説明する。 The encoding process will be described with reference to the flowchart of FIG.

先ずステップS1（以下「ステップ」を省略）で予め全
文字につき一文字からなる文字列を初期値として登録し
てから符号化を始める。S1の符号化は，入力した最初の
文字Ｋにより辞書を検索して参照番号ωを求め，これを
語頭文字列（prefix string）とする。First, in step S1 (hereinafter "step" is omitted), a character string consisting of one character for all characters is registered in advance as an initial value, and then encoding is started. In the encoding of S1, the dictionary is searched by the first character K input to obtain a reference number ω, which is used as a prefix string.

次にS2で入力データの次の文字を読み込み,S3で文字
入力が終了したか否かをチェックした後,S4に進んでS1
で求めた語頭文字列ω又はS5のωにS2で読み込んだ文字
Ｋを加えた（ωＫ）が辞書にあるか否か探す。Next, at S2, the next character of the input data is read, and at S3, it is checked whether the character input is completed.
A search is made to see if the dictionary has a character (ωK) obtained by adding the character K read in S2 to the initial character string ω obtained in S5 or ω in S5.

S4で文字列（ωＫ）が辞書になければ,S6に進んでS1
で求めた文字Ｋの参照番号ωを符号語code（ω）として
出力し，また文字列（ωＫ）に新たな参照番号を付加し
て辞書に登録し，さらにS2の入力文字Ｋを参照番号ωに
置き換えるとともに，辞書アドレスｎをインクリメント
してS2に戻って次の文字Ｋを読み込む。If the character string (ωK) is not in the dictionary in S4, the process proceeds to S6 and S1
The reference number ω of the character K obtained in step is output as a code word code (ω), a new reference number is added to the character string (ωK) and registered in the dictionary, and the input character K of S2 is further referred to as the reference number ω And the dictionary address n is incremented, and the process returns to S2 to read the next character K.

一方,S4で文字列（ωＫ）が辞書にあれば,S5で文字列
（ωＫ）を参照番号ωに置き換え，再びS2に戻って文字
列（ωＫ）が辞書から探せなくなるまで最大一致長の探
索を続ける。On the other hand, if the character string (ωK) is found in the dictionary in S4, the character string (ωK) is replaced with the reference number ω in S5, and the process returns to S2 to search for the maximum matching length until the character string (ωK) cannot be searched from the dictionary. Continue.

第15図の復号化処理は第14図の符号化の逆の操作を行
う。The decoding process of FIG. 15 performs the reverse operation of the encoding of FIG.

第15図の復号化では，符号化と同様に予め辞書に全文
字につき一文字から成る文字列を初期値として登録して
から復号を始める。In the decoding of FIG. 15, similarly to the encoding, the decoding is started after a character string consisting of one character for every character is registered in the dictionary as an initial value in advance.

先ずS1で最初の符号（参照番号）を読み込み，現在の
CODEをOLDcodeとし，最初の符号は既に辞書に登録され
た一文字の参照番号いずれかに該当することから，入力
符号CODEに一致する文字code（Ｋ）を探し出し，文字Ｋ
を出力する。なお，出力した文字（Ｋ）は後述するS8の
例外処理のためFIN charにセットしておく。First, the first code (reference number) is read in S1, and the current code is read.
CODE is OLDcode, and since the first code corresponds to one of the reference numbers of one character already registered in the dictionary, a character code (K) that matches the input code CODE is searched for, and the character K
Is output. Note that the output character (K) is set in FIN char for exception processing in S8 described later.

次にS2に進んで次の符号を読み込んでCODEにINcodeと
してセットする。Then, the process proceeds to S2, where the next code is read and set as CODE in INCODE.

S3で新たな符号があるか否か，すなわち符号入力の終
了の有無をチェックしてS4に進み,S3で入力された符号C
ODEが辞書に定義（登録）されているか否かチェックす
る。In S3, it is checked whether there is a new code, that is, whether or not the code input has been completed, and the process proceeds to S4, where the code C input in S3
Check whether ODE is defined (registered) in the dictionary.

通常，入力した符号語は前回までの処理で辞書に登録
されているため,S5に進んで符号CODEに対応する文字列c
ode（ωＫ）を辞書から読み出し,S6で文字列Ｋを一時的
にスタックし，参照番号code（ω）を新たなCODEとして
再度S5に戻り，このS5,S6の手順を再帰的に参照番号ω
が一文字に至るまで繰り返し，最後にS7に進んでS6でス
タックした文字をLIFO（Last In Past Out）形式でポッ
プアップして出力する。同時にS7において，前回使った
符号ωと今回復元した文字列の最初の一文字Ｋを組
（ω,K）と表した文字列に，新たな参照番号を付加して
辞書に登録する。Normally, since the input code word is registered in the dictionary in the previous processing, the process proceeds to S5, where the character string c corresponding to the code CODE is obtained.
ode (ωK) is read from the dictionary, the character string K is temporarily stacked in S6, the reference number code (ω) is set as a new CODE, and the process returns to S5 again.
Is repeated until one character is reached. Finally, the process proceeds to S7, and the characters stacked in S6 are popped up and output in LIFO (Last In Past Out) format. At the same time, in S7, a new reference number is added to the character string represented as a set (ω, K) of the code ω used last time and the first character K of the character string restored this time, and registered in the dictionary.

このような画像のユニバーサル符号化のメリットを高
めるための，画像データの前処理方法について既提案な
ものを例に第12図，第13図により説明する。A pre-processing method of image data for improving the merit of such universal coding of an image will be described with reference to FIGS. 12 and 13, taking an already proposed method as an example.

第12図は既提案の画像データの二次元符号化説明図，
第13図はその符号化例説明図である。Fig. 12 is an explanatory diagram of two-dimensional encoding of image data already proposed,
FIG. 13 is an explanatory diagram of the encoding example.

第12図のような８ラインの画像を例に説明すると，既
提案のパターンランレングス法は，垂直方向の８ライン
のパターンを符号化し，同一パターンの水平方向の長さ
（ランレングス）と組み合わせる。Taking the example of an 8-line image as shown in FIG. 12 as an example, the proposed pattern run-length method encodes a vertical 8-line pattern and combines it with the horizontal length (run-length) of the same pattern. .

即ち,8ラインの垂直方向のパターンとその水平方向の
ランレングス（RL）で二次元画像を符号化する。That is, a two-dimensional image is coded by a vertical pattern of eight lines and its horizontal run length (RL).

例えば，白を“0",黒を“1"とし，各ラインの白黒状
態を表現すると，垂直方向８ラインが白なら，パターン
符号は“00000000"であり，その水平方向の長さ「127」
はランレングス符号“01111111"であり，これを１単位
として符号化する。For example, when white is “0” and black is “1”, and the black and white state of each line is expressed, if eight lines in the vertical direction are white, the pattern code is “00000000” and the length in the horizontal direction is “127”.
Is a run-length code "01111111", which is encoded as one unit.

第12図の場合は，第13図の入力シンボルの如く，符号
化される。In the case of FIG. 12, encoding is performed like the input symbol of FIG.

この方法は，画像を二次元的なブロックとしてみなし
て符号化するものであり，パターン＋ランレングスの組
合せで二次元符号化できる。According to this method, an image is encoded as a two-dimensional block, and two-dimensional encoding can be performed by a combination of a pattern and a run length.

このような単位長（バイト）の入力シンボルをユニバ
ーサル符号化（例えばLZW符号）すると第13図の如くな
り，辞書へは，出力符号（最大長一致符号）＋次のシン
ボルが登録される。When input symbols of such a unit length (byte) are universally encoded (for example, LZW code), the result becomes as shown in FIG. 13, and an output code (maximum length matching code) + the next symbol is registered in the dictionary.

例えば，入力シンボル“00000000"はインデックス
「０」で符号化され，辞書へは，次の入力シンボル“01
111111"との組合せが，インデックス「256」として登録
される。For example, an input symbol “00000000” is encoded with an index “0”, and the next input symbol “01” is stored in the dictionary.
The combination with “111111” is registered as the index “256”.

[Problems to be solved by the invention]

ところで，従来のユニバーサル符号化は，データ系列
に全く関係なく，最大長一致データに次の１シンボルを
加えたものを順次辞書に登録する。By the way, in the conventional universal coding, data obtained by adding the next one symbol to the maximum length matching data is sequentially registered in the dictionary irrespective of the data sequence.

このため，類似性を有する前述の中間データの如き，
データであっても，第13図の登録番号257,259,261,263,
265,267,269……のような全く無関係なデータ系列の組
合せも登録することになる。Therefore, similar to the above-mentioned intermediate data with similarity,
Even if it is data, the registration numbers 257,259,261,263,
A combination of completely unrelated data series such as 265, 267, 269... Is also registered.

このため，従来技術では，次のような問題が生じてい
た。For this reason, the following problems have arisen in the prior art.

無関係なデータ系列間のデータも組合せて，登録す
るので，出現頻度の極めて少ないものを数多く登録して
しまい，学習効果が遅れ，初期の圧縮率が低下する。Since data between irrelevant data series is also registered in combination, a large number of data having an extremely low appearance frequency are registered, so that the learning effect is delayed and the initial compression ratio is reduced.

無駄な登録が多いため，インデックス（参照番号）
のビット数が増加し，圧縮率の低下を招く。Index (reference number) because there are many useless registrations
, The number of bits increases, and the compression ratio decreases.

従って，本発明は，無関係なデータ系列間のデータが
登録されることを防ぎ，圧縮率を向上することができる
データ圧縮及び復元方法を提供することを目的とする。Accordingly, an object of the present invention is to provide a data compression and decompression method capable of preventing data between irrelevant data sequences from being registered and improving the compression ratio.

[Means for solving the problem]

第１図は，本発明の原理図である。 FIG. 1 is a diagram illustrating the principle of the present invention.

本発明は，第１図に示すように，符号化済データを相
異なる部分列に分けて該部分列を辞書に登録しておき，
入力データで辞書に登録された部分列を検索し，該入力
データを該辞書に登録された部分列の内，最大長一致す
るものの参照番号で指定して符号化し，該辞書に登録す
るデータ圧縮方法において，該入力データから局所的な
データの区切り及び大局的なデータの区切りを検出し，
該局所的なデータの区切り間で，該一致した部分列に次
の入力データを付加した新部分列を該辞書に登録し，該
大局的なデータの区切り間で，該大局的なデータの区切
りの先頭データから最終データまでに至る各部分列を連
続して該辞書に登録するものである。In the present invention, as shown in FIG. 1, the encoded data is divided into different sub-sequences, and the sub-sequences are registered in a dictionary.
Searches a subsequence registered in a dictionary with input data, encodes the input data by specifying a reference number of a subsequence registered in the dictionary that matches the maximum length, and compresses the data into the dictionary Detecting a local data break and a global data break from the input data,
A new subsequence obtained by adding the next input data to the matching subsequence is registered in the dictionary between the local data delimiters, and the global data delimiters are registered between the global data delimiters. Are sequentially registered in the dictionary from the first data to the last data.

又，本発明は，請求項（１）の符号化データを辞書に
登録された部分列の参照番号と比較し，該符号化データ
を一致する参照番号の部分列に復元するデータ復元方法
において，復元された部分列から局所的なデータの区切
り及び大局的なデータの区切りを検出し，該局所的なデ
ータの区切り間で，該一致した部分列に次の入力データ
を付加した新部分列を該辞書に登録し，該大局的なデー
タの区切り間で，該大局的なデータの区切りの先頭デー
タから最終データまでに至る各部分列を連続して該辞書
に登録するものである。The present invention also provides a data restoration method for comparing encoded data with a reference number of a subsequence registered in a dictionary and restoring the encoded data to a subsequence with a matching reference number. A local data segment and a global data segment are detected from the restored subsequence, and a new subsequence obtained by adding the next input data to the matched subsequence is detected between the local data segments. The sub-strings are registered in the dictionary, and the sub-sequences from the leading data to the last data of the global data break are continuously registered in the dictionary between the global data breaks.

[Action]

本発明は，局所的なデータの区切り間のデータで辞書
登録を行うので，前述の局所的なデータ系列の異なるデ
ータの組合せが登録されるのを防ぎ，無駄な登録を防止
して，学習効果を促進する。According to the present invention, dictionary registration is performed using data between local data delimiters, so that a combination of data having different local data series described above is prevented from being registered, and unnecessary registration is prevented. To promote.

一方，大局的にデータを見ると，画像群等の大局的デ
ータ系列が存在する。On the other hand, looking at the data globally, there is a global data sequence such as an image group.

前述の局所的なデータ系列で区切ると，登録される組
合せシンボル数が区切り間に限られ，長くならない。When divided by the above-mentioned local data series, the number of registered combination symbols is limited to the interval, and does not become long.

そこで，大局的なデータ系列でも区切って，この間の
データを，前述の異なる局所的なデータ系列間のデータ
の組合せが生じないよう，先頭シンボルから連続登録
し，より長いデータを登録し，圧縮率を向上させるもの
である。Therefore, the data is divided even in the global data series, and the data during this period is registered consecutively from the first symbol so that the combination of the data between the different local data series described above does not occur. Is to improve.

〔Example〕

（ａ）第１の実施例の説明第２図は，本発明の第１の実施例処理フロー図であ
る。(A) Description of the first embodiment FIG. 2 is a processing flowchart of the first embodiment of the present invention.

画像データを，第16図，第17図で説明したように，
二次元的な情報に変換して前処理する。As described in FIGS. 16 and 17,
Convert to two-dimensional information and preprocess.

次に，変換されたデータの大局的，局所的切れ目
（区切り）を認識（検出）し，辞書への登録選択しなが
ら，前述のユニバーサル符号化を行い本処理する。Next, global and local breaks (breaks) of the converted data are recognized (detected), and the above-described universal encoding is performed while the dictionary is selected for registration.

この本処理は次のようにして行う。 This processing is performed as follows.

入力文字（データ）系列と同じ文字（データ）系列
を辞書内において検索する（辞書検索ステップ）。The same character (data) sequence as the input character (data) sequence is searched in the dictionary (dictionary search step).

次に辞書内で検索一致した文字（データ）系列のイ
ンデックスで符号化する（インデックス符号化ステッ
プ）。Next, encoding is performed using an index of a character (data) series that has been searched and matched in the dictionary (index encoding step).

文字（データ）列の大局的，局所的切れ目を認識
し，辞書に登録する（辞書登録ステップ）。Recognize global and local breaks in a character (data) string and register them in a dictionary (dictionary registration step).

第３図は本発明の第１の実施例符号化処理フロー図で
あり，第４図はその動作説明数，第５図はその復号化処
理フロー図である。FIG. 3 is a flowchart of an encoding process according to the first embodiment of the present invention, FIG. 4 is a flowchart for explaining the number of operations, and FIG. 5 is a flowchart of the decoding process.

尚，符号化，復号化とも，図示しないプロセッサがメ
モリに辞書を作成して実行するものとする。It is assumed that a processor (not shown) creates and executes a dictionary in a memory for both encoding and decoding.

符号化処理について，第３図を用いて説明する。 The encoding process will be described with reference to FIG.

尚，第10図で示したステップと同一のものは，同一の
記号で示してある。The same steps as those shown in FIG. 10 are denoted by the same symbols.

S1）辞書の初期化として，第１番目の256文字（ワー
ド）を登録し，辞書の先頭アドレスＮを256とする。S1) To initialize the dictionary, the first 256 characters (words) are registered, and the start address N of the dictionary is set to 256.

次に，最初の文字Ｋを入力し，辞書中にそのアドレス
を登録文字列ω₁,符号文字列ω_２に代入する。Next, the first character K is input, and its address is assigned to the registered character string ω ₁ and the code character string ω ₂ in the dictionary.

S2）次の文字Ｋを入力し,S4に進む。S2) Input the next character K and proceed to S4.

S4）文字列ω_１と文字Ｋとの組合せが辞書にあるかど
うかを調べる。S4) a combination of a string ω ₁ and the letter K determine whether there is in the dictionary.

S5）辞書にω₁ Kが存在するならば，ω₁ Kの登録アド
レスをω_１に，ω₂ Kの登録アドレスをω_２に代入す
る。S5) If ω ₁ K exists in the dictionary, substitute the registered address of ω ₁ K into ω ₁ and the registered address of ω ₂ K into ω ₂ .

S3）次にデータが終了かを調べ，データ終了ならS7
へ，データ終了でないならS2へ戻る。S3) Next, it is checked whether the data is completed.
If not, return to S2.

これによって辞書内にある最長一致する登録文字列を
検索する。Thus, the longest matching registered character string in the dictionary is searched.

S7）データ終了なら，アドレスω_２を符号として出力
し，終了する。If S7) data end, and outputs an address omega ₂ as a code, ends.

S8）ステップS4で，辞書にω₁ Kが存在しないなら
ば，ω_１とＫとの間がデータの大局的切れ目かを判定す
る。S8) In step S4, if ω ₁ K does not exist in the dictionary, it is determined whether or not ω ₁ and K are global breaks in data.

大局的区切り（切れ目）は，例えば，入力シンボル
（中間データ）をパターンランレングスで符号化した場
合に，第４図ので示す如く，黒を含むパターンと白の
パターンを区別したものであり，パターン符号の解読に
より，解析できる。For example, when an input symbol (intermediate data) is coded by a pattern run length, a global break (break) distinguishes a pattern including black from a white pattern, as shown in FIG. It can be analyzed by decoding the code.

S9）データが大局的切れ目なら，アドレスω_２を符号
として出力し，登録はせずに,Kの辞書アドレスをω₁,ω
_２に代入して，ステップS3に進む。S9) If data is global break, outputs address omega ₂ as a code without registration, ₁ dictionary address of K omega, omega
_Then , the process proceeds to step S3.

これによって大局的切れ目を検出すると，切れ目間の
ω₁ Kは登録せず，切れ目の次のシンボルＫをω₁,ω_２
にセットし,Kより検索登録を始める。When a global break is detected in this way, ω ₁ K between the breaks is not registered, and the symbol K following the break is changed to ω ₁ , ω ₂
And start the search registration from K.

S10）一方，データが大局的切れ目でなければ，次に
符号文字列ω_２と文字Ｋとの組合せが辞書にあるかどう
かを判定する。S10) On the other hand, the data determines whether if not global cut, then the combination of the code strings omega ₂ and the character K is in the dictionary.

S11） ω₂ Kが辞書にあれば，ω₁ Kを辞書に登録し，
大局的区切りの先頭から文字Ｋまでの文字列を登録（連
続登録という）し，参照アドレスＮをＮ＋１にインクリ
メントし，ω₁ Kの辞書アドレスをω_１に代入する。S11) If ω ₂ K is in the dictionary, register ω ₁ K in the dictionary,
A character string from the head of the global division to the character K is registered (referred to as continuous registration), the reference address N is incremented to N + 1, and the dictionary address of ω ₁ K is substituted for ω ₁ .

S12）次に，ω₂ Kの間がデータの局所的切れ目かを判
定する。S12) Next, it is determined whether or not data between ω ₂ K is a local break.

データの局所的切れ目は，例えば第図の↓で示す如
く，パターンとランレングスのペアを組として局所的切
れ目を設定してある。The local breaks in the data are set as pairs of patterns and run lengths, as shown by, for example, ↓ in FIG.

S13）局所的切れ目でなければ，局所的切れ目間の登
録のため，ω₂ Kの辞書アドレスをω_２に代入し，ステ
ップS3へ進む。S13) If it is not a local break, the dictionary address of ω ₂ K is substituted for ω ₂ for registration between local breaks, and the process proceeds to step S3.

S14）一方，局所的切れ目なら，アドレスω_２を符号
として出力し，登録はS11で終っているので，切れ目の
次の文字Ｋを符号文字列の先頭にすべく，文字Ｋの辞書
アドレスをω_２にセットして，ステップS3へ進む。S14) On the other hand, if the local cuts, and outputs the address omega ₂ as the code, since registration is done at S11, in order to the next character K of the cut at the beginning of the code string, a dictionary address of the character K omega Set to ₂ and go to step S3.

S15） S10で，ω₂ Kが辞書になければ，アドレスω_２
を符号として出力し,S11と同様，ω₁ Kを辞書に登録
し，参照アドレスＮをＮ＋１にインクリメントする。S15) In S10, if ω ₂ K is not in the dictionary, address ω ₂
Is output as a code, ω ₁ K is registered in the dictionary, and the reference address N is incremented to N + 1 as in S11.

S16）次に，ω_２＝ω_１かを判定し，ω_２＝ω_１な
ら，ω₂ Kの登録が必要ないため，ステップS18へ進む。S16) Next, it is determined whether ω ₂ = ω _1, if ω ₂ = ω _1, since there is no need registration omega ₂ K, the process proceeds to step S18.

S17） ω_２≠ω_１なら，局所的区切り間の文字列の登
録のため，ω₂ Kを辞書登録し，参照アドレスＮをＮ＋
１にインクリメントし，ステップS18へ進む。S17) If ω ₂ ≠ ω _1, for the registration of character strings between local separated, the omega ₂ K to dictionary registration, the reference address N N +
The value is incremented to 1 and the process proceeds to step S18.

S18）登録文字列ω_１を延長すべく，ω₁ Kの登録アド
レスをω_１に代入し，符号文字列ω_２には文字Ｋの辞書
アドレスを代入し，ステップS3へ進む。S18) in order to extend the registration string omega _1, substitutes the registered address of the omega ₁ K to omega _1, the code string omega ₂ substitutes the dictionary address character K, the process proceeds to step S3.

第４図を用いて，具体的に説明する。 This will be specifically described with reference to FIG.

第４図の入力シンボルは，前述のパターンランレング
ス二次元符号化したものであり，パターンとランレング
スの組で符号化したものである。The input symbols shown in FIG. 4 are those obtained by performing the above-described pattern run-length two-dimensional coding, and are obtained by coding a set of a pattern and a run length.

この組単位で局所的区切りを設定し，白パターンとそ
れ以外の黒を含むパターンを区別して大局的区切りを設
定している。Local breaks are set for each set, and global breaks are set by distinguishing white patterns from other patterns including black.

第４図では,2バイト毎に局所的区切りを設定し，白パ
ターンとそれ以外の黒パターンとの区切りで大局的区切
りを設定している。In FIG. 4, a local delimiter is set every two bytes, and a global delimiter is set by a delimiter between a white pattern and other black patterns.

これは，黒を含むパターンの連続を１つの画像として
考えたことによる。This is because a series of patterns including black is considered as one image.

大局的切れ目でなければ，即ち大局的切れ目間は,S8,
S9のように，切れ目の先頭文字Ｋをω_１とすることによ
り，先頭文字Ｋを先頭にS11,S15で，登録番号「258」，
「259」，「261」，「262」，「264」，「265」等のよ
うに連続登録が行われる。If it is not a global break, ie, between global breaks, S8,
By setting the _first character K of the break to ω1 as in S9, the _first character K starts with S11 and S15, the registration number “258”,
Continuous registration is performed as "259", "261", "262", "264", "265", and the like.

一方，局所的切れ目なら,S14で文字Ｋをω_２とするこ
とにより,S17で切れ目間のデータの通常登録が，登録番
号「257」，「260」等のように行われる。On the other hand, if the local cut, by the letter K and omega ₂ at S14, the normal registration data between cuts in S17 is registration number "257" is carried out such as "260".

登録番号「265」によって，画像として大局的に意味
のあるデータ系列がひとまとめのデータ系列として登録
されるので，効率良く圧縮することが出来る。With the registration number "265", a data sequence that is globally significant as an image is registered as a collective data sequence, so that compression can be performed efficiently.

又，局所的区切りをまたぐようには登録されないの
で，無駄な登録を防げる。In addition, since registration is not performed so as to straddle local delimiters, useless registration can be prevented.

次に，第５図を用いて復号化処理について説明する。 Next, the decoding process will be described with reference to FIG.

尚，第11図で示したステップと同一のものは，同一の
記号で示してある。The same steps as those shown in FIG. 11 are denoted by the same symbols.

S1）符号化と同様に予め辞書に第１番目の256文字を
登録し，辞書の先頭アドレスｎを「256」としてから復
号を始める。S1) As in the case of the encoding, the first 256 characters are registered in the dictionary in advance, and decoding is started after the head address n of the dictionary is set to “256”.

先づ，最初の符号（参照番号）を読み込み，現在のCO
DEをOLD codeとし，最初の符号は既に辞書に登録された
一文字の参照番号いずれかに該当することから，入力符
号CODEに一致する文字code（Ｋ）を探し出し，文字Ｋを
出力する。なお，出力した文字（Ｋ）は後の例外処理の
ためFIN charにセットしておく。First, the first code (reference number) is read and the current CO
DE is an OLD code. Since the first code corresponds to one of the reference numbers of one character already registered in the dictionary, a character code (K) that matches the input code CODE is searched for and the character K is output. Note that the output character (K) is set in FIN char for later exception processing.

S2）次にS2に進んで次の符号を読み込んでCODEにINco
deとしてセットする。S2) Next, proceed to S2, read the next code, and set INCO to CODE.
Set as de.

S4）次にS4に進み,S2で入力された符号CODEが辞書に
定義（登録）されているか否かチェックする。S4) Next, proceed to S4, and check whether or not the code CODE input in S2 is defined (registered) in the dictionary.

S5）通常，入力した符号語は前回までの処理で辞書に
登録されているため,S5に進んで符号CODEに対応する文
字列code（ω₁K）を辞書から読み出す。S5) Normally, the input codeword has been registered in the dictionary in the previous processing, so the process proceeds to S5, where the character string code (ω ₁ K) corresponding to the code CODE is read from the dictionary.

S6） S6で文字列Ｋを一時的にスタックし，参照番号co
de（ω_１）を新たなCODEとして再度S5に戻し，このS5,S
6の手順を再帰的に参照番号ω_１が一文字にいたるまで
繰り返す。S6) Character string K is temporarily stacked in S6, and reference number co
de (ω ₁ ) is returned to S5 as a new CODE, and S5, S5
Step 6 is recursively repeated until the reference number ω1 reaches _one character.

S9）最後にS9に進んで，文字Ｋを出力し,KをFINchar
にセットした後,S6でスタックした文字LIFO（Last In F
ast Out）形式でポップアップして出力する。S9) Finally, proceed to S9, output the character K, and set K to FINchar
LIFO (Last In F)
ast Out) pop up and output.

S10）次に，復元された文字を前述の符号化のステッ
プS8と同様に，構文解析し，データの大局的切れ目があ
るかをチェックする。S10) Next, the restored character is parsed in the same manner as in the above-described encoding step S8 to check whether there is a global break in the data.

データの大局的切れ目の場合は，ステップS11へ，デ
ータの大局的切れ目でない場合は，ステップS12へ進
む。If the data is a global break, the process proceeds to step S11. If the data is not a global break, the process proceeds to step S12.

S11）データの大局的切れ目なら，辞書登録は行わず,
INcodeをOLD codeにセットし，符号文字列ω_２に「０」
をセットし，ステップS3へ進む。S11) If the data is a global break, the dictionary is not registered.
Set the INcode to OLD code, to sign string ω ₂ "0"
Is set, and the process proceeds to step S3.

S3）データが終了かをチェックし，終了でなければ,S
2へ戻り，終了なら復号化を終了する。S3) Check whether the data is completed.
Return to step 2 and end the decoding if done.

S12）データの大局的切れ目でなければ，局所的切れ
目かを判定する。S12) If the data is not a global break, determine whether it is a local break.

局所的切れ目なら，ステップS13へ，局所的切れ目で
ないなら，ステップS14へ進む。If it is a local break, the process proceeds to step S13. If it is not a local break, the process proceeds to step S14.

S13）局所的切れ目なら，文字Ｋを符号文字列ω_２の
先頭にすべく，ω_２に代入し，ステップS17へ進む。If S13) local cuts, in order to letters K to the head of the code string omega _2, and substituted into omega _2, the process proceeds to step S17.

S14）局所的切れ目でないなら，ω₂ Kが辞書にあるか
を判定する。S14) If it is not a local break, determine whether ω ₂ K is in the dictionary.

S15）辞書になければ，未登録のため，ω₂ Kを辞書に
登録し，参照アドレスｎをｎ＋１にインクリメントし，
ステップS16へ進む。S15) If not in the dictionary, since it has not been registered, ω ₂ K is registered in the dictionary, and the reference address n is incremented to n + 1.
Proceed to step S16.

これによって局所的切れ目間の通常登録が行われる。 As a result, normal registration between local breaks is performed.

S16）次に，ω₂ Kの辞書アドレスをω_２に代入する。S16) Next, the dictionary address of ω ₂ K is substituted for ω ₂ .

S17） S17では，前回使った符号OLD codeと今回復元し
た文字列の最初の一文字Ｋを組（OLD code,K）と表わし
た文字列を，新たな参照番号で辞書に登録し，参照番号
（アドレス）ｎをｎ＋１にインクリメントし,OLD code
とＫの組合せの文字をOLD codeにセットして，ステップ
S3に進む。S17) In S17, a character string representing a pair (OLD code, K) of the code OLD code used last time and the first character K of the character string restored this time is registered in the dictionary with a new reference number, and the reference number ( Address) n is incremented to n + 1 and OLD code
Set the character of the combination of K and K to OLD code and step
Go to S3.

S8）なお,S4において登録されていない符号（符号化
において直前の参照番号を参照する場合に起きる）場
合,S8にて,FIN charを出力し,OLD codeをCODEに,code
（OLD code,FIN char）をIN codeに戻した後にS5へ進む
ようにする。S8) If the code is not registered in S4 (it occurs when referring to the immediately preceding reference number in encoding), FIN char is output in S8, OLD code is set to CODE, and code
After returning (OLD code, FIN char) to IN code, proceed to S5.

このようにして，符号化と同様に復号化が行われる。 In this way, decoding is performed in the same way as encoding.

（ｂ）第２の実施例の説明第６図は本発明の第２の実施例説明図である。(B) Description of Second Embodiment FIG. 6 is an explanatory diagram of a second embodiment of the present invention.

この例では，前処理に輪郭線法を取り入れたものであ
り，第12図の画像例で示してある。In this example, the contour method is incorporated in the preprocessing, and is shown in the image example of FIG.

輪郭線法は，画像の輪郭線を，水平モードとランレン
グスRLで１つのライン上の位置を表わし，他のラインは
それからの水平方向のズレの長さZLで順次表わすもので
ある。In the contour method, the contour of the image is represented by a horizontal mode and a run length RL on one line, and the other lines are sequentially represented by a horizontal displacement length ZL from the line.

例えば，第12図の左側の黒画像の左側輪郭線は，第６
図では輪郭線１として表わされ，右側輪郭線は輪郭線２
で表わされる。For example, the left outline of the left black image in FIG.
In the figure, it is represented as contour 1 and the right contour is contour 2
Is represented by

即ち，輪郭線１は,1ライン目の左端から128アドレス
目に１ライン目の境があり,2ライン目は,1ライン目に対
し，ズレ「０」（ZL0）,3ライン目は,2ライン目に対し
ズレ「１」（ZL−１）,4ライン目は,3ライン目に対しズ
レ「０」（ZL0）,5ライン目以降は白（パスモードＰ）
と表わす。That is, the contour line 1 has a boundary of the first line at the 128th address from the left end of the first line, the second line is shifted from the first line by “0” (ZL0), and the third line is Offset "1" (ZL-1) with respect to the line, 4th line with "0" (ZL0) with respect to the 3rd line, white after 5th line (pass mode P)
It is expressed as

ここで，第６図では，局所的なデータの区切りを↓，
大局的な区切りをで示す。この場合の大局的な区切り
は，各輪郭線を示す（Ｈ〜Ｐまで符号）データ群を目安
としており，これが一つの画像輪郭を構成すると考えた
ことによる。また，局所的な区切りとしては，水平コー
ドと垂直，パスコードとの区別を意識したもので，水平
コードは横の相関，垂直，パスコードは縦の相関を捉え
られると考えた。尚，各輪郭線が存在するラインNo.と
隣接する輪郭線のNo.については，出力符号として付加
するのみで，ここでは辞書への登録を行わないようにし
た。（インデックスの増加を防ぐため）さて，例えば，輪郭線１では，水平コードから３つの
垂直コードを経てパスコードを一連の画像データとして
辞書に登録する。（登録番号264）従って，画像として
大局的に意味のあるデータ系列が登録されるので効率よ
く圧縮することが出来る。Here, in Fig. 6, the local data delimiters are ↓,
Global breaks are indicated by. The general division in this case is based on a data group indicating each contour line (codes from H to P) as a guide, and is considered to form one image contour. In addition, as a local delimiter, the horizontal code was conscious of the distinction between a vertical code and a pass code, and it was thought that the horizontal code could capture the horizontal correlation and the vertical and pass code could capture the vertical correlation. It should be noted that the line number where each contour line exists and the contour line number adjacent to the line number are only added as output codes, and are not registered in the dictionary here. (In order to prevent an increase in the index) For example, for the contour line 1, a pass code is registered in the dictionary as a series of image data through three vertical codes from a horizontal code. (Registration number 264) Therefore, a data sequence that is globally significant is registered as an image, so that compression can be performed efficiently.

（ｃ）第３の実施例の説明第７図は本発明の第３の実施例説明図である。(C) Description of Third Embodiment FIG. 7 is an explanatory view of a third embodiment of the present invention.

この例では，第６図の輪郭線法において，各輪郭線の
水平，垂直，パスモードと，ランレングスを分離したモ
ード分離形を示している。In this example, in the contour method shown in FIG. 6, a mode separation type in which the horizontal, vertical, and pass modes of each contour and the run length are separated is shown.

ここで，第７図では，局所的なデータの区切りを↓，
大局的な区切りをで示す。この場合の大局的な区切り
は，各輪郭線を示す（Ｈ〜Ｐまで符号）データ群をモー
ド符号とランレングス符号に分けて画像輪郭を構成す
る。局所的な区切りについてはランレングス符号のみ水
平コードと垂直，パスコードとの区別を意識した。Here, in Fig. 7, the local data delimiters are ↓,
Global breaks are indicated by. In this case, as a global delimiter, an image contour is formed by dividing a data group (codes from H to P) indicating each contour line into a mode code and a run-length code. Regarding local breaks, only run-length codes were distinguished from horizontal codes, vertical codes, and pass codes.

さて，例えば輪郭線１において，モード符号をまとめ
て辞書に登録される（登録番号260）ので，その後の輪
郭2,3,4はこの登録番号260で表現することができる。従
って，画像として大局的に意味のあるデータ系列がひと
まとめのデータ系列として登録されるので効率よく圧縮
することが出来る。Now, for example, in the outline 1, the mode codes are collectively registered in the dictionary (registration number 260), so that the subsequent outlines 2, 3, and 4 can be represented by the registration number 260. Therefore, since a data sequence that is globally significant as an image is registered as a collective data sequence, it can be efficiently compressed.

（ｄ）第４の実施例の説明第８図は本発明の第４の実施例処理フロー図である。(D) Description of Fourth Embodiment FIG. 8 is a processing flowchart of the fourth embodiment of the present invention.

この実施例では，第２図の実施例に比し，画像データ
を二次元情報に変換する際に大局的，局所的区切り位置
に区切り（切れ目）符号を挿入するものであり，本処理
において，区切り符号を検出し，第１乃至第３の実施例
で説明した符号化，復号化を行うものである。In this embodiment, as compared with the embodiment of FIG. 2, when converting image data into two-dimensional information, a break (break) code is inserted at a global and local break position. This is to detect the delimiter code and perform the encoding and decoding described in the first to third embodiments.

このようにすると，区切りの検出に構文解析しなくて
よいので，符号化，復号化時間が短縮できるという利点
がある。By doing so, there is an advantage in that the parsing does not have to be performed for the detection of the break, and the encoding and decoding times can be reduced.

（ｅ）他の実施例の説明上述の実施例の他に，本発明は次のような変形が可能
である。(E) Description of Other Embodiments In addition to the above-described embodiments, the present invention can be modified as follows.

画像データを対象としたが，他のデータであっても
よい。Although image data is targeted, other data may be used.

画像データを二次元情報に変換して符号化している
が，原データを符号化してもよい。Although image data is converted into two-dimensional information and encoded, original data may be encoded.

以上本発明を実施例により説明したが，本発明は本発
明の主旨に従い種々の変形が可能であり，本発明からこ
れらを排除するものではない。Although the present invention has been described with reference to the embodiments, the present invention can be variously modified in accordance with the gist of the present invention, and these are not excluded from the present invention.

〔The invention's effect〕

以上説明した様に，本発明によれば，次の効果を奏す
る。As described above, the present invention has the following effects.

局所的なデータの区切り間のデータで辞書登録を行
うので，前述の局所的なデータ系列の異なるデータの組
合せが登録されるのを防ぎ，無駄な登録を防止して，学
習効果を促進する。Since dictionary registration is performed using data between local data breaks, it is possible to prevent a combination of data having different local data series from being registered, prevent unnecessary registration, and promote a learning effect.

大局的にデータを見ると，画像群等の大局的データ
系列が存在するが，前述の局所的なデータ系列で区切る
と，登録される組合せシンボル数が区切り間に限られ，
長くならない。Looking at the data globally, there is a global data sequence such as an image group. However, if the data is divided by the above-mentioned local data sequence, the number of registered combination symbols is limited to the interval,
It will not be long.

そこで，大局的なデータ系列でも区切って，この間の
データを，前述の異なる局所的なデータ系列間のデータ
の組合せが生じないよう，先頭シンボルから連続登録
し，より長いデータを登録し，圧縮率を向上させる。Therefore, the data is divided even in the global data series, and the data during this period is registered consecutively from the first symbol so that the combination of the data between the different local data series described above does not occur. Improve.

[Brief description of the drawings]

第１図は本発明の原理図，第２図は本発明の第１の実施例処理フロー図，第３図は本発明の第１の実施例符号化処理フロー図，第４図は本発明の第１の実施例動作説明図，第５図は本発明の第１の実施例復号化処理フロー図，第６図は本発明の第２の実施例説明図，第７図は本発明の第３の実施例説明図，第８図は本発明の第４の実施例処理フロー図，第９図乃至第13図は従来技術の説明である。 FIG. 1 is a principle diagram of the present invention, FIG. 2 is a processing flowchart of the first embodiment of the present invention, FIG. 3 is a flowchart of an encoding processing of the first embodiment of the present invention, and FIG. FIG. 5 is an explanatory diagram of the operation of the first embodiment, FIG. 5 is a flowchart of the decoding process of the first embodiment of the present invention, FIG. 6 is an explanatory diagram of the second embodiment of the present invention, and FIG. FIG. 8 is an explanatory view of a third embodiment, FIG. 8 is a processing flowchart of a fourth embodiment of the present invention, and FIGS.

───────────────────────────────────────────────────── フロントページの続き (72)発明者中野泰彦神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (56)参考文献特開昭60−116228（ＪＰ，Ａ) 特開昭61−242122（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) H03M 7/42 ────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Yasuhiko Nakano 1015 Uedanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture Inside Fujitsu Limited (56) References JP-A-60-116228 (JP, A) JP-A-61-242122 (JP, A) (58) Field surveyed (Int. Cl. ⁷ , DB name) H03M 7/42

Claims

(57) [Claims]

1. A method according to claim 1, wherein the encoded data is divided into different subsequences, the subsequences are registered in a dictionary, the input data is searched for the subsequences registered in the dictionary, and the input data is registered in the dictionary. In a data compression method of coding by specifying a reference number of a subsequence that matches the maximum length among reference subsequences and registering it in the dictionary, a local data delimiter and a global data delimiter are detected from the input data. A new subsequence obtained by adding the next input data to the matched subsequence is registered in the dictionary between the local data delimiters, and the global data is delimited between the global data delimiters. A data compression method characterized by successively registering, in the dictionary, each subsequence from the first data to the last data of the delimiter.

2. A data restoration method according to claim 1, wherein said encoded data is compared with a reference number of a subsequence registered in a dictionary, and said encoded data is restored to a subsequence having a matching reference number. A local data segment and a global data segment are detected from the divided subsequence, and a new subsequence obtained by adding the next input data to the matched subsequence is detected between the local data segments. Registering in the dictionary, and successively registering, in the dictionary, each subsequence from the first data to the last data of the global data break between the global data breaks. Method.