JP3231105B2

JP3231105B2 - Data encoding method and data restoration method

Info

Publication number: JP3231105B2
Application number: JP31957992A
Authority: JP
Inventors: 泰彦中野; 佳之岡田; 茂吉田; 広隆千葉
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1992-11-30
Filing date: 1992-11-30
Publication date: 2001-11-19
Anticipated expiration: 2016-11-19
Also published as: JPH06168096A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、ジブ・レンペル符号を
用いたユニバーサル符号化によりデータを圧縮するデー
タ符号化方式、及びデータ復元方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data encoding method for compressing data by universal encoding using a Jib-Lempel code and a data restoring method.

【０００２】[0002]

【従来の技術】近年、ＯＡ（オフィシャル・オートメー
ション）の発達に伴い、一文書中に文字、図形、画像な
ど様々のメディアを混在して取り込めるようになってき
ている。そして、文字コードや白黒２値画像等の混在情
報が、それらのレイアウト情報とともに、文書データと
してＧ４ファクシミリや光ディスクファイル・システム
などで扱われるようになってきており、それらの情報の
データ量も急速に増加してきている。これらのマルチメ
ディアから成る文書情報をディジタルデータとして利用
するとき、一般に、画像情報のデータ量は文字コードの
データ量に比較して１０倍〜数１０倍と多くなる。この
ため、データ蓄積やデータ伝送等で、画像情報を扱うと
きは、それらの処理を効率良く行うために、データの中
の冗長な部分を省いてデータ量を圧縮することにより、
記憶容量の削減や伝送の効率化を図っている。2. Description of the Related Art In recent years, with the development of OA (Official Automation), it has become possible to incorporate various media such as characters, figures, and images in a single document. Mixed information such as character codes and black-and-white binary images is now handled as document data in G4 facsimile and optical disk file systems, etc., along with their layout information. The amount of such information is also rapidly increasing. It is increasing. When document information composed of such multimedia is used as digital data, the data amount of image information is generally ten times to several tens times larger than the data amount of character codes. For this reason, when handling image information in data storage, data transmission, etc., in order to efficiently perform such processing, the data amount is compressed by omitting redundant portions in the data,
It aims to reduce storage capacity and increase transmission efficiency.

【０００３】しかしながら、大容量のファイルシステム
や文書データベースでは、文書データ中の文字コード情
報も全体として大きなものとなるため、画像情報のみな
らず文字コード情報の圧縮も必要となってくる。However, in a large-capacity file system or a document database, the character code information in the document data becomes large as a whole, so that not only image information but also character code information must be compressed.

【０００４】文字コードや画像データなどの様々のデー
タを一つの方式でデータ圧縮できる方法として、ユニバ
ーサル符号化方式が知られており、その代表的な方法と
してジブ・レンペル符号（宗像清治、「Ziv-Lempelのデ
ータ圧縮法」、情報処理、Vol.26,No.1,Jan.1985年参
照）がある。A universal encoding method is known as a method capable of compressing various data such as character codes and image data by one method. As a typical method, a Jib Lempel code (Seiji Munakata, "Ziv -Lempel Data Compression Method ", Information Processing, Vol. 26, No. 1, Jan. 1985).

【０００５】このジブ・レンペル符号には、ユニバーサル型と増分分解型（Incremental Paring) の２つのアルゴリズムがある。There are two algorithms for the Jib-Lempel code, a universal type and an incremental paring type.

【０００６】さらに、ユバーサル型アルゴリズムの改良
として、ＬＺＷ符号がある(T.C. Bell,"Better OPM/L T
ext Compression",IEEE Trans. on Commun., Vol.COM-3
4,No.12,Dec.1986参照）。Further, as an improvement of the universal algorithm, there is an LZW code (TC Bell, "Better OPM / LT
ext Compression ", IEEE Trans. on Commun., Vol.COM-3
4, No. 12, Dec. 1986).

【０００７】また、増分分解型アルゴリズムにも、その
改良型として、ＬＺＷ符号がある(T.A. Welch,"A Techn
ique for High-Performance Data Compression",Comput
er,June 1984 参照）。An LZW code is also an improved version of the incremental decomposition type algorithm (TA Welch, "A Techn.
ique for High-Performance Data Compression ", Comput
er, June 1984).

【０００８】これらの符号化方式の内、高速処理ができ
ることと、アルゴリズムが簡単であることから、最近
は、ＬＺＷ符号が、記憶装置に格納するファイルの圧縮
などに使用されるようになってきている。[0008] Among these encoding methods, LZW codes have recently been used for compression of files stored in storage devices because of their high speed processing and simple algorithms. I have.

【０００９】ここで、上記ユニバーサル符号化な代表的
な方法であるジブ・レンペル符号のユニバーサル型及び
増分分解型の２つのアルゴリズムについて説明する。１．ユニバーサル型のアルゴリズムこのアルゴリズムは、演算量が多いが、高い圧縮率が得
られるものであり、符号化するデータを、過去のデータ
系列の任意の位置から一致する最大長の系列（部分列）
に区切り、過去の系列の複製として符号化する方法であ
る。Here, two algorithms of a universal type and an incremental decomposition type of the Jib Lempel code, which are typical methods of the universal coding, will be described. 1. Universal type algorithm This algorithm requires a large amount of computation, but can provide a high compression rate. The maximum length sequence (subsequence) that matches data to be encoded from an arbitrary position in the past data sequence
And encodes it as a copy of a past series.

【００１０】このようなユニバーサル型ジブ・レンペル
符号の符号化の基本概念を図１５(a) に示す。同図(a)
に示すＰバッファには過去のデータ系列である既に符号
化済みの入力データ「・・・ａｂｃ・・・」が格納され
ている。一方、Ｑバッファにはこれから符号化するデー
タ（文字列）「ａｂｃｄｅｆ」が入力・格納されてい
る。FIG. 15 (a) shows the basic concept of the encoding of such a universal type Jib Lempel code. Figure (a)
.. Abc... Stored in the P buffer shown in FIG. On the other hand, data (character string) “abcdef” to be encoded is input and stored in the Q buffer.

【００１１】このような状態において、Ｑバッファ内の
データを符号化する際には、Ｑバッファのデータ系列を
キーとしてＰバッファ内のデータ系列を走査し、Ｐバッ
ファ内でＱバッファ内のデータ系列に一致する最大長の
部分列（同図(a) の例では「ａｂｃ」）を求める。そし
て、Ｐバッファ中のこの最大長の部分列を指定するため
に、同図(b) に示す形式の情報の組を符号化する。この
情報の組は、「Ｐバッファ中における最大一致系列の開
始位置」（同図(a) の例では「ａ」のアドレス）、「一
致する長さ」（同図(a) の例では「３」）、及び「次の
シンボル」（同図(a) の例では「ｄ」）の３個の情報か
らなる。In such a state, when encoding the data in the Q buffer, the data sequence in the P buffer is scanned using the data sequence in the Q buffer as a key, and the data sequence in the Q buffer is scanned in the P buffer. Is obtained ("abc" in the example of FIG. 3A). Then, in order to specify the substring having the maximum length in the P buffer, a set of information in the format shown in FIG. This set of information includes "the start position of the maximum matching sequence in the P buffer" (the address of "a" in the example of FIG. 3A), and "length of matching" (in the example of FIG. 3 "), and" next symbol "(" d "in the example of FIG. 3A).

【００１２】続いて、このＱバッファ内の符号化した系
列（この場合、「ａｂｃ」）をＰバッファ内に移動・格
納して新たな過去のデータ系列を得る。以下、Ｑバッフ
ァ内の残りのデータ系列「ｄｅｆ」についても、同様の
操作を繰り返し、Ｑバッファ内の残りのデータ系列をＰ
バッファ内に既に格納されている部分列に分解し、上述
のようにして符号化すると共に、Ｐバッファ内のデータ
系列を更新する。Subsequently, the encoded sequence (in this case, "abc") in the Q buffer is moved and stored in the P buffer to obtain a new past data sequence. Hereinafter, the same operation is repeated for the remaining data series “def” in the Q buffer, and the remaining data series in the Q buffer is
The data sequence is decomposed into partial strings already stored in the buffer, encoded as described above, and the data series in the P buffer is updated.

【００１３】２．増分分解型のアルゴリズムこのアルゴリズムは、圧縮率はユニバーサル型より劣る
が、アルゴリズムが簡単であり、計算も容易であること
から高速処理ができる。2. Incremental decomposition type algorithm Although this algorithm is inferior in compression ratio to the universal type, the algorithm is simple and the calculation is easy, so that high-speed processing can be performed.

【００１４】このアルゴリズムの代表的な方法であるＬ
ＺＷ符号化の方法を、図１６に示すフローチャート、図
１７に示す辞書（学習辞書）、及び図１８に示すデータ
変換の模式図を用いて説明する。L which is a typical method of this algorithm
The ZW encoding method will be described with reference to a flowchart shown in FIG. 16, a dictionary (learning dictionary) shown in FIG. 17, and a schematic diagram of data conversion shown in FIG.

【００１５】ＬＺＷ符号化は、書き替え可能な辞書（学
習用辞書）を１個持ち、入力文字列を相異なる文字列
（部分列）に分け、これらの文字列を出現した順に参照
番号を付けて上記辞書に登録すると共に、現在入力して
いる文字列を、上記辞書に既に登録されている最大長の
一致する文字列に割り当てられた参照番号で表わすこと
により符号化するものである。尚、以後の説明では、情
報理論で用いられる呼称を踏襲し、データの１ワード単
位を文字と呼び、データが任意ワードつながったものを
文字列と呼ぶ。The LZW encoding has one rewritable dictionary (learning dictionary), divides an input character string into different character strings (substrings), and assigns reference numbers in the order in which these character strings appear. And registering the character string currently input in the dictionary with a reference number assigned to a matching character string of the maximum length already registered in the dictionary. In the following description, one word unit of data is referred to as a character, and data connected with an arbitrary word is referred to as a character string, following the name used in information theory.

【００１６】ＬＺＷ符号化処理では、まず、ステップＳ
１で、予め辞書Ｄ_Cに、全文字につき一文字から成る文
字列を登録する初期化を行う。即ち、例えば、一文字を
８ビットコードで符号化する場合には、最大２５６種類
の全文字につき一文字からなる文字列を、辞書Ｄ_Cのア
ドレス０〜２５５番地に初期登録する。これにより、例
えば図１７に示すように、辞書Ｄ_Cのアドレス０、１、
２、・・・、２５５に、アルファベット「ａ」、
「ｂ」、「ｃ」、・・・や、ひらがな、カタカナ、数字
等が登録される。尚、同図の左側に示す文字列テーブル
Ｂ１は説明を容易なものとするために、補助的に示した
ものである。In the LZW encoding process, first, at step S
1, the pre dictionary D _C, is initialized to register a string of one character per total characters. That is, for example, when encoding character with 8-bit code, a string of one character per maximum 256 all characters are initially registered in the address 0 to 255 address of the dictionary D _C. Thus, for example, as shown in FIG. 17, the address of the dictionary D _C 0, 1,
, 255, the alphabet "a",
“B”, “c”,..., Hiragana, katakana, numerals, and the like are registered. It should be noted that the character string table B1 shown on the left side of the figure is supplementarily shown for easy explanation.

【００１７】以下の説明では、説明を分かり易くするた
めに、図１８に示すような入力文字列が入力された場合
の例を取り上げて説明する。まず、ステップＳ１で、辞
書Ｄ_Cの書込用先頭アドレスｎに、上記初期登録された
最後の文字列の格納アドレスの次のアドレスである「２
５６」を、新たに登録する文字列の辞書Ｄ_Cへの格納ア
ドレスとして設定する。In the following description, an example in which an input character string as shown in FIG. 18 is input will be described for easy understanding. First, at step S1, the leading address n for writing dictionary D _C, which is the next address of the storage address of the last character string the initial registration "2
56 "is set as the storage address to the dictionary D _C of the character string to be newly registered.

【００１８】続いて、同じくステップＳ１で、入力され
た最初の文字Ｋをキーデータ（インデックス）として辞
書Ｄ_cを検索し、参照番号ω（辞書Ｄ_Cに登録されてい
る文字Ｋの参照番号）を求め、これを語頭文字列(prefi
x string) とする。これにより、入力文字列が、例え
ば、図１８に示すような「ａｂａｂｃｂａｂａｂａａａ
ａａａａ」であれば、最初の文字Ｋである「ａ」をイン
デックスとして辞書Ｄ_Cが検索され、「ａ」の参照番号
「０」が参照番号ωとして求められ、この参照番号
「０」が語頭文字列となる（図１８の出力コードの欄を
参照）。[0018] Then, similarly in step S1, searches the dictionary D _c the first character K which is input as the key data (index), reference numeral omega (reference number of a character K which is registered in the dictionary D _C) And assign this to the initial string (prefi
x string). As a result, the input character string is, for example, “ababcbababaaaa” as shown in FIG.
If aaaa ", dictionary D _C is searched for, which is the first letter K" a "as an index," obtained as a reference number "0" of a "reference number ω, the reference number" 0 "is the word This is the initial character string (see the output code column in FIG. 18).

【００１９】次に、ステップＳ２で、入力文字列の次の
文字Ｋを読む。これにより、上記最初の入力文字の
「ａ」の次の文字「ｂ」が読み込まれる。続いて、ステ
ップＳ３で、文字Ｋがあるか否かを判別する。これは、
入力文字列がまだ終了していないか否かを判別する処理
である。Next, in step S2, the next character K of the input character string is read. Thus, the character “b” next to the first input character “a” is read. Subsequently, in a step S3, it is determined whether or not the character K is present. this is,
This is a process for determining whether or not the input character string has not been completed yet.

【００２０】図１８に示す入力文字列の場合は、上記ス
テップＳ２で、「ａ」の次の文字「ｂ」が読み込まれる
ので文字列がまだ終了しておらず、したがって、ステッ
プＳ３ではＹｅｓと判断し、次にステップＳ４で、文字
列「ωＫ」が辞書Ｄ_Cに登録されてあるか否か検索す
る。In the case of the input character string shown in FIG. 18, the character string "b" next to "a" is read in step S2, and the character string has not been finished yet. determination, and then in step S4, a character string "ωK" searches whether are registered in the dictionary D _C.

【００２１】これにより、ステップＳ１で求められた語
頭文字列ω（ここでは参照番号「０」）に、ステップＳ
２で読み込んだ文字Ｋ（ここでは「ｂ」）を加えた文字
列「０ｂ」が、辞書Ｄ_C内に登録されているか否かが調
べられる。Thus, the initial character string ω (here, reference number “0”) obtained in step S1 is added to step S1.
(In this case "b") read in two characters K is the string "0b" plus, whether or not it is registered in the dictionary D _C is examined.

【００２２】そして、この検索で、Ｎｏであれば、ステ
ップＳ６に進み、ステップＳ１で得られている文字Ｋの
参照番号ωの符号「code（ω）」を出力し、また文字列
「ωＫ」に新たな参照番号ｎを付与して辞書Ｄ_Cのアド
レスｎに登録する。If the result of this search is No, the process proceeds to step S6, where the code "code (ω)" of the reference number ω of the character K obtained in step S1 is output, and the character string "ωK" It is registered in the address n of the dictionary D _C to impart a new reference number n in.

【００２３】これにより、図１８に示す入力文字列の場
合、まず、「ａ」の参照番号ωである「０」の符号が出
力され、さらに、検出されなかった文字列「０ｂ」が参
照番号「２５６」が付与されて、辞書Ｄ_Cのアドレス２
５６に登録される。As a result, in the case of the input character string shown in FIG. 18, first, the code of "0" which is the reference number ω of "a" is output, and the character string "0b" which has not been detected is replaced with the reference number. been granted the "256", the address of the dictionary D _C 2
56 is registered.

【００２４】続いて、同じくステップＳ６で、上記ステ
ップＳ２で読み込んだ入力文字Ｋを参照番号ωに置き換
えると共に、辞書Ｄ_Cのアドレスｎを「１」インクリメ
ントして、ステップＳ２に戻り次の文字Ｋを読み込む。[0024] Then, similarly in step S6, along with replacing the input character K read in step S2 to the reference numbers omega, the address n of the dictionary D _C is incremented "1", the process returns to the step S2 next character K Read.

【００２５】これにより、図１８の入力文字列の例であ
れば、参照番号ωが「ｂ」の参照番号である「１」に置
き換えられ、次回新たに登録される文字列の辞書Ｄ_C内
での登録アドレスｎがインクリメントされて「２５７」
に変わる。[0025] Thus, in the example of the input string in FIG. 18, reference numeral ω is replaced by "b" is a reference number "1", the dictionary D in _C of a character string next newly registered Registration address n is incremented by "257"
Changes to

【００２６】一方、ステップＳ４で文字列「ωＫ」が辞
書Ｄ_Cに登録されていれば、この場合は、ステップＳ５
に進んで、その文字列「ωＫ」を参照番号ωに置き換
え、再びステップＳ２に戻ってステップＳ４で文字列
「ωＫ」が辞書Ｄ_Cから探せなくなるまでステップＳ２
〜Ｓ５を繰り返し、最大一致長の文字列の検索を続け
る。On the other hand, if the character string "ωK" is registered in the dictionary D _C in step S4, in this case, step S5
The process proceeds to step S2 to the replace the string "ωK" at reference numeral omega, not find in step S4 string "ωK" from the dictionary D _C returns to step S2 again
Steps S5 to S5 are repeated to continue searching for the character string having the maximum matching length.

【００２７】このような方法で行われるＬＺＷ符号化の
処理を、図１８に示す入文字列「ａｂａｂｃｂａｂａｂ
ａａａａａａａ」を取り上げて具体的に説明すると、ま
ず、最初の文字「ａ」を入力したとき、辞書Ｄ_Cには
「ａ」の他に一致する文字列がないので、「ａ」に付与
された参照番号「０」の符号code（０）を出力する。そ
して、拡張した文字列「ａｂ」に参照番号「２５６」を
付与して辞書Ｄ_Cに登録する。実際の辞書登録は図１３
の右側に示すように文字列「０ｂ」の形で登録される。The processing of LZW encoding performed by such a method is described by the input character string "ababcbabab" shown in FIG.
To be specific addresses the aaaaaaa ", firstly, when typing the first letter" a ", there is no character string that matches the other" a "in the dictionary D _C, granted to" a " The code code (0) of the reference number “0” is output. Then, registered in the dictionary D _C by giving the reference number "256" to the extended character string "ab". Figure 13 shows the actual dictionary registration.
Is registered in the form of a character string “0b” as shown on the right side of “.

【００２８】続いて、２番目の文字「ｂ」が新たな検索
文字列の先頭になる。この場合、辞書Ｄ_Cには文字
「ｂ」の他に一致する文字がないので文字「ｂ」に付さ
れている「１」の参照番号の符号code（１）を出力し、
同時に拡張した文字列「ｂａ」もまだ辞書Ｄ_Cに登録さ
れていないので、文字列「ｂａ」を「１ａ」で表わし、
参照番号「２５７」を付与して辞書Ｄ_Cに登録する。そ
して、次は、３番目の文字「ａ」が次の検索文字列「ω
Ｋ」の先頭になる。以下同様に、このような処理を続け
ていくことにより、図１８に示す入力文字列「ａｂａｂ
ｃｂａｂａｂａａａａａａａ」が、同図の出力コード欄
に示す「０、１、２５６、２、２５７、２６０、０、２
６２、２６３」の符号列に変換・出力され、この結果と
して、入力文字列が圧縮される。Subsequently, the second character "b" becomes the head of a new search character string. In this case, the dictionary D _C outputs the sign of the reference numbers "1" since there is no character that matches the other characters "b" which are assigned to the letter "b" code (1),
Is not registered in the extended character string "ba" is also still Dictionary D _C simultaneously represents a character string "ba" in "1a",
Given the reference number "257" is registered in the dictionary D _C is. Then, next, the third character “a” is replaced with the next search character string “ω
K ”. Similarly, by continuing such processing, the input character string “abab” shown in FIG.
“cbababaaaaaaa” is “0, 1, 256, 2, 257, 260, 0, 2” shown in the output code column of FIG.
62, 263 ", and as a result, the input character string is compressed.

【００２９】次に、上述の如くＬＺＷ符号化された符号
データを復元するアルゴリズムを、図１９のフローチャ
ートを用いて説明する。また、この復元の具体例とし
て、図１８に示すＬＺＷ符号化された出力符号列「０、
１、２５６、２５７、２６０、０、２６２、２６３」
を、入力符号列として図２０(a) に再掲して説明の補助
とする。Next, an algorithm for restoring LZW encoded code data as described above will be described with reference to the flowchart of FIG. As a specific example of this restoration, the output code string “0,
1, 256, 257, 260, 0, 262, 263 "
Is again shown in FIG. 20 (a) as an input code string to assist the explanation.

【００３０】先ず、ステップＳ１１では、この場合も上
記ＬＺＷ符号化のときと同様に、辞書Ｄd に全文字につ
き一文字から成る文字列を初期登録する。これから説明
する上記具体例では、各一文字「ａ」，「ｂ」，
「ｃ」、・・・を、それぞれ参照番号「０」、「１」、
「２」、・・・を付与して辞書Ｄd に登録し、また、辞
書Ｄd の書込用先頭アドレスｎに、上記初期登録された
最後の文字列の格納アドレスの次のアドレスである「２
５６」を、新たに登録する文字列の辞書Ｄd への格納ア
ドレスｎとして設定する。First, in step S11, as in the case of the LZW encoding, a character string consisting of one character for every character is initially registered in the dictionary Dd. In the specific example described below, each of the characters “a”, “b”,
"C",... Are represented by reference numbers "0", "1",
Are registered in the dictionary Dd by adding "2",..., And "2" which is the next address of the storage address of the last character string initially registered is added to the writing start address n of the dictionary Dd.
56 "is set as the storage address n of the newly registered character string in the dictionary Dd.

【００３１】次に、同じくステップＳ１１で、最初の符
号ＣＯＤＥを読み込み、この符号ＣＯＤＥに対応する参
照番号をＯＬＤωにセットする。これにより、図２０
（ａ）示す入力符号列の例では最初の入力符号である参
照番号「０」の符号code（０）が読み込まれて、参照番
号「０」に変換された後、ＯＬＤωにセットされる。Next, in step S11, the first code CODE is read, and a reference number corresponding to the code CODE is set to OLDω. As a result, FIG.
In the example of the input code string shown in (a), the code code (0) of the reference number “0”, which is the first input code, is read, converted to the reference number “0”, and set to OLDω.

【００３２】続いて、同じくステップＳ１１で、参照番
号「ＯＬＤω」に対応する文字Ｋを復元する。この処理
では、最初の入力符号ＣＯＤＥは上述のようにして辞書
Ｄｄに初期登録された一文字の参照番号のいずれかに該
当することから、その入力符号ＣＯＤＥに一致する符号
code（Ｋ）を辞書Ｄd から探し出し、該当文字「Ｋ」を
出力する。尚、この出力した文字「Ｋ」は後に必要に応
じて行われる例外処理に備えてＦＩＮcharにもセットし
ておく。Subsequently, in step S11, the character K corresponding to the reference number "OLDω" is restored. In this process, since the first input code CODE corresponds to one of the one-character reference numbers initially registered in the dictionary Dd as described above, the code that matches the input code CODE is used.
The code (K) is searched from the dictionary Dd, and the corresponding character "K" is output. Note that the output character "K" is also set in FINchar in preparation for an exception process performed as necessary later.

【００３３】これにより、図２０(a) に示す入力符号列
の例では、最初に参照番号「０」に対応する文字「ａ」
が、復元・出力されると共に、ＦＩＮcharにもセットさ
れる。Thus, in the example of the input code string shown in FIG. 20A, first, the character "a" corresponding to the reference number "0"
Is restored and output, and is also set in FINchar.

【００３４】続いて、ステップＳ１２で、次の入力符号
ＣＯＤＥを読み込む。すなわち、図２０(a) に示す入力
符号列の例では、「１」の符号code（１）が読み込まれ
る。そして、ステップＳ１３で、新たに読み込まれた符
号ＣＯＤＥが有るか否か、すなわち符号入力の終了の有
無を判別する。図２０(a) に示す入力符号列の例では、
ステップＳ１２で参照番号「１」の符号code（１）が新
たな入力符号ＣＯＤＥとして読み込まれる。Subsequently, in step S12, the next input code CODE is read. That is, in the example of the input code string shown in FIG. 20A, the code code (1) of "1" is read. Then, in a step S13, it is determined whether or not there is a code CODE newly read, that is, whether or not the code input is completed. In the example of the input code string shown in FIG.
In step S12, the code code (1) having the reference number "1" is read as a new input code CODE.

【００３５】このように、新たな入力符号ＣＯＤＥがあ
れば、ステップＳ１４に進んで、この入力符号ＣＯＤＥ
に対応する参照番号「ω」をＩＮωにセットする。これ
により、図２０(a) に示す入力符号の例では、参照番号
「１」がＩＮωにセットされる。As described above, if there is a new input code CODE, the process proceeds to step S14, where the input code CODE is set.
Is set to INω. Thus, in the example of the input code shown in FIG. 20A, the reference number “1” is set to INω.

【００３６】つぎに、ステップＳ１５で、上記参照番号
「ω」が辞書Ｄd に既に登録されているか否か（ω≧
ｎ）を判別する。この処理では、通常、読み込んだ符号
ＣＯＤＥは前回までの処理で、辞書Ｄd に既に登録され
ているから、ω＜ｎであり、ステップＳ１６に進んで、
辞書Ｄd を検索して、上記参照番号「ω」に対応する文
字列ω′Ｋを辞書Ｄd から読み出し、参照番号「ω」に
対応する文字列が二文字の文字列「ω′Ｋ」であるか否
か判別する。そして二文字の文字列「ω′Ｋ」であった
場合には、ステップＳ１７で文字「Ｋ」を一時的にスタ
ックし、参照番号「ω′」を新たな参照番号ωとして再
度ステップＳ１６に戻り、このステップＳ１６、Ｓ１７
の手順を再帰的に参照番号ωに対応する文字列が一文字
「Ｋ」に成るまで繰り返し、最後ステップＳ１８に進ん
で、まず上記最後に復元した文字Ｋを出力した後、ステ
ップＳ１７でスタックした全ての文字をＬＩＦＯ(Last
In First Out) 形式でポップアップして出力する（上記
ステップＳ１２で読み込んだ符号ＣＯＤＥの復元・出
力）。さらに、ステップＳ１８において、上記復元文字
列の第一文字ＫをＦＩＮcharにセットした後、前回復元
処理した参照番号ＯＬＤωと今回復元した文字列の最初
の一文字Ｋとから組（ＯＬＤω、Ｋ）で表わされる文字
列を、新たな参照番号「ｎ」を付与して辞書Ｄd のアド
レスｎに登録する。続いて、アドレスｎを「１」インク
リメントして、その「ｎ＋１」を次に辞書Ｄd に登録す
る文字列の登録アドレスｎとして設定し、さらにＩＮω
にセットされていた今回復元された符号ＣＯＤＥに対応
する参照番号「ω」をＯＬＤωに代入して、ステップＳ
１２に戻る。Next, in step S15, it is determined whether or not the reference number "ω" has already been registered in the dictionary Dd (ω ≧
n) is determined. In this process, normally, the read code CODE is already registered in the dictionary Dd in the previous process, so that ω <n, and the process proceeds to step S16.
The dictionary Dd is searched, and the character string ω′K corresponding to the reference number “ω” is read from the dictionary Dd, and the character string corresponding to the reference number “ω” is a two-character character string “ω′K”. Is determined. If it is a two-character string "ω'K", the character "K" is temporarily stacked in step S17, and the reference number "ω '" is set as a new reference number ω, and the process returns to step S16 again. This step S16, S17
Is repeated recursively until the character string corresponding to the reference number ω becomes one character "K", and the process proceeds to the last step S18, where the last restored character K is output, and then all of the characters stacked in the step S17 are output. LIFO (Last
(In First Out) pop-up and output (restoration and output of code CODE read in step S12). Further, in step S18, after setting the first character K of the restored character string to FINchar, the first character K of the restored character string is represented by a set (OLDω, K) from the reference number OLDω of the previously restored processing and the first character K of the currently restored character string. The character string is registered at the address n of the dictionary Dd with a new reference number "n". Subsequently, the address n is incremented by "1", and "n + 1" is set as a registration address n of a character string to be registered next in the dictionary Dd.
The reference number “ω” corresponding to the code CODE restored this time, which has been set to
Return to 12.

【００３７】これにより、図２０(a) に示す入力符号の
場合には、同(b) に示すように、２番目に読み込まれた
参照番号「１」の符号ＣＯＤＥ（＝code（１））から文
字「ｂ」が復元・出力され、この文字「ｂ」がＦＩＮch
arにセットされると共に、前回復元処理した符号ＣＯＤ
Ｅ（＝code（０））に対応する参照番号「０」と今回復
元した一文字「ｂ」との連なりから成る文字列「０ｂ」
が新たな参照番号「２５６」が付与されて辞書Ｄd に登
録される。Thus, in the case of the input code shown in FIG. 20A, as shown in FIG. 20B, the code CODE (= code (1)) of the reference number "1" read second. The character "b" is restored and output from
ar and the code COD that was previously restored
A character string “0b” consisting of a series of a reference number “0” corresponding to E (= code (0)) and one character “b” restored this time
Is given a new reference number "256" and registered in the dictionary Dd.

【００３８】そして、辞書Ｄd の登録アドレスｎが「２
５７」に更新された後、ＯＬＤωには今回、復元された
符号ＣＯＤＥ（＝code（１））に対応する参照番号
「１」がセットされ、ステップＳ１２で３番目の符号co
de（２５６）が読み込まれる。The registered address n of the dictionary Dd is "2
After being updated to “57”, the reference number “1” corresponding to the code CODE (= code (1)) restored this time is set in OLDω, and the third code co is set in step S12.
de (2 56) is read.

【００３９】そして、辞書Ｄd の検索により求められた
文字列「０ｂ」から文字列「ａｂ」への置き換えが行わ
れて、文字列「ａｂ」が出力される。同時に、前回復元
処理した符号code（１）に対応する参照番号「１」と今
回復元した文字列の第一文字「ａ」とを組み合わせた文
字列「１ａ」（＝「ｂａ」）が、新たな参照番号「２５
７」が付与されて辞書Ｄd のアドレス「２５７」に登録
される。Then, the character string "0b" obtained by searching the dictionary Dd is replaced with the character string "ab", and the character string "ab" is output. At the same time, a character string “1a” (= “ba”) obtained by combining the reference number “1” corresponding to the code code (1) restored last time and the first character “a” of the character string restored this time is new. Reference number "25
"7" is added and registered in the address "257" of the dictionary Dd.

【００４０】一方、上記のステップＳ１５の判別で、読
み込んだ符号code（ω）が前回までの処理で辞書Ｄd に
登録されていない場合（ω≧ｎ）は、ステップＳ１９に
進んで例外処理を行う。この例外処理では、まず、前回
復元した文字列の第一文字「ＦＩＮchar」を出力した
後、前回復元処理した符号ＣＯＤＥに対応する参照番号
「ＯＬＤω」を参照番号ωとしてセットした後に、上記
前回復元した文字列の第一文字「ＦＩＮchar」を加えた
文字列「ＯＬＤω、ＦＩＮchar」を求め、この新たな文
字列に対応する参照番号をＩＮωにセットしてからステ
ップＳ１６に進む。On the other hand, if it is determined in step S15 that the read code (ω) is not registered in the dictionary Dd in the previous process (ω ≧ n), the process proceeds to step S19 to perform an exception process. . In this exception processing, first, after outputting the first character "FINchar" of the previously restored character string, the reference number "OLDω" corresponding to the code CODE restored last time is set as the reference number ω, and then the previously restored character string is restored. A character string “OLDω, FINchar” obtained by adding the first character “FINchar” of the character string is obtained, a reference number corresponding to the new character string is set to INω, and the process proceeds to step S16.

【００４１】このことにより、例えば、図２０(a) に示
す入力符号列の場合では、６番目に入力する「２６０」
の符号code（２６０）に対応する参照番号「２６０」
は、この時点では辞書Ｄd に定義されていない。この場
合は、まず、ステップＳ１９で、前回復元された符号co
de（２５７）に対応する文字列「ｂａｂ」の第一文字
（ＦＩＮchar）が出力された後、上記前回復元処理した
符号code（２５７）に対応する参照番号「２５７」に前
回復元した文字列「ｂａ」の最初の一文字「ｂ」を加え
た文字列「２５７ｂ」を求め、この文字列に対し参照番
号「２６０」を付与し、この参照番号をＩＮωにセット
する。そして、次に、ステップＳ１６→Ｓ１７の処理を
繰り返すことにより、「ａ」、「ｂ」の順に１文字づつ
スタックする。そしてステップＳ１８で、ポップアップ
操作により文字列「ａｂ」を出力して、最終的に符号co
de（２６０）を「ｂａｂ」の文字列に復元・出力すると
共に、上記文字列「２５７ｂ」を参照番号「２６０」を
付与して辞書Ｄd に登録する（同図(b) 〜(e) 参照）。Thus, for example, in the case of the input code string shown in FIG.
Reference number "260" corresponding to code (260) of
Are not defined in the dictionary Dd at this time. In this case, first, in step S19, the previously restored code co
After the first character (FINchar) of the character string “bab” corresponding to de (257) is output, the previously restored character string “ba” is added to the reference number “257” corresponding to the code code (257) previously restored. Is obtained by adding the first character "b" of ".", A reference number "260" is given to this character string, and this reference number is set to INω. Then, by repeating the processing of steps S16 → S17, the characters are stacked one by one in the order of “a” and “b”. In step S18, the character string "ab" is output by a pop-up operation, and finally the code
de (2 6 0) as well as restore and output the character string "bab", registered in the dictionary Dd by applying a reference number "260" of the character string "257b" (FIG. (b) ~ (e )).

【００４２】以下、同様な処理を順次繰り返すことによ
り、図２０(a) に示す入力符号列が同図(e) に示す文字
列に復元される。Thereafter, by repeating the same processing sequentially, the input code string shown in FIG. 20A is restored to the character string shown in FIG.

【００４３】[0043]

【発明が解決しようとする課題】上述したジブ・レンペ
ル符号化によるデータ圧縮は、他の方式に見られるよう
な対象データの統計的な性質や定常性を予め仮定して圧
縮を行う方法でなく、符号すると元の情報に完全に復元
されるという情報保存型のデータ圧縮方法であることか
ら、例えば文字コードや、プログラムのソースコードも
しくはブジェクトコードのように、完全な復元が要求さ
れるデータの圧縮に適している。The data compression by the Jib-Lempel coding described above is not a method of performing compression by assuming in advance the statistical properties and stationarity of the target data as seen in other systems. Since it is an information-storing type data compression method that is completely restored to the original information when encoded, it is possible to use data that requires complete restoration, such as character codes, program source codes or object codes. Suitable for compression.

【００４４】また、ジブ・レンペル符号は、任意の記号
列に直接適用できるので、画像データを、一定量のデー
タに分割して、そのデータを文字コード同様に扱えば、
画像データもジブ・レンペル符号化によって圧縮するこ
とができる。したがって、例えば文字コードと画像デー
タのように性質が異なる複数種類のデータが混在する情
報をジブ・レンペル符号化により圧縮することは可能で
ある。Since the Jib-Lempel code can be directly applied to an arbitrary symbol string, if image data is divided into a certain amount of data and the data is handled in the same manner as a character code,
Image data can also be compressed by Jib-Lempel coding. Therefore, it is possible to compress information in which a plurality of types of data having different properties, such as a character code and image data, are mixed by Jib-Lempel coding.

【００４５】しかし、従来のジブ・レンペル符号化によ
るデータ圧縮は、１個の辞書のみ用いて行っており、こ
の辞書を入力データを符号化しながら作成・更新してい
き、辞書の容量が一杯になると直ちに即クリア（初期
化）するか、または容量が一杯になった後、圧縮率が悪
化してきた場合にクリアして、再び辞書の登録を最初か
ら始めるという方法でデータの符号化を行っている。こ
のため、上記辞書の容量が小さいと入力データの局所的
な性質は促えられるものの、十分な学習が行えず入力デ
ータの圧縮率は余り上がらない。However, the conventional data compression by Jib Lempel coding is performed using only one dictionary, and this dictionary is created and updated while encoding the input data, so that the capacity of the dictionary becomes full. Data is coded by clearing (initializing) immediately as soon as possible, or clearing when the compression rate has deteriorated after the capacity is full, and restarting the dictionary registration from the beginning. I have. For this reason, if the capacity of the dictionary is small, local characteristics of the input data can be promoted, but sufficient learning cannot be performed, and the compression ratio of the input data does not increase much.

【００４６】一方、上記辞書の容量を余り大きくする
と、入力データの大局的な性質は促えられているものの
入力データの局所的な変化への対応が鈍くなり、この面
でデータ圧縮率が悪化するという問題があった。On the other hand, if the capacity of the dictionary is too large, the general nature of the input data is promoted, but the response to local changes in the input data becomes slow, and the data compression ratio deteriorates in this aspect. There was a problem of doing.

【００４７】本発明は、このような従来の問題的に鑑み
なされたものであり、登録容量が異なる複数の書き換え
可能な辞書を用いてジブ・レンペル符号化を行うことに
より、入力データの大局的な性質と局所的な変化に対応
した効率的なジブ・レンペル符号化を行い、ジブ・レン
ペル符号化によるデータ圧縮の圧縮率を向上させること
を目的とする。The present invention has been made in view of such a conventional problem. By performing Jib-Lempel encoding using a plurality of rewritable dictionaries having different registration capacities, the general scope of input data can be improved. An object of the present invention is to perform efficient Jib-Lempel encoding corresponding to various characteristics and local changes, and to improve the data compression ratio by Jib-Lempel encoding.

【００４８】[0048]

【課題を解決するための手段】図１は、本発明（第１の
発明）の原理図である。この第１の発明は、ジブ・レン
ペル符号を用いたユニバーサル符号化によりデータ圧縮
するデータ符号化方において、入力データがある特定の
一定区間にわたってジブ・レンペル符号化される毎に初
期設定が行われる第１の辞書１と、前記入力データの連
続する複数の前記一定区間にわたるジブ・レンペル符号
化過程で生成される辞書データを全て登録できる第２の
辞書２と、入力されるデータを前記第１の辞書１を用い
てジブ・レンペル符号化すると共に、このジブ・レンペ
ル符号化過程で生ずる新たな辞書データを前記第１の辞
書１に登録する第１の符号化手段３と、該第１の符号化
手段１に入力されるデータを前記第２の辞書２を用いて
ジブ・レンペル符号化すると共に、このジブ・レンペル
符号化過程で生ずる新たな辞書データを前記第２の辞書
２に登録する第２の符号化手段４と、前記一定区間毎
に、前記第１の符号化手段３により得られたジブ・レン
ペル符号列と前記第２の符号化手段４により得られたジ
ブ・レンペル符号列のデータ量を比較し、データ量が少
ない方のジブ・レンペル符号列をこの符号化に用いられ
た辞書を示すフラグと共に出力する圧縮データ出力手段
５と、を備えたことを特徴とする。FIG. 1 is a principle diagram of the present invention (first invention). According to the first invention, in a data encoding method for compressing data by universal encoding using a Jib Lempel code, an initial setting is performed every time input data is Jib Lempel encoded over a certain specific section. A first dictionary 1; a second dictionary 2 capable of registering all dictionary data generated in a Jib-Lempel encoding process over a plurality of continuous sections of the input data; A first encoding means 3 for performing Jib-Lempel encoding using the dictionary 1 and registering new dictionary data generated in the Jib-Lempel encoding process in the first dictionary 1; The data input to the encoding means 1 is subjected to Jib-Lempel encoding using the second dictionary 2, and new dictionary data generated in this Jib-Lempel encoding process is A second encoding unit 4 to be registered in the second dictionary 2, a jib-Lempel code string obtained by the first encoding unit 3 and the second encoding unit 4 for each of the predetermined sections. Compressed data output means 5 for comparing the data amount of the obtained Jib-Lempel code sequence and outputting the smaller data amount of the Jib-Lempel code sequence together with a flag indicating the dictionary used for this encoding. It is characterized by having.

【００４９】上記第１の発明において前記一定区間は、
例えば、請求項２記載のように前記第１の辞書１が初期
設定されてから前記第２の符号化手段４がジブ・レンペ
ル符号化したデータ数が「０」からある特定の個数にな
るまでの期間であるように定義してもよい。In the first aspect, the certain section is
For example, after the first dictionary 1 is initialized as described in claim 2, the number of data subjected to the Jib-Lempel encoding by the second encoding unit 4 becomes from "0" to a specific number. May be defined as the period.

【００５０】また、さらには、前記一定区間は、例え
ば、請求項３記載のように、前記第１の辞書１が初期設
定されてから、その登録容量が一杯になるまでの期間で
あるように定義してもよい。Further, the fixed section may be, for example, a period from when the first dictionary 1 is initialized to when its registered capacity becomes full, as described in claim 3. May be defined.

【００５１】また、さらに、前記一定区間は、例えば請
求項４記載のように、前記第１の辞書１の登録容量が一
杯になってから、前記第１の符号手段３から出力される
ジブ・レンペル符号列の入力データに対する圧縮率があ
る下限値まで低下するまでの期間であるように定義して
もよい。Further, the fixed section is provided with a jib output from the first encoding means 3 after the registered capacity of the first dictionary 1 becomes full, for example. It may be defined as a period until the compression ratio of the input data of the Lempel code string decreases to a certain lower limit.

【００５２】さらに、前記一定区間は、例えば、請求項
５記載のように、第１の符号化手段３と第２の符号化手
段４が１個のジブ・レンペル符号を出力する期間である
ように設定してもよい。Further, the fixed section may be, for example, a period during which the first encoding means 3 and the second encoding means 4 output one Jib-Lempel code. May be set.

【００５３】そして、上記のような、各種構成におい
て、前記第１の符号化手段３と前記第２の符号化手段４
は、並行して同一入力データのジブ・レンペル符号化を
行うような構成にしてもよい。In the various configurations as described above, the first encoding means 3 and the second encoding means 4
May be configured to perform Jib-Lempel encoding of the same input data in parallel.

【００５４】次に図２は、もう１つの本発明（第２の発
明）の原理図である。この第２の発明は、上記第１の発
明のデータ符号化方式によってジブ・レンペル符号化さ
れた圧縮データを復元する復元方式であって、圧縮デー
タがある特定の一定区間にわたって復元される毎に初期
設定される第１の辞書１１と、前記圧縮データの連続す
る複数の前記一定区間にわたって復元過程で生成される
辞書データを登録できる第２の辞書１２と、前記フラグ
を参照して、復元すべき圧縮データの圧縮の際に用いら
れた辞書が、上記第１の発明の前記第１の辞書１または
前記第２の辞書２のいずれかであるかを判断して、第１
の辞書１１または第２の辞書１２のいずれか一方を選択
し、この辞書を用いて前記圧縮データを復元すると共
に、必要に応じて上記辞書にこの復元により得られた辞
書データを登録する復元手段１３と、を備えたことを特
徴とする。Next, FIG. 2 is a principle diagram of another present invention (second invention). The second invention is a decompression method for decompressing Jib-Lempel encoded data by the data encoding method of the first invention. The first dictionary 11 to be initialized, the second dictionary 12 capable of registering the dictionary data generated in the decompression process over the plurality of continuous sections of the compressed data, and the decompression with reference to the flag It is determined whether the dictionary used at the time of compressing the compressed data to be compressed is one of the first dictionary 1 or the second dictionary 2 of the first aspect of the present invention.
Restoring means for selecting either the dictionary 11 or the second dictionary 12 and restoring the compressed data using this dictionary and, if necessary, registering the dictionary data obtained by this restoration in the dictionary. 13 is provided.

【００５５】この第２の発明は、上記構成において、例
えば、前記一定区間は、前記第１の辞書１１が初期設定
されてから前記復元手段１３により復元された復元デー
タのデータ長がある特定の値に等しくなるまでの期間で
あるように定義してもよい。According to the second aspect of the present invention, in the above-mentioned configuration, for example, in the specific section, the data length of the restored data restored by the restoring unit 13 after the first dictionary 11 is initialized is a specific length. It may be defined to be a period until it becomes equal to the value.

【００５６】また、さらに、前記一定区間は、例えば、
請求項９記載のように、前記第１の辞書１１が初期設定
されてから、その登録容量が一杯になるまでの期間であ
るように定義してもよい。Further, the certain section is, for example,
As described in claim 9, the period may be defined to be a period from when the first dictionary 11 is initialized to when its registered capacity becomes full.

【００５７】また、さらには、前記一定区間は、例え
ば、請求項１０記載のように前記復元手段１３が１つの
ジブ・レンペル符号を入力してから、このジブ・レンペ
ル符号を復元するまでの期間であるように定義してもよ
い。Further, for example, the fixed section is a period from when the restoration means 13 inputs one Jib Lempel code to when the restoration means 13 restores the Jib Lempel code. May be defined as

【００５８】そして、上記各種構成において、前記復元
手段１３は、前記第１の辞書１２を用いた前記圧縮デー
タの復元と前記第２の辞書１２を用いた前記圧縮データ
の復元を、並行して同時に行うような構成にしてもよ
い。In the various configurations described above, the decompression means 13 performs the decompression of the compressed data using the first dictionary 12 and the decompression of the compressed data using the second dictionary 12 in parallel. It may be configured to be performed at the same time.

【００５９】[0059]

【作用】まず、図１に示す第１の発明においては、例え
ば、第１の辞書１及び第２の辞書２に、予めアルファベ
ット、かな、英数字等の１文字が対応するジブ・レンペ
ル符号と、対応付けられて登録されている（初期設
定）。First, in the first invention shown in FIG. 1, for example, a Jib-Lempel code in which one character such as an alphabet, a kana, or an alphanumeric character corresponds to the first dictionary 1 and the second dictionary 2 in advance. Are registered in association with each other (initial setting).

【００６０】そして、データが入力されると、第１の符
号化手段３は第１の辞書１を、第２の符号化手段４は第
２の辞書２を参照して、その入力データに対応するジブ
・レンペル符号がそれぞれの辞書に登録されてあるか調
べる。そして、登録されてあれば、次のデータを入力
し、先の入力データに今度の入力データを加えた文字列
がそれぞれの辞書に登録されてあるか調べる。第１の符
号化手段１と第２の符号化手段２は、このような辞書
１，２の検索処理を、入力した文字列がそれぞれの辞書
に登録されていないことが分かるまで繰り返す。そし
て、第１の符号化手段１と第２の符号化手段２は、現在
までに入力した文字列がそれぞれの辞書１，２に登録さ
れていないことが分かると、前回までの入力文字列に対
応するジブ・レンペル符号を出力すると共に、今回まで
の入力文字列にまだ未使用のジブ・レンペル符号を割り
当て、それぞれの辞書１，２に登録すると共に、今度
は、今回入力した文字（最新の入力文字）を次にジブ・
レンペル符号化する文字列の先頭文字として、上述した
処理を繰り返す。When the data is input, the first encoding means 3 refers to the first dictionary 1 and the second encoding means 4 refers to the second dictionary 2 to correspond to the input data. It is checked whether the Jib Lempel code to be registered is registered in each dictionary. If it has been registered, the next data is input, and it is checked whether a character string obtained by adding the input data to the previous input data is registered in each dictionary. The first encoding unit 1 and the second encoding unit 2 repeat such search processing of the dictionaries 1 and 2 until it is found that the input character string is not registered in each dictionary. When the first encoding unit 1 and the second encoding unit 2 find that the character string input so far is not registered in the respective dictionaries 1 and 2, the first encoding unit 1 and the second encoding unit 2 A corresponding Jib-Lempel code is output, an unused Jib-Lempel code is assigned to the input character string up to this time, and registered in each of the dictionaries 1 and 2. Input characters) then jib
The above processing is repeated as the first character of the character string to be Lempel encoded.

【００６１】このような処理が、何度も繰り返される
と、第１及び第２の辞書１，２には、どんどん新たな文
字列が登録されてゆく。そして、やがて、容量が小さい
第１の辞書１の登録容量が一杯になる。When such processing is repeated many times, new character strings are registered in the first and second dictionaries 1 and 2 more and more. Eventually, the registered capacity of the first dictionary 1 having a small capacity becomes full.

【００６２】第１の辞書１がこのような状態になると、
第１符号化手段１は、所定のタインミングで、第１の辞
書１を、上述のように初期設定する。したがって、第１
の辞書１には、これから入力されるデータに対応する文
字列を再び登録されるようになる。When the first dictionary 1 is in such a state,
The first encoding unit 1 initializes the first dictionary 1 at a predetermined timing as described above. Therefore, the first
The character string corresponding to the data to be input from now on is registered again in the dictionary 1.

【００６３】一方、容量の大きい第２の辞書２には、最
初の入力データから最新の入力データまでの文字列の中
に現れる未登録がどんどん登録されていく。このため、
第２の辞書２には、入力データの大局的な性質を示す多
量の文字列がどんどん蓄積される。On the other hand, in the second dictionary 2 having a large capacity, unregistered characters appearing in a character string from the first input data to the latest input data are registered more and more. For this reason,
In the second dictionary 2, a large number of character strings indicating the global properties of the input data are accumulated.

【００６４】他方、第１の辞書１は、入力データの一定
区間毎に初期設定されて、新たな登録を開始するので、
登録される文字列は入力データの局所的な性質を反映し
たものとなる。したがって第１の辞書１は入力データの
性質（種類）の変化に対応し易く、入力データの性質が
変化した場合には、第１の辞書１を用いた方が第２の辞
書２を用いた場合よりも圧縮率が高くなる。On the other hand, the first dictionary 1 is initialized for each fixed section of the input data and starts a new registration.
The registered character string reflects the local properties of the input data. Therefore, the first dictionary 1 can easily cope with a change in the property (type) of the input data. When the property of the input data changes, the first dictionary 1 uses the second dictionary 2. The compression ratio is higher than in the case.

【００６５】このため、第１の符号化手段３と第２の符
号化手段４から出力される入力データの圧縮データ（ジ
ブ・レンペル符号列）のデータ量を、上記一定区間毎に
比較して、よりデータ量の少ない圧縮データを選択出力
することにより、データの圧縮率を従来よりも向上させ
ることができる。また、圧縮データ出力手段５は、上記
圧縮データの出力の際に、この圧縮データが第１の辞書
１または第２の辞書２のいずれの辞書を用いてジブ・レ
ンペル符号化されたものであるかを示すフラグも出力す
る。このことにより、圧縮データの復元側は、復元時に
どの辞書を用いればよいのか、復元前に知ることができ
るので、入力する圧縮データの復元を正しく行うことが
できる。For this reason, the data amount of the compressed data (Jib-Lempel code string) of the input data output from the first encoding means 3 and the second encoding means 4 is compared for each of the above-mentioned fixed sections. By selectively outputting compressed data having a smaller data amount, the data compression ratio can be improved as compared with the conventional case. The compressed data output means 5 is such that, when outputting the compressed data, the compressed data is subjected to Jib-Lempel encoding using either the first dictionary 1 or the second dictionary 2. Also outputs a flag indicating whether or not. Thus, the decompression side of the compressed data can know which dictionary should be used at the time of decompression before the decompression, so that the input compressed data can be correctly decompressed.

【００６６】次に、図２に示す上記第２の発明において
は、まず、第１の辞書１１と第２の辞書１２が、上記第
１の発明の辞書１１，１２と同様に初期設定される。こ
の初期設定が終了すると、復元手段１３は上記第１の発
明により生成された圧縮データの復元を開始する。この
圧縮データは、（フラグ、圧縮データ）の複数の組の系
列から成り、復元手段１３は、まずフラグを入力して、
続いて入力する圧縮データが第１の辞書１または第２の
辞書２のいずれの辞書を用いて圧縮されたものであるか
を判断し、第１の辞書１の場合には第１の辞書１１を、
第２の辞書２の場合には第２の辞書１２を参照して、以
後入力する圧縮データを前記一定区間単位で復元してい
く。また、復元手段１３は、この復元過程において、上
記第１の発明の第１の符号化手段３及び第２の符号化手
段４と同様にして、第１の辞書１及び第２の辞書２に、
未登録の復元データをそのジブ・レンペル符号と対応付
けて登録していく。また、復元手段１３は、上記第１の
発明の場合と同様に、一定区間のデータ復元が終了する
毎に第１の辞書１１を初期設定する。したがって、この
第１の辞書１１も、上記第１の発明の第１の辞書１と同
様にして、登録と初期設定が行われる。したがって、復
元手段３は、上記第１の発明から出力される圧縮データ
を正確に復元することができる。Next, in the second invention shown in FIG. 2, first, the first dictionary 11 and the second dictionary 12 are initialized similarly to the dictionaries 11 and 12 of the first invention. . When this initialization is completed, the decompression means 13 starts decompression of the compressed data generated by the first invention. The compressed data is composed of a series of a plurality of sets of (flag, compressed data).
Then, it is determined whether the input compressed data is compressed using one of the first dictionary 1 and the second dictionary 2, and in the case of the first dictionary 1, the first dictionary 11 To
In the case of the second dictionary 2, with reference to the second dictionary 12, the compressed data to be input thereafter is restored in units of the fixed section. Further, in this restoration process, the restoration means 13 stores the first dictionary 1 and the second dictionary 2 in the same manner as the first encoding means 3 and the second encoding means 4 of the first invention. ,
The unregistered restored data is registered in association with the Jib-Lempel code. Further, the restoring unit 13 initializes the first dictionary 11 every time data restoration in a certain section is completed, as in the case of the first invention. Therefore, the first dictionary 11 is registered and initialized similarly to the first dictionary 1 of the first invention. Therefore, the decompression means 3 can correctly decompress the compressed data output from the first invention.

【００６７】[0067]

【実施例】以下、図面を参照しながら、本発明の実施例
を説明する。図３は、本発明の一実施例のデータ符号化
システムの基本構成図である。Embodiments of the present invention will be described below with reference to the drawings. FIG. 3 is a basic configuration diagram of a data encoding system according to one embodiment of the present invention.

【００６８】同図において、原データ１１０は、文字コ
ード、画像情報等の複数種類の情報が混在したデータで
あり、１つのファイルに格納されている。また、ＬＺＷ
符号化Ａ系１２０は、原データ１１０のＬＺＷ符号化を
行うＬＺＷ符号器Ａと大局辞書１２１とから成り、原デ
ータ１１０の大局的な性質を大局辞書１２１により学習
しながら、ＬＺＷ符号器Ａにより上記原データ１１０の
ＬＺＷ符号化を行う。大局辞書１２１は、十分に大きな
辞書容量を持つ書き換え可能な辞書であり、ＬＺＷ符号
器Ａによる上記原データ１１０のＬＺＷ符号化の過程
で、原データ１１０の大局的な性質が反映された辞書と
成る。In the figure, original data 110 is data in which a plurality of types of information such as character codes and image information are mixed and stored in one file. Also, LZW
The encoding A system 120 includes an LZW encoder A for performing LZW encoding of the original data 110 and a global dictionary 121. The LZW encoder A learns the global properties of the original data 110 using the global dictionary 121. LZW encoding of the original data 110 is performed. The global dictionary 121 is a rewritable dictionary having a sufficiently large dictionary capacity. In the process of LZW encoding of the original data 110 by the LZW encoder A, a dictionary in which the global properties of the original data 110 are reflected. Become.

【００６９】一方、ＬＺＷ符号化Ｂ系１３０は、原デー
タ１１０のＬＺＷ符号化を行うＬＺＷ符号器Ｂと局所辞
書１３１から成り、原データ１１０の各部の性質の変化
（局所的な変化）を局所辞書１３１により把握・学習し
ながら、上記原データ１１０のＬＺＷ符号化を行う。局
所辞書１３１は、上記大局辞書１２１よりも小さな辞書
容量を持つ書き換え可能な辞書であり、ＬＺＷ符号器Ｂ
による上記原データ１１０のＬＺＷ符号化の過程で、あ
る特定の入力データ数（以後、ブロックと表現する）単
位で、何度もクリア（初期設定）されながら新登録を繰
り返すことにより、原データ１１０の局所的な変化に対
応した辞書と成る。On the other hand, the LZW encoding B system 130 is composed of an LZW encoder B for performing LZW encoding of the original data 110 and a local dictionary 131, and localizes changes (local changes) in the properties of each part of the original data 110. The LZW encoding of the original data 110 is performed while grasping and learning by the dictionary 131. The local dictionary 131 is a rewritable dictionary having a dictionary capacity smaller than that of the global dictionary 121, and is an LZW encoder B
In the process of LZW encoding of the original data 110, the new registration is repeated many times (initial setting) in units of a specific number of input data (hereinafter, referred to as a block), so that the original data 110 Becomes a dictionary corresponding to the local change of.

【００７０】これらＬＺＷ符号化Ａ系１２０とＬＺＷ符
号化Ｂ系１３０は、それぞれＬＺＷ符号器Ａ、ＬＺＷ符
号器Ｂにより原データ１１０のＬＺＷ符号化を上記ブロ
ック単位で並行して行い、それぞれのＬＺＷ符号化によ
り得られた上記原データ１１０の各ブロックの圧縮デー
タＡ、圧縮データＢをそれぞれ、バッファＡ１５０、バ
ッファＢ１６０に格納する。圧縮率比較器１４０は、バ
ッファＡ１５０に格納されているＬＺＷ符号データのデ
ータ量とバッファＢ１６０に格納されているＬＺＷ符号
データのデータ量を基に、ＬＺＷ符号器Ａによる原デー
タ１１０の各ブロックのデータの圧縮率ＡとＬＺＷ符号
器Ｂによる原データ１１０の各ブロックのデータの圧縮
率Ｂを、同一ブロック同士で比較し、上記圧縮データＡ
と上記圧縮データＢの内、圧縮率の高い方の圧縮データ
を選択する旨を選択信号ａによりＭＰＸ（マルチプレク
サ）１７０に指示する。The LZW encoding A system 120 and the LZW encoding B system 130 perform LZW encoding of the original data 110 in parallel by the LZW encoder A and the LZW encoder B on a block-by-block basis. The compressed data A and compressed data B of each block of the original data 110 obtained by the encoding are stored in the buffer A150 and the buffer B160, respectively. Compression ratio comparator 1 4 0, based on the data amount of the LZW code data stored in the LZW code data of the data amount and a buffer B160 stored in the buffer A150, each of the original data 110 according LZW encoder A The compression ratio A of the block data and the compression ratio B of the data of each block of the original data 110 by the LZW encoder B are compared between the same blocks, and the compressed data A
The selection signal a instructs the MPX (multiplexer) 170 to select the compressed data having the higher compression rate from the compressed data B.

【００７１】ＭＰＸ１７０は、上記圧縮率比較器１４０
から加わる上記選択信号ａの指示に従って、上記バッフ
ァＡ１５０または上記バッファＢ１６０のいずれか一方
に格納されているデータ量のより小さい方の圧縮データ
を選択出力する。[0071] MPX170 is the compression ratio comparator 140
In accordance with the instruction of the selection signal a added from the above, the smaller compressed data stored in either the buffer A 150 or the buffer B 160 is selectively output.

【００７２】ＭＰＸ１７０は、上記圧縮データの選択出
力を、上記原データ１１０の各ブロック毎に行うので、
ＭＰＸ１７０から出力される圧縮データは、離散した時
系列のデータとなる。また、ＭＰＸ１７０は、圧縮デー
タを選択出力する際、その先頭に、上記圧縮データが大
局辞書１２１または局所辞書１３１のいずれの辞書を用
いて得られたものであるかを示す辞書フラグを付けて出
力する。The MPX 170 performs the selective output of the compressed data for each block of the original data 110.
The compressed data output from the MPX 170 is discrete time-series data. Further, when the compressed data is selectively output, the MPX 170 adds a dictionary flag indicating which of the global dictionary 121 and the local dictionary 131 is used at the head of the compressed data to output the compressed data. I do.

【００７３】このＭＰＸ１７０から出力される圧縮デー
タ系列１８０の構成を図４に示す。同図に示す（辞書フ
ラグ１８１，圧縮データ１８２）の組は、上記圧縮率
Ａ，Ｂを比較する原データ１１０のＬＺＷ符号化のブロ
ック単位（符号化ブロック単位）で、ＭＰＸ１７０から
出力される。FIG. 4 shows the configuration of compressed data sequence 180 output from MPX 170. The set of (dictionary flag 181, compressed data 182) shown in the figure is output from the MPX 170 in LZW coding block units (coding block units) of the original data 110 for comparing the compression ratios A and B.

【００７４】続いて、本発明の一実施例のデータ符号化
（データ圧縮）方式のアルゴリズムを図５のフローチャ
ートを参照しながら説明する。尚、この例では、ＬＺＷ
符号化Ａ系１２０のＬＺＷ符号器ＡとＬＺＷ符号化Ｂ系
１３０のＬＺＷ符号器Ｂは、共に不図示の入力カウンタ
を備えている。この入力カウンタは、入力データである
原データ１１０の符号化単位である１ブロックのデータ
数（例えば、１００Ｋバイト）を計数するために使用さ
れる減算カウンタである。また、原データ１１０からの
データ入力は、１バイト単位で行い、この１バイトデー
タを文字コード以外のデータ（例えば、画像情報）であ
っても１文字として取り扱い、複数の文字から成る入力
データを文字列と表現する。Next, an algorithm of a data encoding (data compression) system according to an embodiment of the present invention will be described with reference to a flowchart of FIG. In this example, LZW
Encoding A system 120 LZW coder A and LZW coding B system 1 3 0 of LZW coder B of has both an input counter (not shown). This input counter is a subtraction counter used to count the number of data in one block (for example, 100 Kbytes), which is a coding unit of the original data 110 as input data. The data input from the original data 110 is performed in units of 1 byte. Even if the 1-byte data is data other than the character code (for example, image information), it is treated as one character, and the input data including a plurality of characters is processed. Expressed as a character string.

【００７５】まず、ＬＺＷ符号器ＡとＬＺＷ符号器Ｂは
共に、それぞれの入力カウンタに、原データ１１０の符
号化単位である１ブロックのデータ数（ブロックサイ
ズ）を、初期値として設定する（ステップＳ７０１）。First, both the LZW encoder A and the LZW encoder B set the number of data of one block (block size), which is the coding unit of the original data 110, as an initial value in each input counter (step S1). S701).

【００７６】続いて、ＬＺＷ符号器ＡとＬＺＷ符号器Ｂ
は、該当入力ファイルから原データ１１０の最初のデー
タを入力する（ステップＳ７０２）。次に、ＬＺＷ符号
器ＡとＬＺＷ符号器Ｂは、上記入力データが上記該当入
力ファイルの終了を示す「ＥＯＦ」(End of File) であ
るか否か判別し（ステップＳ７０３）、「ＥＯＦ」であ
れば全ての原データ１１０のＬＺＷ符号化が終了したの
で、直ちに処理を終了するが、入力データが「ＥＯＦ」
でなければ、入力カウンタを「１」デクリメントして
（ステップＳ７０４）、このデクリメントされた入力カ
ウンタの値が「０」（ＣＴ＝０）になったか否か判別す
る（ステップＳ７０５）。このステップＳ７０５の判別
処理は、原データ１１０におけるある１ブロックの入力
データのＬＺＷ符号化が終了したか否かを判別する処理
である。Subsequently, the LZW encoder A and the LZW encoder B
Inputs the first data of the original data 110 from the corresponding input file (step S702). Next, the LZW encoder A and the LZW encoder B determine whether or not the input data is “EOF” (End of File) indicating the end of the input file (step S703). If there is, the processing ends immediately because the LZW encoding of all the original data 110 is completed, but the input data is “EOF”
If not, the input counter is decremented by "1" (step S704), and it is determined whether or not the value of the decremented input counter has become "0" (CT = 0) (step S705). The determination process of step S705 is a process of determining whether or not LZW encoding of input data of a certain block in the original data 110 has been completed.

【００７７】この判別で、ＣＴ＝０でないときは、ＬＺ
Ｗ符号器ＡとＬＺＷ符号器Ｂは、原データ１１０のある
１ブロックデータのＬＺＷ符号化がまだ終了していない
ものと判断し、上記ステップＳ７０２で入力されたデー
タのＬＺＷ符号化を同時に並行して開始する（Ｓ７０
６）。In this determination, if CT = 0 is not satisfied, LZ
W encoder A and LZW encoder B judges that the LZW encoding of one block data of the original data 110 is not yet finished at the same time in parallel LZW coding of data input in step S70 2 And start (S70
6).

【００７８】この並列処理において、ＬＺＷ符号器Ａは
大局辞書１２１を用いて上記入力データのＬＺＷ符号化
を行い（ステップＳ７０７Ａ）、ＬＺＷ符号器Ｂは局所
辞書１３１を用いて上記入力データのＬＺＷ符号化を行
う（ステップＳ７０７Ｂ）。In this parallel processing, the LZW encoder A performs LZW encoding on the input data using the global dictionary 121 (step S707A), and the LZW encoder B uses the local dictionary 131 to perform LZW encoding on the input data. (Step S707B).

【００７９】そして、ＬＺＷ符号器Ａは、大局辞書Ａに
上記入力データに対応するＬＺＷ符号が登録されていれ
ば（Ｓ７０８Ａ，ＹＥＳ）、その入力データも加えた文
字列のＬＺＷ符号化を試みるため、再びステップＳ７０
２に戻り、次のデータを入力する。If the LZW code corresponding to the input data is registered in the global dictionary A (S708A, YES), the LZW encoder A attempts LZW encoding of the character string including the input data. Again at step S70
Return to step 2 and enter the next data.

【００８０】したがって、ＬＺＷ符号器Ａは、上記ステ
ップＳ７０８Ａで入力データ（１文字または文字列）が
大局辞書１２１に登録されていないと判断するまで、上
記ステップＳ７０２〜Ｓ７０６、ステップＳ７０７Ａ〜
Ｓ７０９Ａを繰り返す。Therefore, the LZW encoder A determines that the input data (one character or character string) is not registered in the general dictionary 121 in the step S708A, and the steps S702 to S706 and the steps S707A to
S709A is repeated.

【００８１】そして、ＬＺＷ符号器Ａは、上記ステップ
Ｓ７０８Ａで、大局辞書１２１に今までに入力されたデ
ータ列（文字列）が登録されていないと判別すると（Ｓ
７０８１Ａ，ＮＯ）、前回のステップＳ７０７Ａで大局
辞書１２１から読み出したＬＺＷ符号をバッファＡ１５
０に格納すると共に、今回のステップＳ７０７Ａで大局
辞書１２１に登録されていないことが判明した入力デー
タ（一文字または文字列）に新たなＬＺＷ符号を割り当
て、これらの（入力データ，ＬＺＷ符号）の組を上記入
力データをインデックスとして大局辞書１２１に登録し
た後（ステップＳ７０９Ａ）、再びステップＳ７０２に
戻り上記該当入力ファイルから次にＬＺＷ符号化すべき
原データ１１０の新たなデータの入力を開始する。When the LZW encoder A determines in step S708A that the data string (character string) input so far is not registered in the global dictionary 121 (step S708A).
7081A, NO), the buffer A1 5 the LZW code read from global dictionary 121 in the previous step S707A
0, and a new LZW code is assigned to the input data (one character or character string) found not to be registered in the global dictionary 121 in step S707A, and a set of these (input data, LZW code) is set. Is registered in the general dictionary 121 using the input data as an index (step S709A), and the process returns to step S702 again to start inputting new data of the original data 110 to be subjected to the next LZW encoding from the input file.

【００８２】ＬＺＷ符号器Ａは、以上のような動作を、
前記ステップＳ７０５で入力カウンタ値が「０」、すな
わち原データの最初のブロックのＬＺＷ符号化が全て終
了したと判断するまで行って、原データ１１０の最初の
ブロックの全データをＬＺＷ符号化し、このＬＺＷ符号
化により得られた上記原データ１１０の最初のブロック
の全データの圧縮データＡ（ＬＺＷ符号列）をバッファ
Ａ１５０に格納する。The LZW encoder A performs the above operation,
In step S705, the process is performed until the input counter value is determined to be “0”, that is, it is determined that all LZW encoding of the first block of the original data is completed. The compressed data A (LZW code string) of all data of the first block of the original data 110 obtained by the LZW encoding is stored in the buffer A 150 .

【００８３】一方、ＬＺＷ符号器Ｂも、ＬＺＷ符号器Ａ
と並行に、局所辞書１３１を用いて上述したＬＺＷ符号
器Ａと同様な処理を行い、原データ１１０の最初のブロ
ックの全データをＬＺＷ符号化し、このＬＺＷ符号化に
より得られた上記原データ１１０の最初のブロックの全
データの圧縮データＢをバッファＢ１６０に格納する
（ステップＳ７０２〜Ｓ７０６，ステップＳ７０７Ｂ〜
Ｓ７０９Ｂ）。On the other hand, LZW encoder B also has LZW encoder A
In parallel with the above, the same processing as in the above-described LZW encoder A is performed using the local dictionary 131, all data of the first block of the original data 110 are LZW-encoded, and the original data 110 obtained by this LZW encoding is obtained. Is stored in the buffer B160 (steps S702-S706, S707B-
S709B).

【００８４】以上のようにして、ＬＺＷ符号器ＡとＬＺ
Ｗ符号器Ｂが、並行処理により原データ１１０の最初の
ブロックのＬＺＷ符号化を終了すると（Ｓ７０５，ＹＥ
Ｓ）、圧縮率比較器１４０は、バッファＡ１５０に格納
されているＬＺＷ符号器Ａにより生成された上記圧縮デ
ータＡとバッファＢ１６０に格納されているＬＺＷ符号
器Ｂにより生成された上記圧縮データＢの容量の大きさ
を比較し（ステップＳ７１０）、圧縮データＡの容量が
圧縮データＢの容量より小であれば（ステップＳ７１
０，ＹＥＳ）、ＭＰＸ１７０に対し、圧縮データＡの方
を選択する旨を選択信号ａにより指示する。As described above, LZW encoder A and LZW encoder A
When the W encoder B completes LZW encoding of the first block of the original data 110 by parallel processing (S705, YE
S), the compression ratio comparator 140 compares the compressed data A generated by the LZW encoder A stored in the buffer A150 with the compressed data B generated by the LZW encoder B stored in the buffer B160. The sizes of the capacities are compared (step S710), and if the capacity of the compressed data A is smaller than the capacity of the compressed data B (step S71).
0, YES), and instructs the MPX 170 to select the compressed data A by the selection signal a.

【００８５】ＭＰＸ１７０は、この選択信号ａを入力す
ると、まず大局辞書１２１を用いた圧縮データであるこ
とを示すフラグＡを出力ファイルに出力した後（ステッ
プＳ７１２）、バッファＡ１５０に格納されている大局
辞書１２１を用いた圧縮データＡを上記出力ファイルに
選択出力する（ステップＳ７１３）。Upon receiving the selection signal a, the MPX 170 first outputs a flag A indicating that the data is compressed data using the general dictionary 121 to an output file (step S712), and then outputs the general data stored in the buffer A150. The compressed data A using the dictionary 121 is selectively output to the output file (step S713).

【００８６】一方、圧縮率比較器１４０は、上記ステッ
プＳ７１０で、圧縮データＢの容量が圧縮データＡの容
量以下であると判断すると（ステップＳ７１０，Ｎ
Ｏ）、ＭＰＸ１７０に対し、圧縮データＢの方を選択出
力する旨を選択信号ａにより指示する。[0086] On the other hand, the compression ratio comparator 140, at step S710, the capacity of the compressed data B is determined to be less than capacity <br/> amount of compressed data A (step S710, N
O), the selection signal a instructs the MPX 170 to selectively output the compressed data B.

【００８７】ＭＰＸ１７０は、この選択信号ａを入力す
ると、まず局所辞書１３１を用いた圧縮データであるこ
とを示すフラグＢを上記出力ファイルに出力した後（Ｓ
７１３）、バッファＢ１６０に格納されている局所辞書
１３１を用いて得られた圧縮データＢを上記出力ファイ
ルに選択出力する（Ｓ７１４）。Upon receiving the selection signal a, the MPX 170 first outputs a flag B indicating that the data is compressed data using the local dictionary 131 to the output file (S
713), the compressed data B obtained by using the local dictionary 131 stored in the buffer B 160 is selectively output to the output file (S714).

【００８８】このようにして、（フラグ、圧縮データ）
の組から成る最初のブロックの圧縮データが出力され
る。以上の処理が、原データ１１０の第２ブロック以降
の各ブロックについてブロック単位で行われ、ＬＺＷ符
号器Ａ並びにＬＺＷ符号器Ｂが、上記該当入力ファイル
内の原データ１１０の全データについてＬＺＷ符号化に
よるデータ圧縮が終了したと判断すると（ステップＳ７
０３，ＹＥＳ）、原データ１１０の全データのＬＺＷ符
号化を終了する。Thus, (flag, compressed data)
Is output as compressed data of the first block consisting of The above processing is performed for each block after the second block of the original data 110 in units of blocks, and the LZW encoder A and the LZW encoder B perform LZW encoding on all data of the original data 110 in the input file. Is determined to be completed (step S7).
03, YES), the LZW encoding of all data of the original data 110 ends.

【００８９】このようにして、原データの各ブロック毎
に大局辞書１２１を用いたＬＺＷ符号器Ａによる圧縮デ
ータＡと局所辞書１３１を用いたＬＺＷ符号器Ｂによる
圧縮データＢの作成が並行して行われ、各ブロック毎に
圧縮データＡまたは圧縮データＢのいずれか一方の圧縮
率の高い方の圧縮データがその圧縮に用いられた辞書を
示すフラグ（フラグＡまたはフラグＢ）と共に上記出力
ファイルに格納される。As described above, for each block of the original data, the compressed data A by the LZW encoder A using the global dictionary 121 and the compressed data B by the LZW encoder B using the local dictionary 131 are created in parallel. For each block, the compressed data having the higher compression ratio of either the compressed data A or the compressed data B is stored in the output file together with a flag (flag A or flag B) indicating the dictionary used for the compression. Is stored.

【００９０】したがって、原データ１１０を、各ブロッ
クのデータの性質に応じて各ブロック毎に最適なＬＺＷ
符号化を行い、原データ１１０を非常に高い圧縮率で効
率よく圧縮することができる。Accordingly, the original data 110 is converted into an optimal LZW for each block according to the data properties of each block.
By encoding, the original data 110 can be efficiently compressed at a very high compression ratio.

【００９１】尚、上記実施例では、バッファＡ１５０と
バッファＢ１６０に格納されている圧縮データＡ，Ｂの
容量を比較して圧縮率の優劣を判断するようにしている
が、ＬＺＷ符号化Ａ系１２０並びにＬＺＷ符号化Ｂ系１
３０内に圧縮率を算出する手段を設け、この手段が、圧
縮率比較器１６０に対し直接に、圧縮データＡ，Ｂのそ
れぞれの圧縮率Ａ，Ｂを出力するような構成にしてもよ
い。[0091] In the above embodiment, compressed data A stored in the buffer A150 and the buffer B1 6 0, but by comparing the capacity of B so that to determine the relative merits of the compression ratio, LZW coding A System 120 and LZW coded B system 1
A means for calculating the compression ratio may be provided in the unit 30, and this means may be configured to directly output the compression ratios A and B of the compressed data A and B to the compression ratio comparator 160.

【００９２】さらに、上記の場合には、多量の入力デー
タから成るブロック単位でデータ圧縮率の比較を行って
いるが、１回のＬＺＷ符号化毎に符号化された入力デー
タ列（検索一致文字列）のデータ長の比較を行い、デー
タ長の短い方のＬＺＷ符号を、逐次、圧縮データとして
出力していくような構成にしてもよい。Further, in the above case, the data compression ratio is compared in units of blocks composed of a large amount of input data. However, the input data string (search matching character string) encoded for each LZW encoding is performed. The data length of each column may be compared, and the LZW code having the shorter data length may be sequentially output as compressed data.

【００９３】図６は、上記データ符号化方式により原デ
ータ１１０をブロック単位でＬＺＷ符号化する際の大局
辞書１２１と局所辞書１３１の使用方法の一例を示す図
である。FIG. 6 is a diagram showing an example of a method of using the global dictionary 121 and the local dictionary 131 when the original data 110 is subjected to LZW encoding in block units by the data encoding method.

【００９４】同図（ａ）は、上記データ符号化方式によ
り原データ１１０をブロック単位でＬＺＷ符号化してい
く過程での大局辞書１２１と局所辞書１３１の登録個数
の変化を示す図であり、横軸がＬＺＷ符号化したデータ
数（入力データ数）に、縦軸が大局辞書１２１及び局所
辞書１３１の登録個数に対応している。また、縦軸のＬ
_T ，Ｌ_K の各目盛は、それぞれ、大局辞書１２１及び局
所辞書１３１の登録容量を示している。FIG. 11A is a diagram showing a change in the number of registered general dictionaries 121 and local dictionaries 131 in the process of LZW-encoding the original data 110 in block units according to the data encoding method. the axis LZW encoded number of data (number of input data), and the vertical axis corresponds to the registration number of the global dictionary 121 and local dictionary 1 31. Also, L on the vertical axis
Each scale of _T and L _K indicates the registration capacity of the global dictionary 121 and the local dictionary 131, respectively.

【００９５】また、同図（ｂ）は、同図（ａ）と同じデ
ータ符号化過程における大局辞書１２１を用いた場合と
局所辞書１３１を用いた場合での、各ブロック毎のデー
タ圧縮率の比較を示す図であり、横軸が同図（ａ）と同
じ入力データ数、縦軸が各ブロックでのデータ圧縮率に
対応している。FIG. 13B shows the data compression ratio of each block when the global dictionary 121 and the local dictionary 131 are used in the same data encoding process as in FIG. FIG. 7 is a diagram showing a comparison, in which the horizontal axis corresponds to the same number of input data as in FIG. 7A and the vertical axis corresponds to the data compression ratio in each block.

【００９６】この図６に示す例では、ＬＺＷ符号器Ｂ
は、局所辞書１３１への登録個数が各ブロックのＬＺＷ
符号化の途中で一杯になった場合には、辞書登録を直ち
に停止する。また、ＬＺＷ符号器Ｂは、各ブロックのＬ
ＺＷ符号化を開始する前に、たえず局所辞書１３１をク
リア（初期設定）する。In the example shown in FIG. 6, the LZW encoder B
Indicates that the number of registrations in the local dictionary 131 is LZW of each block.
If it becomes full during encoding, dictionary registration is immediately stopped. Also, the LZW encoder B calculates the L of each block.
Before starting the ZW encoding, the local dictionary 131 is always cleared (initial setting).

【００９７】この局所辞書１３１のクリアは、例えば、
一文字から成る全文字を初期の辞書情報として登録する
初期化処理である。同図に示すように、１ブロック目の
ＬＺＷ符号化においては、局所辞書１３１と大局辞書１
２１の登録情報は同一であり、しかも局所辞書１３１の
登録個数が原データ１１０の１ブロック目のＬＺＷ符号
化過程の途中で満杯になるため、１ブロック目の圧縮率
は、大局辞書１２１を用いて得られるＬＺＷ符号器Ａの
出力する圧縮データＡの方が、局所辞書１３１を用いて
得られるＬＺＷ符号器Ｂの出力する圧縮データＢより
も、少しばかり圧縮率が高くなる。The local dictionary 131 can be cleared, for example, by
This is an initialization process for registering all characters consisting of one character as initial dictionary information. As shown in the figure, in the LZW encoding of the first block, the local dictionary 131 and the global dictionary 1
21 are the same, and the number of registrations in the local dictionary 131 becomes full during the LZW encoding process of the first block of the original data 110. Therefore, the compression ratio of the first block is determined by using the global dictionary 121. The compressed data A output from the LZW encoder A obtained by using the local dictionary 131
The compression ratio is slightly higher than the obtained compressed data B output from the LZW encoder B.

【００９８】また、原データ１１０の２ブロック目及び
３ブロック目のＬＺＷ符号化では、大局辞書１２１を用
いて得られる圧縮データＢの圧縮率の方が、局所辞書１
３１を用いて得られる圧縮データＡよりもさらに圧縮率
が高くなる。しかしながら、原データ１１０の４ブロッ
ク目のＬＺＷ符号化においては、入力データの性質が変
化したため、局所辞書１３１を用いて得られる圧縮デー
タＢの方が大局辞書１２１を用いて得られる圧縮データ
Ａよりも圧縮率が少しばかり高くなる。In the LZW encoding of the second block and the third block of the original data 110, the compression rate of the compressed data B obtained by using the global dictionary 121 is larger than that of the local dictionary 1
The compression ratio is higher than that of the compressed data A obtained by using No. 31. However, in the LZW encoding of the fourth block of the original data 110, the nature of the input data has changed, so that the compressed data B obtained using the local dictionary 131 is better than the compressed data A obtained using the global dictionary 121. However, the compression ratio is slightly higher.

【００９９】したがって、ＭＰＸ１７０から出力される
圧縮データ系列は、（フラグＡ，圧縮データＡ₁，フラ
グＡ，圧縮データＡ₂，フラグＡ，圧縮データＡ₃，フ
ラグＢ，圧縮データＢ₄・・・）となる。尚、圧縮デー
タＡおよび圧縮データＢに付した添字ｉ（ｉ＝１，２，
３，４，・・・）は、ブロックｉの圧縮データを示す番
号である。Therefore, the compressed data sequence output from the MPX 170 includes (flag A, compressed data A ₁ , flag A, compressed data A ₂ , flag A, compressed data A ₃ , flag B, compressed data B ₄ ... ). Note that a subscript i (i = 1, 2, 2) added to the compressed data A and the compressed data B
3, 4,...) Are numbers indicating the compressed data of the block i.

【０１００】このように、図６に示す例では、局所辞書
１３１の登録個数を各ブロックのＬＺＷ符号化の後半で
登録容量が飽和してしまうような小容量に設定し、大局
辞書１２１の方の登録個数は、複数ブロックにまたがっ
てＬＺＷ符号化を行っても登録容量が飽和してしまわな
いような大容量に設定している。ここで、局所辞書１３
１の登録容量がブロックのＬＺＷ符号化過程の途中で満
杯になった場合の他の対処方法を図７及び図９に示す。As described above, in the example shown in FIG. 6, the number of registered local dictionaries 131 is set to such a small amount that the registered capacity is saturated in the latter half of LZW encoding of each block. Is set to a large capacity such that the registered capacity does not become saturated even when LZW encoding is performed over a plurality of blocks. Here, the local dictionary 13
FIGS. 7 and 9 show another method of coping with the case where the registered capacity of No. 1 becomes full during the LZW encoding process of the block.

【０１０１】図７に示す例では、局所辞書１３１への登
録が一杯になったらば（飽和したならば）、登録文字を
登録する分解成分の木（ツリー：ｔｒｅｅ）の最下位層
（最下位レベル）の成分及びそれらの成分に（節）に接
続している枝に付けられた文字（図８に示すレベルの場
合には、レベル３に属するＸ₃，Ｘ₉，Ｘ₈の各成分及
びそれらの成分に接続している枝に付けられた文字）を
削除する枝刈り削除処理または各登録文字の参照頻度を
基に参照回数の少ない登録文字から優先的に削除する処
理を行って、登録用の空スペースを確保し、この空スペ
ースに以後の新規の文字を登録するようにする。尚、分
解成分の木の構成については、前掲した宗像「Ｚｉｖ−
Ｌｅｍｐｅｌのデータ圧縮法」の３．１入力列のインク
リメンタル分解に詳細に説明されている。In the example shown in FIG. 7, when the registration in the local dictionary 131 is full (if it is saturated), the lowest level (lowest level) of the decomposition component tree (tree) for registering the registration character is registered. Level) and characters attached to the branches connected to the (section) (in the case of the level shown in FIG. 8, each component of X ₃ , X ₉ , X ₈ belonging to level 3 and Pruning deletion processing to delete the characters attached to the branches connected to those components) or processing to preferentially delete the registered characters with a small number of references based on the reference frequency of each registered character, and register An empty space is reserved, and new characters are registered in the empty space. The structure of the decomposition component tree is described in the above-mentioned Munakata “Ziv-
This is described in detail in 3.1, Incremental Decomposition of Input Sequence, in Lempel's Data Compression Method.

【０１０２】一方、図９に示す例では、局所辞書１３１
の登録容量が一杯になったときには、ＬＲＵ（Leastly
Recent Used)方式により、最も古く登録された文字列を
１つ削除して、その変わりに新規の文字列を登録するよ
うにする。On the other hand, in the example shown in FIG.
LRU (Leastly)
According to the Recent Used method, one of the oldest registered character strings is deleted, and a new character string is registered instead.

【０１０３】さらに、図１０に示す例は、局所辞書１３
１が各ブロックのＬＺＷ符号化の過程で、決して登録容
量が一杯にならないように（飽和しないように）、局所
辞書１３１の登録容量を予め十分な大きさに確保してお
くものである。このようにした場合、ＬＺＷ符号器Ｂの
データ圧縮率は、上記他の各方式局所辞書１３１よりも
大抵の場合高くなる。Further, the example shown in FIG.
In the process of LZW encoding of each block, the registration capacity of the local dictionary 131 is ensured to be sufficiently large in advance so that the registration capacity does not become full (so as not to be saturated). In such a case, the data compression rate of the LZW encoder B is generally higher than that of each of the other system local dictionaries 131.

【０１０４】さらに、図１１に示す例は、ブロックの途
中までしかＬＺＷ符号化が進行していない場合でも、局
所辞書１３１の登録容量が一杯になったら直ちに局所辞
書１３１をクリアし、その後新たな辞書登録を再開する
ものである。Further, in the example shown in FIG. 11, even when LZW encoding has progressed only halfway through the block, the local dictionary 131 is cleared immediately when the registration capacity of the local dictionary 131 becomes full, and then a new local dictionary is cleared. This is to restart dictionary registration.

【０１０５】そして、さらに、図１２に示す例は、局所
辞書１３１の登録容量が一杯になったならば（飽和した
ならば）新たな辞書登録を停止し、その後、データ圧縮
率を監視し続け、データ圧縮率が所定の基準値よりも低
くなったときにはデータ圧縮率が悪化したと判断し、こ
の時点で局所辞書１３１をクリアし、その後、再び辞書
登録を再開するものである。Further, in the example shown in FIG. 12, when the registration capacity of the local dictionary 131 becomes full (saturated), the registration of a new dictionary is stopped, and thereafter, the data compression ratio is continuously monitored. When the data compression ratio becomes lower than a predetermined reference value, it is determined that the data compression ratio has deteriorated, the local dictionary 131 is cleared at this time, and then dictionary registration is restarted.

【０１０６】次に、上記データ符号化システムから出力
される図４に示す構成の圧縮データ系列１８２を元のデ
ータに復元するデータ復元システムの基本構成を図１３
に示す。ＬＺＷ復元化Ａ系２２０は、登録容量の大きい
大局辞書２２１と、この大局辞書２２１を用いて圧縮デ
ータ１８２を復元するＬＺＷ復元器Ａから成る。Next, FIG. 13 shows a basic configuration of a data decompression system for decompressing the compressed data sequence 182 output from the data encoding system and having the configuration shown in FIG. 4 to original data.
Shown in The LZW decompression A system 220 includes a global dictionary 221 having a large registration capacity and an LZW decompressor A for decompressing the compressed data 182 using the global dictionary 221.

【０１０７】また、ＬＺＷ復元化Ｂ系２３０は、登録容
量の小さな局所辞書２３１と、この局所辞書２３１を用
いて、圧縮データ１８２を復元するＬＺＷ復元器Ｂから
成る。[0107] Further, LZW recovered of B system 230 includes a small local dictionary 231 registration capacity, using the local dictionary 231 consists LZW decompressor B for restoring the compressed data 1 8 2.

【０１０８】これらＬＺＷ復元化系２２０とＬＺＷ復元
化系２３０は、圧縮データ１８２の復元を同時に並行し
て行う。上記ＬＺＷ復元器Ａ並びに上記ＬＺＷ復元器Ｂ
は、それぞれ大局辞書２２１、局所辞書２３１を用いて
復元した復元データＡ、復元データＢをＭＰＸ（マルチ
プレクサ）２５０に出力する。The LZW decompression system 220 and the LZW decompression system 230 simultaneously decompress the compressed data 182 . The LZW reconstructor A and the LZW reconstructor B
Is recovered data A restored respectively, using global dictionary 221, the local dictionary 23 1, and outputs the restored data B to the MPX (multiplexer) 250.

【０１０９】辞書フラグ判別器２４０は、圧縮データ系
列１８０の辞書フラグ１８１を入力し、この辞書フラグ
１８１により、現在、ＬＺＷ復元化Ａ系２２０とＬＺＷ
復元化Ｂ系２３０とで、並行して復元されている圧縮デ
ータ１８２の作成に用いられた辞書（大局辞書１２１ま
たは局所辞書１３１）を判断し、上記現在復元中の圧縮
データ１８２のＬＺＷ符号化に用いられた辞書を指示す
る選択信号ｄを上記ＭＰＸ２５０に出力する。The dictionary flag discriminator 240 receives the dictionary flag 181 of the compressed data sequence 180, and uses the dictionary flag 181 to determine whether the LZW reconstruction A system 220 and the LZW
The decompression B system 230 determines the dictionary (global dictionary 121 or local dictionary 131) used to create the compressed data 182 which is decompressed in parallel, and performs LZW encoding of the compressed data 182 currently being decompressed. Is output to the MPX 250 described above.

【０１１０】ＭＰＸ２５０は、辞書フラグ判別器２４０
から入力する選択信号ｄの指示に応じて、大局辞書１２
１が指示されていればＬＺＷ復元器Ａから出力される復
元データＡを、一方局所辞書１３１が指示されていれば
ＬＺＷ復元器Ｂから出力される復元データＢを選択出力
する。The MPX 250 includes a dictionary flag discriminator 240
In accordance with the instruction of the selection signal d input from the
If the instruction 1 is designated, the restoration data A outputted from the LZW restoration unit A is selected, and if the local dictionary 131 is designated, the restoration data B outputted from the LZW restoration unit B is selectively outputted.

【０１１１】続いて、上記構成のデータ復元システムに
より行われる前記図３に示すデータ符号化システムから
出力される圧縮データ系列１８０（図４参照）の復元方
法を図１４のフローチャートを参照しながら説明する。Next, a method of decompressing the compressed data sequence 180 (see FIG. 4) output from the data encoding system shown in FIG. 3 performed by the data decompression system having the above configuration will be described with reference to the flowchart of FIG. I do.

【０１１２】まず、ＬＺＷ復元器ＡとＬＺＷ復元器Ｂ
は、上記データ符号化システムにおける原データ１１０
の符号化のブロックサイズ（１ブロックのデータ数）を
それぞれが内蔵している入力カウンタに設定する（Ｓ８
０１）。First, the LZW decompressor A and the LZW decompressor B
Is the original data 110 in the data encoding system.
Is set in the input counter incorporated therein (S8).
01).

【０１１３】続いて、辞書フラグ判別器２４０は、ファ
イル（前記データ符号化システムにより生成された出力
ファイル）から圧縮データ１８２の先頭に付加されたデ
ータ（第１入力データ）すなわち辞書フラグ１８１（図
４参照）を読み出す（ステップＳ８０２）。Subsequently, the dictionary flag discriminator 240 outputs data (first input data) added from the file (the output file generated by the data encoding system) to the head of the compressed data 182. That is, the dictionary flag 181 (see FIG. 4) is read (step S802).

【０１１４】次に、ＬＺＷ復元器ＡとＬＺＷ復元器Ｂ
は、同時に上記ファイルから辞書フラグ１８１に続く圧
縮データ１８２の最初のデータを入力し（ステップＳ８
０３）、この入力データが上記ファイルの終了を示す
「ＥＯＦ」（End of File)であるか否か判別する（ステ
ップＳ８０４）。Next, the LZW decompressor A and the LZW decompressor B
Simultaneously inputs the first data of the compressed data 182 following the dictionary flag 181 from the file (step S8).
03), it is determined whether or not the input data is “EOF” (End of File) indicating the end of the file (step S804).

【０１１５】そして、ＬＺＷ復元器ＡとＬＺＷ復元器Ｂ
が上記入力データが「ＥＯＦ」でないと判別すると（ス
テップＳ８０４，ＮＯ）、辞書フラグ判別器２４０が上
記先に入力した辞書フラグ１８１により、上記入力デー
タが大局辞書１２１または局所辞書１３１のいずれの辞
書を用いてＬＺＷ符号化されたのか判断し、ＬＺＷ符号
化に用いられた方の該当辞書を指示する選択信号ｄをＭ
ＰＸ２５０に出力する。この間、ＬＺＷ復元器ＡとＬＺ
Ｗ復元器Ｂは、上記入力データの復元をそれぞれ大局辞
書２２１と局所辞書３２１を用いて並行して行い、それ
ぞれ復元データＡ，ＢとしてＭＰＸ２５０に出力する。Then, the LZW decompressor A and the LZW decompressor B
Determines that the input data is not "EOF" (step S804, NO), the dictionary flag discriminator 240 determines whether the input data is the dictionary of the global dictionary 121 or the local dictionary 131 based on the dictionary flag 181 previously input. Is determined using LZW encoding, and a selection signal d indicating the corresponding dictionary used for LZW encoding is set to M
Output to PX250. During this time, the LZW decompressors A and LZ
The W restorer B restores the input data in parallel using the global dictionary 221 and the local dictionary 321, respectively, and outputs the restored data A and B to the MPX 250, respectively.

【０１１６】ＭＰＸ２５０は上記辞書フラグ判別器２４
０から加わる選択信号ｄに応じて、復元データＡまたは
復元データＢのいずれか一方の該当するデータを選択出
力する。ところで、ＬＺＷ復元器ＡとＬＺＷ復元器Ｂ
は、復元データが、それぞれの辞書２２１、２３１に未
登録であった場合には、（この復元データ、この復元デ
ータに対応するＬＺＷ符号）の組をそれぞれ辞書２２
１、２３１に登録する（以上、ステップＳ８０５）。The MPX 250 has the dictionary flag discriminator 24.
According to the selection signal d added from 0, the corresponding data of the restored data A or the restored data B is selectively output. By the way, LZW decompressor A and LZW decompressor B
If the restored data has not been registered in the respective dictionaries 221 and 231, the set of (the restored data and the LZW code corresponding to the restored data) is stored in the dictionary 221, respectively.
1 and 231 (step S805).

【０１１７】続いて、ＬＺＷ復元器ＡとＬＺＷ復元器Ｂ
は、それぞれ入力カウンタを復元データの文字数分だけ
デクリメントし、それぞれの上記入力カウンタに現在復
元中のブロック（現ブロック）の残りの復元データの文
字数をセットする（Ｓ８０７）。Subsequently, the LZW decompressor A and the LZW decompressor B
Decrements the input counter by the number of characters of the restored data, and sets the number of characters of the remaining restored data of the block currently being restored (current block) to each of the input counters (S807).

【０１１８】そして、次にＬＺＷ復元器ＡとＬＺＷ復元
器Ｂは、それぞれの入力カウンタの値が「０」、すなわ
ち、現ブロックの復元が全て終了したか否かを判別する
（ステップＳ８０８）。Then, the LZW decompressor A and the LZW decompressor B determine whether the value of each input counter is "0", that is, whether or not the restoration of the current block has been completed (step S808).

【０１１９】そして、ＬＺＷ復元器ＡとＬＺＷ復元器Ｂ
は、それぞれの入力カウンタの値が「０」でなく、ま
だ、現ブロックの復元が未終了であると判断すると、
（ステップＳ８０８，ＮＯ）、再びステップＳ８０３に
戻って、前記ファイルから次のデータを入力し現ブロッ
クの残りのデータを復元する処理を繰り返す（ステップ
Ｓ８０３〜Ｓ８０８の繰り返し）。Then, the LZW decompressor A and the LZW decompressor B
Determines that the value of each input counter is not “0” and that the restoration of the current block has not been completed yet,
(Step S808, NO), the process returns to step S803, and repeats the process of inputting the next data from the file and restoring the remaining data of the current block (repetition of steps S803 to S808).

【０１２０】そして、ＬＺＷ復元器ＡとＬＺＷ復元器Ｂ
は、上記ステップＳ８０８で現ブロックの圧縮データ１
８２の全てについてデータの復元が終了したと判断する
と（ステップＳ８０８，ＹＥＳ）、ＬＺＷ復元器Ａ，
Ｂ，辞書フラグ判別器２４０，及びＭＰＸ２５０が再び
上記ステップＳ８０１〜Ｓ８０８の処理を行い、ＬＺＷ
復元器ＡとＬＺＷ復元器Ｂが上記ステップＳ８０４で上
記ファイルから「ＥＯＦ」を読み出すまで前記ファイル
に格納されている次ブロック以降の残りの全てのブロッ
クの圧縮データを復元する。Then, the LZW decompressor A and the LZW decompressor B
Is the compressed data 1 of the current block in step S808.
If it is determined that the data restoration has been completed for all of the 82 (step S808, YES), the LZW reconstructor A,
B, the dictionary flag discriminator 240, and the MPX 250 perform the processing of steps S801 to S808 again, and the LZW
Until the decompressor A and the LZW decompressor B read "EOF" from the file in step S804, the compressed data of all the remaining blocks after the next block stored in the file is decompressed.

【０１２１】そして、ＬＺＷ復元器ＡとＬＸＷ復元器Ｂ
は上記ファイルに格納されている全ブロックの圧縮デー
タ１８２の復元が終了したと判断すると（ステップＳ８
０４，ＹＥＳ）、圧縮データ１８２の復元処理を終了す
る。Then, the LZW decompressor A and the LXW decompressor B
Determines that the decompression of the compressed data 182 of all blocks stored in the file is completed (step S8).
04, YES), the decompression process of the compressed data 182 ends.

【０１２２】このようにして、上述した前記図５のフロ
ーチャートに示すデータ符号化方式により、ブロック単
位でＬＺＷ符号化されて圧縮されたデータを、各ブロッ
クの圧縮データ１８２の先頭に付加された辞書フラグ１
８１を参照しながら、ブロック単位で順次復元する。In this manner, the data that is LZW-encoded and compressed in block units by the data encoding method shown in the flowchart of FIG. 5 described above is added to the dictionary added to the head of the compressed data 182 of each block. Flag 1
The image data is sequentially restored in block units with reference to 81.

【０１２３】[0123]

【発明の効果】請求項１及び請求項６記載の第１の発明
によれば、小容量の第１の辞書と大容量の第２の辞書の
容量が異なる２つの辞書を備え、小容量の第１の辞書は
入力データの一定区間毎に初期設定するので、第２の辞
書だけでは対応しにくい入力データの性質の局所的な変
化に対応した辞書情報を第１の辞書に登録でき、この結
果として、第２の辞書により入力データの大局的な性質
に適応したＬＺＷ符号化を行えると共に、第１の辞書に
より入力データの局所的な性質に対応したＬＺＷ符号化
を行うことができるため、文字コードや白黒画像データ
など性質の異なる複数種類のデータが混在している入力
データを、全体的に従来のＬＺＷ符号化によるデータ圧
縮よりも高い圧縮でデータ圧縮することが可能になる。According to the first aspect of the present invention, there are provided two dictionaries in which the capacity of the first dictionary having a small capacity and the capacity of the second dictionary having a large capacity are different from each other. Since the first dictionary is initially set for each fixed section of the input data, dictionary information corresponding to a local change in the property of the input data, which is difficult to handle with the second dictionary alone, can be registered in the first dictionary. as a result, the perform the LZW coding adapted to the global nature of the input data by the second dictionary, it is possible to perform the LZW coding corresponding to the local properties of the input data by the first dictionary, Input data in which a plurality of types of data having different properties such as character codes and black and white image data are mixed can be subjected to data compression at a higher compression than that of the conventional LZW encoding.

【０１２４】また、請求項７乃至請求項１０記載の第２
の発明によれば上記第１の発明と同等の機能を有する第
１及び第２の辞書を備えているので、復元手段は、上記
第１の発明が各圧縮データに付加して出力する各圧縮デ
ータの作成に用いられた辞書を示すフラグを参照するこ
とにより、上記第１の発明により圧縮されたデータを、
該当する辞書を参照かつ更新・登録することにより元の
データに完全に復元することができる。The second aspect of the present invention is the second aspect of the present invention.
According to the invention, since the first and second dictionaries having the same functions as those of the first invention are provided, the decompression means is provided for each compressed data which the first invention adds to each compressed data and outputs the compressed data. By referring to the flag indicating the dictionary used to create the data, the data compressed according to the first aspect can be
The original data can be completely restored by referring to, updating, and registering the corresponding dictionary.

[Brief description of the drawings]

【図１】本発明の原理図（その１）である。FIG. 1 is a principle diagram (part 1) of the present invention.

【図２】本発明の原理図（その２）である。FIG. 2 is a principle diagram (part 2) of the present invention.

【図３】本発明の一実施例のデータ符号化システムの基
本構成図である。FIG. 3 is a basic configuration diagram of a data encoding system according to an embodiment of the present invention.

【図４】本発明の圧縮データ系列の構成を説明する図で
ある。FIG. 4 is a diagram illustrating a configuration of a compressed data sequence according to the present invention.

【図５】本発明のデータ符号化方式（データ圧縮方式）
の一実施例のアルゴリズムを説明するフローチャートで
ある。FIG. 5 is a data encoding method (data compression method) of the present invention.
5 is a flowchart illustrating an algorithm according to one embodiment.

【図６】大局辞書と局所辞書の使用方法の第１の例を示
す図である。FIG. 6 is a diagram illustrating a first example of a method of using a global dictionary and a local dictionary.

【図７】局所辞書の使用方法の他の例を示す図（その
１）である。FIG. 7 is a diagram (part 1) illustrating another example of how to use the local dictionary.

【図８】登録文字を登録する分解成分の木の構成を示す
図である。FIG. 8 is a diagram illustrating a configuration of a decomposition component tree for registering a registration character.

【図９】局所辞書の使用方法の他の例を示す図（その
２）である。FIG. 9 is a diagram (part 2) illustrating another example of how to use the local dictionary.

【図１０】局所辞書の使用方法の他の例を示す図（その
３）である。FIG. 10 is a diagram (part 3) illustrating another example of how to use the local dictionary.

【図１１】局所辞書の使用方法の他の例を示す図（その
４）である。FIG. 11 is a diagram (part 4) illustrating another example of how to use the local dictionary.

【図１２】局所辞書の使用方法の他の例を示す図（その
５）である。FIG. 12 is a diagram (part 5) illustrating another example of how to use the local dictionary.

【図１３】本発明の一実施例のデータ復元システムの基
本構成図である。FIG. 13 is a basic configuration diagram of a data restoration system according to an embodiment of the present invention.

【図１４】上記データ復元システムにより行われるデー
タ復元方式のアルゴリズムを説明するフローチャートで
ある。FIG. 14 is a flowchart illustrating an algorithm of a data restoration method performed by the data restoration system.

【図１５】ユニバーサル型ジブ・レンペル符号の符号化
の基本概念を説明する図である。FIG. 15 is a diagram illustrating a basic concept of encoding of a universal type Jib Lempel code.

【図１６】ＬＺＷ符号化のアルゴリズムを説明するフロ
ーチャートである。FIG. 16 is a flowchart illustrating an algorithm of LZW encoding.

【図１７】ＬＺＷ符号化に用いられる辞書の構成を説明
する図である。FIG. 17 is a diagram illustrating a configuration of a dictionary used for LZW encoding.

【図１８】ＬＺＷ符号化方法を説明する模式図である。FIG. 18 is a schematic diagram illustrating an LZW encoding method.

【図１９】ＬＺＷ符号の復元アルゴリズムを説明するフ
ローチャートである。FIG. 19 is a flowchart illustrating an algorithm for restoring an LZW code.

【図２０】ＬＺＷ符号の復元の一具体例を説明するため
の模式図である。FIG. 20 is a schematic diagram for explaining a specific example of restoration of an LZW code.

[Explanation of symbols]

１，１１第１の辞書２，１２第２の辞書３第１の符号化手段４第２の符号化手段５圧縮データ出力手段１３復元手段 1, 11 First dictionary 2, 12 Second dictionary 3 First encoding means 4 Second encoding means 5 Compressed data output means 13 Decompression means

───────────────────────────────────────────────────── フロントページの続き (72)発明者千葉広隆神奈川県川崎市中原区上小田中1015番地富士通株式会社内 (56)参考文献特開平５−66917（ＪＰ，Ａ) 特開平４−43720（ＪＰ，Ａ) 特開平２−34038（ＪＰ，Ａ) 特開平１−158825（ＪＰ，Ａ) 特開昭62−209948（ＪＰ，Ａ) 実開平２−73876（ＪＰ，Ｕ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 5/00 G06T 9/00 H03M 7/30 - 7/46 H04N 1/41 H04N 1/413 ────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Hirotaka Chiba 1015 Kamiodanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture Inside Fujitsu Limited (56) References JP-A-5-66917 (JP, A) JP-A-4-43720 (JP, A) JP-A-2-34038 (JP, A) JP-A-1-158825 (JP, A) JP-A-62-209948 (JP, A) JP-A-2-73876 (JP, U) ( 58) Investigated field (Int.Cl. ⁷ , DB name) G06F 5/00 G06T 9/00 H03M ^7/ 30-7/46 H04N 1/41 H04N 1/413

Claims

(57) [Claims]

In a data encoding method for compressing data by universal encoding using Jib-Lempel code, an initial setting is made each time input data is subjected to Jib-Lempel encoding over a specific fixed section. And a second dictionary (2) capable of registering all dictionary data generated in a Jib-Lempel encoding process over a plurality of continuous intervals of the input data. The first dictionary (1) performs Jib-Lempel encoding on the data to be processed, and new dictionary data generated in the Jib-Lempel encoding process is registered in the first dictionary (1). And Jib-Lempel encoding of data input to the first encoding means (1) using the second dictionary (2). Excessive Second encoding means (4) for registering new dictionary data generated in the process in the second dictionary (2), and the first encoding means (3) obtained for each of the predetermined sections. The data amount of the jib-Lempel code string obtained by the second encoding means (4) is compared with the data amount of the jib-Lempel code string, and the jib-Lempel code string with the smaller data amount is used for this encoding. And a compressed data output means (5) for outputting the compressed data together with a flag indicating the dictionary.

2. The method according to claim 1, wherein the certain section includes the first dictionary.
2 is a period from the initial setting to the time when the number of data subjected to the Jib Lempel encoding by the second encoding means (4) becomes from “0” to a specific number. Data encoding method.

3. The method according to claim 1, wherein the certain section is the first dictionary.
2. The data encoding method according to claim 1, wherein the period is from the initial setting until the registered capacity becomes full.

4. The method according to claim 1, wherein the certain section is the first dictionary.
Is a period from the time when the registered capacity becomes full to the time when the compression ratio for the input data of the Jib-Lempel code string output from the first encoding means (3) decreases to a certain lower limit value. The data encoding expression according to claim 1.

5. The fixed section is a period in which the first encoding means (3) and the second encoding means (4) output one Jib Lempel code. 1
The described data encoding method.

6. The first encoding means (3) and the second encoding means (3)
The encoding means (4) performs Jib-Lempel encoding of the same input data in parallel.
2. A data encoding method according to 2, 3, 4 or 5.

7. A data restoration method for restoring Jib-Lempel-encoded compressed data according to the data encoding method according to claim 1, wherein initialization is performed each time the compressed data is restored over a specific fixed section. A first dictionary (11) to be registered; a second dictionary (12) capable of registering dictionary data generated in a decompression process over a plurality of continuous sections of the compressed data; and the flag, It is determined whether the dictionary used for compressing the compressed data to be decompressed is either the first dictionary (1) or the second dictionary (2), and the first dictionary (11 ) Or the second dictionary (12) is selected, and the compressed data is restored using the selected dictionary, and the dictionary data obtained by this restoration is added to the dictionary if necessary. Climb Recording means (13) for recording data.

8. The first dictionary (1)
8. The data restoration method according to claim 7, wherein 1) is a period from the initial setting to the time when the data length of the restoration data restored by the restoration means (13) becomes equal to a specific value.

9. The first dictionary (1)
8. The data restoration method according to claim 7, wherein 1) is a period from the initial setting to the time when the registered capacity becomes full.

10. The method according to claim 1, wherein the certain section is provided with the restoration means (1).
8. The data restoration method according to claim 7, wherein 3) is a period from inputting one Jib Lempel code to restoring the Jib Lempel code.

11. The decompression means (13) performs decompression of the compressed data using the first dictionary (12) and the second dictionary (12).
11. The data decompression method according to claim 7, wherein the decompression of the compressed data using the dictionary (12) is performed in parallel and simultaneously.