JPH05127866A

JPH05127866A - Image data compression system

Info

Publication number: JPH05127866A
Application number: JP28745091A
Authority: JP
Inventors: Yoshiyuki Okada; 佳之岡田; Shigeru Yoshida; 茂吉田; Yasuhiko Nakano; 泰彦中野; Hirotaka Chiba; 広隆千葉
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1991-11-01
Filing date: 1991-11-01
Publication date: 1993-05-25

Abstract

PURPOSE:To improve the compressibility of the image data by performing the approximate retrieval of the registered character strings to a dictionary including the patterns and run lengths similar and approximate to each other, coding these retrieved character strings, and registering approximately the character strings obtained by adding the input characters to those registered character strings. CONSTITUTION:The character string inputted from a pre-processing means 10 is retrieved out of a dictionary 12 by a dictionary retrieving meens 14. If the dictionary 12 does not include the character string that is coincident with the relevant character string including the input characters, the characters approximate to the input characters are read out by an approximate dictionary retrieving means 16. Then the means 16 retrieves a character string that is coincident in the largest length with the character string including the approximate characters. A coding means 18 codes the character string retrieved by the means 14 or 16 in a reference number of the dictionary 12. Then the character string obtained by adding the input characters to the reference number of the character string that is coded immediately before detection of a fact that the character string coincident with the character string including the input characters cannot be retrieved out of the dictionary 12 and no approximate character is obtained is registered into the dictionary 12 as a new reference number.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、画像を２次元情報のパ
ターンとランレングスに変換した後にユニバーサル符号
化を適用する画像データ圧縮方式に関する。近年、オフ
ィスオートメーションが発展し、文書が白黒２値の画像
情報としてファクシミリや光ディスクファイル・システ
ムなどで扱われるようになっている。文書情報をディジ
タルデータとして利用するとき、画像情報のデータ量
は、文字画像に比べ非常に大きく１０数〜数１０倍にな
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image data compression system which applies universal coding after converting an image into a pattern of two-dimensional information and a run length. In recent years, office automation has been developed, and documents are handled as binary image information in a facsimile or an optical disk file system. When the document information is used as digital data, the data amount of the image information is much larger than that of the character image and is ten to several ten times as large.

【０００２】また、最近は、画像の品位を向上させるた
め、ファクシミリにおいては、従来のＧ３機の約200 dp
i から、次のＧ４機では300 dpi や400 dpi へと解像度
が上がり、データ量は増加する方向にある。したがっ
て、蓄積や伝送等で画像情報を効率良く扱うには、効率
的なデータ圧縮を加えることでデータ量を減らすことが
必須となる。Recently, in order to improve the image quality, in a facsimile, about 200 dp of a conventional G3 machine is used.
The resolution will increase from i to 300 dpi or 400 dpi on the next G4 machine, and the amount of data will increase. Therefore, in order to handle image information efficiently in storage, transmission, etc., it is essential to reduce the data amount by adding efficient data compression.

【０００３】[0003]

【従来の技術】本願発明者等は、白黒２値画像にその二
次元的性質を捉える前処理を加えた後にユニバーサル符
号化する方式を既に提案しており、この方式によれば白
黒２値画像の圧縮方式として知られた従来の代表的な２
つの方式、即ちＭＭＲ方式及び予測符号化方式に匹敵す
る圧縮率が得られている。2. Description of the Related Art The inventors of the present application have already proposed a method of universal encoding after adding preprocessing for capturing a two-dimensional property to a black and white binary image. According to this method, a black and white binary image is obtained. Conventional two known as compression method
Compression rates comparable to two schemes have been obtained, namely the MMR scheme and the predictive coding scheme.

【０００４】この方式は、データの種類に依存せず、且
つ一つの方式で、即ち簡単な回路で効率よくデータ圧縮
できる。白黒２値画像の２次元的性質を捕える前処理の
一つとして、図８に示すものがある。図８（ａ）は原画
を示したもので、この原画を対象に４ライン単位の画素
毎に白黒パターンの種類と同じパターンが連続する数、
即ちランレングスの情報に変換して図８（ｂ）に示すデ
ータを得る。この変換データは例えば上位４ビットでパ
ターンの種類を示し、下位４ビットでランレングスを示
す８ビットデータとなる。This method does not depend on the type of data and can be efficiently compressed by one method, that is, by a simple circuit. FIG. 8 shows one of the preprocessings for capturing the two-dimensional property of a black and white binary image. FIG. 8A shows an original image, and the number of consecutive patterns of the same type as the black-and-white pattern for each 4-line pixel for this original image,
That is, the data is converted into run length information to obtain the data shown in FIG. The converted data is, for example, 8-bit data in which the upper 4 bits indicate the type of pattern and the lower 4 bits indicate the run length.

【０００５】このパターンとランレングスの組でなる変
換データのもつ連続性等の統計的性質をユニバーサル符
号化の手法により学習しながら符号の最良化を図り、種
々の性質の画像において効率の良い圧縮を行うことがで
きる。図９は図８（ｂ）の変換データを対象に行ったユ
ニバーサル符号化で登録された辞書の木構成を示してい
る。The universal coding method is used to optimize the code while learning the statistical properties such as continuity of the conversion data consisting of the pattern and the run length, and efficient compression is performed on images of various properties. It can be performed. FIG. 9 shows a tree structure of a dictionary registered by universal encoding performed on the converted data of FIG. 8B.

【０００６】元来、ユニバーサル符号は、情報保存型の
データ圧縮方法であり、データ圧縮時に情報源の統計的
な性質を予め仮定しないため、文字コード、オブジェク
トコードなどの種々のタイプのデータに適用することが
できる。文書画像では、文字の輪郭等や文字間隔には類
似性がある。また、網点画像は網点周期性、網点形状の
同一性等が類似している。この類似性のもつ冗長性をユ
ニバーサル符号化により削減し、有効な圧縮を行うこと
ができる。Originally, the universal code is an information storage type data compression method, and since it does not assume the statistical property of the information source in advance at the time of data compression, it is applied to various types of data such as character codes and object codes. can do. In a document image, there are similarities in character outlines and character spacing. Further, the halftone images have similar halftone dot periodicity, halftone dot shape identity, and the like. The redundancy with this similarity can be reduced by universal coding, and effective compression can be performed.

【０００７】ここでは、ユニバーサル符号化の一つであ
るＬＺＷ符号を採り上げる（T.A. Welch, “A Techniqu
e for High-Performance Data Compression ”,Compute
r, June 1984参照）。ＬＺＷ符号では、次のシンボルを
次の部分列に組み込むようにして、インデックス（辞書
の参照番号）のみで符号化できるようにしている。[0007] Here, the LZW code which is one of the universal coding is adopted (TA Welch, "A Techniqu.
e for High-Performance Data Compression ”, Compute
r, June 1984). In the LZW code, the next symbol is incorporated in the next subsequence so that only the index (dictionary reference number) can be used for encoding.

【０００８】図１０に２５６文字を初期登録してＬＺＷ
符号化を行った際に作られる辞書の木構成を示し、また
図１１にＬＺＷ符号におけるデータ（文字列）とＬＺＷ
符号化で得られた符号語としてのインデックス（辞書の
参照番号）を示す。図１２は従来のＬＺＷ符号化の詳細
アルゴリズムを示したフローチャートである。Initially register 256 characters in FIG.
FIG. 11 shows a tree structure of a dictionary created when encoding is performed, and FIG. 11 shows data (character strings) and LZW in the LZW code.
An index (reference number of dictionary) as a code word obtained by encoding is shown. FIG. 12 is a flowchart showing a detailed algorithm of conventional LZW coding.

【０００９】このＬＺＷ符号化は、書き替え可能な辞書
をもち、入力文字列中を相異なる文字列に分け、この文
字列を出現した順に番号を付けて辞書に登録すると共
に、現在入力している文字列を辞書に登録してある最長
一致文字列の番号で表して符号化するものである。図１
２のＬＺＷ符号化処理では、まずステップＳ１で予め全
文字につき一文字からなる文字列を初期値として登録し
てから符号化を始める。ステップＳ２では入力した最初
の文字Ｋを辞書検索する参照番号ωとし、これを語頭文
字列（prefix string ）とする。次にステップＳ３で入
力データの次の文字Ｋを読み込み、ステップＳ４ではス
テップＳ２で求めた語頭文字列ωにステップＳ３で読み
込んだ文字Ｋを加えた（ωＫ）が現在の辞書にあるか否
か検索する。This LZW encoding has a rewritable dictionary, divides the input character string into different character strings, assigns numbers to the character strings in the order in which they appear, and registers them in the dictionary. The present character string is represented by the number of the longest matching character string registered in the dictionary and is encoded. Figure 1
In the LZW encoding process of No. 2, first, in step S1, a character string consisting of one character for all characters is registered in advance as an initial value, and then encoding is started. In step S2, the input first character K is used as a reference number ω for dictionary search, and this is used as a prefix string. Next, in step S3, the next character K of the input data is read, and in step S4, whether or not (ωK) obtained by adding the character K read in step S3 to the initial character string ω obtained in step S2 is in the current dictionary. Or search.

【００１０】ステップＳ４で文字列（ωＫ）が辞書にあ
れば、ステップＳ５で文字列（ωＫ）を参照番号ωに置
き換え、ステップＳ６で入力データが終了がどうかを判
断した後、再びステップＳ３に戻って文字列（ωＫ）が
辞書から探せなくなるまで最大一致長の検索を続ける。
次にステップＳ４で文字列（ωＫ）が辞書になければ、
ステップＳ７に進んでステップＳ２で求めた文字列の参
照番号ωを符号語code（ω）として出力し、また文字列
（ωＫ）に新たな参照番号を付加して辞書に登録し、さ
らにステップＳ２の入力文字Ｋを参照番号ωに置き換え
ると共に、辞書アドレスＮをインクリメントして、ステ
ップＳ５のチェックを受けた後、ステップＳ２に戻って
次の文字Ｋを読み込む。If the character string (ωK) is found in the dictionary in step S4, the character string (ωK) is replaced with the reference number ω in step S5, and it is determined in step S6 whether or not the input data has ended. Returning to this, the search for the maximum matching length is continued until the character string (ωK) cannot be searched from the dictionary.
Next, in step S4, if the character string (ωK) is not in the dictionary,
Proceeding to step S7, the reference number ω of the character string obtained in step S2 is output as a code word code (ω), and a new reference number is added to the character string (ωK) and registered in the dictionary. The input character K is replaced with the reference number ω, the dictionary address N is incremented, and after the check in step S5 is received, the process returns to step S2 to read the next character K.

【００１１】次に図１４、図１５を参照してＬＺＷ符号
化を具体的に説明する。尚、図１４、図１５では説明を
簡単にするためａｂｃの３文字の組合わせからなるデー
タを圧縮する場合を取上げている。またａ，ｂ，ｃの３
文字は予め初期登録されている。まず図１４の入力デー
タは左から右へ読み込む。最初の文字ａを入力したと
き、辞書には文字ａの他に一致する文字列がないので、
出力符号（参照番号ω）を符号語として出力する。そし
て、拡張した文字列ａｂに参照番号４をつけて辞書に登
録する。実際の登録は文字列（１ｂ）の形となる。Next, the LZW encoding will be specifically described with reference to FIGS. 14 and 15. Note that, in FIGS. 14 and 15, the case of compressing data consisting of a combination of three characters of abc is taken for simplification of description. In addition, 3 of a, b, c
The characters are initially registered in advance. First, the input data of FIG. 14 is read from left to right. When the first character a is entered, there is no matching character string other than the character a in the dictionary, so
The output code (reference number ω) is output as a code word. Then, the reference character 4 is attached to the expanded character string ab and registered in the dictionary. The actual registration is in the form of a character string (1b).

【００１２】続いて２番目のｂが文字列の先頭になる。
辞書にはｂの他に一致する文字列がないので、参照番号
２を符号語として出力し、拡張した文字列ｂａを実際に
は２ａの形で参照番号５をつけて辞書に登録する。３番
目のａが次の文字列の先頭になる。以下、同様にこの処
理を続ける。次に１３のＬＺＷ符号化を説明する。Then, the second b becomes the beginning of the character string.
Since there is no matching character string other than b in the dictionary, the reference number 2 is output as a codeword, and the expanded character string ba is actually added to the dictionary with the reference number 5 and registered in the dictionary. The third a is the beginning of the next character string. Hereinafter, this processing is similarly continued. Next, 13 LZW encoding will be described.

【００１３】この復号化は図１２の符号化の逆の操作を
行う。まず符号化と同様にステップＳ１で予め辞書に全
文字につき一文字からなる文字列を初期値として登録し
てから復号を始める。次にステップＳ２で最初の符号
（参照番号）を読み込み、現在のＣＯＤＥをＯＬＤcode
とし、最初の符号は既に辞書に登録された一文字の参照
番号いずれかに該当することから、入力符号ＣＯＤＥに
一致する文字code（Ｋ）を探し出し、文字Ｋを出力す
る。なお、出力した文字（Ｋ）は後の例外処理のためch
arにセットしておく。This decoding is the inverse operation of the encoding of FIG. First, similarly to the encoding, in step S1, a character string consisting of one character for all characters is registered in the dictionary in advance as an initial value, and then decoding is started. Next, in step S2, the first code (reference number) is read and the current CODE is OLD code.
Since the first code corresponds to any one-character reference number already registered in the dictionary, the character code (K) matching the input code CODE is searched for and the character K is output. The output character (K) is ch for exception processing later.
Set it in ar.

【００１４】次にステップＳ３に進んで次の符号を読み
込んでＣＯＤＥにＮＥＷcodeとしてセットする。次にス
テップＳ４に進み、ステップＳ３で入力された符号ＣＯ
ＤＥが辞書に定義（登録）されているか否かチェックす
る。通常、入力した符号語は前回までの処理で辞書に登
録されているため、ステップＳ５に進んで符号ＣＯＤＥ
に対応する文字列code（ωＫ）を辞書から読み出し、ス
テップＳ６で文字列Ｋを一時的にスタックし、参照番号
code（ω）を新たなＣＯＤＥとして再度ステップＳ５に
戻し、このステップＳ５，ステップＳ６の手順を再帰的
に参照番号ωが一文字にいたるまで繰り返し、最後にス
テップＳ７に進んでステップＳ６でスタックした文字を
ＬＩＬＯ（LastIn Fast Out）形式でポップアップして
出力する。Next, in step S3, the next code is read and set in CODE as NEW code. Next, in step S4, the code CO input in step S3 is input.
It is checked whether DE is defined (registered) in the dictionary. Normally, the input codeword is registered in the dictionary by the processing up to the previous time, so the processing proceeds to step S5 and the code CODE is entered.
The character string code (ωK) corresponding to is read from the dictionary, the character string K is temporarily stacked in step S6, and the reference number
The code (ω) is set as a new CODE and returned to step S5 again, and the procedure of steps S5 and S6 is recursively repeated until the reference number ω reaches one character, and finally, the process proceeds to step S7 and the characters stacked in step S6. Is popped up and output in the LILO (Last In Fast Out) format.

【００１５】同時にステップＳ７において、前回使った
符号ωと今回復元した文字列の最初の一文字Ｋを組
（ω，Ｋ）と表した文字列に、新たな参照番号を付加し
て辞書に登録する。なお、ステップＳ４において登録さ
れていない符号（符号化において直前の参照番号を参照
する場合に起きる）の場合、ステップＳ９にて、ＯＬＤ
codeをＣＯＤＥに戻し、またcode（ＯＬＤcode,char)を
ＮＥＷcodeに戻した後にステップＳ５へ進むようにす
る。At the same time, in step S7, a new reference number is added to the character string in which the code ω used last time and the first character K of the character string restored this time are represented as a set (ω, K) and registered in the dictionary. .. If the code is not registered in step S4 (which occurs when the immediately preceding reference number is referred to in encoding), the OLD is determined in step S9.
After returning code to CODE and returning code (OLDcode, char) to NEWcode, the process proceeds to step S5.

【００１６】図１６を参照して復号化処理を具体的に説
明すると次のようになる。尚、図１６では説明を簡単に
するためａｂｃの３文字の組合わせからなるデータを圧
縮する場合を取上げており、またａ，ｂ，ｃの３文字は
予め初期登録されている。図１６において、まず最初の
入力符号は１であり、一文字ａ，ｂ，ｃについては既に
参照番号１，２，３として図１５に示すように辞書に登
録されているため、辞書の参照により符号１に一致する
参照番号の文字列ａに置き換えて出力する。次の符号２
についても同様にして文字ｂに置き換えて出力する。こ
のとき前回処理した符号と今回復号した最初の一文字ｂ
とを組み合わせた（１ｂ）に新たな参照番号４を付加し
て辞書に登録する。The decoding process will be described in detail with reference to FIG. Note that FIG. 16 shows the case where data consisting of a combination of three characters abc is compressed, and the three characters a, b, and c are pre-registered in advance in order to simplify the explanation. In FIG. 16, the first input code is 1, and the characters a, b, and c are already registered in the dictionary as reference numbers 1, 2, and 3 as shown in FIG. It is replaced with the character string a of the reference number that matches 1 and output. Next code 2
Is also replaced with the character b and output. At this time, the previously processed code and the first character b decoded this time
A new reference number 4 is added to (1b) which is a combination of and and is registered in the dictionary.

【００１７】３番目の符号４は辞書の検索により１ｂか
らａｂと置き換えて文字列ａｂを出力する。同時に前回
処理した符号２と今回復号した文字列の１番目の文字ａ
との組合せた文字列２ａ（＝ｂａ）を新たな参照番号５
を付加して辞書に登録する。以下同様に、この処理を繰
り返す。ただし、図１６の復号化では次の例外処理があ
る。この例外処理は、第６番目の入力符号８の復号で生
ずる。符号８は復号時に辞書に定義されておらず、復号
できない。この場合には、前回処理した符号５に前回復
号した文字列ｂａの最初の一文字ｂを加えた文字列５ｂ
を求め、さらに２ａｂ，ｂａｂと置き換えられて出力さ
れる。そして、文字列の出力語に前回の符号語５に今回
復号した文字列の文字ｂを加えた文字列５ｂに参照番号
８を付加して辞書に登録する。The third code 4 replaces 1b with ab by searching the dictionary and outputs the character string ab. At the same time, the code 2 processed last time and the first character a of the character string decoded this time
The character string 2a (= ba) in combination with
Is added and registered in the dictionary. Similarly, this process is repeated thereafter. However, the decryption of FIG. 16 has the following exception processing. This exception processing occurs in the decoding of the sixth input code 8. Code 8 is not defined in the dictionary at the time of decoding and cannot be decoded. In this case, the character string 5b obtained by adding the first character b of the previously decoded character string ba to the previously processed code 5
Is calculated and further replaced with 2ab and bab and output. Then, the reference number 8 is added to the character string 5b obtained by adding the character b of the character string decoded this time to the previous code word 5 to the output word of the character string and registered in the dictionary.

【００１８】この例外処理は図１３の復号化処理フロー
のステップＳ４，ステップＳ９の処理を通じて行われ、
最終的にステップＳ７で文字列の出力と新たな文字列に
参照番号を付加した辞書への登録が行われる。尚、図１
２、図１３の符号化及び復号化処理は、同じ辞書を作り
出しながら行う。This exception processing is performed through the processing of steps S4 and S9 in the decoding processing flow of FIG.
Finally, in step S7, the character string is output and the new character string is added to the reference number and registered in the dictionary. Incidentally, FIG.
2. The encoding and decoding processes of FIG. 13 are performed while creating the same dictionary.

【００１９】[0019]

【発明が解決しようとする課題】しかしながら、白黒２
値画像に前処理を施してパターンとレングスの組でなる
データに変換した後にＬＺＷ符号化を行っている従来の
画像データ圧縮方式にあっては、画像情報を完全に保存
するようにしているため、図８に示す非常に似かよった
画像の前処理データでも全く異なるデータとしてユニバ
ーサル符号化を行うことになり、圧縮率を向上する妨げ
となっていた。However, black and white 2
In the conventional image data compression method in which LZW encoding is performed after performing preprocessing on the value image to convert it into data including a set of a pattern and a length, the image information is completely saved. The pre-processed data of very similar images shown in FIG. 8 is subjected to universal encoding as completely different data, which has been an obstacle to improving the compression rate.

【００２０】本発明は、このような従来の問題点に鑑み
てなされたもので、ユニバーサル符号化の辞書検索に近
似性を持たせることで圧縮率を向上するようにした画像
データ圧縮方式を提供することを目的とする。The present invention has been made in view of the above-mentioned conventional problems, and provides an image data compression method for improving the compression rate by providing a closeness to the dictionary search of universal encoding. The purpose is to do.

【００２１】[0021]

【課題を解決するための手段】図１は本発明の原理説明
図である。まず本発明は、画像の２次元情報にユニバー
サル符号化を適用する画像データ圧縮方式を対象とす
る。このような画像データ圧縮方式とする本発明にあっ
ては、画像データをパターンとランレングスの組で構成
される文字に変換する前処理手段１０と、前処理手段１
０から入力した文字を含む文字列に最大長一致する文字
列を辞書１２の中から検索する辞書検索手段１４と、辞
書検索手段１４による辞書検索で入力文字Ｋを含む文字
列ωＫに一致する文字列が辞書１２になかった時に、入
力文字Ｋに対して近似できる近似文字Ｋ´を読出し、再
度、近似文字Ｋ´を含む文字列ωＫ´に最大長一致する
文字列を検索し、同じ文字列がなければ辞書検索を中止
し、同じ文字列がある場合には近似文字列ωＫ´を使っ
て辞書検索を続ける近似辞書検索手段１６と、辞書検索
手段１４または近似辞書検索手段１６で検索された文字
列ωの辞書１２における参照番号で符号化を行う符号化
手段１８と、辞書１０から入力文字Ｋを含む文字列ωＫ
に一致する文字列の検索ができず且つ近似文字Ｋ´もな
かった時に、直前に符号化した文字列ωの参照番号に入
力文字Ｋを加えた文字列ωＫを新たな参照番号を付して
辞書１２に登録する辞書登録手段２０とを設けたことを
特徴とする。FIG. 1 illustrates the principle of the present invention. First, the present invention is directed to an image data compression method that applies universal encoding to two-dimensional information of an image. In the present invention which employs such an image data compression method, the preprocessing means 10 for converting the image data into characters composed of a set of patterns and run lengths, and the preprocessing means 1.
A dictionary search unit 14 that searches the dictionary 12 for a character string whose maximum length matches a character string including a character input from 0, and a character that matches the character string ωK including the input character K in the dictionary search by the dictionary search unit 14. When the string is not in the dictionary 12, the approximate character K ′ that can be approximated to the input character K is read, the character string having the maximum length matching the character string ωK ′ including the approximate character K ′ is searched again, and the same character string is searched. If there is not, the dictionary search is stopped, and if there is the same character string, the dictionary search means 16 that continues the dictionary search using the approximate character string ωK ′ and the dictionary search means 14 or the approximate dictionary search means 16 are searched. Encoding means 18 that encodes the character string ω with the reference number in the dictionary 12, and the character string ωK including the input character K from the dictionary 10.
When it is not possible to search for a character string that matches with and there is no approximate character K ′, a new reference number is added to the character string ωK obtained by adding the input character K to the reference number of the character string ω encoded immediately before. The dictionary registration means 20 for registering in the dictionary 12 is provided.

【００２２】ここで近似辞書検索手段１６は、パターン
とレングスの組で構成された入力文字Ｋに近似するパタ
ーンとレングスの組をもつ近似文字Ｋ´を予め定めたテ
ーブル２２を有し、該テーブル２２を入力文字Ｋのみに
より参照して対応する近似文字Ｋ´を読み出す。また近
似辞書検索手段１６は、入力文字Ｋ₁ とその前後の入力
文字Ｋ₀ ，Ｋ₂から近似文字Ｋ´を予め定めたテーブル
２２を有し、該テーブル２２を入力文字Ｋ₁ 、及びその
前後の入力文字Ｋ₀ ，Ｋ₂ により参照して現在処理中の
入力文字Ｋの近似文字Ｋ´を読み出すことを特徴とす
る。Here, the approximate dictionary search means 16 has a table 22 in which an approximate character K'having a set of patterns and lengths that approximates the input character K composed of a set of patterns and lengths is set in advance. 22 is referred to only by the input character K and the corresponding approximate character K ′ is read out. Further, the approximate dictionary search means 16 has a table 22 in which an approximate character K ′ is predetermined from the input character K ₁ and the input characters K ₀ and K ₂ before and after the input character K ₁ , and the table 22 is used as the input character K ₁ and before and after it. It is characterized in that an approximate character K ′ of the input character K currently being processed is read out by referring to the input characters K ₀ and K ₂ .

【００２３】更に近似辞書検索手段１６は、ランレング
スが同じでパターンが類似する文字を近似文字として読
み出し、或いはパターンが同じでランレングスが類似す
る文字列を近似文字として読み出す。更にまた、近似辞
書検索手段１６は、入力文字Ｋ₁ とその前後の入力文字
Ｋ₀，Ｋ₂ からパターンが同じでランレングスが類似す
る近似文字Ｋ´を読み出す場合に、入力文字Ｋ₁ と近似
文字Ｋ´のランレングスの不一致を補正するように後の
入力文字Ｋ₂ のランレンクズを補正する。Further, the approximate dictionary retrieval means 16 reads characters having the same run length and similar patterns as approximate characters, or reading character strings having the same pattern and similar run lengths as approximate characters. Furthermore, the approximate dictionary search means 16 approximates the input character K ₁ when reading the approximate character K ′ having the same pattern and similar run length from the input character K ₁ and the input characters K ₀ and K ₂ before and after the input character K _1. The run lengths of the subsequent input character K ₂ are corrected so as to correct the run length mismatch of the character K ′.

【００２４】[0024]

【作用】このような構成を備えた本発明の画像データ圧
縮方式によれば、白黒２値画像データを対象に図２に示
すように、前処理、ユニバーサル近似検索、ユニバーサル近似符号化ユニバーサル近似登録、を行うものである。According to the image data compression method of the present invention having such a configuration, as shown in FIG. 2, preprocessing, universal approximation search, universal approximation encoding, universal approximation registration is applied to monochrome binary image data. , Is to do.

【００２５】ここで、前処理で得られたパターンとラン
レングスの組で構成されるデータ、即ち入力文字列に対
するユニバーサル近似検索は、入力文字が辞書になかっ
た時に、ランレングスが同一でパターンが類似している場合、パターンが同一でランレングスが類似している場合、などの条件を基に、入力文字を近似文字に置き換えて辞
書を検索することで、入力文字列に一致するより長い文
字列の辞書参照番号に変換することができ、ユニバーサ
ル符号化の圧縮率をさらに向上させることになる。Here, in the universal approximate search for data composed of a set of patterns and run lengths obtained in the preprocessing, that is, an input character string, when the input character is not in the dictionary, the run length is the same and the pattern is If they are similar, the patterns are the same and the run lengths are similar, etc., based on the conditions such as, replace the input character with the approximate character and search the dictionary to find a longer character that matches the input character string. It can be converted into a column dictionary reference number, which further improves the compression rate of universal encoding.

【００２６】近似文字の読み出しは、入力文字に１対１
に対応する近似文字をテーブルに格納して読み出す方式
のみならず、入力文字とその前後の文字の関係から近似
文字のテーブルを作成して読み出すようにしてもよい。The reading of the approximate character is performed on the input character one to one.
In addition to the method of storing the approximate character corresponding to the above in the table and reading the table, the approximate character table may be created and read from the relationship between the input character and the characters before and after the input character.

【００２７】[0027]

【実施例】図３は本発明の一実施例を示した実施例構成
図である。図３において、２４は制御手段としてのＣＰ
Ｕであり、ＣＰＵ２４に対してはプログラムメモリ２６
とデータメモリ３０が接続される。プログラムメモリ２
６にはコントロールソフト２８，前処理ソフト１０，辞
書検索ソフト１４，近似辞書検索ソフト１６，符号化ソ
フト１８及び辞書登録ソフト２０が設けられる。FIG. 3 is a block diagram of an embodiment showing one embodiment of the present invention. In FIG. 3, 24 is a CP as a control means.
U, the program memory 26 for the CPU 24
And the data memory 30 are connected. Program memory 2
6 is provided with control software 28, preprocessing software 10, dictionary search software 14, approximate dictionary search software 16, encoding software 18, and dictionary registration software 20.

【００２８】コントロールソフト２８は画像データの前
処理及びＬＺＷ符号化の全体的な制御を行う。前処理ソ
フト１０は、例えば図８（ａ）に示したような白黒２値
の画像データについて４ライン毎に同じパターンの繰返
し数を示すパターンとランレングスの画像データ、即ち
ＬＺＷ変換における入力文字に変換する。この前処理に
より変換された入力文字は図８（ｂ）に示すようにパタ
ーンを上位４ビットで表わし、ランレングスＲＬを下位
４ビットで表わした８ビットの文字コードとなる。The control software 28 performs preprocessing of image data and overall control of LZW encoding. The pre-processing software 10 uses, for example, black-and-white binary image data as shown in FIG. 8A as a pattern showing the number of repetitions of the same pattern every four lines and run-length image data, that is, an input character in LZW conversion. Convert. The input character converted by this preprocessing is an 8-bit character code in which the pattern is represented by upper 4 bits and the run length RL is represented by lower 4 bits as shown in FIG. 8B.

【００２９】辞書検索ソフト１４は前処理ソフト１０に
より得られたパターンとランレングスの組でなる入力文
字を含む文字列に最大長一致する文字列を登録済みの辞
書１２の中から検索する。更に本発明にあっては、新た
に近似辞書検索ソフト１６が設けられる。この近似辞書
検索ソフト１６は辞書検索ソフト１４による辞書検索で
入力文字を含む文字列に一致する辞書登録文字列が辞書
１２の中になかったときに、そのときの入力文字に対し
近似できる近似文字を近似情報テーブル２２から読み出
し、読み出した近似文字を含む文字列に最大長一致する
文字列の検索を辞書１２に対し行い、それでも同じ文字
列がなければ検索を中止し、同じ文字列が検索できた場
合には近似文字列を使って更に文字列の検索を続ける。The dictionary search software 14 searches the registered dictionary 12 for a character string whose maximum length matches the character string including the input character consisting of the pattern and run length obtained by the preprocessing software 10. Further, in the present invention, the approximate dictionary search software 16 is newly provided. The approximate dictionary search software 16 is an approximate character that can approximate the input character at that time when the dictionary registration character string that matches the character string including the input character is not in the dictionary 12 by the dictionary search by the dictionary search software 14. Is searched from the approximate information table 22, and the dictionary 12 is searched for a character string having a maximum length matching the character string including the read approximate character. If the same character string is still not found, the search is stopped and the same character string can be searched. If so, the string search is continued using the approximate character string.

【００３０】符号化ソフト１８は辞書検索ソフト１４あ
るいは近似辞書検索ソフト１６で検索された文字列の辞
書１２におけるインデックス（辞書参照番号）を符号コ
ードとして出力する符号化を行う。更に、辞書登録ソフ
ト２０は辞書検索ソフト１４により入力文字を含む文字
列に一致する辞書１２中の文字列の検索ができず、また
近似辞書検索ソフト１６で近似情報テーブル２２を参照
しても近似文字が読み出せなかったときに、直前に符号
化した文字列の参照番号に入力文字を加えた文字列に新
たなインデックスを付して辞書１２に登録する。The encoding software 18 performs encoding to output the index (dictionary reference number) in the dictionary 12 of the character string retrieved by the dictionary retrieval software 14 or the approximate dictionary retrieval software 16 as a code code. Further, the dictionary registration software 20 cannot search the character string in the dictionary 12 that matches the character string including the input character by the dictionary search software 14, and the approximate dictionary search software 16 refers to the approximate information table 22 to approximate the character string. When the character cannot be read, a new index is added to the character string obtained by adding the input character to the reference number of the character string encoded immediately before, and the character string is registered in the dictionary 12.

【００３１】一方、データメモリ３０には辞書１２，近
似情報テーブル２２及びデータバッファ２４が設けられ
る。辞書１２は辞書登録ソフト２０によりＬＺＷ符号化
を行いながら作成される。データバッファ２４にはこれ
から前処理を行おうとする画像データ及び前処理が済ん
だこれからＬＺＷ符号化しようとする文字列が格納され
る。On the other hand, the data memory 30 is provided with a dictionary 12, an approximation information table 22 and a data buffer 24. The dictionary 12 is created by the dictionary registration software 20 while performing LZW encoding. The data buffer 24 stores image data to be preprocessed and character strings to be LZW encoded after preprocessing.

【００３２】更に、近似情報テーブル２２には近似辞書
検索ソフト１６で入力文字、更には入力文字の前後の文
字の関係により読み出される近似文字が予め登録されて
いる。図４は図３の近似情報テーブル２２に格納される
近似条件の第１実施例を示したもので、この実施例にあ
ってはランレングスが同じでパターンが類似する場合に
ついて近似条件を予め設定している。この近似条件には
図４（ａ）の入力データのみで近似を判断する場合と図
４（ｂ）の前後のデータも見て近似を判断する場合の２
つがある。Further, in the approximate information table 22, the input character by the approximate dictionary search software 16 and the approximate character read out according to the relationship of the characters before and after the input character are registered in advance. FIG. 4 shows a first embodiment of the approximation conditions stored in the approximation information table 22 of FIG. 3. In this embodiment, the approximation conditions are preset when the run lengths are the same and the patterns are similar. is doing. There are two conditions for this approximation, that is, the case where the approximation is judged only by the input data in FIG.
There is one.

【００３３】図４（ａ）の入力データのみで近似を判断
する場合には、例えば入力文字Ｋが白＝２、黒＝２のパ
ターンでランレングスＲＬ＝２であったとすると、この
入力文字Ｋの近似文字としてランレングスＲＬはＲＬ＝
２と同じであるが、パターンが１つ異なる近似文字Ｋ
₁ ' とＫ₂ ' を定めておく。近似文字Ｋ₁ ' は文字Ｋに
対し黒が１つ多い黒＝３となっており、また近似文字Ｋ
₂ ' は文字Ｋに対し黒が１つ少ない黒＝１となってい
る。When the approximation is judged only by the input data of FIG. 4A, if the input character K has a pattern of white = 2 and black = 2 and the run length RL = 2, the input character K is input. Run length RL is an approximate character of RL =
Approximate letter K, which is the same as 2, but with a different pattern
₁ 'and K _2' should define a. Approximate character K ₁ 'is black = 3, which is one more black than character K, and approximate character K ₁
_{In 2} ', there is one black less than the letter K, that is, black = 1.

【００３４】この図４（ａ）に示す近似条件を近似情報
テーブル２２に設定しておくことで、入力文字Ｋによる
参照で近似文字Ｋ₁ ' またはＫ₂ ' を読み出して近似辞
書検索を行うようになる。図４（ｂ）はランレングスが
同じでパターンが類似する場合について前後のデータも
見て近似を判断するようにしている。即ち、現時点の入
力文字がＫ₁ であり、１つ前の入力文字がＫ₀ であり、
更に次の入力文字がＫ₂ であったとすると、前後の文字
Ｋ₀ ，Ｋ₂ が同じで現在の文字における黒画素が１つ多
いＫ´を近似文字とする。By setting the approximation condition shown in FIG. 4 (a) in the approximation information table 22, the approximation character K ₁ 'or K ₂ ' is read out by referring to the input character K and the approximation dictionary search is performed. become. In FIG. 4B, in the case where the run lengths are the same and the patterns are similar to each other, the preceding and following data are also referred to determine the approximation. That is, the current input character is K ₁ , the previous input character is K ₀ ,
Further, if the next input character and was K _2, black pixels before and after the character K _0, K ₂ is the same as the current character is the approximate character one more K'.

【００３５】この図４（ｂ）に示す前後のデータも見て
近似を判断する条件を近似情報テーブル２２に予め設定
しておくことにより、現在処理中の入力文字Ｋ₁ 及びそ
の前後の文字Ｋ₀ ，Ｋ₂ によって近似情報テーブル２２
を参照し、入力文字Ｋ₁ の近似文字としてＫ´を読み出
す。図５は図３の近似情報テーブル２２に格納される近
似条件の第２実施例を示したもので、この実施例にあっ
ては、パターンが同じでランレングスが類似する近似条
件を設定している。By presetting the condition for judging the approximation by also looking at the data before and after shown in FIG. 4B, the input character K ₁ currently being processed and the characters K before and after it are processed. _The approximate information table 22 according to ₀ and K ₂ .
, And K ′ is read as an approximate character of the input character K ₁ . FIG. 5 shows a second embodiment of the approximation conditions stored in the approximation information table 22 of FIG. 3. In this embodiment, approximation conditions with the same pattern but similar run lengths are set. There is.

【００３６】図５（ａ）はパターンが同じでランレング
スが類似する場合について、入力データのみで近似文字
を判断する場合である。即ち、現時点の入力文字Ｋが白
＝１、黒＝３のパターンでランレングスＲＬ＝３であっ
たとすると、同じパターンでランレングスが±１の範囲
内にあるＲＬ＝２の文字Ｋ₁ ' とＲＬ＝４の文字Ｋ₂'
をそれぞれ文字Ｋの近似文字とする。FIG. 5 (a) shows a case where the approximate character is judged only by the input data when the patterns are the same and the run lengths are similar. That is, assuming that the input character K at the present time has a run length RL = 3 in a pattern of white = 1 and black = 3, a character K ₁ 'of RL = 2 having the same pattern and a run length of ± 1 is obtained. RL = 4 letter K ₂ '
Are the approximate characters of the character K, respectively.

【００３７】更に、図５（ｂ）はパターンが同じでラン
レングスが類似する場合に前後のデータも見て近似判断
を行う条件を示している。即ち、現時点の入力文字がＫ
₁ であり、１つ前の文字がＫ₀ であり、更に１つ後の文
字がＫ₂ であったとすると、右側に示すように３つの文
字Ｋ₀ ，Ｋ₁ ，Ｋ₂ とパターンが同じで現在処理中の入
力文字Ｋ₁ に対応するランレングスが１つ多いＲＬ＝４
を近似文字Ｋ´として定める。Further, FIG. 5B shows a condition for making an approximate judgment by also looking at the data before and after when the patterns are the same and the run lengths are similar. That is, the current input character is K
_If it is 1, the previous character is K ₀ , and the next character is K ₂ , the pattern is the same as the three characters K ₀ , K ₁ , K ₂ as shown on the right side. There is one more run length corresponding to the input character K ₁ currently being processed RL = 4
Is defined as an approximate character K '.

【００３８】この図５（ｂ）の場合、近似前の文字Ｋ
₀ ，Ｋ₁ ，Ｋ₂ のランレングスの総和は１１であるのに
対し、近似した場合の３つの文字Ｋ₀ ，Ｋ´，Ｋ₂ ，ラ
ンレングスの総和は１２になっていることから、次の入
力文字Ｋ₂ のランレングスＲＬ＝７を１つ少ないＲＬ＝７−１＝６とする補正を行い、近似を行ってもランレングスが全体
として狂わないようにする。In the case of FIG. 5B, the character K before approximation
_While the sum of the run lengths of ₀ , K ₁ and K ₂ is 11, the sum of the three characters K ₀ , K ′, K ₂ and the run length in the case of approximation is 12, therefore The run length RL = 7 of the input character K ₂ is reduced by one so that RL = 7-1 = 6 is corrected so that the run length does not change as a whole even if approximation is performed.

【００３９】尚、図４（ａ）及び図５（ａ）の入力デー
タのみで近似する場合については、入力文字Ｋにより近
似情報テーブル２２を参照して近似文字Ｋ₁ ' またはＫ
₂ 'を読み出す場合を例にとるものであったが、これら
３つの文字における近似関係は相互に成立するので、例
えば入力文字がＫ₁ ' であった場合には近似文字として
ＫまたはＫ₂ 'を読み出すことになる。In the case of approximating only the input data of FIGS. 4A and 5A, the approximate character K ₁ 'or K is referred to by referring to the approximate information table 22 by the input character K.
_{Although the} case of reading out ₂ 'is taken as an example, since the approximation relations of these three characters are mutually established, for example, when the input character is K ₁ ', K or K ₂ 'as an approximate character. Will be read.

【００４０】図７は入力データのみで近似判断を行う場
合の本発明のＬＺＷ符号化を示したフローチャートであ
る。図７において、まずステップＳ１では第１番目の文
字を含むように辞書の初期化を行い、辞書の先頭アドレ
スｎを、例えば前処理が済んだパターンとランレングス
の組でなる入力データ（入力文字）を８ビットとする
と、ｎ＝２５６とする。続いて最初の文字Ｋを入力し、
この入力文字をインデックス（語頭文字列）ωとする。FIG. 7 is a flow chart showing the LZW encoding of the present invention when the approximation judgment is made only with the input data. In FIG. 7, first, in step S1, the dictionary is initialized so as to include the first character, and the start address n of the dictionary is, for example, input data (input character consisting of a set of preprocessed pattern and run length). ) Is 8 bits, n = 256. Then enter the first letter K,
Let this input character be an index (initial character string) ω.

【００４１】次にステップＳ２で次の文字Ｋを入力す
る。続いてＳ３において文字列ωＫが辞書内に存在する
かどうかをチェックする。存在する場合にはステップＳ
４に進み、文字列ωＫを新たな語頭文字列ωとしてステ
ップＳ５におけるデータ終了のチェックを経て再びステ
ップＳ２に戻り、一致する最長文字列の検索を繰り返
す。Next, in step S2, the next character K is input. Then, in S3, it is checked whether or not the character string ωK exists in the dictionary. Step S if it exists
4, the character string ωK is used as a new initial character string ω, the data end is checked in step S5, and the process returns to step S2 to repeat the search for the longest matching character string.

【００４２】ステップＳ３において文字列ωＫが辞書に
存在せず、最長文字列の検索を終了した場合にはステッ
プＳ６に進み、ステップＳ６以降の処理で近似検索を行
う。ステップＳ６では図４（ａ）及び図５（ａ）に示す
ような条件から入力文字Ｋと類似した近似文字候補Ｋ´
が予め作成した近似情報テーブルにあるかどうかを判断
し、ステップＳ１１に進む。ステップＳ１１にあって
は、このときの語頭文字列ωを符号code（ω）として出
力すると共に、語頭文字列ωに文字Ｋをを加えた文字列
ωＫを辞書に登録し、更に文字Ｋを新たな語頭文字列ω
とすると共に辞書アドレスｎを１つインクリメントす
る。If the character string ωK does not exist in the dictionary in step S3, and the search for the longest character string is completed, the process proceeds to step S6, and the approximate search is performed in the processes after step S6. In step S6, an approximate character candidate K ′ similar to the input character K is obtained from the conditions as shown in FIGS. 4 (a) and 5 (a).
Is in the approximate information table created in advance, and the process proceeds to step S11. In step S11, the initial character string ω at this time is output as the code code (ω), the character string ωK obtained by adding the character K to the initial character string ω is registered in the dictionary, and the character K is further added. A new prefix ω
And the dictionary address n is incremented by one.

【００４３】一方、ステップＳ６で近似情報テーブルに
近似候補Ｋ´があった場合にはステップＳ７で近似候補
Ｋ´を読み出し、ステップＳ８内にて近似文字列ωＫ´
が辞書内に存在するかどうかをチェックする。辞書内に
存在する場合には近似できるとしてステップＳ９で近似
文字列ωＫ´を語頭文字列ωに置き換える。続いてステ
ップＳ１０にて次の入力文字を読み、ステップＳ３に戻
って辞書検索を継続する。尚、ステップＳ１０にあって
は、図５（ｂ）に示したようにランレングスの補正が必
要な場合には次の入力文字Ｋのランレングスの値を補正
した後にステップＳ３に戻るが、図６における入力デー
タのみの場合にはこの補正は特に必要ない。On the other hand, if there is an approximation candidate K'in the approximation information table in step S6, the approximation candidate K'is read out in step S7, and the approximation character string ωK 'is read in step S8.
Check if exists in the dictionary. If it exists in the dictionary, it can be approximated and the approximate character string ωK ′ is replaced with the initial character string ω in step S9. Then, in step S10, the next input character is read, and the process returns to step S3 to continue the dictionary search. In step S10, when the run length needs to be corrected as shown in FIG. 5B, the run length value of the next input character K is corrected, and then the process returns to step S3. When only the input data in 6 is used, this correction is not necessary.

【００４４】一方、ステップＳ８で近似文字列ωＫ´が
辞書内に存在しない場合には再びステップＳ６に戻り、
新たな近似候補Ｋ´を探し出す。最終的に近似候補Ｋ´
がない場合にはステップＳ１１に進み、符号code（ω）
を出力して文字列ωＫを辞書アドレスｎに登録する。更
に、文字Ｋを語頭文字列ωに代入し、アドレスｎを１つ
インクリメントし、ステップＳ５のチェックを経て再び
ステップＳ２に戻る。ステップＳ５において最終データ
であることが判別された場合にはステップＳ１２に進ん
で最後の符号code（ω）を出力して一連の処理を終了す
る。On the other hand, if the approximate character string ωK 'does not exist in the dictionary in step S8, the process returns to step S6 again.
A new approximation candidate K'is searched for. Finally approximation candidate K '
If there is not, the process proceeds to step S11, and the code code (ω)
Is output and the character string ωK is registered in the dictionary address n. Further, the letter K is substituted into the initial character string ω, the address n is incremented by 1, and the process returns to step S2 after the check in step S5. If it is determined in step S5 that the data is the final data, the process proceeds to step S12 to output the last code code (ω) and end the series of processes.

【００４５】図７は図４（ｂ）及び図５（ｂ）に示した
入力文字とその前後の文字から近似判断を行う場合の本
発明のＬＺＷ符号化を示したフローチャートである。図
７において、まずステップＳ１では１番目の文字を含む
ように辞書の初期化を行い、辞書の先頭アドレスｎを文
字データが８ビットと想定してｎ＝２５６とする。続い
て最初の文字Ｋを入力し、入力文字Ｋをインデックス
（語頭文字列）ωとする。FIG. 7 is a flow chart showing the LZW encoding of the present invention when the approximation judgment is performed from the input character shown in FIGS. 4B and 5B and the characters before and after the input character. In FIG. 7, first, in step S1, the dictionary is initialized to include the first character, and the head address n of the dictionary is set to n = 256 assuming that the character data is 8 bits. Then, the first character K is input, and the input character K is used as an index (initial character string) ω.

【００４６】次に、ステップＳ２で次の文字Ｋ₁ を入力
する。続いてステップＳ３に進み、文字列ωＫ₁ が辞書
内に存在するかどうかをチェックする。存在する場合に
はステップＳ４に進み、文字列ωＫを新たな語頭文字列
ωに置き換え、また文字Ｋ₁を入力した最初の文字Ｋ₀に
置き換える。続いてステップＳ５のデータ終了のチェッ
クを経てステップＳ２に戻り、一致する最長文字列の検
索を継続する。Next, in step S2, the next character K ₁ is input. Succeedingly, in a step S3, it is checked whether or not the character string ωK ₁ exists in the dictionary. If it exists, the process proceeds to step S4, and the character string ωK is replaced with a new initial character string ω, and the character K ₁ is replaced with the input first character K ₀ . Then, after checking the end of data in step S5, the process returns to step S2 to continue the search for the matching longest character string.

【００４７】一方、ステップＳ３において文字列ωＫ₁
が辞書に存在せず、最長文字列の検索を終了した場合に
はステップＳ６に進み、ステップＳ６以降で近似検索を
行う。ステップＳ６では次の文字Ｋ₂ を読み出し、ステ
ップＳ７において、例えば図４（ｂ）及び図５（ｂ）に
示すような近似条件から文字Ｋ₀ ，Ｋ₁ ，Ｋ₂ を見て入
力文字Ｋ₁ と類似した近似候補Ｋ´が予め作成した近似
情報テーブルにあるかどうか判断し、テーブルになけれ
ばステップＳ１２に進んで辞書登録を行う。On the other hand, in step S3, the character string ωK ₁
Does not exist in the dictionary and the search for the longest character string is completed, the process proceeds to step S6, and an approximate search is performed after step S6. Step S6, reads the next character K _2, in step S7, for example, 4 (b) and 5 characters from the approximate conditions shown in _{_{(b) K 0, K 1}} , inputs watches K ₂ characters K ₁ It is determined whether or not an approximation candidate K'similar to is in the approximate information table created in advance. If it is not in the table, the process proceeds to step S12 and dictionary registration is performed.

【００４８】ステップＳ７で近似候補Ｋ´がある場合に
はステップＳ８に進んで近似候補Ｋ´を読み出し、ステ
ップＳ９において近似文字列ωＫ´が辞書内に存在する
かどうかチェックする。辞書内に存在する場合には近似
できるとしてステップＳ１０にて近似文字列ωＫ´を語
頭文字列ωとし、また近似候補Ｋ´を入力した最初の文
字Ｋ₀ とする。If there is an approximate candidate K'in step S7, the process proceeds to step S8 to read the approximate candidate K ', and in step S9 it is checked whether or not the approximate character string ωK' exists in the dictionary. If it exists in the dictionary, it is assumed that it can be approximated, and the approximate character string ωK ′ is set to the initial character string ω and the approximate candidate K ′ is set to the input first character K _{0 in} step S10.

【００４９】続いてステップＳ１１にて、直前の近似に
対し図５（ｂ）のようにランレングスの補正が必要な場
合は、文字Ｋ₂ のランレングスの値を補正してＫ₂ ' と
した後、補正文字Ｋ₂ ' を次に処理する入力文字Ｋ₁ と
してステップＳ３に進んで辞書検索を継続する。また、
ステップＳ９で近似文字列ωＫ´が辞書内に存在しない
場合にはステップＳ７に戻り、新たな近似候補Ｋ´を探
し出す。最終的に近似候補Ｋ´がない場合にはステップ
Ｓ１２に進んで符号code（ω）を出力した後、文字列ω
Ｋ₁ を辞書アドレスｎに登録する。更に、文字Ｋ₁ を語
頭文字列ωに代入し、辞書アドレスｎを１つインクリメ
ントしてステップＳ５を介して再びステップＳ２に戻
る。ステップＳ５において最終データであることが判別
された場合にはステップＳ１３に進み、最後の符号code
（ω）を出力して一連の処理を終了する。Then, in step S11, when the run length correction is necessary for the immediately preceding approximation as shown in FIG. 5B, the run length value of the character K ₂ is corrected to K ₂ '. After that, the correction character K ₂ 'is set as the input character K _{1 to be} processed next, and the process proceeds to step S3 to continue the dictionary search. Also,
If the approximate character string ωK 'does not exist in the dictionary in step S9, the process returns to step S7 to search for a new approximate candidate K'. Finally, when there is no approximate candidate K ′, the process proceeds to step S12, the code code (ω) is output, and then the character string ω
Register K ₁ in dictionary address n. Further, the character K ₁ is substituted for the initial character string ω, the dictionary address n is incremented by 1, and the process returns to step S2 via step S5. When it is determined in step S5 that the data is the final data, the process proceeds to step S13, and the last code code
(Ω) is output and a series of processing is ended.

【００５０】尚、本発明における近似辞書検索における
近似条件としては、図４及び図５に限定されず適宜の類
似条件を設定することができる。また、前処理で得られ
るパターンとランレングスの組でなる前処理データとし
て８ビットデータを例にとっているが、前処理データの
ビット蝶も処理を行う装置に適合した任意のビット長と
することができる。The approximation conditions in the approximate dictionary search according to the present invention are not limited to those shown in FIGS. 4 and 5, and appropriate similar conditions can be set. Although 8-bit data is taken as an example of preprocessed data that is a set of a pattern and a run length obtained by preprocessing, the bit butterfly of the preprocessed data may have an arbitrary bit length suitable for the device that performs the process. it can.

【００５１】[0051]

【発明の効果】以上説明してきたように本発明によれ
ば、画像データの前処理により得られたパターンとラン
レングスの組でなるデータのユニバーサル符号化につい
て、データの類似性を取り入れて辞書の近似検索を行う
ことで画像データをＬＺＷ等のユニバーサル符号で符号
化して圧縮する際の圧縮率を更に向上することができ
る。As described above, according to the present invention, the universal encoding of the data, which is the set of the pattern and the run length obtained by the pre-processing of the image data, takes into account the similarity of the data and the dictionary. By performing the approximate search, it is possible to further improve the compression rate when the image data is encoded by the universal code such as LZW and compressed.

【００５２】また元の画像を損わない程度の類似範囲に
ついて近似検索を行うことで、符号データを復元した際
にも元の画像データと同様な画像を再現することができ
る。By performing an approximate search on a similar range that does not damage the original image, an image similar to the original image data can be reproduced when the code data is restored.

[Brief description of drawings]

【図１】本発明の原理説明図FIG. 1 is an explanatory diagram of the principle of the present invention.

【図２】本発明の作用説明図FIG. 2 is an explanatory view of the operation of the present invention.

【図３】本発明の実施例構成図FIG. 3 is a configuration diagram of an embodiment of the present invention.

【図４】ランレングスが同じでパターンが類似する本発
明の近似条件を示した説明図FIG. 4 is an explanatory diagram showing approximation conditions of the present invention in which the run lengths are the same and the patterns are similar.

【図５】パターンが同じでランレングスが類似する本発
明の近似条件を示した説明図FIG. 5 is an explanatory diagram showing the approximation conditions of the present invention in which the patterns are the same and the run lengths are similar.

【図６】入力データのみで近似判断する本発明のＬＺＷ
符号化を示したフローチャートFIG. 6 is an LZW of the present invention in which an approximate judgment is made using only input data.
Flowchart showing encoding

【図７】前後のデータを含めて近似判断する本発明のＬ
ＺＷ符号化を示したフロートチャートFIG. 7 is an L of the present invention for making an approximate determination including front and back data.
Float chart showing ZW encoding

【図８】従来の白黒２値データに対する前処理と前処理
データの説明図FIG. 8 is an explanatory diagram of preprocessing and preprocessing data for conventional black and white binary data.

【図９】図８の前処理データのユニバーサル符号化で作
られた辞書の木構成を示した説明図9 is an explanatory diagram showing a tree structure of a dictionary created by universal encoding of the preprocessed data of FIG.

【図１０】従来のＬＺＷ符号化における辞書の木構成図FIG. 10 is a tree structure diagram of a dictionary in conventional LZW encoding.

【図１１】従来のＬＺＷ符号化における文字列と符号と
の関係を示した説明図FIG. 11 is an explanatory diagram showing a relationship between a character string and a code in conventional LZW encoding.

【図１２】従来のＬＺＷ符号化を示したフローチャートFIG. 12 is a flowchart showing conventional LZW encoding.

【図１３】従来のＬＺＷ復号化を示したフローチャートFIG. 13 is a flowchart showing conventional LZW decoding.

【図１４】従来のＬＺＷ符号化の具体例を示した説明図FIG. 14 is an explanatory diagram showing a specific example of conventional LZW encoding.

【図１５】従来のＬＺＷ符号化で作成される辞書構成を
具体的に示した説明図FIG. 15 is an explanatory diagram specifically showing a dictionary configuration created by conventional LZW encoding.

【図１６】従来のＬＺＷ復号化の具体例を示した説明図FIG. 16 is an explanatory diagram showing a specific example of conventional LZW decoding.

[Explanation of symbols]

１０：前処理手段（前処理ソフト）１２：辞書１４：辞書検索手段（辞書検索ソフト）１６：近似辞書検索手段（近似辞書ソフト）１８：符号化手段（符号化ソフト）２０：辞書登録手段（辞書登録ソフト２２：テーブル（近似情報テーブル）２４：ＣＰＵ２６：プログラムメモリ２８：コントロールソフト３０：データメモリ３２：データバッファ 10: Preprocessing means (preprocessing software) 12: Dictionary 14: Dictionary search means (dictionary search software) 16: Approximate dictionary search means (approximate dictionary software) 18: Encoding means (encoding software) 20: Dictionary registration means ( Dictionary registration software 22: Table (approximation information table) 24: CPU 26: Program memory 28: Control software 30: Data memory 32: Data buffer

───────────────────────────────────────────────────── フロントページの続き (72)発明者千葉広隆神奈川県川崎市中原区上小田中1015番地富士通株式会社内 ─────────────────────────────────────────────────── --- Continuation of the front page (72) Inventor Hirotaka Chiba 1015 Kamiodanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture Fujitsu Limited

Claims

[Claims]

1. A pre-processing means (10) for converting image data into a character composed of a pattern and a run length in an image data compression method which applies universal coding to two-dimensional information of an image. A dictionary search means (14) for searching the dictionary (12) for a character string having the maximum length that matches the character string including the characters input from the preprocessing means (10), and a dictionary search by the dictionary search means (14) When there is no character string matching the character string ωK including the input character K in the dictionary (12), the approximate character K that can be approximated to the input character K
′ Is read and the character string ωK ′ including the approximate character K ′ is read again.
Is searched for a character string having a maximum length match, and if there is no same character string, the search is stopped, and if there is the same character string, an approximate dictionary search means (16) that continues the dictionary search using the approximate character string ωK ′ And the dictionary search means (14) or the approximate dictionary search means (1
4) an encoding means (18) for encoding the character string ω retrieved by the reference number in the dictionary (12); and a character string matching the character string ωK including the input character K from the dictionary (12). When the search is not possible and there is no approximate character K ′, the character string ωK obtained by adding the input character K to the reference number of the character string ω encoded immediately before is added to the dictionary (12) with a new reference number. An image data compression method characterized by comprising a dictionary registration means (20) for performing.

2. The image data compression method according to claim 1, wherein the approximate dictionary search means (16) has a pattern / length set that is close to the input character K composed of a pattern / length set. An image data compression method comprising a table (22) in which the approximate character K'is determined in advance, and the table (22) is referred to only by the input character K to read the corresponding approximate character K '.

3. The image data compression method according to claim 1, wherein the approximate dictionary search means (16) preliminarily obtains an approximate character K'from the input character K ₁ and the input characters K ₀ and K ₂ before and after the input character K ₁ . A predetermined table (22) is provided, and the table (22) is referred to by the input character K ₁ and the input characters K ₀ and K ₂ before and after it, and the approximate character K ′ of the input character K currently being processed is read out. An image data compression method characterized in that

4. The image data compression method according to claim 2 or 3, wherein the approximate dictionary search means (16) reads a character having the same run length and a similar pattern as an approximate character. Image data compression method.

5. The image data compression method according to claim 2 or 3, wherein the approximate dictionary search means (16) reads a character string having the same pattern and a similar run length as an approximate character. Characteristic image data compression method.

6. The image data compression method according to claim 5, wherein the approximate dictionary search means (16) has the same pattern from the input character K ₁ and the input characters K ₀ and K ₂ before and after the input character K ₁ . When reading the similar character K ', which is similar to
An image data compression method, characterized in that the run length of the subsequent input character K ₂ is corrected so as to correct the run length mismatch between the input character K ₁ and the approximate character K ′.