JP2001044850A

JP2001044850A - Data compression method, data decoding method and information processing unit

Info

Publication number: JP2001044850A
Application number: JP2000166584A
Authority: JP
Inventors: Ryuji Omoto; 隆二大本; Hikonosuke Uei; 彦之介上井
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 1993-06-22
Filing date: 2000-06-02
Publication date: 2001-02-16
Anticipated expiration: 2018-09-22
Also published as: JP3449338B2

Abstract

PROBLEM TO BE SOLVED: To provide a data compression method that can freely decode only the required characters at a high compression rate and fast decoding processing. SOLUTION: In this data compression method, data are compressed by using a dictionary registering registration data streams in relation to a registration number to replace two combinations or more of data streams with the registration number. In steps A1, A2, a dictionary is generated and updated, and in a step A3, whether an optimum dictionary is obtained is discriminated. The update of the dictionary is conducted by an incremental separation method or an increased separation method or the like. As shown in steps A4, A5, when the optimum dictionary is generated, the dictionary is outputted for a final decoding static dictionary, data are compressed by using a static dictionary, and compressed data streams are outputted as final decoding compressed data.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、登録番号に関連づ
けて登録データ列が登録される辞書を使用して行うデー
タ圧縮方法、圧縮データを復元するためデータ復元方法
及び情報処理装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data compression method performed by using a dictionary in which a registration data string is registered in association with a registration number, a data restoration method for restoring compressed data, and an information processing apparatus.

【０００２】[0002]

【背景技術】プリンタ等の情報処理装置では最近、種々
の印字サイズのビットマップフォントやアウトラインフ
ォントを供給することで付加価値を高め、さらにどの印
字サイズにおいても高印字品質を維持しようとする要求
が高まっている。そこで、プリンタ等の分野では最近、
これらフォントデータからなる大量の情報を効率よく記
憶させるためのデータ圧縮技術が注目されるようになっ
ている。2. Description of the Related Art Recently, there has been a demand for information processing devices such as printers to increase the added value by supplying bitmap fonts and outline fonts of various print sizes, and to maintain high print quality at any print size. Is growing. Therefore, recently in the field of printers and the like,
Attention has been paid to a data compression technique for efficiently storing a large amount of information composed of these font data.

【０００３】大量の情報をできるだけ少ない容量で記憶
あるいは転送するためのデータ圧縮の従来技術として
は、ハフマン符号のように固定長ビットのデータを可変
長ビットのコードに変換する技術や、いわゆるLempel-Z
iv特許（米国特許第4464650 号明細書）やＬＺＷ特許
（米国特許第4558302 号明細書）のように過去に現れた
データ列とこれから圧縮しようとしているデータ列との
一致を利用して圧縮を行う技術等が知られている。[0003] As a conventional technique of data compression for storing or transferring a large amount of information with a capacity as small as possible, a technique of converting fixed-length bit data into a variable-length bit code such as Huffman code, and a so-called Lempel- Z
Compression is performed by using the coincidence between a data sequence appearing in the past and a data sequence to be compressed, as in the iv patent (US Pat. No. 4,446,650) and the LZW patent (US Pat. No. 4,583,902). Techniques are known.

【０００４】しかし、これらの従来技術は、いわゆる動
的辞書を使用して行うデータ圧縮方法である。即ち、圧
縮すべき対象データを解析し、出現頻度等を調べながら
データを辞書構造の形で登録し、この辞書を用いて同時
にデータを圧縮していく方法である。この場合、辞書は
リアルタイムでどんどん変わって行くのが特徴である。
この動的辞書によるデータ圧縮では、圧縮処理が行われ
た対象データだけが生成物として残るが、データを復元
する際は圧縮時の履歴データの特性を調べ、再び辞書を
作成し直しながら次のデータを復元して行かねばならな
い。従って、圧縮された対象データの最初から順に処理
を行って行かなければならないという問題がある。However, these conventional techniques are data compression methods performed using a so-called dynamic dictionary. That is, this method is a method of analyzing target data to be compressed, registering the data in the form of a dictionary structure while checking the appearance frequency and the like, and simultaneously compressing the data using this dictionary. In this case, the dictionary is characterized in that it changes rapidly in real time.
In this data compression using the dynamic dictionary, only the target data subjected to the compression processing remains as a product, but when restoring the data, the characteristics of the history data at the time of compression are examined, and the next dictionary is created while recreating the dictionary. You have to restore the data and go. Therefore, there is a problem that the processing must be performed in order from the beginning of the compressed target data.

【０００５】さて、プリンタ等におけるフォントデータ
の圧縮では、圧縮されたフォントデータはプリンタある
いはホストコンピュータ側の記憶装置に格納される。そ
して、印字の際には、記憶装置の中から必要な圧縮され
たフォントデータを取り出し、この圧縮されたフォント
データを通常のデータに復元して印字データを形成す
る。従って、この様な条件においてデータ圧縮を用いる
場合、必要とされるのは記憶装置に格納されているデー
タを如何に早く復元し、印字データを形成するかであ
る。[0005] In the compression of font data in a printer or the like, the compressed font data is stored in a storage device of the printer or the host computer. Then, at the time of printing, necessary compressed font data is taken out of the storage device, and the compressed font data is restored to normal data to form print data. Therefore, when data compression is used under such conditions, what is needed is how to quickly restore the data stored in the storage device and form print data.

【０００６】また、プリンタ等においてフォントデータ
を圧縮する際に重要となるもう一つのポイントは、どの
ような印字データ（文字）をどの順番でも出力できると
いう自在性を持たせることである。つまり、プリンタ等
においては記憶されているデータに対してランダムにア
クセスし、ランダムに印字データを形成できることが要
求される。Another important point when compressing font data in a printer or the like is to have the flexibility of outputting any print data (characters) in any order. That is, it is required that a printer or the like can randomly access stored data and form print data at random.

【０００７】[0007]

【発明が解決しようとする課題】ハフマン符号等の圧縮
方法よりも一般に圧縮率が高いとされるLempel-ZivやＬ
ＺＷなどのデータ圧縮方法は、今後の圧縮技術の主流に
なると思われる。しかし、その一方で、これらの圧縮技
術は動的辞書を用いたデータ圧縮であるためにデータを
復元する際も過去のデータ特性を調べ、更新していかね
ばならず、復元に時間がかかるという問題がある。SUMMARY OF THE INVENTION Lempel-Ziv or L
Data compression methods such as ZW are expected to become the mainstream of compression technology in the future. However, on the other hand, since these compression techniques are data compression using dynamic dictionaries, when restoring data, it is necessary to check past data characteristics and update, and it takes time to restore. There's a problem.

【０００８】また、Lempel-ZivやＬＺＷなどのデータ圧
縮方法は復元の際には、圧縮データの最初から順にデー
タを復元せねばならず、必要な印字データのみを必要な
ときに自由に取り出すことはできないという問題があ
る。[0008] In data compression methods such as Lempel-Ziv and LZW, when decompression, data must be decompressed sequentially from the beginning of the compressed data, and only necessary print data can be freely extracted when necessary. There is a problem that can not be.

【０００９】また、ＬＺＷ等で用いられている増分分解
法と呼ばれるデータ圧縮方法では、辞書に登録できるデ
ータ列の個数を１ずつしか増やしてゆくことができない
ため、例えば同じデータ列が連続する場合には、データ
圧縮率をいまいち高めることができないという問題があ
った。Further, in a data compression method called an incremental decomposition method used in LZW or the like, the number of data strings that can be registered in the dictionary cannot be increased by only one. Has a problem that the data compression ratio cannot be increased.

【００１０】本発明は、以上のような課題を解決するた
めになされたものであり、その目的とするところは、Le
mpel-ZivやＬＺＷのデータ圧縮方法のような高圧縮率を
維持しつつ、それでいて復元の処理を比較的短い時間で
済ませ、かつ必要なデータ列のみを自由に復元可能にす
るデータ圧縮方法、圧縮されたデータ列を復元できるデ
ータ復元方法及び情報処理装置を提供することにある。The present invention has been made in order to solve the above-mentioned problems, and the object thereof is to solve the problem described in Le.
A data compression method and compression that maintains a high compression ratio like the data compression method of mpel-Ziv and LZW, but still requires a relatively short time for the decompression process, and allows only necessary data strings to be freely decompressed. It is an object of the present invention to provide a data restoration method and an information processing device capable of restoring a data string that has been set.

【００１１】また、本発明の他の目的は、増分分解法よ
りもさらに圧縮率の高い方法を創出することにより、圧
縮されたデータ列とその際に生成された辞書の格納のた
めに必要な記憶容量を少なくすることができるデータ圧
縮方法、圧縮されたデータ列を復元できるデータ復元方
法及び情報処理装置を提供することにある。Another object of the present invention is to create a method having a higher compression ratio than the incremental decomposition method, so that a compressed data sequence and a dictionary generated at that time are stored. It is an object of the present invention to provide a data compression method capable of reducing a storage capacity, a data restoration method capable of restoring a compressed data string, and an information processing apparatus.

【００１２】[0012]

【課題を解決するための手段】上記課題を解決するため
に、本発明は、登録番号に関連づけて登録データ列を登
録できる辞書を使用しデータ列の２以上の組み合わせを
該登録番号に置き換えることでデータ圧縮を行うデータ
圧縮方法であって、圧縮対象であるデータ列のデータ圧
縮に最適な辞書が生成されるまで辞書を更新し、最適な
辞書が生成された段階で該辞書を最終的な復元用の静的
辞書として出力するとともに、該静的辞書により圧縮対
象であるデータ列のデータ圧縮を行い、圧縮されたデー
タ列を最終的な復元用の圧縮データとして出力すること
を特徴とする。In order to solve the above problems, the present invention uses a dictionary capable of registering a registration data string in association with a registration number, and replaces two or more combinations of data strings with the registration number. In the data compression method of performing data compression in, the dictionary is updated until a dictionary optimal for data compression of a data string to be compressed is generated, and when the optimal dictionary is generated, the dictionary is finalized. Outputting as a static dictionary for decompression, performing data compression of a data string to be compressed by the static dictionary, and outputting the compressed data string as final compressed data for decompression. .

【００１３】本発明によれば、辞書には、登録番号に関
連づけて登録データ列が登録される。そして、データ列
の２以上の組み合わせを該登録番号に置き換えることで
データ圧縮が行われる。このようにデータ圧縮を行え
ば、復元の際に該登録番号により上記登録データ列を読
み出すことにより元のデータ列を復元することが可能と
なる。この場合、辞書は、圧縮対象であるデータ列のデ
ータ圧縮に最適な辞書が生成されるまで、更新される。
即ち、例えば圧縮データのデータ量、辞書のデータ量等
が最適になるまで辞書が更新されることになる。そし
て、最適な辞書が生成された段階で、該辞書が最終的な
復元用の静的辞書として出力される。また、該静的辞書
により圧縮対象であるデータ列のデータ圧縮が行われ、
圧縮されたデータ列が最終的な復元用の圧縮データとし
て出力される。そして、出力された最終的な静的辞書、
圧縮データは、例えば記憶装置、記憶媒体等に格納さ
れ、プリンタ、コンピュータ等の情報処理装置により復
元され、元のデータ列が復元されることになる。このよ
うに本発明によれば、最適な辞書が生成されるまで辞書
の更新が行われ、この最適な辞書を静的な辞書としてこ
の静的辞書によりデータ圧縮が行われる。従って、出力
される静的辞書と圧縮データのデータ量を最適なものと
することができる。更に、出力されるまでの辞書は静的
辞書である必要がないため、例えばデータ圧縮率の非常
に高い動的辞書を用いたデータ圧縮アルゴリズム等によ
り辞書の更新、データ圧縮を行うことが可能となる。こ
れにより、最終的な圧縮データのデータ圧縮率を非常に
高めることが可能となる。一方、出力された辞書は静的
辞書となるため、必要なデータ列をこの静的辞書を用い
て自在に復元することも可能となる。According to the present invention, a registration data string is registered in the dictionary in association with the registration number. Then, data compression is performed by replacing two or more combinations of data strings with the registration number. If data compression is performed in this manner, the original data string can be restored by reading out the registered data string using the registration number at the time of restoration. In this case, the dictionary is updated until an optimal dictionary for data compression of the data string to be compressed is generated.
That is, the dictionary is updated until, for example, the data amount of the compressed data, the data amount of the dictionary, and the like are optimized. Then, when the optimum dictionary is generated, the dictionary is output as a final static dictionary for restoration. Further, data compression of a data string to be compressed is performed by the static dictionary,
The compressed data string is output as final decompressed compressed data. And the final static dictionary output,
The compressed data is stored in, for example, a storage device, a storage medium, or the like, and is restored by an information processing device such as a printer or a computer, so that the original data string is restored. As described above, according to the present invention, the dictionary is updated until the optimum dictionary is generated, and the data compression is performed by using the optimum dictionary as a static dictionary. Therefore, it is possible to optimize the amount of output static dictionary and compressed data. Furthermore, since the dictionary until output is not required to be a static dictionary, it is possible to update the dictionary and perform data compression by a data compression algorithm using a dynamic dictionary having a very high data compression ratio, for example. Become. This makes it possible to greatly increase the data compression rate of the final compressed data. On the other hand, since the output dictionary is a static dictionary, a necessary data string can be freely restored using this static dictionary.

【００１４】また、本発明は、前記最適な辞書が生成さ
れるまでの前記辞書の更新が、組み合わせ個数の多いデ
ータ列の組み合わせを優先的に登録することにより生成
された辞書から使用頻度の低い登録データ列の登録を辞
書の登録数が所定数になるまで削除することにより行わ
れることを特徴とする。Further, according to the present invention, the updating of the dictionary until the generation of the optimal dictionary is performed in such a manner that a combination of data strings having a large number of combinations is registered preferentially. It is characterized in that registration of a registration data string is performed by deleting until the number of registrations in the dictionary reaches a predetermined number.

【００１５】本発明によれば、組み合わせ個数の多いデ
ータ列の組み合わせを優先的に登録することにより辞書
が生成される。このような辞書を生成させるための手法
としては、例えばスライド辞書と呼ばれる手法を利用す
ることができる。そして、このスライド辞書手法を利用
する場合には、スライド辞書手法により過去のデータ列
と対象となるデータ列との間の最長一致データ列を探し
出し、この最長一致データ列を辞書に登録することで辞
書を生成することになる。これにより、組み合わせ個数
の多いデータ列の組み合わせが優先的に登録された辞書
を生成することが可能となる。そして、このような辞書
を使用することにより、組み合わせ個数の多いデータ列
の組み合わせが優先的に辞書の登録番号に置き換えられ
ることになるため、データの圧縮率を最適なものとする
ことができる。一方、このようにして生成された辞書は
その登録数が非常に多くなる場合がある。そこで、この
生成された辞書から使用頻度の低い登録データ列の登録
を削除することで辞書の更新を行い、辞書の登録数が所
定数になった段階で更新を終了すれば、圧縮率が良い、
いいかえればデータ量の少ない最適な辞書を生成するこ
とが可能となる。According to the present invention, a dictionary is generated by preferentially registering combinations of data strings having a large number of combinations. As a method for generating such a dictionary, for example, a method called a slide dictionary can be used. When this slide dictionary technique is used, the longest match data string between the past data string and the target data string is searched for by the slide dictionary technique, and the longest match data string is registered in the dictionary. A dictionary will be generated. This makes it possible to generate a dictionary in which combinations of data strings having a large number of combinations are registered with priority. By using such a dictionary, a combination of data strings having a large number of combinations is preferentially replaced with a registration number of the dictionary, so that the data compression ratio can be optimized. On the other hand, the dictionary generated in this way may have a very large number of registrations. Therefore, if the dictionary is updated by deleting the registration of the infrequently used registration data string from the generated dictionary, and the update is completed when the number of registered dictionary has reached the predetermined number, the compression rate is good. ,
In other words, it is possible to generate an optimal dictionary with a small amount of data.

【００１６】また、本発明は、前記最適な辞書が生成さ
れるまでの前記辞書の更新が、出現確率が高いデータ列
の組み合わせを優先的に登録することにより生成された
辞書から使用頻度の低い登録データ列の登録を辞書の登
録数が所定数になるまで削除することにより行われるこ
とを特徴とする。Further, according to the present invention, the updating of the dictionary until the generation of the optimum dictionary is performed by first registering a combination of data strings having a high appearance probability from a dictionary generated in a low frequency. It is characterized in that registration of a registration data string is performed by deleting until the number of registrations in the dictionary reaches a predetermined number.

【００１７】本発明によれば、出現確率の高いデータ列
の組み合わせを優先的に登録することにより辞書が生成
される。このデータ列の組み合わせの出現確率は、例え
ば、圧縮対象となる全データ列の出現確率を調べ、この
出現確率から求めることができる。そして、このように
して生成された辞書を用いることにより、出現確率の高
いデータ列の組み合わせが優先的に登録番号に置き換え
られることになるため、データ圧縮率を最適なものとす
ることができる。そして、生成された辞書から使用頻度
の低い登録データ列の登録を削除することで辞書の更新
を行い、辞書の登録数が所定数になった段階で更新を終
了すれば、データ量の少ない最適な辞書を生成すること
が可能となる。According to the present invention, a dictionary is generated by preferentially registering a combination of data strings having a high appearance probability. The appearance probability of this combination of data strings can be determined, for example, by examining the appearance probabilities of all data strings to be compressed. Then, by using the dictionary generated in this manner, a combination of data strings having a high appearance probability is preferentially replaced with a registration number, so that the data compression ratio can be optimized. Then, the dictionary is updated by deleting the registration of the infrequently used registration data string from the generated dictionary. If the update is completed when the number of dictionary registrations reaches a predetermined number, the optimal data with a small data amount is obtained. It is possible to generate a simple dictionary.

【００１８】また、本発明は、前記最適な辞書が生成さ
れるまでの前記辞書の更新が、データ圧縮の際に辞書が
動的に変化するデータ圧縮アルゴリズムにより辞書を更
新しながら圧縮対象となる全てのデータ列に対するデー
タ圧縮の処理を行い、該処理により更新された辞書を用
いて再び前記データ圧縮アルゴリズムにより辞書を更新
しながら圧縮対象となる全てのデータ列に対するデータ
圧縮の処理を行い、データ圧縮率が最適になるまで前記
処理を繰り返すことにより行われることを特徴とする。Further, according to the present invention, the updating of the dictionary until the generation of the optimum dictionary becomes a compression target while updating the dictionary by a data compression algorithm that dynamically changes during data compression. Performs data compression processing on all data strings, performs data compression processing on all data strings to be compressed while updating the dictionary again using the data compression algorithm using the dictionary updated by the processing, and performs data compression. The processing is performed by repeating the above processing until the compression ratio becomes optimal.

【００１９】本発明によれば、データ圧縮の際に辞書が
動的に変化するデータ圧縮アルゴリズム、例えば増分分
解アルゴリズム、加増分解アルゴリズム等により辞書を
更新しながら全圧縮対象データ列に対するデータ圧縮の
処理が行われる。そして、次に、この処理により更新さ
れた辞書を用いて再び前記データ圧縮アルゴリズムによ
り辞書を更新しながら全圧縮対象データ列に対するデー
タ圧縮の処理が行われる。そして、この処理をデータ圧
縮が最適になるまで繰り返すことにより、最適な辞書が
生成されることになる。本発明によれば、データ圧縮率
の高い動的辞書を用いたデータ圧縮アルゴリズムにより
データ圧縮がなされ、しかもデータ圧縮率が最適な段階
で辞書の更新が終了するため、データ圧縮率を非常に高
めることが可能となる。一方、出力された辞書は静的辞
書となるため、必要なデータ列をこの静的辞書を用いて
自在に復元することも可能となる。According to the present invention, data compression processing for all data strings to be compressed while updating the dictionary with a data compression algorithm that dynamically changes the dictionary during data compression, for example, an incremental decomposition algorithm, an incremental decomposition algorithm, or the like. Is performed. Then, data compression processing is performed on all the data strings to be compressed while updating the dictionary again by the data compression algorithm using the dictionary updated by this processing. Then, by repeating this process until the data compression becomes optimal, an optimal dictionary is generated. According to the present invention, data compression is performed by a data compression algorithm using a dynamic dictionary having a high data compression rate, and updating of the dictionary is completed when the data compression rate is optimal. It becomes possible. On the other hand, since the output dictionary is a static dictionary, a necessary data string can be freely restored using this static dictionary.

【００２０】また本発明は、前記データ圧縮アルゴリズ
ムにより圧縮データを出力すべき時に増える圧縮データ
出力回数に基づいて、最適な辞書が生成されたか否かが
判断されることを特徴とする。Further, the present invention is characterized in that it is determined whether or not an optimal dictionary has been generated based on the number of times of output of compressed data which increases when compressed data is to be output by the data compression algorithm.

【００２１】また本発明は、登録番号に関連づけて登録
データ列を登録できる辞書を使用しデータ列の２以上の
組み合わせを該登録番号に置き換えることでデータ圧縮
を行うデータ圧縮方法であって、（Ａ）圧縮対象である
データ列から所定数のデータ列を取り出し所定数のバッ
ファを有する作業領域に格納する工程と、（Ｂ）前記作
業領域内の隣り合うバッファに格納されるデータ列の組
み合わせが辞書に登録されているか否かを解析し、辞書
に登録されている場合には、該データ列の組み合わせを
辞書における前記登録番号に置き換えるとともに置き換
えにより生じた空きバッファを埋めるようにデータ列を
作業領域内でシフトさせ、その結果作業領域の終端に生
じた空きバッファに続きのデータ列を取り込み、再び作
業領域内の隣り合うバッファに格納されるデータ列の組
み合わせが辞書に登録されているか否かを解析する工程
と、（Ｃ）上記工程（Ｂ）の解析により作業領域内の隣
り合うバッファに格納されるデータ列の組み合わせのい
ずれもが辞書に登録されていないと判断された場合に
は、作業領域内の先頭から１番目、２番目のバッファに
格納されるデータ列の組み合わせを辞書に登録するとと
もに１番目のデータ列を消去し、消去により生じた空き
バッファを埋めるようにデータ列を作業領域内でシフト
させ、その結果作業領域の終端に生じた空きバッファに
続きのデータ列を取り込む工程とを含み、圧縮の対象と
なる全てのデータ列が前記作業領域内に格納されるまで
前記工程（Ｂ）、（Ｃ）を繰り返すことを特徴とする。The present invention also relates to a data compression method for compressing data by using a dictionary capable of registering a registration data string in association with a registration number and replacing two or more combinations of data strings with the registration number. A) a step of extracting a predetermined number of data strings from a data string to be compressed and storing the data strings in a work area having a predetermined number of buffers; and (B) a combination of data strings stored in adjacent buffers in the work area. It analyzes whether or not it is registered in the dictionary, and if it is registered in the dictionary, replaces the combination of the data string with the registration number in the dictionary and works on the data string so as to fill the empty buffer generated by the replacement. Shifts the data in the area, fetches the subsequent data string into the empty buffer generated at the end of the work area, and returns A step of analyzing whether a combination of data strings stored in the buffer is registered in the dictionary; and (C) a combination of data strings stored in adjacent buffers in the work area by the analysis of the step (B). Are not registered in the dictionary, the combination of the data strings stored in the first and second buffers from the head in the work area is registered in the dictionary and the first data string is registered. Erasing the data sequence in the work area so as to fill the empty buffer generated by the erasure, and capturing the subsequent data sequence into the empty buffer generated at the end of the work area as a result. The steps (B) and (C) are repeated until all the data strings are stored in the work area.

【００２２】本発明によれば、作業領域内の隣り合うバ
ッファに格納されるデータ列の組み合わせが辞書に登録
されているか否かが解析され、登録されている場合には
辞書の登録番号に置き換えられる。そして、その結果生
じた空きバッファに続きのデータ列を取り込み、再びデ
ータ列の組み合わせが辞書に登録されているか否かが解
析される。そして、データ列の組み合わせが辞書に登録
されていないと判断されると、先頭から１番目、２番目
のデータ列の組み合わせが登録され、その結果生じた空
きバッファに続きのデータ列が取り込まれる。そして、
これらの処理が、全てのデータ列が作業領域内に格納さ
れるまで繰り返されることでデータ圧縮が行われる。こ
のように本発明によれば、所定容量の作業用領域を設
け、注目しているデータ列と同じデータ列が作業用領域
に存在する場合は辞書を用いた置き換え処理が行われる
ようになっている。従って、特に同一のデータ列を圧縮
処理する場合に、辞書の登録数を従来の増分分解法と比
べて非常に少なくすることができるとともに、該辞書の
登録番号により置き換えられて圧縮が施された圧縮デー
タ自体も、従来の増分分解法に比べて非常に少ないデー
タ量とすることができる。According to the present invention, it is analyzed whether or not a combination of data strings stored in adjacent buffers in the work area is registered in the dictionary, and if so, replaced with the registration number of the dictionary. Can be Then, the subsequent data string is taken into the resulting free buffer, and it is analyzed again whether or not the combination of the data strings is registered in the dictionary. If it is determined that the combination of the data strings is not registered in the dictionary, the first and second data string combinations from the top are registered, and the subsequent data string is taken into the resulting empty buffer. And
Data compression is performed by repeating these processes until all data strings are stored in the work area. As described above, according to the present invention, a work area having a predetermined capacity is provided, and when the same data string as the data string of interest exists in the work area, replacement processing using a dictionary is performed. I have. Therefore, especially when the same data string is subjected to compression processing, the number of dictionary registrations can be greatly reduced as compared with the conventional incremental decomposition method, and compression is performed by replacing with the registration number of the dictionary. The compressed data itself can have a much smaller data amount than the conventional incremental decomposition method.

【００２３】また、本発明は、圧縮の対象となる全ての
データ列に対して処理が行われるまでの前記工程
（Ｂ）、（Ｃ）の繰り返しを１回のパスとした場合に、
前回のパスで更新された辞書を用いて現在のパスにおけ
るデータ圧縮を行うデータ圧縮方法であって、現在のパ
スにおける前記工程（Ｂ）、（Ｃ）の繰り返し回数が前
回のパスにおける繰り返し回数以下の場合には次のパス
に移行し、現在のパスにおける前記工程（Ｂ）、（Ｃ）
の繰り返し回数が前回のパスにおける繰り返し回数より
も大きい場合には前回のパスで更新された辞書を最終的
な復元用の静的辞書として出力するとともに、該静的辞
書により１回のパスのデータ圧縮を行い、圧縮されたデ
ータ列を最終的な復元用の圧縮データとして出力するこ
とを特徴とする。Further, the present invention provides a method in which the repetition of the steps (B) and (C) until processing is performed on all data strings to be compressed is defined as one pass.
A data compression method for compressing data in a current pass using a dictionary updated in a previous pass, wherein the number of repetitions of the steps (B) and (C) in the current pass is equal to or less than the number of repetitions in the previous pass In the case of (1), the process proceeds to the next pass, and the steps (B) and (C) in the current pass are performed.
If the number of repetitions is larger than the number of repetitions in the previous pass, the dictionary updated in the previous pass is output as a final static dictionary for restoration, and the data of one pass is output by the static dictionary. It is characterized in that compression is performed, and a compressed data string is output as compressed data for final decompression.

【００２４】本発明によれば、現在のパスにおける処理
の繰り返し回数と前回のパスの処理の繰り返し回数を比
較することで、データ圧縮率が最適になったか否かが判
断される。そして、最適なデータ圧縮率になった段階で
該辞書を最終的な静的辞書とし、該静的辞書及び該静的
辞書により圧縮されたデータが出力される。従って、圧
縮率の非常に高い加増分解アルゴリズムによりデータ圧
縮が可能となりデータ圧縮率が非常に高められるととも
に、出力された辞書は静的辞書となるため、必要なデー
タ列をこの静的辞書を用いて自在に復元することも可能
となる。According to the present invention, it is determined whether or not the data compression ratio has been optimized by comparing the number of repetitions of processing in the current pass with the number of repetitions of processing in the previous pass. Then, at the stage when the data compression ratio becomes optimal, the dictionary is set as a final static dictionary, and the static dictionary and data compressed by the static dictionary are output. Therefore, the data compression becomes possible by the additive decomposition algorithm having a very high compression ratio, and the data compression ratio is greatly increased. In addition, since the output dictionary is a static dictionary, a necessary data string is used by using this static dictionary. It is also possible to restore freely.

【００２５】また、本発明は、前記辞書には登録番号、
登録データ列と共に使用頻度情報が記憶され、使用頻度
の低い登録データ列を順次削除する工程を含むことを特
徴とする。Further, according to the present invention, the dictionary includes a registration number,
The use frequency information is stored together with the registered data string, and the method includes a step of sequentially deleting the registered data string with a low use frequency.

【００２６】本発明によれば、使用頻度の低い登録デー
タ列の登録を順次削除する工程が含まれる。従って、辞
書の登録可能数に限界がある場合等に、辞書のデータ量
を最適なサイズとすることが可能となる。According to the present invention, there is included a step of sequentially deleting the registration of the registration data string that is used less frequently. Therefore, when the number of dictionaries that can be registered is limited, the data amount of the dictionary can be set to an optimal size.

【００２７】また、本発明は、前記使用頻度の低い登録
データ列の削除が、辞書に登録されている登録データ列
の使用頻度数を順次減らしてゆき、初めに使用頻度数が
所定数以下になった登録データ列から優先して削除する
ことにより行われることを特徴とする。According to the present invention, the deletion of the registered data string having a low frequency of use sequentially reduces the frequency of use of the registered data string registered in the dictionary, and the frequency of use is initially reduced to a predetermined number or less. It is characterized in that it is performed by preferentially deleting the registered data sequence that has become lost.

【００２８】本発明によれば、辞書に登録されている登
録データ列の使用頻度数を順次減らしてゆき、初めに使
用頻度数が所定数以下になった登録データ列から優先し
て削除される。これにより、辞書の登録可能数に限界が
ある場合等に、辞書のデータ量を最適なサイズとするこ
とが可能となる。しかも、初めに使用頻度が所定数以下
になった登録データ列が優先的に削除されるため、辞書
に使用頻度の高い登録データ列を残すことができ、最適
な辞書を生成できる。According to the present invention, the frequency of use of the registered data sequence registered in the dictionary is sequentially reduced, and the registered data sequence whose frequency of use becomes equal to or less than a predetermined number is first deleted with priority. . This makes it possible to set the data size of the dictionary to an optimal size when the number of dictionaries that can be registered is limited. In addition, since the registered data strings whose use frequency has become equal to or less than the predetermined number at the beginning are preferentially deleted, the registered data strings frequently used can be left in the dictionary, and an optimal dictionary can be generated.

【００２９】また、本発明は、圧縮の対象となるデータ
列が文字の印字の際に必要なフォントデータであること
を特徴とする。Further, the present invention is characterized in that the data string to be compressed is font data necessary for printing characters.

【００３０】本発明によれば、文字印字の際に必要なフ
ォントデータが圧縮対象となる。このようなフォントデ
ータとしては、ビットマップフォントデータ、アウトラ
インフォントデータ等が考えられる。ビットマップフォ
ントデータを圧縮する場合には、例えば縦方向、横方向
に並んだ所定数単位（例えば１バイト単位、１ワード単
位）のドットデータを圧縮対象とすることができる。ま
た、アウトラインフォントデータを圧縮する場合には、
例えば文字の輪郭を構成する各点の属性情報、各点のベ
クトル座標を制御するための情報等を圧縮対象とするこ
とができる。According to the present invention, font data necessary for character printing is to be compressed. Examples of such font data include bitmap font data, outline font data, and the like. In the case of compressing bitmap font data, for example, dot data of a predetermined number of units (for example, 1 byte unit and 1 word unit) arranged in the vertical and horizontal directions can be compressed. When compressing outline font data,
For example, attribute information of each point constituting a character outline, information for controlling vector coordinates of each point, and the like can be compression targets.

【００３１】また、本発明は、前記フォントデータの中
の一部のみが前記圧縮の対象となるデータ列となり、他
の一部が他のデータ圧縮方法によりデータ圧縮されるこ
とを特徴とする。Further, the present invention is characterized in that only a part of the font data becomes the data string to be compressed, and the other part is data compressed by another data compression method.

【００３２】本発明によれば、フォントデータを構成す
るデータの特性に応じて、一部のデータが静的辞書、増
分分解アルゴリズム、加増分解アルゴリズム等を利用し
たデータ圧縮方法で圧縮され、他の一部が他の圧縮方
法、例えばハフマン符号手法により圧縮される。このよ
うに、データの特性に応じて、適用する圧縮方法を換え
ることで、データの圧縮率を更に高めることが可能とな
る。According to the present invention, part of data is compressed by a data compression method using a static dictionary, an incremental decomposition algorithm, an incremental decomposition algorithm, or the like in accordance with the characteristics of data constituting font data. Some are compressed by other compression methods, such as Huffman coding. As described above, by changing the applied compression method according to the characteristics of the data, it is possible to further increase the data compression ratio.

【００３３】また本発明は、アウトラインフォントの各
点の属性を表す情報、文字の特性を表す情報が、前記圧
縮の対象となるデータ列になり、各点における打ち出し
方向のベクトル座標を表す情報が、他のデータ圧縮方法
によりデータ圧縮されることを特徴とするデータ圧縮方
法。Further, according to the present invention, the information representing the attribute of each point of the outline font and the information representing the characteristics of the character become the data string to be compressed, and the information representing the vector coordinates of the launch direction at each point. A data compression method characterized in that the data is compressed by another data compression method.

【００３４】また、本発明は、共通の字体を有する前記
文字に対しては共通の前記辞書を用いてデータ圧縮が行
われることを特徴とする。Further, the present invention is characterized in that data compression is performed on the characters having a common font using the common dictionary.

【００３５】本発明によれば、共通の字体を有する文字
に対しては、共通の辞書を用いてデータ圧縮が行われ
る。例えば、明朝体の文字に対しては、全て明朝体専用
の辞書を用いて辞書の更新、データ圧縮を行い、最終的
な静的辞書、圧縮データを得る。また、ゴシックの文字
列に対しては、全てゴシック専用の辞書を用いて辞書の
更新、データ圧縮を行い、最終的な静的辞書、圧縮デー
タを得る。このように字体の各々について辞書を共通化
することで、データを効率よく圧縮することが可能とな
る。According to the present invention, data compression is performed on characters having a common font using a common dictionary. For example, for all Mincho fonts, the dictionary is updated and data compressed using a dictionary dedicated to Mincho font, and final static dictionaries and compressed data are obtained. For Gothic character strings, a dictionary dedicated to Gothic is used to update the dictionary and compress data to obtain final static dictionaries and compressed data. By sharing a dictionary for each of the fonts, data can be efficiently compressed.

【００３６】また、本発明は、圧縮の対象となるデータ
列が文字列であることを特徴とする。Further, the present invention is characterized in that the data string to be compressed is a character string.

【００３７】本発明によれば、文字列データが圧縮対象
となる。これにより、例えば文字の記憶に必要な容量等
を節約することができる。According to the present invention, character string data is to be compressed. As a result, for example, the capacity required for storing characters can be saved.

【００３８】また、本発明は、最終的に生成された辞書
に含まれる登録番号と登録データ列の情報とから、前記
登録データ列を復元専用のデータ形式に変換した復元専
用登録データ列と該復元専用登録データ列のデータ長と
該復元専用登録データ列の開始アドレスの情報とを含む
復元専用の辞書が生成されることを特徴とする。Further, the present invention provides a restoration-dedicated registration data string obtained by converting the registration data string into a restoration-dedicated data format from the registration number and the information of the registration data string contained in the finally generated dictionary. It is characterized in that a dictionary exclusively for restoration including the data length of the registration data string exclusively for restoration and information on the start address of the registration data string exclusively for restoration is generated.

【００３９】本発明によれば、復元専用登録データ列
と、この復元専用登録データ列のデータ長と、この復元
専用登録データ列の開始アドレスの情報とを含む復元専
用の辞書が生成される。そして、復元の際には、この復
元専用辞書を用いてデータの復元が行われる。即ち、前
記開始アドレスで指定される位置から前記データ長で指
定される長さの前記復元専用登録データ列を読み出すこ
とで辞書からデータを読み出し、復元処理が行われる。
この場合、復元専用登録データ列は、復元専用のデータ
形式に変換されている。従って、通常の辞書を用いる場
合よりも、非常に速く復元処理を行うことが可能とな
る。According to the present invention, a restoration-only dictionary including a restoration-dedicated registration data sequence, a data length of the restoration-dedicated registration data sequence, and information on a start address of the restoration-dedicated registration data sequence is generated. Then, at the time of restoration, data restoration is performed using this restoration-dedicated dictionary. That is, data is read from the dictionary by reading the restoration-dedicated registration data string having the length designated by the data length from the position designated by the start address, and restoration processing is performed.
In this case, the restoration-dedicated registration data string is converted into a restoration-only data format. Therefore, the restoration process can be performed much faster than when a normal dictionary is used.

【００４０】また、本発明は、上記のいずれかのデータ
圧縮方法により生成された圧縮データと最終的な辞書と
を用いて、該データ圧縮方法に応じた復元処理により圧
縮対象となったデータ列を復元することを特徴とする。Further, the present invention uses a compressed data generated by any one of the above-described data compression methods and a final dictionary to execute a data string compressed by a decompression process according to the data compression method. Is restored.

【００４１】本発明によれば、上記データ圧縮方法によ
り生成された圧縮データと最終的な辞書とを用いて元の
データ列が復元される。これにより、この復元されたデ
ータ列を用いて所定の処理、例えば文字の印字等の処理
を行うことができる。According to the present invention, the original data string is restored using the compressed data generated by the data compression method and the final dictionary. As a result, predetermined processing, for example, processing such as printing of characters, can be performed using the restored data string.

【００４２】また、本発明は、上記のいずれかのデータ
圧縮方法により生成された圧縮データと最終的な辞書と
を用いて、該データ圧縮方法に応じた復元処理により圧
縮対象となったデータ列を復元する手段を含むことを特
徴とする。Further, according to the present invention, using a compressed data generated by any of the above data compression methods and a final dictionary, a data string compressed by a decompression process according to the data compression method is provided. Is included.

【００４３】本発明によれば、上記データ圧縮方法によ
り生成された圧縮データと最終的な辞書を用いて、復元
手段により元のデータ列が復元される。そして、この復
元手段は、例えば、コンピュータ、プリンタ等の情報処
理装置に内蔵させることができる。According to the present invention, the original data string is restored by the restoration means using the compressed data generated by the data compression method and the final dictionary. The restoring means can be incorporated in an information processing device such as a computer and a printer.

【００４４】[0044]

【発明の実施の形態】以下、本発明の最適な実施例につ
いて説明する。なお、以下の第１、第２の実施例では、
説明を簡単にするためにデータ列として主に文字列を圧
縮する場合を例にとり説明を行う。しかし、本発明にお
けるデータ列には、このような文字列のみならず例えば
フォントデータを構成するためのバイト列、ワード列等
のあらゆる種類のデータ列が含まれる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The preferred embodiments of the present invention will be described below. In the following first and second embodiments,
In order to simplify the description, a description will be given of a case where a character string is mainly compressed as a data string as an example. However, the data string in the present invention includes not only such a character string but also any kind of data string such as a byte string and a word string for constituting font data.

【００４５】１．第１の実施例図１には、本実施例のデータ圧縮方法を説明するための
フローチャートが示される。本実施例のデータ圧縮方法
では、登録番号に関連づけて登録データ列を登録できる
辞書が使用される。そして、データ列の２以上の組み合
わせをこの登録番号に置き換えることでデータ圧縮が行
われることになる。まず、圧縮対象である全てのデータ
列（例えば文字列）から辞書が生成され（ステップＡ
１）、データ圧縮に最適な辞書が生成されるまで辞書の
更新が繰り返される（ステップＡ２、Ａ３）。そして、
最適な辞書が生成された段階でこの辞書を最終的な復元
用の静的辞書とし、この静的辞書により圧縮対象である
データ列のデータ圧縮が行われる（ステップＡ４）。そ
して、これにより圧縮された最終的な復元用の圧縮デー
タ及び最終的な復元用の静的辞書が出力され、マスクＲ
ＯＭ、ＥＥＰＲＯＭ等の記憶装置、記憶媒体に格納され
る。そして、この記憶装置等に格納された静的辞書及び
圧縮データを用いて、プリンタあるいはホストコンピュ
ータ等の情報処理装置内においてデータの復元処理が行
われることになる。1. First Embodiment FIG. 1 is a flowchart for explaining a data compression method according to the present embodiment. In the data compression method of the present embodiment, a dictionary that can register a registration data string in association with a registration number is used. Then, data compression is performed by replacing two or more combinations of the data strings with this registration number. First, a dictionary is generated from all data strings (for example, character strings) to be compressed (step A).
1) The update of the dictionary is repeated until a dictionary optimal for data compression is generated (steps A2 and A3). And
When the optimal dictionary is generated, this dictionary is used as the final static dictionary for restoration, and the data string to be compressed is compressed by the static dictionary (step A4). Then, the compressed data for final decompression and the static dictionary for final decompression are output, and the mask R
It is stored in a storage device such as an OM or an EEPROM, or a storage medium. Then, using the static dictionary and the compressed data stored in the storage device or the like, data restoration processing is performed in an information processing device such as a printer or a host computer.

【００４６】なお、辞書を生成・更新してゆき、どの段
階でデータ圧縮に最適な辞書とするかを決める手法とし
ては、後述するように種々の手法が考えられる。As a method of generating and updating the dictionary and determining at which stage the dictionary is optimal for data compression, various methods can be considered as described later.

【００４７】図２には、本実施例のデータ圧縮方法が使
用されるデータ圧縮装置１２の構成の一例が示される。
圧縮対象である全データ列１１は、まず辞書生成・更新
手段１３に入力され、これにより辞書の生成・更新が行
われる。そして、データ圧縮に最適な辞書が生成された
段階で、該辞書は静的辞書１４として静的辞書保持手段
１５に保持される。そして、この保持された最適の静的
辞書１４は、静的辞書出力手段１６により外部の記憶手
段（図示せず）へと出力される一方、データ圧縮手段１
７において全データ列１１をデータ圧縮するのに利用さ
れる。このデータ圧縮は、データ列の２以上の組み合わ
せを辞書の登録番号に置き換えることで行われる。そし
て、この結果得られた圧縮データ１９は、圧縮データ出
力手段１８により外部の記憶手段へと出力されることに
なる。FIG. 2 shows an example of the configuration of the data compression apparatus 12 using the data compression method of the present embodiment.
The entire data string 11 to be compressed is first input to the dictionary generating / updating means 13, which generates and updates the dictionary. Then, when a dictionary optimal for data compression is generated, the dictionary is held as a static dictionary 14 in the static dictionary holding unit 15. The stored optimal static dictionary 14 is output by the static dictionary output unit 16 to an external storage unit (not shown), while the data compression unit 1
7 is used to compress the entire data string 11. This data compression is performed by replacing a combination of two or more data strings with a registration number of a dictionary. Then, the compressed data 19 obtained as a result is output to the external storage means by the compressed data output means 18.

【００４８】次に、本実施例における辞書の構成につい
て説明する。例えば、 ”ｓｔａｔｉｃ＿ｓｔｒｉｎｇ＿ｄｉｃｔｉｏｎａｒ
ｙ” というデータ列（文字列）を圧縮する場合を考える。こ
の場合に辞書に、 ”ｓｔ”、”ａｔ”、”ｉｃ”、”＿”、”ｓｔ”、”
ｒｉｎｇ＿ｄ”、 ”ｉｃ”、”ｔｉｏｎａｒｙ” の様に８個のデータ列の組み合わせが登録されていたと
する。この場合には、これらのデータ列の組み合わせは
例えば下記に示すように登録番号に関連づけて登録され
ている。０：ｓｔ１：ａｔ２：ｉｃ３：＿４：ｒｉｎｇ＿ｄ５：ｔｉｏｎａｒｙすると、”ｓｔａｔｉｃ＿ｓｔｒｉｎｇ＿ｄｉｃｔｉｏ
ｎａｒｙ”というデータ列は、これらの登録番号によ
り、０、１、２、３、０、４、２、５というように置き換えることが可能となり、これにより
データが圧縮されることになる。Next, the configuration of the dictionary in this embodiment will be described. For example, "static_string_dictionar
Consider a case in which a data string (character string) called “y” is compressed. In this case, the dictionary stores “st”, “at”, “ic”, “_”, “st”, “
Assume that a combination of eight data strings is registered, such as “ring_d”, “ic”, and “tionary.” In this case, the combination of these data strings is associated with a registration number as shown below, for example. 0: st 1: at 2: ic 3: _ 4: ring_d 5: tionary Then, “static_string_diction” is registered.
The data string “nary” can be replaced with 0, 1, 2, 3, 0, 4, 2, 5, and so on by these registration numbers, thereby compressing the data.

【００４９】次に、最適な辞書を生成するための種々の
手法について説明する。本実施例では、最終的には静的
辞書と圧縮データの２つが出力され、これらが記憶装置
に格納される。従って、記憶装置の使用容量を節約する
ためには、最終的な静的辞書、圧縮データのデータ量を
減らす必要があり、これらのデータ量を少なくできる辞
書を最適な辞書ということができる。即ち、最適な辞書
とするためには、辞書自体のデータ量を小さくできるこ
とが望ましく、あるいは、圧縮データのデータ量を小さ
くできることが望ましい。このために、本実施例では例
えば以下の第１〜第４の手法を用いている。なお、以下
ではデータ列として文字列を例に取り説明する。Next, various methods for generating an optimal dictionary will be described. In this embodiment, finally, two of the static dictionary and the compressed data are output, and these are stored in the storage device. Therefore, in order to save the used capacity of the storage device, it is necessary to reduce the data amount of the final static dictionary and the compressed data, and a dictionary capable of reducing the data amount can be called an optimal dictionary. That is, in order to obtain an optimal dictionary, it is desirable that the data amount of the dictionary itself can be reduced, or that the data amount of the compressed data can be reduced. For this purpose, this embodiment uses, for example, the following first to fourth methods. In the following, a character string will be described as an example of a data string.

【００５０】（Ａ）最適な辞書を得るための第１の手法この手法では、スライド辞書と呼ぶ手法を利用し、組み
合わせ個数の多いデータ列の組み合わせを優先的に登録
することにより辞書を生成する。そして、生成された辞
書の登録数を使用頻度情報に基づいて順次削除してゆく
ことで最適な辞書を得る。(A) First Method for Obtaining an Optimal Dictionary In this method, a dictionary called a slide dictionary is generated by preferentially registering combinations of data strings having a large number of combinations. . Then, an optimal dictionary is obtained by sequentially deleting the generated registration number of the dictionary based on the use frequency information.

【００５１】まず、スライド辞書と呼ぶ手法について図
３（Ａ）〜（Ｅ）を用いて説明する。この手法では、圧
縮対象となる全文字列４１を最初から順に対象文字列４
３として作業領域であるメモリ空間上に記憶させてゆ
く。そして、過去の文字列４２（最初から対象文字列４
３の前まで）に該対象文字列４３と同じ文字列がなかっ
たかを調べる。そして、過去に同じ文字列がなかったな
ら、該対象文字列４３の最初の１文字を対象文字列４３
から過去の文字列４２へと移す（スライドする）。一
方、もし過去に同じ文字列があったなら、次の文字列も
一致するかを調べ、これを繰り返すことにより過去の文
字列４２と対象文字列４３との間の最長一致文字列（文
字列の組み合わせ個数の最も多いもの）を見つけ出す。First, a technique called a slide dictionary will be described with reference to FIGS. In this method, all the character strings 41 to be compressed are sequentially sorted from the beginning to the target character strings 4.
As No. 3, it is stored in a memory space which is a work area. Then, the past character string 42 (the target character string 4
Before 3), it is checked whether there is a character string identical to the target character string 43. If the same character string has not been found in the past, the first character of the target character string 43 is
To the previous character string 42 (slide). On the other hand, if the same character string is found in the past, it is checked whether the next character string also matches, and by repeating this, the longest matching character string between the past character string 42 and the target character string 43 (character string (The one with the largest number of combinations).

【００５２】例えば、圧縮対象となる文字列がＡＢＣＡ
ＢＣＤＥＦであった場合を考える。この場合は、まずＡ
が対象文字列４３となるが、過去に同じ文字列はないた
め、Ａは過去の文字列４２であるスライド辞書に移され
る（図３（Ｂ）参照）。そして、Ｂ、Ｃについても過去
に同じ文字列がないためスライド辞書に移される（図３
（Ｃ）参照）。そして、次にＡが対象文字列４３となる
が、この場合にはＡはスライド辞書内にあるため、次の
Ｂが一致するか否かが調べられる（図３（Ｄ）参照）。
そして、この場合は一致するため、次にＣが一致するか
否かが調べられる（図３（Ｅ）参照）。そして、次にＤ
が一致するか否かが調べられるが、Ｄはスライド辞書内
にない。そこで、この場合はＡＢＣが最長一致文字列と
される。For example, if the character string to be compressed is ABCA
Consider the case of BCDEF. In this case, first
Is the target character string 43, but since there is no same character string in the past, A is moved to the slide dictionary, which is the past character string 42 (see FIG. 3B). Then, B and C are moved to the slide dictionary because there is no same character string in the past (FIG. 3).
(C)). Then, A becomes the target character string 43. In this case, since A is in the slide dictionary, it is checked whether or not the next B matches (see FIG. 3D).
Then, in this case, since they match, it is checked whether or not C matches (see FIG. 3E). And then D
Are checked for a match, but D is not in the slide dictionary. Therefore, in this case, ABC is the longest matching character string.

【００５３】さて、スライド辞書と呼ばれる手法では、
このように最長一致文字列ＡＢＣを見つけ出すことによ
り、例えばＡＢＣＡＢＣＤＥＦという文字列を「Ａ］、
「Ｂ」、「Ｃ］、「３つ前と３文字同じ」、「Ｄ」、
「Ｅ」、「Ｆ］に圧縮する。具体的には、一致が無い場
合には、その文字列を「一致無しフラッグ＝１及びその
文字コード」により表し、一致があった場合には、その
最長一致文字列を「一致無しフラッグ＝０及び一致場所
及び一致長」で表すことによりデータ圧縮を行う。Now, in a method called a slide dictionary,
By finding the longest matching character string ABC in this way, for example, the character string ABCCADEF is changed to "A",
"B", "C", "Same as 3 characters before 3", "D",
Compressed to “E”, “F” .Specifically, if there is no match, the character string is represented by “No match flag = 1 and its character code”. Data compression is performed by expressing the longest matching character string as “no match flag = 0 and matching location and matching length”.

【００５４】しかし、本実施例においては、スライド辞
書の手法を、実際のデータ圧縮に利用するのではなく、
圧縮対象文字列の中の最長一致文字列を見つけ出すため
にのみ利用している。そして、本実施例では、最長一致
文字列が見つけ出されると、この最長一致文字列を登録
番号に関連づけて辞書に登録する。上記例では、ＡＢＣ
が辞書に登録されることになる。このようにして、全て
の圧縮対象文字列の中で２回以上現れる最長一致文字が
見つけ出され、これらが辞書に登録されることになる。
これにより、組む合わせ個数の多い文字列が優先的に辞
書に登録されることになる。However, in this embodiment, the slide dictionary technique is not used for actual data compression,
It is used only to find the longest matching character string in the string to be compressed. In this embodiment, when the longest matching character string is found, the longest matching character string is registered in the dictionary in association with the registration number. In the above example, ABC
Is registered in the dictionary. In this way, the longest matching character that appears twice or more in all the compression target character strings is found, and these are registered in the dictionary.
As a result, a character string having a large number of combinations is preferentially registered in the dictionary.

【００５５】なお、スライド辞書手法を用いる場合は過
去の文字列をメモリに記憶しておく必要がある。しか
し、図３（Ａ）に示すようにメモリの容量は有限であり
メモリに記憶できる範囲は有限である。従って、この場
合には過去の文字列４２の中においてメモリに記憶でき
る範囲内のものだけがスライド辞書になる。When using the slide dictionary method, it is necessary to store past character strings in a memory. However, as shown in FIG. 3A, the capacity of the memory is finite and the range that can be stored in the memory is finite. Therefore, in this case, only the past character strings 42 within the range that can be stored in the memory become the slide dictionary.

【００５６】図４には、スライド辞書手法を利用して最
適な辞書を得る手法を説明するためのフロチャートが示
される。図４に示すように、まず、スライド辞書手法を
用いて圧縮対象文字列の中で２回以上現れる最長一致文
字が見つけ出され、これらを辞書に登録することで辞書
が生成される（ステップＢ１）。次に、生成された辞書
を用いて、全ての圧縮対象文字列の中から辞書登録文字
列と最長に一致する文字列の組み合わせを見つけ出し、
見つけ出した時点でその登録文字列の使用頻度を１つず
つ増やす（ステップＢ２）。例えば圧縮対象文字列がＡ
ＢＣＤＥＦで、登録文字列がＡＢとＡＢＣであった場合
は、登録文字列ＡＢＣの使用頻度が１つ増やされる。FIG. 4 is a flowchart for explaining a technique for obtaining an optimal dictionary by using the slide dictionary technique. As shown in FIG. 4, first, the longest matching character that appears two or more times in the character string to be compressed is found using the slide dictionary method, and a dictionary is generated by registering these characters in the dictionary (step B1). ). Next, using the generated dictionary, find a combination of the character string that matches the dictionary registered character string and the longest from all the compression target character strings,
When found, the frequency of use of the registered character string is increased by one (step B2). For example, if the character string to be compressed is A
In BCDEF, when the registered character strings are AB and ABC, the use frequency of the registered character string ABC is increased by one.

【００５７】このようにして使用頻度を計算した後、次
に、使用頻度の少ない登録文字列から順に例えば１００
個程度の登録を削除する（ステップＢ３）。この場合、
例えば辞書への登録可能数が、４０９６−２５６＝３８
４０個であった場合には、全登録数が３８４０個未満に
ならないように登録の削除を行う。具体的には、例えば
削除前の辞書への全登録数が３９００個であった場合に
は６０個のみを削除する。After the frequency of use is calculated in this way, the registered character strings with the least frequency of use, for example, 100
About registrations are deleted (step B3). in this case,
For example, the number that can be registered in the dictionary is 4096-256 = 38
If the number is 40, the registration is deleted so that the total number of registrations does not become less than 3840. Specifically, for example, if the total number of entries in the dictionary before deletion is 3900, only 60 are deleted.

【００５８】次に、全登録数が、３８４０個（所定数）
以下か否かが判断され（ステップＢ４）、３８４０個よ
り多い場合には、ステップＢ２に戻り再度使用頻度が計
算され、ステップＢ３で１００個程度の登録が削除され
る。このようにしてステップＢ２〜Ｂ４を繰り返し、辞
書の登録数を順次少しずつ減らしてゆく。そして、登録
数が３８４０個となった時点でステップＢ５に移行す
る。上記のように辞書には例えば３８４０個登録できる
ため、登録数が３８４０個となったところで、その辞書
は最適な辞書とされることになる。Next, the total number of registrations is 3840 (predetermined number)
It is determined whether or not the number is below (step B4). If the number is more than 3840, the process returns to step B2, the usage frequency is calculated again, and about 100 registrations are deleted in step B3. Steps B2 to B4 are repeated in this way, and the number of registered dictionaries is gradually reduced. When the number of registrations reaches 3840, the process proceeds to step B5. As described above, for example, 3840 entries can be registered in the dictionary. Therefore, when the number of registrations reaches 3840, the dictionary is determined to be the optimal dictionary.

【００５９】最後に、この最適な辞書を静的な辞書と
し、この静的辞書により全圧縮対象文字列を圧縮し、最
終的な静的辞書と圧縮データを出力することになる（ス
テップＢ５、Ｂ６）。Finally, the optimal dictionary is used as a static dictionary, all the character strings to be compressed are compressed by the static dictionary, and the final static dictionary and compressed data are output (step B5, B6).

【００６０】（Ｂ）最適な辞書を得るための第２の手法この手法では、出現確率の高い文字列の組み合わせを優
先的に登録することで辞書を生成する。そして、生成さ
れた辞書の登録数を使用頻度情報に基づいて順次削除し
てゆくことで最適な辞書を得る。(B) Second Method for Obtaining an Optimal Dictionary In this method, a dictionary is generated by preferentially registering a combination of character strings having a high appearance probability. Then, an optimal dictionary is obtained by sequentially deleting the generated registration number of the dictionary based on the use frequency information.

【００６１】まず、圧縮対象となる全文字列を例えば１
回解析することで各文字列の出現確率を計算する。その
後、この出現確率に基づいて、文字列の組み合わせの出
現確率を求める。First, the entire character string to be compressed is, for example, 1
The appearance probability of each character string is calculated by analyzing the number of times. Then, based on the appearance probability, the appearance probability of the combination of character strings is obtained.

【００６２】例えば、上記の解析・計算により、ａ、
ｂ、ｃ、ｄの出現確率が、ａ；５０％ｂ；２０％ｃ；１０％ｄ；５％と求まったとする。すると、文字列の組み合わせａａ等
の出現確率は、ａａ；２５％ａａａ；１２．５％ａｂ；１０％ｂａ；１０％ａｃ；５％ｃａ；５％ａａｂ；５％ａｂａ；５％ｂａａ；５％ａｄ；２．５％ ‥‥‥ と計算される。但し、これは文字列間の出現確率に相関
関係が無いと仮定した場合の予想値となるものである。For example, by the above analysis and calculation, a,
Assume that the appearance probabilities of b, c, and d are determined as follows: a; 50% b; 20% c; 10% d; Then, the appearance probability of the character string combination aa or the like is as follows: aa; 25% aaa; 12.5% ab; 10% ba; 10% ac; 5% ca; 5% aab; 5% aba; 5% baa; % Ad; calculated as 2.5% ‥‥‥. However, this is an expected value when it is assumed that there is no correlation between the appearance probabilities between the character strings.

【００６３】このようにして、文字列の組み合わせの出
現確率を求めた後、この出現確率の高い文字列の組み合
わせから優先的に登録することで辞書を生成する。次
に、例えば図４のステップＢ２〜Ｂ４と同様の手法によ
り、使用頻度の少ない文字列の登録を、辞書の登録が所
定数例えば３８４０個になるまで削除し、最適な辞書を
生成する。そして、ステップＢ５、Ｂ６と同様の手法に
より、最終的な静的辞書とこれにより圧縮された圧縮デ
ータを出力することになる。以上の手法によれば、初め
の解析が１回の走査で完了できるという利点がある。After the appearance probability of a combination of character strings is obtained in this way, a dictionary is generated by preferentially registering the combination of character strings having a high appearance probability. Next, for example, by using the same method as in steps B2 to B4 in FIG. 4, the registration of a character string that is used less frequently is deleted until the number of dictionary registrations reaches a predetermined number, for example, 3840, and an optimal dictionary is generated. Then, by the same method as in steps B5 and B6, the final static dictionary and the compressed data compressed thereby are output. According to the above method, there is an advantage that the first analysis can be completed in one scan.

【００６４】（Ｃ）最適な辞書を得るための第３の手法この手法では、増分分解アルゴリズムを利用して最適な
辞書を得る。増分分解アルゴリズムの詳細については、
後述の第２の実施例において加増分解アルゴリズムの対
比において説明する。(C) Third Method for Obtaining an Optimal Dictionary In this method, an optimal dictionary is obtained using an incremental decomposition algorithm. For more information on the incremental decomposition algorithm, see
This will be described in comparison with an additive decomposition algorithm in a second embodiment described later.

【００６５】図５には、増分分解アルゴリズムを利用し
て、最適な辞書を得る手法を説明するためのフローチャ
ートが示される。この手法では、まず、辞書の初期化を
行う（ステップＣ１）。これにより辞書の登録番号０〜
２５５にだけ登録文字列が登録された状態になる。具体
的には０〜２５５にはアスキーコードの文字が記憶され
る。次に、圧縮データの出力回数を表すＮＵＭ０がＮＵ
Ｍ０＝０に設定される（ステップＣ２）。その後、増分
分解アルゴリズムを用いて辞書を更新しながら全圧縮対
象文字列に対するデータ圧縮を行い、圧縮データを出力
すべき時に上記のＮＵＭ０の値を１ずつ増やす（ステッ
プＣ３）。この際、圧縮データ自体は外部に出力しな
い。ステップＣ３の処理は辞書の更新を目的とする処理
だからである。FIG. 5 is a flowchart for explaining a technique for obtaining an optimal dictionary by using the incremental decomposition algorithm. In this method, first, a dictionary is initialized (step C1). Thus, the dictionary registration numbers 0 to 0
A registered character string is registered only in 255. Specifically, ASCII code characters are stored in 0 to 255. Next, NUM0 indicating the number of times of output of the compressed data is NU
M0 = 0 is set (step C2). Thereafter, data compression is performed on all the character strings to be compressed while updating the dictionary using the incremental decomposition algorithm, and the value of NUM0 is increased by one when compressed data is to be output (step C3). At this time, the compressed data itself is not output to the outside. This is because the process of step C3 is a process for updating the dictionary.

【００６６】次に、ＮＵＭ１がＮＵＭ１＝０に設定され
る（ステップＣ４）。その後、ステップＣ３で最終的に
得られた辞書を用いて、ステップＣ３と同様の処理を行
う。即ち、増分分解アルゴリズムを用いて辞書を更新し
ながら全圧縮対象文字列に対するデータ圧縮を行い、圧
縮データを出力すべき時に上記のＮＵＭ１の値を１ずつ
増やす（ステップＣ５）。そして、この際にも圧縮デー
タ自体は外部に出力しない。Next, NUM1 is set to NUM1 = 0 (step C4). Thereafter, the same processing as in step C3 is performed using the dictionary finally obtained in step C3. That is, data compression is performed on all character strings to be compressed while updating the dictionary using the incremental decomposition algorithm, and the value of NUM1 is incremented by one when compressed data is to be output (step C5). At this time, the compressed data itself is not output to the outside.

【００６７】次に、ＮＵＭ１＞ＮＵＭ０か否かが判断さ
れる（ステップＣ６）。これによりステップＣ５の処理
によりデータ圧縮率が最適になったか否か、即ち最適な
辞書になったか否かが判断される。そして、ＮＵＭ１≦
ＮＵＭ０ならば、まだ最適な辞書ではないとして、ＮＵ
Ｍ０＝ＮＵＭ１とされ（ステップＣ７）、ステップＣ
４、Ｃ５の処理が繰り返される。ＮＵＭ０、ＮＵＭ１は
圧縮データの出力回数を表し、これが少ないということ
は圧縮データのデータ量も少ないことを意味する。従っ
て、ＮＵＭ１≦ＮＵＭ０ということは、ステップＣ５の
処理でデータ圧縮率が向上したことを意味する。このた
め、この場合には、更にステップＣ４、Ｃ５の処理が繰
り返されることになるわけである。そして、ステップＣ
６で、ＮＵＭ１＞ＮＵＭ０となった場合に、データ圧縮
に最適な辞書になったと判断される。Next, it is determined whether or not NUM1> NUM0 (step C6). Thus, it is determined whether or not the data compression ratio has been optimized by the processing of step C5, that is, whether or not the dictionary has been optimized. And NUM1 ≦
If it is NUM0, it is determined that it is not yet the optimal dictionary, and NU
M0 = NUM1 (step C7) and step C
4. The processing of C5 is repeated. NUM0 and NUM1 represent the number of times of output of the compressed data, and a small number means that the data amount of the compressed data is also small. Therefore, NUM1 ≦ NUM0 means that the data compression ratio has been improved in the process of step C5. Therefore, in this case, the processes of steps C4 and C5 are further repeated. And step C
If NUM1> NUM0 at 6, it is determined that the dictionary is optimal for data compression.

【００６８】次に、この最適な辞書を静的な辞書とし、
この静的辞書により全圧縮対象文字列が圧縮される（ス
テップＣ８）。但し、このデータ圧縮の際には、辞書の
更新は行われない。そして、最後に、最終的な静的辞書
と、これにより圧縮された圧縮データが出力されること
になる（ステップＣ９）。Next, this optimal dictionary is defined as a static dictionary.
All character strings to be compressed are compressed by this static dictionary (step C8). However, the dictionary is not updated during this data compression. Finally, the final static dictionary and the compressed data compressed thereby are output (step C9).

【００６９】図６には、以上の処理を視覚的に表したも
のが示される。全圧縮対象文字列２１はまず１回目の解
析を受け、これにより暫定的な辞書２２が作成される
（この辞書は動的な辞書である）。次に再び全圧縮対象
文字列２１は２回目の解析を受け、更新版１の辞書２３
が作成される。さらに同様にして更新版２の辞書２４が
作成される。このようにして辞書の更新を繰り返し、デ
ータの圧縮率を判定し最適な辞書が得られた段階で、こ
れを決定版の静的辞書２５とする。そして、この静的辞
書２５により全圧縮対象文字列２１の圧縮を再び行う。
但し、この際には、静的辞書２５は静的なままであり、
辞書の更新は行わない。そして、この静的辞書２５と圧
縮されたデータが外部に出力され、記憶装置に格納され
ることになる。FIG. 6 shows a visual representation of the above processing. The entire character string 21 to be compressed is first subjected to the first analysis, whereby a provisional dictionary 22 is created (this dictionary is a dynamic dictionary). Next, the entire character string 21 to be compressed is again subjected to the second analysis, and the dictionary 23 of the updated version 1 is updated.
Is created. Further, the dictionary 24 of the updated version 2 is created in the same manner. In this way, the dictionary is repeatedly updated, the compression ratio of the data is determined, and when an optimal dictionary is obtained, this is set as the static dictionary 25 of the definitive version. Then, the compression of all the compression target character strings 21 is performed again by the static dictionary 25.
However, at this time, the static dictionary 25 remains static,
Do not update the dictionary. Then, the static dictionary 25 and the compressed data are output to the outside and stored in the storage device.

【００７０】また、図５、図６に示す手法と、ＬＺＷと
の主な相違は以下の通りである。即ち、ＬＺＷでは、デ
ータ圧縮後に最終的には圧縮データだけが生成物として
出力され、辞書は動的な辞書であり出力されない。ＬＺ
Ｗは、電話回線を通じたデータ通信に利用されるデータ
圧縮手法であり、送信側はＬＺＷにより圧縮データを出
力し、受信側がこれを復元する。そして、この復元の際
には、圧縮データの特性を解析し、圧縮データから再び
動的辞書を作成し直しながらデータを復元してゆく必要
がある。従って、圧縮データの最初から順に処理を行っ
ていかなければならなく、復元速度が遅く、また必要な
データ列にランダムにアクセスし復元することはできな
い。従って、ＬＺＷにより圧縮されたデータをプリンタ
等のフォントデータに利用することは困難である。The main differences between the methods shown in FIGS. 5 and 6 and LZW are as follows. That is, in LZW, after data compression, only compressed data is finally output as a product, and the dictionary is a dynamic dictionary and is not output. LZ
W is a data compression technique used for data communication through a telephone line. The transmitting side outputs compressed data by LZW, and the receiving side restores the compressed data. At the time of the restoration, it is necessary to analyze the characteristics of the compressed data and restore the data while re-creating the dynamic dictionary from the compressed data. Therefore, processing must be performed in order from the beginning of the compressed data, the restoration speed is slow, and it is not possible to randomly access and restore a necessary data string. Therefore, it is difficult to use data compressed by LZW as font data for a printer or the like.

【００７１】これに対して、図５、図６に示す手法で
は、ＬＺＷと異なり、最終的には圧縮データ、静的辞書
の２つが出力される。従って、データの復元をする場合
も再度辞書を作成し直す必要が無いため復元速度が速
い。また、静的辞書を用いているため、圧縮データの中
の所望のデータ列にランダムにアクセスすることもでき
る。なお、このように、復元速度が速い、データ列にラ
ンダムにアクセスできるという利点は、この第３の手法
のみならず前述の第１、第２の手法及び後述の第４の手
法においても得ることができる利点である。On the other hand, in the methods shown in FIGS. 5 and 6, unlike LZW, two types of data, that is, compressed data and a static dictionary are finally output. Therefore, even when data is restored, there is no need to create a dictionary again, so that the restoration speed is high. Further, since a static dictionary is used, a desired data string in the compressed data can be accessed at random. The advantages that the restoration speed is high and that the data string can be accessed randomly are obtained not only in the third method, but also in the first and second methods described above and the fourth method described later. Is an advantage.

【００７２】（Ｄ）最適な辞書を得るための第４の手法この手法では、加増分解アルゴリズムを利用して最適な
辞書を得る。加増分解アルゴリズムについては第２の実
施例において詳述する。しかし、処理の流れ自体は、ス
テップＣ３、Ｃ５でのデータ圧縮が加増分解アルゴリズ
ムにより行われる他は、図５とほぼ同様である。(D) Fourth Method for Obtaining an Optimal Dictionary In this method, an optimal dictionary is obtained using an additive decomposition algorithm. The additive decomposition algorithm will be described in detail in the second embodiment. However, the flow of the processing itself is substantially the same as that of FIG. 5 except that the data compression in steps C3 and C5 is performed by the additive decomposition algorithm.

【００７３】上記の第３の手法では、増分分解アルゴリ
ズムを用いてデータ圧縮が行われるため高圧縮率の圧縮
データを得ることができるが、１文字ずつ登録する工程
が存在するため、処理速度は決して早いとは言えず、ま
た、加増分解アルゴリズムと比べると圧縮率は高くな
い。従って、圧縮率を高くする場合には、この加増分解
アルゴリズムを利用する第４の手法を採用することが望
ましい。In the third method, high-compression data can be obtained because data is compressed using the incremental decomposition algorithm. However, since there is a step of registering one character at a time, the processing speed is high. It is not very fast, and the compression ratio is not high compared to the additive decomposition algorithm. Therefore, when increasing the compression ratio, it is desirable to adopt the fourth method using this additive decomposition algorithm.

【００７４】なお、以上の第１〜第４の手法において、
圧縮データは固定ビット長のコードとして出力される。
このコードは、例えば１２ビットであったり、１６ビッ
トの１ワードであったりする。そして、１６ビットの１
ワードとした場合には処理速度が改善されるというメリ
ットを持つ反面、圧縮率は１２ビットの場合よりも落ち
るというデメリットを持つ。In the first to fourth methods,
The compressed data is output as a fixed bit length code.
This code is, for example, 12 bits or one word of 16 bits. And 16 bit 1
In the case of using words, there is an advantage that the processing speed is improved, but on the other hand, there is a disadvantage that the compression ratio is lower than in the case of 12 bits.

【００７５】以上のように、本実施例によれば、最適な
静的辞書を作成するために、使用頻度に基づいて辞書登
録を削除したり、圧縮時に複数回データを走査する必要
があり、処理時間は長くはなる。しかし、これはユーザ
にとっては全く問題とならない。つまり、圧縮処理を行
うのはデータを例えば書換不可能な記憶装置、記憶媒体
（ＲＯＭなど）に書き込む際に必要なものであり、これ
は圧縮後の復元時の処理時間には何ら影響するものでは
ないからである。具体的に言えば、ユーザが、該圧縮方
法により圧縮されたデータを格納した記憶媒体を搭載し
たプリンタを使って印字を行う場合、メーカ側が圧縮デ
ータを作成する時間は多くかかるが、ユーザが印字をさ
せたい場合に行われる復元処理の速度は特に遅くなるこ
とはないからである。As described above, according to this embodiment, in order to create an optimal static dictionary, it is necessary to delete the dictionary registration based on the frequency of use or scan the data a plurality of times during compression. Processing time is longer. However, this is not a problem for the user at all. In other words, the compression process is necessary when data is written to, for example, a non-rewritable storage device or storage medium (ROM or the like), and this has no effect on the processing time at the time of decompression after compression. It is not. Specifically, when a user performs printing using a printer equipped with a storage medium storing data compressed by the compression method, it takes much time for the manufacturer to create the compressed data, This is because the speed of the restoration processing performed when the user wants to perform the processing does not become particularly slow.

【００７６】２．第２の実施例（Ａ）加増分解法図７には、本第２の実施例で使用されるデータ圧縮方法
（以下、加増分解法あるいは加増分解アルゴリズムと呼
ぶ）を模式的に説明する図が示される。2. Second Embodiment (A) Incremental Decomposition Method FIG. 7 is a diagram schematically illustrating a data compression method (hereinafter, referred to as an incremental decomposition method or an additive decomposition algorithm) used in the second embodiment. Is shown.

【００７７】圧縮対象である文字列データ１は、まず作
業用記憶手段である作業用バッファ２にデータの最初の
部分から格納されていく。次に、該作業用バッファ２中
の文字列で辞書３に登録されている文字列が存在しない
かどうかを比較し、存在すれば該辞書３の情報を更新す
ると共に該作業用バッファ２中の文字列を登録番号で置
き換え、該作業用バッファ２に次の対象文字列を追加す
る。一方、登録されている文字列が存在しなければ該作
業用バッファ２中の先頭の２文字を該辞書３に登録し、
先頭の１文字を圧縮データ４として出力する。そして、
該作業用バッファ２に次の対象文字列を追加し、再び該
作業用バッファ２中の文字列で辞書３に登録されている
文字列が存在しないかどうかを比較する。以上の処理を
繰り返すことでデータの圧縮を行う。The character string data 1 to be compressed is first stored in the work buffer 2 as work storage means from the first part of the data. Next, a comparison is made as to whether or not the character strings registered in the dictionary 3 are present in the character strings in the work buffer 2. If the character strings are present, the information in the dictionary 3 is updated, and the character strings in the work buffer 2 are updated. The character string is replaced with the registration number, and the next target character string is added to the work buffer 2. On the other hand, if there is no registered character string, the first two characters in the working buffer 2 are registered in the dictionary 3,
The first character is output as compressed data 4. And
The next target character string is added to the work buffer 2, and a comparison is made again as to whether a character string registered in the dictionary 3 does not exist in the work buffer 2. Data compression is performed by repeating the above processing.

【００７８】次に、加増分解法を用いて次に示す文字列
データを圧縮対象として処理し、静的辞書と圧縮データ
を得る場合について詳しく説明する。ここで、圧縮対象
となる文字列データは、「ＡＢＣＥＢＣＨＡ、ＢＢＣＣＤＢＣＨ、ＡＢＣＡＢＥ
ＨＤ、ＡＢＢＣＧＡＢＫ‥‥」である。ただし、ここでは書換可能な記憶素子で構成さ
れる辞書の0 〜255 番までには、アスキーコードの文字
データが登録されているとする。また、作業用記憶手段
には、１６バイトの容量のバッファを用意したとする。Next, a case where the following character string data is processed as an object to be compressed using the additive decomposition method to obtain a static dictionary and compressed data will be described in detail. Here, the character string data to be compressed is "ABCEBCHA, BBCCDBCH, ABCABE
HD, ABBCGABK ‥‥ ”. However, here, it is assumed that ASCII code character data is registered in numbers 0 to 255 of the dictionary composed of rewritable storage elements. It is also assumed that a buffer having a capacity of 16 bytes is prepared in the working storage unit.

【００７９】（１）最初に、圧縮対象である文字列デー
タの先頭から作業用バッファの容量分（１６バイト）の
データを取り込む。すると、作業用バッファ中の文字列
は「ＡＢＣＥＢＣＨＡ、ＢＢＣＣＤＢＣＨ」になる。こ
こで、各文字はそれぞれ１バイトで表される数値データ
であり、これは前述のようにアスキーコード表に従う。
例えば、" Ａ" は65、”Ｂ”は66、”Ｃ”は67である。(1) First, data of the capacity (16 bytes) of the working buffer is fetched from the beginning of the character string data to be compressed. Then, the character string in the working buffer becomes “ABCEBCHA, BBCCDBCH”. Here, each character is numerical data represented by one byte, and follows the ASCII code table as described above.
For example, "A" is 65, "B" is 66, and "C" is 67.

【００８０】（２）次に、この作業用バッファ中に既に
辞書に登録されている２文字以上の文字列（文字列の組
み合わせ）が存在するかどうかを調べる。しかし、その
ような文字列は登録されていないので、辞書の例えば25
6 番目に先頭の２文字" ＡＢ" を、長さは２（文字
分）、データ欄には65+66 （" Ａ" と" Ｂ" のアスキー
コード）として登録し、この"256" 番の使用頻度を１回
とする。更に、作業用バッファ中の先頭の１文字" Ａ"
の登録番号”65”を圧縮データとして出力し、作業用バ
ッファ上では消去する。そして、これにより生じた作業
用バッファ上の空欄の１バイトを埋めるようにそれ以降
の文字列をすべて先頭方向にシフトさせ、その結果生じ
る作業用バッファの終端の空白の１バイトには対象の文
字列データの続きから１文字分" Ａ" を取り込む。する
と、作業用バッファの文字列は「ＢＣＥＢＣＨＡＢ、Ｂ
ＣＣＤＢＣＨＡ」になる。(2) Next, it is checked whether or not a character string (combination of character strings) of two or more characters already registered in the dictionary exists in the work buffer. However, since such a character string is not registered, for example, 25
Sixth, the first two characters "AB" are registered with a length of 2 (for characters) and 65 + 66 (ASCII code of "A" and "B") in the data column. The frequency of use is once. Furthermore, the first character "A" in the working buffer
Is output as compressed data, and is deleted on the work buffer. Then, all of the subsequent character strings are shifted in the leading direction so as to fill the resulting blank byte on the work buffer, and the resulting blank byte at the end of the work buffer is the target character. One character "A" is taken from the continuation of the column data. Then, the character string of the work buffer is “BCEBCHAB, B
CCDBCHA ".

【００８１】（３）次に、再びこの作業用バッファ中に
既に辞書に登録されている２文字以上の文字列が存在す
るかどうかを調べる。すると、作業用バッファの７、８
文字目に" ＡＢ" の文字列が存在しているので、２バイ
ト分の該文字列" ＡＢ" を256 の文字に置き換え、辞書
の"256" 番の文字列の使用頻度を２回に更新する。この
置き換えにより作業用バッファには１バイトの空欄が生
じるので、これを埋めるようにそれ以降の文字列をすべ
て先頭方向にシフトさせ、その結果生じる作業用バッフ
ァの終端の空白の１バイトには対象の文字列データの続
きから１文字分" Ｂ" を取り込む。すると、作業用バッ
ファ中の文字列は「ＢＣＥＢＣＨ256 Ｂ、ＣＣＤＢＣＨ
ＡＢ」になる。(3) Next, it is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the work buffer. Then, the working buffer 7, 8
Since the character string "AB" exists in the character, replace the 2-byte character string "AB" with 256 characters and update the usage frequency of the "256" character string in the dictionary to twice. I do. As a result of this replacement, a blank space of 1 byte is created in the work buffer, so that all subsequent character strings are shifted toward the beginning so as to fill this space, and the resulting blank 1 byte at the end of the work buffer is filled. One character "B" is taken in from the continuation of the character string data. Then, the character string in the work buffer is “BCEBCH256 B, CCDBCH
AB ".

【００８２】（４）再びこの作業用バッファ中に既に辞
書に登録されている２文字以上の文字列が存在するかど
うかを調べる。すると、作業用バッファの15、16文字目
に"ＡＢ" の文字列が存在しているので、２バイト分の
該文字列" ＡＢ" を256 の文字に置き換え、辞書の"25
6" 番の文字列の使用頻度を３回に更新する。この置き
換えにより作業用バッファには１バイトの空欄が生じる
ので、これを埋めるようにそれ以降の文字列をすべて先
頭方向にシフトさせ、その結果生じる作業用バッファの
終端の空白の１バイトには対象の文字列データの続きか
ら１文字分" Ｃ"を取り込む。すると、作業用バッファ
中の文字列は「ＢＣＥＢＣＨ256 Ｂ、ＣＣＤＢＣＨ256
Ｃ」になる。(4) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. Then, since the character string "AB" exists at the 15th and 16th characters of the working buffer, the character string "AB" for 2 bytes is replaced with 256 characters, and "25" in the dictionary is changed to "25".
Update the frequency of use of the 6th string to 3 times. This replacement creates a 1-byte blank space in the work buffer, so all subsequent strings are shifted to the top to fill this space. One character “C” from the continuation of the target character string data is fetched into the one blank byte at the end of the resulting work buffer, and the character strings in the work buffer are “BCEBCH256 B, CCDBCH256”.
C ".

【００８３】（５）再びこの作業用バッファ中に既に辞
書に登録されている２文字以上の文字列が存在するかど
うかを調べる。しかし、そのような文字列は登録されて
いないので、辞書の例えば257 番目に先頭の２文字" Ｂ
Ｃ" を、長さは２（文字分）、データ欄には66+67 とし
て登録し、この"257" 番の使用頻度を１回とする。さら
に、作業用バッファ中の先頭の１文字" Ｂ" の登録番
号"66"を圧縮データとして出力し、作業用バッファ上で
は消去し、これにより生じた作業用バッファ上の空欄の
１バイトを埋めるようにそれ以降の文字列をすべて先頭
方向にシフトさせ、その結果生じる作業用バッファの終
端の空白の１バイトには対象の文字列データの続きから
１文字分" Ａ" を取り込む。すると、作業用バッファの
文字列は「ＣＥＢＣＨ256 ＢＣ、ＣＤＢＣＨ256 ＣＡ」
になる。(5) It is again checked whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. However, since such a character string has not been registered, for example, the first two characters "B" at the 257th position in the dictionary
C "is registered with a length of 2 (characters) and 66 + 67 in the data field, and the frequency of use of the number" 257 "is set to once. The registration number "66" of B "is output as compressed data, erased in the work buffer, and all subsequent character strings are shifted to the beginning so as to fill one byte of a blank space in the work buffer. The character string "A" is fetched from the continuation of the target character string data in the blank one byte at the end of the resulting work buffer, and the character string in the work buffer becomes "CEBCH256 BC, CDBCH256 CA". "
become.

【００８４】（６）再びこの作業用バッファ中に既に辞
書に登録されている２文字以上の文字列が存在するかど
うかを調べる。すると、作業用バッファの3 、4 文字
目、7、8 文字目、および11、12文字目に" ＢＣ" の文
字列が存在しているので、合計６バイト分の該文字列"
ＢＣ" をそれぞれ257 の文字に置き換え、辞書の"257"
番の文字列の使用頻度を４回に更新する。この置き換え
により作業用バッファには３バイトの空欄が生じるの
で、これを埋めるようにそれ以降の文字列をすべて先頭
方向にシフトさせ、その結果生じる作業用バッファの終
端の空白の３バイトには対象の文字列データの続きから
３文字分" ＢＥＨ" を取り込む。すると、作業用バッフ
ァ中の文字列は「ＣＥ257 Ｈ256257ＣＤ、257 Ｈ256 Ｃ
ＡＢＥＨ」になる。(6) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. Then, since the character string of "BC" exists in the third, fourth, seventh and eighth characters, and the eleventh and twelfth characters of the working buffer, the character string of a total of 6 bytes "
Replace "BC" with 257 characters each, and "257" in the dictionary
The use frequency of the number string is updated to four times. As a result of this replacement, a three-byte blank space is created in the work buffer, so that all subsequent character strings are shifted in the leading direction so as to fill the space, and the resulting blank three bytes at the end of the work buffer are not affected. "BEH" for three characters from the continuation of the character string data of Then, the character string in the work buffer is "CE257 H256257CD, 257H256C
ABEH ”.

【００８５】（７）再びこの作業用バッファ中に既に辞
書に登録されている２文字以上の文字列が存在するかど
うかを調べる。すると、作業用バッファの13、14文字目
に"ＡＢ" の文字列が存在しているので、２バイト分の
該文字列" ＡＢ" を256 の文字に置き換え、辞書の"25
6" 番の文字列の使用頻度を４回に更新する。この置き
換えにより作業用バッファには１バイトの空欄が生じる
ので、これを埋めるようにそれ以降の文字列をすべて先
頭方向にシフトさせ、その結果生じる作業用バッファの
終端の空白の１バイトには対象の文字列データの続きか
ら１文字分" Ｄ"を取り込む。すると、作業用バッファ
中の文字列は「ＣＥ257 Ｈ256257ＣＤ、257 Ｈ256 Ｃ25
6 ＥＨＤ」になる。(7) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. Then, since the character string "AB" exists at the 13th and 14th characters of the working buffer, the character string "AB" of 2 bytes is replaced with 256 characters, and "25" of the dictionary is changed to "25".
Update the frequency of use of the 6th character string to 4. This replacement creates a 1-byte blank space in the working buffer, so all subsequent strings are shifted to the top to fill this space. The resulting blank byte at the end of the work buffer captures one character “D” from the continuation of the target character string data, and the character string in the work buffer is “CE257 H256257CD, 257H256 C25”.
6 EHD ”.

【００８６】（８）再びこの作業用バッファ中に既に辞
書に登録されている２文字以上の文字列が存在するかど
うかを調べる。しかし、そのような文字列は登録されて
いないので、辞書の例えば258 番目に先頭の２文字" Ｃ
Ｅ" を、長さは２（文字分）、データ欄には67+69 とし
て登録し、この"258" 番の使用頻度を１回とする。さら
に、作業用バッファ中の先頭の１文字" Ｃ" の登録番
号"67"を圧縮データとして出力し、作業用バッファ上で
は消去し、これにより生じた作業用バッファ上の空欄の
１バイトを埋めるようにそれ以降の文字列をすべて先頭
方向にシフトさせ、その結果生じる作業用バッファの終
端の空白の１バイトには対象の文字列データの続きから
１文字分" Ａ" を取り込む。すると、作業用バッファの
文字列は「Ｅ257 Ｈ256257ＣＤ257 、Ｈ256 Ｃ256 ＥＨ
ＤＡ」になる。(8) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. However, since such a character string is not registered, for example, the first two characters "C" at the 258th position in the dictionary
E "is registered as having a length of 2 (for characters) and 67 + 69 in the data field, and the frequency of use of the number" 258 "is set to once. The registration number "67" of C "is output as compressed data, erased in the work buffer, and all subsequent character strings are shifted in the leading direction so as to fill one byte of a blank space in the work buffer. After the shift, the one byte at the end of the resulting work buffer contains "A" for one character from the continuation of the target character string data, and the character string in the work buffer becomes "E257 H256257CD257, H256 C256". EH
DA ".

【００８７】（９）再びこの作業用バッファ中に既に辞
書に登録されている２文字以上の文字列が存在するかど
うかを調べる。しかし、そのような文字列は登録されて
いないので、辞書の例えば259 番目に先頭の２文字" Ｅ
257"を、長さは２（文字分）、データ欄には69+257とし
て登録し、この"259" 番の使用頻度を１回とする。更
に、作業用バッファ中の先頭の１文字" Ｅ" の登録番
号"69"を圧縮データとして出力し、作業用バッファ上で
は消去し、これにより生じた作業用バッファ上の空欄の
１バイトを埋めるようにそれ以降の文字列をすべて先頭
方向にシフトさせ、その結果生じる作業用バッファの終
端の空白の１バイトには対象の文字列データの続きから
１文字分" Ｂ" を取り込む。すると、作業用バッファの
文字列は「257 Ｈ256257ＣＤ257 、Ｈ256 Ｃ256 ＥＨＤ
ＡＢ」になる。(9) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. However, since such a character string is not registered, the first two characters "E
257 ", the length is 2 (characters), and 69 + 257 is registered in the data field, and the frequency of use of the number" 259 "is set to once. The registration number "69" of E "is output as compressed data, erased on the work buffer, and all subsequent character strings are shifted in the leading direction so as to fill the resulting one byte of blank space on the work buffer. The character string "B" for one character from the continuation of the target character string data is fetched into the blank one byte at the end of the resulting work buffer, and the character string in the work buffer becomes "257 H256 257 CD257, H256 C256". EHD
AB ".

【００８８】（１０）再びこの作業用バッファ中に既に
辞書に登録されている２文字以上の文字列が存在するか
どうかを調べる。すると、作業用バッファの15、16文字
目に" ＡＢ" の文字列が存在しているので、２バイト分
の該文字列" ＡＢ" を256 の文字に置き換え、辞書の"2
56" 番の文字列の使用頻度を４回に更新する。この置き
換えにより作業用バッファには１バイトの空欄が生じる
ので、これを埋めるようにそれ以降の文字列をすべて先
頭方向にシフトさせ、その結果生じる作業用バッファの
終端の空白の１バイトには対象の文字列データの続きか
ら１文字分" Ｂ" を取り込む。すると、作業用バッファ
中の文字列は「257 Ｈ256257ＣＤ257 、Ｈ256 Ｃ256 Ｅ
ＨＤ256 Ｂ」になる。(10) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. Then, since the character string "AB" exists at the 15th and 16th characters of the working buffer, the character string "AB" of 2 bytes is replaced with 256 characters, and "2" of the dictionary is replaced with "2".
Update the frequency of use of the 56th string to 4 times. This replacement creates a 1-byte blank space in the working buffer, so all subsequent strings are shifted to the top to fill this space. One character “B” from the continuation of the target character string data is taken into the one blank byte at the end of the resulting work buffer, and the character string in the work buffer becomes “257 H256257CD257, H256 C256E”.
HD256 B ".

【００８９】（１１）再びこの作業用バッファ中に既に
辞書に登録されている２文字以上の文字列が存在するか
どうかを調べる。しかし、そのような文字列は登録され
ていないので、辞書の例えば260 番目に先頭の２文字"2
57Ｈ" を、長さは２（文字分）、データ欄には257+72と
して登録し、この"260" 番の使用頻度を１回とする。更
に、作業用バッファ中の先頭の１文字"257" を圧縮デー
タとして出力し、作業用バッファ上では消去し、これに
より生じた作業用バッファ上の空欄の１バイトを埋める
ようにそれ以降の文字列をすべて先頭方向にシフトさ
せ、その結果生じる作業用バッファの終端の空白の１バ
イトには対象の文字列データの続きから１文字分" Ｃ"
を取り込む。すると、作業用バッファの文字列は「Ｈ25
6257ＣＤ257 Ｈ、256 Ｃ256 ＥＨＤ256 ＢＣ」になる。(11) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. However, since such a character string has not been registered, the first two characters "2
57H ", the length is 2 (for characters), and 257 + 72 is registered in the data field, and the frequency of use of this" 260 "is set to once. 257 "is output as compressed data, erased in the work buffer, and all subsequent character strings are shifted to the beginning so as to fill one blank byte generated in the work buffer. One character from the continuation of the target character string data is "C" in the blank 1 byte at the end of the work buffer.
Take in. Then, the character string of the work buffer is "H25
6257CD257H, 256C256, EHD256BC ".

【００９０】（１２）再びこの作業用バッファ中に既に
辞書に登録されている２文字以上の文字列が存在するか
どうかを調べる。すると、作業用バッファの7 、8 文字
目に"257Ｈ" の文字列が、15、16文字目に" ＢＣ" の文
字列が、それぞれ存在しているので、合計４バイト分の
該文字列"257Ｈ" を260 に、該" ＢＣ" を257 に置き換
え、辞書の"260" の使用頻度を2 回に"257" の使用頻度
を５回に更新する。この置き換えにより作業用バッファ
には２バイトの空欄が生じるので、これを埋めるように
それ以降の文字列をすべて先頭方向にシフトさせ、その
結果生じる作業用バッファの終端の空白の２バイトには
対象の文字列データの続きから２文字分" ＧＡ" を取り
込む。すると、作業用バッファ中の文字列は「Ｈ256257
ＣＤ260256Ｃ、256 ＥＨＤ256257ＧＡ」になる。(12) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. Then, the character string of "257H" exists at the 7th and 8th characters of the working buffer, and the character string of "BC" exists at the 15th and 16th characters. 257H "is replaced with 260 and" BC "is replaced with 257, and the usage frequency of" 260 "in the dictionary is updated twice and the usage frequency of" 257 "is updated five times. As a result of this replacement, a 2-byte blank space is created in the work buffer, so that all subsequent character strings are shifted toward the beginning so as to fill this space, and the resulting blank 2 bytes at the end of the work buffer are replaced. "GA" for two characters from the continuation of the character string data of. Then, the character string in the work buffer is "H256257
CD260 256C, 256 EHD256257GA ".

【００９１】（１３）再びこの作業用バッファ中に既に
辞書に登録されている２文字以上の文字列が存在するか
どうかを調べる。しかし、そのような文字列は登録され
ていないので、辞書の例えば261 番目に先頭の２文字"
Ｈ256"を、長さは３（文字分）、データ欄には72+256と
して登録し、この"261" 番の使用頻度を１回とする。更
に、作業用バッファ中の先頭の１文字" Ｈ" の登録番
号"72"を圧縮データとして出力し、作業用バッファ上で
は消去し、これにより生じた作業用バッファ上の空欄の
１バイトを埋めるようにそれ以降の文字列をすべて先頭
方向にシフトさせ、その結果生じる作業用バッファの終
端の空白の１バイトには対象の文字列データの続きから
１文字分" Ｂ" を取り込む。すると、作業用バッファの
文字列は「256257ＣＤ260256Ｃ256 、ＥＨＤ256257ＧＡ
Ｂ」になる。(13) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. However, since such a character string has not been registered, for example, the first two characters at the 261st position in the dictionary "
H256 "is registered as having a length of 3 (for characters) and 72 + 256 in the data column, and the frequency of use of the number" 261 "is set to once. The registration number "72" of "H" is output as compressed data, erased in the work buffer, and all subsequent character strings are shifted in the leading direction so as to fill one byte of a blank space in the work buffer generated by this. The character string "B" is fetched from the continuation of the target character string data into the blank one byte at the end of the work buffer resulting from the shift.
B ".

【００９２】（１４）再びこの作業用バッファ中に既に
辞書に登録されている２文字以上の文字列が存在するか
どうかを調べる。しかし、そのような文字列は登録され
ていないので、辞書の例えば262 番目に先頭の２文字"2
56257"を、長さは４（文字分）、データ欄には256+257
として登録し、この"262" 番の使用頻度を１回とする。
更に、作業用バッファ中の先頭の１文字"256" を圧縮デ
ータとして出力し、作業用バッファ上では消去し、これ
により生じた作業用バッファ上の空欄の１バイトを埋め
るようにそれ以降の文字列をすべて先頭方向にシフトさ
せ、その結果生じる作業用バッファの終端の空白の１バ
イトには対象の文字列データの続きから１文字分" Ｋ"
を取り込む。すると、作業用バッファの文字列は「257
ＣＤ260256Ｃ256 Ｅ、ＨＤ256257ＧＡＢＫ」になる。(14) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. However, since such a character string has not been registered, the first two characters "2
56257 ", the length is 4 (for characters), and the data column is 256 + 257
And the frequency of use of "262" is once.
Furthermore, the first character "256" in the working buffer is output as compressed data, deleted on the working buffer, and the subsequent characters are filled so as to fill the resulting blank one byte on the working buffer. All columns are shifted toward the beginning, and the resulting one-byte blank space at the end of the working buffer contains one character "K" from the continuation of the target string data.
Take in. Then, the character string of the work buffer is "257
CD260256C256E, HD256257GABK ".

【００９３】（１５）再びこの作業用バッファ中に既に
辞書に登録されている２文字以上の文字列が存在するか
どうかを調べる。すると、作業用バッファの11、12文字
目に"256257"の文字列が存在しているので、該"256257"
を262 に置き換え、辞書の"262" の使用頻度を2 回に更
新する。この置き換えにより作業用バッファには１バイ
トの空欄が生じるので、これを埋めるようにそれ以降の
文字列をすべて先頭方向にシフトさせ、その結果生じる
作業用バッファの終端の空白の１バイトには対象の文字
列データの続きから１文字を取り込む。すると、作業用
バッファ中の文字列は「257 ＣＤ260256Ｃ256 Ｅ、ＨＤ
262 ＧＡＢＫ・」になる。(15) It is checked again whether or not a character string of two or more characters already registered in the dictionary exists in the working buffer. Then, since the character string of "256257" exists in the 11th and 12th characters of the working buffer, the "256257"
To 262 and update the usage of "262" in the dictionary to twice. As a result of this replacement, a blank space of 1 byte is created in the work buffer, so that all subsequent character strings are shifted toward the beginning so as to fill this space, and the resulting blank 1 byte at the end of the work buffer is filled. One character is taken from the continuation of the character string data. Then, the character string in the working buffer is “257 CD260 256C256 E, HD
262 GABK. "

【００９４】このように本実施例の加増分解法によれ
ば、作業用バッファの文字列の長さを常に一定に保つよ
うに圧縮対象の文字列データから順次データを取り込む
ようになっている。そして、該作業用バッファ内の先頭
からの相隣り合う文字列の組み合わせが辞書に登録され
ていればこれを辞書の登録番号に置き換えることで圧縮
を実行する。更に、この文字列の組み合わせが作業用バ
ッファの中に含まれていればこれをも辞書の登録番号に
置き換えることでさらに圧縮を実行している。従って、
従来の増分分解法の場合よりも圧縮データのデータ量が
削減され、結果的に圧縮データを少ない記憶容量で格納
することが可能となる。なお、圧縮データは、作業用バ
ッファ内の先頭からの相隣り合う文字列の組み合わせが
辞書に登録されていない場合に、該文字列の組み合わせ
を辞書登録した後に、出力されることになる。As described above, according to the additive decomposition method of this embodiment, data is sequentially fetched from the character string data to be compressed so that the length of the character string in the working buffer is always kept constant. Then, if a combination of adjacent character strings from the head in the work buffer is registered in the dictionary, the combination is replaced with the registration number of the dictionary to perform compression. Further, if this combination of character strings is included in the working buffer, this is also replaced with the registration number of the dictionary to further perform compression. Therefore,
The data amount of the compressed data is reduced as compared with the conventional incremental decomposition method, and as a result, the compressed data can be stored with a small storage capacity. If the combination of adjacent character strings from the head in the working buffer is not registered in the dictionary, the compressed data is output after the combination of the character strings is registered in the dictionary.

【００９５】以上の処理をツリー構造を用いて模式的に
表すと図８（Ａ）のようになる。図８（Ａ）では、上記
の説明に用いた（１）〜（１５）に合わせて番号を付し
ている。但し、（１）、（１０）、（１２）、（１５）
については図８（Ａ）には示していない。FIG. 8A schematically shows the above processing using a tree structure. In FIG. 8A, numbers are assigned according to (1) to (15) used in the above description. However, (1), (10), (12), (15)
Is not shown in FIG.

【００９６】例えば、”ＡＢ”を辞書登録し、”Ａ”を
出力する上記（２）の工程は、図８（Ａ）において以下
のように表現される。即ち、図８（Ａ）において、上矢
印の先に”ＡＢ”とすることで辞書登録することを表現
する。また、下矢印の先に”Ａ”とすることで該文字を
圧縮データとして出力することを表現する。更に、これ
らの上矢印、下矢印の近くに工程の番号”２”を付すこ
とで、当該処理が工程（２）の処理であることを表現す
る。For example, the process (2) for registering “AB” in a dictionary and outputting “A” is expressed as follows in FIG. 8 (A). In other words, in FIG. 8A, "AB" is set at the end of the upward arrow to indicate that the dictionary is registered. In addition, the letter “A” at the end of the down arrow indicates that the character is output as compressed data. Further, by assigning a process number “2” near the up arrow and the down arrow, it indicates that the process is the process of the process (2).

【００９７】また、上記（３）の工程のように辞書に登
録されている文字”ＡＢ”を作業用バッファ中に見つ
け、これを登録番号で置き換える場合は、該置き換える
べき文字列の上を括り、そこに該処理が行われる工程の
番号”３”を付している。When the character "AB" registered in the dictionary is found in the work buffer as in the step (3) and replaced with the registration number, the character string to be replaced is wrapped up. , And the number “3” of the step in which the process is performed.

【００９８】因みに本実施例の場合、圧縮済みの出力デ
ータは以下のようになる。「Ａ、Ｂ、Ｃ、Ｅ、257 、Ｈ、256 、257 、Ｃ、・・
・」また、辞書には結局次のような文字列が登録されること
になる。 0 〜255 アスキーコード 256 ＡＢ 257 ＢＣ 258 ＣＥ 259 Ｅ257 260 257 Ｈ 261 Ｈ256 262 256 257 ‥ ‥‥‥In the present embodiment, compressed output data is as follows. "A, B, C, E, 257, H, 256, 257, C, ...
・ ”In addition, the following character strings are eventually registered in the dictionary. 0 to 255 ASCII code 256 AB 257 BC 258 CE 259 E257 260 257 H261 H256 262 256 257 ‥ ‥‥‥

【００９９】（Ｂ）増分分解法次に、前記の例と同じ以下の文字列データを従来の増分
分解法を用いた方法により処理し、辞書と圧縮データを
得る場合について説明する。ただし、ここでも辞書には
最初からアスキーコードの文字を0 〜255 番に登録して
いたとする。「ＡＢＣＥＢＣＨＡ、ＢＢＣＣＤＢＣＨ、
ＡＢＣＡＢＥＨＤ、ＡＢＢＣＧＡＢＫ‥‥」(B) Incremental Decomposition Method Next, a case will be described in which the same character string data as in the above example is processed by a method using a conventional incremental decomposition method to obtain a dictionary and compressed data. However, here, it is assumed that ASCII code characters are registered in the dictionary from 0 to 255 from the beginning. "ABCECHA, BBCCDBCH,
ABCABEHD, ABBCGABK ‥‥ "

【０１００】（１）送られてくる文字列は上記に示した
加増分解法のように、一定のバイト分ずつ作業用バッフ
ァに一旦格納されることはなく、先頭部分の必要最低限
の文字列のみが作業用バッファに取り込まれる。そこ
で、この作業用バッファ中の１文字目" Ａ" が辞書に登
録されているかどうかを調べる。すると、該" Ａ" は辞
書の65番目に登録されているので、その使用頻度のカウ
ント数を１にする。(1) The sent character string is not once stored in the working buffer by a fixed number of bytes at a time as in the case of the above-described additive decomposition method. Only the work buffer is loaded. Therefore, it is checked whether the first character "A" in the working buffer is registered in the dictionary. Then, since "A" is registered in the 65th place in the dictionary, the use frequency count number is set to one.

【０１０１】（２）次に、２文字目" Ｂ" を加え、この
作業用バッファ中の文字" ＡＢ" が辞書に登録されてい
るかどうかを調べる。しかし、２文字以上の文字列で辞
書にすでに登録されているものは見つからない。そこ
で、先頭の２文字" ＡＢ" を番号の256 番目に、長さは
２（文字分）、データ欄には65+66 として登録し、辞書
の"256" の使用頻度を１回と記す。次に、作業用バッフ
ァ中の先頭の該" Ａ" を圧縮データとして出力し、次は
対象となる文字列データの２文字目" Ｂ" から注目す
る。(2) Next, the second character "B" is added, and it is checked whether the character "AB" in the working buffer is registered in the dictionary. However, a character string already registered in the dictionary with two or more characters cannot be found. Therefore, the first two characters "AB" are registered as the 256th in the number, the length is 2 (for characters), and 65 + 66 is registered in the data column, and the use frequency of "256" in the dictionary is described as once. Next, the head "A" in the working buffer is output as compressed data, and the second character "B" of the target character string data is focused on.

【０１０２】（３）新しい１文字目" Ｂ" が辞書に登録
されているかどうかを調べる。すると、該" Ｂ" は辞書
の66番目に登録されているので、その使用頻度のカウン
ト数を１にする。(3) Check whether the new first character “B” is registered in the dictionary. Then, since "B" is registered in the 66th place in the dictionary, the count number of the frequency of use is set to one.

【０１０３】（４）次に、新しい２文字目" Ｃ" を加
え、この作業用バッファ中の文字" ＢＣ" が辞書に登録
されているかどうかを調べる。しかし、また２文字以上
の文字列で辞書にすでに登録されている文字列は見つか
らない。そこで、先頭の２文字" ＢＣ" を番号の257 番
目に、長さは２（文字分）、データ欄には66+67 として
登録し、辞書の"257" の使用頻度を１回と記す。次に、
作業用バッファ中の先頭の該" Ｂ" を圧縮データとし
て出力し、次は対象の文字列データの３文字目"Ｃ" か
ら注目する。(4) Next, a new second character "C" is added, and it is checked whether the character "BC" in the working buffer is registered in the dictionary. However, a character string already registered in the dictionary with two or more characters cannot be found. Therefore, the first two characters "BC" are registered as the 257th number, the length is 2 (characters), and 66 + 67 is registered in the data column, and the usage frequency of "257" in the dictionary is described as once. next,
The head "B" in the working buffer is output as compressed data, and the next focus is on the third character "C" of the target character string data.

【０１０４】（５）これと同様の操作が最初から４文字
目のＥまで続き、" ＥＢ" が登録されたあと、作業用バ
ッファ中の新しい１文字目" Ｂ" が辞書に登録されてい
るかどうかを調べる。しかし、該" Ｂ" は辞書の66番目
に登録されているので、その使用頻度のカウント数を今
度は２にする。(5) The same operation is continued from the beginning to the fourth character E, and after “EB” is registered, is the new first character “B” in the working buffer registered in the dictionary? Find out if. However, since "B" is registered at the 66th position in the dictionary, its use frequency count number is set to 2 this time.

【０１０５】（６）次に、新しい２文字目" Ｃ" を加
え、この作業用バッファ中の文字列"ＢＣ" が辞書に登
録されているかどうかを調べる。すると今度は、辞書
の"257"番目に" ＢＣ" の文字が登録されているので、
その使用頻度を２回と記し、さらに新しい３文字目を加
えた文字列" ＢＣＨ" が辞書に登録されているかどうか
を調べる。(6) Next, a new second character "C" is added, and it is checked whether or not the character string "BC" in the working buffer is registered in the dictionary. Then, this time, the character "BC" is registered at the "257" th position in the dictionary.
The frequency of use is described as twice, and it is checked whether the character string "BCH" to which the new third character is added is registered in the dictionary.

【０１０６】（７）すると、これは登録されていないの
で、この２文字列" ＢＣＨ" を番号の260 番目に、長さ
は３（文字分）、データ欄には257+72として登録し、辞
書の"260" の使用頻度を１回と記す。次に、先頭の" Ｂ
Ｃ" の登録番号"257" を圧縮データとして出力し、次は
対象の文字列データの７文字目" Ｈ" から注目する。(7) Then, since this is not registered, this two-character string "BCH" is registered as the 260th of the number, the length is 3 (for character), and 257 + 72 in the data column. The frequency of using "260" in the dictionary is once. Next, the leading "B"
The registration number "257" of C "is output as compressed data, and attention is paid to the seventh character" H "of the target character string data.

【０１０７】図９には、以上の増分分解法の動作を説明
するためのフロチャートを参考として示す。FIG. 9 is a flowchart for explaining the operation of the above-mentioned incremental decomposition method.

【０１０８】このようにして増分分解法は行われてゆく
が、本実施例の加増分解法との大きな違いは処理対象の
文字列データを先頭から１文字ずつ取り込み、辞書との
一致を見て行くことである。The incremental decomposition method is performed in this manner, but the major difference from the incremental decomposition method of the present embodiment is that the character string data to be processed is fetched one character at a time from the beginning, and matching with the dictionary is performed. Is to go.

【０１０９】以上の処理を図８（Ａ）の例と同様にツリ
ー構造を用いて模式的に表すと図８（Ｂ）のようにな
る。ただし、図中の１〜８の番号は上記の工程の説明で
用いられた番号には沿っていない。ここで、例えば、”
ＡＢ”を辞書登録し、”Ａ”を出力する１の工程では、
上矢印の先に”ＡＢ”とすることで辞書登録を表すこと
とし、下矢印の先に”Ａ”とすることで該文字を圧縮デ
ータとして出力することを表すこととする。The above processing is schematically shown using a tree structure as in the example of FIG. 8A, as shown in FIG. 8B. However, the numbers 1 to 8 in the figure do not follow the numbers used in the description of the above steps. Here, for example, "
In a process of registering “AB” as a dictionary and outputting “A”,
A dictionary registration is represented by “AB” at the top of the up arrow, and a character is output as compressed data by “A” at the bottom of the down arrow.

【０１１０】さて、図８（Ｂ）から理解されるように、
従来の増分分解法では、図８（Ａ）の（３）の工程のよ
うに辞書に登録されている文字”ＡＢ”を作業用バッフ
ァ中の先頭以外の途中に見つけ、これを登録番号で置き
換える工程は有り得ない。有り得るのは、図８（Ｂ）の
５の工程のように、ただ辞書に登録してある文字列の右
側に新しい１文字を加え、その上を括ることで新しい文
字列を作り出すことのみである。Now, as understood from FIG. 8B,
In the conventional incremental decomposition method, the character "AB" registered in the dictionary is found in the middle of the working buffer other than the head as in the step (3) in FIG. No process is possible. The only possibility is to create a new character string by adding a new character to the right side of the character string registered in the dictionary and enclosing it on the right, as in step 5 in FIG. 8B. .

【０１１１】因みに、この増分分解法による圧縮済みの
出力データは以下のようになる。「Ａ、Ｂ、Ｃ、Ｅ、257 、Ｈ、256 、257 、Ｃ、Ｄ、25
7 、Ｈ、256 、・・・」また、辞書には結局次のような文字列が登録されること
になる。 0 〜255 アスキーコード 256 ＡＢ 257 ＢＣ 258 ＣＥ 259 ＥＢ 260 257 Ｈ 261 ＨＡ 262 256 Ｂ 263 257 Ｃ 264 ＣＤ 265 ＤＢ 266 260 Ａ ‥ ‥‥‥Incidentally, the output data compressed by the incremental decomposition method is as follows. "A, B, C, E, 257, H, 256, 257, C, D, 25
7, H, 256,... "In addition, the following character strings are eventually registered in the dictionary. 0 to 255 ASCII code 256 AB 257 BC 258 CE 259 EB 260 257 H 261 HA 262 256 B 263 257 C 264 CD 265 DB 266 260 A ‥ ‥‥‥

【０１１２】図８（Ａ）と図８（Ｂ）を比較した場合、
例に示す範囲においては出力される圧縮データは同じも
のとなるが、辞書に登録される文字列は確実に異なって
おり、注目する文字列が圧縮対象の文字列データの後に
行くほど違いがでてくる。When FIG. 8A and FIG. 8B are compared,
In the range shown in the example, the output compressed data is the same, but the character strings registered in the dictionary are definitely different, and the difference increases as the character string of interest comes after the character string data to be compressed. Come.

【０１１３】このように、従来の増分分解法では１文字
ずつしか増やしていけないために特に図１０（Ａ）、
（Ｂ）で示すような同じ文字列が連続する文字列データ
については、本実施例の加増分解法とは大きな違いが生
じ、加増分解法の方が少ない記憶容量で記憶することが
可能になる。As described above, the conventional incremental decomposition method can increase only one character at a time.
With respect to character string data in which the same character string continues as shown in (B), there is a great difference from the incremental decomposition method of the present embodiment, and the incremental decomposition method can store data with a smaller storage capacity. .

【０１１４】図１０（Ａ）には、本実施例の加増分解法
により連続する文字列を圧縮する場合が示され、図１０
（Ｂ）には増分分解法により連続する文字列を圧縮する
場合が示される。図１０（Ｂ）の増分分解法の場合、辞
書にはＡ，ＡＡ，ＡＡＡ，ＡＡＡＡ，ＡＡＡＡＡ，ＡＡ
ＡＡＡＡ，…の順で登録が行われていくため、Ａの文字
16バイト分を圧縮処理するのに16回の出力と17回の辞書
登録を行わなければならない。一方、図１０（Ａ）の本
実施例の加増分解法の場合、辞書にはＡ，ＡＡ，ＡＡＡ
Ａ，ＡＡＡＡＡＡＡＡ，ＡＡＡＡＡＡＡＡＡＡＡＡＡＡ
ＡＡ，…の順で登録が行われていくため、Ａの文字16バ
イト分を圧縮処理するのにわずか５回の出力と６回の辞
書登録を行うだけで済むことになる。FIG. 10A shows a case where a continuous character string is compressed by the additive decomposition method of this embodiment.
(B) shows a case where a continuous character string is compressed by the incremental decomposition method. In the case of the incremental decomposition method shown in FIG. 10B, the dictionary contains A, AA, AAA, AAAAA, AAAAA, and AAA.
Because the registration is performed in the order of AAAA, ..., the letter of A
In order to compress 16 bytes, 16 outputs and 17 dictionary entries must be performed. On the other hand, in the case of the additive decomposition method of this embodiment in FIG. 10A, the dictionary contains A, AA, and AAA.
A, AAAAAAAAA, AAAAAAAAAAAAAAA
Since the registration is performed in the order of AA,..., It is only necessary to output the dictionary five times and perform the dictionary registration six times to compress the 16 bytes of the character A.

【０１１５】このように、同じ文字が連続する文字列デ
ータの場合は、本実施例の加増分解法の効果は明らかで
あり、その圧縮率は従来の増分分解法と比べて非常に高
くなる。これは言い替えれば、同じ量の文字列データを
処理するのに少ない出力回数で済むことを意味する。そ
して、出力時における文字列は必ず辞書の登録番号の固
定長バイトに置き換えられて出力されるため、少ない出
力回数は少ないバイト数を示し、圧縮データが少ないと
いうことを意味することになる。As described above, in the case of character string data in which the same character is continuous, the effect of the incremental decomposition method of this embodiment is clear, and the compression ratio is much higher than that of the conventional incremental decomposition method. This means, in other words, that a smaller number of output times is required to process the same amount of character string data. Since the character string at the time of output is always replaced with the fixed-length byte of the registration number of the dictionary and output, a small number of times of output indicates a small number of bytes, meaning that the compressed data is small.

【０１１６】（Ｃ）最適な辞書の生成以上説明した加増分解法による処理は、前述の図６と同
様に、全圧縮対象文字列に対して複数回行われ、これに
より文字列データを最も効率よく圧縮できる最適な辞書
が生成される。以下、この辞書の最適化処理について図
１１、図１２を用いて説明する。(C) Generation of Optimal Dictionary The processing by the additive decomposition method described above is performed a plurality of times for all character strings to be compressed, as in FIG. An optimal dictionary that can be compressed well is generated. Hereinafter, the dictionary optimization processing will be described with reference to FIGS.

【０１１７】図１１のフロチャートは、辞書を最適化す
るために加増分解法により行われる１回のパスの処理が
示される。ここで、１回のパスとは、全ての文字列が作
業バッファに格納され全ての文字列に対して処理が行わ
れるまでをいう。図１１では、まず、ＮＵＭ＝０とされ
る（ステップＥ１）。次に、ｂｕｆ１、ｂｕｆ２（作業
バッファ内の１番目、２番目）に格納される文字列の組
み合わせ（即ちＡＢ）が辞書に登録されているか否かが
調べられる（ステップＥ２）。登録されていると判断さ
れた場合には、このｂｕｆ１、ｂｕｆ２に格納される文
字列の組み合わせを辞書の登録番号に置き換え、この置
き換えにより空きが生じたｂｕｆ１６に次の文字列（即
ちＢ）を格納する（ステップＥ３）。その後、ステップ
Ｅ２に戻り、再度ｂｕｆ１、ｂｕｆ２の文字列の組み合
わせが辞書に登録されていないか否かが調べられる。The flowchart of FIG. 11 shows the processing of one pass performed by the additive decomposition method to optimize the dictionary. Here, one pass means that all character strings are stored in the work buffer and processing is performed on all character strings. In FIG. 11, first, NUM = 0 is set (step E1). Next, it is checked whether or not a combination (that is, AB) of character strings stored in buf1 and buf2 (first and second in the work buffer) is registered in the dictionary (step E2). If it is determined that the character string has been registered, the combination of the character strings stored in the buf1 and buf2 is replaced with the registration number of the dictionary. It is stored (step E3). Thereafter, the process returns to step E2 to check again whether or not the combination of the character strings buf1 and buf2 is registered in the dictionary.

【０１１８】一方、ステップＥ２で、ｂｕｆ１、ｂｕｆ
２の文字列の組み合わせが辞書に登録されていないと判
断された場合は、今度はｂｕｆ２、ｂｕｆ３の文字列の
組み合わせが辞書に登録されていないか否かが調べられ
る（ステップＥ４）。そして、登録されている場合はス
テップＥ５に移行し、その後、またステップＥ２に戻
る。登録されていない場合は、次のステップに移行す
る。On the other hand, in step E2, buf1, buf1
If it is determined that the combination of the character strings 2 is not registered in the dictionary, it is checked whether or not the combination of the character strings buf2 and buf3 is registered in the dictionary (step E4). If it has been registered, the process proceeds to step E5, and thereafter returns to step E2. If not registered, move to the next step.

【０１１９】このようにしてｂｕｆ１５、ｂｕｆ１６の
文字列の組み合わせまで調べてゆき（ステップＥ８）、
この文字列の組み合わせが登録されていない場合は、今
回のパスが最終パスか否かが最終パスフラッグにより調
べられる（ステップＥ１０）。そして、最終パスでない
場合には、ステップＥ１１に移行し、ｂｕｆ１、ｂｕｆ
２の文字列の組み合わせを新たに辞書に登録する。そし
て、ｂｕｆ１に格納される文字列を消去するとともに作
業バッファ内の文字列を左にシフトし、これにより生じ
た空きを埋めるようにｂｕｆ１６に次の文字列を格納す
る。更に、ＮＵＭ＝ＮＵＭ＋１とし、ＮＵＭの値を１つ
増やし、その後、ステップＥ２に戻る。一方、ステップ
Ｅ１０で最終パスであると判断された場合には、ｂｕｆ
１の文字列を圧縮データとして外部に出力し、ｂｕｆ１
の文字列を消して１文字左にずらし、空いたｂｕｆ１６
に次の文字列を入れることになる（ステップＥ１２）。In this way, the combination of character strings buf15 and buf16 is checked (step E8).
If the combination of the character strings is not registered, it is checked whether or not the current pass is the last pass by the final pass flag (step E10). If it is not the last pass, the process proceeds to step E11, where buf1, buf1,
The combination of the character strings 2 is newly registered in the dictionary. Then, the character string stored in buf1 is erased, and the character string in the work buffer is shifted to the left, and the next character string is stored in buf16 so as to fill the resulting space. Further, NUM = NUM + 1 is set, and the value of NUM is increased by one, and thereafter, the process returns to step E2. On the other hand, if it is determined in step E10 that the current path is the last pass, buf
1 is output to the outside as compressed data, and buf1
Is erased and shifted one character to the left.
The next character string is put in the step (step E12).

【０１２０】なお、図１１では、例えばステップＥ８、
Ｅ９の後、ステップＥ２に戻っているが、本発明では必
ずしもステップＥ２にまで戻る必要はない。即ち、この
場合には、置き換わる文字列はｂｕｆ１５、ｂｕｆ１６
に格納される文字列のみなので、これにより判断に影響
が出るステップＥ６にまで戻れば十分である。このこと
はステップＥ６等の他のステップでも同様である。In FIG. 11, for example, step E8,
After E9, the process returns to step E2. However, the present invention does not necessarily return to step E2. That is, in this case, the replaced character strings are buf15 and buf16.
, It is sufficient to return to step E6 where this affects the determination. This is the same in other steps such as step E6.

【０１２１】以上の処理を繰り返し、全圧縮対象文字列
が作業バッファに格納され１回目のパスが終わると、図
１２に示すようにＮＵＭ２＝ＮＵＭに設定される（ステ
ップＦ２）。ここで、ＮＵＭは、図１１のステップＥ１
１を通る毎に１ずつ増やされるもので、圧縮データの出
力回数を表すものに相当する（実際には、ステップＥ１
１、Ｅ１２に示すように、最終パスになるまでは圧縮デ
ータは外部に出力されない）。When the above processing is repeated and all the character strings to be compressed are stored in the work buffer and the first pass is completed, NUM2 = NUM is set as shown in FIG. 12 (step F2). Here, NUM corresponds to step E1 in FIG.
1 is incremented by one for each pass, and represents the number of times compressed data is output (actually, step E1
1, compressed data is not output to the outside until the final pass is reached).

【０１２２】次に、ステップＦ３に示すように、２回目
のパスが行われる。この２回目のパスでは、１回目のパ
スで更新された辞書が使用され、これにより辞書が成熟
化されてゆく。この２回目のパスは、１回目のパスと同
様に、図１１に示す処理により全圧縮対象文字列に対し
て行われる。次に、ステップＦ４で、ＮＵＭ＞ＮＵＭ２
か否かが判断される。ＮＵＭ≦ＮＵＭ２の場合は、ステ
ップＦ３における圧縮データ出力回数ＮＵＭの方がステ
ップＦ１における圧縮データ出力回数ＮＵＭ２よりも小
さいことを意味し、データの圧縮率が向上したことを意
味する。従って、この場合には、ステップＦ２に戻り、
最適な辞書が得られるまで処理が繰り返されることにな
る。そして、ＮＵＭ＞ＮＵＭ２となると、最適な辞書が
生成されたと判断され、最終パスフラッグ＝１に設定さ
れ（ステップＦ５）、最終パスの圧縮動作が行われる
（ステップＦ６）。Next, as shown in step F3, a second pass is performed. In the second pass, the dictionary updated in the first pass is used, and the dictionary is matured. The second pass is performed on all the compression target character strings by the processing shown in FIG. 11, as in the first pass. Next, in step F4, NUM> NUM2
Is determined. If NUM ≦ NUM2, it means that the number NUM of compressed data outputs in step F3 is smaller than the number NUM2 of compressed data outputs in step F1, which means that the data compression ratio has been improved. Therefore, in this case, the process returns to step F2,
The process is repeated until an optimal dictionary is obtained. When NUM> NUM2, it is determined that the optimal dictionary has been generated, the final path flag is set to 1 (step F5), and the final path compression operation is performed (step F6).

【０１２３】最終パスの圧縮動作では、最終パスフラッ
グ＝１となっているため、図１１のステップＥ１０の判
断は常にＹＥＳの方向になり、常にステップＥ１２に移
行することになる。即ち、この場合にはステップＥ１２
を通る毎に、ｂｕｆ１に格納されている文字列が圧縮デ
ータとして外部に出力されることになる。また、ステッ
プＥ１２では、ステップＥ１１のように辞書に新たな文
字列が登録がされることがなく、辞書は前回のパスにお
ける辞書のままで、更新されない。そして、この更新さ
れない辞書が、静的な辞書として圧縮データと共に外部
に出力され、記憶装置等に格納されることになる。In the compression operation of the last pass, since the last pass flag = 1, the determination in step E10 in FIG. 11 is always in the YES direction, and the process always proceeds to step E12. That is, in this case, step E12
, The character string stored in buf1 is output to the outside as compressed data. In step E12, a new character string is not registered in the dictionary as in step E11, and the dictionary remains unchanged from the dictionary in the previous pass and is not updated. The dictionary that is not updated is output to the outside together with the compressed data as a static dictionary, and is stored in a storage device or the like.

【０１２４】（Ｄ）使用頻度さて、以上のようにして、最適な辞書を作成していく過
程において、辞書の登録数が限界に達し、新たに使用頻
度の高い文字列を辞書に登録しようとしても登録できな
い場合が生じる。この場合には、使用頻度の低い登録文
字列を消去し、消去により生じた空きの部分に登録した
い文字列を書き込むようにする。(D) Frequency of Use As described above, in the process of creating an optimal dictionary, the number of dictionary registrations has reached the limit, and a new frequently used character string is to be registered in the dictionary. Also cannot be registered. In this case, the registered character string that is used less frequently is erased, and the character string to be registered is written in a vacant portion created by the erase.

【０１２５】図１３（Ａ）には、このような使用頻度の
情報を含む辞書の構造が示され、図１３（Ｂ）には、こ
の使用頻度に基づく登録削除の処理のフローチャートが
示される。FIG. 13A shows the structure of a dictionary including such usage frequency information, and FIG. 13B shows a flowchart of a registration deletion process based on this usage frequency.

【０１２６】ここで、Ｐは、新たに登録する辞書の登録
番号を表すものであり、Ｐの初期値は２５６である。そ
して、図１３（Ａ）に示すように、辞書の登録場所に
は、２５６〜４０９５の登録番号が割り振られている。
各登録番号の場所には、１文字目の番号を入れる場所
と、２文字目の番号を入れる場所と、使用頻度を入れる
場所とがある。使用頻度の初期値は全て”０”であり、
使用頻度＝０は、その場所には登録文字列が何も登録さ
れていないことを示す。Here, P represents the registration number of the dictionary to be newly registered, and the initial value of P is 256. Then, as shown in FIG. 13A, registration numbers 256 to 4095 are assigned to the registration locations of the dictionary.
The location of each registration number includes a location for entering the first character number, a location for entering the second character number, and a location for entering the frequency of use. The initial value of the usage frequency is all “0”,
Usage frequency = 0 indicates that no registered character string is registered at that location.

【０１２７】新たに、１文字目＋２文字目の組合わせか
らなる文字列を登録する場合には、図１３（Ｂ）のステ
ップＧ１に示すように、辞書の中のＰの場所（図１３
（Ａ）では２５８番の場所）に、１文字目、２文字目の
文字列を書き込む。次に、Ｐ＝Ｐ＋１とし（ステップＧ
２）、次の場所（２５９番の場所）を見る。この場合、
Ｐ＝４０９６となった場合には、Ｐ＝２５６に設定され
る（ステップＧ３、Ｇ４）。次に、Ｐの場所（２５９番
の場所）の使用頻度が調べられ（ステップＧ５）、使用
頻度＞１ならばＰの場所の使用頻度を１減らす（ステッ
プＧ７）。そして、またステップＧ２に戻り、Ｐ＝Ｐ＋
１として次の場所（２６０番の場所）を見る。When newly registering a character string composed of a combination of the first character and the second character, as shown in step G1 of FIG. 13B, the location of P in the dictionary (FIG.
In (A), the character string of the first character and the second character is written in the 258th place. Next, P = P + 1 (step G
2) Look at the next place (259th place). in this case,
When P = 4096, P = 256 is set (steps G3 and G4). Next, the use frequency of the location P (location 259) is checked (step G5), and if the use frequency> 1, the use frequency of the location P is reduced by 1 (step G7). Then, returning to step G2, P = P +
Look at the next place (place 260) as 1.

【０１２８】例えば、今、辞書の２５６〜４０９５の場
所に空きが１つもない場合を考える。この場合には、使
用頻度＝０となる場所が生じるまで、ステップＧ２〜Ｇ
５の処理が繰り返される。そして、初めに使用頻度＝０
となった場所の登録が削除される（ステップＧ６）。そ
して、新たに辞書登録を行う場合には、この削除された
場所に辞書登録することになる。For example, consider the case where there is no space at 256 to 4095 in the dictionary. In this case, steps G2 to G2 are performed until a location where the use frequency = 0 is generated.
Step 5 is repeated. And first, the use frequency = 0
Is deleted (step G6). When a new dictionary is registered, the dictionary is registered at the deleted location.

【０１２９】以上の処理により、登録番号２５６〜４０
９５の中で、最も使用頻度の低い登録文字列を削除する
ことが可能となり、この登録が削除された場所に、新た
に他の文字列を登録することが可能となる。これによ
り、使用頻度の高い登録文字列が優先的に登録された辞
書を生成することが可能となる。With the above processing, the registration numbers 256 to 40
95, it is possible to delete the least frequently used registered character string, and it is possible to newly register another character string at the location where this registration has been deleted. This makes it possible to generate a dictionary in which registered character strings that are frequently used are registered with priority.

【０１３０】（Ｅ）復元処理本実施例により生成された辞書は以下のようにツリー構
造となっている。（辞書）２５６：３０＋５６２５７：８０＋１６２５８：２５６＋２５７２５９：２５６＋２５７２６０：２５９＋２５７従って、この辞書により例えば登録番号２５６〜２６０
に登録される登録文字列を復元すると、以下のようにな
る。（文字列復元）２５６＝３０＋５６２５７＝８０＋１６２５８＝２５６＋４０＝３０＋５６＋４０（２５６＝３０＋５６より）２５９＝２５６＋２５７＝３０＋５６＋２５７（２５６＝３０＋５６より）＝３０＋２６＋８０＋１６（２５７＝８０＋１６より）２６０＝２５９＋２５７（２５９＝２５６＋２５７より）＝２５６＋２５７＋２５７（２５６＝３０＋５６より）＝３０＋５６＋２５７＋２５７（２５７＝８０＋１６より＝３０＋５６＋８０＋１６＋２５７＝３０＋５６＋８０＋１６＋８０＋１６（２５７＝８０＋１６より）(E) Restoration Processing The dictionary generated according to this embodiment has a tree structure as follows. (Dictionary) 256: 30 + 56 257: 80 + 16 258: 256 + 257 259: 256 + 257 260: 259 + 257 Therefore, this dictionary allows, for example, registration numbers 256 to 260.
Restoring the registered character string registered in (Character string restoration) 256 = 30 + 56 257 = 80 + 16 258 = 256 + 40 = 30 + 56 + 40 (from 256 = 30 + 56) 259 = 256 + 257 = 30 + 56 + 257 (from 256 = 30 + 56) = 30 + 26 + 80 + 16 (from 257 = 80 + 16) 260 = 259 + 257 + (259 = 257) = 256 + 257 + 257 (from 256 = 30 + 56) = 30 + 56 + 257 + 257 (from 257 = 80 + 16 = 30 + 56 + 80 + 16 + 257 = 30 + 56 + 80 + 16 + 80 + 16 (from 257 = 80 + 16)

【０１３１】このように本実施例により圧縮された文字
列を復元するのには、登録文字列を構成する全ての文字
列が２５６より小さくなるまで分解処理を行う必要があ
る。しかし、復元をする際に、上記のような分解処理を
行うと、文字列の復元速度が非常に遅くなるという問題
がある。そこで、この問題を解決するために、生成され
た辞書から以下に述べるような復元専用の辞書を生成
し、この復元専用の辞書をＲＯＭ等の記憶媒体に格納
し、復元の際に使用することが望ましい。In order to restore the character string compressed according to the present embodiment as described above, it is necessary to perform a decomposition process until all the character strings constituting the registered character string become smaller than 256. However, when the above-described decomposition processing is performed during restoration, there is a problem that the restoration speed of a character string is extremely slow. Therefore, in order to solve this problem, it is necessary to generate a dictionary dedicated for restoration as described below from the generated dictionary, store the dictionary dedicated for restoration in a storage medium such as a ROM, and use the dictionary for restoration. Is desirable.

【０１３２】図１４（Ａ）には、この復元専用辞書の構
造が示されている。復元専用辞書の先頭部分には、文字
列長（データ長）と、文字列開始アドレスの情報とが登
録番号２５６〜４０９５の順に並んで格納されている。
そして、その後ろに文字列のコア部分（文字列の羅列）
が格納されている。この文字列のコア部分は、上記のよ
うに、登録文字列を構成する全ての文字列が２５６より
小さくなるまで分解処理を行うことにより生成したもの
であり、これらの文字列は復元専用の登録文字列とな
る。FIG. 14A shows the structure of this restoration-only dictionary. A character string length (data length) and information on a character string start address are stored in the head portion of the restoration-dedicated dictionary in the order of registration numbers 256 to 4095.
Then, after that, the core part of the character string (a list of character strings)
Is stored. As described above, the core portion of this character string is generated by performing a decomposition process until all the character strings constituting the registered character string are smaller than 256, and these character strings are registered only for restoration. It becomes a character string.

【０１３３】復元処理を行う際には、図１４（Ｂ）に示
すように、登録番号２５６〜４０９５の位置に格納され
た文字列開始アドレスにより、文字列のコア部分の中の
対応する開始アドレスを指定する。そして、この開始ア
ドレスから、文字列長により指定される長さだけ文字列
（復元専用登録文字列）を取り出すことにより、データ
の復元が行われる。この場合、文字列のコア部分は、あ
らかじめ上記のように分解処理されているため、ツリー
構造の辞書よりも復元処理の速度を非常に早くすること
が可能となる。When the restoration process is performed, as shown in FIG. 14B, the corresponding start address in the core portion of the character string is determined by the character string start address stored in the position of the registration numbers 256 to 4095. Is specified. Then, data is restored by extracting a character string (restore-dedicated registered character string) by the length specified by the character string length from the start address. In this case, since the core part of the character string has been decomposed in advance as described above, the speed of the restoration processing can be made much faster than that of a tree-structured dictionary.

【０１３４】図１５には、この復元専用の辞書を用いた
場合の、復元処理のフローチャートが示される。まず、
ステップＨ２で、１２ビットのコードを読み出し、コー
ドが終了記号であった場合には終了となり（ステップＨ
４）、終了記号でない場合は、ステップＨ５に移行す
る。そして、コードの番号にしたがって辞書から復元さ
れる文字列開始アドレスと文字列長の情報を得る。そし
て、得られた文字列開始アドレスから文字列長分の文字
列を復元バッファに書き込む（ステップＨ６）。以上の
処理を、終了記号が検出されるまで繰り返すことによ
り、データの復元が可能となる。FIG. 15 shows a flowchart of the restoration processing when the dictionary dedicated for restoration is used. First,
At step H2, a 12-bit code is read, and if the code is an end symbol, the process ends (step H2).
4) If it is not the end symbol, the process proceeds to step H5. Then, information on the character string start address and character string length to be restored from the dictionary is obtained according to the code number. Then, a character string corresponding to the character string length from the obtained character string start address is written in the restoration buffer (step H6). By repeating the above processing until an end symbol is detected, data can be restored.

【０１３５】３．第３の実施例第３の実施例は、以上の第１、第２の実施例で説明した
データ圧縮手法により、いわゆるフォントデータを圧縮
する実施例である。[0135] 3. Third Embodiment A third embodiment is an embodiment in which so-called font data is compressed by the data compression method described in the first and second embodiments.

【０１３６】（Ａ）ビットマップフォントデータまず、ビットマップフォントデータを圧縮する場合につ
いて説明する。ビットマップフォントのデータは通常ラ
スター方向に並んでいる。しかし、本実施例によりビッ
トマップフォントのデータを圧縮する場合、縦方向に圧
縮した方が圧縮率がかなり良くなる。そこで、図１６
（Ａ）に示すように、縦方向にデータ圧縮することにす
る。しかし、もちろん横方向にデータ圧縮してもかまわ
ない。(A) Bitmap Font Data First, the case of compressing bitmap font data will be described. Bitmap font data is usually arranged in the raster direction. However, in the case of compressing bitmap font data according to the present embodiment, the compression ratio becomes considerably better when compressed in the vertical direction. Therefore, FIG.
As shown in (A), data is compressed in the vertical direction. However, of course, the data may be compressed in the horizontal direction.

【０１３７】図１６（Ａ）に示すように、ビットマップ
フォントデータ（４０ドット×４０ドット）は、４０バ
イト固定長のデータブロックを５個並べることで構成さ
れる。本実施例では、各データブロック毎に第１、第２
の実施例で示した圧縮手法により圧縮する。なお、通常
はこの様な形でデータ圧縮を行うが、圧縮率よりも復元
速度を優先する場合は、図１６（Ａ）のようなバイト列
単位ではなく、図１６（Ｂ）に示すように横に並んだ２
バイトを１ワードとして、ワード列単位でデータ圧縮を
行うことも可能である。これにより、復元速度を上げる
ことが可能となるが、その反面、辞書に登録できる文字
列の数を３２７６８程度にする必要があるという問題が
ある。As shown in FIG. 16A, the bitmap font data (40 dots × 40 dots) is configured by arranging five data blocks having a fixed length of 40 bytes. In this embodiment, the first and second data blocks are
Compression by the compression method described in the embodiment. Normally, data compression is performed in such a manner. However, when priority is given to the restoration speed over the compression ratio, the data compression is not performed in byte string units as shown in FIG. 16A but as shown in FIG. 2 side by side
It is also possible to perform data compression on a word string basis, with a byte as one word. As a result, the restoration speed can be increased, but on the other hand, there is a problem that the number of character strings that can be registered in the dictionary needs to be about 32768.

【０１３８】図１７には、例えば「に」という文字を、
ビットマップイメージで示したものが示される。図１７
において”１”が書き込まれている部分が黒となる。そ
して、他の部分には”０”が書き込まれており、この部
分は白となる。図１７から明らかなように、０〜７、１
２、１３、３０〜３９番のバイト列（データ列）は００
ｈ（ヘクサ表示）となる。また、１０、１１、１４〜１
６、２７〜２９番のバイト列は０１ｈとなる。また、８
番のバイト列は０２ｈとなり、９、１７〜２６番のバイ
ト列は０３ｈとなる。このように、ビットマップフォン
トデータを、縦方向に圧縮してゆくと、同じ値のバイト
列が連続する。従って、第１、第２の実施例のデータ圧
縮手法により高効率に圧縮されることが理解される。In FIG. 17, for example, the characters “Ni”
What is shown in the bitmap image is shown. FIG.
The part where "1" is written becomes black. Then, "0" is written in other portions, and this portion becomes white. As is clear from FIG.
The byte strings (data strings) of Nos. 2, 13, 30 to 39 are 00
h (hex display). Also, 10, 11, 14-1
The byte strings of Nos. 6, 27-29 are 01h. Also, 8
The byte sequence No. is 02h, and the byte sequences Nos. 9 and 17 to 26 are 03h. As described above, when the bitmap font data is compressed in the vertical direction, byte strings having the same value continue. Therefore, it is understood that the data is compressed with high efficiency by the data compression methods of the first and second embodiments.

【０１３９】（Ｂ）アウトラインフォントデータアウトラインフォントとは、文字の輪郭を、何個かの点
とそれを結ぶ直線、曲線により表そうとするものであ
る。通常の大きさの文字（例えば１０ポイント）につい
ては、上記のビットマップフォントを用いて印字する
が、ユーザによっては３２ポイントなどの大きなサイズ
で印字したい場合がある。このような場合に、各々のサ
イズのビットマップフォントデータをあらかじめ用意し
ておくことはデータ容量との関係で困難である。そこ
で、このような場合には、各々の文字について基本とな
るアウトラインフォントを持っておき、このアウトライ
ンフォントにより文字の輪郭を表し、これをスケーリン
グにより拡大し、これを図１７に示すようなビットマッ
プイメージのデータに変換する。これにより所望のサイ
ズの文字を印字することが可能となる。(B) Outline Font Data An outline font is intended to represent the outline of a character by several points and straight lines and curves connecting the points. Normally sized characters (for example, 10 points) are printed using the above-described bitmap font, but some users may want to print at a large size such as 32 points. In such a case, it is difficult to prepare bitmap font data of each size in advance in relation to the data capacity. Therefore, in such a case, a basic outline font is provided for each character, the outline of the character is represented by the outline font, and the outline is enlarged by scaling. Convert to image data. This makes it possible to print characters of a desired size.

【０１４０】例えば、アウトラインフォントにより
「２」という文字の輪郭は、図１８に示すように、点
Ａ、Ｂ、Ｃ、Ｄ、Ｅ、Ｆ、Ｇ・・・・とこれを結ぶ直
線、曲線により表すことができる。そして、本実施例で
は、このアウトラインフォントのデータを、ＮＳＩフォ
ーマットと呼ぶデータ形式で記述する。For example, as shown in FIG. 18, the outline of the character "2" in the outline font is represented by points A, B, C, D, E, F, G,. Can be represented. In the present embodiment, the outline font data is described in a data format called an NSI format.

【０１４１】さて、本実施例では、このＮＳＩフォーマ
ットで記述されたアウトラインフォントデータを、図１
９（Ａ）に示すように３つの部分、即ちＦＬＧ部分、Ｄ
ＡＴ部分、ＶＣＴ部分に分離して圧縮している。In this embodiment, the outline font data described in this NS1 format is
As shown in FIG. 9 (A), three parts, namely, FLG part, D
It is compressed into an AT part and a VCT part.

【０１４２】ここで、ＶＣＴ部分は各点における打ち出
し方向のベクトル座標を表す情報であり、各点のＸ、Ｙ
ベクトル座標情報が含まれる。なお、これらのＸ、Ｙベ
クトル座標は、座標の絶対値を表すものではなく、１つ
前の点からの相対値を表すものである。Here, the VCT portion is information representing the vector coordinates in the launch direction at each point, and the X, Y
Contains vector coordinate information. Note that these X and Y vector coordinates do not represent absolute values of the coordinates, but represent relative values from the immediately preceding point.

【０１４３】また、ＦＬＧ部分は、各点の属性を表す情
報であり、例えば、「Ａ点は始点である」、「Ａ点とＢ
点は直線で結ばれる」、「Ｂ点とＤ点はＣ点を中間点と
する曲線（ベゼー曲線）で結ばれる」、「Ｄ点とＥ点は
直線で結ばれる」、「Ｅ点とＧ点はＦ点を中間点とする
曲線（ベゼー曲線）で結ばれる」等を表す情報が含まれ
る。更に、ＤＡＴ部分は、文字の特性を表す情報であ
り、文字の大きさを表す情報、あるいは文字の線幅を表
す情報等が含まれる。本実施例においては、ＶＣＴ部分
についてはハフマン符号法により圧縮する。これに対し
て、ＦＬＧ部分、ＤＡＴ部分については、同じデータが
続く等の、ある程度の規則性があるため、上記第１、第
２の実施例で説明したデータ圧縮方法により圧縮する。
このようにアウトラインフォントデータを構成する成分
を、成分毎に分離し圧縮することにより、圧縮率を大幅
に向上させることが可能となる。そして、圧縮後のデー
タの構造は、図１９（Ｂ）に示すように、まずＦＬＧ部
分の圧縮データのみが並び、その次にＤＡＴ部分の圧縮
データのみが並び、その次にＶＣＴ部分の圧縮データの
みが並ぶという構造になる。また、ＦＬＧ部分、ＤＡＴ
部分、ＶＣＴ部分の各々に対応した専用の復元用辞書が
用意されることになる。The FLG portion is information indicating the attribute of each point, for example, "point A is a starting point", "point A and B
"Points are connected by a straight line", "Points B and D are connected by a curve (Bézé curve) with point C as an intermediate point", "Points D and E are connected by a straight line", "Points E and G The points are connected by a curve (Bézé curve) having the point F as an intermediate point ”. Furthermore, the DAT portion is information indicating the characteristics of the character, and includes information indicating the size of the character, information indicating the line width of the character, and the like. In this embodiment, the VCT part is compressed by the Huffman coding method. On the other hand, the FLG portion and the DAT portion have a certain degree of regularity such that the same data continues, and so are compressed by the data compression method described in the first and second embodiments.
By separating and compressing the components constituting the outline font data for each component in this manner, it is possible to greatly improve the compression ratio. Then, as shown in FIG. 19B, the structure of the compressed data is as follows: first, only the compressed data of the FLG part is arranged, then only the compressed data of the DAT part is arranged, and then the compressed data of the VCT part is arranged. Only the lines are arranged. In addition, FLG part, DAT
A dedicated restoration dictionary corresponding to each of the VCT portion and the VCT portion is prepared.

【０１４４】この圧縮データを復元する際には、ＦＬＧ
部分、ＤＡＴ部分を復元し、その後、これに基づいてＶ
ＣＴ部分を復元することになる。図１９（Ｃ）には、こ
の復元処理のフロチャートが示される。まず、ステップ
Ｉ２で、ＦＬＧ部分をＦＬＧ用の静的辞書で復元し、こ
れをＦＬＧ用に設けられたバッファに一時的に書き込
む。次に、ステップＩ３で、ＤＡＴ部分をＤＡＴ用の静
的辞書で復元し、これをＤＡＴ用に設けられたバッファ
に一時的に書き込む。この状態では、データ読み出し用
のポインタは、図１９（Ｂ）に示すように圧縮されたＶ
ＣＴ部分の先頭を指し示すことになる。When decompressing this compressed data, FLG
Part, the DAT part, and then V
The CT part will be restored. FIG. 19C shows a flowchart of this restoration processing. First, in step I2, the FLG part is restored with a FLG static dictionary, and this is temporarily written in a buffer provided for the FLG. Next, in step I3, the DAT portion is restored with a DAT static dictionary, and this is temporarily written in a buffer provided for DAT. In this state, the pointer for reading data is a compressed V as shown in FIG.
This indicates the head of the CT part.

【０１４５】次に、ステップＩ４に示すように、ＤＡＴ
部分とＦＬＧ部分をＮＳＩフォーマットの規則にしたが
って上記バッファから読み出し、出力した後、読み出さ
れたＦＬＧ部分のビット０〜ビット３（これらのビット
の詳細は後述する）にしたがって、ＶＣＴ部分をハフマ
ン符号に復元する。これにより、ＮＳＩフォーマットの
順番に戻された元のアウトラインフォントデータを得る
ことが可能となる。Next, as shown in step I4, DAT
After reading and outputting the portion and the FLG portion from the buffer according to the rules of the NSI format, the VCT portion is Huffman-coded according to bits 0 to 3 of the read FLG portion (the details of these bits will be described later). To restore. This makes it possible to obtain the original outline font data returned in the order of the NSI format.

【０１４６】さて、本実施例では、データの圧縮率を更
に向上させるために、上記のＦＬＧ部分、ＶＣＴ部分
を、下記の座標点制御部、座標点定義部で表せるフォー
マットに変更した後、圧縮を行っている。In this embodiment, in order to further improve the data compression ratio, the FLG portion and the VCT portion are changed to a format that can be represented by the following coordinate point control unit and coordinate point definition unit, and then the compression is performed. It is carried out.

【０１４７】ここで、座標点制御部のビット４〜ビット
７は、元のＦＬＧ部分のビット４〜７と同じであり、こ
れらのビット４〜ビット７により、各点の属性の指定、
即ち始点・中間点・終点のいずれなのか、移動補間、直
線補間、曲線補間のいずれなのか等が指定される。ま
た、座標点制御部のビット０〜ビット３は、座標点定義
部に定義されるベクトル座標の値が以下の意味になるこ
とを表す。即ち、ビット０はＸベクトル座標が０でない
値であり存在することを表し、ビット１はＹベクトル座
標が０でない値であり存在することを表す。また、ビッ
ト２はＸベクトル座標が負の値であることを表す。更
に、ビット３はＹベクトル座標が負の値であることを表
す。Here, bits 4 to 7 of the coordinate point control unit are the same as bits 4 to 7 of the original FLG part, and these bits 4 to 7 specify the attribute of each point.
In other words, the user designates a start point, an intermediate point, or an end point, or any of movement interpolation, linear interpolation, and curve interpolation. Bit 0 to bit 3 of the coordinate point control unit indicate that the value of the vector coordinates defined in the coordinate point definition unit has the following meaning. That is, bit 0 indicates that the X vector coordinate is a non-zero value and exists, and bit 1 indicates that the Y vector coordinate is a non-zero value and exists. Bit 2 indicates that the X vector coordinate is a negative value. Further, bit 3 indicates that the Y vector coordinate is a negative value.

【０１４８】座標点定義部には１０ビットのコード（０
〜１０２３）が並ぶ。そして、これらのコードと、座標
点制御部のビット０〜ビット３の指定により、 −１０２４≦Ｘベクトル座標≦１０２４ −１０２４≦Ｙベクトル座標≦１０２４の範囲のベクトル座標値を表すことができる。A 10-bit code (0
To 1023). Then, by specifying these codes and bits 0 to 3 of the coordinate point control unit, vector coordinate values in the range of −1024 ≦ X vector coordinate ≦ 1024 −1024 ≦ Y vector coordinate ≦ 1024 can be represented.

【０１４９】例えば、ビット０＝１の時は、座標点定義
部にはＸ−１のデータがあることが示される。また、ビ
ット１＝１の時は、座標点定義部にはＹ−１のデータが
あることが示される。文字において縦棒、横棒等がある
場合には、Ｘベクトル座標＝０であったり、Ｙベクトル
座標＝０であったりする場合があり、この場合には、Ｘ
ベクトル座標、Ｙベクトル座標を圧縮データとして出力
する必要はない。従って、この場合には、座標点制御部
のビット０、１の制御により、座標点定義部でのＸベク
トル座標、Ｙベクトル座標の有無を指定できれば、デー
タ圧縮率を更に向上することが可能となる。また、Ｘベ
クトル座標、Ｙベクトル座標の正負の指定も、上記のよ
うに座標点制御部のビット２、３により指定し、これに
よりデータ圧縮率を更に向上させることができる。For example, when bit 0 = 1, it indicates that the coordinate point definition section has X-1 data. Also, when bit 1 = 1, it indicates that the coordinate point definition unit has Y-1 data. If there are vertical bars, horizontal bars, etc. in the character, the X vector coordinates may be 0 or the Y vector coordinates may be 0. In this case, X
It is not necessary to output vector coordinates and Y vector coordinates as compressed data. Therefore, in this case, if the presence / absence of X vector coordinates and Y vector coordinates in the coordinate point definition unit can be designated by controlling bits 0 and 1 of the coordinate point control unit, the data compression rate can be further improved. Become. Also, the positive and negative designations of the X vector coordinates and the Y vector coordinates are designated by the bits 2 and 3 of the coordinate point control unit as described above, whereby the data compression ratio can be further improved.

【０１５０】以上に説明したように、本実施例によれ
ば、ビットマップフォントデータ、アウトラインフォン
トデータを効果的に圧縮することができる。そして、こ
の圧縮の際には、第１、第２の実施例で説明した静的辞
書が用いられる。この場合、静的辞書は、共通の字体
（フォント）の文字に対しては共通の静的辞書とするこ
とが望ましい。例えば、明朝体の文字に対しては、全て
明朝体専用の辞書を用いて辞書の更新、データ圧縮を行
い、最終的な静的辞書、圧縮データを得る。また、ゴシ
ックの文字に対しては、全てゴシック専用の辞書を用い
て辞書の更新、データ圧縮を行い、最終的な静的辞書、
圧縮データを得る。このように字体の各々について辞書
を共通化することで、データを効率よく圧縮することが
可能となる。これは、字体が共通であると、例えば文字
の輪郭の変化の仕方（線の跳ね上がり方等）、輪郭の太
さ等が各文字の間で共通となるため、文字自体の特性情
報、各輪郭点の属性情報等が同じとなり、この共通の辞
書によりデータが圧縮されやすくなるからである。この
場合には、出力された静的辞書、圧縮データは、字体毎
に分類されて記憶装置に格納されることになる。As described above, according to this embodiment, bitmap font data and outline font data can be effectively compressed. At the time of this compression, the static dictionary described in the first and second embodiments is used. In this case, it is desirable that the static dictionary be a common static dictionary for characters having a common font. For example, for all Mincho fonts, the dictionary is updated and data compressed using a dictionary dedicated to Mincho font, and final static dictionaries and compressed data are obtained. In addition, for Gothic characters, the dictionary is updated and data compressed using a Gothic dedicated dictionary, and the final static dictionary,
Get compressed data. By sharing a dictionary for each of the fonts, data can be efficiently compressed. This is because, when the font is common, for example, the way of changing the outline of the character (such as how a line jumps up) and the thickness of the outline are common among the characters, the characteristic information of the character itself, This is because the attribute information of the points and the like become the same, and the data is easily compressed by the common dictionary. In this case, the output static dictionary and compressed data are classified for each font and stored in the storage device.

【０１５１】図２０には、以上説明した本実施例のデー
タ圧縮方法の利用態様が示される。圧縮対象データ列
（フォントデータ）はデータ圧縮装置３１に入力され、
本実施例のデータ圧縮方法によりデータ圧縮される。そ
して、これにより生成された静的辞書及び圧縮データは
データ復元手段３２内の記憶装置（ＲＯＭ等）に格納さ
れる。そして、このデータ復元手段３２は、情報処理装
置、例えばプリンタ３３やコンピュータ３４等に内蔵さ
れることになる（復元手段を分割して２つの装置に分け
て内蔵させてもよい）。FIG. 20 shows a use mode of the data compression method of the present embodiment described above. The data string (font data) to be compressed is input to the data compression device 31,
Data is compressed by the data compression method of the present embodiment. Then, the generated static dictionary and compressed data are stored in a storage device (ROM or the like) in the data decompression means 32. The data restoring means 32 is built in an information processing device, for example, a printer 33 or a computer 34 (the restoring means may be divided and built in two devices).

【０１５２】従来のプリンタ等では、フォントデータ等
は、データ圧縮されないまま記憶装置に格納され、個々
の圧縮データの格納アドレスを管理することで随時必要
なフォントデータを出力し、文字の印字を行っていた。
これに対して本実施例によれば、静的辞書及び圧縮デー
タを記憶装置に格納し、個々の圧縮データの格納アドレ
スと静的文字列辞書を管理することで随時必要な文字を
出力することが可能となる。これにより、記憶装置に格
納されるデータ量を少なくすることができ、装置の低コ
スト化を図ることができる。このような長所は特に多く
の記憶容量を必要とする漢字文字やハングル文字等に有
効であるが、その他の言語の文字についてももちろん効
果は大きい。In a conventional printer or the like, font data and the like are stored in a storage device without data compression. By managing storage addresses of individual compressed data, necessary font data is output at any time and characters are printed. I was
On the other hand, according to the present embodiment, the static dictionary and the compressed data are stored in the storage device, and the necessary addresses are output as needed by managing the storage addresses of the individual compressed data and the static character string dictionary. Becomes possible. Thus, the amount of data stored in the storage device can be reduced, and the cost of the device can be reduced. Such an advantage is particularly effective for kanji characters and Hangul characters that require a large storage capacity, but of course the effect is also great for characters in other languages.

【０１５３】なお、本発明は上記実施例に限定されるも
のではなく、本発明の要旨の範囲内で種々の変形実施が
可能である。The present invention is not limited to the above embodiment, and various modifications can be made within the scope of the present invention.

【０１５４】例えば、本実施例では文字列データを処理
することについて主に説明したが、その他の記号データ
等についても応用は可能である。For example, in the present embodiment, processing of character string data has been mainly described, but application to other symbol data is also possible.

【０１５５】また、本発明により圧縮されるデータ列と
しては、文字列、バイト列、ワード列に限らず種々のデ
ータ列が含まれ、例えば２次元、３次元図形を構成する
ためのデータ列等も含めることができる。The data string compressed according to the present invention is not limited to a character string, a byte string, and a word string, but includes various data strings, such as a data string for forming a two-dimensional or three-dimensional figure. Can also be included.

【０１５６】また、本発明における登録番号には、この
登録番号と実質的に同一の機能を果たす情報も含まれ
る。The registration number in the present invention also includes information that performs substantially the same function as the registration number.

【０１５７】[0157]

【発明の効果】本発明によれば、最適な辞書を静的な辞
書として、この静的辞書によりデータ圧縮が行われるた
め、出力される静的辞書と圧縮データのデータ量を最適
なものとすることができる。これにより、この静的辞
書、圧縮データが格納される記憶装置、記憶媒体の容量
を小さくすることができ、コストの削減等を図ることが
できる。更に、例えばデータ圧縮率の高い動的辞書を用
いたデータ圧縮アルゴリズム等を利用することが可能と
なり、データ圧縮率を非常に高めることができる。ま
た、必要なデータ列をこの静的辞書を用いて自在に復元
することも可能となり、例えば、情報処理装置等により
この圧縮されたデータ列を復元する場合に所望のデータ
列を任意に取り出すことが可能となる。また、本発明に
よれば、辞書を最適にし、最終的な辞書、圧縮データを
得るまでの時間は長くなる可能性もあるが、復元処理速
度自体は特に遅くなることはないという効果もある。According to the present invention, since the optimal dictionary is used as a static dictionary and data compression is performed by the static dictionary, the output static dictionary and the amount of compressed data can be optimized. can do. As a result, the capacity of the storage device and the storage medium in which the static dictionary and the compressed data are stored can be reduced, and the cost can be reduced. Further, for example, a data compression algorithm using a dynamic dictionary having a high data compression rate can be used, and the data compression rate can be greatly increased. In addition, it is possible to freely restore necessary data strings using this static dictionary. For example, when restoring this compressed data string by an information processing device or the like, it is possible to arbitrarily extract a desired data string. Becomes possible. Further, according to the present invention, there is a possibility that the time required for optimizing the dictionary and obtaining the final dictionary and compressed data may be long, but there is also an effect that the restoration processing speed itself does not become particularly slow.

【０１５８】また、本発明によれば、組み合わせ個数の
多いデータ列の組み合わせが優先的に辞書の登録番号に
置き換えられることになるため、データの圧縮率を最適
なものとすることができる。一方、使用頻度の低い登録
データ列の登録を削除し、登録数が所定数になった段階
で更新が終了するため、データ量の少ない最適な辞書を
生成することが可能となる。そして、スライド辞書と呼
ばれる手法により簡易に辞書を生成できるという利点も
ある。なお、この場合の所定数は、出力データのビット
数が例えば１２ビットであった場合には、辞書の登録数
がこの１２ビットの範囲で収まるような数とすることが
できる。Further, according to the present invention, a combination of data strings having a large number of combinations is preferentially replaced with a registration number in the dictionary, so that the data compression ratio can be optimized. On the other hand, the registration of the registered data string that is used less frequently is deleted, and the update ends when the number of registered data reaches the predetermined number. Therefore, it is possible to generate an optimal dictionary with a small data amount. There is also an advantage that a dictionary can be easily generated by a technique called a slide dictionary. Note that the predetermined number in this case can be a number such that when the number of bits of the output data is, for example, 12 bits, the number of registered dictionaries falls within the range of 12 bits.

【０１５９】また、本発明によれば、出現確率の高いデ
ータ列の組み合わせが優先的に登録番号に置き換えられ
ることになるため、データ圧縮率を最適なものとするこ
とができる。一方、使用頻度の低い登録データ列の登録
を削除し、登録数が所定数になった段階で更新が終了す
るため、データ量の少ない最適な辞書を生成することが
可能となる。そして、初めの辞書の生成は１回の解析で
できるため、処理を非常に単純化できるという利点もあ
る。Further, according to the present invention, a combination of data strings having a high appearance probability is preferentially replaced with a registration number, so that the data compression ratio can be optimized. On the other hand, the registration of the registered data string that is used less frequently is deleted, and the update ends when the number of registered data reaches the predetermined number. Therefore, it is possible to generate an optimal dictionary with a small data amount. Since the first dictionary can be generated by one analysis, there is an advantage that the processing can be greatly simplified.

【０１６０】また、本発明によれば、データ圧縮率の高
い動的辞書を用いたデータ圧縮アルゴリズムによりデー
タ圧縮がなされ、しかもデータ圧縮率が最適な段階で辞
書の更新が終了するため、データ圧縮率を非常に高める
ことが可能となる。一方、出力された辞書は静的辞書と
なるため、必要なデータ列をこの静的辞書を用いて自在
に復元することも可能となる。According to the present invention, data compression is performed by a data compression algorithm using a dynamic dictionary having a high data compression rate, and updating of the dictionary is completed when the data compression rate is optimal. The rate can be greatly increased. On the other hand, since the output dictionary is a static dictionary, a necessary data string can be freely restored using this static dictionary.

【０１６１】また、本発明によれば、特に同一のデータ
列を圧縮処理する場合に、辞書の登録数を従来の増分分
解法と比べて非常に少なくすることができるとともに、
該辞書の登録番号により置き換えられて圧縮が施された
圧縮データ自体も、従来の増分分解法に比べて非常に少
ないデータ量とすることができる。Further, according to the present invention, especially when the same data string is subjected to compression processing, the number of dictionary registrations can be greatly reduced as compared with the conventional incremental decomposition method,
The compressed data itself that has been replaced with the registration number of the dictionary and has been compressed can have a much smaller data amount than the conventional incremental decomposition method.

【０１６２】また、本発明によれば、データ圧縮率が非
常に高められるとともに、必要なデータ列をこの静的辞
書を用いて自在に復元することも可能となる。また、最
適な圧縮率になったか否かを、処理の繰り返し回数だけ
で判断することができるという利点もある。Further, according to the present invention, the data compression rate can be greatly increased, and a required data string can be freely restored using this static dictionary. Another advantage is that it is possible to determine whether or not the optimum compression ratio has been reached only by the number of repetitions of the processing.

【０１６３】また、本発明によれば、辞書の登録可能数
に限界がある場合等に、辞書のデータ量を非常に簡易な
方法で最適なサイズとすることが可能となる。Further, according to the present invention, when the number of dictionaries that can be registered is limited, the data amount of the dictionary can be set to an optimum size by a very simple method.

【０１６４】また、本発明によれば、辞書のデータ量を
最適なサイズとすることができるとともに、辞書に使用
頻度の高い登録データ列を残すことができ、最適な辞書
を生成できる。Further, according to the present invention, the data amount of the dictionary can be set to an optimum size, and a registered data string frequently used can be left in the dictionary, so that an optimum dictionary can be generated.

【０１６５】また、本発明によれば、ビットマップフォ
ントデータ、アウトラインフォントデータの圧縮が可能
となり、これらのフォントデータが格納される記憶装
置、記憶媒体の容量を少なくすることができる。これに
より、該記憶装置、記憶媒体を内蔵するプリンタ、コン
ピュータ等のコストを削減することが可能となる。Further, according to the present invention, bitmap font data and outline font data can be compressed, and the capacity of a storage device and a storage medium for storing these font data can be reduced. As a result, it is possible to reduce the cost of the storage device, the printer, the computer, and the like including the storage medium.

【０１６６】また、本発明によれば、データの特性に応
じて、適用する圧縮方法を換えることができ、データの
圧縮率を更に高めることが可能となる。Further, according to the present invention, the compression method to be applied can be changed according to the characteristics of the data, and the data compression ratio can be further increased.

【０１６７】また、本発明によれば、明朝体、ゴシック
等の字体の各々について辞書を共通化することができ
る。例えば文字の輪郭の変化の仕方、輪郭の太さ等は同
じ字体であれば共通となる場合が多いため、本発明によ
れば、データをより効率よく圧縮することが可能とな
る。Further, according to the present invention, a dictionary can be shared for each of fonts such as Mincho and Gothic. For example, the manner of changing the outline of a character, the thickness of the outline, and the like are often the same for the same font, and therefore, according to the present invention, data can be more efficiently compressed.

【０１６８】また、本発明によれば、文字の記憶に必要
な容量等を節約することができる。Further, according to the present invention, the capacity required for storing characters can be saved.

【０１６９】また、本発明によれば、復元専用登録デー
タ列が、復元専用のデータ形式に変換されており、例え
ば増分分解法、加増分解法においてはツリー構造が無く
なるまで登録データ列が分解されている。従って、通常
の辞書を用いる場合よりも、非常に速く復元処理を行う
ことが可能となる。Further, according to the present invention, the restoration-dedicated registration data string is converted into a restoration-dedicated data format. For example, in the incremental decomposition method and the incremental decomposition method, the registration data string is decomposed until the tree structure is eliminated. ing. Therefore, the restoration process can be performed much faster than when a normal dictionary is used.

【０１７０】また、本発明によれば、復元されたデータ
列を用いて所定の処理、例えば文字の印字等の処理を行
うことができる。Further, according to the present invention, predetermined processing, for example, processing such as printing of characters, can be performed using the restored data string.

【０１７１】また、本発明によれば、復元手段により元
のデータ列が復元でき、この復元手段を、例えば、コン
ピュータ、プリンタ等の情報処理装置に内蔵させること
ができる。According to the present invention, the original data string can be restored by the restoring means, and this restoring means can be built in an information processing apparatus such as a computer or a printer.

[Brief description of the drawings]

【図１】第１の実施例のデータ圧縮方法を説明するため
のフローチャートである。FIG. 1 is a flowchart illustrating a data compression method according to a first embodiment.

【図２】実施例のデータ圧縮方法が使用されるデータ圧
縮装置１２の構成の一例を示すブロック図である。FIG. 2 is a block diagram illustrating an example of a configuration of a data compression device 12 using the data compression method of the embodiment.

【図３】図３（Ａ）〜（Ｅ）は、スライド辞書の手法を
説明するための概略説明図である。FIGS. 3A to 3E are schematic explanatory diagrams for explaining a technique of a slide dictionary.

【図４】スライド辞書手法を利用して最適な辞書を得る
手法を説明するためのフロチャートである。FIG. 4 is a flowchart for explaining a method of obtaining an optimal dictionary using a slide dictionary method.

【図５】増分分解アルゴリズムを利用して最適な辞書を
得る手法を説明するためのフローチャートである。FIG. 5 is a flowchart for explaining a method of obtaining an optimal dictionary by using an incremental decomposition algorithm.

【図６】静的辞書および圧縮データを作成する過程を視
覚的に表した図である。FIG. 6 is a diagram visually illustrating a process of creating a static dictionary and compressed data.

【図７】第２の実施例で使用されるデータ圧縮方法を模
式的に説明するための概略説明図である。FIG. 7 is a schematic explanatory diagram for schematically explaining a data compression method used in the second embodiment.

【図８】図８（Ａ）、（Ｂ）は、増分分解法と加増分解
法の処理の違いを模式的に示した図である。FIGS. 8A and 8B are diagrams schematically showing a difference in processing between an incremental decomposition method and an additive decomposition method.

【図９】増分分解法の動作を説明するためのフロチャー
トである。FIG. 9 is a flowchart for explaining the operation of the incremental decomposition method.

【図１０】図１０（Ａ）、（Ｂ）は、増分分解法と加増
分解法の処理の違いを模式的に示した図である。FIGS. 10A and 10B are diagrams schematically showing a difference in processing between an incremental decomposition method and an additive decomposition method.

【図１１】加増分分解法による１回のパスの処理を説明
するためのフローチャートである。FIG. 11 is a flowchart for explaining processing of one pass by an incremental decomposition method.

【図１２】辞書の最適化のための複数回のパスの処理を
説明するためのフローチャートである。FIG. 12 is a flowchart illustrating a process of a plurality of passes for optimizing a dictionary.

【図１３】図１３（Ａ）は、使用頻度情報を含む辞書の
構造を示す図であり、図１３（Ｂ）は、使用頻度に基づ
く登録削除の処理を表すフローチャートである。FIG. 13A is a diagram illustrating a structure of a dictionary including usage frequency information, and FIG. 13B is a flowchart illustrating registration deletion processing based on usage frequency.

【図１４】図１４（Ａ）は、復元専用辞書の構造を示す
図であり、図１４（Ｂ）は、文字開始アドレス及び文字
列長による文字列のコア部分の指定について説明する図
である。14A is a diagram illustrating a structure of a restoration-dedicated dictionary, and FIG. 14B is a diagram illustrating designation of a core portion of a character string by a character start address and a character string length. .

【図１５】復元専用の辞書を用いた場合の復元処理のフ
ローチャートである。FIG. 15 is a flowchart of a restoration process when a dictionary dedicated for restoration is used.

【図１６】図１６（Ａ）、（Ｂ）は、ビットマップデー
タの圧縮について説明するための概略説明図である。FIGS. 16A and 16B are schematic explanatory diagrams for describing compression of bitmap data. FIG.

【図１７】文字のビットマップイメージを説明するため
の概略説明図である。FIG. 17 is a schematic explanatory diagram for explaining a bitmap image of a character.

【図１８】アウトラインフォントを説明するための概略
説明図である。FIG. 18 is a schematic explanatory diagram for explaining an outline font.

【図１９】図１９（Ａ）、（Ｂ）は、アウトラインフォ
ントデータの圧縮について説明するための概略説明図で
あり、図１９（Ｃ）は、この圧縮データの復元処理のフ
ローチャートである。FIGS. 19A and 19B are schematic explanatory diagrams for describing compression of outline font data, and FIG. 19C is a flowchart of a process for restoring the compressed data.

【図２０】実施例のデータ圧縮方法の利用態様である。FIG. 20 illustrates a use mode of the data compression method according to the embodiment.

[Explanation of symbols]

１圧縮対象である文字列データ２作業用バッファ３辞書４圧縮データ５完成した辞書１１圧縮対象である全データ列１２データ圧縮装置１３辞書生成手段１４静的辞書１５静的辞書保持手段１６静的辞書出力手段１７データ圧縮手段１８圧縮データ出力手段１９圧縮データ DESCRIPTION OF SYMBOLS 1 Character string data to be compressed 2 Work buffer 3 Dictionary 4 Compressed data 5 Completed dictionary 11 All data strings to be compressed 12 Data compression device 13 Dictionary generating means 14 Static dictionary 15 Static dictionary holding means 16 Static Dictionary output means 17 Data compression means 18 Compressed data output means 19 Compressed data

Claims

[Claims]

1. A data compression method for compressing data by using a dictionary capable of registering a registration data string in association with a registration number and replacing two or more combinations of data strings with the registration number. The dictionary is updated until an optimal dictionary for data compression of the data string is generated, and at the stage when the optimal dictionary is generated, the dictionary is output as a final static dictionary for restoration. A data compression method, comprising compressing a data string to be compressed and outputting the compressed data string as final decompressed compressed data.

2. The method according to claim 1, wherein the updating of the dictionary until the optimal dictionary is generated includes:
The method is characterized in that registration of infrequently used registration data strings is deleted from a dictionary generated by preferentially registering combinations of data strings having a large number of combinations until the number of registered dictionary entries reaches a predetermined number. Data compression method.

3. The method according to claim 1, wherein the updating of the dictionary until the optimal dictionary is generated includes:
It is characterized in that registration of infrequently used registered data strings is deleted from the dictionary generated by preferentially registering a combination of data strings with high appearance probabilities until the number of registered dictionaries reaches a predetermined number. Data compression method.

4. The method according to claim 1, wherein updating the dictionary until the optimal dictionary is generated includes:
At the time of data compression, the dictionary is updated by a data compression algorithm that changes dynamically, data compression processing is performed on all data strings to be compressed while the dictionary is updated, and the data is again updated using the dictionary updated by the processing. A data compression method comprising performing data compression processing on all data strings to be compressed while updating a dictionary by a compression algorithm, and repeating the processing until the data compression ratio is optimized.

5. The data according to claim 4, wherein it is determined whether or not an optimum dictionary has been generated based on the number of times of output of compressed data that increases when compressed data is to be output by the data compression algorithm. Compression method.

6. The method according to claim 1, further comprising the step of storing usage frequency information together with a registration number and a registration data string in the dictionary, and sequentially deleting the registration data strings having a low usage frequency. Data compression method.

7. The method according to claim 6, wherein the deletion of the registered data string having a low frequency of use sequentially reduces the frequency of use of the registered data string registered in the dictionary. A data compression method, which is performed by preferentially deleting a registered data string that has been changed.

8. The data compression method according to claim 1, wherein the data string to be compressed is font data required for printing characters.

9. The method according to claim 8, wherein only a part of the font data is a data string to be compressed, and another part is data compressed by another data compression method. Data compression method.

10. The outline font according to claim 9, wherein the information representing the attribute of each point of the outline font and the information representing the characteristics of the character are data strings to be compressed, and represent vector coordinates in the launch direction at each point. A data compression method, wherein information is compressed by another data compression method.

11. The data compression method according to claim 8, wherein data compression is performed on the characters having a common font using a common dictionary.

12. The data compression method according to claim 1, wherein the data string to be compressed is a character string.

13. A data string to be compressed by a decompression process according to the data compression method, using the compressed data generated by the data compression method according to claim 1 and a final dictionary. A data restoration method characterized by restoring data.

14. A data string to be compressed by a decompression process corresponding to the data compression method, using the compressed data generated by the data compression method according to claim 1 and a final dictionary. An information processing apparatus comprising means for restoring the information.