JP3001519B1

JP3001519B1 - Data compression method and data compression method

Info

Publication number: JP3001519B1
Application number: JP23552098A
Authority: JP
Inventors: 久幸山中
Original assignee: 日本電気アイシーマイコンシステム株式会社
Priority date: 1998-08-21
Filing date: 1998-08-21
Publication date: 2000-01-24
Anticipated expiration: 2018-08-21
Also published as: JP2000068857A

Abstract

【要約】【課題】ＬＺＷ符号化方式によるデータ圧縮方式であ
って、従来の方法で行っていた全ての被圧縮データなら
びに全ての専用辞書による符号化処理を削減し、符号化
処理の高速化を図ることができるデータ圧縮方式を提供
する。【解決手段】ＬＺＷ符号化方式によるデータ圧縮方式
において、学習用辞書（１２−１）と１つまたはそれ以
上の専門辞書（１２−２〜１２−４）とを設けるととも
に、符号化される被圧縮データの一部分を前記辞書の全
てを用いて圧縮学習した後、その被圧縮データ内容から
最も近い分野の前記専用辞書を選択、固定し、この選
択、固定した専用辞書により残された被圧縮データの圧
縮を行う。A data compression method based on the LZW encoding method, which reduces the encoding processing by all the data to be compressed and all the dedicated dictionaries performed by the conventional method, and speeds up the encoding processing. Provide a data compression scheme that can be achieved. SOLUTION: In a data compression system based on an LZW encoding system, a learning dictionary (12-1) and one or more specialized dictionaries (12-2 to 12-4) are provided, and a data dictionary to be encoded is provided. After performing compression learning on a part of the compressed data using all of the dictionaries, the dedicated dictionary in the field closest to the compressed data content is selected and fixed, and the compressed data left by the selected and fixed dedicated dictionary Compression.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ユニバーサル符号
化方式の１つであるＬＺＷ符号化によるデータ圧縮方法
およびデータ圧縮方式に関するものである。BACKGROUND OF THE INVENTION The present invention relates to a data compression method according to the LZW coding is one of the universal coding scheme
And a data compression method.

【０００２】[0002]

【従来の技術】ユニバーサル符号化方式の１つである増
分分解型のＬＺＷ符号化におけるデータ圧縮方法、特に
圧縮時における符号化方法において、従来方式の符号化
処理は、学習用辞書ならびに数種の専用辞書を設け、圧
縮対象となる全ての被圧縮データに対し全ての学習用辞
書ならびに専用辞書において符号化および辞書学習を行
っていた。そして、その内の最も圧縮率の高い値を示し
た専用辞書により符号化された圧縮データに、復号器で
該圧縮データがどの専用辞書で符号化されたかを判断で
きる辞書選択データを付与し復号器に送るなどして、符
号復号化処理の削減を図っていた（図２１参照）。Data compression method definitive in LZW coding incremental decomposition type which is one of the Related Art Universal coding method, in particular encoding method at the time of compression encoding process of the conventional method, the learning dictionary and several And the encoding and dictionary learning are performed in all learning dictionaries and dedicated dictionaries for all compressed data to be compressed. Then, to the compressed data encoded by the dedicated dictionary having the highest value of the compression ratio among them, dictionary selection data that allows the decoder to determine in which dedicated dictionary the compressed data was encoded is added and decoded. For example, the code decoding process is reduced by sending the data to a device (see FIG. 21).

【０００３】しかしながら、従来の方法は、全ての被圧
縮データと全ての辞書を用いて符号化し、符号化動作を
完了した時点で使用した数種の辞書の圧縮率を求めると
いう方法のため、辞書選択までの時間に無駄があった。However, conventional methods are encoded using all of the compressed data and all dictionary, for the method of determining the compression ratio of several dictionaries used upon completion of the coding operation, Dictionary There was a waste of time before the selection.

【０００４】[0004]

【発明が解決しようとする課題】そこで、本発明は、Ｌ
ＺＷ符号化によるデータ圧縮方法およびデータ圧縮方式
であって、従来の方法で行っていた全ての被圧縮データ
ならびに全ての専用辞書による符号化処理を削減し、符
号化処理の高速化を図ることができるデータ圧縮方法お
よびデータ圧縮方式を提供することを目的とする。Therefore, the present invention relates to
A data compression method and data compression method according to the ZW coding to reduce coding processing by all of the compressed data as well as all special dictionary has been performed in a conventional manner, to increase the speed of the encoding process Data compression methods
And a data compression method.

【０００５】本発明者は、前記目的を達成するために鋭
意検討を行った結果、ユニバーサル符号化方式の１つで
あるＬＺＷ符号化において、本来の学習用辞書と１つま
たはそれ以上の専門辞書を設け、符号化効率を高めると
ともに、ある一定量の被圧縮データを全ての辞書を用い
て圧縮学習することにより、符号化される被圧縮データ
の一部分から被圧縮データの傾向を掴み、その傾向に特
化した専門辞書を選択し、それによる符号化を行うこと
で、符号化処理時間の削減を図ることができることに想
到し、本発明をなすに至った。[0005] The present inventor has conducted extensive studies to achieve the above object, Oite the LZW coding is one of the universal coding scheme, dictionary and one or more for the original training By providing a specialized dictionary and improving the encoding efficiency, by learning compression of a certain amount of compressed data using all dictionaries, grasping the tendency of compressed data from a part of the compressed data to be encoded, The present inventors have conceived that it is possible to reduce the encoding processing time by selecting a specialized dictionary specializing in such a tendency and performing encoding based on the selected dictionary, and have accomplished the present invention.

【０００６】したがって、本発明は、ＬＺＷ符号化によ
るデータ圧縮方法において、学習用辞書と１つまたはそ
れ以上の専門辞書とを設けるとともに、符号化される被
圧縮データの一部分を前記辞書の全てを用いて圧縮学習
した後、その被圧縮データ内容から最も近い分野の前記
専用辞書を選択、固定し、この選択、固定した専用辞書
により残された被圧縮データの圧縮を行うことを特徴と
するデータ圧縮方法を提供する。Accordingly, the present invention provides a data compression method based on LZW encoding , wherein a learning dictionary and one or more specialized dictionaries are provided, and a part of the compressed data to be encoded is provided. After performing compression learning using all of the dictionaries, the dedicated dictionary in the field closest to the compressed data content is selected and fixed, and the selected and fixed compressed dictionaries remaining are compressed. A data compression method is provided.

【０００７】ユニバーサル符号化方式の１つである増分
分解型のＬＺＷ（ジフ・レンペルと称す。以降ＬＺＷと
呼ぶ）符号化におけるデータ圧縮方法の特に圧縮時にお
ける符号化方法において、従来方法の符号化処理は、前
述したように学習用辞書ならびに数種の専用辞書を設
け、圧縮対象となる全ての被圧縮データに対し全ての学
習用辞書ならびに専用辞書において符号化および辞書学
習を行った後、その内の最も圧縮率の高い値を示した専
用辞書により符号化された圧縮データに、復号器で該圧
縮データがどの専用辞書で符号化されたかを示す辞書選
択データを付与し復号器に送るなどして、符号復号化処
理の削減を図っていた。[0007] which is one of the incremental decomposition type LZW universal coding scheme (referred to as Ziff Lempel. Hereinafter referred to as LZW) In particular encoding method at the time of compression of the data compression method definitive in coding, the sign of the conventional method As described above, the learning process is performed by providing a learning dictionary and several types of dedicated dictionaries, performing encoding and dictionary learning on all learning dictionaries and dedicated dictionaries for all compressed data to be compressed, The decoder assigns dictionary selection data indicating which dedicated dictionary the compressed data was encoded to to the compressed data encoded by the exclusive dictionary showing the highest compression rate, and sends it to the decoder. For example, the code decoding process is reduced.

【０００８】本発明は、辞書形態は同様であるが、従来
が全ての被圧縮データと全ての辞書を用い符号化してい
たのに対し、ある一定量の被圧縮データを全ての辞書を
用いて圧縮学習した後、その被圧縮データ内容から最も
近い分野の専用辞書を選択、固定し、残された該被圧縮
データの圧縮を行うことにより、従来の方法で行ってい
た全ての被圧縮データならびに全ての専用辞書による符
号化処理を削減し、符号化処理の高速化を図ったもので
ある。In the present invention, although the dictionary form is the same, the conventional method encodes all the data to be compressed and all the dictionaries. After the compression learning, the dedicated dictionary of the field closest to the compressed data content is selected and fixed, and by compressing the remaining compressed data, all the compressed data and the compressed data that have been performed by the conventional method are obtained. The encoding processing by all dedicated dictionaries is reduced, and the encoding processing is speeded up.

【０００９】[0009]

【発明の実施の形態】本発明の構成を図１本発明のＬＺ
Ｗ符号器および図２本発明のＬＺＷ復号器を用いて説明
する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG.
A description will be given using a W encoder and the LZW decoder of the present invention in FIG.

【００１０】本発明の第１の構成は、図４第１の実施例
の被圧縮データフレームの被圧縮データ４２を蓄える文
字列入力バッファ１０と、学習用辞書１２−１、通信分
野に適した通信専用辞書１２−２、音声分野に適した音
声専用辞書１２−３および画像分野に適した画像専用辞
書１２−４を備えた符号化辞書１２と、前記各辞書に対
応した学習用辞書符号列出力バッファ１４、通信専用辞
書符号列出力バッファ１５、音声専用辞書符号列出力バ
ッファ１６および画像専用辞書符号列出力バッファ１７
と、辞書選択器１３と、前記各辞書を使用してＬＺＷ法
により入力バッファ１０からの被圧縮データ４２の符号
化を行い、該符号化データを前記出力バッファへ出力す
る機能、および、前記各専用辞書において学習した登録
文字と該登録文字が被圧縮データにどれだけ出現したか
を示すデータを蓄積する前記学習用辞書と専用辞書とか
ら被圧縮データの符号化を行う際に最も適した辞書を選
択し、辞書選択器１３へ選択信号１８を出力する機能を
有するＬＺＷ符号化部１１とからなるデータ圧縮方式で
ある。The first configuration of the present invention is a character string input buffer 10 for storing the compressed data 42 of the compressed data frame of the first embodiment of FIG. 4, a learning dictionary 12-1, and suitable for the communication field. A coding dictionary 12 including a communication dictionary 12-2, a voice dictionary 12-3 suitable for the voice field, and an image dictionary 12-4 suitable for the image field, and a learning dictionary code string corresponding to each dictionary Output buffer 14, communication-dedicated dictionary code string output buffer 15, audio-only dictionary code string output buffer 16, and image-only dictionary code string output buffer 17
A function of encoding the compressed data 42 from the input buffer 10 by the LZW method using the dictionary selector 13 and the respective dictionaries, and outputting the encoded data to the output buffer; A dictionary most suitable for encoding compressed data from the learning dictionary and the dedicated dictionary, which stores registered characters learned in the dedicated dictionary and data indicating how much the registered characters appear in the compressed data. And a LZW encoding unit 11 having a function of outputting a selection signal 18 to the dictionary selector 13.

【００１１】本発明の第２の構成は、前記第１の構成に
おいて、図６第２の実施例の被圧縮データフレームに示
すように、予めどの分野の被圧縮データかを示す辞書選
択フラグがつけられた該被圧縮データ４３を符号化する
機能を有するＬＺＷ符号化部１１を含むデータ圧縮方式
である。A second aspect of the present invention, in the first configuration, as shown in the compressed data frame of FIG. 6 a second embodiment, the dictionary selection flag indicating whether the data to be compressed in advance which areas This is a data compression method including the LZW encoding unit 11 having a function of encoding the attached compressed data 43.

【００１２】一般にユニバーサル符号化方式では、図７
に示す辞書構成により辞書作成が行われる。通常、ＬＺ
Ｗ法の場合、辞書には基本となる英数字（基本登録文字
９２）と若干の制御語が登録される。なお、本発明の請
求範囲は圧縮動作に限定されるため、ＬＺＷ方式での例
外処理の説明についてはここでは割愛する。したがっ
て、制御語についての説明は行わない。Generally, in the universal coding method, FIG.
A dictionary is created according to the dictionary configuration shown in FIG. Usually LZ
In the case of the W method, basic alphanumeric characters (basic registration characters 92) and some control words are registered in the dictionary. Since the scope of the present invention is limited to the compression operation, the description of the exception processing in the LZW method is omitted here. Therefore, description of the control word will not be given.

【００１３】最初に、図１を用いて第１の実施例である
ＬＺＷ符号器の動作を符号器の学習用辞書を用いて説明
する。ＬＺＷ符号化部１１は図１５の符号化手順に従い
被圧縮データ４２の符号化を行う。以下この手順に従っ
て動作を説明する。First, the operation of the LZW encoder according to the first embodiment will be described with reference to FIG. 1 using a learning dictionary for the encoder. The LZW encoding unit 11 encodes the compressed data 42 according to the encoding procedure of FIG. The operation will be described below according to this procedure.

【００１４】まず、ＬＺＷ符号化部１１は図７の学習用
辞書に示すような２５９（０〜２５８）個の基本登録文
字（ａ，ｂ，ｃ，ｄ，…・０，１，２，…・・）９２の
登録を行う（ステップ１００）。First, the LZW encoding unit 11 has 259 (0 to 258) basic registered characters (a, b, c, d,... 0, 1, 2,...) As shown in the learning dictionary of FIG. ..) 92 are registered (step 100).

【００１５】ここで、ＬＺＷ符号器の文字列入力バッフ
ァ１０には被圧縮データ４２として被圧縮データフレー
ム（ａｂｃａｂｃａｂｃａａｂｂｃｃ……）が保持され
ているものとする。次に、該フレームの最初の一文字ａ
を読み込み（ステップ１０１）、学習用辞書１２−１に
登録されているどうか検索する（ステップ１０２）。こ
の場合、ａは基本登録文字９２として登録されているの
で検索が一致するため、次の文字ｂをポイントする（ス
テップ１０３）。次のポイントされた被圧縮文字ｂを読
み込み、学習用辞書１２−１にａｂが登録されているか
どうかを検索する（ステップ１０１〜１０２）。検索文
字列ａｂは登録されていないため、新規登録文字９３と
して学習用辞書１２−１に登録され、次の接点番号２５
９が付与される（ステップ１０４）。Here, it is assumed that the character string input buffer 10 of the LZW encoder holds compressed data frames (abcabcabcaabbcc...) As compressed data 42. Next, the first character a of the frame
Is read (step 101), and it is searched whether or not it is registered in the learning dictionary 12-1 (step 102). In this case, since a is registered as the basic registration character 92 and the search matches, the next character b is pointed (step 103). The next pointed compressed character b is read, and it is searched whether ab is registered in the learning dictionary 12-1 (steps 101 to 102). Since the search character string ab is not registered, it is registered in the learning dictionary 12-1 as a newly registered character 93, and the next contact number 25
9 is given (step 104).

【００１６】次に、前回の検索で一致した文字ａを図１
２のＬＺＷ復号器の通信専用辞書に示す符号語（０００
００００００）に変換し、学習用辞書符号列出力バッフ
ァ１４へ出力する（ステップ１０５）。次に、被圧縮文
字列が終わりでないので（ステップ１０６）、被圧縮文
字列の読み込み（ステップ１０１）に戻り、前回読み込
んだ文字ｂの文字検索を行う。この場合、文字ｂは基本
登録文字９２として登録されているので何も出力せず、
次の文字ｃをポイントしｂｃが学習用辞書１２―１に登
録されているか検索を行う。文字ｂｃは登録されていな
いので、新規登録文字９３として登録され、次の接点番
号２６０が付与される。この場合も文字ｂの符号語（０
０００００００１）が符号列出力バッファ１４に出力さ
れる。Next, the character a matched in the previous search is shown in FIG.
2 codeword (000) shown in the communication dictionary of the LZW decoder 2
000000) and outputs it to the learning dictionary code string output buffer 14 (step 105). Next, since the compressed character string is not at the end (step 106), the process returns to the reading of the compressed character string (step 101), and the character search of the character b previously read is performed. In this case, since the character b is registered as the basic registered character 92, nothing is output,
A search is performed by pointing to the next character c to determine whether bc is registered in the learning dictionary 12-1. Since the character bc is not registered, it is registered as a newly registered character 93, and the next contact number 260 is assigned. Also in this case, the code word (0
00000001) is output to the code string output buffer 14.

【００１７】以下同様に続く被圧縮文字列の増分分解型
の符号化が行われ、図７に示すような学習用辞書の作成
が行われるとともに、符号語列（０００００００００、
００００００００１，０００００００１０、１００００
００１１、…・・）が該符号列出力バッファより対向す
る復号器へ出力される（ステップ１０１〜１０６）。ま
た、上記符号化動作は通信専用辞書１２−２、音声専用
辞書１２−３（図１０）および画像専用辞書１２−４
（図１１）についても辞書選択がなされるまで同様に行
われる。In the same manner as described above, encoding of the subsequent compressed character string is performed in an incrementally decomposing type, a learning dictionary as shown in FIG. 7 is created, and a code word string (000000000,
000000001000000000010, 10000
0011,...) Are output from the code string output buffer to the opposite decoder (steps 101 to 106). The encoding operation is performed by the communication dictionary 12-2, the audio dictionary 12-3 (FIG. 10), and the image dictionary 12-4.
The same applies to FIG. 11 until a dictionary is selected.

【００１８】次に、図２本発明のＬＺＷ復号器により、
復号化について符号化と同様に復号器の学習用辞書２２
−１を用いて説明する。ＬＺＷ復号化部２１は図１６の
復号化手順に従い圧縮データの復号化を行う。以下この
手順に従って動作を説明する。Next, FIG. 2 shows the LZW decoder of the present invention.
For decoding, the learning dictionary 22 of the decoder is used in the same manner as the encoding.
Explanation is made using -1. The LZW decoding unit 21 decodes the compressed data according to the decoding procedure in FIG. The operation will be described below according to this procedure.

【００１９】最初に、ＬＺＷ復号化部２１は符号化と同
様に図１２の通信専用辞書に示すような２５９（０〜２
５８）個の基本登録文字（ａ，ｂ，ｃ，ｄ，…・０，
１，２，…・・）８３の登録を行う（ステップ２０
０）。この場合、符号化辞書と異なるところは、これか
ら復号しようとする符号語８０に対応した接点番号列８
１と該接点番号に対応した文字および文字列８２が割り
当てられていることである。First, the LZW decoding unit 21 259 (0 to 2) as shown in the communication-dedicated dictionary in FIG.
58) basic registration characters (a, b, c, d,... 0,
, 83) are registered (step 20).
0). In this case, the difference from the coding dictionary is that the contact number sequence 8 corresponding to the code word 80 to be decoded is
1 and a character and a character string 82 corresponding to the contact number.

【００２０】ここで、上記符号化の動作説明で符号化さ
れ出力された符号列（０００００００００、０００００
０００１，０００００００１０、１００００００１１、
…・・）が符号列入力バッファ２０に蓄えられているも
のとする。Here, the code string (000000000, 00000) encoded and output in the explanation of the encoding operation is output.
0001,00000000010, 100000011,
...) Are stored in the code string input buffer 20.

【００２１】まず、符号列入力バッファ２０から符号語
（０００００００００）を読み込む（ステップ２０
１）。読み込まれた符号語（０００００００００）を接
点番号（０）に変換し（ステップ２０２）、学習用辞書
２２−１（図８）に登録されているか検索を行う（ステ
ップ２０３）。この場合、接点番号（０）は登録されて
いるので一致した接点番号（０）に対応した文字ａを音
声用辞書文字列出力バッファ２４へ出力する（ステップ
２０４）。First, a code word (000000000) is read from the code string input buffer 20 (step 20).
1). The read codeword (000000000) is converted into a contact number (0) (step 202), and a search is performed to determine whether the codeword is registered in the learning dictionary 22-1 (FIG. 8) (step 203). In this case, since the contact number (0) is registered, the character a corresponding to the coincident contact number (0) is output to the voice dictionary character string output buffer 24 (step 204).

【００２２】続いて、文字ａが新規文字であるかどうか
検索する（ステップ２０５）。文字ａは基本登録文字９
８であるので新規文字登録をスキップし、次の符号語を
ポイントする（２０６、２０７）。ポイントされた次の
符号語（００００００００１）を読み込み、前回と同様
に辞書検索を行う。符号語（００００００００１）は接
点番号（１）に対応しており、基本登録文字ｂと一致す
るので、学習用辞書文字列出力バッファ２４へ文字ｂを
出力する（ステップ２０１〜２０４）。ここでＬＺＷ復
号化部２１は、これまでの復号文字列ａｂはこの場合辞
書に登録されていないので、文字列ａｂに符号語（１０
０００００１１）、接点番号（２５９）を付与し、新規
文字列９９として辞書に登録する（ステップ２０５、２
０６）。Subsequently, it is searched whether the character a is a new character (step 205). Character a is the basic registered character 9
Since it is 8, registration of a new character is skipped and the next code word is pointed (206, 207). The next code word (000000001) pointed to is read, and a dictionary search is performed in the same manner as the previous time. Since the code word (00000000001) corresponds to the contact number (1) and matches the basic registered character b, the character b is output to the learning dictionary character string output buffer 24 (steps 201 to 204). Here, the LZW decoding unit 21 adds the code word (10) to the character string ab since the decoded character string ab so far is not registered in the dictionary in this case.
0000011), a contact number (259) is assigned, and registered in the dictionary as a new character string 99 (steps 205 and 2).
06).

【００２３】次にポイントされた符号語（００００００
０１０）は接点番号（２）で文字ｃに、続く符号語（１
００００００１１）は接点番号（２５９）のａｂに対応
しているので、学習用辞書文字列出力バッファ２４へそ
れぞれ出力する。ここで、文字ｃと次に続く文字列ａｂ
のａからなる文字列ｃａは辞書に登録されていないの
で、符号語（１０００００１００）、接点番号２６０が
付与され、新規文字列９９として辞書に登録される（ス
テップ２０１〜２０７）。この様にして符号語の復号
（ａｂｃａｂｃａｂｃａａｂｂｃｃ…・・）と復号化学
習用辞書２２−１の作成が圧縮ファイルの終わりまで行
われる。The next pointed code word (000000)
010) is the contact number (2), followed by the character c, followed by the code word (1)
000000011) corresponds to the ab of the contact number (259), and is output to the learning dictionary character string output buffer 24, respectively. Here, the character c and the following character string ab
Is not registered in the dictionary, the code word (10000000) and the contact number 260 are assigned, and the character string ca is registered as a new character string 99 in the dictionary (steps 201 to 207). In this manner, the decoding of the codeword (abcabcabcaabbcc...) And the creation of the decoding learning dictionary 22-1 are performed until the end of the compressed file.

【００２４】また、上記復号化動作は符号化の動作説明
と同様にＬＺＷ復号器の通信専用辞書２２−２、音声専
用辞書２２―３および画像専用辞書２２―４についても
辞書選択がなされるまで同様に行われる。なお、符号語
が復号辞書に登録されていない場合は例外処理（ステッ
プ２０３、２０９）が行われるが本特許請求範囲ではな
いのでここでは説明を割愛する。以上のようにしてＬＺ
Ｗ法による文字列の符号化と復号化が行われる。The decoding operation is performed until the dictionary is selected for the communication dictionary 22-2, the audio dictionary 22-3, and the image dictionary 22-4 of the LZW decoder in the same manner as the description of the encoding operation. The same is done. If the code word is not registered in the decoding dictionary, an exception process (steps 203 and 209) is performed, but the description is omitted here because it is not within the scope of the present invention. As described above, LZ
Encoding and decoding of a character string by the W method are performed.

【００２５】次に、専用辞書を使った場合の本発明の符
号化および復号化動作を説明する。まず、符号化動作を
図１７専用辞書による符号化手順のフローチャートに従
って通信専用辞書を用いて説明する。図９は通信分野の
用語が予め通信専用辞書１２―２に登録されている内容
を示しており、接点番号５０、登録文字列５１、出現カ
ウンタ５２で構成されている。Next, the encoding and decoding operations of the present invention when a dedicated dictionary is used will be described. First, the encoding operation will be described using the communication-dedicated dictionary according to the flowchart of the encoding procedure using the dedicated dictionary in FIG. FIG. 9 shows contents in which terms in the communication field are registered in the communication dictionary 12-2 in advance, and includes a contact number 50, a registered character string 51, and an appearance counter 52.

【００２６】まず、ＬＺＷ符号化部１１は、符号化処理
に先立ち辞書検索カウンタ、出現カウンタ、文字ポイン
タの初期化（ここでは０クリア）を行う（ステップ３０
１）。次に、基本登録文字５３および各分野に対応した
専用登録文字５４の登録を行う（ステップ３０２）。こ
こで、ＬＺＷ符号器１１の文字列入力バッファ１０には
図３に示した一連の被圧縮データが第１ブロック〜第Ｎ
＋２ブロックに区切られ蓄積されている。該ブロックご
との被圧縮データサイズは任意であるが、ここでは１０
Ｋバイト分とする。また、第１ブロックには、前記学習
用辞書１２―１による符号化の説明と同様に、被圧縮デ
ータとして図４に示す被圧縮データフレーム（ａｂｃａ
ｂｃａｂｃａａｂｂｃｃ……）４２が保持されているも
のとする。First, the LZW encoding unit 11 initializes (here, clears to 0) a dictionary search counter, an appearance counter, and a character pointer prior to the encoding process (step 30).
1). Next, the basic registration character 53 and the special registration character 54 corresponding to each field are registered (step 302). Here, the series of compressed data shown in FIG. 3 is stored in the character string input buffer 10 of the LZW encoder 11 from the first block to the Nth block.
+2 blocks are stored. The size of the data to be compressed for each block is arbitrary.
It is assumed to be K bytes. In the first block, the compressed data frame (abca) shown in FIG. 4 is stored as the compressed data in the same manner as in the description of the encoding using the learning dictionary 12-1.
bcabcaabbcc...) 42 are held.

【００２７】続いて、どの専用辞書を用いて符号化を行
うかを決定する専用辞書選択値を求める符号化範囲を決
定する辞書検索カウント値（ここでは１Ｋバイト分・Ｔ
０）を調べる（ステップ３０３）が、初期化後であるの
で該被圧縮データフレーム４２から最初の文字ａの読み
込みに入る（ステップ３０４）。次に、文字ａが通信専
用辞書１２―２に登録されているかどうか検索する（ス
テップ３０５）。文字ａは基本登録文字５３として登録
されているので出現カウンタをカウントアップ（＋１）
し（ステップ３０６）、次の文字をポイントする（ステ
ップ３０７）。ポイントされた次の文字ｂを読み込む。
文字列ａｂが通信専用辞書２２―２に登録されているか
どうか検索する。通信専用辞書では文字列ａｂは通信分
野の専用文字列として登録されているため、文字列ａｂ
の出現カウンタをカウントアップし、次の文字をポイン
トする（ステップ３０４〜３０７）。次にポイントされ
た文字ｃを読み込む。文字列ａｂｃが通信専用辞書１２
―２に登録されているかどうか検索する。文字ａｂｃは
登録されているので文字列ａｂｃの出現カウンタをカウ
ントアップし、さらに次の文字をポイントする（ステッ
プ３０４〜３０７）。ポイントされた文字ｃを読み込み
文字列ａｂｃａが通信専用辞書１２−２に登録されてい
るかどうか検索する。この場合、文字列ａｂｃａは辞書
に登録されていないので、接点番号（２６６）が付与さ
れ通信専用辞書１２−２に登録される（ステップ３０４
〜３０８）。次に、１つ前の一致した文字列ａｂｃの符
号語を通信専用符号列出力バッファ１５へ出力する（ス
テップ３０９）。Subsequently, a dictionary search count value for determining a coding range for obtaining a dedicated dictionary selection value for determining which dedicated dictionary is to be used for encoding (here, 1 Kbyte · T
0) is checked (step 303), but since the initialization has been completed, the first character a is read from the compressed data frame 42 (step 304). Next, it is searched whether the character a is registered in the communication dictionary 12-2 (step 305). Since the character a is registered as the basic registered character 53, the appearance counter is counted up (+1).
(Step 306), and point to the next character (step 307). Read the next character b pointed to.
A search is made to determine whether the character string ab is registered in the communication dictionary 22-2. Since the character string ab is registered as a dedicated character string in the communication field in the communication-dedicated dictionary, the character string ab
Is counted up, and the next character is pointed (steps 304 to 307). Next, the pointed character c is read. Character string abc is communication dictionary 12
-Search whether it is registered in 2. Since the character abc is registered, the occurrence counter of the character string abc is counted up, and the next character is pointed (steps 304 to 307). The pointed character c is read, and it is searched whether the character string abca is registered in the communication dictionary 12-2. In this case, since the character string abca is not registered in the dictionary, a contact number (266) is assigned and registered in the communication dictionary 12-2 (step 304).
308). Next, the code word of the previous matching character string abc is output to the communication-specific code string output buffer 15 (step 309).

【００２８】上述した一連の文字列の符号化（ステップ
３０４〜３０９）は、学習用辞書１２−１による符号化
と同様に行われ、辞書検索カウント値が１Ｋバイトにな
るまで続けられる（ステップ３０３）。また、辞書検索
カウント値が１Ｋバイト分Ｔ０になるまでは学習用辞書
１２−１、通信専用辞書１２−２、音声専用辞書１２−
３および画像専用辞書１２−４のすべての辞書も新規の
文字列を同じように登録して行くとともに、該辞書それ
ぞれに登録された文字に対する被圧縮データの出現カウ
ント数が出現カウンタ５２に記録される。辞書検索カウ
ント値が１Ｋバイトに達する（ステップ３０３）と、Ｌ
ＺＷ符号化部１１は通信専用辞書ヒット率演算を行う
（ステップ３１１）。この演算は図１４本発明の実施例
における辞書選択値と閾値に示されているように、通信
専用辞書ヒット率＝専用登録文字出現カウント数の総和
÷専用辞書接点数で計算される。図９符号化通信専用辞
書では、専用登録文字出現カウント数の総和＝２＋３＋
６＋８＋７＋５＋４＝３５、専用辞書接点数＝７である
ので、通信専用辞書ヒット率＝５．０となる。同様に音
声専用辞書１２−３および画像専用辞書１２―４につい
て求めたものが図１４に示されており、音声専用辞書１
２―３が０．０、画像専用辞書１２−４が１．７となっ
ている。The above-described encoding of a series of character strings (steps 304 to 309) is performed in the same manner as the encoding by the learning dictionary 12-1, and is continued until the dictionary search count value reaches 1 Kbyte (step 303). ). Until the dictionary search count value reaches 1 KB T0, the learning dictionary 12-1, the communication dictionary 12-2, and the voice dictionary 12-
3 and all the image-dedicated dictionaries 12-4, new character strings are registered in the same manner, and the number of appearances of compressed data corresponding to the characters registered in each of the dictionaries is recorded in the appearance counter 52. You. When the dictionary search count value reaches 1 Kbyte (step 303), L
The ZW encoding unit 11 performs a communication dictionary hit rate calculation (step 311). As shown in the dictionary selection value and the threshold value in the embodiment of the present invention in FIG. 14, this calculation is calculated by: communication dictionary hit rate = sum of dedicated registered character appearance counts / dedicated dictionary contact number. FIG. 9 In the coded communication-dedicated dictionary, the sum total of the special registered character appearance count = 2 + 3 +
Since 6 + 8 + 7 + 5 + 4 = 35 and the number of dedicated dictionary contacts = 7, the communication dictionary hit rate = 5.0. Similarly, FIG. 14 shows the results obtained for the voice-only dictionary 12-3 and the image-only dictionary 12-4.
2-3 is 0.0, and the image-dedicated dictionary 12-4 is 1.7.

【００２９】続いて、該通信専用辞書ヒット率から、符
号化している被圧縮データがどの分野に最も適している
かを決定する。この場合、辞書選択閾値≧３．０で一番
高いヒット率を示しているのは通信専用辞書であるの
で、ＬＺＷ符号化部１１は残る９Ｋバイトの被圧縮デー
タの符号化に通信専用辞書１２―２を選択する選択信号
１８を辞書選択器１３に出力するとともに、図１３辞書
選択データとその対応辞書テーブルに示したように、通
信専用辞書１２−２に対応した符号語（１００００００
００）も通信専用符号列出力バッファ１５へ出力する。
該選択信号１８を受けた辞書選択器１３は、現ブロック
の符号化が終わるまで符号化辞書として通信専用辞書１
２−２を固定し、他の専用辞書には変えない（ステップ
３１４）。次に、符号化する第１ブロックが終了する
と、次のブロックでの通信専用辞書ヒット率を求める準
備として、それぞれの専用辞書の出現カウンタと辞書検
索ポインタとを初期化するとともに、辞書選択器１３へ
の選択信号１８をリセットし終了する（ステップ３１
５、３１６）。Subsequently, it is determined from the communication-specific dictionary hit rate which field the compressed data to be encoded is most suitable for. In this case, since the dictionary with the highest hit rate when the dictionary selection threshold ≥ 3.0 is the communication dictionary, the LZW encoding unit 11 performs the encoding of the remaining 9 Kbytes of compressed data using the communication dictionary 12. In addition to the output of the selection signal 18 for selecting −2 to the dictionary selector 13, as shown in the dictionary selection data and the corresponding dictionary table in FIG.
00) is also output to the communication-specific code string output buffer 15.
The dictionary selector 13 receiving the selection signal 18 makes the communication dictionary 1 as an encoding dictionary until the encoding of the current block is completed.
2-2 is fixed, and is not changed to another dedicated dictionary (step 314). Next, when the first block to be encoded ends, in preparation for obtaining the communication dictionary hit rate in the next block, the appearance counter and dictionary search pointer of each dedicated dictionary are initialized, and the dictionary selector 13 is initialized. Is reset and the selection signal 18 is terminated (step 31).
5, 316).

【００３０】以上のように、第１ブロックの被圧縮デー
タは専用辞書内に予め登録された専用登録文字により、
通常のＬＺＷ法による符号化よりも高い圧縮率で符号化
され、図５の圧縮データフレームに組み立てられ、通信
辞書符号化列出力バッファ１５から対向する復号器へ出
力される。なお、閾値３．０に満たない場合は、どの専
用辞書による圧縮もその効果が得られないものとし、学
習用辞書１２―１によAs described above, the data to be compressed in the first block is obtained by the special registered characters registered in the special dictionary in advance.
The encoded data is encoded at a higher compression ratio than the ordinary LZW encoding, is assembled into the compressed data frame shown in FIG. 5, and is output from the communication dictionary encoded sequence output buffer 15 to the opposite decoder. If the value is less than the threshold value 3.0, it is assumed that the compression by any of the dedicated dictionaries does not provide the effect, and the learning dictionary 12-1 does not.

【００３１】次に、本発明の復号化動作を図１８専用辞
書による復号化手順に従って通信専用辞書を用いて説明
する。図１２は通信分野の用語が予め登録された通信専
用辞書内容で符号語８０、接点番号８１および登録文字
列８２で構成されている。Next, the decoding operation of the present invention will be described using a communication-dedicated dictionary according to the decoding procedure using the dedicated dictionary in FIG. FIG. 12 is a communication-dedicated dictionary content in which terms in the communication field are registered in advance, and includes a code word 80, a contact number 81, and a registered character string 82.

【００３２】ＬＺＷ復号化部２１は、辞書選択器２３へ
の辞書選択信号２８リセットと接点番号カウンタをクリ
アする（ステップ４００）。次に、基本登録文字８３お
よび通信分野に対応した専用登録文字８４の登録を行う
（ステップ４０１）。辞書選択データ４０は１Ｋバイト
分の被圧縮データが圧縮されるまで送られてこないの
で、復号化辞書の選択はここでは行われない（ステップ
４０２、４０３）。辞書選択データ４０がＬＺＷ符号器
から送られてくるまでは、学習用辞書２２−１、通信専
用辞書２２−２、音声専用辞書２２−３および画像専用
辞書２２−４の全ての辞書により復号化が行われ、復号
文字列はそれぞれ学習用辞書文字列出力バッファ２４と
通信用辞書文字列出力バッファ２５と音声用辞書文字列
出力バッファ２６と画像用辞書文字列出力バッファ２８
とに蓄えられる。The LZW decoding unit 21 resets the dictionary selection signal 28 to the dictionary selector 23 and clears the contact number counter (Step 400). Next, the basic registration character 83 and the special registration character 84 corresponding to the communication field are registered (step 401). Since the dictionary selection data 40 is not sent until the compressed data of 1 Kbyte is compressed, the selection of the decoding dictionary is not performed here (steps 402 and 403). Until the dictionary selection data 40 is sent from the LZW encoder, decoding is performed by all of the learning dictionary 22-1, the communication dictionary 22-2, the audio dictionary 22-3, and the image dictionary 22-4. The decoded character strings are sent to the learning dictionary character string output buffer 24, the communication dictionary character string output buffer 25, the audio dictionary character string output buffer 26, and the image dictionary character string output buffer 28, respectively.
It is stored in.

【００３３】辞書選択データ４０が来ると、ＬＺＷ復号
化部２１は図１３に示した辞書選択データとその対応辞
書テーブルから該辞書選択データ４０以降の圧縮データ
がどの符号化辞書により符号化されているかを判断し
（ステップ４０２）、復号化辞書を設定するために辞書
選択信号２８をＬＺＷ復号器の辞書選択器２３に出力す
る。辞書選択器２３は、復号化辞書の中から通知を受け
た専用辞書を選択する（ステップ４０３）。この場合、
符号語（１００００００００）を受けているので通信専
用辞書２２−２が選択され、次に辞書選択データ４０が
来るまで固定する。When the dictionary selection data 40 arrives, the LZW decoding unit 21 encodes the compressed data after the dictionary selection data 40 by the coding dictionary from the dictionary selection data and the corresponding dictionary table shown in FIG. Is determined (step 402), and a dictionary selection signal 28 is output to the dictionary selector 23 of the LZW decoder to set a decoding dictionary. The dictionary selector 23 selects the dedicated dictionary notified from the decrypted dictionaries (step 403). in this case,
Since it has received the code word (10000000), the communication-dedicated dictionary 22-2 is selected and fixed until the next dictionary selection data 40 comes.

【００３４】辞書選択データ４０に続く圧縮データが読
み込まれる（ステップ４０４）と、それに対応する接点
番号に変換され（ステップ４０５）、該接点番号に対応
した文字列が通信専用辞書２２―２で検索される（ステ
ップ４０６）。ここで、辞書選択データ４０に続く符号
語（…・・１００００１０００、００００００１００、
…・・）が来たとすると、まず符号語（１００００１０
００）が読み込まれ、接点番号（２６４）が通信専用辞
書２２―２に登録されているかどうか検索する。接点番
号２６４に対応した文字列ａｂｃｃは通信専用辞書に登
録されているので、この文字列を文字列出力バッファに
出力する（ステップ４０４〜４０７）とともに、辞書選
択データの１つ前で復号された文字（ここではｆと仮定
する）との組み合わせｆａｂｃｃが通信用辞書２２―２
に登録されているかどうか検索する（ステップ４０
８）。もし登録されてなければ、新規登録文字８５とし
て符号語（１００１０１０１１）、接点番号（２９９）
を付与して通信専用辞書２２―２に登録する（ステップ
４０８、４０９）。登録されていれば何もしない。次に
ポイントされる符号語（００００００１００）を読み込
み、同様に接点番号（４）に対応した文字ｅを検索す
る。これは基本登録語として登録されているので、文字
ｅを通信用辞書文字列出力バッファ２５へ出力し、１つ
前の復号文字との組み合わせであるａｂｃｃｅが通信専
用辞書２２−２に登録されているかどうか検索する。登
録されていなければ新規登録文字８５として同じく通信
専用辞書２２―１に登録する。以降同様にして続く符号
語の復号動作（４０２〜４０９）が行われ、圧縮データ
ファイルが終わりになったら復号化処理を終了する（ス
テップ４１１）。なお、符号語が復号辞書に登録されて
いない場合は例外処理（ステップ４０６、４１２）が行
われるが、本特許請求範囲ではないのでここでは説明を
割愛する。When the compressed data following the dictionary selection data 40 is read (step 404), it is converted into a contact number corresponding thereto (step 405), and a character string corresponding to the contact number is searched in the communication dictionary 22-2. (Step 406). Here, the codewords following the dictionary selection data 40 (... 10,0000000, 0000000000,
… ..), the code word (1000010)
00) is read, and it is searched whether or not the contact number (264) is registered in the communication dictionary 22-2. Since the character string abcc corresponding to the contact number 264 is registered in the communication dictionary, this character string is output to the character string output buffer (steps 404 to 407), and is decoded just before the dictionary selection data. The combination fabcc with a character (here, f is assumed) is the communication dictionary 22-2.
(Step 40)
8). If not registered, the code word (100101011) and the contact number (299) are newly registered characters 85.
And registers it in the communication dictionary 22-2 (steps 408 and 409). Do nothing if registered. Next, the code word (0000000000) to be pointed is read, and a character e corresponding to the contact number (4) is similarly searched. Since this is registered as a basic registered word, the character e is output to the communication dictionary character string output buffer 25, and abcse, which is a combination with the immediately preceding decoded character, is registered in the communication dictionary 22-2. Search for If it is not registered, it is registered as a newly registered character 85 in the communication dictionary 22-1. Thereafter, decoding operations of the subsequent code words (402 to 409) are performed in the same manner, and when the compressed data file ends, the decoding process ends (step 411). If the codeword is not registered in the decoding dictionary, exception processing (steps 406 and 412) is performed, but the description is omitted here because it is not the scope of the present invention.

【００３５】本発明の第２の実施例を図１９第２の実施
例専用辞書の符号化手順に従って説明する。ここで、第
２の実施例の符号復号器の構成は第１の実施例のものと
同じである。図９は通信分野の用語が予め通信専用辞書
１２―２に登録されている内容を示しており、接点番号
５０、登録文字列５１、出現カウンタ５２で構成されて
いる。A second embodiment of the present invention will be described with reference to FIG. Here, the configuration of the code decoder of the second embodiment is the same as that of the first embodiment. FIG. 9 shows contents in which terms in the communication field are registered in the communication dictionary 12-2 in advance, and includes a contact number 50, a registered character string 51, and an appearance counter 52.

【００３６】まず、ＬＺＷ符号化部１１は符号化処理に
先立ち、辞書検索カウンタ、出現カウンタ、文字ポイン
タの初期化（ここでは０クリア）を行う（ステップ５０
０）。次に、基本登録文字５３および各分野に対応した
専用登録文字５４の登録を行う（ステップ５０１）。こ
こで、ＬＺＷ符号器１１の文字列入力バッファ１０に
は、図６に示した一連の被圧縮データが蓄積されてい
る。また、前記学習用辞書１２―１による符号化の説明
と同様に、被圧縮データとして図４に示す被圧縮データ
フレーム（ａｂｃａｂｃａｂｃａａｂｂｃｃ……）４２
が保持されているものとする。First, the LZW encoding unit 11 initializes (here, clears to 0) a dictionary search counter, an appearance counter, and a character pointer prior to the encoding process (step 50).
0). Next, the basic registration character 53 and the special registration character 54 corresponding to each field are registered (step 501). Here, a series of compressed data shown in FIG. 6 is stored in the character string input buffer 10 of the LZW encoder 11. Similarly to the description of the encoding using the learning dictionary 12-1, the compressed data frame (abcabcabcaabbcc...) 42 shown in FIG.
Is held.

【００３７】続いて、該被圧縮データフレームに辞書選
択フラグが付与または既に辞書選択モードに設定されて
いるかどうか調べる（ステップ５０２）。この場合、辞
書選択フラグが付与されているので、辞書選択フラグモ
ードの設定を行うとともに、辞書選択を図２０辞書選択
でフラグとその対応辞書に示す図から辞書選択を行う。
ここでは、通信専用辞書１２―２を選択する辞書選択フ
ラグ（００）が指定されてきたとする。なお、辞書選択
フラグも辞書選択フラグモードも設定されていない場合
は、どの専用辞書を用いて符号化を行うかを決定する専
用辞書選択値を求める符号化範囲を決定する辞書検索カ
ウント値（ここでは１Ｋバイト分・Ｔ０）を調べる（ス
テップ５０４）が、この場合の符号化処理は第１の実施
例と同様であるのでここでの説明は割愛する。Subsequently, it is checked whether a dictionary selection flag has been added to the compressed data frame or the dictionary selection mode has already been set (step 502). In this case, since the dictionary selection flag is given, the dictionary selection flag mode is set, and the dictionary is selected from the dictionary and the corresponding dictionary shown in FIG.
Here, it is assumed that a dictionary selection flag (00) for selecting the communication dictionary 12-2 has been designated. When neither the dictionary selection flag nor the dictionary selection flag mode is set, a dictionary search count value (here, a dictionary search count value for determining a coding range for obtaining a dedicated dictionary selection value for determining which dedicated dictionary to use for coding). In this case, 1K bytes (T0) is checked (step 504), but the encoding process in this case is the same as in the first embodiment, and the description is omitted here.

【００３８】辞書選択フラグモードに入ると、被圧縮デ
ータフレーム４２から最初の文字ａの読み込みに入る
（ステップ５０５）。次に、文字ａが通信専用辞書１２
―２に登録されているかどうか検索する（ステップ５０
６）。文字ａは基本登録文字５３として登録されている
ので、出現カウンタをカウントアップ（＋１）し（ステ
ップ５０７）、次の文字をポイントする（ステップ５０
８）。ポイントされた次の文字ｂを読み込む。文字列ａ
ｂが通信専用辞書２２―２に登録されているかどうか検
索する。通信専用辞書では文字列ａｂは通信分野の専用
文字列として登録されているため、文字列ａｂの出現カ
ウンタをカウントアップし、次の文字をポイントする
（ステップ５０５〜５０８）。次にポイントされた文字
ｃを読み込む。文字列ａｂｃが通信専用辞書１２―２に
登録されているかどうか検索する。文字ａｂｃは登録さ
れているので、文字列ａｂｃの出現カウンタをカウント
アップし、さらに次の文字をポイントする（ステップ５
０５〜５０８）。ポイントされた文字ｃを読み込み、文
字列ａｂｃａが通信専用辞書１２−２に登録されている
かどうか検索する。この場合、文字列ａｂｃａは辞書に
登録されていないので、接点番号（２６６）が付与さ
れ、通信専用辞書１２−２に登録される（ステップ５０
５〜５０９）。次に、１つ前の一致した文字列ａｂｃの
符号語を通信専用符号列出力バッファ１５へ出力する
（ステップ５１０）。なお、本発明の第２の実施例にお
ける復号動作は前記第１の実施例の専用辞書を使用した
復号動作と同じであるので、ここでの説明は割愛する。In the dictionary selection flag mode, the first character a is read from the compressed data frame 42 (step 505). Next, the character a is entered in the communication dictionary 12
-2 to search whether it is registered in (Step 50
6). Since the character a is registered as the basic registered character 53, the appearance counter is counted up (+1) (step 507), and the next character is pointed (step 50).
8). Read the next character b pointed to. Character string a
A search is made to determine whether or not b is registered in the communication dictionary 22-2. Since the character string ab is registered in the communication-dedicated dictionary as a character string dedicated to the communication field, the occurrence counter of the character string ab is counted up and the next character is pointed (steps 505 to 508). Next, the pointed character c is read. A search is performed to determine whether the character string abc is registered in the communication dictionary 12-2. Since the character abc is registered, the appearance counter of the character string abc is counted up, and the character abc is further pointed to (step 5).
05-508). The pointed character c is read, and it is searched whether the character string abca is registered in the communication dictionary 12-2. In this case, since the character string abca is not registered in the dictionary, a contact number (266) is assigned and registered in the communication-dedicated dictionary 12-2 (step 50).
5-509). Next, the code word of the previous matched character string abc is output to the communication-specific code string output buffer 15 (step 510). The decoding operation according to the second embodiment of the present invention is the same as the decoding operation using the dedicated dictionary according to the first embodiment, and the description is omitted here.

【００３９】以上のように、予め符号化する被圧縮デー
タに、符号化する際に使用する専用辞書選択情報である
辞書選択フラグを付与することにより、被圧縮データに
適した専用辞書を演算、選択する処理が不要となり、専
用辞書を使用する場合の符号化処理が軽減される。As described above, by assigning a dictionary selection flag, which is dedicated dictionary selection information used for encoding, to the data to be encoded in advance, a dedicated dictionary suitable for the data to be compressed is calculated. The selection process is not required, and the encoding process when using the dedicated dictionary is reduced.

【００４０】[0040]

【発明の効果】以上説明したように、本発明のＬＺＷ符
号化方法は、図３辞書選択データの符号化範囲に示した
ように辞書形態は従来と同様であるが、従来が全ての被
圧縮データと全ての辞書を用い符号化していたのに対
し、本発明ではある一定量の被圧縮データを全ての辞書
を用いて圧縮学習した後、その被圧縮データ内容から最
も近い分野の専用辞書を選択、固定し、残された該被圧
縮データの圧縮を行うことにより、従来の方法で行って
いた全ての被圧縮データならびに全ての専用辞書による
符号化処理を削減し、符号化処理の高速化を図ることが
出来る。符号化削減時間は図３において従来方法による
符号化時間をＴａ、本発明において一連のブロック番号
ｉを０、１、２、３、……ｎ＋２とし各ブロックの符号
化時間をＴｉで表すと、下記式のようになる。As described above, in the LZW encoding method of the present invention, the dictionary form is the same as the conventional one as shown in the encoding range of the dictionary selection data in FIG. In contrast to encoding using data and all dictionaries, in the present invention, after a certain amount of compressed data is subjected to compression learning using all dictionaries, a dedicated dictionary in the field closest to the content of the compressed data is obtained. By selecting, fixing and compressing the remaining compressed data, the encoding processing by the conventional method with all the compressed data and all the dedicated dictionaries is reduced, and the encoding processing is accelerated. Can be achieved. Coding reduces time between the time of encoding according to the conventional method in FIG. 3 Ta, the set of block number i in the present invention 0,1,2,3, and ...... n + 2 and to represent the encoding time of each block of Ti And the following equation.

【数１】 (Equation 1)

[Brief description of the drawings]

【図１】本発明のＬＺＷ符号器の一例を示すブロック図
である。FIG. 1 is a block diagram illustrating an example of an LZW encoder according to the present invention.

【図２】本発明のＬＺＷ復号器の一例を示すブロック図
である。FIG. 2 is a block diagram illustrating an example of an LZW decoder according to the present invention.

【図３】辞書選択データの符号化範囲を示す図である。FIG. 3 is a diagram showing an encoding range of dictionary selection data.

【図４】第１の実施例の被圧縮データフレームを示す図
である。FIG. 4 is a diagram illustrating a compressed data frame according to the first embodiment.

【図５】圧縮データフレームを示す図である。FIG. 5 is a diagram showing a compressed data frame.

【図６】第２の実施例の被圧縮データフレームを示す図
である。FIG. 6 is a diagram illustrating a compressed data frame according to a second embodiment.

【図７】符号化学習用辞書を示す図である。FIG. 7 is a diagram showing an encoding learning dictionary.

【図８】復号化学習用辞書を示す図である。FIG. 8 is a diagram showing a decoding learning dictionary.

【図９】符号化通信専用辞書を示す図である。FIG. 9 is a diagram showing a dictionary dedicated to encoded communication.

【図１０】符号化音声専用辞書を示す図である。FIG. 10 is a diagram showing a coded voice-only dictionary.

【図１１】符号化画像専用辞書を示す図である。FIG. 11 is a diagram showing a dictionary dedicated to encoded images.

【図１２】ＬＺＷ復号器の通信専用辞書を示す図であ
る。FIG. 12 is a diagram showing a communication-dedicated dictionary of the LZW decoder.

【図１３】辞書選択データとその対応辞書を示す表であ
る。FIG. 13 is a table showing dictionary selection data and its corresponding dictionary.

【図１４】本発明の実施例における辞書選択値と閾値を
示す式および表である。FIG. 14 is an expression and a table showing a dictionary selection value and a threshold value in the embodiment of the present invention.

【図１５】学習用辞書による符号化手順を示すフローチ
ャートである。FIG. 15 is a flowchart showing an encoding procedure using a learning dictionary.

【図１６】学習用辞書による復号化手順を示すフローチ
ャートである。FIG. 16 is a flowchart showing a decoding procedure using a learning dictionary.

【図１７】専用辞書による符号化手順を示すフローチャ
ートである。FIG. 17 is a flowchart showing an encoding procedure using a dedicated dictionary.

【図１８】専用辞書による復号化手順を示すフローチャ
ートである。FIG. 18 is a flowchart showing a decoding procedure using a dedicated dictionary.

【図１９】第２の実施例専用辞書の符号化手順を示すフ
ローチャートである。FIG. 19 is a flowchart showing a procedure for encoding a dictionary dedicated to the second embodiment.

【図２０】第２の実施例辞書選択でフラグとその対応辞
書を示す表である。FIG. 20 is a table showing flags and corresponding dictionaries in dictionary selection according to the second embodiment.

【図２１】従来の符号化システムのブロック図である。FIG. 21 is a block diagram of a conventional encoding system.

[Explanation of symbols]

１０文字列入力バッファ１１ＬＺＷ符号化部１２符号化辞書１２−１学習用辞書１２−２通信専用辞書１２−３音声専用辞書１２−４画像専用辞書１３辞書選択器１４学習用辞書符号列出力バッファ１５通信専用辞書符号列出力バッファ１６音声専用辞書符号列出力バッファ１７画像専用辞書符号列出力バッファ１８選択信号２０符号列入力バッファ２１ＬＺＷ復号化部２２復号化辞書２２−１学習用辞書２２−２通信専用辞書２２−３音声専用辞書２２−４画像専用辞書２３辞書選択器２４学習用辞書文字列出力バッファ２５通信専用辞書文字列出力バッファ２６音声専用辞書文字列出力バッファ２７画像専用辞書文字列出力バッファ２８選択信号 DESCRIPTION OF SYMBOLS 10 Character string input buffer 11 LZW encoding part 12 Encoding dictionary 12-1 Learning dictionary 12-2 Communication dictionary 12-3 Audio dictionary 12-4 Image dictionary 13 Dictionary selector 14 Learning dictionary code string output buffer 15 Communication-dedicated dictionary code string output buffer 16 Audio-only dictionary code string output buffer 17 Image-dedicated dictionary code string output buffer 18 Selection signal 20 Code string input buffer 21 LZW decoding unit 22 Decoding dictionary 22-1 Learning dictionary 22-2 Communication-only dictionary 22-3 Voice-only dictionary 22-4 Image-only dictionary 23 Dictionary selector 24 Learning dictionary character-string output buffer 25 Communication-only dictionary character-string output buffer 26 Audio-only dictionary character-string output buffer 27 Image-only dictionary character-string output Buffer 28 selection signal

フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) H03M 7/40 - 7/42 Continuation of front page (58) Field surveyed (Int.Cl. ⁷ , DB name) H03M ^7/ 40-7/42

Claims

(57) [Claims]

We claim: 1. have you <br/> the data compression method according to the LZW coding, provided with a learning dictionary and one or more specialized dictionary, said dictionary a portion of the compressed data to be encoded After performing compression learning using all of the compressed data, selecting and fixing the dedicated dictionary in the field closest to the compressed data content, and compressing the selected compressed data left by the fixed dedicated dictionary. Data compression method .

2. A character string input buffer for storing data to be compressed, an encoding dictionary including a learning dictionary and one or more specialized dictionaries, a code string output buffer corresponding to each dictionary, and a dictionary A function of encoding the data to be compressed from the input buffer using the selector and each of the dictionaries, and outputting the encoded data to the output buffer; The most suitable dictionary when encoding the compressed data is selected from the learning dictionary and the dedicated dictionary that stores data indicating how many characters appear in the compressed data, and a selection signal is sent to the dictionary selector. 2. The data compression method according to claim 1, further comprising an LZW encoding unit having a function of outputting the data.

3. LZW coding unit, the data compression method of claim 2 having a function to encode該被compressed data dictionary selected flagged to indicate whether data to be compressed in advance which areas.

4. A code string input buffer for storing compressed data, a decoding dictionary including a learning dictionary and one or more specialized dictionaries, a character string output buffer corresponding to each dictionary, and a dictionary selection buffer. And a function of decoding the compressed data from the input buffer using the respective dictionaries and outputting the decoded data to the output buffer. The registered characters and the registered characters learned in the respective dictionaries are A function of selecting a dictionary most suitable for decoding compressed data from the learning dictionary and the dedicated dictionary that stores data indicating how many times the compressed data appears, and outputting a selection signal to the dictionary selector. 2. An LZW decoding unit having:
2. The data compression method according to 1 .

(5) Character string input buffer that stores compressed data
And learning dictionary, communication-specific dictionary suitable for the communication field, sound
Voice-only dictionary and image field suitable for voice field Picture suitable for
An encoding dictionary with a dedicated image dictionary, and
Learning dictionary code string output buffer, communication-specific dictionary code string
Output buffer, voice-only dictionary code string output buffer and
Image-only dictionary code string output buffer, dictionary selector, and
From the input buffer by the LZW method using each dictionary
The data to be compressed is encoded, and the encoded data is output as described above.
Function to output to the output buffer
The registered characters and the registered characters learned in
For learning, which stores data indicating how many times it has appeared
When encoding data to be compressed from a dictionary and a dedicated dictionary
Select the most suitable dictionary and output a selection signal to the dictionary selector
Data compression comprising an LZW encoder having a function of performing
method.

6. Prefix indicating whether the data to be compressed in advance which areas
Encodes the data to be compressed with a document selection flag
The data according to claim 5, further comprising an LZW encoding unit having a function.
Data compression method.