JP4093200B2

JP4093200B2 - Data compression method and program, and data restoration method and apparatus

Info

Publication number: JP4093200B2
Application number: JP2004092980A
Authority: JP
Inventors: 國明植木
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2004-03-26
Filing date: 2004-03-26
Publication date: 2008-06-04
Anticipated expiration: 2024-03-26
Also published as: JP2005286371A

Description

本発明は、符号化済みの記号列との一致に基づいて符号化することにより、データを圧縮する方法等に関し、特に、データ圧縮率を向上させることができ、データ圧縮及び復元の処理に時間のかからないデータ圧縮方法等に関する。 The present invention relates to a method of compressing data by encoding based on a match with an encoded symbol string, and in particular, it can improve a data compression rate and spend time on data compression and decompression processing. The present invention relates to a data compression method that does not cost.

近年、情報処理技術の発達やインターネット等のネットワークの普及により、膨大なデータが処理されると共に通信されるようになってきている。例えば、ホストコンピュータからネットワークを介してプリンタに印刷を実行させるということが一般によく行われ、また、プリンタの処理速度も随時向上してきているが、ホストコンピュータからプリンタに送信する印刷データの容量が大きい場合にはその通信に時間がかかってしまい、プリンタの処理速度を十分に生かすことができないという事態も発生する。従って、データを如何に圧縮して送信し、受信後、如何に復元するかというデータの圧縮及び復元技術が必要となってくる。 In recent years, with the development of information processing technology and the spread of networks such as the Internet, a huge amount of data has been processed and communicated. For example, it is a common practice to cause a printer to execute printing from a host computer via a network, and the processing speed of the printer is improved as needed, but the capacity of print data transmitted from the host computer to the printer is large. In some cases, the communication takes time, and the processing speed of the printer cannot be fully utilized. Therefore, a data compression and decompression technique is required for how data is compressed and transmitted and how it is restored after reception.

かかるデータの圧縮及び復元技術としては、従来から、その目的や対象データにより幾つかのものが提案されている。その一つとして、ＬＺ７７（ＺｉｖａｎｄＬｅｍｐｅｌ（１９７７））符号を用いた方法がある。かかる手法では、現在符号化を行なっている記号（データ）の前後の記号列をバッファに保存し、このバッファを辞書として参照して、符号化しようとしている記号列と一致する、最長の記号列を当該辞書内で探索する。そして、探索された記号列の長さと位置の情報により対象の記号列を符号化する。 Conventionally, several data compression and decompression techniques have been proposed depending on the purpose and target data. One of them is a method using an LZ77 (Ziv and Lempel (1977)) code. In this method, a symbol string before and after a symbol (data) that is currently encoded is stored in a buffer, and this buffer is referred to as a dictionary, and the longest symbol string that matches the symbol string to be encoded is stored. In the dictionary. Then, the target symbol string is encoded based on the length and position information of the searched symbol string.

このＬＺ７７方式にも様々なバリエーションが開発されており、例えば、下記非特許文献１に記載されたＬＺＳＳ符号を用いた方法は現在頻繁に用いられている。かかる方法は、前記一致する最長記号列の長さが所定の値より小さい場合には、元の記号をそのまま符号とし、即ち、上述の一致記号列の長さと位置による符号化は行なわず、一致する最長記号列の長さが所定の値より大きい場合には、上述の一致記号列の長さと位置による符号化を行なう。そして、両者を区別するためのフラグを符号の先頭に付加する。 Various variations have been developed for the LZ77 system. For example, a method using an LZSS code described in Non-Patent Document 1 below is frequently used. In this method, when the length of the matching longest symbol string is smaller than a predetermined value, the original symbol is used as it is, that is, the above matching symbol string is not encoded according to the length and position. When the length of the longest symbol string to be performed is larger than a predetermined value, the above-described encoding based on the length and position of the matching symbol string is performed. Then, a flag for distinguishing both is added to the head of the code.

また、このようなＬＺ７７方式のデータ圧縮方法について、圧縮率の向上などの目的により幾つかの提案がなされている（例えば、下記特許文献１及び２）。
特開平５−１１９７３号公報特開平７−２６１９７７号公報植松友彦著、「文書データ圧縮アルゴリズム入門」、第２版、ＣＱ出版、１９９５年６月、ｐ．１４５−１４８ Further, several proposals have been made for such an LZ77 data compression method for the purpose of improving the compression rate (for example, Patent Documents 1 and 2 below).
Japanese Patent Laid-Open No. 5-11973 JP 7-261977 A Tomohiko Uematsu, “Introduction to Document Data Compression Algorithm”, 2nd edition, CQ Publishing, June 1995, p. 145-148

しかしながら、上述したＬＺＳＳ符号では、前記一致する最長記号列の位置（以下、一致位置と呼ぶ）と長さ（以下、一致長さと呼ぶ）を固定長さの符号（ビット列）で表現しているので、データの圧縮率があまりよくないという課題があった。 However, in the above-described LZSS code, the position (hereinafter referred to as the matching position) and the length (hereinafter referred to as the matching length) of the longest matching symbol string are expressed by a fixed-length code (bit string). There was a problem that the data compression rate was not so good.

また、上記特許文献１には、辞書内の一致する記号列に出現番号なるものを付けて、この出現番号を上記一致位置として出力することにより、データ圧縮率の向上を図ろうとする方法が記載されているが、かかる方法では、圧縮されたデータの復号の際にも、一致する記号列の出現位置を検索によって求める必要があり、復元の処理に時間がかかってしまうという問題があった。 Further, Patent Document 1 describes a method for improving the data compression rate by attaching an appearance number to a matching symbol string in the dictionary and outputting the appearance number as the matching position. However, such a method has a problem in that it takes time for the restoration process because it is necessary to obtain the appearance position of the matching symbol string by searching even when the compressed data is decoded.

また、上記特許文献２には、上記一致位置と一致長さを、スプレイ木により動的に変化する符号で表現し、圧縮率の向上を図る技術が示されているが、符号を動的に変更するため、圧縮、復元共に処理が複雑であり、それらの処理に時間がかかるという課題があった。 Further, in Patent Document 2, a technique for expressing the matching position and the matching length with a code that dynamically changes according to a spray tree and improving the compression ratio is shown. Because of the change, both the compression and decompression processes are complicated, and there is a problem that these processes take time.

そこで、本発明の目的は、ＬＺ７７符合に基づくデータ圧縮方法であって、圧縮率を改善でき、また、データ圧縮及び復元の処理に時間のかからないデータ圧縮方法等を提供することである。 Therefore, an object of the present invention is to provide a data compression method based on the LZ77 code, which can improve the compression rate, and that does not take time for data compression and decompression processing.

上記の目的を達成するために、本発明の一つの側面は、符号化済みの記号列の中に存在する、符号化対象の記号列に最大長一致する最長一致系列を検索し、当該最長一致系列の存在位置である一致位置と当該最長一致系列の長さである一致長さとに基づいて符号化を行いデータを圧縮するデータ圧縮方法であって、前記符号化によって生成される符号が、前記一致位置及び又は前記一致長さの値が前回の符号化時の値と同一であるか否かの情報を含む第一符号と、前記前回の符号化時の値と同一でない前記一致位置及び又は前記一致長さの値を表す第二符号と、から構成されることを特徴とする。従って、本発明によれば、前回の符号化時の値と同一である一致位置及び又は一致長さの情報が付加されないので、データ圧縮率をより高めることができる。 In order to achieve the above object, one aspect of the present invention is to search for a longest match sequence having a maximum length match with a symbol string to be encoded, existing in an encoded symbol string, and to perform the longest match. A data compression method for compressing data by performing encoding based on a match position that is a sequence existing position and a match length that is a length of the longest match sequence, wherein the code generated by the encoding is A first position including information indicating whether the value of the matching position and / or the matching length is the same as the value at the previous encoding, and the matching position not equal to the value at the previous encoding, and / or And a second code representing the value of the coincidence length. Therefore, according to the present invention, since the coincidence position and / or coincidence length information that is the same as the previous encoding value is not added, the data compression rate can be further increased.

更に、上記の発明において、その好ましい態様は、前記第二符号が、可変長の符号であり、前記第一符号が、前記第二符号の長さに関する情報を含むことを特徴とする。これにより、更にデータ圧縮率を高めることが可能となる。 Furthermore, in the above invention, a preferable aspect thereof is characterized in that the second code is a variable-length code, and the first code includes information on the length of the second code. As a result, the data compression rate can be further increased.

更に、上記の発明において、好ましい態様は、前記第二符号が、前記一致位置及び又は前記一致長さを表す値を２進法で表現した場合の最上位の１を除いたものであることを特徴とする。これにより、更に一層、データ圧縮率をより高めることができる。 Furthermore, in the above invention, a preferred aspect is that the second code is obtained by removing the highest-order 1 when the value representing the coincidence position and / or the coincidence length is expressed in binary. Features. Thereby, the data compression rate can be further increased.

上記の目的を達成するために、本発明の別の側面は、符号化済みの記号列の中に存在する、符号化対象の記号列に最大長一致する最長一致系列を検索し、当該最長一致系列の存在位置である一致位置と当該最長一致系列の長さである一致長さとに基づいて符号化を行いデータを圧縮するデータ圧縮方法であって、前記最長一致系列の長さが、所定の値よりも小さい場合には、当該所定の値よりも小さい旨と元の記号を表す、所定長さの第一識別情報により前記符号化を行い、前記最長一致系列の長さが、前記所定の値以上の場合には、当該所定の値以上である旨と、前記一致位置及び又は前記一致長さの値が前回の符号化時の値と同一であるか否かの情報と、前記前回の符号化時の値と同一でない前記一致位置及び又は一致長さを表す一致位置情報及び又は一致長さ情報の長さとを表す、前記第一識別情報と同じ長さの第二識別情報と、前記前回の符号化時の値と同一でない前記一致位置及び又は一致長さを表す一致位置情報及び又は一致長さ情報と、により前記符号化を行なうことである。 In order to achieve the above object, another aspect of the present invention is to search for a longest matching sequence having a maximum length match with a symbol string to be encoded, existing in an encoded symbol string, and to perform the longest match. A data compression method for compressing data by encoding based on a match position that is a sequence existing position and a match length that is a length of the longest match sequence, wherein the length of the longest match sequence is a predetermined length If the value is smaller than the value, the encoding is performed with the first identification information having a predetermined length indicating that the original symbol is smaller than the predetermined value, and the length of the longest matching sequence is the predetermined value. If the value is greater than or equal to the value, information indicating that the value is equal to or greater than the predetermined value, whether the match position and / or the match length value is the same as the previous encoding value, and the previous value Matching position indicating the matching position and / or matching length that is not the same as the encoded value Represents the length of the information and / or the matching length information, the second identification information having the same length as the first identification information, and the matching position and / or the matching length that is not the same as the previous encoding value The encoding is performed using the matching position information and / or the matching length information.

更に、上記の発明において、その好ましい態様は、前記一致位置情報及び又は一致長さ情報が、前記一致位置及び又は前記一致長さを表す値を２進法で表現した場合の最上位の１を除いたものであることを特徴とする。 Furthermore, in the above-mentioned invention, the preferred mode is that the matching position information and / or matching length information is the highest-order 1 when a value representing the matching position and / or the matching length is expressed in binary. It is characterized by being excluded.

更に、上記の発明において、好ましい態様は、前記一致位置及び又は一致長さの値が前回の符号化時の値と同一である場合には、前記第二識別情報に含まれる前記一致位置情報及び又は一致長さ情報の長さを０とすることを特徴とする。 Furthermore, in the above invention, a preferable aspect is that, when the value of the matching position and / or the matching length is the same as the value at the previous encoding, the matching position information included in the second identification information and Alternatively, the length of the matching length information is set to 0.

また、上記の発明において、好ましい態様は、前記第一識別情報及び又は第二識別情報
がハフマン符号化されることを特徴とする。 In the above invention, a preferred aspect is characterized in that the first identification information and / or the second identification information is Huffman-coded.

上記の目的を達成するために、本発明の別の側面は、符号化済みの記号列の中に存在す
る、符号化対象の記号列に最大長一致する最長一致系列を検索し、当該最長一致系列の存在位置である一致位置と当該最長一致系列の長さである一致長さとに基づいて符号化を行いデータを圧縮する処理をコンピュータに実行させるデータ圧縮プログラムであって、前記一致位置及び又は前記一致長さの値が前回の符号化時の値と同一であるか否かの判断を行うステップと、前記判断の結果に基づき、前記前回の符号化時の値と同一であるか否かの情報を含む符号を出力するステップと、前記出力の後に、前記一致位置及び又は前記一致長さの値が前回の符号化時の値と同一でない場合には、当該一致位置及び又は一致長さの値を表す符号を出力するステップとを前記コンピュータに実行させることである。 In order to achieve the above object, another aspect of the present invention is to search for a longest matching sequence having a maximum length match with a symbol string to be encoded, existing in an encoded symbol string, and to perform the longest match. A data compression program for causing a computer to execute a process of encoding and compressing data based on a matching position that is a sequence existing position and a matching length that is the length of the longest matching sequence, the matching position and / or A step of determining whether or not the value of the matching length is the same as the value at the previous encoding, and based on the result of the determination, whether or not the value is the same as the value at the previous encoding If the value of the matching position and / or the matching length is not the same as the value at the previous encoding after the output and the step of outputting the code including the information of the matching position and / or the matching length Output a sign representing the value of It is to perform the up to the computer.

上記の目的を達成するために、本発明の更に別の側面は、符号化済みの記号列の中に存在する、符号化対象の記号列に最大長一致する最長一致系列を検索し、当該最長一致系列の存在位置である一致位置と当該最長一致系列の長さである一致長さとに基づいて符号化されたデータを復元するデータ復元方法であって、前記復元対象のデータが、前記一致位置及び又は前記一致長さの値が前回の符号化時の値と同一であるか否かの情報を含む第一符号と、前記前回の符号化時の値と同一でない前記一致位置及び又は前記一致長さの値を表す第二符号とから構成される場合に、前記第一符号に基づいて、前記一致位置及び又は前記一致長さの値が前回の復号時の値と同一であるか否かを判断し、前記前回の復号時の値と同一であると判断された一致位置及び又は一致長さの値を、前回の復号時の値から取得し、前記前回の復号時の値と同一でないと判断された一致位置及び又は一致長さの値を、前記第二符号から取得し、前記取得された一致位置と一致長さの値に基づいて、復号済みの記号列を用いて復号を行なうことである。 In order to achieve the above object, still another aspect of the present invention is to search for a longest matching sequence having a maximum length match with a symbol string to be encoded, existing in an encoded symbol string, and A data restoration method for restoring data encoded based on a match position that is a position where a match series exists and a match length that is the length of the longest match series, wherein the data to be restored is the match position And / or the first code including information on whether or not the value of the match length is the same as the value at the previous encoding, and the match position and / or the match not equal to the values at the previous encoding Whether or not the matching position and / or the matching length value is the same as the previous decoding value based on the first code when the second code representing the length value is included. Is determined to be the same as the previous decoding value. The value of the position and / or the matching length is obtained from the value at the previous decoding, and the value of the matching position and / or the matching length determined not to be the same as the value at the previous decoding is calculated from the second code. Obtaining and decoding using the decoded symbol string based on the obtained match position and match length values.

上記の目的を達成するために、本発明の別の側面は、符号化済みの記号列の中に存在す
る、符号化対象の記号列に最大長一致する最長一致系列を検索し、当該最長一致系列の存在位置である一致位置と当該最長一致系列の長さである一致長さとに基づいて符号化されたデータを復元するデータ復元装置が、前記復元済みの記号列を格納する記号列格納手段と、前記復元時に得られる前記一致位置及び前記一致長さの値であって最新の値を格納する前回値格納手段と、前記復元対象のデータが、前記一致位置及び又は前記一致長さの値が前回の符号化時の値と同一であるか否かの情報を含む第一符号と、前記前回の符号化時の値と同一でない前記一致位置及び又は前記一致長さの値を表す第二符号とから構成される場合に、前記第一符号に基づいて、前記一致位置及び又は前記一致長さの値が前記前回の符号化時の値と同一であるか否かを判断し、前記前回の符号化時の値と同一であると判断された一致位置及び又は一致長さの値を、前記前回値格納手段に格納された値から取得し、前記前回の符号化時の値と同一でないと判断された一致位置及び又は一致長さの値を、前記第二符号から取得し、前記取得された一致位置と一致長さの値に基づいて、前記記号列格納手段に格納された記号列を用いて復号を行なう、復号手段とを備えることである。 In order to achieve the above object, another aspect of the present invention is to search for a longest matching sequence having a maximum length match with a symbol string to be encoded, existing in an encoded symbol string, and to perform the longest match. A symbol string storage means for storing the restored symbol string in a data restoration device that restores the encoded data based on the matching position that is the position of the series and the matching length that is the length of the longest matching series The previous value storage means for storing the latest value of the match position and the match length value obtained at the time of restoration, and the data to be restored are the match position and / or the match length value. A second code representing the value of the matching position and / or the matching length that is not the same as the previous encoding value Based on the first code. The match position and / or the match length value are determined to be the same as the previous encoding value, and the match is determined to be the same as the previous encoding value. The value of the position and / or the matching length is obtained from the value stored in the previous value storage means, and the value of the matching position and / or the matching length determined not to be the same as the value at the previous encoding, Decoding means that obtains from the second code and performs decoding using the symbol string stored in the symbol string storage means based on the obtained match position and match length value. .

本発明の更なる目的及び、特徴は、以下に説明する発明の実施の形態から明らかになる。 Further objects and features of the present invention will become apparent from the embodiments of the invention described below.

以下、図面を参照して本発明の実施の形態例を説明する。しかしながら、かかる実施の形態例が、本発明の技術的範囲を限定するものではない。なお、図において、同一又は類似のものには同一の参照番号又は参照記号を付して説明する。
図１は、本発明を適用したデータ圧縮装置の実施の形態例に係る構成図である。また、図２は、本発明を適用したデータ復元装置の実施の形態例に係る構成図である。図１及び図２に示すデータ圧縮装置１及びデータ復元装置２が、本発明に係るデータ圧縮方法及び復元方法を用いた装置である。本データ圧縮装置１のデータ圧縮方法は、ＬＺ７７符号の考え方を基本としているが、符号化に用いる前記一致位置と一致長さの値について、前回の処理時の値を記憶しておき、それらの値と同じ場合には、その旨を符号に含めることにより符号から一致位置と一致長さの情報を除き、符号の長さを極力短くしてデータの圧縮率を改善しようとするものである。また、本データ復元装置２のデータ復元方法は、上記圧縮時と同様に、一致位置と一致長さの値について前回の処理時の値を記憶しておき、圧縮時に付加された、それら前回の値と同一であるか否かの情報を利用して、一致位置と一致長さの情報を容易に取得し、それらの情報から元の記号列に復号するものであり、処理に時間のかからない復元処理を実現するものである。 Embodiments of the present invention will be described below with reference to the drawings. However, such an embodiment does not limit the technical scope of the present invention. In the drawings, the same or similar elements are denoted by the same reference numerals or reference symbols.
FIG. 1 is a configuration diagram according to an embodiment of a data compression apparatus to which the present invention is applied. FIG. 2 is a configuration diagram according to an embodiment of a data restoration apparatus to which the present invention is applied. The data compression apparatus 1 and the data decompression apparatus 2 shown in FIGS. 1 and 2 are apparatuses using the data compression method and the decompression method according to the present invention. The data compression method of the data compression apparatus 1 is based on the concept of the LZ77 code, but the values at the previous processing are stored for the match position and match length values used for encoding, and those values are stored. In the case where the value is the same as the value, the fact that it is included in the code removes the information on the coincidence position and the coincidence length from the code and tries to improve the data compression rate by shortening the code length as much as possible. Further, the data restoration method of the present data restoration device 2 stores the values of the previous processing for the values of the matching position and the matching length in the same manner as the above compression, and adds those previous values added at the time of compression. By using information on whether or not it is the same as the value, information on the matching position and matching length is easily obtained, and the information is decoded into the original symbol string. The process is realized.

図１に示すように、データ圧縮装置１は、入力バッファ１１、符号化部１２、辞書バッファ１３、及び前回値バッファ１４から構成され、入力されるデータを符号化してデータ容量を圧縮する装置である。ここで、入力される符号化前のデータのことを記号（列）と呼ぶこととし、本実施の形態例においては、各記号、例えば、図１の入力バッファ１１内の“Ａ”、“Ｂ”など、は８ビットのデータであるものとする。また、処理後の符号は、２進法による“０”又は“１”のビット列で表現される。 As shown in FIG. 1, the data compression device 1 is composed of an input buffer 11, an encoding unit 12, a dictionary buffer 13, and a previous value buffer 14, and encodes input data to compress the data capacity. is there. Here, input data before encoding is referred to as a symbol (column). In this embodiment, each symbol, for example, “A”, “B” in the input buffer 11 of FIG. "Etc." are assumed to be 8-bit data. Further, the processed code is expressed by a bit string of “0” or “1” in binary.

入力バッファ１１は、符号化対象の記号列を順次受け入れて格納するデータバッファであり、符号化が終了した記号列を順次辞書バッファ１３に引き渡す。また、入力バッファ１１の長さ（図１のＬ２）、即ち、格納可能な記号の数は、本実施の形態例では２５７であるものとする。この長さＬ２は、前記一致長さ（例えば、図１のｌ）の最大値を意味するものである。 The input buffer 11 is a data buffer that sequentially receives and stores symbol strings to be encoded, and sequentially transfers the symbol strings that have been encoded to the dictionary buffer 13. Further, the length of the input buffer 11 (L2 in FIG. 1), that is, the number of symbols that can be stored is assumed to be 257 in the present embodiment. This length L2 means the maximum value of the matching length (for example, 1 in FIG. 1).

辞書バッファ１３は、符号化済みの記号列を上記入力バッファ１１から順次受け入れて格納するデータバッファであり、ＬＺ７７方式において所謂辞書と呼ばれているものである。ここに格納された記号列は、入力バッファ１１内の符号化処理対象の記号列と比較され、この中から入力バッファ１１内の記号列と一致する最大長の記号列（以下、最長一致系列と呼ぶ）が探索される。辞書バッファ１３では、符号化部１２における符号化が終了する度に、符号化が済んだ記号列を受け入れて、先頭にある（図１の左側に位置する）その分の記号列が不要の記号として吐き出される。また、辞書バッファ１３の長さ（図１のＬ１）、即ち、格納可能な記号の数は、本実施の形態例では１６３８３（１６Ｋ）であるものとする。この長さＬ１は、前記一致位置を表す距離（例えば、図１のｄ）の最大値を意味するものである。図１に示す例では、最長一致系列が“ＡＢＣＤ”であり、その長さである一致長さがｌであり、その一致位置がｐであるということになる。 The dictionary buffer 13 is a data buffer that sequentially receives encoded symbol strings from the input buffer 11 and stores them, and is called a so-called dictionary in the LZ77 system. The symbol string stored here is compared with the symbol string to be encoded in the input buffer 11, and the maximum length symbol string that matches the symbol string in the input buffer 11 (hereinafter referred to as the longest matching sequence). Is called). In the dictionary buffer 13, every time encoding in the encoding unit 12 is completed, the encoded symbol string is accepted, and the symbol string at the head (located on the left side in FIG. 1) is unnecessary. Will be exhaled. In addition, the length of the dictionary buffer 13 (L1 in FIG. 1), that is, the number of symbols that can be stored is 16383 (16K) in the present embodiment. This length L1 means the maximum value of the distance (for example, d in FIG. 1) representing the coincidence position. In the example shown in FIG. 1, the longest matching sequence is “ABCD”, the matching length which is the length thereof is l, and the matching position is p.

次に、符号化部１２は、当該データ圧縮装置１における記号列の入力、符号化、及び符号の出力に係る全般の制御を行なう部分であるが、主に、辞書バッファ１３を参照しながら、入力バッファ１１内の記号列を符号化する処理を実行する。かかる符号化部１２が行なう符号化処理に特徴があり、その具体的な内容については後述する。なお、符号化部１２は、処理の手順を示したプログラムとそのプログラムに従って処理を実行するＣＰＵ等で構成してもよいし、ハードウェア回路で構成してもよい。 Next, the encoding unit 12 is a part that performs general control related to the input, encoding, and output of the symbol string in the data compression apparatus 1, mainly referring to the dictionary buffer 13, A process of encoding the symbol string in the input buffer 11 is executed. The encoding process performed by the encoding unit 12 is characterized, and the specific contents thereof will be described later. The encoding unit 12 may be configured by a program indicating a processing procedure, a CPU that executes processing according to the program, or may be configured by a hardware circuit.

また、前回値バッファ１４は、前回の、前記符号化部１２における一致位置と一致長さを用いた符号化処理時の、一致位置と一致長さの値を保持する部分であり、一致位置と一致長さを用いた符号化処理が行われる度に保持する値が更新される。かかる前回値バッファ１４に保持された値が符号化部１２における符号化処理に用いられることが、本データ圧縮装置１の特徴の一つである。 The previous value buffer 14 is a part that holds the value of the matching position and the matching length at the time of the previous encoding process using the matching position and the matching length in the encoding unit 12. Each time the encoding process using the matching length is performed, the value held is updated. One of the features of the data compression apparatus 1 is that the value held in the previous value buffer 14 is used for the encoding process in the encoding unit 12.

図２に示すように、データ復元装置２もデータ圧縮装置１と同様の構成をしており、入力バッファ２１、復号部２２（復号手段）、辞書バッファ２３（記号列格納手段）、及び前回値バッファ２４（前回値格納手段）を備えている。本データ復元装置２は、前記データ圧縮装置１で圧縮されたデータ（符号）を元の記号列のデータに復元する装置である。入力バッファ２１は、処理対象の符号を順次受け入れて格納するデータバッファである。 As shown in FIG. 2, the data restoration device 2 has the same configuration as that of the data compression device 1, and includes an input buffer 21, a decoding unit 22 (decoding unit), a dictionary buffer 23 (symbol string storage unit), and a previous value. A buffer 24 (previous value storage means) is provided. The data restoration device 2 is a device for restoring the data (code) compressed by the data compression device 1 to the original symbol string data. The input buffer 21 is a data buffer that sequentially receives and stores codes to be processed.

辞書バッファ２３は、復号済みの記号列を復号部２２から順次受け入れて格納するデータバッファである。ここに格納された記号列は、復号部２２による復号処理時に参照され、復号処理に利用される。辞書バッファ２３では、復号部２２における処理が終了する度に、復号された記号列を受け入れて、先頭にある（図２の右側に位置する）その分の記号列が不要の記号として吐き出される。また、辞書バッファ２３の長さ（サイズ）は、データ圧縮装置１の辞書バッファ１３の長さＬ１と同じである。 The dictionary buffer 23 is a data buffer that sequentially receives and stores decoded symbol strings from the decoding unit 22. The symbol string stored here is referred to during the decoding process by the decoding unit 22 and is used for the decoding process. In the dictionary buffer 23, every time the processing in the decoding unit 22 is completed, the decoded symbol string is accepted, and the corresponding symbol string at the head (located on the right side in FIG. 2) is spouted as an unnecessary symbol. Further, the length (size) of the dictionary buffer 23 is the same as the length L1 of the dictionary buffer 13 of the data compression apparatus 1.

また、辞書バッファ２３内の状態は、その時に復号部２２において処理対象となっている符号が、データ圧縮装置１において符号化された時の辞書バッファ１３の状態と一緒の状態となっている。図１に示す例で“ＡＢＣＤ”という記号列が、一致位置ｐ（又はｄ）と一致長さｌに基づいて符号化された場合に、当該符号がデータ復元装置２によって復号処理される際には、辞書バッファ２３内の状態は図２に示すような状態になっている。即ち、辞書バッファ２３の左端から距離ｄの位置ｐより“ＡＢＣＤ”の順番に（図２では逆方向に表現されている）記号列が納められる。従って、復号部２２は、符号から取得される一致位置ｐ（又はｄ）と一致長さｌの情報に基づいて“ＡＢＣＤ”と復号することができる。 Further, the state in the dictionary buffer 23 is the same as the state of the dictionary buffer 13 when the code to be processed in the decoding unit 22 at that time is encoded in the data compression apparatus 1. In the example shown in FIG. 1, when the symbol string “ABCD” is encoded based on the matching position p (or d) and the matching length l, when the code is decoded by the data restoration device 2. The state in the dictionary buffer 23 is as shown in FIG. That is, the symbol strings (expressed in the reverse direction in FIG. 2) are stored in the order of “ABCD” from the position p at a distance d from the left end of the dictionary buffer 23. Therefore, the decoding unit 22 can decode “ABCD” based on the matching position p (or d) acquired from the code and the matching length l.

復号部２２は、当該データ復元装置２における符号の入力、復号、及び記号の出力に係る全般の制御を行なう部分であるが、主に、辞書バッファ２３を参照しながら、入力バッファ２１内の符号を復号する処理を実行する。かかる復号部２２が行なう復号処理の具体的な内容については後述する。なお、復号部２２は、処理の手順を示したプログラムとそのプログラムに従って処理を実行するＣＰＵ等で構成してもよいし、ハードウェア回路で構成してもよい。 The decoding unit 22 is a part that performs general control related to code input, decoding, and symbol output in the data restoration device 2, and mainly refers to the code in the input buffer 21 while referring to the dictionary buffer 23. The process which decodes is performed. Specific contents of the decoding process performed by the decoding unit 22 will be described later. The decoding unit 22 may be configured by a program showing a processing procedure, a CPU that executes processing according to the program, or may be configured by a hardware circuit.

また、前回値バッファ２４は、一致位置と一致長さを用いて符号化された符号の、前記復号部２２における前回の処理時において取得された一致位置と一致長さの値を保持する部分であり、かかる処理が行われる度に保持する値が更新される。かかる前回値バッファ２４に保持された値が復号部２２における復号処理に用いられることが、本データ復元装置２の特徴の一つである。 The previous value buffer 24 is a part that holds the value of the match position and the match length acquired during the previous process in the decoding unit 22 of the code encoded using the match position and the match length. Yes, each time such processing is performed, the value held is updated. One of the characteristics of the data restoration device 2 is that the value held in the previous value buffer 24 is used for the decoding process in the decoding unit 22.

図３は、データ圧縮装置１の符号化部１２が行なう処理の内容を例示したフローチャートである。以下、図３に基づいて、本データ圧縮装置１で行なわれる圧縮処理の具体的な内容について説明する。まず、符号化部１２は、前記前回値バッファ１４に保持されている一致位置及び一致長さの値（図１の例では、ｄとｌの値）を初期化する（ステップＳ１）。具体的には、双方の値を０としてもよいし、それぞれ、符号化時において頻繁に現れる値としてもよい。次に、処理対象の記号列が順番に入力バッファ１１に読み込まれる（ステップＳ２）。その後、符号化部１２は、入力バッファ１１に格納された先頭位置（図１では、左端位置）からの記号列について、辞書バッファ１３内に格納された記号列との一致を探索する（ステップＳ３）。即ち、前述した最長一致系列を検索する。 FIG. 3 is a flowchart illustrating the contents of the process performed by the encoding unit 12 of the data compression apparatus 1. Hereinafter, based on FIG. 3, the specific content of the compression process performed by this data compression apparatus 1 is demonstrated. First, the encoding unit 12 initializes the coincidence position and coincidence length values (the values of d and l in the example of FIG. 1) held in the previous value buffer 14 (step S1). Specifically, both values may be 0, or may be values that frequently appear during encoding. Next, the symbol strings to be processed are sequentially read into the input buffer 11 (step S2). Thereafter, the encoding unit 12 searches for a match with the symbol string stored in the dictionary buffer 13 with respect to the symbol string from the head position (leftmost position in FIG. 1) stored in the input buffer 11 (step S3). ). That is, the longest matching sequence described above is searched.

そして、検索された最長一致系列の長さ（一致長さ）が予め定められた値以上であるか否かがチェックされる（ステップＳ４）。例えば、かかる所定数は“３”とされる。その結果、一致長さが２以下であれば（ステップＳ４のＮｏ）、符号化部１２は、入力バッファ１１の先頭位置にある記号を、最長一致系列の一致位置と一致長さで表現せずに、当該記号をそのまま２進法で表現する手法で符号化する。そして、その符号を出力する（ステップＳ５）。具体的には、本実施の形態例では、記号は８ビットのデータであるので、元の記号のままの８ビットのデータを９ビットで表現し、それを当該記号の符号とする。 Then, it is checked whether or not the length of the searched longest matching sequence (matching length) is greater than or equal to a predetermined value (step S4). For example, the predetermined number is “3”. As a result, if the match length is 2 or less (No in step S4), the encoding unit 12 does not represent the symbol at the head position of the input buffer 11 with the match position of the longest match sequence and the match length. In addition, the symbol is encoded by a method of expressing the symbol as it is in the binary system. And the code | symbol is output (step S5). Specifically, in the present embodiment, since the symbol is 8-bit data, the 8-bit data as the original symbol is represented by 9 bits, which is used as the code of the symbol.

図４は、本データ圧縮装置１によって生成される符号等を説明するための図である。図４の（ａ）が、本データ圧縮装置１によって生成される符号を模式的に示したものであり、上段に示される図（[不一致]と記載）が、上記ステップＳ５で出力される符号である。言い換えれば、入力バッファ１１の先頭位置から所定数の記号列について、辞書バッファ１３内に一致するものが無かった場合の符号を示している。前述のように、かかる符号は、元々８ビットのデータを９ビットで表現したものであるので、必ず最上位のビットは“０”となる。後述するが、前記ステップＳ４で一致長さが所定数以上である場合には、この最上位のビットが必ず“１”となるので、当該最上位のビットは、所定数の記号列について“不一致”であったことを示すことになる。また、それ以降に続く８ビットのデータは、元の記号そのものである。 FIG. 4 is a diagram for explaining codes and the like generated by the data compression apparatus 1. FIG. 4A schematically shows a code generated by the data compression apparatus 1, and the diagram shown in the upper part (denoted [mismatch]) is the code output in step S5. It is. In other words, the code in the case where there is no match in the dictionary buffer 13 for a predetermined number of symbol strings from the head position of the input buffer 11 is shown. As described above, since this code originally represents 8 bits of data in 9 bits, the most significant bit is always “0”. As will be described later, when the matching length is greater than or equal to the predetermined number in step S4, the most significant bit is always “1”, so the most significant bit is “mismatched” for the predetermined number of symbol strings. ". The subsequent 8-bit data is the original symbol itself.

従って、当該符号は、その最上位のビットにより上記“不一致”であること、即ち、その後のビット列で記号そのものを表していることを識別させている。言い換えれば、０〜５１１の値を取れる表現で、０〜２５５の値を表現することにより、元の記号をそのまま表現したものであることを識別させている。なお、この“不一致”であることと元の記号を含んだ、“不一致”の場合の符号全体を、ここでは“不一致”の場合の識別情報（第一識別情報）と称することとする。 Therefore, the code is identified by the most significant bit as being “mismatched”, that is, indicating the symbol itself in the subsequent bit string. In other words, an expression that can take a value of 0 to 511 is used to express a value of 0 to 255, thereby identifying the original symbol as it is. The entire code in the case of “mismatch” including the “mismatch” and the original symbol is referred to as identification information (first identification information) in the case of “mismatch”.

図３に戻って、前記ステップＳ４において、一致長さが所定数以上である場合には（ステップＳ４のＹｅｓ）、符号化部１２は、検索された最長一致系列の一致位置及び一致長さを表すために必要なビット数（ビット列の長さ）を求める（ステップＳ６）。本実施の形態例では、前述した辞書バッファ１３のサイズから、一致位置は、１〜１６３８３の値を取るので、一致位置を表すのに必要なビット数は、１〜１４の値となる。また、前述した入力バッファ１１のサイズ及びステップＳ４における条件から、一致長さは、３〜２５７の値を取るので、一致長さを表すのに必要なビット数は、１〜８の値となる。 Returning to FIG. 3, when the matching length is equal to or larger than the predetermined number in step S4 (Yes in step S4), the encoding unit 12 sets the matching position and the matching length of the searched longest matching sequence. The number of bits (bit string length) necessary to represent is obtained (step S6). In the present embodiment, the matching position takes a value of 1 to 16383 based on the size of the dictionary buffer 13 described above, so the number of bits necessary to represent the matching position is a value of 1 to 14. Further, since the matching length takes a value of 3 to 257 from the size of the input buffer 11 and the condition in step S4, the number of bits necessary to represent the matching length is a value of 1 to 8. .

次に、符号化部１２は、今回探索された最長一致系列の一致位置の値が、前回値バッファ１４に保持されている一致位置の値と同一であるか否かをチェックする（ステップＳ７）。そして、同一であった場合には（ステップＳ７のＹｅｓ）、前記求めた一致位置を表すのに必要なビット数を０とする（ステップＳ８）。一方、同一でなかった場合、即ち、相異していた場合には（ステップＳ７のＮｏ）、前記求めた一致位置を表すのに必要なビット数を変更しない。 Next, the encoding unit 12 checks whether or not the value of the matching position of the longest matching sequence searched this time is the same as the value of the matching position held in the previous value buffer 14 (step S7). . If they are the same (Yes in step S7), the number of bits necessary to represent the obtained matching position is set to 0 (step S8). On the other hand, if they are not the same, that is, if they are different (No in step S7), the number of bits necessary to represent the obtained matching position is not changed.

更に、次に、符号化部１２は、今回探索された最長一致系列の一致長さの値が、前回値バッファ１４に保持されている一致長さの値と同一であるか否かをチェックする（ステップＳ９）。そして、同一であった場合には（ステップＳ９のＹｅｓ）、前記求めた一致長さを表すのに必要なビット数を０とする（ステップＳ１０）。一方、同一でなかった場合、即ち、相異していた場合には（ステップＳ９のＮｏ）、前記求めた一致長さを表すのに必要なビット数を変更しない。 Next, the encoding unit 12 checks whether or not the match length value of the longest match sequence searched this time is the same as the match length value held in the previous value buffer 14. (Step S9). If they are the same (Yes in step S9), the number of bits necessary to represent the obtained matching length is set to 0 (step S10). On the other hand, if they are not the same, that is, if they are different (No in step S9), the number of bits necessary to represent the obtained matching length is not changed.

このように一致位置と一致長さのビット数が決定すると、符号化部１２は、これら一致位置及び一致長さのビット数と、入力バッファ１１の先頭位置から所定数の記号列について辞書バッファ１３内に“一致”するものがあった旨を示す符号を生成して出力する（ステップＳ１１）。かかる符号を、ここでは“一致”の場合の識別情報（第一符号、第二識別情報）と称することとする。具体的には、一例として、下記（１）式のような値を識別情報として生成する。 When the number of bits of the matching position and the matching length is determined in this way, the encoding unit 12 determines the number of bits of the matching position and the matching length and the dictionary buffer 13 for a predetermined number of symbol strings from the head position of the input buffer 11. A code indicating that there is a “match” is generated and output (step S11). Such a code is referred to herein as identification information (first code, second identification information) in the case of “match”. Specifically, as an example, a value such as the following formula (1) is generated as identification information.

識別情報（“一致”の場合）＝２５６＋ＢＬ×（最大ＢＰ＋１）＋ＢＰ（１）
但し、ＢＬ：一致長さを表すのに必要なビット数
ＢＰ：一致位置を表すのに必要なビット数
最大ＢＰ：一致位置を表すのに必要な最大のビット数
なお、本実施の形態例では、上述の通り、最大ＢＰは１４の値となり、ＢＰ及びＢＬは上記値を取るので、（１）式により、上記識別情報（“一致”の場合）は２５６〜３９０の値となる。従って、当該識別情報（“一致”の場合）も２進法の符号として９ビットで表現される。図４の（ａ）の下段に示される４つの符号（[一致]と記載）が、上記所定数の記号列の“一致”があった場合に生成され出力される符号を示しているが、その左側部分の９ビットが当該識別情報に相当する。 Identification information (in the case of “match”) = 256 + BL × (maximum BP + 1) + BP (1)
Where BL is the number of bits required to represent the matching length
BP: Number of bits required to represent the matching position
Maximum BP: Maximum number of bits necessary to represent the matching position In the present embodiment, as described above, the maximum BP has a value of 14, and BP and BL have the above values. Thus, the identification information (in the case of “match”) has a value of 256 to 390. Therefore, the identification information (in the case of “match”) is also expressed by 9 bits as a binary code. The four codes (denoted [match]) shown in the lower part of FIG. 4A indicate codes that are generated and output when there is a “match” of the predetermined number of symbol strings. The 9 bits on the left side correspond to the identification information.

上述のように、当該識別情報は２５６以上の値を取るので、そのことによって上記“一致”があったことを識別させ、また、当該識別情報の値から２５６を引いた値によって一致位置及び一致長さを表すのに必要なビット数を示している。言い換えれば、９ビットで表現される当該符号の最上位のビットは常に“１”の値となり、そのことで上記“一致”があったことを示し、下位の８ビットで一致位置及び一致長さを表すのに必要なビット数を示している。更に、一致位置、一致長さを表すのに必要なビット数が０である場合には、それぞれ、それらの値が前回値バッファ１４に保持されている値と同一であること、即ち、前回、所定数以上の記号が一致した時の一致位置、一致長さの値と同一であることを示している。 As described above, since the identification information takes a value of 256 or more, it is possible to identify that there is a “match”, and to match the position and match by the value obtained by subtracting 256 from the value of the identification information. The number of bits required to represent the length is shown. In other words, the most significant bit of the code represented by 9 bits always has a value of “1”, which indicates that there is a “match”, and the lower 8 bits indicate the match position and match length. The number of bits necessary to represent Further, when the number of bits necessary to represent the coincidence position and the coincidence length is 0, each of those values is the same as the value held in the previous value buffer 14, that is, the previous time, This indicates that the match position and match length values are the same when a predetermined number or more of the symbols match.

このように、本実施の形態例による符号では、上記“一致”の場合にも“不一致”の場合にも、識別情報が共に９ビットで表現され、その値が２５６以上であるか否かにより、言い換えれば、最上位のビットにより、一致位置と一致長さに基づく符号化がなされているか否かが識別できる。また、その識別の後、下位８ビットにより、“不一致”の場合には、元の記号そのものを知ることができ、“一致”の場合には、識別情報の後に続く一致位置と一致長さを表す符号の長さ、あるいは、一致位置、一致長さの値が前回の値と同一であったことを知ることができる。 As described above, in the code according to the present embodiment, the identification information is expressed by 9 bits both in the case of “match” and “mismatch”, depending on whether the value is 256 or more. In other words, it is possible to identify whether or not encoding is performed based on the matching position and the matching length by the most significant bit. In addition, after the identification, the lower 8 bits can be used to know the original symbol itself in the case of “mismatch”, and in the case of “match”, the matching position and the matching length that follow the identification information can be obtained. It can be known that the length of the code to be represented, or the value of the matching position and the matching length are the same as the previous value.

かかる識別情報の出力が終了すると、符号化部１２は、一致位置を表すのに必要なビット数が０であるか否かをチェックし（ステップＳ１２）、０でない場合には（ステップＳ１２のＮｏ）、前記検索された最長一致系列の一致位置を表す一致位置情報（第二符号）を符号として出力する（ステップＳ１３）。具体的には、一致位置を表す値を２進法表現した際の最上位の“１”のビットを除いた下位のビット列を一致位置情報として出力する。これは、最上位のビットが“１”であることが自明であり、極力データ量を少なくしようとする目的によるものである。例えば、一致位置の値が９である場合には、２進法では“１００１”となるが、“００１”を一致位置情報として出力する。 When the output of the identification information is finished, the encoding unit 12 checks whether or not the number of bits necessary to represent the coincidence position is 0 (step S12), and if not (0 in step S12). ), The matching position information (second code) indicating the matching position of the searched longest matching sequence is output as a code (step S13). Specifically, a lower-order bit string excluding the most significant “1” bit when the value representing the matching position is expressed in binary notation is output as matching position information. This is self-evident that the most significant bit is “1”, and is for the purpose of reducing the amount of data as much as possible. For example, when the value of the coincidence position is 9, although “1001” is obtained in the binary system, “001” is output as the coincidence position information.

また、前述した、一致位置と一致長さを固定長のビット列で表現するＬＺＳＳ符号の場合と比較して表現すれば、一致位置を最大長の１４ビットで表現し、そのビット列のＭＳＢ（ＭｏｓｔＳｉｇｎｉｆｉｃａｎｔＢｉｔ）側から連続して存在している“０”のビットとその次の“１”のビットを取り除いたものを一致位置情報とする。
一方、一致位置を表すのに必要なビット数が０である場合には（ステップＳ１２のＹｅｓ）、上記一致位置情報は出力しない。 Further, if the matching position and the matching length are expressed in comparison with the case of the LZSS code expressing the fixed-length bit string, the matching position is expressed by the maximum length of 14 bits, and the MSB (Most Significant) of the bit string is expressed. Bits) obtained by removing the “0” bit continuously present from the “Bit” side and the next “1” bit are used as matching position information.
On the other hand, when the number of bits necessary to represent the coincidence position is 0 (Yes in step S12), the coincidence position information is not output.

次に、符号化部１２は、一致長さを表すのに必要なビット数が０であるか否かをチェックし（ステップＳ１４）、０でない場合には（ステップＳ１４のＮｏ）、前記検索された最長一致系列の一致長さを表す一致長さ情報（第二符号）を符号として出力する（ステップＳ１５）。この一致長さ情報も前記一致位置情報と同様に、一致長さを表す値を２進法表現した際の最上位の“１”のビットを除いた下位のビット列を一致長さ情報として出力する。一方、一致長さを表すのに必要なビット数が０である場合には（ステップＳ１４のＹｅｓ）、上記一致長さ情報は出力しない。 Next, the encoding unit 12 checks whether or not the number of bits necessary to represent the matching length is 0 (step S14). If the number is not 0 (No in step S14), the search is performed. The matching length information (second code) indicating the matching length of the longest matching sequence is output as a code (step S15). Similarly to the match position information, the match length information is output as a match length information by subordinate bit strings excluding the most significant “1” when the value representing the match length is expressed in binary. . On the other hand, if the number of bits necessary to represent the matching length is 0 (Yes in step S14), the matching length information is not output.

図４の（ａ）の下段に例示した[一致]の符号において、右側に示される“一致位置情報”及び“一致長さ情報”が前記ステップＳ１３及びＳ１５において出力される各情報に相当する。そして、それらの中の（相異、相異）で示す符号は、一致位置の値も一致長さの値も前回値バッファ１４の値と同一でなかった場合に出力される符号を示している。以下、同様に、（相異、同一）、（同一、相異）、及び（同一、同一）で示す符号は、それぞれ、一致長さの値のみが前回値バッファ１４の値と同一であった場合、一致位置の値のみが前回値バッファ１４の値と同一であった場合、及び一致位置の値も一致長さの値も前回値バッファ１４の値と同一であった場合に出力される符号を示している。このように、前回の値と同じ情報については出力がなされず、更なるデータの圧縮が図られることになる。 In the “match” code illustrated in the lower part of FIG. 4A, “match position information” and “match length information” shown on the right side correspond to each information output in steps S13 and S15. Among them, the code indicated by (difference, difference) indicates the code output when the value of the matching position and the value of the matching length are not the same as the value of the previous value buffer 14. . Hereinafter, similarly, the signs indicated by (difference, same), (same, different), and (same, same) are respectively the same as the value of the previous value buffer 14 only in the value of the matching length. In this case, the code output when only the value of the matching position is the same as the value of the previous value buffer 14 and when the value of the matching position and the value of the matching length are the same as the value of the previous value buffer 14 Is shown. In this way, the same information as the previous value is not output, and further data compression is achieved.

このように、一致長さ情報の出力が終了すると、符号化部１２は、前回値バッファ１４に保持される一致位置及び一致長さの値を、それぞれ、今回の処理で探索された前記最長一致系列の一致位置及び一致長さの値に更新する（ステップＳ１６）。 As described above, when the output of the match length information is finished, the encoding unit 12 searches the match position and the match length value held in the previous value buffer 14 respectively for the longest match searched in the current process. The sequence is updated to match position and match length values (step S16).

以上説明したように、“不一致”の場合あるいは“一致”の場合における符号の生成及び出力の処理が終了すると、入力バッファ１１及び辞書バッファ１３に格納されている記号列がスライドされる（ステップＳ１７）。具体的には、上記処理により符号が生成され出力された記号（列）が、入力バッファ１１の先頭部分から辞書バッファ１３の後尾部分に移動する。そして、その移動分の新たな記号（列）が入力バッファ１１に入力され、また、その移動分の記号（列）が辞書バッファ１３から吐き出される。なお、前記“不一致”の場合には、１記号のみが符号化されるので、上記移動する記号は一つであり、前記“一致”の場合には、３以上の記号が符号化されるので、上記移動する記号は３以上となる。 As described above, when the code generation and output process in the case of “mismatch” or “match” is completed, the symbol strings stored in the input buffer 11 and the dictionary buffer 13 are slid (step S17). ). Specifically, the symbol (sequence) generated by the code generated by the above processing and output is moved from the head portion of the input buffer 11 to the tail portion of the dictionary buffer 13. Then, a new symbol (sequence) for the movement is input to the input buffer 11, and a symbol (sequence) for the movement is discharged from the dictionary buffer 13. In the case of “mismatch”, only one symbol is encoded, so there is only one moving symbol. In the case of “match”, three or more symbols are encoded. The moving symbol is 3 or more.

その後、今回行おうとしている圧縮処理が終了したか否かが判断され（ステップＳ１８）、終了していない場合には（ステップＳ１８のＮｏ）、前述したステップＳ３からの処理が繰り返される。そして、圧縮処理が終了したと判断された場合には（ステップＳ１８のＹｅｓ）、一連のデータ圧縮処理を終了する。 Thereafter, it is determined whether or not the compression process to be performed this time has been completed (step S18). If the compression process has not been completed (No in step S18), the processes from step S3 described above are repeated. When it is determined that the compression process has been completed (Yes in step S18), the series of data compression processes is terminated.

図５は、データ復元装置２の復号部２２が行なう処理の内容を例示したフローチャートである。以下、図５に基づいて、本データ復元装置２で行なわれる復元処理の具体的な内容について説明する。まず、復号部２２は、前記前回値バッファ２４に保持されている一致位置及び一致長さの値を初期化する（ステップＳ２１）。具体的には、復号対象のデータが符号化された際の前回値バッファ１４の初期化時における値と同じ値とされる。その後、処理対象の符号が順番に入力バッファ２１に読み込まれる（ステップＳ２２）。次に、復号部２２は、符号の先頭の９ビット、即ち、識別情報を入力バッファ２１より読み込んでその情報を解釈する（ステップＳ２３）。 FIG. 5 is a flowchart illustrating the contents of the process performed by the decryption unit 22 of the data restoration device 2. Hereinafter, based on FIG. 5, the specific content of the restoration process performed by the data restoration apparatus 2 will be described. First, the decoding unit 22 initializes the value of the matching position and the matching length held in the previous value buffer 24 (step S21). Specifically, the value is the same as the value at the time of initialization of the previous value buffer 14 when the data to be decoded is encoded. Thereafter, the codes to be processed are sequentially read into the input buffer 21 (step S22). Next, the decoding unit 22 reads the first 9 bits of the code, that is, identification information from the input buffer 21 and interprets the information (step S23).

そして、現在処理対象の符号が、符号化時に前述した“不一致”の場合であったのか“一致”の場合であったのかを判断する（ステップＳ２４）。かかる判断は、前述の通り、識別情報の最上位のビットによって行うことができ、最上位のビットが“０”であれば、“不一致”の場合であったと判断し、最上位のビットが“１”であれば、“一致”の場合であったと判断する。言い換えれば、読み込んだ識別情報の値が０〜２５５であれば、その値が元の記号そのものを表していると判断し、識別情報の値が２５６以上であれば、この識別情報でこの後に続く一致位置と一致長さの情報のビット数を知ることができると判断する。 Then, it is determined whether the current processing target code is the case of “mismatch” or “match” at the time of encoding (step S24). As described above, this determination can be made based on the most significant bit of the identification information. If the most significant bit is “0”, it is determined that the case is “mismatch”, and the most significant bit is “ If it is “1”, it is determined that the case is “match”. In other words, if the value of the read identification information is 0 to 255, it is determined that the value represents the original symbol itself, and if the value of the identification information is 256 or more, this identification information follows. It is determined that the number of bits of information on the matching position and the matching length can be known.

かかる判断で、“不一致”の場合であったと判断された場合には（ステップＳ２４のＮｏ）、上記読み込んだ識別情報が元の記号そのものを表しているので、元の記号を復元して出力する（ステップＳ２５）。より具体的には、９ビットの識別情報を８ビットとして出力する。 If it is determined that the result is “mismatch” (No in step S24), since the read identification information represents the original symbol itself, the original symbol is restored and output. (Step S25). More specifically, 9-bit identification information is output as 8 bits.

一方、“一致”の場合であったと判断された場合には（ステップＳ２４のＹｅｓ）、読み込んだ識別情報から、前記一致位置を表すのに必要なビット数と一致長さを表すのに必要なビット数を求める（ステップＳ２６）。具体的には、識別情報が前述した（１）式によって生成されているので、読み込んだ識別情報の値から２５６を引き、その後の値を前記最大ＢＰよりも１大きい値（１４＋１）で割り、その時の余りを一致位置を表すのに必要なビット数とし、また、その時の商を一致長さを表すのに必要なビット数とする。 On the other hand, if it is determined that it is the case of “match” (Yes in step S24), it is necessary to represent the number of bits and the match length necessary to represent the match position from the read identification information. The number of bits is obtained (step S26). Specifically, since the identification information is generated by the above-described equation (1), 256 is subtracted from the value of the read identification information, and the subsequent value is divided by a value (14 + 1) larger by 1 than the maximum BP. The remainder at that time is the number of bits necessary to represent the coincidence position, and the quotient at that time is the number of bits necessary to represent the coincidence length.

次に、復号部２２は、前記求めた一致位置を表すのに必要なビット数が０であるか否かをチェックし（ステップＳ２７）、０でなければ（ステップＳ２７のＮｏ）、前記読み込んだ識別情報に続く、上記求めた一致位置を表すのに必要なビット数から１を引いた数の符号（ビット）を、入力バッファ２１から読み込む（ステップＳ２８）。その後、復号部２２は、当該読み込んだ一致位置情報から一致位置の値を求める（ステップＳ２９）。具体的には、読み込んだ一致位置情報のビット列の最上位に“１”のビットを付加した値を、一致位置の値とする。換言すれば、２の（前記一致位置を表すのに必要なビット数−１）乗に一致位置情報の値を加えたものを一致位置の値とする。 Next, the decoding unit 22 checks whether or not the number of bits necessary to represent the obtained coincidence position is 0 (step S27). If it is not 0 (No in step S27), the reading is performed. Following the identification information, the number of codes (bits) obtained by subtracting 1 from the number of bits necessary to represent the obtained matching position is read from the input buffer 21 (step S28). Thereafter, the decoding unit 22 obtains the value of the matching position from the read matching position information (step S29). Specifically, a value obtained by adding a bit of “1” to the most significant bit string of the read matching position information is set as the matching position value. In other words, the value of the matching position is obtained by adding the value of the matching position information to the power of 2 (the number of bits necessary to represent the matching position minus 1).

一方、前記求めた一致位置を表すのに必要なビット数が０であれば（ステップＳ２７のＹｅｓ）、一致位置の値を前回値バッファ２４に保持されている一致位置の値とする（ステップＳ３０）。 On the other hand, if the number of bits necessary to represent the obtained matching position is 0 (Yes in step S27), the value of the matching position is set as the value of the matching position held in the previous value buffer 24 (step S30). ).

引き続き、復号部２２は、前記求めた一致長さを表すのに必要なビット数が０であるか否かをチェックし（ステップＳ３１）、０でなければ（ステップＳ３１のＮｏ）、前記読み込んだ識別情報に続く、上記求めた一致長さを表すのに必要なビット数から１を引いた数の符号（ビット）を、入力バッファ２１から読み込む（ステップＳ３２）。その後、復号部２２は、当該読み込んだ一致長さ情報から一致長さの値を求める（ステップＳ３３）。具体的には、読み込んだ一致長さ情報のビット列の最上位に“１”のビットを付加した値に、２を加えた値を、一致長さの値とする。換言すれば、２の（前記一致長さを表すのに必要なビット数−１）乗に一致長さ情報の値と２を加えたものを一致長さの値とする。ここで２を加えるのは、前述のように、一致長さの値が３〜２５７の値を取り、符号化時にそれを８ビットで表現するために、一致長さの値から２を引いた値（１〜２５５）を符号化していることによるものである。 Subsequently, the decoding unit 22 checks whether or not the number of bits necessary to represent the obtained matching length is 0 (step S31). If not, it is not 0 (No in step S31). Following the identification information, the number of codes (bits) obtained by subtracting 1 from the number of bits necessary to represent the obtained matching length is read from the input buffer 21 (step S32). Thereafter, the decoding unit 22 obtains a value of the matching length from the read matching length information (step S33). Specifically, a value obtained by adding 2 to the value obtained by adding the bit “1” to the most significant bit string of the read match length information is set as the match length value. In other words, a value obtained by adding the value of the match length information and 2 to the power of 2 (the number of bits necessary to represent the match length minus 1) is set as the match length value. Here, 2 is added, as described above, the value of the matching length takes a value of 3 to 257, and 2 is subtracted from the value of the matching length in order to express it with 8 bits at the time of encoding. This is because the values (1 to 255) are encoded.

一方、前記求めた一致長さを表すのに必要なビット数が０であれば（ステップＳ３１のＹｅｓ）、一致長さの値を前回値バッファ２４に保持されている一致長さの値とする（ステップＳ３４）。 On the other hand, if the number of bits necessary to represent the obtained match length is 0 (Yes in step S31), the match length value is set as the match length value held in the previous value buffer 24. (Step S34).

次に、復号部２２は、上記求めた一致位置と一致長さの値から元の記号列を復元して復元した記号列を出力する（ステップＳ３５）。具体的には、辞書バッファ２３に格納されている記号列の上記求めた値の一致位置から上記求めた値の一致長さ分の記号列を読み出し、その読み出した記号列を元の記号列として出力する。 Next, the decoding unit 22 restores the original symbol string from the value of the obtained matching position and matching length, and outputs the restored symbol string (step S35). Specifically, a symbol string corresponding to the coincidence length of the obtained value is read from the coincidence position of the obtained value of the symbol string stored in the dictionary buffer 23, and the read symbol string is used as the original symbol string. Output.

その後、復号部２２は、前回値バッファ２４に保持される一致位置及び一致長さの値を、それぞれ、今回の処理で求められた一致位置及び一致長さの値に更新する（ステップＳ３６）。 Thereafter, the decoding unit 22 updates the match position and match length values held in the previous value buffer 24 to the match position and match length values obtained in the current process, respectively (step S36).

以上説明したように、“不一致”の場合あるいは“一致”の場合における復号処理が終了すると、入力バッファ２１及び辞書バッファ２３に格納されている符号及び記号列がスライドされる（ステップＳ３７）。具体的には、上記処理により復元された符号が入力バッファ２１から削除され、その分の新たな符号が入力バッファ２１に入力される。また、上記処理により復元された記号列が辞書バッファ２３の後尾部分に追加され、先頭部分のその分の記号列が不要な記号として辞書バッファ２３から吐き出される。 As described above, when the decoding process in the case of “mismatch” or “match” is completed, the codes and symbol strings stored in the input buffer 21 and the dictionary buffer 23 are slid (step S37). Specifically, the code restored by the above processing is deleted from the input buffer 21, and a new code corresponding to the code is input to the input buffer 21. Further, the symbol string restored by the above processing is added to the tail portion of the dictionary buffer 23, and the corresponding symbol string at the head portion is discharged from the dictionary buffer 23 as an unnecessary symbol.

その後、今回行おうとしている復元処理が終了したか否かが判断され（ステップＳ３８）、終了していない場合には（ステップＳ３８のＮｏ）、前述したステップＳ２３からの処理が繰り返される。そして、復元処理が終了したと判断された場合には（ステップＳ３８のＹｅｓ）、一連のデータ復元処理を終了する。 Thereafter, it is determined whether or not the restoration process to be performed this time has been completed (step S38). If the restoration process has not been completed (No in step S38), the processes from step S23 described above are repeated. If it is determined that the restoration process has been completed (Yes in step S38), the series of data restoration processes is terminated.

図６は、本データ圧縮装置１及びデータ復元装置２が行う処理の具体例を示した図である。図に示す例は、データ圧縮時に前記所定数以上の記号の“一致”があった場合であり、その時の一致位置が３００で、一致長さが２１であった場合である。更に、この例は、一致長さの値が前回の処理時の一致長さの値と同一であった場合である。図６の（ａ）は、前記一致位置（３００）及び一致長さ（２１）を、最大限のビット数の長さで模式的に示した図である。図６の（ｂ）は、当該一致位置（３００）を最大限のビット数で２進法表記したものである。 FIG. 6 is a diagram showing a specific example of processing performed by the data compression apparatus 1 and the data restoration apparatus 2. The example shown in the figure is a case where there is a “match” of the predetermined number of symbols or more at the time of data compression, the match position at that time is 300, and the match length is 21. Furthermore, this example is a case where the value of the matching length is the same as the value of the matching length at the previous processing. FIG. 6A is a diagram schematically showing the matching position (300) and the matching length (21) with the maximum number of bits. FIG. 6B is a binary representation of the matching position (300) with the maximum number of bits.

かかる状態に対して前述した符号化部１２による符号化が行われると図６の（ｃ）に示すような符号が生成されて出力される。図に示す符号の左部分が前述した識別情報（“一致”の場合）であり、前記（１）式に従って図に示すような計算がなされて、識別情報の値は２６５となる。即ち、一致長さの値が前回値バッファ１４に保持される値と同一であるので、一致長さを表すのに必要なビット数が０となり、その値に１５（最大ＢＰ＋１）を掛け、その値に、一致位置の値である３００を２進法表現するのに必要なビット数９と、２５６が加えられて２６５という識別情報が得られる。 When the encoding by the encoding unit 12 described above is performed on this state, a code as shown in FIG. 6C is generated and output. The left part of the code shown in the figure is the above-described identification information (in the case of “match”), and the calculation shown in the figure is performed according to the equation (1). That is, since the value of the match length is the same as the value held in the previous value buffer 14, the number of bits necessary to represent the match length is 0, and the value is multiplied by 15 (maximum BP + 1) The number of bits 9 necessary for binary representation of 300, which is the value of the matching position, and 256 are added to the value to obtain identification information of 265.

また、図６の（ｃ）に示す符号の右側部分の８ビットが一致位置情報であり、３００を２進法表現したビット列の最上位の“１”が除かれたものとなっている。従って、図６の（ｂ）に示した１４ビットの表現から、ＭＳＢ側の全ての“０”と上記最上位の“１”が除かれた８ビットが符号として出力されることになる。また、上述のように、一致長さの値が前回値バッファ１４に保持される値と同一であるので、一致長さ情報は、出力されない。 Also, the 8 bits in the right part of the code shown in FIG. 6C is the matching position information, and the most significant “1” in the bit string expressing 300 in binary is removed. Accordingly, 8 bits obtained by removing all “0” s on the MSB side and the most significant “1” from the 14-bit representation shown in FIG. 6B are output as codes. Further, as described above, since the value of the matching length is the same as the value held in the previous value buffer 14, the matching length information is not output.

このような符号が、データ復元装置２で受け取られ復号部２２で復号されると、前記識別情報が解釈されて、一致位置と一致長さを表すのに必要なビット数がそれぞれ９と０であると判断される。そして、前述した処理内容に従って、図６の（ｄ）に示されるように、まず、識別情報の後の（９−１）ビットが読み出されて、その最上位に“１”のビットが付加される。これにより、一致位置の値が３００であると判断される。また、一致長さを表すのに必要なビット数が０であるので、前回値バッファ２４の値が一致長さの値と判断される。この場合、前回値バッファ２４には、一致長さの値として２１が保持されているので、一致長さの値も正しく取得される。これにより、復号部２２は、辞書バッファ２３から所定の記号列を取り出して出力し、当該符号に対する復号処理を終了する。 When such a code is received by the data restoration device 2 and decoded by the decoding unit 22, the identification information is interpreted, and the number of bits necessary to represent the matching position and the matching length is 9 and 0, respectively. It is judged that there is. Then, according to the processing contents described above, as shown in FIG. 6D, first, the (9-1) bit after the identification information is read, and the bit “1” is added to the most significant bit. Is done. Thereby, it is determined that the value of the coincidence position is 300. In addition, since the number of bits necessary to represent the matching length is 0, the value in the previous value buffer 24 is determined as the matching length value. In this case, since the previous value buffer 24 holds 21 as the value of the matching length, the value of the matching length is also acquired correctly. Thus, the decoding unit 22 extracts and outputs a predetermined symbol string from the dictionary buffer 23, and ends the decoding process for the code.

以上説明したように、本実施の形態例によるデータ圧縮方法及び復元方法では、辞書（バッファ１３）内に所定長以上の最長一致系列が存在しない場合には、その旨と元の記号を含む一定長の識別情報を符号として出力し、一方、辞書内に所定長以上の最長一致系列が存在する場合には、その旨と、一致位置を表すのに必要なビット数及び一致長さを表すのに必要なビット数を含む上記識別情報と同じ長さの識別情報をまず出力し、その後に、必要に応じて、一致位置情報と一致長さ情報を出力する。そして、上記一致位置、一致長さを表すのに必要なビット数が０である場合には、それらの値が前回の値と同一であることを意味し、かかる場合には、同一である一致位置、一致長さの情報は出力されない。また、前記出力される一致位置情報及び一致長さ情報は、一致位置及び一致長さを示す値を２進法表示した際の最上位のビットを除いたビット列で表現される。 As described above, in the data compression method and the decompression method according to the present embodiment, when there is no longest matching sequence of a predetermined length or more in the dictionary (buffer 13), a constant including the fact and the original symbol. The long identification information is output as a code. On the other hand, if there is a longest matching sequence of a predetermined length or longer in the dictionary, this is indicated, and the number of bits necessary to represent the matching position and the matching length are indicated. First, identification information having the same length as that of the identification information including the number of bits required is output, and thereafter, matching position information and matching length information are output as necessary. If the number of bits necessary to represent the matching position and the matching length is 0, it means that those values are the same as the previous value. In such a case, the matching is the same. Information about position and matching length is not output. The output match position information and match length information are expressed as a bit string excluding the most significant bit when the values indicating the match position and the match length are displayed in binary.

従って、一致位置と一致長さに基づく符号化をした際に、それらを表す情報を常に最大の固定長（本実施の形態例では、１４と８）のビット列で表現する必要がなく、上記識別情報が付加されても、符号全体として平均的に符号の長さを短くでき、従来よりもデータ圧縮率を向上させることができる。更に、本実施の形態例による圧縮方法では、一致位置、一致長さの値が前回の処理時と同一の場合には、それらを表す情報の出力が行なわれず、同一であることを示す情報を付加することにより符号が長くなることもないので、このことにより、更に圧縮率を向上させることができる。 Therefore, when encoding is performed based on the matching position and the matching length, it is not necessary to always represent the information representing them with a bit string of the maximum fixed length (14 and 8 in the present embodiment), and the above identification Even if information is added, the length of the code as a whole can be shortened on average, and the data compression rate can be improved as compared with the conventional case. Further, in the compression method according to the present embodiment, when the matching position and the matching length values are the same as those in the previous processing, the information indicating them is not output, and information indicating that they are the same. Since the code does not become longer due to the addition, this can further improve the compression rate.

図４の（ｂ）は、一致位置と一致長さの最大値が本実施の形態例と同じ場合に、ＬＺＳＳ符号を用いた時の符号例を模式的に示している。上段の辞書内に所定長以上の最長一致系列が存在しない場合（「不一致」）には、その旨を示す“０”と元の記号８ビットが出力される。また、下段の辞書内に所定長以上の最長一致系列が存在する場合（「一致」）には、その旨を示す“１”と固定長の一致位置情報及（１４ビット）及び一致長さ情報（８ビット）が出力される。図４の（ａ）に示した本実施の形態例の場合と比較して[不一致]の場合には、データ長さは同じであるが、[一致]の場合には、本実施の形態例の場合の方が平均してデータ長さが短くなるという評価が得られている。 FIG. 4B schematically shows a code example when the LZSS code is used when the maximum value of the matching position and the matching length is the same as in the present embodiment. If there is no longest matching sequence of a predetermined length or longer in the upper dictionary (“mismatch”), “0” indicating that fact and 8 bits of the original symbol are output. If the longest matching sequence having a predetermined length or longer exists in the lower dictionary (“match”), “1” indicating that, a fixed-length match position information (14 bits), and match length information (8 bits) is output. Compared to the case of the present embodiment shown in FIG. 4A, the data length is the same in the case of “mismatch”, but in the case of “match”, the present embodiment example In the case of, the evaluation that the data length becomes shorter on average is obtained.

また、当該符号を復元する場合には、まず、同じ長さの識別情報を解釈して、一致位置と一致長さに基づく符号化がなされているかが判断され、当該符号化がなされている場合にも、識別情報から一致位置情報と一致長さ情報を容易に読み取ることができる。また、当該識別情報により一致位置、一致長さの値が前回処理時の値と同一であるか否かを知ることができ、前回処理時の値が保持されているので、前回処理時と同一の場合にも、容易に一致位置、一致長さの値を得ることができる。従って、復号処理時に入力バッファから一致位置または一致長さの符号を読み出す処理を必要とせず、処理に時間を要することはない。 Also, when restoring the code, first, the identification information of the same length is interpreted, it is determined whether the encoding based on the matching position and the matching length has been made, and the coding has been made In addition, the matching position information and the matching length information can be easily read from the identification information. In addition, it is possible to know whether or not the value of the matching position and the matching length is the same as the value at the previous processing by the identification information, and since the value at the previous processing is held, it is the same as the previous processing. Even in this case, it is possible to easily obtain the value of the matching position and the matching length. Therefore, the process of reading the code of the coincidence position or the coincidence length from the input buffer is not required during the decoding process, and the process does not take time.

なお、前記（１）式における乗数、即ち、一致長さ情報のビット数に掛ける乗数、を１５としたが、この乗数を１６にしてもよい。これにより、符号化時の乗算及び復号時の除算はビットシフトで済むことになり、さらに、圧縮及び復元の処理を高速化することができる。また、この場合にも識別情報のビット数は増やす必要がなく、データ量が増えることもない。 Note that although the multiplier in the equation (1), that is, the multiplier by which the number of bits of the matching length information is multiplied, is 15, the multiplier may be 16. As a result, the multiplication at the time of encoding and the division at the time of decoding need only be a bit shift, and the compression and decompression processes can be further accelerated. Also in this case, it is not necessary to increase the number of bits of identification information, and the amount of data does not increase.

また、本実施の形態例における固定長の識別情報をハフマン符号化してもよい。これにより、更にデータ圧縮率を高めることができる。 Further, the fixed-length identification information in this embodiment may be Huffman encoded. Thereby, the data compression rate can be further increased.

次に、本発明に係る別の実施の形態例について説明する。本実施の形態例に係るデータ圧縮装置及びデータ復元装置の構成は、図１及び図２に示した前記実施の形態例の場合と同様である。本実施の形態例では、圧縮時に、一致位置情報と一致長さ情報をそれぞれ固定長の符号（ビット列）で表現するが、一致位置、一致長さの値が前回処理時の値と同一であるか否かを表す符号（第一符号）を先頭に付加し、同一である一致位置、一致長さの情報については出力しない、という方法で符号化を行なう。言い換えれば、符号を、一致位置、一致長さの値が前回処理時の値と同一であるか否かを表す符号と、必要な固定長の一致位置情報及び一致長さ情報（第二符号）で構成する。 Next, another embodiment according to the present invention will be described. The configurations of the data compression apparatus and the data restoration apparatus according to this embodiment are the same as those in the above-described embodiment shown in FIGS. In this embodiment, at the time of compression, the match position information and the match length information are each expressed by a fixed-length code (bit string), but the values of the match position and the match length are the same as the values at the previous processing. Is encoded by a method in which a code (first code) indicating whether or not the same is added to the head and information on the same match position and match length is not output. In other words, the code includes a code indicating whether the value of the match position and the match length is the same as the value at the time of the previous processing, and the required fixed-length match position information and match length information (second code). Consists of.

図７は、本実施の形態例により出力される圧縮処理後の符号を例示した図である。図の（ａ）、（ｂ）、（ｃ）、及び（ｄ）は、それぞれ、一致位置の値も一致長さの値も前回値と同一でなかった場合、一致長さの値のみが前回値と同一であった場合、一致位置の値のみが前回値と同一であった場合、及び一致位置の値も一致長さの値も前回値と同一であった場合に出力される符号を示している。各符号において、先頭の２ビットが、前述した、一致位置、一致長さの値が前回処理時の値と同一であるか否かを表す符号であり、（ａ）〜（ｄ）の４種類の場合を区別している。 FIG. 7 is a diagram exemplifying codes after compression processing output according to the present embodiment. (A), (b), (c), and (d) in the figure, respectively, when the value of the matching position and the value of the matching length are not the same as the previous value, only the value of the matching length is the previous value. Indicates the code that is output when the value is the same as the value, only when the value at the match position is the same as the previous value, and when the value at the match position and the value of the match length are the same as the previous value. ing. In each code, the first two bits are codes indicating whether or not the values of the matching position and the matching length are the same as the values at the time of the previous processing, and are the four types (a) to (d). The case is distinguished.

このように符号化されたデータの復号時には、本実施の形態例に係るデータ復元装置が、符号の最初の２ビットから、一致位置、一致長さの値が前回処理時の値と同一であるか否かを判断し、同一である値については、保持しておいた前回値を用いて当該値を取得し、同一でない値については、上記２ビットの後に続く固定長の符号から値を取得する。 When decoding data encoded in this way, the data restoration apparatus according to the present embodiment uses the first two bits of the code to match the match position and match length values to the values at the time of the previous process. For values that are the same, obtain the value using the previous value that was held, and for values that are not the same, obtain the value from the fixed-length code that follows the 2 bits. To do.

このように、本実施の形態例では、一致位置、一致長さの値が前回値と同一である場合には、それらの情報が出力されず、また、その旨を示すために付加される符号もそれほど長くないので、最長一致系列の一致位置及び一致長さを用いたデータ圧縮方法において、データの圧縮率を高くすることが可能である。また、かかる方法で圧縮された符号の復元も、上述の通り容易であり、処理に時間がかからない。 Thus, in this embodiment, when the value of the matching position and the matching length is the same as the previous value, such information is not output, and a code added to indicate that fact. Is not so long, it is possible to increase the data compression rate in the data compression method using the matching position and matching length of the longest matching sequence. Moreover, the decompression | restoration of the code | cord | chord compressed by this method is also easy as above-mentioned, and processing does not take time.

次に、前述した２つの実施の形態例に係るデータ圧縮方法及び復元方法の適用例について説明する。図８は、当該データ圧縮方法及び復元方法を用いた印刷システムの概略構成図である。図８に示すように、本印刷システムは、ホストコンピュータ３とプリンタ４から構成される。ホストコンピュータ３のアプリケーション３１から印刷要求が出され、それを受けるプリンタドライバ３２が印刷データを生成して、プリンタ４へ送信する。プリンタ４では、その印刷データをコントローラ４１が受信し、所定の処理を施した後にデータをエンジン４２に送る。エンジン４２は、そのデータに基づいて、印刷媒体への印刷を実行する。 Next, application examples of the data compression method and the decompression method according to the above-described two exemplary embodiments will be described. FIG. 8 is a schematic configuration diagram of a printing system using the data compression method and the decompression method. As shown in FIG. 8, the printing system includes a host computer 3 and a printer 4. A print request is issued from the application 31 of the host computer 3, and the printer driver 32 that receives the print request generates print data and transmits it to the printer 4. In the printer 4, the controller 41 receives the print data, performs predetermined processing, and sends the data to the engine 42. The engine 42 executes printing on a print medium based on the data.

当該印刷システムは、所謂ホストベースのシステムであり、ホストコンピュータ３側でハーフトーン処理（スクリーン処理）までを行ってしまう。従って、プリンタドライバ３２には、図８に示すように、画像データを生成する画像生成部３３、色の変換処理を行う色変換部３４、ハーフトーン処理を行うハーフトーン処理部３５、及びハーフトーン処理後の印刷データを圧縮する圧縮部３６が備えられている。なお、これらの部分は、それぞれの処理内容が示されたドライバプログラムとそれに従って処理を実行するＣＰＵ（制御装置）等によって構成され得る。 The printing system is a so-called host-based system and performs up to halftone processing (screen processing) on the host computer 3 side. Accordingly, as shown in FIG. 8, the printer driver 32 includes an image generation unit 33 that generates image data, a color conversion unit 34 that performs color conversion processing, a halftone processing unit 35 that performs halftone processing, and a halftone. A compression unit 36 is provided for compressing the processed print data. Note that these portions can be configured by a driver program showing the contents of each process and a CPU (control device) that executes the process according to the driver program.

一方、プリンタ４のコントローラ４１には、ホストコンピュータ３から圧縮処理された印刷データが送信されるので、それを復元処理する解凍部４３が備えられる。 On the other hand, the print data compressed by the host computer 3 is transmitted to the controller 41 of the printer 4 and is provided with a decompression unit 43 for restoring the print data.

このように構成される本印刷システムの上記圧縮部３６と解凍部４３に、それぞれ、前述した２つの実施の形態例のいずれかに係るデータ圧縮装置１とデータ復元装置２が用いられている。従って、圧縮部３６では２値化後のドットイメージを表す印刷データが、前述した手法で符号化される。そして、その符号化された印刷データがプリンタ４に送信され、解凍部４３において述した手法で復号される。 The compression unit 36 and the decompression unit 43 of the printing system configured as described above use the data compression device 1 and the data restoration device 2 according to one of the above-described two exemplary embodiments, respectively. Accordingly, the compression unit 36 encodes the print data representing the binarized dot image by the method described above. The encoded print data is transmitted to the printer 4 and decoded by the method described in the decompression unit 43.

本印刷システムでは、このように前述した実施の形態例に係るデータ圧縮方法及び復元方法が用いられるので、印刷データの圧縮率を高めることができ、ホストコンピュータ３からプリンタ４への送信時間を短縮できる。従って、高速化するホストコンピュータ３、プリンタ４での処理にデータ送信を追随させることができ、印刷システムとしてのスループットを向上させることができる。特に、本適用例のようにハーフトーン処理後の印刷データは、同じパターンのデータが繰り返し現れる傾向にあることから、圧縮率を向上させるために本データ圧縮方法が適しているといえる。なお、ハーフトーン処理後のデータについて、最長一致系列が同じであった場合に一致位置の値が小さい方を選択するようにして本圧縮方法を使用した場合に、一致位置情報は平均７〜８ビットで、一致長さ情報は平均２〜３ビットで表せるという評価が得られており、一致位置、一致長さの値が前回値と同一でない場合でも符号全体では平均１９ビット程度であるといえる。従って、ＬＺＳＳ方式による２３ビットの符号と比較してデータを短くすることができる（図４参照）。 In the present printing system, the data compression method and the restoration method according to the above-described embodiment are used in this way, so that the print data compression rate can be increased and the transmission time from the host computer 3 to the printer 4 can be shortened. it can. Accordingly, it is possible to follow the data transmission in the processing by the host computer 3 and the printer 4 which are increased in speed, and the throughput as the printing system can be improved. In particular, since the print data after halftone processing tends to appear repeatedly in the same way as in this application example, it can be said that the present data compression method is suitable for improving the compression rate. For the data after halftone processing, when this compression method is used so that the smaller matching value is selected when the longest matching sequence is the same, the matching position information is an average of 7-8. It has been evaluated that the match length information can be expressed by an average of 2 to 3 bits in bits, and even if the match position and the match length value are not the same as the previous value, it can be said that the average of the entire code is about 19 bits. . Therefore, the data can be shortened as compared with the 23-bit code by the LZSS system (see FIG. 4).

また、プリンタ４の制御装置（ＣＰＵ）は、パーソナルコンピュータなどで構成されるホストコンピュータ３と比べて、それほど高速のものを採用しなくてもすむようにしたいという要望があるが、前述した圧縮方法及び復元方法を用いれば、前述の通り、プリンタ４で行われる復元処理が容易となり、当該要望にもかなっている。また、ハーフトーン処理後の印刷データの場合には、前記識別情報のハフマン符号化に適しており、これによって更に圧縮率を向上させることができる。 Further, the control device (CPU) of the printer 4 has a desire to avoid the use of a high-speed control device as compared with the host computer 3 constituted by a personal computer or the like. If the restoration method is used, as described above, the restoration process performed by the printer 4 is facilitated and meets the demand. In addition, in the case of print data after halftone processing, it is suitable for Huffman coding of the identification information, which can further improve the compression rate.

本発明の保護範囲は、上記の実施の形態に限定されず、特許請求の範囲に記載された発明とその均等物に及ぶものである。 The protection scope of the present invention is not limited to the above-described embodiment, but covers the invention described in the claims and equivalents thereof.

本発明を適用したデータ圧縮装置の実施の形態例に係る構成図である。It is a block diagram concerning the example of an embodiment of a data compression device to which the present invention is applied. 本発明を適用したデータ復元装置の実施の形態例に係る構成図である。It is a block diagram concerning the example of an embodiment of a data restoration device to which the present invention is applied. 符号化部１２が行なう処理の内容を例示したフローチャートである。It is the flowchart which illustrated the content of the process which the encoding part 12 performs. 本データ圧縮装置１によって生成される符号等を説明するための図である。It is a figure for demonstrating the code | symbol etc. which are produced | generated by this data compression apparatus. 復号部２２が行なう処理の内容を例示したフローチャートである。It is the flowchart which illustrated the content of the process which the decoding part 22 performs. データ圧縮装置１及びデータ復元装置２が行う処理の具体例を示した図である。It is the figure which showed the specific example of the process which the data compression apparatus 1 and the data decompression | restoration apparatus 2 perform. 本実施の形態例により出力される圧縮処理後の符号を例示した図である。It is the figure which illustrated the code | symbol after the compression process output by this Example. 当該データ圧縮方法及び復元方法を用いた印刷システムの概略構成図である。It is a schematic block diagram of the printing system using the said data compression method and the decompression | restoration method.

Explanation of symbols

１データ圧縮装置、２データ復元装置、３ホストコンピュータ、４プリンタ、１１入力バッファ、１２符号化部、１３辞書バッファ、１４前回値バッファ、２１入力バッファ、２２復号部（復号手段）、２３辞書バッファ（記号列格納手段）、２４前回値バッファ（前回値格納手段）、３１アプリケーション、３２プリンタドライバ、３３画像生成部、３４色変換部、３５ハーフトーン処理部、３６圧縮部、４１コントローラ、４２エンジン、４３解凍部 DESCRIPTION OF SYMBOLS 1 Data compression apparatus, 2 Data decompression | restoration apparatus, 3 Host computer, 4 Printer, 11 Input buffer, 12 Encoding part, 13 Dictionary buffer, 14 Previous value buffer, 21 Input buffer, 22 Decoding part (decoding means), 23 Dictionary buffer (Symbol string storage means), 24 previous value buffer (previous value storage means), 31 application, 32 printer driver, 33 image generation unit, 34 color conversion unit, 35 halftone processing unit, 36 compression unit, 41 controller, 42 engine 43 Defroster

Claims

The longest matching sequence that matches the maximum length with the encoding target symbol sequence that exists in the encoded symbol sequence is searched, and the matching position that is the location of the longest matching sequence and the length of the longest matching sequence are searched. A data compression method for encoding data based on a certain matching length and compressing data,
The code generated by the encoding is
A first code including information on whether the value of the matching position and / or the matching length is the same as the previous encoding value;
A data compression method comprising: the second code representing the coincidence position and / or the coincidence length value which is not identical to the previous encoding value.

In claim 1,
The second code is a variable-length code;
The data compression method, wherein the first code includes information on the length of the second code.

In claim 2,
The data compression method according to claim 2, wherein the second code is obtained by removing the highest-order 1 when the value representing the matching position and / or the matching length is expressed in binary.

The longest matching sequence that matches the maximum length with the encoding target symbol sequence that exists in the encoded symbol sequence is searched, and the matching position that is the location of the longest matching sequence and the length of the longest matching sequence are searched. A data compression method for encoding data based on a certain matching length and compressing data,
When the length of the longest matching sequence is smaller than a predetermined value, the encoding is performed with the first identification information having a predetermined length, representing the fact that it is smaller than the predetermined value and the original symbol,
When the length of the longest matching sequence is equal to or greater than the predetermined value,
It is the same as the value at the time of the previous encoding, and the information whether the value of the matching position and / or the length of the matching is the same as the value at the previous encoding, and the value at the previous encoding. Non-matching position information representing the matching position and / or matching length and / or length of matching length information, second identification information having the same length as the first identification information,
The data compression method, wherein the encoding is performed by using the matching position information and / or matching length information indicating the matching position and / or the matching length which is not the same as the previous encoding value.

In claim 4,
Data compression characterized in that the coincidence position information and / or coincidence length information is obtained by removing the most significant 1 when the value representing the coincidence position and / or the coincidence length is expressed in a binary system. Method.

In either claim 4 or 5,
When the value of the matching position and / or the matching length is the same as the previous encoding value, the length of the matching position information and / or the matching length information included in the second identification information is set to 0. A data compression method characterized by:

In any one of Claims 4 thru | or 6,
The data compression method, wherein the first identification information and / or the second identification information is Huffman encoded.

Search for the longest matching sequence that has the longest match with the symbol sequence to be encoded and exists in the encoded symbol sequence. A data compression program for causing a computer to execute a process of encoding and compressing data based on a certain matching length,
Determining whether the value of the matching position and / or the matching length is the same as the previous encoding value;
Based on the result of the determination, outputting a code including information on whether or not the value is the same as the previous encoding value;
After the output, if the value of the match position and / or the match length is not the same as the previous encoding value, outputting a code representing the match position and / or the match length value; A data compression program that is executed by the computer.

The longest matching sequence that matches the maximum length of the encoding target symbol string that exists in the encoded symbol string is searched, and the matching position that is the position of the longest matching sequence and the length of the longest matching sequence are searched. A data restoration method for restoring encoded data based on a certain matching length,
The restoration target data includes a first code including information on whether the value of the matching position and / or the matching length is the same as the previous encoding value, and the previous encoding value. And a second code representing the value of the matching position and / or the matching length that are not identical to
Based on the first code, determine whether the value of the matching position and / or the matching length is the same as the previous decoding value,
The match position and / or match length value determined to be the same as the previous decoding value is obtained from the previous decoding value,
A match position and / or match length value determined not to be the same as the previous decoding value is obtained from the second code,
A data restoration method, wherein decoding is performed using a decoded symbol string based on the acquired match position and match length values.

The longest matching sequence that matches the maximum length of the encoding target symbol string that exists in the encoded symbol string is searched, and the matching position that is the position of the longest matching sequence and the length of the longest matching sequence are searched. A data restoration device for restoring data encoded based on a certain matching length,
Symbol string storage means for storing the restored symbol string;
Previous value storage means for storing the latest value of the matching position and the matching length obtained at the time of the restoration;
The restoration target data includes a first code including information on whether the value of the matching position and / or the matching length is the same as the previous encoding value, and the previous encoding value. And a second code representing the value of the matching position and / or the matching length that are not identical to
Based on the first code, it is determined whether the value of the matching position and / or the matching length is the same as the value at the previous encoding, and the same as the value at the previous encoding. A match position and / or match length value determined to be present is acquired from a value stored in the previous value storage means, and a match position and / or match value determined not to be the same as the previous encoding value A decoding unit that obtains a length value from the second code, and performs decoding using the symbol string stored in the symbol string storage unit based on the acquired matching position and matching length value A data restoration device comprising: