JP2004343459A

JP2004343459A - Data compression system and data decompression system

Info

Publication number: JP2004343459A
Application number: JP2003137943A
Authority: JP
Inventors: Takahiro Usami; 貴弘宇佐美
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2003-05-15
Filing date: 2003-05-15
Publication date: 2004-12-02

Abstract

<P>PROBLEM TO BE SOLVED: To efficiently compress/decompress correlated data for every fixed data unit. <P>SOLUTION: A compression algorithm executing unit 2 compares an input data string with a data string at a position in correlation with the input data string, and encodes the matched position into a code having a fixed bitwidth, and combines data strings at the positions which do not match the code and compresses it into compressed data. A decompression algorithm executing unit 12 decodes the codes in the compressed data string, and combines the data strings at positions which do not match each other. Thus, the input data string is compared with the data string at the position in correlation with the input data, and the matched position is encoded into the code having the fixed bitwidth, and the data strings at the positions which do not match the code are combined and compressed into the compressed data, so that the correlated data can be efficiently compressed/decompressed for every fixed data unit. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、一定のデータ単位ごとに相関を有するデータを高能率に符号化して圧縮するデータ圧縮システム及びそのデータ圧縮システムにより圧縮されたデータを復元するためのデータ復元システムに関する。
【０００２】
【従来の技術】
大容量の情報をできるだけ少ない容量で記憶し、あるいは転送するための技術として、高能率符号化技術が知られており、例えば、ハフマン符号化法のように出現率の高いものには短い符号を割り当て、出現率の低いものには長い符号を割り当てるＦＶ（Ｆｉｘｅｄ−ｔｏ−Ｖａｒｉａｂｌｅ）符号化法や、ＬＺ（Ｌｅｍｐｅｌ−Ｚｉｖ）符号化法（下記の特許文献１を参照）やＬＺＷ（Ｌｅｍｐｅｌ−ＺｉｖＷｉｌｃｈ）符号化法（下記の特許文献２を参照）のように過去に出現したデータ列とこれから圧縮しようとしているデータ列の一致性を利用して圧縮する方式などが知られている。
【０００３】
【特許文献１】
米国特許第４，４６４，６５０号明細書
【特許文献２】
米国特許第４，５５８，３０２号明細書
【０００４】
【発明が解決しようとする課題】
しかしながら、従来のデータ圧縮技術は、データ圧縮の際に圧縮対象となるすべてのデータをいったんスキャンする必要があるために、データを入力しながら圧縮処理を実行することができず、すべてのデータの圧縮処理が完了するまでに多くの時間を要するなどの問題がある。
【０００５】
そこで、このような問題を解決するためにデータを入力しながら圧縮処理を行うゼロ圧縮法や連長圧縮法などのいわゆる１パス圧縮・復元法が提案されている。しかしながら、これらの１パス圧縮・復元法は、圧縮対象であるデータが特定ワードの繰り返し構造を有する場合には高い圧縮率を達成することができるが、例えば画像データの水平方向や垂直方向などのように一定のデータ単位ごとに相関を有するデータを圧縮するには効率のよい圧縮方法とは言えない。
【０００６】
ここで、一定のデータ単位ごとに相関を有するデータとは、図７に例示するように、例えば１ワード（３２ビット：ｂｉｔ）単位のデータが１行にｎ列（図７に示した例では８列）配列され、隣接する行の間でそれぞれの列ごとに相関（類似性）を有するようなデータ構造を持つデータをいう。具体的に一例を挙げると、図７におけるｂ行目２列目のワードは、ａ行目２列目のワードと相関関係にあり、ｃ行目２列目のワードとも相関関係にある。
【０００７】
本発明は、以上に述べた状況を鑑みて成されたものであり、一定のデータ単位ごとに相関を有するデータを効率良く圧縮・復元することができるデータ圧縮システム及びデータ復元システムを提供することを目的とする。
【０００８】
【課題を解決するための手段】
本発明は上記目的を達成するために、入力データ列と、その入力データ列と相関する位置のデータ列を比較し、一致した位置を固定ビット幅の符号に符号化してその符号と一致しない位置のデータ列を合成して圧縮するようにしたものである。
【０００９】
すなわち本発明によれば、入力データのうち圧縮対象となるデータを所定の圧縮単位ビット数の入力データ列に分割する手段と、
前記入力データ列の比較対象となる一度入力された入力データ列を登録データ列として記憶する記憶手段と、
前記所定の圧縮単位ビット数に分割された入力データ列を更に任意のビット数幅に分割した各分割データ列と、前記記憶手段に記憶されている当該分割データ列と相関する位置の分割データ列を比較し、一致した位置の分割データ列を固定ビット幅の符号に符号化して、その符号と一致しない位置の分割データ列を合成して圧縮データとして圧縮するとともに、前記入力データ列を次の登録データ列として前記記憶手段に登録する圧縮手段とを、
有するデータ圧縮システムが提供される。
【００１０】
また本発明によれば、請求項１に記載のデータ圧縮システムにより圧縮されたデータを復元するデータ復元システムであって、前の復元データ列を登録データ列として記憶する記憶手段と、
前記データ圧縮システムにより圧縮された圧縮データ列内の符号化データに基づいて前記記憶手段に記憶されている分割復元データ列を抽出し、抽出した分割復元データ列と前記一致しない位置の分割データ列を合成して復元データ列を生成するとともに、前記生成した復元データ列を次の登録データ列として前記記憶手段に登録する復元手段とを、
有するデータ復元システムが提供される。
【００１１】
【発明の実施の形態】
以下、図面を参照して本発明の実施の形態について説明する。図１は本発明に係るデータ圧縮システムの一実施の形態を示すブロック図、図２は図１の圧縮アルゴリズム実行部の圧縮アルゴリズムを説明するためのフローチャート、図３は図１の圧縮アルゴリズム実行部の圧縮アルゴリズムを示す説明図、図４は符号化例を示す説明図である。
【００１２】
＜データ圧縮システム＞
図１に示すように、本実施の形態によるデータ圧縮システムは、圧縮対象となる入力データを所定の圧縮ビット数幅の列に分割して入力するデータ入力部１と、入力データと比較対象となる登録データを上述の所定ビット数幅の列に分割して記憶する登録データ列メモリ３と、入力データに対して圧縮処理を行う圧縮アルゴリズム実行部２と、圧縮されたデータを出力するデータ出力部４を有する。
【００１３】
圧縮アルゴリズム実行部２は、データ入力部１から入力された入力データ列と、この入力データ列と相関する列配置に記憶されている登録データ列とを比較する。そして、入力データ列を更に任意に分割設定した複数のデータ幅における入力データ列と登録データ列との一致するデータ列を判定し、判定した結果に応じた固定幅符号を固定幅符号テーブル５から取得し、取得した固定幅符号と、一致データ幅以外の入力データ列のビット部分とを入力データの圧縮データとしてデータ出力部４に出力する。さらに、圧縮アルゴリズム実行部２は、圧縮前の入力データ列を次の比較対象となる登録データとして登録データ列メモリ３の対応する列位置に登録する。
【００１４】
次に、図２を参照しながら圧縮アルゴリズム実行部２が実行するデータ圧縮処理について説明する。ステップＳ０の処理において、圧縮アルゴリズム実行部２は初期設定処理を行う。具体的には、内部カウンタである登録データカウンタＮをリセット（Ｎに１をセット）し、さらに、比較対象となる登録データを所定ビット数幅の列に分割して登録データ列メモリ３に記憶する。ここで「登録データカウンタＮ」は、データ入力部１に入力される入力データ列の列番号を示すインデックスであり、第１列目の入力データを処理する際にはＮ＝１であり、第２列目の入力データを処理する際にはＮ＝２，．．．，第ｎ列目の入力データを入力する際にはＮ＝ｎと推移していく。
【００１５】
また、このステップＳ０の処理において登録データ列メモリ３に登録する登録データとしては、入力データの第１行目のデータを所定ビット数幅の列に分割して登録し、入力データの第２行目データから圧縮処理を開始し、入力データの第１行目のデータは圧縮しないままデータ出力部４に出力する。もしくは、圧縮対象となる入力データの第１行目のデータから処理を開始できるようにするために、所定のダミーデータを登録データ列メモリ３に登録するようにしてもよい。
【００１６】
圧縮アルゴリズム実行部２は、次のステップＳ１の処理においてデータ入力部１により入力された圧縮対象の第Ｎ列目の入力データ列を取得し、続くステップＳ２の処理において登録データカウンタＮの値に対応する登録データ列メモリ３の比較対象の第Ｎ列目に格納されている登録データ列を取得する。圧縮アルゴリズム実行部２は、続くステップＳ３の処理において、取得した第Ｎ列目の入力データ列及び登録データ列を、あらかじめ設定した複数の任意の幅のデータにそれぞれ分割し、続くステップＳ４の処理においてステップＳ３の処理により分割した分割後入力データ、分割後登録データの対応するデータ同士の一致を判定する。
【００１７】
圧縮アルゴリズム実行部２は続くステップＳ５の処理において、図３に例示するように、登録データ列ａ、入力データ列ｂをそれぞれステップＳ３で任意の幅に分割した分割後登録データ（ｃ、ｄ、ｅ）、分割後入力データ（ｆ、ｇ、ｈ）における対応するデータ同士の一致、不一致判定結果に応じた固定幅符号を図４に示すような固定幅符号テーブル５から取得する。
【００１８】
図３及び図４に示した例は、３２ｂｉｔ（ビット）の登録データ列ａと３２ｂｉｔの入力データ列ｂをそれぞれ４ｂｉｔ、８ｂｉｔ、２０ｂｉｔに分割した分割後登録データｃ、ｄ、ｅ、分割後入力データｆ、ｇ、ｈにおける同じ位置のｃとｆ、ｄとｇ、ｅとｈの一致判定に応じた４ｂｉｔの固定幅符号の判定例（ｂｉｔ値を１６進数表記）を示しており、例えば、
・分割後登録データｃと分割後入力データｆが一致し、分割後データｃ、ｆ以外は一致しない場合には４ｂｉｔの固定幅符号「４」を設定し、
・分割後登録データｃと分割後入力データｆが一致し、分割後登録データｄと分割後入力データｇが一致し、分割後データ（ｃ、ｆ）、（ｄ、ｇ）以外は一致しない場合には４ｂｉｔの固定幅符号「７」を設定し、
・分割後登録データｃ、ｄ、ｅが分割後入力データｆ、ｇ、ｈとすべて一致しない場合には４ｂｉｔの固定幅符号「１」を設定する。
【００１９】
ステップＳ６の処理において、圧縮アルゴリズム実行部２は、取得した固定幅符号と、ステップＳ４の処理において一致しなかった分割後入力データとを合成し、第Ｎ列目の入力データの圧縮データとしてデータ出力部４に出力する。
【００２０】
例えば、図３に示した例では、
・分割後登録データｃ、ｄ、ｅが分割後入力データｆ、ｇ、ｈとすべて一致しない場合には、４ｂｉｔの固定幅符号「１」と不一致部分の分割後入力データｆ、ｇ、ｈを出力し、
・分割後登録データｃと分割後入力データｆが一致し、分割後データｃ、ｆ以外は一致しない場合には、４ｂｉｔの固定幅符号「４」と不一致部分の分割後入力データｇ、ｈを出力し、
・分割後登録データｃ、ｄ、ｅが分割後入力データｆ、ｇ、ｈとすべて一致した場合には、４ｂｉｔの固定幅符号「８」のみを出力する。つまり、一致する分割後データが多いほど（相関度が高いほど）圧縮効率が高くなる。
【００２１】
圧縮アルゴリズム実行部２は、ステップＳ７の処理において、登録データ列メモリ３の第Ｎ列に圧縮対象であった入力データ列を登録し、続くステップＳ８の処理において登録データカウンタＮの値が登録データ列メモリ３の最大アドレス値ｎ（最大列数ｎ）であるか否かを判別する。そして、判別の結果、登録データ列Ｎの値が最大アドレス値ｎでない場合、圧縮アルゴリズム実行部２は、ステップＳ９の処理として登録データカウンタＮの値に１を加算して、ステップＳ１の処理に戻り、次の入力データ列に対する処理に移行する。
【００２２】
一方、ステップＳ８の判別の結果、登録データ列Ｎの値が最大アドレス値ｎ（最大列数ｎ）である場合は、圧縮アルゴリズム実行部２は、ステップＳａの処理として登録データカウンタＮの値を１にリセットして、ステップＳ１の処理に戻り、次の行の第１列の入力データ列に対する処理に移行する。圧縮アルゴリズム実行部２は、以上の処理をデータ入力部１が入力するすべての入力データについて実行する。
【００２３】
以上の説明から明らかなように、本実施の形態によるデータ圧縮システムにおいては、圧縮アルゴリズム実行部２が、入力データ列と相関位置関係にある登録データ列とを比較し、任意に分割設定した複数のデータ幅における入力データ列と登録データ列との、個々の一致を判定した判定結果に応じた固定幅符号と不一致部分のビット列とを圧縮データとして出力し、圧縮前の入力データ列を、次の比較対象となる登録データ列として対応する列位置に登録する。このような構成によれば、一定のデータ単位ごとに相関を有するデータであっても、データを入力しながら、効率良く高速に圧縮することができる。
【００２４】
＜データ復元システム＞
次に図５、図６を参照して、本実施の形態によるデータ復元システムの構成について説明する。図５に示すように、データ復元システムは、上記の実施の形態によるデータ圧縮システムにより圧縮された圧縮データ列を含むデータを入力するデータ入力部１１と、比較対象となる登録データを所定の幅の列に分割して記憶する登録データ列メモリ１３と、入力された圧縮データに対して復元処理を行う復元アルゴリズム実行部１２と、復元されたデータを出力するデータ出力部１４を有する。
【００２５】
復元アルゴリズム実行部１２は、データ入力部１１から入力した圧縮データ列に含まれる固定幅符号に基づいて固定幅符号テーブル１５から、圧縮データ列と相関する列位置に記憶されている登録データ列の、圧縮前のデータ列との、任意に分割した各データ幅における一致部分を求め、前記登録データ列から一致部分を取り出し、取り出した一致部分と当該圧縮データ列の固定幅符号以降のｂｉｔ部分とを合成して所定ビット数幅の復元データ列としてデータ出力部１４に出力する。さらに、復元アルゴリズム実行部１２は、当該復元データ列を次の比較対象となる登録データとして登録データ列メモリ１３の対応する列位置に登録する。
【００２６】
なお、図５に示すデータ復元システムは、図１に示したデータ圧縮システムとは別体のシステム構成として説明するが、例えば、データ入力部１１、登録データ列メモリ１３、固定幅符号テーブル１５、データ出力部１４などの構成要素の一部若しくは全部を、図１に示したデータ圧縮システムにおけるデータ入力部１、登録データ列メモリ３、固定幅符号テーブル５、データ出力部４などの構成要素と共通のシステムにより構成してもよい。
【００２７】
次に図６に示すフローチャートを参照しながら、復元アルゴリズム実行部１２が実行するデータ復元処理について説明する。ステップＳ１０の処理において、復元アルゴリズム実行部１２は初期設定処理として、内部カウンタである登録データカウンタをリセットしてＮに１をセットし、更に、比較対象となる登録データを所定ビット数幅の列に分割して登録データ列メモリ１３に記憶する。
【００２８】
ここで、登録データ列メモリ１３に登録する登録データとしては、図２を用いて説明したデータ圧縮システムのステップＳ０において入力データの第１行目のデータを圧縮しないまま出力していた場合には、このステップＳ１０においても未圧縮の第１行目の入力データを所定ビット数幅の列に分割して登録し、圧縮データの第２行目のデータから復元処理を開始する。また、上記データ圧縮システムのステップＳ０において所定のダミーデータを登録データ列メモリ３に登録し、入力データの第１行目から圧縮処理を行った場合には、このステップＳ１０においても所定のダミーデータを登録データ列メモリ１３に登録し、圧縮データの第１行目のデータから復元処理を行う。
【００２９】
復元アルゴリズム実行部１２は、ステップＳ１１の処理において、データ入力部１１が入力した復元対象となる圧縮データの第Ｎ列目の入力データ列を取得し、続くステップＳ１２の処理において入力された圧縮データ列から固定幅符号を抽出する。復元アルゴリズム実行部１２は続くステップＳ１３の処理において、固定幅符号テーブル１５を参照して、登録データ列メモリ１３の第Ｎ列目に格納されている登録データ列から、圧縮時に設定した複数の幅のデータにおける、取り出した固定幅符号に対応するデータを抽出し、続くステップＳ１４の処理においてステップＳ１３で抽出したデータのデータ幅に基づいて、入力データ列から不一致データ、つまり未圧縮であったデータを抽出する。
【００３０】
復元アルゴリズム実行部１２は、続くステップＳ１５の処理において、第Ｎ列目の登録データ列から抽出したデータと、入力データ列から抽出した未圧縮のデータとを合成して元のデータを復元し、続くステップＳ１６の処理において登録データ列メモリ１３の第Ｎ列に、復元したデータを登録する。
【００３１】
復元アルゴリズム実行部１２は、続くステップＳ１７の処理において、登録データカウンタＮの値が登録データ列メモリ１３の最大アドレス値ｎ（最大列数ｎ）であるか否かを判別する。そして、判別の結果、登録データ列Ｎの値がアドレス値ｎでない場合、復元アルゴリズム実行部１２は、ステップＳ１８の処理として登録データカウンタＮの値に１を加算して、ステップＳ１１の処理に戻り、次の入力データ列に対する処理に移行する。一方、判別の結果、登録データカウンタＮの値が最大アドレス値ｎ（最大列数ｎ）である場合は、復元アルゴリズム実行部１２は、ステップＳ１９の処理として登録データカウンタＮの値を１にリセットして、ステップＳ１１の処理に戻り、次の第１列の入力データ列に対する処理に移行する。
【００３２】
以上の説明から明らかなように、本実施の形態によるデータ復元システムでは、復元アルゴリズム実行部１２が、圧縮データ列に含まれる固定幅符号を参照して圧縮済のデータの符号化部分を復元し、復元したデータと入力されたデータの符号化部分以外のデータとを合成することにより、復元対象となるデータを順次復元する。このような構成によれば、本実施の形態によるデータ圧縮システムにより圧縮されたデータを効率良く、高速に復元することができる。
【００３３】
以上、本発明の実施の形態を詳細に説明したが、本発明は、その精神または主要な特徴から逸脱することなく、他の様々な形で実施することができる。例えば、本実施の形態では、３２ｂｉｔ単位のデータが１行に８列ずつ配置され、隣接する行の間でそれぞれの列ごとに相関（類似性）を有するようなデータ構造をもつデータをモデルに説明したが、圧縮対象となるデータのデータ構造はこれに限定されない。１行当たりの列数は８列未満でも８列以上でもよいし、１列のデータ長は３２ｂｉｔ未満でも３２ｂｉｔ以上であってもよい。
【００３４】
また、固定幅符号を、４ｂｉｔ、８ｂｉｔ、２０ｂｉｔに分割したデータ幅における一致度によって設定パターンを定めた例を示したが、例えば、分割するデータ幅は８ｂｉｔ、８ｂｉｔ、１６ｂｉｔとして、各分割データ幅における一致度によって設定パターンを定めてもよいし、固定幅符号の設定値も様々な値が想定される。
【００３５】
【発明の効果】
以上説明したように、本発明によれば入力データ列と、その入力データ列と相関する位置のデータ列を比較し、一致した位置を固定ビット幅の符号に符号化して、その符号と一致しない位置のデータ列を合成して圧縮データとして圧縮するようにしたので、一定のデータ単位ごとに相関を有するデータを効率良く圧縮・復元することができる。
【図面の簡単な説明】
【図１】本発明に係るデータ圧縮システムの一実施の形態を示すブロック図である。
【図２】図１の圧縮アルゴリズム実行部の圧縮アルゴリズムを説明するためのフローチャートである。
【図３】図１の圧縮アルゴリズム実行部の圧縮アルゴリズムを示す説明図である。
【図４】符号化例を示す説明図である。
【図５】本発明に係るデータ復元システムの一実施の形態を示すブロック図である。
【図６】図５の復元アルゴリズム実行部の復元アルゴリズムを説明するためのフローチャートである。
【図７】一定周期ごとに相関をもつデータの構成図である。
【符号の説明】
１、１１データ入力部
２圧縮アルゴリズム実行部
３、１３登録データ列メモリ
４、１４データ出力部
５、１５固定幅符号テーブル
１２復元アルゴリズム実行部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a data compression system for efficiently encoding and compressing data having a correlation for each predetermined data unit and a data restoration system for restoring data compressed by the data compression system.
[0002]
[Prior art]
As a technique for storing or transferring a large amount of information with as small a capacity as possible, a high-efficiency coding technique is known. FV (Fixed-to-Variable) coding, LZ (Lempel-Ziv) coding (see Patent Literature 1 below), LZW (Lempel-ZivWilch), and LZW (Lempel-ZivWilch) 2. Description of the Related Art There is known a method of performing compression by using the matching between a data string that has appeared in the past and a data string to be compressed, such as an encoding method (see Patent Document 2 below).
[0003]
[Patent Document 1]
US Patent No. 4,464,650 [Patent Document 2]
US Patent No. 4,558,302
[Problems to be solved by the invention]
However, conventional data compression techniques require that all data to be compressed be once scanned during data compression, and thus cannot perform compression processing while inputting data. There is a problem that it takes a lot of time to complete the compression process.
[0005]
In order to solve such a problem, a so-called one-pass compression / decompression method such as a zero compression method or a run length compression method for performing compression processing while inputting data has been proposed. However, these one-pass compression / decompression methods can achieve a high compression ratio when the data to be compressed has a specific word repetition structure. As described above, it is not an efficient compression method to compress data having a correlation for each fixed data unit.
[0006]
Here, the data having a correlation for each fixed data unit is, for example, as shown in FIG. 7, data of one word (32 bits: bit) unit is n columns in one row (in the example shown in FIG. 7, (8 columns) means data having a data structure that is arranged and has a correlation (similarity) between adjacent rows for each column. To give a specific example, the word in the b-th row and the second column in FIG. 7 has a correlation with the word in the a-th row and the second column, and also has a correlation with the word in the c-th row and the second column.
[0007]
The present invention has been made in view of the circumstances described above, and provides a data compression system and a data decompression system that can efficiently compress and decompress data having a correlation for each fixed data unit. With the goal.
[0008]
[Means for Solving the Problems]
In order to achieve the above object, the present invention compares an input data sequence with a data sequence at a position correlated with the input data sequence, encodes the matched position into a code having a fixed bit width, and converts the position to a code that does not match the code. Are combined and compressed.
[0009]
That is, according to the present invention, means for dividing data to be compressed among input data into input data strings having a predetermined number of compression unit bits,
A storage unit that stores an input data sequence that has been input once to be compared with the input data sequence as a registered data sequence,
Each divided data string obtained by further dividing the input data string divided into the predetermined compression unit bit number into an arbitrary bit number width, and a divided data string at a position correlated with the divided data string stored in the storage means Are compared, the divided data string at the matched position is encoded into a code having a fixed bit width, and the divided data string at a position that does not match the code is synthesized and compressed as compressed data. Compression means for registering in the storage means as a registration data sequence,
A data compression system is provided.
[0010]
Further, according to the present invention, there is provided a data restoration system for restoring data compressed by the data compression system according to claim 1, wherein a storage means for storing a previous restored data string as a registered data string;
Extracting a divided restored data string stored in the storage unit based on the encoded data in the compressed data string compressed by the data compression system, and extracting the divided restored data string at a position that does not match the extracted divided restored data string And generating a restored data string by synthesizing the data, and restoring means for registering the generated restored data string as the next registered data string in the storage means.
A data restoration system is provided.
[0011]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of a data compression system according to the present invention, FIG. 2 is a flowchart for explaining a compression algorithm of a compression algorithm execution unit in FIG. 1, and FIG. 3 is a compression algorithm execution unit in FIG. And FIG. 4 is an explanatory diagram showing an example of encoding.
[0012]
<Data compression system>
As shown in FIG. 1, the data compression system according to the present embodiment includes a data input unit 1 that divides input data to be compressed into a column having a predetermined compression bit width and inputs the divided data, A registered data string memory 3 for dividing the registered data into columns having the predetermined bit width and storing the data, a compression algorithm executing unit 2 for performing compression processing on the input data, and a data output for outputting the compressed data It has a part 4.
[0013]
The compression algorithm execution unit 2 compares the input data sequence input from the data input unit 1 with a registered data sequence stored in a column arrangement correlated with the input data sequence. Then, the input data sequence is further divided arbitrarily and a data sequence that matches the input data sequence and the registered data sequence in a plurality of data widths is determined, and a fixed-width code corresponding to the determined result is determined from the fixed-width code table 5. The acquired fixed width code and the bit portion of the input data string other than the matched data width are output to the data output unit 4 as compressed data of the input data. Further, the compression algorithm execution unit 2 registers the input data sequence before compression as registration data to be compared next at a corresponding column position in the registration data sequence memory 3.
[0014]
Next, a data compression process executed by the compression algorithm execution unit 2 will be described with reference to FIG. In the process of step S0, the compression algorithm execution unit 2 performs an initialization process. More specifically, the registered data counter N, which is an internal counter, is reset (N is set to 1), and the registered data to be compared is divided into columns having a predetermined bit width and stored in the registered data column memory 3. I do. Here, the “registered data counter N” is an index indicating the column number of the input data column input to the data input unit 1, and when processing the input data in the first column, N = 1. When processing the input data in the second column, N = 2,. . . , N = n when inputting the input data in the n-th column.
[0015]
In the process of step S0, as the registration data to be registered in the registration data string memory 3, the data of the first row of the input data is divided into columns having a predetermined bit width and registered, and the second row of the input data is registered. The compression process is started from the eye data, and the data of the first line of the input data is output to the data output unit 4 without being compressed. Alternatively, predetermined dummy data may be registered in the registered data string memory 3 so that processing can be started from the first row of input data to be compressed.
[0016]
The compression algorithm execution unit 2 obtains the Nth input data sequence to be compressed, which is input by the data input unit 1 in the next process of step S1, and sets the value of the registered data counter N in the subsequent process of step S2. The corresponding registered data string stored in the Nth column to be compared in the registered data string memory 3 is acquired. In the processing of the subsequent step S3, the compression algorithm execution unit 2 divides the acquired input data string and the registered data string of the N-th column into a plurality of data having a predetermined arbitrary width, respectively, and then proceeds to the processing of step S4. In step S3, it is determined whether or not the corresponding data of the divided input data and the divided registered data divided by the process of step S3 match.
[0017]
In the processing of the subsequent step S5, the compression algorithm execution unit 2 divides the registration data string a and the input data string b into arbitrary widths in step S3, as shown in FIG. e) A fixed-width code corresponding to the matching / mismatch determination result of the corresponding data in the divided input data (f, g, h) is obtained from the fixed-width code table 5 as shown in FIG.
[0018]
The examples shown in FIG. 3 and FIG. 4 are divided registration data c, d, e obtained by dividing a 32-bit (bit) registration data string a and a 32-bit input data string b into 4 bits, 8 bits, and 20 bits, respectively, It shows an example of determination of a 4-bit fixed-width code (bit value is expressed in hexadecimal) according to the determination of coincidence between c and f, d and g, and e and h at the same position in data f, g and h.
If the post-division registration data c and the post-division input data f match, and the post-division data c and f do not match, a 4-bit fixed width code “4” is set,
When the post-division registration data c and the post-division input data f match, the post-division registration data d and the post-division input data g match, and the post-division data (c, f) and (d, g) do not match Is set to a 4-bit fixed width code "7",
If the post-division registration data c, d, and e do not all match the post-division input data f, g, and h, a 4-bit fixed-width code “1” is set.
[0019]
In the process of step S6, the compression algorithm execution unit 2 combines the acquired fixed-width code and the divided input data that did not match in the process of step S4, and generates data as compressed data of the input data in the Nth column. Output to the output unit 4.
[0020]
For example, in the example shown in FIG.
If the post-division registration data c, d, and e do not all match the post-division input data f, g, and h, the 4-bit fixed-width code “1” and the post-division input data f, g, and h that do not match are used. Output,
If the post-division registration data c and the post-division input data f match but the post-division data c and f do not match, the post-division input data g and h of the part that does not match the 4-bit fixed width code “4” Output,
When the post-division registration data c, d, and e all match the post-division input data f, g, and h, only the 4-bit fixed width code “8” is output. That is, the compression efficiency increases as the number of matching post-division data increases (the correlation degree increases).
[0021]
The compression algorithm execution unit 2 registers the input data sequence to be compressed in the N-th column of the registered data sequence memory 3 in the process of step S7, and in the subsequent process of step S8, sets the value of the registered data counter N to the registered data sequence. It is determined whether or not the maximum address value n (the maximum number of columns n) of the column memory 3 is reached. Then, as a result of the determination, if the value of the registered data string N is not the maximum address value n, the compression algorithm executing unit 2 adds 1 to the value of the registered data counter N as the processing of step S9, and proceeds to the processing of step S1. Then, the process returns to the next input data sequence.
[0022]
On the other hand, if the result of the determination in step S8 indicates that the value of the registered data string N is the maximum address value n (maximum number of columns n), the compression algorithm execution unit 2 executes the processing of step Sa to change the value of the registered data counter N The value is reset to 1, and the process returns to step S1 to shift to the process for the input data string in the first column of the next row. The compression algorithm execution unit 2 executes the above processing for all input data input by the data input unit 1.
[0023]
As is apparent from the above description, in the data compression system according to the present embodiment, the compression algorithm execution unit 2 compares the input data sequence with the registered data sequence having a correlation positional relationship, and arbitrarily sets the divided data sequence. The fixed-width code and the bit string of the mismatched portion according to the determination result of the individual match between the input data string and the registered data string at the data width of are output as compressed data, and the input data string before compression is Is registered at a corresponding column position as a registered data string to be compared. According to such a configuration, it is possible to efficiently and quickly compress even data having a correlation for each fixed data unit while inputting data.
[0024]
<Data restoration system>
Next, the configuration of the data restoration system according to the present embodiment will be described with reference to FIGS. As shown in FIG. 5, the data restoration system includes a data input unit 11 for inputting data including a compressed data string compressed by the data compression system according to the above-described embodiment, and a registration data to be compared having a predetermined width. , A registration data string memory 13 for dividing and storing the data, a decompression algorithm executing unit 12 for performing decompression processing on the input compressed data, and a data output unit 14 for outputting the decompressed data.
[0025]
The decompression algorithm execution unit 12 outputs the registered data sequence stored in the column position correlated with the compressed data sequence from the fixed-width code table 15 based on the fixed-width code included in the compressed data sequence input from the data input unit 11. Calculate a matching portion of the data string before compression with each data width arbitrarily divided, extract a matching portion from the registered data sequence, and determine the extracted matching portion and a bit portion after the fixed width code of the compressed data sequence. Are combined and output to the data output unit 14 as a restored data string having a predetermined bit width. Further, the restoration algorithm execution unit 12 registers the restored data string as the next comparison target registration data at the corresponding column position in the registered data string memory 13.
[0026]
The data restoration system shown in FIG. 5 will be described as a system configuration separate from the data compression system shown in FIG. 1. For example, a data input unit 11, a registered data string memory 13, a fixed width code table 15, A part or all of the components such as the data output unit 14 are combined with the components such as the data input unit 1, the registered data string memory 3, the fixed width code table 5, and the data output unit 4 in the data compression system shown in FIG. You may comprise by a common system.
[0027]
Next, the data restoration process executed by the restoration algorithm execution unit 12 will be described with reference to the flowchart shown in FIG. In the process of step S10, the restoration algorithm execution unit 12 resets a registered data counter, which is an internal counter, to 1 as an initial setting process, and further sets the registered data to be compared to a column having a predetermined bit width. And stored in the registered data string memory 13.
[0028]
Here, as the registered data to be registered in the registered data string memory 13, if the data of the first row of the input data is output without being compressed in step S0 of the data compression system described with reference to FIG. Also in this step S10, the uncompressed input data of the first row is divided into columns having a predetermined bit width and registered, and the decompression process is started from the data of the second row of the compressed data. When predetermined dummy data is registered in the registered data string memory 3 in step S0 of the data compression system and compression processing is performed from the first row of the input data, the predetermined dummy data is also stored in step S10. Is registered in the registered data string memory 13, and a restoration process is performed from the data in the first row of the compressed data.
[0029]
The decompression algorithm execution unit 12 acquires the Nth input data sequence of the decompression target compressed data input by the data input unit 11 in the process of step S11, and receives the compressed data input in the subsequent process of step S12. Extract the fixed-width code from the column. In the processing of the subsequent step S13, the decompression algorithm executing unit 12 refers to the fixed width code table 15 and extracts a plurality of widths set during compression from the registered data string stored in the Nth column of the registered data string memory 13. In the data of (i), data corresponding to the extracted fixed-width code is extracted, and in the subsequent processing of step S14, based on the data width of the data extracted in step S13, mismatched data, that is, uncompressed data, Is extracted.
[0030]
The decompression algorithm execution unit 12 decomposes the original data by combining the data extracted from the Nth registered data sequence and the uncompressed data extracted from the input data sequence in the subsequent step S15. In the subsequent step S16, the restored data is registered in the Nth column of the registration data column memory 13.
[0031]
The restoring algorithm execution unit 12 determines whether or not the value of the registered data counter N is the maximum address value n (the maximum number of columns n) of the registered data string memory 13 in the subsequent processing of step S17. If the result of the determination is that the value of the registered data string N is not the address value n, the restoration algorithm execution unit 12 adds 1 to the value of the registered data counter N as the processing of step S18, and returns to the processing of step S11. Then, the process proceeds to the next input data string. On the other hand, if the result of the determination indicates that the value of the registered data counter N is the maximum address value n (maximum number of columns n), the restoration algorithm execution unit 12 resets the value of the registered data counter N to 1 as the process of step S19. Then, the process returns to the process of step S11, and shifts to the process for the next input data sequence of the first column.
[0032]
As is apparent from the above description, in the data decompression system according to the present embodiment, the decompression algorithm execution unit 12 decompresses the coded part of the compressed data with reference to the fixed width code included in the compressed data sequence. Then, by combining the restored data with data other than the encoded part of the input data, the data to be restored is sequentially restored. According to such a configuration, data compressed by the data compression system according to the present embodiment can be efficiently and rapidly restored.
[0033]
Although the embodiments of the present invention have been described in detail, the present invention can be implemented in various other forms without departing from the spirit or main features. For example, in the present embodiment, data having a data structure in which 32-bit data is arranged in eight columns in one row and has a correlation (similarity) between adjacent rows for each column is used as a model. Although described, the data structure of the data to be compressed is not limited to this. The number of columns per row may be less than eight columns or eight or more columns, and the data length of one column may be less than 32 bits or more than 32 bits.
[0034]
Also, an example is shown in which the fixed width code is divided into 4 bits, 8 bits, and 20 bits, and the setting pattern is determined by the degree of coincidence in the data width. For example, the divided data width is 8 bits, 8 bits, and 16 bits. The set pattern may be determined according to the degree of coincidence in, and various values are assumed for the set value of the fixed width code.
[0035]
【The invention's effect】
As described above, according to the present invention, an input data sequence is compared with a data sequence at a position correlated with the input data sequence, and the matched position is encoded into a code having a fixed bit width, and the code does not match the code. Since the data sequence at the position is synthesized and compressed as compressed data, data having a correlation for each fixed data unit can be efficiently compressed and restored.
[Brief description of the drawings]
FIG. 1 is a block diagram showing one embodiment of a data compression system according to the present invention.
FIG. 2 is a flowchart for explaining a compression algorithm of a compression algorithm execution unit in FIG. 1;
FIG. 3 is an explanatory diagram illustrating a compression algorithm of a compression algorithm execution unit in FIG. 1;
FIG. 4 is an explanatory diagram showing an encoding example.
FIG. 5 is a block diagram showing one embodiment of a data restoration system according to the present invention.
FIG. 6 is a flowchart for explaining a restoration algorithm of a restoration algorithm execution unit in FIG. 5;
FIG. 7 is a configuration diagram of data having a correlation every fixed period.
[Explanation of symbols]
1, 11 Data input unit 2 Compression algorithm execution unit 3, 13 Registered data string memory 4, 14 Data output unit 5, 15 Fixed width code table 12 Restoration algorithm execution unit

Claims

Means for dividing data to be compressed among the input data into input data strings having a predetermined compression unit bit number,
A storage unit that stores an input data sequence that has been input once to be compared with the input data sequence as a registered data sequence,
Each divided data string obtained by further dividing the input data string divided into the predetermined compression unit bit number into an arbitrary bit number width, and a divided data string at a position correlated with the divided data string stored in the storage means Are compared, the divided data string at the matched position is encoded into a code having a fixed bit width, and the divided data string at a position that does not match the code is synthesized and compressed as compressed data. Compression means for registering in the storage means as a registration data sequence,
Data compression system.

A data restoration system for restoring data compressed by the data compression system according to claim 1, wherein a storage unit stores a previous restored data sequence as a registered data sequence,
Extracting a divided restored data string stored in the storage unit based on the encoded data in the compressed data string compressed by the data compression system, and extracting the divided restored data string at a position that does not match the extracted divided restored data string And generating a restored data string by synthesizing the data, and restoring means for registering the generated restored data string as the next registered data string in the storage means.
Data recovery system having.