JP2001092627A

JP2001092627A - Method for compressing data

Info

Publication number: JP2001092627A
Application number: JP26535599A
Authority: JP
Inventors: Akira Saito; 明齋藤
Original assignee: Toshiba TEC Corp
Current assignee: Toshiba TEC Corp
Priority date: 1999-09-20
Filing date: 1999-09-20
Publication date: 2001-04-06

Abstract

PROBLEM TO BE SOLVED: To provide a method for compressing data capable of improving a processing time at the time of compressing the memory image of a personal computer or the like, and storing the data in a storage device such as an HDD. SOLUTION: At the time of compressing a data stream stored in a main storage part, and storing a code stream obtained as the result of the compression in an auxiliary storage part, when the word length of an arithmetic processing part is 4 bytes, and the processing unit length of the compression processing is 1 byte, a shortest offset code is assigned to offset separated only by (the word length of the arithmetic processing part)÷(the processing unit length of the compression processing)=4.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、ＬＺ７７に代表
される辞書べ一ス方式を基にした圧縮を用いて、特にパ
ソコン（パーソナルコンピュータ）などの主記憶全体の
内容（以下メモリイメージ）を圧縮してＨＤＤ（ハード
ディスクドライブ）などの格納装置に格納するデータ圧
縮方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention uses compression based on a dictionary-based system represented by LZ77, and particularly compresses the entire contents of a main memory (hereinafter referred to as a memory image) such as a personal computer (personal computer). And a data compression method for storing the data in a storage device such as an HDD (hard disk drive).

【０００２】[0002]

【従来の技術】現在、辞書べ一スによりデータを圧縮す
る方法は、Abraham Lempel 氏と Jacob Ziv 氏が1977年
にIEEE Transaction Information Theory に発表した論
文'A Universal Algorithm for Sequential Data Compr
ession 'に見られる。これは、通称 Lempe1-Ziv符号化
のスライド辞書法又はＬＺ７７法と言われている。例え
ば、宗像清治：Ziv-Lempelのデータ圧縮法、情報処理、
Vol.26,No1（1985）が知られている。2. Description of the Related Art At present, a method of compressing data using a dictionary base is described in Abraham Lempel and Jacob Ziv's paper "A Universal Algorithm for Sequential Data Compr.
Seen in ession '. This is commonly referred to as the Lempe1-Ziv encoding slide dictionary method or LZ77 method. For example, Seiji Munakata: Ziv-Lempel's data compression method, information processing,
Vol.26, No1 (1985) is known.

【０００３】ＬＺ７７のアルゴリズムは、符号化データ
を過去のデータ系列の任意の位置から一致する最大長の
系列に区切り、過去の系列の複製として符号化する方法
である。具体的には、図４に示すように、符号化済みの
入力データを格納する移動窓と、これから符号化するデ
ータを格納する先読みバッファとを備え、先読みバッフ
ァのデータ系列と移動窓のデータ系列のすべての部分系
列とを照合して、移動窓中で一致する最大長の部分系列
を求める。そして、移動窓中でこの最大長の部分系列を
指定するために、「その最大長の部分系列の開始位置」
と「一致する長さ」と「不一致をもたらした次のシンボ
ル」との組を符号化する。次に、先読みバッファ内の符
号化したデータ系列を移動窓に移して、先読みバッファ
内に符号化したデータ系列分の新たなデータ系列を入力
する。以下、同様の処理を繰り返していくことで、デー
タを部分系列に分解して符号化を実行していくのであ
る。[0003] The LZ77 algorithm is a method of dividing encoded data from an arbitrary position in a past data sequence into a sequence of the maximum length that matches, and encoding the duplicate as a copy of the past sequence. More specifically, as shown in FIG. 4, a moving window for storing encoded input data and a look-ahead buffer for storing data to be encoded from now on are provided, and the data sequence of the look-ahead buffer and the data sequence of the moving window are provided. Is compared with all of the sub-sequences, and the maximum-length sub-sequence that matches in the moving window is obtained. Then, in order to specify this maximum-length sub-sequence in the moving window, "start position of the maximum-length sub-sequence"
And the "matching length" and the "next symbol that resulted in a mismatch". Next, the encoded data sequence in the prefetch buffer is moved to the moving window, and a new data sequence for the encoded data sequence is input into the prefetch buffer. Hereinafter, by repeating the same processing, the data is decomposed into sub-series and the encoding is executed.

【０００４】このよな符号化には、多くの改良型が提案
されている。例えば、符号化コードであるのか生データ
であるのかを識別するフラグを設けて、符号化コードが
生データよりも長くなってしまうときには生データを符
号化するという方法がとられる。これは、ＬＺＳＳ符号
方式（T. C. Bell ,“ Better OPM / L Text Compressi
on " , IEEE Transaction Commun. , Vol. COM-34 , N
o.12, Dec(1986)）として知られている。他には、Ｍ．
ネルソン＝データ圧縮ハンドブック改訂第２版、トッパ
ン(1996). ISBN4-8101-8605-9.が知られている。[0004] Many improved types of such encoding have been proposed. For example, a method of providing a flag for identifying whether the data is an encoded code or raw data, and encoding the raw data when the encoded code becomes longer than the raw data is adopted. This is based on the LZSS coding method (TC Bell, “Better OPM / L Text Compressi
on ", IEEE Transaction Commun., Vol. COM-34, N
o.12, Dec (1986)). In addition, M.
Nelson-Data Compression Handbook, Second Revised Edition, Toppan (1996). ISBN 4-8101-8605-9.

【０００５】[0005]

【発明が解決しようとする課題】近年のパソコンが搭載
する０Ｓ（オペレーティングシステム）及びアプリケー
ションは年々高機能化が進んでおり、同時に大容量のメ
モリが要求されている。このようなＯＳ及びアプリケー
ションの大容量化・大規模化は、電源オンーオフ時の待
ち時間を増大させている。処理を中断して電源をオフし
た後で、前回処理の状態で瞬時に再開できることがユー
ザのニーズとなっている。これを実現するために、中断
時の主記憶全体の内容（メモリイメージ）をＨＤＤなど
に格納して電源オフとし、再開時にはＨＤＤに格納した
メモリイメージを主記憶にロードし、中断前と同じ状態
を再現することで再開時の待ち時間を短くする方法（ハ
イバネーション処理）がある。In recent years, OS (Operating System) and applications mounted on personal computers have become increasingly sophisticated year by year, and at the same time, large-capacity memories are required. Such a large capacity and large scale of the OS and the application increase the waiting time at the time of power ON / OFF. A user's need is to be able to instantaneously resume processing in the previous processing state after the processing is interrupted and the power is turned off. In order to realize this, the contents (memory image) of the entire main memory at the time of interruption are stored in an HDD or the like, and the power is turned off. At the time of resumption, the memory image stored in the HDD is loaded into the main memory, and the same state as before the interruption There is a method (hibernation processing) of shortening the waiting time at the time of resuming by reproducing.

【０００６】しかしながら、近年のパソコンでは６４Ｍ
Ｂ、１２８ＭＢという大容量メモリを使用することがめ
ずらしくない。このような大容量のメモリイメージをそ
のままＨＤＤに格納すると、ＨＤＤの連続転送性能は１
６ＭＢ／ｓあるいは３２ＭＢ／ｓ程度であるため、数秒
の時間を要してしまう。そこで、メモリイメージを圧縮
して格納することで格納時間を短くすることが考えられ
る。圧縮処理が比較的軽く、入力データの特徴にそれほ
ど依存しないという特徴を有するＬＺ７７系の圧縮方式
はこのような用途に適しているが、単に標準的なＬＺ７
７方式を適用したのでは欠点も多い。However, in recent personal computers, 64M
B, it is not uncommon to use a large capacity memory of 128 MB. If such a large-capacity memory image is stored in the HDD as it is, the continuous transfer performance of the HDD becomes 1
Since it is about 6 MB / s or 32 MB / s, it takes several seconds. Therefore, it is conceivable to shorten the storage time by compressing and storing the memory image. The LZ77-based compression method, which has a feature that the compression process is relatively light and does not depend so much on the characteristics of the input data, is suitable for such an application, but is simply a standard LZ7.
Applying the seven methods has many disadvantages.

【０００７】例えば、入力データであるメモリイメージ
では、ＣＰＵのワード長を単位とする繰り返しデータが
頻出する可能性が高いが、既存のＬＺ７７系データ圧縮
方式では、このようなメモリイメージの特徴を生かすよ
うな符号体系になっていない。また、一致する可能性の
低い遠方のオフセットまで比較を行うなどの欠点があっ
た。For example, in a memory image which is input data, there is a high possibility that repetitive data having a word length of the CPU as a unit frequently occurs. However, the existing LZ77-based data compression system makes use of such features of the memory image. There is no such coding system. In addition, there is a drawback that comparison is performed up to a distant offset that is unlikely to match.

【０００８】この発明の目的は、上記したような事情に
鑑み成されたものであって、パソコンなどのメモリイメ
ージを圧縮してＨＤＤなどの格納装置に格納するときの
処理時間を改善することが可能なデータ圧縮方法を提供
することにある。SUMMARY OF THE INVENTION An object of the present invention has been made in view of the above circumstances, and it is an object of the present invention to improve processing time when a memory image of a personal computer or the like is compressed and stored in a storage device such as an HDD. It is to provide a possible data compression method.

【０００９】[0009]

【課題を解決するための手段】上記課題を解決し目的を
達成するために、この発明のデータ圧縮方法は、以下の
ように構成されている。In order to solve the above problems and achieve the object, a data compression method according to the present invention is configured as follows.

【００１０】（１）この発明のデータ圧縮方法は、入力
データストリームを圧縮して符号ストリームを出力する
ときであって、パソコンなどの主記憶全体の内容（以下
メモリイメージ）を圧縮してＨＤＤなどの格納装置に格
納するとき、（ＣＰＵのワード長÷圧縮処理の処理単位
長（＝シンボル長））だけ離れたオフセットに対して最
短のオフセット符号をアサインする。(1) The data compression method according to the present invention is used when an input data stream is compressed to output a code stream, and the contents of the entire main memory (hereinafter referred to as a memory image) of a personal computer or the like are compressed to an HDD or the like. , The shortest offset code is assigned to an offset separated by (word length of CPU / processing unit length of compression processing (= symbol length)).

【００１１】（２）この発明のデータ圧縮方法は、入力
データストリームを圧縮して符号ストリームを出力する
ときであって、パソコンなどの主記憶全体の内容（以下
メモリイメージ）を圧縮してＨＤＤなどの格納装置に格
納するとき、一致したオフセットが（ＣＰＵのワード長
÷圧縮処理の処理単位長）の倍数で、ある一致長符号を
発生するとき、一致開始位置から一致長内の最初のワー
ドバウンダリまでの一致符号、一致長内の最初のワード
バウンダリから一致長内の最後のワードバウンダリまで
の一致符号、一致長内の最後のワードバウンダリから一
致終了位置までの一致符号、の３つの符号に分解して符
号化する。(2) The data compression method of the present invention is for compressing an input data stream to output a code stream, and compressing the entire contents of a main memory (hereinafter referred to as a memory image) of a personal computer or the like to an HDD or the like. When the matching offset is a multiple of (the word length of the CPU / the processing unit length of the compression process) and a certain matching length code is generated, the first word boundary within the matching length from the matching start position is generated. Into three codes: a match code up to, a match code from the first word boundary in the match length to the last word boundary in the match length, and a match code from the last word boundary in the match length to the match end position. And encode.

【００１２】（３）この発明のデータ圧縮方法は、入力
データストリームを圧縮して符号ストリームを出力する
ときであって、パソコンなどの主記憶全体の内容（以下
メモリイメージ）を圧縮してＨＤＤなどの格納装置に格
納するとき、一致長符号のうち長い一致長に割り当てる
部分については、（ＣＰＵのワード長÷圧縮処理の処理
単位長）の倍数の一致長についてのみ一致長符号がアサ
インされる。(3) The data compression method according to the present invention is used when an input data stream is compressed to output a code stream, and the contents of the entire main memory (hereinafter referred to as a memory image) of a personal computer or the like are compressed to an HDD or the like. When storing in the storage device, the matching length code is assigned only to the matching length that is a multiple of (the word length of the CPU / the processing unit length of the compression process) for the portion assigned to the long matching length in the matching length code.

【００１３】（４）この発明のデータ圧縮方法は、入力
データストリームを圧縮して符号ストリームを出力する
ときであって、パソコンなどの主記憶全体の内容（以下
メモリイメージ）を圧縮してＨＤＤなどの格納装置に格
納するとき、一致長符号のうち長い一致長に割り当てる
部分については、２のべき乗の一致長についてのみ一致
長符号がアサインされている。(4) The data compression method of the present invention is used when an input data stream is compressed to output a code stream, and the contents of the entire main memory (hereinafter referred to as a memory image) of a personal computer or the like are compressed to an HDD or the like. , The portion assigned to the long match length code is assigned a match length code only for the match length of a power of two.

【００１４】（５）この発明のデータ圧縮方法は、入力
データストリームを圧縮して符号ストリームを出力する
ときであって、パソコンなどの主記憶全体の内容（以下
メモリイメージ）を圧縮してＨＤＤなどの格納装置に格
納するとき、圧縮時の参照点数が外部からの指示で可変
にセツト可能であり、伸長側は圧縮側で設定可能な最大
の参照点数に対して伸長でき、前回圧縮時の処理時間及
びＨＤＤアクセス性能の情報をファイルで保持し、その
情報に従って参照点数を決定する。(5) The data compression method of the present invention is used when an input data stream is compressed and a code stream is output, and the content of the entire main memory (hereinafter referred to as a memory image) of a personal computer or the like is compressed to an HDD or the like. The number of reference points at the time of compression can be variably set by an external instruction, and the decompression side can decompress the maximum number of reference points that can be set on the compression side. Time and HDD access performance information are stored in a file, and the number of reference points is determined according to the information.

【００１５】（６）この発明のデータ圧縮方法は、
（５）において、前回の格納時にＨＤＤの格納ネックと
なっていたら、次の回は参照点を増やして圧縮率を改善
し、一方、前回の格納時にＨＤＤが空いていてＣＰＵの
圧縮処理が処理ネックとなっていたら、次の回は参照点
を減らす。(6) The data compression method of the present invention
In (5), if a storage bottleneck in the HDD occurred during the previous storage, the compression rate is improved by increasing the number of reference points the next time, while the compression processing of the CPU is not performed because the HDD is empty during the previous storage. If so, reduce the reference points the next time.

【００１６】（７）この発明のデータ圧縮方法は、入力
データストリームを圧縮して符号ストリームを出力する
ときであって、パソコンなどの主記憶全体の内容（以下
メモリイメージ）を圧縮してＨＤＤなどの格納装置に格
納するとき、圧縮時の参照点数が外部からの指示で可変
にセット可能であり、伸長側は圧縮側で設定可能な最大
の参照点数に対して伸長でき、前回圧縮時の圧縮率情報
をファイルで保持し、その情報に従って参照点数を決定
する。(7) The data compression method of the present invention is used when an input data stream is compressed and a code stream is output, and the content of the entire main memory (hereinafter referred to as a memory image) of a personal computer or the like is compressed to an HDD or the like. When stored in the storage device, the number of reference points at the time of compression can be variably set by an instruction from the outside, and the decompression side can decompress the maximum number of reference points that can be set on the compression side. The rate information is stored in a file, and the number of reference points is determined according to the information.

【００１７】（８）この発明のデータ圧縮方法は、
（７）において、前回の圧縮率が閾値以下であれば、次
の回は参照点を増やし、前回の圧縮率が閾値以上であれ
ば、次の回は参照点を減らす。(8) The data compression method of the present invention
In (7), if the previous compression ratio is equal to or smaller than the threshold, the number of reference points is increased in the next round. If the previous compression ratio is equal to or larger than the threshold, the number of reference points is decreased in the next round.

【００１８】（９）この発明のデータ圧縮方法は、入力
データストリームを圧縮して符号ストリームを出力する
ときであって、パソコンなどの主記憶全体の内容（以下
メモリイメージ）を圧縮してＨＤＤなどの格納装置に格
納するとき、圧縮時の参照点数が外部からの指示で可変
にセット可能であり、伸長側は圧縮側で設定可能な最大
の参照点数に対して伸長でき、前回圧縮時の圧縮率情報
をファイルで保持し、一定時間ユーザジョブが行われな
いとき、バックグラウンドで圧縮率見積もりを行い、前
回の圧縮率情報を更新しておき、前回情報またはバック
グラウンドで更新された情報に従って参照点数を決定す
る。(9) The data compression method according to the present invention is used when an input data stream is compressed to output a code stream, and the contents of the entire main memory (hereinafter referred to as a memory image) of a personal computer or the like are compressed to an HDD or the like. When stored in the storage device, the number of reference points at the time of compression can be variably set by an instruction from the outside, and the decompression side can decompress the maximum number of reference points that can be set on the compression side. Retains the compression ratio information in a file. When the user job is not performed for a certain period of time, estimates the compression ratio in the background, updates the previous compression ratio information, and refers to it according to the previous information or information updated in the background Determine the score.

【００１９】（１０）この発明のデータ圧縮方法は、入
力データストリームを圧縮して符号ストリームを出力す
るときであって、パソコンなどの主記憶全体の内容（以
下メモリイメージ）を圧縮してＨＤＤなどの格納装置に
格納するとき、一定時間ユーザジョブが行われないと
き、バックグラウンドでメモリイメージの圧縮を行いＨ
ＤＤに格納し、そのままハイバネーション処理が行われ
たときはメモリイメージの圧縮を省略する。(10) The data compression method according to the present invention is for compressing an input data stream to output a code stream, and compressing the entire contents of a main memory (hereinafter referred to as a memory image) of a personal computer or the like to an HDD or the like. When the user job is not performed for a certain period of time when the image is stored in the storage device of
The data is stored in the DD, and when the hibernation process is performed as it is, the compression of the memory image is omitted.

【００２０】上記したこの発明によれば、以下の効果が
得られる。According to the present invention, the following effects can be obtained.

【００２１】（１）ＬＺＳ（例えばＭ．ネルソン：デー
タ圧縮ハンドブック改訂第２版、トッパン（１９９６）
に説明あり）のように既存のＬＺ７７系データ圧縮方式
をパソコンのハイバネーションに適用した場合、ＣＰＵ
のワード長を単位とする繰り返しデータが頻出する可能
性が高いが、既存のＬＺ７７系データ圧縮方式はこのよ
うなメモリイメージの特徴を生かすような符号体系にな
っていない、あるいは一致する可能性の低い２０４７ま
での遠いオフセットまで比較を行うなどの欠点があっ
た。この発明では、メモリイメージのうちデータ領域、
スタック領域の性質に着目し、（ワード長÷シンボル
長）だけ離れたオフセットに対して最短のオフセット符
号をアサインすることで、圧縮率を改善し、ひいては圧
縮処理時間を短縮することができる。例えば３２ビット
ＣＰＵならワード長は４バイトであり、そのときＬＺ７
７の処理の単位（シンボル長）が１バイトであれば、
（ワード長÷シンボル長）＝４となる。この場合、オフ
セット４に対して最短のオフセット符号をアサインす
る。(1) LZS (for example, M. Nelson: Data Compression Handbook, 2nd revised edition, Toppan (1996))
When the existing LZ77 data compression method is applied to hibernation of a personal computer as described in
It is highly probable that repetitive data with the word length as a unit frequently appears, but the existing LZ77-based data compression method does not have a coding system that takes advantage of the characteristics of such a memory image, or there is a possibility that it matches. There were drawbacks, such as comparing up to distant offsets as low as 2047. In the present invention, the data area of the memory image,
By focusing on the nature of the stack area and assigning the shortest offset code to offsets separated by (word length / symbol length), the compression ratio can be improved and the compression processing time can be reduced. For example, for a 32-bit CPU, the word length is 4 bytes, and LZ7
If the unit of processing (symbol length) of 7 is 1 byte,
(Word length / symbol length) = 4. In this case, the shortest offset code is assigned to offset 4.

【００２２】（２）上記（１）で述べたように、メモリ
イメージはＣＰＵのワード長を単位とする繰り返しデー
タが頻出する可能性が高いという特徴がある。そこで、
一致したオフセットが４の倍数のとき、ある一致長に対
して、一致開始位置から一致長内の最初のワードバウン
ダリまでの一致符号、一致長内の最初のワードパウンダ
リから一致長内の最後のワードパウンダリまでの一致符
号、一致長内の最後のワードバウンダリから一致終了位
置までの一致符号の３つの符号に分解して符号化する。
このようにすることで、伸長時に「一致長内の最初のワ
ードバウンダリから一致長内の最後のワードバウンダリ
までの一致符号」の分についてはワード単位のコピーが
可能になるので高速化できる。(2) As described in the above (1), the memory image is characterized in that repeated data in units of the word length of the CPU frequently occurs. Therefore,
When the matched offset is a multiple of 4, for a given match length, the match code from the match start position to the first word boundary in the match length, the first word boundary in the match length to the last in the match length The code is decomposed into three codes: a match code up to the word boundary, and a match code from the last word boundary in the match length to the match end position.
By doing so, at the time of decompression, it is possible to copy in units of words for the "match code from the first word boundary in the match length to the last word boundary in the match length", so that the speed can be increased.

【００２３】（３）メモリイメージはＣＰＵのワード長
を単位とする繰り返しデータが頻出する可能性が高い。
そこで、一致長符号のうち長い一致長に割り当てる部分
については、４の倍数の一致長についてのみ符号をアサ
インしておくことで、圧縮率をそれほど悪化させること
なく、一致長符号のテーブルを単純化して処理を高速化
することができる。例えば、一致長１〜３２まではすべ
ての一致長に符号を割り当てるが、それ以降は３６、４
０、４４、…２５６、のように一致長が４の倍数となる
ところだけ符号を割り当てる。(3) In a memory image, there is a high possibility that repeated data frequently appears in units of the word length of the CPU.
Therefore, for the part assigned to the long match length in the match length code, codes are assigned only to match lengths that are multiples of 4, thereby simplifying the table of match length codes without significantly deteriorating the compression ratio. Processing can be speeded up. For example, codes are assigned to all match lengths up to match lengths 1 to 32, but after that, 36, 4
Codes are assigned only where the matching length is a multiple of 4, such as 0, 44,.

【００２４】（４）メモリイメージはＣＰＵのワード長
を単位とする繰り返しデータが頻出する可能性が高い。
そこで、一致長符号のうち長い一致長に割り当てる部分
については、２のべき乗の一致長についてのみ符号をア
サインしておくことで、圧縮率をそれほど悪化させるこ
となく、一致長符号のテーブルを単純化して処理を萬速
化することができる。例えば、一致長１〜３２まではす
べての一致長に符号を割り当てるが、それ以降は６４、
１２８、２５６、５１２、１０２４、のように一致長が
２のべき乗となるところだけ符号を割り当てる。(4) In a memory image, there is a high possibility that repetitive data having a word length of the CPU as a unit frequently appears.
Therefore, by assigning a code only to a match length of a power of 2 for a portion assigned to a long match length in the match length code, the table of match length codes can be simplified without significantly deteriorating the compression ratio. Can speed up the process. For example, a code is assigned to all the match lengths up to the match lengths 1 to 32, but after that, 64,
Codes are assigned only where the matching length is a power of 2, such as 128, 256, 512, 1024.

【００２５】（５）一般に、圧縮時の参照点数（サーチ
するオフセットの数）を増やすと圧縮率が向上するの
で、格納すべきデータのサイズが小さくなるが、一方で
ＣＰＵの行う圧緒処理の負荷が大きくなる。圧線データ
をＨＤＤに格納する時間とＣＰＵ処理時間の違いが見か
け上の処理時間となるので、両者のバランスをとること
が重要である。そこで、圧縮時の参照点数を可変にセッ
ト出来るようにし、外部からの指示で増減出来るように
しておく。また、伸長側は最大の参照点数に対して伸長
できるようにしておく。そして、ＣＰＵの処理性能、Ｈ
ＤＤのアクセス性能で最適な参照点数を予測する。前回
の処理時間・ＨＤＤアクセス性能の情報をファイルで保
持し、参考情報とする。前回の格納時にＨＤＤの格納ネ
ックとなっていたら、次の回は参照点を増やして圧縮率
を改善する。一方、前回の格納時にＨＤＤが空いていて
ＣＰＵの圧縮処理が処理ネックとなっていたら、次の回
は参照点を減らす。このようにすることで、ＨＤＤの格
納時間とＣＰＵの圧縮処理時間を均衡させ、全体の処理
時間を短縮することができる。(5) In general, when the number of reference points (the number of offsets to be searched) at the time of compression is increased, the compression ratio is improved, so that the size of data to be stored is reduced. The load increases. Since the difference between the time for storing the pressure line data in the HDD and the CPU processing time is the apparent processing time, it is important to balance the two. Therefore, the number of reference points at the time of compression can be set variably, and can be increased or decreased by an external instruction. The extension side is designed to be able to extend the maximum number of reference points. Then, the processing performance of the CPU, H
The optimum number of reference points is predicted based on the DD access performance. Information on the previous processing time and HDD access performance is stored in a file and is used as reference information. If the storage has become a bottleneck in the HDD at the time of the previous storage, the number of reference points is increased next time to improve the compression ratio. On the other hand, if the HDD is empty at the time of the previous storage and the compression processing of the CPU is a processing bottleneck, the number of reference points is reduced next time. By doing so, the storage time of the HDD and the compression processing time of the CPU can be balanced, and the overall processing time can be reduced.

【００２６】（６）最初の一回は予測の圧縮率に基づい
て参照点数を決定する。２回目以降は前回の圧縮率情報
をファイルで保持し、この値を予め設定した圧縮率閾値
と比較して参照点数を決定する。前回の圧縮率が閾値以
下であれば、次の回は参照点を増やす。一方、前回の圧
縮率が閾値以上であれば、次の回は参照点を減らす。こ
のようにすることで、ＨＤＤの格納時間とＣＰＵの圧縮
処理時間を均衡させ、全体の処理時間を短縮することが
できる。(6) For the first time, the number of reference points is determined based on the compression ratio of prediction. For the second and subsequent times, the previous compression ratio information is stored in a file, and this value is compared with a preset compression ratio threshold to determine the number of reference points. If the previous compression ratio is less than or equal to the threshold, the next round increases the number of reference points. On the other hand, if the previous compression ratio is equal to or greater than the threshold, the number of reference points is reduced in the next round. By doing so, the storage time of the HDD and the compression processing time of the CPU can be balanced, and the overall processing time can be reduced.

【００２７】（７）上記（５）及び（６）で示した方式
は、前回の情報に基づいて参照点数を決定するので、前
回のハイバネーション処理から時間間隔が空いてしまっ
たときに、圧縮率見積もりが正確に行えないという欠点
がある。そこで、一定時間ユーザジョブが行われないと
き、バックグラウンドで圧縮率見積もりを行い、前回の
圧縮率情報を更新しておく。このようにすることで、実
際の圧縮率に近い情報をもとに参照点数を決定すること
ができる。(7) In the methods described in (5) and (6) above, the number of reference points is determined based on the previous information. Therefore, when a time interval has elapsed from the previous hibernation process, the compression ratio There is a drawback that estimation cannot be performed accurately. Therefore, when the user job is not performed for a certain period of time, the compression ratio is estimated in the background, and the previous compression ratio information is updated. In this way, the number of reference points can be determined based on information close to the actual compression ratio.

【００２８】（８）一定時間ユーザジョブが行われない
とき、バックグラウンドでメモリイメージの圧縮を行い
ＨＤＤに格納しておく。そのままハイバネーション処理
が行われたときはメモリイメージの圧縮を省略する。こ
のようにすることで、ハイバネーション処理の時間をさ
らに短縮できる。(8) When no user job is performed for a certain period of time, the memory image is compressed in the background and stored in the HDD. When the hibernation process is performed as it is, the compression of the memory image is omitted. By doing so, the time for the hibernation process can be further reduced.

【００２９】[0029]

【発明の実施の形態】以下、この発明の実施の形態につ
いて図面を参照して説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００３０】図１に示すように、パソコンは、ＣＰＵ
１、ＤＲＡＭなどで構成される主記憶（メモリ）２、Ｃ
ＰＵとパスと主記憶を接続するホストブリッジ３、ＨＤ
Ｄ（ＩＤＥ）コントローラ４、ＨＤＤ５、及びメインパ
スであるＰＣＩパス６を備えている。As shown in FIG. 1, the personal computer has a CPU
1. Main memory (memory) composed of DRAM or the like 2, C
Host bridge 3 for connecting PU, path and main memory, HD
A D (IDE) controller 4, an HDD 5, and a PCI path 6 as a main path are provided.

【００３１】図１は、主記憶の内容を圧縮することなく
ハイバネーション処理を行うときの様子を示す図であ
る。ここでは主記憶が６４ＭＢの場合を示している。中
断時には、６４ＭＢ分の内容をそのままＨＤＤに格納す
る。再開時には、ＨＤＤに格納された６４ＭＢの内容を
そのまま主記憶にロードする。FIG. 1 is a diagram showing a state in which the hibernation process is performed without compressing the contents of the main memory. Here, a case where the main memory is 64 MB is shown. At the time of suspension, the contents of 64 MB are stored in the HDD as they are. At the time of resumption, the contents of 64 MB stored in the HDD are directly loaded into the main memory.

【００３２】図２に示すように、パソコンは、ＣＰＵ１
１、ＤＲＡＭなどで構成される主記憶１２、ＣＰＵとパ
スと主記憶を接続するホストブリッジ１３、ＨＤＤコン
トローラ１４、ＨＤＤ１５、及びメインパスであるＰＣ
Ｉパス１６を備えている。As shown in FIG. 2, the personal computer has a CPU 1
1, a main memory 12 composed of a DRAM or the like, a host bridge 13 for connecting a CPU to a path and a main memory, an HDD controller 14, an HDD 15, and a PC serving as a main path.
An I path 16 is provided.

【００３３】図２は、主記憶の内容を圧縮してハイバネ
ーション処理を行うときの様子を示す図である。ここで
は、圧縮伸長のハードウェアを用いるのでなく、ＣＰＵ
で圧縮処理を行っている。６４ＭＢの主記憶が２０ＭＢ
に圧縮できた場合を示す。中断時には、６４ＭＢ分の内
容をＣＰＵによるソフト処理で２０ＭＢに圧縮し、２０
ＭＢの符号データをＨＤＤに格納する。再開時には、Ｈ
ＤＤに格納されている符号データを読み出し、ＣＰＵに
よるソフト処理で元の６４ＭＢに伸長し、主記億にロー
ドする。FIG. 2 is a diagram showing a state in which the contents of the main memory are compressed and the hibernation process is performed. Here, instead of using compression / decompression hardware, CPU
Is performing compression processing. 64MB main memory is 20MB
Shows the case where compression was successful. At the time of interruption, the contents of 64 MB are compressed to 20 MB by software processing by the CPU, and
The code data of the MB is stored in the HDD. When resuming, H
The code data stored in the DD is read, decompressed to the original 64 MB by software processing by the CPU, and loaded into the main memory.

【００３４】図３は、主記憶の内容を圧縮して格納する
ことによるメリットを示す図である。圧縮なしでハイバ
ネーショシ処理を行うとき、６４ＭＢ分の格納時間が必
要になる。圧縮して格納する場合は、圧縮処理に要する
時間が新たに発生するが、格納時間が短くなるので、図
３に示すように圧縮なしの場合に比べて処理時間が短縮
されることがある。実際に短縮されるかどうかは、圧縮
率、圧縮の処理時間、ＨＤＤ転送時間などによって決ま
る。FIG. 3 is a diagram showing the merits of compressing and storing the contents of the main memory. When performing the hibernation process without compression, a storage time of 64 MB is required. When the data is compressed and stored, the time required for the compression process is newly generated. However, the storage time is shortened, so that the processing time may be shorter than the case without compression as shown in FIG. Whether it is actually shortened depends on the compression ratio, compression processing time, HDD transfer time, and the like.

【００３５】図４は、ＬＺ７７べ一スの一般的な圧縮方
式の原理を示す図である。ＬＺ７７系の圧縮では、符号
化する際、移動窓中で一致する最大長の部分データ列を
求めるためには、これから符号化するデータ列と移動窓
の中のすべての位置の間でデータ列比較を行わなければ
ならない。すなわち、図４に示すように、これから符号
化するデータ列（図中の◇＝＝＝＝＝…）を、移動窓中
のオフセット１の位置から始まるデータ列、オフセット
２の位置から始まるデータ列、…オフセットｎ（ｎは移
動窓のサイズ）の位置から始まるデータ列と比較して、
最大一致長が得られるオフセットを見つけることであ
る。移動窓のサイズとしては、例えばＬＺＳ方式ではオ
フセット１〜２０４７を探索する。FIG. 4 is a diagram showing the principle of a general compression system based on LZ77. In LZ77-based compression, in order to determine the maximum length of the partial data sequence that matches in the moving window during encoding, the data sequence must be compared between the data sequence to be encoded and all positions in the moving window. Must be done. That is, as shown in FIG. 4, a data sequence to be coded (◇ =====... In the figure) is converted into a data sequence starting from an offset 1 position and a data sequence starting from an offset 2 position in a moving window. ,... Compared with the data sequence starting from the position of offset n (n is the size of the moving window),
The goal is to find the offset that gives the maximum match length. As the size of the moving window, for example, in the LZS method, an offset of 1 to 2047 is searched.

【００３６】圧縮処理が比較的軽く、入力データの性質
にそれほど依存しないという特徴を有するＬＺ７７系の
圧縮方式は、ハイバネーション用の圧縮に適している
が、単に標準的なＬＺ７７方式を適用したのでは欠点も
多い。例えば、入力データであるメモリイメージでは、
ＣＰＵのワード長を単位とする繰り返しデータが頻出す
る可能性が高いが、既存のＬＺ７７系データ圧縮方式
は、このようなメモリイメージの特徴を生かすような符
号体系になっていない。また、一致する可能性の低い遠
方のオフセットまで比較を行ってしまう。The LZ77-based compression method, which has a feature that the compression processing is relatively light and does not depend so much on the properties of the input data, is suitable for the compression for hibernation, but if the standard LZ77 method is simply applied, There are many disadvantages. For example, in a memory image that is input data,
Although there is a high possibility that repetitive data with the word length of the CPU as a unit frequently occurs, the existing LZ77-based data compression method does not have a coding system that makes use of such features of the memory image. In addition, the comparison is performed up to a distant offset that is unlikely to match.

【００３７】すなわち、従来のＬＺ７７方式をＣＰＵの
ソフト処理による圧縮に適用すると次のようになる。こ
こではＬＺＳ方式の例を示している。That is, when the conventional LZ77 system is applied to compression by software processing of a CPU, the following is achieved. Here, an example of the LZS method is shown.

【００３８】従来方式（２０４７個所の比較ポイントに
対して単純に最長一致を求めるモジュール）｛符号化位
置からのデータ列とオフセット１からのデータ列の一致
長を求め、結果をｌｅｎ１とする符号化位置からのデータ列とオフセット２からのデータ
列の一致長を求め、結果をｌｅｎ２とする符号化心置からのデータ列とオフセット３からのデータ
列の一致長を求め、結果をｌｅｎ３とする … 符号化位置からのデータ列とオフセット３１からのデー
タ列の一致長を求め、結果をｌｅｎ２０４６とする符号化位置からのデータ列とオフセット３２からのデー
タ列の一致長を求め、結果をｌｅｎ２０４７とするｌｅｎ１、ｌｅｎ２、…、ｌｅｎ２０４７の最大値と最
大一致長を与えるオフセットを返す｝図５は、ｉ３８６以降の３２ビットＣＰＵを搭載したパ
ソコンのメモリイメージの例を示す図である。Conventional method (a module for simply finding the longest match for 2047 comparison points) {a match length between a data string from an encoding position and a data string from offset 1 is obtained, and the result is len1. Find the match length between the data string from the position and the data string from offset 2 and set the result to len2. Find the match length between the data string from the coded center and the data string from offset 3 and set the result to len3. The match length between the data sequence from the encoding position and the data sequence from the offset 31 is obtained, and the result is len2046. The match length between the data sequence from the encoding position and the data sequence from the offset 32 is obtained, and the result is len2047. Return the maximum value of len1, len2,... len2047 and the offset that gives the maximum match length. Is a diagram showing an example of the personal computer of the memory image that was equipped with a Tsu door CPU.

【００３９】メモリイメージでは、データ領域、スタッ
ク領域などでＣＰＵのワード長を単位とする繰り返しの
データが頻出する可能性が高い。この発明では、メモリ
イメージのうちデータ領域、スタック領域の性質に着目
し、（ワード長÷シンボル長）だけ離れたオフセットに
対して最短のオフセット符号をアサインすることで、圧
縮率を改善し、ひいては圧縮処理時間を短縮することが
できる。例えば３２ビットＣＰＵならワード長は４バイ
トであり、そのときＬＺ７７の処理の単位（シンボル
長）が１バイトであれば、（ワード長÷シンボル長）＝
４となる。この場合、オフセット４に対して最短のオフ
セット符号をアサインする。In a memory image, there is a high possibility that repeated data with the word length of the CPU as a unit frequently appears in a data area, a stack area, or the like. According to the present invention, the compression rate is improved by focusing on the properties of the data area and the stack area in the memory image, and assigning the shortest offset code to an offset separated by (word length / symbol length), thereby improving the compression rate. The compression processing time can be reduced. For example, if a 32-bit CPU has a word length of 4 bytes, and if the unit of processing (symbol length) of LZ77 is 1 byte, then (word length / symbol length) =
It becomes 4. In this case, the shortest offset code is assigned to offset 4.

【００４０】各オフセットで一致長を求める際には、一
致長符号の構成で上限が決まっているので、長い一致が
得られてもそこでうち切ることになる。たとえば一致長
符号の最大長が２５６となっている場合には、一致の検
出が２５６に達したところでその後の比較をうち切り、
２５６を一致長とする。When obtaining the match length at each offset, the upper limit is determined by the structure of the match length code, so even if a long match is obtained, it will be cut off there. For example, if the maximum length of the match length code is 256, when the number of matches reaches 256, the subsequent comparison is terminated, and
Let 256 be the match length.

【００４１】上記のような最大一致長を求める方式で
は、それぞれのオフセットとも長い一致が得られる場合
に処理速度が落ちるという欠点がある。たとえば全く同
じデータが連続する部分を符号化すると、すべてのオフ
セットとの比較で最長の一致（たとえば２５６）が得ら
れるので、１データの比較を１回とカウントすると、各
オフセットあたり２５６回、計８１９２回の比較を行う
ことになる。ところが、２５６は最大の一致長であるの
で、２５６という一致長が得られた時点で残りのオフセ
ットの比較をうち切ることで高速化が実現できる。The method for obtaining the maximum matching length as described above has a disadvantage that the processing speed is reduced when a long match is obtained for each offset. For example, if a part where identical data is continuous is encoded, the longest match (for example, 256) can be obtained by comparison with all offsets. Therefore, if one data comparison is counted as one time, 256 times for each offset are counted. 8192 comparisons will be made. However, since 256 is the maximum matching length, when the matching length of 256 is obtained, the comparison of the remaining offsets is cut off, thereby realizing high speed.

【００４２】通常はオフセット符号、一致長符号ともハ
フマン符号を用いるので、長い一致長が得られそうなオ
フセットに短いオフセット符号を割り当てている。たと
えば、ＬＺＳ方式のオフセット符号は、図６に示すよう
に構成されている。また、他のオフセット符号の例を図
７に示す。Normally, the Huffman code is used for both the offset code and the match length code. Therefore, a short offset code is assigned to an offset at which a long match length is likely to be obtained. For example, the offset code of the LZS scheme is configured as shown in FIG. FIG. 7 shows an example of another offset code.

【００４３】この発明では（ＣＰＵのワード長÷シンボ
ル長）となるオフセットがもっとも有効とみなし、オフ
セット４に最短の符号を割り当て、以下１、２、３、
５、６、…、３２の順に短い符号を割り当てている。こ
の発明のオフセット符号の構成例を図８に示す。In the present invention, the offset that is (the word length of the CPU / the symbol length) is regarded as the most effective, and the shortest code is assigned to the offset 4.
Short codes are assigned in the order of 5, 6, ..., 32. FIG. 8 shows a configuration example of the offset code according to the present invention.

【００４４】また、この発明の方式をＣＰＵのソフト処
理による圧縮に適用すると次のようになる。When the method of the present invention is applied to compression by CPU software processing, the following is obtained.

【００４５】この発明による最長一致を求めるモジュー
ル｛符号化位置からのデータ列とオフセット４からのデ
ータ列の一致長を求め、結果をｌｅｎ４とするｌｅｎ４
＝２５６なら、オフセット＝４、一致長＝２５６として
モジュール終了符号化位置からのデータ列とオフセット１からのデータ
列の一致長を求め、結果をｌｅｎ１とするｌｅｎ１＝２
５６なら、オフセット＝１、一致長＝２５６としてモジ
ュール終了符号化位置からのデータ列とオフセット２からのデータ
列の一致長を求め、結果を１ｅｎ２とするｌｅｎ２＝２
５６なら、オフセット＝２、一致長＝２５６としてモジ
ュール終了 … 符号化位置からのデータ列とオフセット３２からのデー
タ列の一致長を求め、結果をｌｅｎ３２とするｌｅｎ３
２＝２５６なら、オフセット＝３２、一致長＝２５６と
してモジュール終了ｌｅｎ４、ｌｅｎ１、ｌｅｎ２、…、ｌｅｎ３２の最大
値と最大一致長を与えるオフセットを返す｝次に、図９を参照して、高速で伸長処理が行えるような
符号の発生について説明する。A module for obtaining the longest match according to the present invention. The match length between the data sequence from the encoding position and the data sequence from offset 4 is obtained, and the result is len4.
If = 256, the offset is set to 4, and the matching length is set to 256. The module ends. The matching length between the data string from the encoding position and the data string from offset 1 is obtained, and the result is len1.
If 56, the offset = 1 and the coincidence length = 256, and the module ends. The coincidence length between the data sequence from the encoding position and the data sequence from the offset 2 is obtained, and the result is 1en2, len2 = 2
If 56, the module ends with offset = 2 and matching length = 256. The matching length between the data string from the encoding position and the data string from offset 32 is obtained, and the result is len3 with the result being len32.
If 2 = 256, the offset is set to 32 and the matching length is set to 256. The module ends. The maximum value of len4, len1, len2,..., Len32 and the offset giving the maximum matching length are returned. Next, referring to FIG. A description will be given of the generation of a code that can perform the decompression process.

【００４６】図９に示すａ〜ｋの１１バイトが、◇から
始まるデータと一致したとき、通常であれば、（オフセ
ット１１、一致長１１）の符号を発生する。When the 11 bytes a to k shown in FIG. 9 coincide with the data starting from $, a code of (offset 11, coincidence length 11) is normally generated.

【００４７】伸長時には、オフセット分（この場合は１
１バイト）前の位置から１１バイトコピーして伸長デー
タとする。ＣＰＵの１ワードが４バイトのとき、図９に
示すｃ〜ｊの部分はワード境界に一致しているので、ワ
ード単位のコピーが可能であり、バイト単位のコピー処
理より高速に処理できる。At the time of expansion, the offset (in this case, 1
11 bytes from the previous position (1 byte) are copied to obtain decompressed data. When one word of the CPU is 4 bytes, the portions c to j shown in FIG. 9 coincide with word boundaries, so that word-by-word copying is possible and processing can be performed faster than byte-by-byte copying.

【００４８】そこで、（オフセット１１、一致長２） …ａ、ｂの部分（オフセット１１、一致長８） …ｃ〜ｊの部分（オフセット１１、一致長１） …ｋの部分のような分割をして符号化すると、ｃ〜ｊの部分はワー
ド単位の高速伸長処理が可能になる。Therefore, a division such as (offset 11, matching length 2)... A, b (offset 11, matching length 8)... C to j (offset 11, matching length 1). Then, the portions c to j can be subjected to high-speed decompression processing in word units.

【００４９】次に、この発明における一致長符号につい
て説明する。Next, the match length code according to the present invention will be described.

【００５０】図１０は、通常の一致長符号の構成の一例
を示す図である。図１０に示すように、短い一致長に短
いハフマン符号を割り当てている。この例では、９１語
からなるハフマン符号を順に１〜９１の一致長に割り当
てている。このようにすると、長い一致データに対して
効率的な圧縮ができない。例えば、メモリの未使用領域
などでは、１ｋバイトを越える同一データがあり得るた
め、そこでは長い一致が発生する。ところが図１０のよ
うに一致長の上限が小さい値だと、その一致長符号を繰
り返し発生することになり、効率的とは言えない。FIG. 10 is a diagram showing an example of the configuration of a normal match length code. As shown in FIG. 10, a short Huffman code is assigned to a short match length. In this example, a Huffman code consisting of 91 words is assigned to match lengths 1 to 91 in order. In this case, efficient compression cannot be performed on long matching data. For example, in an unused area of a memory or the like, there may be the same data exceeding 1 kbyte, and a long match occurs there. However, if the upper limit of the match length is small as shown in FIG. 10, the match length code is repeatedly generated, which is not efficient.

【００５１】この発明では、図１１のように、一致長符
号のうち長い一致長に割り当てる部分については、４の
倍数の一致長についてのみ一致長符号がアサインされて
いるため、長い一致に対しても効率的に符号を発生でき
る。ここでは、（ＣＰＵのワード長÷圧縮処理の処理単
位長）が４であるため、４の倍数を優先してアサインし
ている。According to the present invention, as shown in FIG. 11, the matching length code is assigned only to the matching length that is a multiple of 4 for the portion assigned to the long matching length in the matching length code. Can also generate codes efficiently. Here, since (the word length of the CPU / the processing unit length of the compression process) is 4, a multiple of 4 is assigned with priority.

【００５２】他の例では、図１２に示すように、一致長
符号のうち長い一致長に割り当てる部分については、２
のべき乗の一致長についてのみ一致長符号がアサインさ
れているため、長い一致に対しても効率的に符号を発生
できる。In another example, as shown in FIG. 12, the part assigned to the long match length in the match length code is 2 bits.
Since a match length code is assigned only to a match length of a power of, a code can be generated efficiently even for a long match.

【００５３】次に、圧縮時の比較を行うオフセットの数
を可変とする制御について説明する。一般に、圧縮時の
参照点数（サーチするオフセットの数）を増やすと圧縮
率が向上するので、格納すべきデータのサイズが小さ<
なるが、一方でＣＰＵの行う圧縮処理の負荷が大きくな
る。圧縮データをＨＤＤに格納する時間とＣＰＵ処理時
間の遅い方が見かけ上の処理時間となるので、両者のバ
ランスをとることが重要である。そこで、次の実施例で
は、圧縮時の参照点数を可変にセット出来るようにし、
外部からの指示で増減出来るようにしておく。また、伸
長側は最大の参照点数に対して伸長できるようにしてお
く。Next, control for varying the number of offsets used for comparison during compression will be described. In general, if the number of reference points (the number of offsets to be searched) at the time of compression is increased, the compression ratio is improved, so that the size of data to be stored is small.
However, on the other hand, the load of the compression processing performed by the CPU increases. The slower of the time for storing the compressed data in the HDD and the processing time of the CPU is the apparent processing time, so it is important to balance the two. Therefore, in the next embodiment, the number of reference points at the time of compression can be set variably,
It can be increased or decreased by an external instruction. The extension side is designed to be able to extend the maximum number of reference points.

【００５４】最初の一回は予測の圧縮率に基づいて参照
点数を決定する。２回目以降は前回の圧縮率情報をファ
イルで保持し、この値を予め設定した圧縮率閾値と比較
して参照点数を決定する。前回の圧縮率が閾値以下であ
れば、次の回は参照点を増やす。一方、前回の圧縮率が
閾値以上であれば、次の回は参照点を減らす。At the first time, the number of reference points is determined based on the compression ratio of the prediction. For the second and subsequent times, the previous compression ratio information is stored in a file, and this value is compared with a preset compression ratio threshold to determine the number of reference points. If the previous compression ratio is less than or equal to the threshold, the next round increases the number of reference points. On the other hand, if the previous compression ratio is equal to or greater than the threshold, the number of reference points is reduced in the next round.

【００５５】図１３は、圧縮時の比較を行うオフセット
数可変制御を示すフローチャートである。まず、前回の
ハイバネーション時にセーブしておいたＨＤＤ格納時
間、ＣＰＵ圧縮処理時間を読み出す（ＳＴ１）。ＨＤＤ
格納時間を閾値と比較し（ＳＴ２）、閾値を越えていた
ら（ＳＴ２、ＹＥＳ）、参照点数Ｎを１増やす（ＳＴ
７）。また、ＣＰＵ圧縮処理時間を閾値と比較し（ＳＴ
３）、閾値を越えていたら（ＳＴ３、ＹＥＳ）、参照点
数を１減ずる（ＳＴ６）。その後、ハイバネーション処
理（ＳＴ４）、ＨＤＤ転送時間及びＣＰＵ圧縮処理時間
の更新が行なわれる（ＳＴ５）。FIG. 13 is a flowchart showing the offset number variable control for performing comparison at the time of compression. First, the HDD storage time and CPU compression processing time saved during the previous hibernation are read (ST1). HDD
The storage time is compared with a threshold value (ST2). If the storage time exceeds the threshold value (ST2, YES), the number N of reference points is increased by one (ST2).
7). Further, the CPU compression processing time is compared with a threshold value (ST
3) If the threshold value is exceeded (ST3, YES), the number of reference points is reduced by one (ST6). After that, the hibernation process (ST4), the HDD transfer time and the CPU compression process time are updated (ST5).

【００５６】ソフト圧縮処理においては、比較を行うオ
フセットの点数を参照点数Ｎの値で制御する。このよう
にすることで、ＨＤＤの格納時間とＣＰＵの圧縮処理時
間を均衡させ、全体の処理時間を短縮することができ
る。In the soft compression processing, the number of offset points to be compared is controlled by the value of the reference point number N. By doing so, the storage time of the HDD and the compression processing time of the CPU can be balanced, and the overall processing time can be reduced.

【００５７】図１３のフローチャートで示した方式は、
前回の情報に基づいて参照点数を決定するので、前回の
ハイバネーション処理から時間間隔が空いてしまったと
きに、圧縮率見積もりが正確に行えないという欠点があ
る。そこで、一定時間ユーザジョブが行われないとき、
バックグラウンドで圧縮率見積もりを行い、前回の圧縮
率情報を更新しておく。The method shown in the flowchart of FIG.
Since the number of reference points is determined based on the previous information, there is a disadvantage that the compression ratio cannot be accurately estimated when a time interval has elapsed from the previous hibernation process. Therefore, when a user job is not performed for a certain period of time,
The compression ratio is estimated in the background, and the previous compression ratio information is updated.

【００５８】図１４は、バックグランドで圧縮率の見積
もりを行ない、前回の圧縮情報を更新する処理を示すフ
ローチャートである。まず、タイマ機能で、前回のハイ
バネーション処理からの経過時間を測定しておく（ＳＴ
１１）。一定時間が経過したら（ＳＴ１１、ＹＥＳ）、
バックグラウンドで、ＨＤＤ転送を伴わない圧縮率見積
もりだけの処理を行い（ＳＴ１２）、圧縮率情報を更新
する（ＳＴ１３）。このようにすることで、実際の圧縮
率に近い情報をもとに参照点数を決定することができ
る。FIG. 14 is a flowchart showing a process for estimating the compression ratio in the background and updating the previous compression information. First, the elapsed time from the previous hibernation process is measured by the timer function (ST
11). After a certain time has passed (ST11, YES),
In the background, only compression ratio estimation without HDD transfer is performed (ST12), and compression ratio information is updated (ST13). In this way, the number of reference points can be determined based on information close to the actual compression ratio.

【００５９】他の例として、一定時間ユーザジョブが行
われないとき、バックグラウンドでメモリイメージの圧
縮を行いＨＤＤに格納しておく。そのままハイバネーシ
ョン処理が行われたときはメモリイメージの圧縮を省略
する。As another example, when a user job is not performed for a fixed time, a memory image is compressed in the background and stored in the HDD. When the hibernation process is performed as it is, the compression of the memory image is omitted.

【００６０】図１５は、一定時間ユーザジョブが行われ
ないとき、バックグラウンドでメモリイメージの圧縮を
行いＨＤＤに格納し、そのままハイバネーション処理が
行われたときはメモリイメージの圧縮を省略する処理を
示すフローチャートである。FIG. 15 shows a process of compressing a memory image in the background when no user job is performed for a certain period of time and storing it in the HDD, and omitting the compression of the memory image when the hibernation process is performed as it is. It is a flowchart.

【００６１】まず、タイマ機能で、ユーザがキーボード
やマウス操作を行わなくなってからの経過時間を測定し
ておく（ＳＴ２１）。一定時間が経過したら（ＳＴ２
１、ＹＥＳ）、バックグラウンドでメモリイメージの圧
縮を行いＨＤＤに格納しておく（ＳＴ２２）。そのまま
ハイバネーション処理が行われたときは（ＳＴ２３、直
後の要求）、メモリイメージの圧縮を省略することがで
きる（ＳＴ２５）。操作後にハイバネーション処理が行
われたときは（ＳＴ２３、操作後の要求）、通常のハイ
バネーション処理に移行する（ＳＴ２４）。このように
することで、ハイバネーション処理の時間をさらに短縮
できる。First, the timer function measures the elapsed time since the user stops operating the keyboard and mouse (ST21). After a certain time has passed (ST2
1, YES), the memory image is compressed in the background and stored in the HDD (ST22). When the hibernation process is performed as it is (ST23, request immediately after), the compression of the memory image can be omitted (ST25). When the hibernation process is performed after the operation (ST23, a request after the operation), the process proceeds to the normal hibernation process (ST24). By doing so, the time for the hibernation process can be further reduced.

【００６２】[0062]

【発明の効果】この発明によれば、パソコンなどのメモ
リイメージを圧縮してＨＤＤなどの格納装置に格納する
ときの処理時間を改善することが可能なデータ圧縮方法
を提供できる。According to the present invention, it is possible to provide a data compression method capable of improving the processing time when a memory image such as a personal computer is compressed and stored in a storage device such as an HDD.

[Brief description of the drawings]

【図１】パソコンの概略構成を示すとともに、主記憶の
内容を圧縮することなくハイバネーション処理が行なわ
れる様子を示す図である。FIG. 1 is a diagram showing a schematic configuration of a personal computer and a state in which a hibernation process is performed without compressing the contents of a main memory.

【図２】パソコンの概略構成を示すとともに、主記憶の
内容を圧縮してハイバネーション処理が行なわれる様子
を示す図である。FIG. 2 is a diagram showing a schematic configuration of a personal computer and a state in which the contents of a main memory are compressed to perform a hibernation process.

【図３】主記憶の内容を圧縮して格納することによるメ
リットを説明するための図である。FIG. 3 is a diagram for explaining the merits of compressing and storing the contents of a main memory.

【図４】ＬＺ７７べ一スの一般的な圧縮方式の原理を説
明するための図である。FIG. 4 is a diagram for explaining the principle of a general compression system based on LZ77.

【図５】ｉ３８６以降の３２ビットＣＰＵを搭載したパ
ソコンのメモリイメージの一例を示す図である。FIG. 5 is a diagram showing an example of a memory image of a personal computer equipped with a 32-bit CPU of i386 or later.

【図６】オフセット符号（ＬＺＳ方式）の例１を示す図
である。FIG. 6 is a diagram illustrating an example 1 of an offset code (LZS method).

【図７】オフセット符号の例２を示す図である。FIG. 7 is a diagram illustrating an example 2 of an offset code.

【図８】この発明に係るオフセット符号の例を示す図で
ある。FIG. 8 is a diagram showing an example of an offset code according to the present invention.

【図９】高速で伸長処理が行えるような符号の発生につ
いて説明するための図である。FIG. 9 is a diagram for explaining generation of a code that can perform a decompression process at high speed.

【図１０】一致長符号の構成の一例を示す図である。FIG. 10 is a diagram illustrating an example of the configuration of a match length code.

【図１１】この発明に係る一致長符号の構成の一例（後
半が４の倍数のみアサインするもの）を示す図である。FIG. 11 is a diagram showing an example of a configuration of a match length code according to the present invention (the latter half is assigned only a multiple of 4).

【図１２】この発明に係る一致長符号の構成の一例（後
半が２のべき乗のみアサインするもの）を示す図であ
る。FIG. 12 is a diagram showing an example of a configuration of a match length code according to the present invention (the latter half is assigned only to the power of 2).

【図１３】圧縮時の比較を行うオフセット数可変制御を
示すフローチャートである。FIG. 13 is a flowchart illustrating offset number variable control for performing comparison at the time of compression.

【図１４】バックグランドで圧縮率の見積もりを行な
い、前回の圧縮情報を更新する処理を示すフローチャー
トである。FIG. 14 is a flowchart showing a process of estimating a compression ratio in the background and updating previous compression information.

【図１５】一定時間ユーザジョブが行われないとき、バ
ックグラウンドでメモリイメージの圧縮を行いＨＤＤに
格納し、そのままハイバネーション処理が行われたとき
はメモリイメージの圧縮を省略する処理を示すフローチ
ャートである。FIG. 15 is a flowchart showing a process of compressing a memory image in the background when no user job is performed for a certain period of time and storing it in the HDD, and omitting the compression of the memory image when the hibernation process is performed as it is; .

[Explanation of symbols]

１１…ＣＰＵ１２…主記憶１３…ホストブリッジ１４…ＨＤＤコントローラ１５…ＨＤＤ１６…ＰＣＩパス DESCRIPTION OF SYMBOLS 11 ... CPU 12 ... Main memory 13 ... Host bridge 14 ... HDD controller 15 ... HDD 16 ... PCI path

Claims

[Claims]

When a data stream stored in a main storage unit is compressed and a code stream obtained as a result of the compression is stored in an auxiliary storage unit, (word length of an arithmetic processing unit) ÷
A data compression method, wherein a shortest offset code is assigned to an offset separated by (processing unit length of compression processing).

2. When the data stream stored in the main storage unit is compressed and the code stream obtained as a result of the compression is stored in the auxiliary storage unit, the matched offset is (word length of the arithmetic processing unit) ÷ (compression). When a predetermined match length code is generated in multiples of (processing unit length of processing), the match code from the match start position to the first word boundary within the match length, the first word boundary within the match length to the last within the match length And the matching code from the last word boundary in the matching length to the matching end position.
A data compression method characterized in that the data is decomposed into two codes and encoded.

3. A part for compressing a data stream stored in a main storage unit and assigning a code stream obtained as a result of the compression to a long match length exceeding a predetermined length when storing the code stream in the auxiliary storage unit. , A matching length code is assigned only for a matching length that is a multiple of (word length of the arithmetic processing unit) / (processing unit length of the compression process).

4. A method of compressing a data stream stored in a main storage unit and storing a code stream obtained as a result of the compression in an auxiliary storage unit, wherein a portion assigned to a long match length exceeding a predetermined length among the match length codes. (C) assigning a match length code only for match lengths of powers of two.

5. When the data stream stored in the main storage unit is compressed and the code stream obtained as a result of the compression is stored in the auxiliary storage unit, the number of reference points at the time of compression can be variably set by an external instruction. The decompression side can decompress the maximum number of reference points that can be set on the compression side. The decompression side retains information on the processing time at the time of the previous compression and the access performance of the auxiliary storage unit in a file. Determining a data compression method.

6. When the storage processing of the auxiliary storage unit is a bottleneck at the time of the previous storage, the reference point is increased at the next storage to improve the compression ratio. 6. The data compression method according to claim 5, wherein the reference points are reduced at the time of next storage when the compression processing is a bottleneck.

7. When the data stream stored in the main storage unit is compressed and the code stream obtained as a result of the compression is stored in the auxiliary storage unit, the number of reference points at the time of compression can be variably set by an external instruction. The decompression side can decompress the maximum number of reference points that can be set on the compression side, retains the compression ratio information at the time of the previous compression in a file, and determines the number of reference points according to this information. Data compression method to use.

8. The data according to claim 7, wherein the reference points are increased next time if the previous compression ratio is equal to or less than the threshold value, and the reference points are decreased next time if the previous compression ratio is higher than the threshold value. Compression method.

9. When the data stream stored in the main storage unit is compressed and the code stream obtained as a result of the compression is stored in the auxiliary storage unit, the number of reference points at the time of compression can be variably set by an external instruction. The decompression side can decompress the maximum number of reference points that can be set on the compression side, retains the compression ratio information at the time of the previous compression in a file, and keeps the background A data compression method comprising: estimating a compression ratio, updating previous compression ratio information, and determining the number of reference points in accordance with compression ratio information at the time of previous compression or compression ratio information updated in the background.

10. When a data stream stored in a main storage unit is compressed and a code stream obtained as a result of the compression is stored in an auxiliary storage unit, a memory image is compressed in the background unless a user job is executed for a predetermined time. And storing the data in an auxiliary storage unit, and thereafter, when the hibernation process is subsequently performed, the compression of the memory image is omitted.