JP5492103B2

JP5492103B2 - Backup apparatus, backup method, data compression method, backup program, and data compression program

Info

Publication number: JP5492103B2
Application number: JP2011012498A
Authority: JP
Inventors: 崇志熊谷; 和寛松下; 仁茂仲野谷; 典雄荒城
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2011-01-25
Filing date: 2011-01-25
Publication date: 2014-05-14
Anticipated expiration: 2031-01-25
Also published as: JP2012155428A

Description

本発明は、データの圧縮と展開、信頼性検査に伴う時間を短縮するバックアップ装置、バックアップ方法、データ圧縮方法等の技術に関する。 The present invention relates to a technology such as a backup device, a backup method, and a data compression method that reduce the time required for data compression and decompression and reliability check.

近年、電子計算機に搭載するＯＳ（Operating System）やシステムの複雑化に伴うプログラム量の増大、ＨＤＤ（Hard Disk Drive）の大容量化等により、データの記録・伝送の高速化が必要不可欠となっている。データ量の増加に加え、データの重要性も増してきており、バックアップの必要性や有効性が個人、企業関係なく浸透している。 In recent years, the speed of data recording and transmission has become indispensable due to the increase in the amount of programs and operating system (OS) installed in computers and the increase in capacity of HDDs (Hard Disk Drives). ing. In addition to the increase in data volume, the importance of data has also increased, and the necessity and effectiveness of backup has permeated regardless of whether it is an individual or a company.

バックアップデータのような膨大な量のデータを他の記憶媒体へ記録・伝送する際、記憶媒体の容量や伝送の効率を考慮すると、圧縮してより少ないデータにすることが望ましい。データ圧縮方法には、データを復元した際に、完全に元に戻る可逆圧縮(ロスレス圧縮)と、完全には元に戻らない非可逆圧縮(ロッシー圧縮)の二種類が存在する。映像や音声等、多少のデータの欠損や改変があっても、全く意味が変わってしまうことがないデータを除き、圧縮には可逆圧縮を用いる必要がある。
可逆圧縮に用いられるアルゴリズムとしては、ランレングス符号化やハフマン符号化等が広く知られている。特に、Lempel-Ziv符号化は、通常のデータセットにおいてランレングス符号化やハフマン符号化よりも優れた圧縮率を実現している。 When a large amount of data such as backup data is recorded and transmitted to another storage medium, it is desirable to compress the data to a smaller amount in consideration of the capacity of the storage medium and the transmission efficiency. There are two types of data compression methods: lossless compression that completely returns to the original when data is restored (lossless compression) and lossy compression that does not return completely (lossy compression). It is necessary to use lossless compression for compression, except for data that does not change the meaning at all even if some data is missing or altered, such as video and audio.
As algorithms used for lossless compression, run-length encoding, Huffman encoding, and the like are widely known. In particular, Lempel-Ziv coding achieves a compression rate superior to run-length coding and Huffman coding in a normal data set.

また、可逆圧縮でデータを圧縮および展開する場合、圧縮前のデータと展開後のデータが一致することを確認する信頼性検査の作業が必要である。信頼性検査の方法として、誤り検出方法の１つであるチェックサムを算出する方法が知られている。この方法は、圧縮前のデータを分割し、分割したデータを数値とみなして合計値を算出する。算出された値（チェックサム）と圧縮後に展開されたデータから算出したチェックサムとが一致するか否かの検査を実施し、圧縮から展開までの過程での誤りを検出することができる。
チェックサムを算出するアルゴリズムとしては、全データをバイト単位で加算する単純な方法から、広義な意味のチェックサムとして、ハッシュ関数を用いる方法が存在する。
ハッシュ関数は、与えられた値から固定長のデータを生成するが、一方向関数であり、逆変換できない。
よって、算出した値から原文を再現することはできず、また、同じハッシュ値を持つ異なるデータを作成することも極めて困難であり、現実的には不可能となることから誤り検出に利用されている。ハッシュ関数を用いてチェックサムを算出する方法として、ＭＤ５（Message Digest 5）やSHA-1(Secure Hash Algorithm-1)が知られている。 Further, when compressing and decompressing data by lossless compression, it is necessary to perform a reliability check operation to confirm that the data before compression and the data after decompression match. As a reliability check method, a method of calculating a checksum, which is one of error detection methods, is known. In this method, data before compression is divided, and the total value is calculated by regarding the divided data as numerical values. It is possible to detect an error in the process from compression to decompression by checking whether the calculated value (checksum) matches the checksum calculated from the decompressed data after compression.
As algorithms for calculating the checksum, there are methods that use a hash function as a checksum in a broad sense, from a simple method of adding all data in units of bytes.
The hash function generates fixed-length data from a given value, but is a one-way function and cannot be inversely transformed.
Therefore, the original text cannot be reproduced from the calculated value, and it is extremely difficult to create different data with the same hash value. Yes. MD5 (Message Digest 5) and SHA-1 (Secure Hash Algorithm-1) are known as methods for calculating a checksum using a hash function.

一般的に、上記圧縮技術を用いた記憶媒体のデータのバックアップにおいて、ファイル単位ではなくＯＳを含めた記憶媒体の全データをセクタ単位でバックアップするためには、記憶媒体の先頭から随時読み込みを行い、読み込んだデータの圧縮を行う必要がある。また、データ復元時の信頼性検査としてチェックサムの作成も必要となる。よって、バックアップデータは、圧縮された記憶媒体全体のデータとチェックサムの２つからなる。
データ展開時には、圧縮データを展開し、記憶媒体の先頭から書き込みを行う。書き込んだデータを再度読み込み、チェックサムを計算してデータ圧縮時に作成したチェックサムと比較することで、圧縮および展開、書き込み時のデータ化けがないか検査を実施している（例えば、特許文献１参照）。 In general, when backing up data on a storage medium using the above-described compression technology, in order to back up all data on the storage medium including the OS, not on a file basis, but on a sector basis, read from the beginning of the storage medium as needed. It is necessary to compress the read data. It is also necessary to create a checksum as a reliability check when restoring data. Therefore, the backup data is composed of two data, that is, the compressed data of the entire storage medium and the checksum.
At the time of data expansion, the compressed data is expanded and written from the beginning of the storage medium. The written data is read again, and a checksum is calculated and compared with a checksum created at the time of data compression, thereby checking whether there is any garbled data at the time of compression, decompression, or writing (for example, Patent Document 1). reference).

特開２００８−１７６４２０号公報JP 2008-176420 A

このように、データの圧縮および展開、信頼性検査を行う場合、電子計算機へ実装する記憶媒体の大容量化に伴う処理時間の増加は避けることができない。特に、記憶媒体全体のデータが処理の対象となる場合は、ＯＳが未使用の記憶領域に対しても、データの圧縮を行わなければならない。一般的に、ＯＳが未使用の記憶領域であっても、一度使用されたことがある記憶領域には不規則なデータの羅列が記憶されているため、データの圧縮率は使用している領域と変わらない。 As described above, when data compression / decompression and reliability inspection are performed, an increase in processing time accompanying an increase in the capacity of a storage medium mounted on an electronic computer cannot be avoided. In particular, when the data of the entire storage medium is a processing target, the data must be compressed even in a storage area that is not used by the OS. In general, even if the OS is an unused storage area, an irregular list of data is stored in a storage area that has been used once. And no different.

また、記憶媒体全体のデータのバックアップでは、未使用の記憶領域に対してもチェックサムの算出を行わなければならず、大容量記憶媒体に対する圧縮率と圧縮および展開、チェックサムの計算に時間がかかることが問題となっている。 In addition, when backing up data on the entire storage medium, it is necessary to calculate checksums even for unused storage areas, and it takes time to calculate compression rates, compression and decompression, and checksums for large-capacity storage media. This is a problem.

そこで、本発明は、前記問題に鑑みてなされたものであり、データの圧縮および展開、信頼性検査に伴う時間を短縮するバックアップ装置、バックアップ方法、データ圧縮方法、バックアッププログラムおよびデータ圧縮プログラムを提供することを課題とする。 Therefore, the present invention has been made in view of the above problems, and provides a backup device, a backup method, a data compression method, a backup program, and a data compression program that reduce the time required for data compression and decompression and reliability inspection. The task is to do.

前記課題を解決するために、本発明のバックアップ装置は、バックアップの対象となるバックアップ対象記憶装置の記憶領域のデータを圧縮してバックアップするバックアップ装置であって、前記記憶領域のうち未使用領域のデータをクリアし、前記記憶領域のデータをクリア済データとする未使用記憶領域データクリア部と、前記クリア済データを細分化したデータに対して、クリアされたデータであるか否かを判定する第１クリアデータ判定部と、前記第１クリアデータ判定部がクリアされたデータであると判定した場合、予め算出したクリアされたデータのチェックコードであるクリアデータチェックコードを取得し、前記第１クリアデータ判定部がクリアされたデータでないと判定した場合、前記クリア済データを細分化したデータからチェックコードを算出し、信頼性検査を行うための第１チェックコード情報を生成する第１チェックコード生成部と、前記第１クリアデータ判定部がクリアされたデータであると判定した場合、予め算出したクリアされたデータの圧縮結果であるクリアデータ圧縮データを取得し、前記第１クリアデータ判定部がクリアされたデータでないと判定した場合、前記クリア済データを細分化したデータから圧縮結果を算出し、圧縮されたデータである圧縮データを生成する圧縮データ生成部と、前記圧縮データを展開し、展開データを生成する圧縮データ展開部と、前記展開データを細分化したデータに対して、クリアされたデータであるか否かを判定する第２クリアデータ判定部と、前記第２クリアデータ判定部がクリアされたデータであると判定した場合、予め算出した前記クリアデータチェックコードを取得し、前記第２クリアデータ判定部がクリアされたデータでないと判定した場合、前記展開データを細分化したデータからチェックコードを算出し、信頼性検査を行うための第２チェックコード情報を生成する第２チェックコード生成部と、前記第１チェックコード情報と前記第２チェックコード情報を比較するチェックコード比較部と、を備えることを特徴とする。 In order to solve the above problems, a backup device according to the present invention is a backup device that compresses and backs up data in a storage area of a backup target storage device to be backed up. An unused storage area data clear unit that clears data and sets the data in the storage area as cleared data, and determines whether the cleared data is cleared data or not When the first clear data determination unit and the first clear data determination unit determine that the data is cleared, a clear data check code that is a check code of the cleared data calculated in advance is acquired, and the first When the clear data determination unit determines that the data is not cleared, the data obtained by subdividing the cleared data When the first check code generation unit that calculates the check code and generates the first check code information for performing the reliability check and the first clear data determination unit determine that the data is cleared, When the clear data compression data that is the compression result of the calculated cleared data is acquired and the first clear data determination unit determines that the data is not cleared, the compression result is obtained from the data obtained by subdividing the cleared data. A compressed data generating unit that generates compressed data that is calculated and compressed data, a compressed data expanding unit that expands the compressed data and generates expanded data, and data obtained by subdividing the expanded data, A second clear data determining unit for determining whether or not the data is cleared, and data obtained by clearing the second clear data determining unit. If it is determined, to obtain the clear data check code calculated in advance, if the second clear data determination unit determines that the data is not cleared, to calculate the check code from the data subdivided the expanded data, A second check code generation unit that generates second check code information for performing a reliability check, and a check code comparison unit that compares the first check code information with the second check code information. And

本発明によれば、データの圧縮および展開、信頼性検査に伴う時間を短縮することができる。 According to the present invention, it is possible to reduce the time required for data compression and decompression and reliability inspection.

本発明の実施形態におけるバックアップ装置の構成を示すブロック図である。It is a block diagram which shows the structure of the backup device in embodiment of this invention. 本発明の実施形態における第１チェックサム情報の一例を示す図である。It is a figure which shows an example of the 1st checksum information in embodiment of this invention. 本発明の実施形態におけるバックアップ装置の未使用領域データクリア部による未使用領域データクリア処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the unused area data clear process by the unused area data clear part of the backup device in embodiment of this invention. 本発明の実施形態におけるバックアップ装置の第１クリアデータ判定部および第１チェックサム生成部による第１チェックサム情報生成処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the 1st checksum information generation process by the 1st clear data determination part and 1st checksum production | generation part of the backup device in embodiment of this invention. 本発明の実施形態におけるバックアップ装置の第１クリアデータ判定部および圧縮データ生成部による圧縮データ生成処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the compression data generation process by the 1st clear data determination part of the backup apparatus in embodiment of this invention, and a compression data generation part. 本発明の実施形態におけるバックアップ装置の圧縮データ展開部、第２クリアデータ判定部、第２チェックサム生成部およびチェックサム比較部による圧縮データ展開処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the compression data expansion | deployment process by the compression data expansion | deployment part of the backup apparatus in embodiment of this invention, a 2nd clear data determination part, a 2nd checksum production | generation part, and a checksum comparison part.

以下、本発明の実施形態について、図１〜図６を参照して説明する。なお、各図において、共通する部分には同一の符号を付し重複した説明を省略する。
図１に示すように、バックアップ装置１は、バックアップ対象記憶装置２のデータのバックアップを取るために、バックアップ対象記憶装置２のデータを圧縮してバックアップ記憶装置３に記憶させるものである。
また、バックアップ装置１は、バックアップ記憶装置３に記憶させた圧縮したデータを展開し、バックアップ対象記憶装置２に再び記録し、信頼性検査を行うものである。
なお、バックアップ対象記憶装置２の記憶領域には、ファイルとして存在するデータを記憶している使用領域と、いわゆる空き領域である未使用領域とがある。バックアップ装置１は、バックアップ対象記憶装置２の未使用領域も含めた記憶領域全体のデータをバックアップするものである。
バックアップ対象記憶装置２は、バックアップの対象となるデータが記憶された記憶媒体であり、例えば、ＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）等である。バックアップ対象記憶装置２は、電子計算機に実装されていてもよい。
バックアップ記憶装置３は、バックアップのデータを記憶する記憶媒体であり、バックアップ装置１の外部にあってもよいし、内部にあってもよい。 Hereinafter, embodiments of the present invention will be described with reference to FIGS. In each figure, common portions are denoted by the same reference numerals, and redundant description is omitted.
As shown in FIG. 1, the backup device 1 compresses data in the backup target storage device 2 and stores it in the backup storage device 3 in order to back up data in the backup target storage device 2.
The backup device 1 decompresses the compressed data stored in the backup storage device 3, records it again in the backup target storage device 2, and performs a reliability check.
The storage area of the backup target storage device 2 includes a used area that stores data existing as a file and an unused area that is a so-called empty area. The backup device 1 backs up data in the entire storage area including the unused area of the backup target storage device 2.
The backup target storage device 2 is a storage medium that stores data to be backed up, and is, for example, an HDD (Hard Disk Drive), an SSD (Solid State Drive), or the like. The backup target storage device 2 may be mounted on an electronic computer.
The backup storage device 3 is a storage medium for storing backup data, and may be external or internal to the backup device 1.

［バックアップ装置１の構成］
図１に示すように、バックアップ装置１は、未使用領域データクリア部１１と、記憶部１２と、第１クリアデータ判定部１３と、第１チェックサム生成部１４と、圧縮データ生成部１５と、圧縮データ展開部１６と、第２クリアデータ判定部１７と、第２チェックサム生成部１８と、チェックサム比較部１９と、を備える。 [Configuration of Backup Device 1]
As shown in FIG. 1, the backup device 1 includes an unused area data clear unit 11, a storage unit 12, a first clear data determination unit 13, a first checksum generation unit 14, and a compressed data generation unit 15. A compressed data expansion unit 16, a second clear data determination unit 17, a second checksum generation unit 18, and a checksum comparison unit 19.

未使用領域データクリア部１１は、バックアップ対象記憶装置２の未使用領域のデータをクリアするものである。
ここで、未使用領域のデータをクリアすることについて説明する。データの圧縮は、一般的に、規則的なデータの羅列に対しては圧縮率が高く、ランダムなデータの羅列に対しては圧縮率が低い傾向にある。ＯＳが未使用と判断した記憶領域であっても、一度使用されたことがある記憶領域には、不規則なデータの羅列が記憶されたままとなっている。そのため、未使用領域は、未使用であっても圧縮率が比較的低い記憶領域といえる。そこで、未使用領域に規則的なデータの羅列を記憶させ、この未使用領域のデータの圧縮率を高めるのである。このように、未使用領域に規則的なデータの羅列を記憶させることを、未使用領域のデータをクリアするという。 The unused area data clear unit 11 clears the unused area data of the backup target storage device 2.
Here, the clearing of data in unused areas will be described. In general, data compression tends to have a high compression rate for a regular data enumeration and a low compression rate for a random data enumeration. Even in a storage area that the OS has determined to be unused, a list of irregular data remains stored in the storage area that has been used once. Therefore, the unused area can be said to be a storage area having a relatively low compression rate even if it is unused. Therefore, a regular list of data is stored in the unused area, and the compression rate of the data in the unused area is increased. In this way, storing a list of regular data in an unused area is called clearing the data in the unused area.

未使用領域のデータをクリアする方法は、未使用領域に例えば、内容が０の羅列である一定量のバイナリファイルを作成することにより行う。このファイルを未使用領域すべてについて作成する。その後、これらのバイナリファイルをすべて削除して再び未使用領域としておく。すると、この未使用領域には、０の羅列が記憶された状態となる。
これにより、圧縮対象であるバックアップ対象記憶装置２の未使用領域については、規則的なデータの羅列が記憶された状態となるため、データをクリアする前の不規則なデータの羅列を圧縮するよりも、圧縮率の向上の効果を得ることができる。
なお、データをクリアする処理が施された未使用領域と、使用領域とを含めたバックアップ対象記憶装置２の記憶領域全体のデータを、クリア済データという。 The method of clearing the data in the unused area is performed by creating a certain amount of binary file in which the contents are, for example, a list of 0 in the unused area. Create this file for all unused areas. After that, all these binary files are deleted and set as unused areas again. Then, a list of 0 is stored in this unused area.
As a result, since the regular data list is stored in the unused area of the backup target storage device 2 that is the compression target, the irregular data list before the data is cleared is compressed. Also, the effect of improving the compression rate can be obtained.
Note that the data in the entire storage area of the backup target storage device 2 including the unused area subjected to the data clearing process and the used area is referred to as cleared data.

記憶部１２は、後記するクリアデータチェックサム１２１と、クリアデータ圧縮データ１２２とを記憶するものである。
クリアデータチェックサム１２１は、規則的なデータの羅列のチェックサムである。例えば、クリアデータチェックサム１２１は、一定量の０の羅列のチェックサムである。したがって、このクリアデータチェックサム１２１は、予め計算しておくことができる。
クリアデータ圧縮データ１２２は、規則的なデータの羅列の圧縮結果である。したがって、このクリアデータ圧縮データ１２２も、予め計算しておくことができる。 The storage unit 12 stores a clear data checksum 121, which will be described later, and clear data compressed data 122.
The clear data checksum 121 is a checksum of a regular list of data. For example, the clear data checksum 121 is a checksum of a certain amount of zeros. Therefore, the clear data checksum 121 can be calculated in advance.
The clear data compression data 122 is a compression result of a regular list of data. Therefore, this clear data compressed data 122 can also be calculated in advance.

第１クリアデータ判定部１３は、クリア済データを細分化したデータを取得し、クリア済データを細分化したデータに対して、クリアされたデータであるか否かを判定するものである。クリア済データを細分化したデータとは、クリア済データを、チェックサム算出のために適当な間隔で区切ったデータである。クリアされたデータであるか否かを判定する方法は、クリア済データを細分化したデータが、規則的なデータの羅列であるか否かを判定することにより行う。例えば、第１クリアデータ判定部１３は、クリア済データを細分化したデータが、０の羅列のみからなる場合は、クリアされたデータであると判定する。したがって、クリア済データの中のデータであって、元々は使用領域に記憶されていたデータであっても、細分化されたことにより、０の羅列のみとなったデータは、クリアされたデータであると判定される。 The first clear data determination unit 13 acquires data obtained by subdividing the cleared data, and determines whether or not the data obtained by subdividing the cleared data is cleared data. The data obtained by subdividing the cleared data is data obtained by dividing the cleared data at an appropriate interval for calculating the checksum. A method for determining whether or not the data is cleared is performed by determining whether or not the data obtained by subdividing the cleared data is a list of regular data. For example, the first clear data determination unit 13 determines that the data obtained by subdividing the cleared data is cleared data when the data includes only 0 series. Therefore, even if the data in the cleared data is originally stored in the use area, the data that has only been listed as 0 due to the subdivision is the cleared data. It is determined that there is.

第１チェックサム生成部１４は、クリア済データを細分化したデータがクリアされたデータであると判定された場合は、予め算出しておいたクリアデータチェックサム１２１を取得し、クリア済データを細分化したデータがクリアされたデータでないと判定された場合は、クリア済データを細分化したデータに対するチェックサムを演算により算出するものである。チェックサムを演算により算出する方法は、従来技術により、種々の方法があり、例えば、クリア済データを細分化したデータを数値とみなして、合計値を算出することにより行う。 When it is determined that the data obtained by subdividing the cleared data is the cleared data, the first checksum generation unit 14 acquires the previously calculated clear data checksum 121 and uses the cleared data as the cleared data. When it is determined that the subdivided data is not cleared data, a checksum for data obtained by subdividing the cleared data is calculated. There are various methods for calculating the checksum by calculation according to the conventional technique. For example, the checksum is calculated by considering the data obtained by subdividing the cleared data as a numerical value and calculating the total value.

これにより、第１クリアデータ判定部１３がクリアされたデータか否か判定する判定処理は多くなるが、チェックサムを算出する処理を大幅に減らすことができる。チェックサムを算出する処理の方が判定処理よりも全体の必要時間に与える影響は大きい。したがって、クリア済データ全体のチェックサムを算出するのに必要な時間を短縮することができる。 Thereby, although the determination process which determines whether the 1st clear data determination part 13 is the cleared data increases, the process which calculates a checksum can be reduced significantly. The process of calculating the checksum has a greater influence on the overall required time than the determination process. Therefore, it is possible to reduce the time required to calculate the checksum of the entire cleared data.

また、第１チェックサム生成部１４は、取得したクリアデータチェックサム１２１または算出したチェックサムを、第１チェックサム情報３１へ追加し、第１チェックサム情報３１を生成する。
ここで、第１チェックサム情報３１について、図２を参照して説明する。
図２に示すように、第１チェックサム情報３１には、チェックサムを算出したバックアップ対象記憶装置２の所定の記憶領域区間ごと（つまり、クリア済データを細分化したデータごと）にチェックサム算出開始アドレス３１１が記述される。所定の記憶領域区間は、例えば、セクタごとである。チェックサムは、通常は、符号３１２に示すチェックサムのように、異なる値となる。一方、クリアされたデータであると判定されたデータのチェックサムは、符号３１３に示すチェックサムのように、同一の値（つまり、クリアデータチェックサム１２１）となる。
なお、本明細書等において、チェックサムとは、チェックコードの概念に含まれるものである。 Further, the first checksum generation unit 14 adds the acquired clear data checksum 121 or the calculated checksum to the first checksum information 31 to generate the first checksum information 31.
Here, the first checksum information 31 will be described with reference to FIG.
As shown in FIG. 2, the first checksum information 31 includes a checksum calculation for each predetermined storage area section of the backup target storage device 2 for which the checksum has been calculated (that is, for each piece of data obtained by subdividing the cleared data). A start address 311 is described. The predetermined storage area section is, for example, every sector. The checksum usually has a different value like the checksum indicated by reference numeral 312. On the other hand, the checksum of the data determined to be cleared data has the same value (that is, the clear data checksum 121) as the checksum indicated by reference numeral 313.
In this specification and the like, a checksum is included in the concept of a check code.

圧縮データ生成部１５は、クリア済データを細分化したデータがクリアされたデータであると判定された場合は、予め算出しておいたクリアデータ圧縮データ１２２を取得し、クリア済データを細分化したデータがクリアされたデータでないと判定された場合は、クリア済データを細分化したデータを使用して圧縮データを圧縮処理により算出するものである。圧縮データを圧縮処理により算出する方法は、従来技術により、種々の方法があり、例えば、Lempel-Ziv符号化等の圧縮アルゴリズムを用いることにより行う。 When it is determined that the data obtained by subdividing the cleared data is the cleared data, the compressed data generation unit 15 acquires the preliminarily calculated clear data compressed data 122 and subdivides the cleared data. If it is determined that the processed data is not cleared data, compressed data is calculated by compression processing using data obtained by subdividing the cleared data. There are various methods for calculating the compressed data by compression processing according to the prior art, and for example, it is performed by using a compression algorithm such as Lempel-Ziv encoding.

これにより、クリアされたデータであると判定されたデータに対しては、時間のかかる圧縮処理をせずとも、圧縮結果を得ることができ、データ圧縮にかかる時間を短縮することができる。 As a result, for data determined to be cleared data, a compression result can be obtained without performing time-consuming compression processing, and the time required for data compression can be shortened.

また、圧縮データ生成部１５は、取得したクリアデータ圧縮データ１２２または算出した圧縮データを、圧縮データ３２へ追加し、圧縮データ３２を生成する。 In addition, the compressed data generation unit 15 adds the acquired clear data compressed data 122 or the calculated compressed data to the compressed data 32 to generate the compressed data 32.

圧縮データ展開部１６は、圧縮データ３２を取得し、圧縮データ３２を展開するものである。展開されたデータを展開データという。また、圧縮データ展開部１６は、展開データをバックアップ対象記憶装置２へ記録するものである。
第２クリアデータ判定部１７は、展開データを細分化したデータを取得し、展開データを細分化したデータに対して、クリアされたデータであるか否かを判定するものである。判定する方法は、第１クリアデータ判定部１３と同様であるため、説明を省略する。 The compressed data expansion unit 16 acquires the compressed data 32 and expands the compressed data 32. The expanded data is called expanded data. Further, the compressed data expansion unit 16 records the expanded data in the backup target storage device 2.
The second clear data determination unit 17 acquires data obtained by subdividing the decompressed data, and determines whether the data obtained by subdividing the decompressed data is cleared data. Since the determination method is the same as that of the first clear data determination unit 13, the description thereof is omitted.

第２チェックサム生成部１８は、第１チェックサム生成部１４と同様の方法で第２チェックサム情報２１を生成するものである。すなわち、展開データを細分化したデータがクリアされたデータであると判定された場合は、予め算出しておいたクリアデータチェックサム１２１を取得し、展開データを細分化したデータがクリアされたデータでないと判定された場合は、展開データを細分化したデータに対するチェックサムを演算により算出するものである。
これにより、展開データ全体のチェックサムを算出するのに必要な時間を短縮することができる。
なお、このときのチェックサム算出のための演算は、圧縮時のチェックサム算出のための演算と同一である。また、展開データの細分化は、圧縮時のチェックサム算出の際にクリア済データを細分化した区間に合わせる。 The second checksum generator 18 generates the second checksum information 21 in the same manner as the first checksum generator 14. That is, when it is determined that the data obtained by subdividing the decompressed data is cleared data, the clear data checksum 121 calculated in advance is acquired, and the data obtained by subdividing the decompressed data is cleared. If it is determined that it is not, a checksum for the data obtained by subdividing the expanded data is calculated by calculation.
Thereby, the time required to calculate the checksum of the entire decompressed data can be shortened.
Note that the calculation for calculating the checksum at this time is the same as the calculation for calculating the checksum during compression. Further, the decompressed data is subdivided into a segment in which the cleared data is subdivided when calculating the checksum at the time of compression.

チェックサム比較部１９は、第１チェックサムと第２チェックサムとが一致しているか否かを判定するものである。第１チェックサムと第２チェックサムとが一致していれば、細分化された当該区間におけるデータは、誤りなく復元されたことになる。一方、第１チェックサムと第２チェックサムとが一致していなければ、データの圧縮や展開時に問題があると考えられるため、バックアップ装置１は、図示しない出力装置にエラーを出力する。 The checksum comparison unit 19 determines whether or not the first checksum and the second checksum match. If the first checksum and the second checksum match, the subdivided data in the section is restored without error. On the other hand, if the first checksum and the second checksum do not match, the backup device 1 outputs an error to an output device (not shown) because it is considered that there is a problem during data compression or decompression.

なお、バックアップ装置１は、図示を省略したＣＰＵ（Central Processing Unit）やメモリを搭載した一般的なコンピュータで実現することができる。このとき、バックアップ装置１は、コンピュータを、前記した各機能部として機能させるバックアッププログラム、データ圧縮プログラムによって動作する。
また、第１クリアデータ判定部１３と第２クリアデータ判定部１７とは、同一のモジュール等で実現することができる。また、第１チェックサム生成部１４と第２チェックサム生成部１８とは、同一のモジュール等で実現することができる。 The backup device 1 can be realized by a general computer having a CPU (Central Processing Unit) and a memory (not shown). At this time, the backup device 1 operates by a backup program and a data compression program that cause the computer to function as each functional unit described above.
Moreover, the 1st clear data determination part 13 and the 2nd clear data determination part 17 are realizable with the same module. Further, the first checksum generation unit 14 and the second checksum generation unit 18 can be realized by the same module or the like.

［バックアップ装置１の動作］
次に、バックアップ装置１の動作について図３〜図６（構成は適宜図１）を参照して説明する。
（未使用領域データクリア処理）
この処理は、バックアップ対象記憶装置２の未使用の記憶領域のデータをクリアする処理である。 [Operation of Backup Device 1]
Next, the operation of the backup device 1 will be described with reference to FIGS.
(Unused area data clear processing)
This process is a process of clearing data in an unused storage area of the backup target storage device 2.

図３のフローチャートに示すように、ステップＳ１０１において、未使用領域データクリア部１１は、バックアップ対象記憶装置２の未使用の記憶領域に、データをクリアするためのファイルを作成する。このファイルは、前記したように、例えば、内容が０の羅列である一定量のバイナリファイルである。
ステップＳ１０２において、未使用領域データクリア部１１は、ファイルの作成が成功したか否かを判定する。 As shown in the flowchart of FIG. 3, in step S 101, the unused area data clear unit 11 creates a file for clearing data in an unused storage area of the backup target storage device 2. As described above, this file is, for example, a certain amount of binary files whose contents are a list of zeros.
In step S102, the unused area data clear unit 11 determines whether the file creation has succeeded.

ファイルの作成が成功したと判定した場合は（ステップＳ１０２・Ｙｅｓ）、ステップＳ１０１に戻って、未使用領域データクリア部１１は、データをクリアするためのファイルの作成を続ける。
一方、ファイルの作成が成功しなかったと判定した場合は（ステップＳ１０２・Ｎｏ）、未使用領域データクリア部１１は、未使用の記憶領域すべてにファイルが作成されたと判断し、ステップＳ１０３において、未使用領域データクリア部１１は、データをクリアするために作成したファイルのすべてを削除する。
以上により、未使用領域データクリア部１１によって、バックアップ対象記憶装置２にクリア済データが生成される。 If it is determined that the file has been successfully created (Yes in step S102), the process returns to step S101, and the unused area data clear unit 11 continues to create a file for clearing data.
On the other hand, if it is determined that the file creation has not been successful (No in step S102), the unused area data clearing unit 11 determines that the file has been created in all unused storage areas. The use area data clear unit 11 deletes all the files created to clear the data.
As described above, the unused area data clear unit 11 generates cleared data in the backup target storage device 2.

（第１チェックサム情報生成処理）
この処理は、第１チェックサム情報３１を生成する処理である。また、バックアップ装置１は、この処理を、バックアップ対象記憶装置２の未使用領域のデータをクリアした後に行う。 (First checksum information generation process)
This process is a process for generating the first checksum information 31. Further, the backup device 1 performs this processing after clearing the data in the unused area of the backup target storage device 2.

図４のフローチャートに示すように、ステップＳ２０１において、第１クリアデータ判定部１３は、クリア済データを細分化したデータを取得する。
ステップＳ２０２において、第１クリアデータ判定部１３は、クリア済データを細分化したデータに対して、クリアされたデータであるか否かを判定する。 As shown in the flowchart of FIG. 4, in step S 201, the first clear data determination unit 13 acquires data obtained by subdividing the cleared data.
In step S202, the first clear data determination unit 13 determines whether or not the data obtained by subdividing the cleared data is cleared data.

クリアされたデータであると判定した場合は（ステップＳ２０２・Ｙｅｓ）、ステップＳ２０３において、第１チェックサム生成部１４は、予め算出したクリアデータチェックサム１２１を取得する。
一方、クリアされたデータでないと判定した場合は（ステップＳ２０２・Ｎｏ）、ステップＳ２０４において、第１チェックサム生成部１４は、クリア済データを細分化したデータからチェックサムを算出する。 When it is determined that the data is cleared (step S202 / Yes), in step S203, the first checksum generation unit 14 acquires a pre-calculated clear data checksum 121.
On the other hand, if it is determined that the data is not cleared (No in step S202), in step S204, the first checksum generation unit 14 calculates a checksum from data obtained by subdividing the cleared data.

ステップＳ２０５において、第１チェックサム生成部１４は、ステップＳ２０３で取得したクリアデータチェックサム１２１またはステップＳ２０４で算出したチェックサムを、第１チェックサム情報３１へ追加し、第１チェックサム情報３１を生成する。
ステップＳ２０６において、第１チェックサム生成部１４は、全記憶領域のデータに対して実施したか否かを判定する。 In step S205, the first checksum generation unit 14 adds the clear data checksum 121 acquired in step S203 or the checksum calculated in step S204 to the first checksum information 31, and the first checksum information 31 is added. Generate.
In step S206, the first checksum generation unit 14 determines whether or not the processing has been performed on the data in all the storage areas.

全記憶領域のデータに対して実施していないと判定した場合は（ステップＳ２０６・Ｎｏ）、ステップＳ２０１に戻って、バックアップ装置１は、処理を続ける。
一方、全記憶領域のデータに対して実施したと判定した場合は（ステップＳ２０６・Ｙｅｓ）、バックアップ装置１は、第１チェックサム情報生成処理を終了する。 When it is determined that the processing is not performed on the data in all the storage areas (No at Step S206), the process returns to Step S201, and the backup device 1 continues the process.
On the other hand, when it determines with having implemented with respect to the data of all the storage areas (step S206 * Yes), the backup device 1 complete | finishes a 1st checksum information production | generation process.

（圧縮データ生成処理）
この処理は、バックアップ用の圧縮データを生成する処理である。また、バックアップ装置１は、この処理を、第１チェックサム情報生成処理と並行して行うことができる。 (Compressed data generation processing)
This process is a process for generating compressed data for backup. Further, the backup device 1 can perform this process in parallel with the first checksum information generation process.

図５のフローチャートに示すように、ステップＳ３０１において、第１クリアデータ判定部１３は、クリア済データを細分化したデータを取得する。
ステップＳ３０２において、第１クリアデータ判定部１３は、クリア済データを細分化したデータに対して、クリアされたデータであるか否かを判定する。 As shown in the flowchart of FIG. 5, in step S 301, the first clear data determination unit 13 acquires data obtained by subdividing the cleared data.
In step S302, the first clear data determination unit 13 determines whether or not the data obtained by subdividing the cleared data is cleared data.

クリアされたデータであると判定した場合は（ステップＳ３０２・Ｙｅｓ）、ステップＳ３０３において、圧縮データ生成部１５は、予め算出したクリアデータ圧縮データ１２２を取得する。
一方、クリアされたデータでないと判定した場合は（ステップＳ３０２・Ｎｏ）、ステップＳ３０４において、圧縮データ生成部１５は、クリア済データを細分化したデータから圧縮データを圧縮処理により算出する。 When it is determined that the data is cleared (Yes in step S302), in step S303, the compressed data generation unit 15 acquires the calculated clear data compressed data 122.
On the other hand, if it is determined that the data is not cleared (No in step S302), in step S304, the compressed data generation unit 15 calculates compressed data from the data obtained by subdividing the cleared data.

ステップＳ３０５において、圧縮データ生成部１５は、ステップＳ３０３で取得したクリアデータ圧縮データ１２２またはステップＳ３０４で算出した圧縮データを、圧縮データ３２へ追加し、圧縮データ３２を生成する。
ステップＳ３０６において、圧縮データ生成部１５は、全記憶領域のデータに対して実施したか否かを判定する。 In step S 305, the compressed data generation unit 15 adds the clear data compressed data 122 acquired in step S 303 or the compressed data calculated in step S 304 to the compressed data 32 to generate the compressed data 32.
In step S306, the compressed data generation unit 15 determines whether or not the processing has been performed on the data in all the storage areas.

全記憶領域のデータに対して実施していないと判定した場合は（ステップＳ３０６・Ｎｏ）、ステップＳ３０１に戻って、バックアップ装置１は、処理を続ける。
一方、全記憶領域のデータに対して実施したと判定した場合は（ステップＳ３０６・Ｙｅｓ）、バックアップ装置１は、圧縮データ生成処理を終了する。 When it is determined that the processing is not performed on the data in all the storage areas (No at Step S306), the process returns to Step S301, and the backup device 1 continues the process.
On the other hand, when it determines with having implemented with respect to the data of all the storage areas (step S306, Yes), the backup device 1 complete | finishes a compressed data production | generation process.

（圧縮データ展開処理）
この処理は、バックアップ用の圧縮データを用いてバックアップ対象記憶装置２へデータを復元し、圧縮前のデータと展開後のデータが一致しているかを確認する信頼性検査を行う処理である。 (Compressed data expansion processing)
This processing is processing for restoring data to the backup target storage device 2 using the compressed data for backup, and performing a reliability check to check whether the data before compression and the data after expansion match.

図６のフローチャートに示すように、ステップＳ４０１において、圧縮データ展開部１６は、圧縮データ３２を取得する。
ステップＳ４０２において、圧縮データ展開部１６は、圧縮データ３２を展開し、展開データを生成する。
ステップＳ４０３において、圧縮データ展開部１６は、展開データをバックアップ対象記憶装置２へ記録する。 As shown in the flowchart of FIG. 6, in step S 401, the compressed data expansion unit 16 acquires the compressed data 32.
In step S402, the compressed data expansion unit 16 expands the compressed data 32 and generates expanded data.
In step S403, the compressed data expansion unit 16 records the expanded data in the backup target storage device 2.

ステップＳ４０４において、第２クリアデータ判定部１７および第２チェックサム生成部１８は、第２チェックサム情報生成処理を実行する。これにより、第２チェックサム情報２１が生成される。なお、第２チェックサム情報生成処理は、第１チェックサム情報生成処理と同様であるため（図４参照）、説明を省略する。
ステップＳ４０５において、チェックサム比較部１９は、第１チェックサムと第２チェックサムとが一致しているか否かを判定する。 In step S404, the second clear data determination unit 17 and the second checksum generation unit 18 execute a second checksum information generation process. Thereby, the second checksum information 21 is generated. Since the second checksum information generation process is the same as the first checksum information generation process (see FIG. 4), the description thereof is omitted.
In step S405, the checksum comparison unit 19 determines whether or not the first checksum and the second checksum match.

第１チェックサムと第２チェックサムとが一致していると判定した場合（ステップＳ４０５・Ｙｅｓ）、ステップＳ４０６において、圧縮データ展開部１６は、圧縮データ３２をすべて展開したか否か判定する。
圧縮データをすべて展開していないと判定した場合は（ステップＳ４０６・Ｎｏ）、ステップＳ４０１に戻って、バックアップ装置１は、処理を続ける。
一方、圧縮データをすべて展開したと判定した場合は（ステップＳ４０６・Ｙｅｓ）、バックアップ装置１は、圧縮データ展開処理を終了する。 If it is determined that the first checksum and the second checksum match (Yes in step S405), in step S406, the compressed data expansion unit 16 determines whether all the compressed data 32 has been expanded.
If it is determined that all the compressed data has not been expanded (No at Step S406), the process returns to Step S401, and the backup device 1 continues processing.
On the other hand, when it is determined that all the compressed data has been expanded (step S406: Yes), the backup device 1 ends the compressed data expansion process.

ステップＳ４０５において、第１チェックサムと第２チェックサムとが一致していないと判定した場合（ステップＳ４０５・Ｎｏ）、ステップＳ４０７において、バックアップ装置１は、図示しない出力装置にエラーを出力する。 If it is determined in step S405 that the first checksum and the second checksum do not match (No in step S405), the backup device 1 outputs an error to an output device (not shown) in step S407.

以上の動作によって、圧縮率向上の効果を得ることができる。また、特別なアルゴリズムを用いておらず、圧縮アルゴリズムに関係なく実施することができる。
また、圧縮時において、クリアされたデータと判定されたデータについては、チェックサムの算出処理をしなくとも、チェックサムを特定することができ、チェックサム算出にかかる時間を短縮することができる。
また、圧縮時において、クリアされたデータと判定されたデータについては、データの圧縮処理をしなくとも、圧縮結果のデータを特定することができ、データ圧縮にかかる時間を短縮することができる。
また、展開時において、クリアされたデータと判定されたデータについては、チェックサムの算出処理をしなくとも、チェックサムを特定することができ、チェックサム算出にかかる時間を短縮することができる。 With the above operation, the effect of improving the compression ratio can be obtained. Further, no special algorithm is used, and the present invention can be implemented regardless of the compression algorithm.
Further, for data determined to be cleared at the time of compression, the checksum can be specified without performing the checksum calculation process, and the time required for checksum calculation can be shortened.
Further, for data determined to be cleared at the time of compression, the data of the compression result can be specified without performing data compression processing, and the time required for data compression can be shortened.
Further, for data determined to be cleared at the time of development, the checksum can be specified without performing checksum calculation processing, and the time required for checksum calculation can be reduced.

本実施形態により、記憶領域の圧縮率を向上させることで、圧縮したデータの記録・伝送に必要な処理時間の短縮や、記録するメディアの容量を削減することができる。
また、電子計算機へ実装する記憶媒体のデータの圧縮および展開、信頼性検査の作業において、処理時間短縮の効果を得ることができ、記憶媒体の大容量化による処理時間の増加を軽減することができる。 According to the present embodiment, by improving the compression ratio of the storage area, it is possible to reduce the processing time required for recording / transmission of compressed data and the capacity of the recording medium.
In addition, it is possible to obtain an effect of shortening the processing time in the work of compressing and decompressing data of the storage medium to be mounted on the electronic computer, and the reliability inspection, and to reduce the increase in the processing time due to the increase in the capacity of the storage medium. it can.

＜変形例＞
以上、本発明の一実施形態について説明したが、本発明はこれに限定されず、本発明の趣旨を逸脱しない範囲で変更することができる。 <Modification>
As mentioned above, although one Embodiment of this invention was described, this invention is not limited to this, It can change in the range which does not deviate from the meaning of this invention.

本発明の実施形態では、フローチャートのステップは、記載された順序に沿って時系列的に行われる処理の例を示したが、必ずしも時系列的に処理されなくとも、並列的あるいは個別実行される処理をも含むものである。
例えば、ステップＳ４０３とステップＳ４０４とは、並列に行うこととしてもよい。すなわち、バックアップ装置１は、展開データをバックアップ対象記憶装置２へ記録するとともに、第２チェックサム情報生成処理を実行することとしてもよい。 In the embodiment of the present invention, the steps of the flowchart show an example of processing performed in time series according to the described order. However, the steps of the flowchart are not necessarily processed in time series, but are executed in parallel or individually. It also includes processing.
For example, step S403 and step S404 may be performed in parallel. That is, the backup device 1 may record the decompressed data in the backup target storage device 2 and execute the second checksum information generation process.

また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

１バックアップ装置
２バックアップ対象記憶装置
３バックアップ記憶装置
１１未使用領域データクリア部
１２記憶部
１３第１クリアデータ判定部
１４第１チェックサム生成部（第１チェックコード生成部）
１５圧縮データ生成部
１６圧縮データ展開部
１７第２クリアデータ判定部
１８第２チェックサム生成部（第２チェックコード生成部）
１９チェックサム比較部（チェックコード比較部）
２１第２チェックサム情報（第２チェックコード情報）
３１第１チェックサム情報（第１チェックコード情報）
３２圧縮データ
１２１クリアデータチェックサム（クリアデータチェックコード）
１２２クリアデータ圧縮データ
３１１チェックサム算出開始アドレス
３１２通常のチェックサム
３１３クリアされたデータと判定されたデータのチェックサム DESCRIPTION OF SYMBOLS 1 Backup apparatus 2 Backup object storage apparatus 3 Backup storage apparatus 11 Unused area | region data clear part 12 Memory | storage part 13 1st clear data determination part 14 1st checksum production | generation part (1st check code production | generation part)
DESCRIPTION OF SYMBOLS 15 Compressed data production | generation part 16 Compressed data expansion | deployment part 17 2nd clear data determination part 18 2nd checksum production | generation part (2nd check code production | generation part)
19 Checksum comparison part (check code comparison part)
21 Second checksum information (second check code information)
31 First checksum information (first check code information)
32 Compressed data 121 Clear data checksum (clear data check code)
122 Clear data compressed data 311 Checksum calculation start address 312 Normal checksum 313 Checksum of data determined to be cleared

Claims

A backup device that compresses and backs up data in a storage area of a backup target storage device to be backed up,
An unused area data clear unit that clears unused area data of the storage area, and sets the storage area data as cleared data;
A first clear data determination unit that determines whether or not the data obtained by subdividing the cleared data is cleared data;
When the first clear data determination unit determines that the data is cleared, a clear data check code that is a check code of the cleared data calculated in advance is obtained, and the reliability is determined using the clear data check code. A first check code generation unit that generates first check code information for performing an inspection;
When the first clear data determination unit determines that the data is cleared, the clear data compressed data that is the compression result of the cleared data calculated in advance is acquired and compressed using the clear data compressed data. A compressed data generator that generates compressed data that is
A compressed data decompression unit that decompresses the compressed data and generates decompressed data;
A second clear data determination unit that determines whether or not the data obtained by subdividing the expanded data is cleared data;
When the second clear data determination unit determines that the data is cleared, the clear data check code calculated in advance is acquired, and a second check for performing a reliability check using the clear data check code A second check code generation unit for generating code information;
A check code comparison unit that compares the first check code information and the second check code information;
A backup device comprising:

When the first check code generation unit determines that the first clear data determination unit is not cleared data, the first check code generation unit calculates a check code from data obtained by subdividing the cleared data, and the clear data check code or calculation The first check code information for performing the reliability check is generated using the check code,
When the compressed data generation unit determines that the first clear data determination unit is not cleared data, the compressed data generation unit calculates a compression result from data obtained by segmenting the cleared data, and the compressed data compression data or the compression result Is used to generate compressed data that is compressed data,
When the second check code generation unit determines that the second clear data determination unit is not cleared data, the second check code generation unit calculates a check code from data obtained by segmenting the expanded data, and calculates the clear data check code or the calculated The backup device according to claim 1, wherein second check code information for performing a reliability check is generated using the check code.

The compressed data decompression unit records the decompressed data in the backup target storage device;
The second clear data determination unit acquires the decompressed data recorded in the backup target storage device, and determines whether the acquired and subdivided data is cleared data. The backup device according to claim 1, characterized in that:

The backup device according to claim 1, wherein the check code is a checksum for digitizing and summing data.

A backup method in a backup device that compresses and backs up data in a storage area of a backup target storage device to be backed up,
An unused area data clear unit of the backup device clears unused area data among the storage areas, and uses the storage area data as cleared data.
A first clear data determination unit that determines whether the first clear data determination unit of the backup device is data that has been cleared with respect to data obtained by subdividing the cleared data;
When the first check code generation unit of the backup device determines that the first clear data determination unit is cleared data, a clear data check code that is a check code of the cleared data calculated in advance is acquired. A first check code generation step for generating first check code information for performing a reliability check using the clear data check code;
When the compressed data generation unit of the backup device determines that the first clear data determination unit is the cleared data, obtains the clear data compressed data that is the compression result of the cleared data calculated in advance, A compressed data generation step for generating compressed data that is compressed data using the clear data compressed data;
A compressed data decompression step in which the compressed data decompression unit of the backup device decompresses the compressed data and generates decompressed data;
A second clear data determination step of determining whether or not the second clear data determination unit of the backup device is the cleared data with respect to the data obtained by subdividing the expanded data;
When the second check code generation unit of the backup device determines that the data is cleared by the second clear data determination unit, the clear data check code calculated in advance is acquired and the clear data check code is used. A second check code generation step for generating second check code information for performing a reliability check;
A check code comparison step in which the check code comparison unit of the backup device compares the first check code information and the second check code information;
A backup method characterized by comprising:

A data compression method in a backup device for compressing data in a storage area of a backup target storage device,
An unused area data clear unit of the backup device clears unused area data among the storage areas, and uses the storage area data as cleared data.
A first clear data determination unit that determines whether the first clear data determination unit of the backup device is data that has been cleared with respect to data obtained by subdividing the cleared data;
When the first check code generation unit of the backup device determines that the first clear data determination unit is cleared data, a clear data check code that is a check code of the cleared data calculated in advance is acquired. A first check code generation step for generating first check code information for performing a reliability check using the clear data check code;
When the compressed data generation unit of the backup device determines that the first clear data determination unit is the cleared data, obtains the clear data compressed data that is the compression result of the cleared data calculated in advance, A compressed data generation step for generating compressed data that is compressed data using the clear data compressed data;
A data compression method comprising:

A backup program for causing a computer to execute the backup method according to claim 5.

A data compression program for causing a computer to execute the data compression method according to claim 6.