JP2010205100A

JP2010205100A - Management server, backup system, backup method, and program

Info

Publication number: JP2010205100A
Application number: JP2009051593A
Authority: JP
Inventors: Hiroshi Yamamoto; 浩山本
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2009-03-05
Filing date: 2009-03-05
Publication date: 2010-09-16
Anticipated expiration: 2029-03-05
Also published as: JP5287366B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a backup system which optimizes the amount of difference data while reducing traffic between a management server and a management target machine. <P>SOLUTION: A management server 1000 includes: a full backup data storage means 1110 wherein full backup data are stored; a working area 1300 wherein hash data including a block size by which the full backup data are divided into a plurality of blocks, and a plurality of hash values corresponding to respective blocks is stored; a boot image 1220 by which the management target machine extracts difference data between full backup data and present data by using hash data; a data transmission/reception means 1500 which transmits hash data to the management target machine after the management target machine is invoked using the boot image, and receives difference data from the management target machine; and a dividing/combining means 1400 which uses full backup data and difference data to calculate a block size for next backup. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、管理対象マシンが記憶するデータをバックアップする方式に関し、特に、前回のバックアップしたデータから変更された差分データをバックアップの対象とする差分バックアップの方式に関する。 The present invention relates to a method for backing up data stored in a managed machine, and more particularly, to a differential backup method in which differential data changed from previously backed up data is a backup target.

業務サービスを動作させるサーバ、あるいは、業務端末（パーソナルコンピュータ）は、不慮の事故に備えるために重要なデータ、あるいは、ディスク全てをバックアップしておく運用が通常行われている。特に、業務サービスを動作させているサーバについては、昨今のサービスの無停止化の流れにより、可能な限りサーバの停止を抑えるような運用が求められている。従って、停止する場合においても、その時間を可能な限り短縮することが求められている。
例えば、重要なデータのみをバックアップする場合は、業務アプリケーションを停止させるなど、アプリケーションとしての静止点（意味的に整合性のとれた状態）を確保した上で、アプリケーションのデータをバックアップするなどの運用がなされている。また、システム（OS : Operating System）をバックアップする場合は、システムを停止してバックアップするなどの運用がなされている。 A server for operating a business service or a business terminal (personal computer) is usually operated to back up all important data or disks in order to prepare for an unexpected accident. In particular, with respect to a server operating a business service, operation that suppresses the server stop as much as possible is required due to the recent non-stop service flow. Therefore, even when stopping, it is required to shorten the time as much as possible.
For example, when backing up only important data, operations such as stopping business applications and securing application quiesce points (state that is semantically consistent) and then backing up application data Has been made. Also, when backing up a system (OS: Operating System), the system is stopped and backed up.

このような状況において、業務アプリケーションやシステムの停止時間を最小限に抑えるために、種々の差分バックアップ方式が考案されている。例えば、特許文献１では、業務ボリュームの差分データのみをコピーボリュームにコピーする技術が記載されている。
特許文献１の方式では、業務ボリューム（コピー元）の各ブロックに対して更新フラグが用意されており、そのフラグが更新のタイミングで管理されているため、前回バックアップが実行された（ボリュームが切り離された）時点から更新されたブロックが把握できるようになっている。そのため、次回バックアップを実行する場合に、どのブロックをコピーすれば良いかを特定できるようになっている。 Under such circumstances, various differential backup methods have been devised in order to minimize the downtime of business applications and systems. For example, Patent Document 1 describes a technique for copying only difference data of a business volume to a copy volume.
In the method of Patent Document 1, an update flag is prepared for each block of a transaction volume (copy source), and the flag is managed at the update timing. Therefore, the previous backup was executed (the volume is disconnected). The block updated from the time point can be grasped. For this reason, it is possible to specify which block should be copied when the next backup is executed.

特開２０００−３３０７３０号公報JP 2000-330730 A

しかしながら、特許文献１に開示された方式に従うと、バックアップ対象上に差分計算の仕組み（ストレージ装置、ファイルシステム）が必要であった。このため、システム構築のコストが高くなってしまっていた。一方、バックアップ対象に特別な仕組みが必要のない汎用的なディスクのバックアップ方式では、常にフルバックアップが必要となっていた。このため、バックアップ時間・バックアップデータ量が肥大化していた。 However, according to the method disclosed in Patent Document 1, a difference calculation mechanism (storage device, file system) is required on the backup target. For this reason, the cost of system construction has become high. On the other hand, in general-purpose disk backup methods that do not require a special mechanism for backup, a full backup has always been required. For this reason, the backup time and the amount of backup data have been enlarged.

これらに対して差分バックアップを実現しようとした場合、フルバックアップを採取してから管理サーバ上で差分データを計算する方式、あるいは、前回採取したフルバックアップデータを管理対象マシンに送信して管理対象マシン上で差分を計算する方式をとらざるを得なかった。しかし、これらの方式では通信データ量が多くなってしまうため、実用的な方式ではなかった。
また、ディスクイメージレベルで差分を計算する場合、計算するブロックサイズの適正値はディスク上に実装されるファイルシステム、あるいは、ファイルシステム上で動作する業務アプリケーション等に依存するため、一概にブロックサイズを決めることができず、差分データ量が最適な量ではなかった。 When attempting to implement differential backup for these, a method of calculating differential data on the management server after collecting a full backup, or sending the full backup data collected last time to the managed machine I had to take the method of calculating the difference above. However, these methods are not practical because the amount of communication data increases.
When calculating the difference at the disk image level, the appropriate value for the block size to be calculated depends on the file system mounted on the disk or the business application running on the file system. The amount of difference data was not the optimum amount.

本発明の目的は、差分計算の仕組み（ストレージ装置、ファイルシステム）を持たない管理対象マシンを対象として、管理サーバと管理対象マシン間の通信量を抑えながら、差分データ量を最適化するバックアップに関する技術を提供することである。 An object of the present invention relates to a backup that optimizes the amount of differential data while suppressing the amount of communication between a management server and a managed machine for a managed machine that does not have a difference calculation mechanism (storage device, file system). Is to provide technology.

本発明に係る管理サーバの一態様は、管理対象マシンのデータをバックアップする管理サーバであって、前記管理対象マシンのフルバックアップデータを格納するフルバックアップデータ格納手段と、前記フルバックアップデータを複数のブロックに分割するブロックサイズと、各ブロックに対応する複数のハッシュ値とを含むハッシュデータを格納するハッシュデータ格納手段と、前記フルバックアップデータと現在のデータとの差分データを、前記ハッシュデータを用いて前記管理対象マシンに抽出させるブートイメージを格納するブートイメージ格納手段と、前記管理対象マシンがバックアップを開始するときに、前記ブートイメージを前記管理対象マシンへ送信する送信手段と、前記ブートイメージを用いて前記管理対象マシンが起動された後、前記ハッシュデータを前記管理対象マシンへ送信し、前記管理対象マシンが抽出した差分データを受信するデータ送受信手段と、前記フルバックアップデータと前記差分データとを用いて、次のバックアップに用いるブロックサイズを計算し、前記差分データを反映させた最新のフルバックアップデータを前記フルバックアップデータ格納手段へ格納し、計算したブロックサイズを用いて前記最新のフルバックアップデータを分割した各ブロックに対応するハッシュ値を算出し、計算したブロックサイズと算出したハッシュ値とを前記ハッシュデータ格納手段に格納する分割・結合手段と、を備える。 One aspect of the management server according to the present invention is a management server that backs up data of a managed machine, full backup data storage means for storing the full backup data of the managed machine, and a plurality of the full backup data. Hash data storage means for storing hash data including a block size to be divided into blocks and a plurality of hash values corresponding to each block, and difference data between the full backup data and the current data, using the hash data Boot image storage means for storing a boot image to be extracted by the managed machine, transmission means for sending the boot image to the managed machine when the managed machine starts backup, and the boot image To start the managed machine After that, the hash data is transmitted to the managed machine, the data transmission / reception means for receiving the differential data extracted by the managed machine, and the full backup data and the differential data are used for the next backup. Calculates the block size to be used, stores the latest full backup data reflecting the difference data in the full backup data storage means, and supports each block obtained by dividing the latest full backup data using the calculated block size Dividing / combining means for calculating a hash value to be stored and storing the calculated block size and the calculated hash value in the hash data storage means.

本発明に係るバックアップ方式の一態様は、管理サーバによって管理対象マシンのデータをバックアップするバックアップ方式であって、前記管理サーバは、上述した各手段を備え、前記管理対象マシンは、前記ブートイメージと前記ハッシュデータとを受信し、前記差分データを送信するデータ送受信手段と、バックアップ対象データのハッシュデータを算出し、前記受信したハッシュデータと比較して差分データを作成する差分データ作成手段を備える。 One aspect of the backup method according to the present invention is a backup method in which data of a managed machine is backed up by a management server. The management server includes the above-described units, and the managed machine includes the boot image and the boot image. Data transmission / reception means for receiving the hash data and transmitting the difference data, and difference data creation means for calculating hash data of backup target data and creating difference data by comparing with the received hash data.

本発明に係るバックアップ方法の一態様は、管理対象マシンのデータをバックアップするバックアップ方法であって、前記管理対象マシンのフルバックアップデータをフルバックアップデータ格納手段にバックアップし、前記フルバックアップデータを複数のブロックに分割するブロックサイズと、各ブロックに対応する複数のハッシュ値とを含むハッシュデータをハッシュデータ格納手段に格納し、前記フルバックアップデータと現在のデータとの差分データを、前記ハッシュデータを用いて前記管理対象マシンに抽出させるブートイメージをブートイメージ格納手段に格納し、前記管理対象マシンがバックアップを開始するときに、前記ブートイメージを用いて前記管理対象マシンを再起動させ、前記管理対象マシンに前記ハッシュデータを通知して前記差分データを抽出させ、前記フルバックアップデータと前記差分データとを用いて、次のバックアップに用いるブロックサイズを計算し、前記差分データを反映させた最新のフルバックアップデータを前記フルバックアップデータ格納手段へ格納し、計算したブロックサイズを用いて前記最新のフルバックアップデータを分割した各ブロックに対応するハッシュ値を算出し、計算したブロックサイズと算出したハッシュ値とを前記ハッシュデータ格納手段に格納する。 One aspect of the backup method according to the present invention is a backup method for backing up data of a managed machine, wherein the full backup data of the managed machine is backed up to a full backup data storage means, and the full backup data is stored in a plurality of ways. Hash data including a block size to be divided into blocks and a plurality of hash values corresponding to each block is stored in a hash data storage unit, and difference data between the full backup data and current data is used as the hash data. The boot image to be extracted by the managed machine is stored in boot image storage means, and when the managed machine starts backup, the managed machine is restarted using the boot image, and the managed machine To the hashday The difference data is extracted, the block size used for the next backup is calculated using the full backup data and the difference data, and the latest full backup data reflecting the difference data is calculated as the full data. A hash value corresponding to each block obtained by dividing the latest full backup data using the calculated block size is stored in the backup data storage means, and the calculated block size and the calculated hash value are stored in the hash data Store in the means.

本発明に係るプログラムの一態様は、管理対象マシンのデータをバックアップするバックアップを実現するプログラムであって、コンピュータに、前記管理対象マシンのフルバックアップデータをフルバックアップデータ格納手段にバックアップする処理と、前記フルバックアップデータを複数のブロックに分割するブロックサイズと、各ブロックに対応する複数のハッシュ値とを含むハッシュデータをハッシュデータ格納手段に格納する処理と、前記管理対象マシンがバックアップを開始するときに、前記フルバックアップデータと現在のデータとの差分データを、前記ハッシュデータを用いて前記管理対象マシンに抽出させるブートイメージを送信し、前記管理対象マシンを再起動させる処理と、前記管理対象マシンに前記ハッシュデータを通知して前記差分データを抽出させ、前記フルバックアップデータと前記差分データとを用いて、次のバックアップに用いるブロックサイズを計算する処理と、前記差分データを反映させた最新のフルバックアップデータを前記フルバックアップデータ格納手段へ格納する処理と、計算したブロックサイズを用いて前記最新のフルバックアップデータを分割した各ブロックに対応するハッシュ値を算出する処理と、計算したブロックサイズと算出したハッシュ値とを前記ハッシュデータ格納手段に格納する処理と、を実行させる。 One aspect of the program according to the present invention is a program that implements a backup for backing up data of a managed machine, wherein the computer backs up the full backup data of the managed machine to a full backup data storage unit; A process of storing hash data including a block size for dividing the full backup data into a plurality of blocks and a plurality of hash values corresponding to each block in a hash data storage unit; and when the managed machine starts backup A process of transmitting a boot image for extracting the difference data between the full backup data and the current data to the managed machine using the hash data, and restarting the managed machine; and the managed machine To the hash The difference data is extracted, the process of calculating the block size used for the next backup using the full backup data and the difference data, and the latest full backup data reflecting the difference data , The hash value corresponding to each block obtained by dividing the latest full backup data using the calculated block size, the calculated block size and the calculated hash Storing the value in the hash data storage means.

本発明によれば、差分計算の仕組み（ストレージ装置、ファイルシステム）を持たない管理対象マシンを対象として、管理サーバと管理対象マシン間の通信量を抑えながら、差分データ量を最適化するバックアップに関する技術を提供することが可能となる。 The present invention relates to a backup that optimizes the amount of difference data while suppressing the amount of communication between the management server and the managed machine for a managed machine that does not have a difference calculation mechanism (storage device, file system). Technology can be provided.

本発明に係るシステム構成を模式的に表した図である。It is a figure showing typically the system configuration concerning the present invention. 実施形態１の管理サーバ内で動作するプログラムの論理的な構造を模式的に表したブロック図である。FIG. 3 is a block diagram schematically illustrating a logical structure of a program that operates in the management server according to the first embodiment. ディスク領域のフルバックデータのイメージを説明する図である。It is a figure explaining the image of the full back data of a disk area | region. 差分データを説明する図である。It is a figure explaining difference data. 差分データ格納手段に格納する差分データのデータ構造を表す図である。It is a figure showing the data structure of the difference data stored in a difference data storage means. ハッシュ値の具体例を示す図である。It is a figure which shows the specific example of a hash value. ワーキングエリアに格納するハッシュデータのデータ構造を表す図である。It is a figure showing the data structure of the hash data stored in a working area. バックアップデータを分割するブロックサイズを決定するロジックを表す図である。It is a figure showing the logic which determines the block size which divides | segments backup data. 今回採取した差分データを、１／２倍のブロックサイズで分割したと仮定した場合の差分データ量の変化の一例を表す図である。It is a figure showing an example of the change of the amount of difference data at the time of assuming that the difference data collected this time was divided | segmented by the block size of 1/2 times. 今回採取した差分データを、２倍のブロックサイズで分割したと仮定した場合の差分データ量の変化の一例を表す図である。It is a figure showing an example of the change of the amount of difference data at the time of assuming that the difference data collected this time was divided | segmented by 2 times the block size. 管理対象マシン上で動作するブートイメージの論理的な構造を模式的に表したブロック図である。FIG. 3 is a block diagram schematically illustrating a logical structure of a boot image that operates on a managed machine. 管理サーバ及び管理対象マシンの処理全体の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the whole process of a management server and a management object machine. 管理対象マシンの差分バックアップ処理の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the differential backup process of a management object machine. 管理サーバの分割・結合処理の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the division | segmentation / coupling | bonding process of a management server. 管理サーバの分割効率計算処理の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the division | segmentation efficiency calculation process of a management server. 管理サーバの結合効率計算処理の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the coupling efficiency calculation process of a management server. 管理サーバのハッシュデータ計算処理の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the hash data calculation process of a management server. 管理サーバのハッシュデータ比較処理の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the hash data comparison process of a management server. 実施形態１の具体例を示すデータを表す図であり、（ａ）はフルバックアップデータ格納手段に格納されたフルバックデータ例を示し、（ｂ）はワーキングエリアに格納されたハッシュデータ例を示す。2 is a diagram illustrating data showing a specific example of the first embodiment, where (a) shows an example of full back data stored in the full backup data storage means, and (b) shows an example of hash data stored in the working area. FIG. . 管理対象マシンが差分データを作成するときのデータ例を示し、（ａ）はハッシュデータ計算手段が計算したハッシュデータ例であり、（ｂ）は差分データ作成手段が作成した差分データ例を示す。An example of data when the managed machine creates difference data is shown, (a) is an example of hash data calculated by the hash data calculation means, and (b) is an example of difference data created by the difference data creation means. 管理サーバが差分データを作成するときのデータ例を示し、（ａ）は前回の領域のハッシュデータ例であり、（ｂ）は差分データ内のハッシュデータ例を示す。An example of data when the management server creates difference data is shown, (a) is an example of hash data of the previous area, and (b) is an example of hash data in the difference data. 管理サーバが結合効率計算を実施するときのデータ例を示し、（ａ）はブルバックアップデータのうち、差分データに対応するブロックであり、（ｂ）は差分データ例を示す。An example of data when the management server performs the coupling efficiency calculation is shown. (A) is a block corresponding to differential data in the bull backup data, and (b) shows an example of differential data.

以下、本発明の実施形態について、図面を参照しながら説明する。説明の明確化のため、以下の記載及び図面は、適宜、省略、及び簡略化がなされている。各図面において同一の構成または機能を有する構成要素および相当部分には、同一の符号を付し、その説明は省略する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. For clarity of explanation, the following description and drawings are omitted and simplified as appropriate. In the drawings, components having the same configuration or function and corresponding parts are denoted by the same reference numerals and description thereof is omitted.

本発明では、管理対象マシン上に差分計算の仕組み（ハードウェア、あるいは、ソフトウェア）を持たない場合に適用可能なバックアップ方式を提供する。また、前回バックアップしたデータと異なる差分データをバックアップする方式を用いる。そのため、管理サーバは、差分バックアップ実行時に、前回採取したフルバックアップデータのハッシュデータを管理対象マシンに送信し、管理対象マシン上で差分データを計算する。さらに、管理サーバは、差分バックアップ実行後に、差分計算用のブロックサイズの適正値を計算し、適正値でなければ自動的にブロックサイズを変更する。以下図面を参照して各実施形態を説明する。 The present invention provides a backup method applicable when the managed machine does not have a difference calculation mechanism (hardware or software). Further, a method of backing up differential data different from the data backed up last time is used. Therefore, the management server transmits the hash data of the full backup data collected last time to the management target machine and calculates the difference data on the management target machine when executing the differential backup. Further, the management server calculates an appropriate value of the block size for the difference calculation after executing the differential backup, and automatically changes the block size if it is not an appropriate value. Embodiments will be described below with reference to the drawings.

（実施形態１）
図１は、本実施形態のシステム構成を模式的に表したものであり、管理サーバ１０００、管理用ネットワーク２０００、及び、管理対象マシン３０００〜５０００を含む。
管理サーバ１０００は、管理用ネットワーク２０００を経由して、管理対象マシン３０００〜５０００を管理・制御するソフトウェアを搭載したハードウェアである。
管理用ネットワーク２０００は、管理サーバ１０００が管理対象マシン３０００〜５０００を管理・制御するために経由するネットワークインフラである。
管理対象マシン３０００〜５０００は、管理サーバ１０００によって管理・制御されるハードウェアである。 (Embodiment 1)
FIG. 1 schematically shows the system configuration of this embodiment, which includes a management server 1000, a management network 2000, and managed machines 3000 to 5000.
The management server 1000 is hardware equipped with software for managing and controlling the managed machines 3000 to 5000 via the management network 2000.
The management network 2000 is a network infrastructure through which the management server 1000 manages and controls the managed machines 3000 to 5000.
The managed machines 3000 to 5000 are hardware managed and controlled by the management server 1000.

図１は、本実施形態を模式的に説明するための例示であって、システム構成を制限するものではない。すなわち、管理対象マシンは、管理・制御するシステムに応じて任意の台数として構成可能である。 FIG. 1 is an illustration for schematically explaining the present embodiment, and does not limit the system configuration. In other words, the managed machine can be configured as an arbitrary number according to the system to be managed / controlled.

図２は、管理サーバ１０００内で動作する本実施形態のプログラムの論理的な構造を模式的に表したものである。管理サーバ１０００の構造は、バックアップデータ格納領域１１００、ブートイメージ格納領域１２００、ワーキングエリア（ハッシュデータ格納手段）１３００、分割・結合手段１４００、データ送受信手段１５００、及び、ブートイメージ送信手段１６００を含む。
ここで、バックアップデータ格納領域１１００、ブートイメージ格納領域１２００は、磁気ディスク等の不揮発性の記憶装置、および、そこに格納されるデータを管理するＤＢＭＳ（DataBase Management System）に相当する。 FIG. 2 schematically shows the logical structure of the program of this embodiment that operates in the management server 1000. The structure of the management server 1000 includes a backup data storage area 1100, a boot image storage area 1200, a working area (hash data storage means) 1300, a division / combination means 1400, a data transmission / reception means 1500, and a boot image transmission means 1600.
Here, the backup data storage area 1100 and the boot image storage area 1200 correspond to a nonvolatile storage device such as a magnetic disk and a DBMS (DataBase Management System) that manages data stored therein.

ワーキングエリア１３００は、メインメモリ等の揮発性の記憶領域に相当する。
分割・結合手段１４００、データ送受信手段１５００、ブートイメージ送信手段１６００は、ＣＰＵ（Central Processing Unit）等によって実行されるプログラムモジュールに相当する。プログラムモジュールは、メモリにロードされ、ＣＰＵ（Central Processing Unit）の制御のもとで命令群が実行され、各手段の処理を実行させる。 The working area 1300 corresponds to a volatile storage area such as a main memory.
The dividing / combining unit 1400, the data transmitting / receiving unit 1500, and the boot image transmitting unit 1600 correspond to program modules executed by a CPU (Central Processing Unit) or the like. The program module is loaded into the memory, and a group of instructions is executed under the control of a CPU (Central Processing Unit) to execute the processing of each means.

バックアップデータ格納領域１１００は、フルバックアップデータ格納手段１１１０と、差分データ格納手段１１２０とを備える。
ブートイメージ格納領域１２００は、ブートイメージ格納手段１２１０を備え、ブートイメージ格納手段１２１０内にはブートイメージ１２２０が格納されている。
分割・結合手段１４００は、分割効率計算手段１４１０、ハッシュデータ計算手段１４２０、ハッシュデータ比較手段１４３０、及び、結合効率計算手段１４４０を備える。 The backup data storage area 1100 includes full backup data storage means 1110 and differential data storage means 1120.
The boot image storage area 1200 includes a boot image storage unit 1210, and the boot image 1220 is stored in the boot image storage unit 1210.
The dividing / combining unit 1400 includes a dividing efficiency calculating unit 1410, a hash data calculating unit 1420, a hash data comparing unit 1430, and a combining efficiency calculating unit 1440.

以下、各構成要素について詳細に説明する。
フルバックアップデータ格納手段１１１０には、管理対象マシンのフルバックアップデータが格納される。
フルバックアップデータとは、管理対象マシンのディスク領域のコピーに相当し、例えば、図３に示すようなアドレス０からディスクの最大アドレスＮまでのデータのフルコピーを指す。 Hereinafter, each component will be described in detail.
The full backup data storage unit 1110 stores the full backup data of the managed machine.
The full backup data corresponds to a copy of the disk area of the managed machine, and refers to a full copy of data from address 0 to the maximum disk address N as shown in FIG. 3, for example.

差分データ格納手段１１２０には、管理対象マシンから今回採取したフルバックアップデータと、前回採取したフルバックアップデータとの差分データが格納される。差分データは、今回採取したフルバックデータと、前回採取したフルバックデータとをそれぞれ所定のブロックサイズに分割し、アドレスが同じブロックに格納されているデータを比較し、異なるブロックを集めたデータである。
例えば、図４に示すように、サイズＮのフルバックアップデータがあった場合、前回採取分と今回採取分を同じブロックサイズ（ｎ）で分割し、ブロック内のデータが前回採取分と完全に一致すれば「差分なし」、ブロック内のデータの一部でも前回採取分と異なれば「差分あり」とした場合の、「差分あり」のブロックを集めたデータを「差分データ」と定義する。
図５は、差分データを表すデータ構造を表している。差分データは、分割するブロックサイズに相当する「サイズ」、ブロックの先頭アドレスに相当する「アドレス」、実際のブロックデータに相当する「データ」から構成される。 The difference data storage unit 1120 stores difference data between the full backup data collected this time from the managed machine and the full backup data collected last time. The difference data is data that is obtained by dividing the full-back data collected this time and the full-back data collected last time into predetermined block sizes, comparing the data stored in the same address block, and collecting different blocks. is there.
For example, as shown in Fig. 4, if there is full backup data of size N, the previous collection and the current collection are divided by the same block size (n), and the data in the block exactly matches the previous collection In this case, “differential data” is defined as data obtained by collecting blocks having “difference” when “no difference” is set, and even if some of the data in the block is different from the previous collection, “difference exists”.
FIG. 5 shows a data structure representing the difference data. The difference data includes a “size” corresponding to the block size to be divided, an “address” corresponding to the head address of the block, and “data” corresponding to the actual block data.

ブートイメージ格納手段１２１０に格納されているブートイメージ１２２０は、差分バックアップを実行するために管理対象マシンに送信されるブートイメージ（差分バックアップ手段）に相当する。管理サーバ１０００が、ブートイメージ１２２０を管理対象マシンに送信・管理対象マシンを再起動することによって、ブートイメージ１２２０内の各種プログラムが動作して差分バックアップが実行される。従って、ブートイメージ１２２０は、差分バックアップを実現するプログラムであるといえる。ブートイメージ１２２０の詳細については図１１を用いて後述する。 The boot image 1220 stored in the boot image storage unit 1210 corresponds to a boot image (differential backup unit) transmitted to the managed machine in order to execute the differential backup. When the management server 1000 transmits the boot image 1220 to the managed machine and restarts the managed machine, various programs in the boot image 1220 operate and differential backup is executed. Therefore, it can be said that the boot image 1220 is a program that realizes differential backup. Details of the boot image 1220 will be described later with reference to FIG.

ワーキングエリア１３００には、差分バックアップの事前準備として、フルバックアップデータ格納手段１１１０に格納されているフルバックアップデータに対応する、ハッシュデータが格納される。
ここで、ハッシュデータとは、ハッシュ対象データをあるブロックサイズ（ｎ）で分割し、各ブロックのハッシュ値を集めたデータを「ハッシュデータ」と定義する。図６にハッシュ値の具体例を示す。
図７は、ハッシュデータを表すデータ構造を表している。ハッシュデータは、分割するブロックサイズに相当する「サイズ」、ブロックの先頭アドレスとブロックに対応するハッシュ値を格納した表から構成される。 In the working area 1300, hash data corresponding to the full backup data stored in the full backup data storage unit 1110 is stored as advance preparation for differential backup.
Here, the hash data is defined as “hash data” that is obtained by dividing the hash target data by a certain block size (n) and collecting the hash values of each block. FIG. 6 shows a specific example of the hash value.
FIG. 7 shows a data structure representing hash data. The hash data is composed of a table storing the “size” corresponding to the block size to be divided, the head address of the block, and the hash value corresponding to the block.

分割・結合手段１４００は、バックアップデータ格納領域１１００を参照しながら、分割するブロックサイズを大小させた場合の差分データ量を計算し、次回の差分バックアップ時の最適なブロックサイズを決定するプログラムモジュールである。 The dividing / combining means 1400 is a program module that calculates the amount of difference data when the block size to be divided is increased or decreased with reference to the backup data storage area 1100 and determines the optimum block size at the next difference backup. is there.

図８は、最適なブロックサイズを決定するロジックを表している。現在のブロックサイズを分割した場合に、差分データ量が減少、同じ、あるいは増加のいずれになるかを各行に示す。また、現在のブロックサイズを結合した場合に、差分データ量が減少、同じ、あるいは増加のいずれになるかを各列に示す。現在のブロックサイズを分割したときに差分データ量が減少し、結合したときに差分データが同じあるいは増加した場合、ブロックサイズを分割する。現在のブロックサイズを分割したとき及び結合したときに差分データが同じである場合、ブロックサイズを結合する。現在のブロックサイズを分割したときに差分データ量が同じであり、結合したときに差分データが増加した場合、ブロックサイズを変更せず、現状維持する。なお、Ｎ／Ａ（Non Applicable）は適用不可の組み合わせを意味する。 FIG. 8 shows the logic for determining the optimal block size. Each line indicates whether the difference data amount decreases, is the same, or increases when the current block size is divided. In addition, each column indicates whether the difference data amount decreases, is the same, or increases when the current block sizes are combined. If the difference data amount decreases when the current block size is divided and the difference data is the same or increased when combined, the block size is divided. If the difference data is the same when the current block size is divided and combined, the block sizes are combined. If the difference data amount is the same when the current block size is divided and the difference data increases when combined, the block size is not changed and the current state is maintained. N / A (Non Applicable) means a combination that cannot be applied.

図８では、今回採取した差分データを、１／２倍のブロックサイズで分割したと仮定した場合の差分データ量（分割、図９）と、２倍のブロックサイズで分割したと仮定した場合の差分データ量（結合、図１０）の計算結果により、次回の差分データの分割ブロックサイズをどのようにするかを定めている。 In FIG. 8, the difference data amount obtained when the difference data collected this time is assumed to be divided by a half block size (division, FIG. 9) and the case where it is assumed that the difference data is divided by a double block size. The calculation result of the difference data amount (combined, FIG. 10) determines how the divided block size of the next difference data is to be made.

分割効率計算手段１４１０は、分割ブロックサイズを１／２倍にした場合に差分データ量が減少するかどうかを計算する。
ハッシュデータ計算手段１４２０は、指定された計算対象領域を、指定されたブロックサイズでハッシュデータを作成する。
ハッシュデータ比較手段１４３０は、指定された２つのハッシュデータを比較し、ハッシュ値が異なるもののリストを算出する。
結合効率計算手段１４４０は、分割ブロックサイズを２倍にした場合に差分データ量が増加するかどうかを計算する。 The division efficiency calculation means 1410 calculates whether or not the difference data amount decreases when the division block size is halved.
The hash data calculation unit 1420 creates hash data of the specified calculation target area with the specified block size.
The hash data comparison unit 1430 compares the two specified hash data, and calculates a list of different hash values.
The coupling efficiency calculation unit 1440 calculates whether or not the difference data amount increases when the divided block size is doubled.

データ送受信手段１５００は、管理対象マシンへのハッシュデータの送信、および、管理対象マシンからの差分データの受信を行うプログラムモジュールである。 The data transmission / reception unit 1500 is a program module that transmits hash data to a managed machine and receives differential data from the managed machine.

ブートイメージ送信手段１６００は、管理対象マシンにブートイメージ１２２０を送信し、そのブートイメージで管理対象マシンを再起動させるプログラムモジュールである。例えば、ＲＦＣ（Request For Comments）４５７８で規定されているＰＸＥ（Preboot eXecution Environment）プロトコルに従ったサーバプログラムでも良いし、管理対象マシン上で動作するいわゆるクライアントプログラムにＰｕｓｈ方式でブートイメージを送り込むものであっても良い。 The boot image transmission unit 1600 is a program module that transmits a boot image 1220 to a managed machine and restarts the managed machine with the boot image. For example, it may be a server program according to the PXE (Preboot eXecution Environment) protocol specified in RFC (Request For Comments) 4578, or a boot image is sent to a so-called client program operating on a managed machine by the Push method. There may be.

ブートイメージ１２２０を管理対象マシンに送信し、そのブートイメージで管理対象マシンを再起動させる方法は既知の技術で十分実施可能であるため、本実施形態では、ブートイメージを送信して、そのブートイメージで管理対象マシンを再起動させる方法については特に定めない。 Since a method of transmitting the boot image 1220 to the managed machine and restarting the managed machine with the boot image can be sufficiently performed by a known technique, in this embodiment, the boot image is transmitted and the boot image is transmitted. The method for restarting managed machines is not specified.

図１１は、管理対象マシン上で動作するブートイメージ１２２０の論理的な構造を模式的に表したものである。ブートイメージ１２２０は、データ送受信手段１２２１、差分データ作成手段１２２２、ハッシュデータ計算手段１２２３、及び、ハッシュデータ比較手段１２２４を備える。 FIG. 11 schematically illustrates the logical structure of the boot image 1220 that operates on the managed machine. The boot image 1220 includes data transmission / reception means 1221, difference data creation means 1222, hash data calculation means 1223, and hash data comparison means 1224.

データ送受信手段１２２１、差分データ作成手段１２２２、ハッシュデータ計算手段１２２３、ハッシュデータ比較手段１２２４は、ＣＰＵ等によって実行されるプログラムモジュールに相当する。 The data transmission / reception unit 1221, the difference data creation unit 1222, the hash data calculation unit 1223, and the hash data comparison unit 1224 correspond to program modules executed by the CPU or the like.

データ送受信手段１２２１は、管理サーバ１０００からの前回採取したフルバックアップデータに対応するハッシュデータの受信、および、管理対象マシン上で作成した差分データの管理サーバ１０００への送信処理を行う。
差分データ作成手段１２２２は、管理対象マシンのディスク領域から計算したハッシュデータと、管理サーバ１０００から受信した前回採取分のフルバックアップデータに対応するハッシュデータから、差分データを作成する処理を行う。
ハッシュデータ計算手段１２２３は、管理対象マシンのディスク領域を、管理サーバ１０００から受信したハッシュデータ内のブロックサイズでハッシュデータを作成する。
ハッシュデータ比較手段１２２４は、管理サーバ１０００から受信した前回採取分のフルバックアップデータに対応するハッシュデータと、管理対象マシンのディスク領域から計算したハッシュデータの比較処理を行う。 The data transmission / reception unit 1221 receives hash data corresponding to the full backup data collected last time from the management server 1000 and transmits the difference data created on the managed machine to the management server 1000.
The difference data creation unit 1222 performs a process of creating difference data from the hash data calculated from the disk area of the managed machine and the hash data corresponding to the full backup data for the previous collection received from the management server 1000.
The hash data calculation unit 1223 creates hash data of the disk area of the managed machine with the block size in the hash data received from the management server 1000.
The hash data comparison unit 1224 compares the hash data corresponding to the previously collected full backup data received from the management server 1000 and the hash data calculated from the disk area of the managed machine.

次に、図２、図１１〜１８を参照して本実施形態の動作について説明する。ユーザからの操作等によって差分バックアップが指示されると（図１２のステップＳ１１）、図２のブートイメージ送信手段１６００は、管理対象マシンにブートイメージ１２２０を送信するとともに、管理対象マシンを再起動させる（図１２のステップＳ１２）。図２のデータ送受信手段１５００は、管理対象マシン内で差分バックアップ処理の開始が確認されると、ワーキングエリア１３００に格納したハッシュデータ（以降、このハッシュデータを「ハッシュデータ１」とする）を管理対象マシンへ送信する（図１２のステップＳ１３）。 Next, the operation of this embodiment will be described with reference to FIG. 2 and FIGS. When a differential backup is instructed by a user operation or the like (step S11 in FIG. 12), the boot image transmission unit 1600 in FIG. 2 transmits the boot image 1220 to the managed machine and restarts the managed machine. (Step S12 in FIG. 12). When the start of differential backup processing is confirmed in the managed machine, the data transmission / reception means 1500 in FIG. 2 manages the hash data stored in the working area 1300 (hereinafter, this hash data is referred to as “hash data 1”). Transmit to the target machine (step S13 in FIG. 12).

管理対象マシンは、ブートイメージ１２２０を受信し（図１２のステップＳ２１）、再起動される。再起動された管理対象マシンは、図１１のブートイメージ１２２０で起動し（図１２のステップＳ２２）、差分バックアップ処理（図１３）を開始する（図１２のステップＳ２３）。
差分バックアップ処理が開始されると、図１１のデータ送受信手段１２２１は、管理サーバ１０００から前回採取したフルバックアップデータのハッシュデータ１を受信する（図１３のステップＡ１）。
管理サーバ１０００からハッシュデータ１を受信後、図１１の差分データ作成手段１２２２は、計算対象領域として「管理対象マシンのディスク領域」を、サイズとして「ハッシュデータ１のサイズ」を入力としてハッシュデータ計算手段１２２３を呼び出す（図１３のステップＡ２）。ハッシュデータ計算手段１２２３の具体的な動作は図１７に示す通りである。まず、ハッシュデータ計算手段１２２３は、計算対象領域（ここでは、「管理対象マシンのディスク領域」）とサイズ（ここでは、「ハッシュデータ１のサイズ」）とを受け付け、計算対象領域をサイズのブロックに分割する（図１７のステップＥ１）。次にハッシュデータ計算手段１２２３は、各ブロックのハッシュ値を計算し（図１７のステップＥ２）、サイズ＋ハッシュ値の集合を「ハッシュデータ」として出力する（図１７のステップＥ３）。 The managed machine receives the boot image 1220 (step S21 in FIG. 12) and is restarted. The restarted managed machine is started with the boot image 1220 of FIG. 11 (step S22 of FIG. 12), and the differential backup process (FIG. 13) is started (step S23 of FIG. 12).
When the differential backup process is started, the data transmitting / receiving unit 1221 in FIG. 11 receives the hash data 1 of the full backup data collected last time from the management server 1000 (step A1 in FIG. 13).
After receiving the hash data 1 from the management server 1000, the differential data creation unit 1222 in FIG. 11 calculates hash data by inputting “the disk area of the managed machine” as the calculation target area and “the size of the hash data 1” as the size. Means 1223 is called (step A2 in FIG. 13). The specific operation of the hash data calculation means 1223 is as shown in FIG. First, the hash data calculation unit 1223 accepts a calculation target area (here, “disk area of the management target machine”) and a size (here, “size of hash data 1”), and sets the calculation target area as a block having a size. (Step E1 in FIG. 17). Next, the hash data calculation means 1223 calculates a hash value of each block (step E2 in FIG. 17), and outputs a set of size + hash value as “hash data” (step E3 in FIG. 17).

管理対象マシンのディスク領域のハッシュデータ（以降、このハッシュデータを「ハッシュデータ２」とする）を計算後、差分データ作成手段１２２２は、ハッシュデータ比較手段１２２４を使用してハッシュデータ１とハッシュデータ２を比較し（図１３のステップＡ３）、ハッシュ値が異なるブロックを集めて差分データを作成し、データ送受信手段１２２１を使用して管理サーバ１０００に差分データを送信する（図１３のステップＡ４）。ハッシュデータ比較手段１２２４の具体的な動作は図１８に示す通りである。まず、ハッシュデータ比較手段１２２４は、二つのハッシュデータ（ここでは、ハッシュデータ１、ハッシュデータ２）を受け付け、ハッシュデータ１の各ハッシュ値について、対応するハッシュデータ２のハッシュ値を比較し（図１８のステップＦ１）、ハッシュ値が異なるもののリストを出力する（図１８のステップＦ２）。 After calculating the hash data of the disk area of the managed machine (hereinafter, this hash data is referred to as “hash data 2”), the differential data creation unit 1222 uses the hash data comparison unit 1224 to generate the hash data 1 and the hash data. 2 are compared (step A3 in FIG. 13), blocks having different hash values are collected to create difference data, and the difference data is transmitted to the management server 1000 using the data transmitting / receiving means 1221 (step A4 in FIG. 13). . The specific operation of the hash data comparison unit 1224 is as shown in FIG. First, the hash data comparison unit 1224 accepts two hash data (here, hash data 1 and hash data 2), and compares the hash value of the corresponding hash data 2 for each hash value of the hash data 1 (see FIG. 18 Step F1), a list of items having different hash values is output (Step F2 in FIG. 18).

図２の管理サーバ１０００は、データ送受信手段１５００によって管理対象マシンから差分データを受信すると、その差分データを差分データ格納手段１１２０に格納し（図１２のステップＳ１４）、分割・結合処理（図１４）を開始する（図１２のステップＳ１５）。 When the data transmission / reception unit 1500 receives the difference data from the managed machine, the management server 1000 in FIG. 2 stores the difference data in the difference data storage unit 1120 (step S14 in FIG. 12), and the division / combination processing (FIG. 14). ) Is started (step S15 in FIG. 12).

図２の分割効率計算手段１４１０は、フルバックアップデータ格納手段１１１０に格納されているフルバックアップデータのうち、差分データ格納手段１１２０に格納されている差分データに対応する領域（前回の領域）を抽出する（図１４のステップＢ１、図１５のステップＣ１）。 2 extracts an area (previous area) corresponding to the differential data stored in the differential data storage means 1120 from the full backup data stored in the full backup data storage means 1110. (Step B1 in FIG. 14 and Step C1 in FIG. 15).

前回の領域を抽出後、分割効率計算手段１４１０は、計算対象領域として「前回の領域」を、サイズとして「差分データ内のサイズの１／２倍」を入力としてハッシュデータ計算手段１４２０を呼びだして（図１５のステップＣ２）、ハッシュデータ３を作成させる。その後、分割効率計算手段１４１０は、計算対象領域として「差分データ内のデータ」を、サイズとして「差分データ内のサイズの１／２倍」を入力としてハッシュデータ計算手段１４２０を呼び出して（図１５のステップＣ３）、ハッシュデータ４を作成させる。具体的には、ハッシュデータ計算手段１４２０は、計算対象領域（ここでは、「前回の領域」または「差分データ内のデータ」）とサイズ（ここでは、「差分データ内のサイズの１／２倍」）とを受け付け、計算対象領域をサイズのブロックに分割する（図１７のステップＥ１）。次にハッシュデータ計算手段１４２０は、各ブロックのハッシュ値を計算し（図１７のステップＥ２）、サイズ＋ハッシュ値の集合を「ハッシュデータ」として出力する（図１７のステップＥ３）。 After extracting the previous area, the division efficiency calculation means 1410 calls the hash data calculation means 1420 with “previous area” as the calculation target area and “1/2 times the size in the difference data” as the input. (Step C2 in FIG. 15), the hash data 3 is created. Thereafter, the division efficiency calculation unit 1410 calls the hash data calculation unit 1420 with “data in the difference data” as the calculation target area and “1/2 times the size in the difference data” as the size (FIG. 15). Step C3), the hash data 4 is created. Specifically, the hash data calculation unit 1420 calculates a calculation target area (here, “previous area” or “data in difference data”) and size (here, “½ times the size in difference data”). And the calculation target area is divided into blocks of size (step E1 in FIG. 17). Next, the hash data calculation unit 1420 calculates a hash value of each block (step E2 in FIG. 17), and outputs a set of size + hash value as “hash data” (step E3 in FIG. 17).

ハッシュデータ３、および、ハッシュデータ４を作成後、分割効率計算手段１４１０は、ハッシュデータ比較手段１４３０を呼び出す。図２のハッシュデータ比較手段１４３０はハッシュデータ３とハッシュデータ４とを比較する（図１５のステップＣ４）。具体的には、ハッシュデータ比較手段１４３０は、二つのハッシュデータ（ここでは、ハッシュデータ３、ハッシュデータ４）を受け付け、ハッシュデータ３の各ハッシュ値について、対応するハッシュデータ４のハッシュ値を比較し（図１８のステップＦ１）、ハッシュ値が異なるもののリストを出力する（図１８のステップＦ２）。 After creating the hash data 3 and the hash data 4, the division efficiency calculation unit 1410 calls the hash data comparison unit 1430. The hash data comparison unit 1430 in FIG. 2 compares the hash data 3 and the hash data 4 (step C4 in FIG. 15). Specifically, the hash data comparison unit 1430 receives two pieces of hash data (here, hash data 3 and hash data 4), and compares the hash value of the corresponding hash data 4 for each hash value of the hash data 3. (Step F1 in FIG. 18), and a list of items having different hash values is output (Step F2 in FIG. 18).

次に、分割効率計算手段１４１０は、ブロックサイズが１／２倍であった場合の差分データ量を計算する（図１５のステップＣ５）。
分割効率計算手段１４１０は、計算結果が元々の差分データよりも小さい場合は「差分データサイズが減る」と判断し（図１５のステップＣ８）、そうでない場合は「差分データサイズは減らない」と判断する（図１５のステップＣ７）。 Next, the division efficiency calculation unit 1410 calculates the difference data amount when the block size is ½ (step C5 in FIG. 15).
When the calculation result is smaller than the original difference data, the division efficiency calculation unit 1410 determines that “the difference data size is reduced” (step C8 in FIG. 15), and otherwise, “the difference data size is not reduced”. Judgment is made (step C7 in FIG. 15).

分割効率計算処理の結果、差分データ量が減少する場合（図１４のステップＢ２のＹ）、分割・結合手段１４００は、分割サイズが１／２倍である方が適正値であると判断し、差分データをフルバックアップデータ格納手段１１１０内のフルバックアップデータにマージ（図１４のステップＢ９）後、次回管理対象マシンに送信するハッシュデータを作成する（図１４のステップＢ１０）。具体的には、分割・結合手段１４００は、計算対象領域として「フルバックアップイメージ」を、サイズとして「１／２倍のサイズ」を入力としてハッシュデータ計算手段１４２０を呼び出して、次回管理対象マシンに送信するハッシュデータを作成させる（図１７のＥ１〜Ｅ３）。分割・結合手段１４００は、作成したハッシュデータをワーキングエリア１３００に格納して、処理を終了する。 As a result of the division efficiency calculation process, when the difference data amount decreases (Y in step B2 in FIG. 14), the division / combination means 1400 determines that the division size is ½ times the proper value, After the difference data is merged with the full backup data in the full backup data storage unit 1110 (step B9 in FIG. 14), hash data to be transmitted to the next managed machine is created (step B10 in FIG. 14). Specifically, the dividing / combining unit 1400 calls the hash data calculating unit 1420 with “full backup image” as the calculation target area and “1/2 times the size” as the input, and sets the next management target machine. Hash data to be transmitted is created (E1 to E3 in FIG. 17). The dividing / combining means 1400 stores the created hash data in the working area 1300 and ends the process.

分割効率計算処理の結果、差分データ量が減少しない場合（図１４のステップＢ２のＮ）、図２の結合効率計算手段１４４０は、フルバックアップデータ格納手段１１１０に格納されているフルバックアップデータを、差分データ格納手段１１２０に格納されている差分データ内のサイズの２倍のサイズでブロックに分割し、差分データ内のデータを包含するブロック（結合領域）を抽出する（図１４のステップＢ３、図１６のステップＤ１）。
結合領域を抽出後、結合領域と差分データ内のデータ量を比較し（図１６のステップＤ２）、サイズが同じであれば「データサイズは増えない」と判断し（図１６のステップＤ５）、そうでなければ「データサイズが増える」と判断する（図１６のステップＤ４）。 When the difference data amount does not decrease as a result of the division efficiency calculation process (N in Step B2 in FIG. 14), the coupling efficiency calculation unit 1440 in FIG. 2 converts the full backup data stored in the full backup data storage unit 1110 to The block is divided into blocks having a size twice the size of the difference data stored in the difference data storage means 1120, and a block (combined area) including the data in the difference data is extracted (step B3 in FIG. 14, FIG. 16 step D1).
After extracting the combined area, the amount of data in the combined area and the difference data is compared (step D2 in FIG. 16). If the size is the same, it is determined that “the data size does not increase” (step D5 in FIG. 16). Otherwise, it is determined that “the data size increases” (step D4 in FIG. 16).

結合効率計算処理の結果、差分データ量が増加する場合（図１４のステップＢ４のＹ）、分割・結合手段１４００は、現状の分割サイズが適正値であると判断し、差分データをフルバックアップデータ格納手段１１１０内のフルバックアップデータにマージ（図１４のステップＢ７）後、次回管理対象マシンに送信するハッシュデータを作成する（図１４のステップＢ８）。具体的には、分割・結合手段１４００は、計算対象領域として「フルバックアップイメージ」を、サイズとして「前回と同じサイズ」を入力としてハッシュデータ計算手段１４２０を呼び出して、次回管理対象マシンに送信するハッシュデータを作成させる（図１７のＥ１〜Ｅ３）。分割・結合手段１４００は、作成したハッシュデータをワーキングエリア１３００に格納して、処理を終了する。 If the difference data amount increases as a result of the coupling efficiency calculation process (Y in step B4 in FIG. 14), the dividing / combining unit 1400 determines that the current division size is an appropriate value, and the difference data is converted to the full backup data. After merging with the full backup data in the storage unit 1110 (step B7 in FIG. 14), hash data to be transmitted to the next managed machine is created (step B8 in FIG. 14). Specifically, the dividing / combining unit 1400 calls the hash data calculating unit 1420 with “full backup image” as the calculation target area and “the same size as the previous time” as the size, and transmits them to the next managed machine. Hash data is created (E1 to E3 in FIG. 17). The dividing / combining means 1400 stores the created hash data in the working area 1300 and ends the process.

結合効率計算処理の結果、差分データ量が増加しない場合（図１４のステップＢ４のＮ）、分割・結合手段１４００は、分割サイズが２倍である方が適正値であると判断し、差分データをフルバックアップデータ格納手段１１１０内のフルバックアップデータにマージ（図１４のステップＢ５）後、次回管理対象マシンに送信するハッシュデータを作成する（図１４のステップＢ６）。具体的には、分割・結合手段１４００は、計算対象領域として「フルバックアップイメージ」を、サイズとして「２倍のサイズ」を入力としてハッシュデータ計算手段１４２０を呼び出して、次回管理対象マシンに送信するハッシュデータを作成させる（図１７のＥ１〜Ｅ３）。分割・結合手段１４００は、作成したハッシュデータをワーキングエリア１３００に格納して、処理を終了する。 If the difference data amount does not increase as a result of the coupling efficiency calculation process (N in step B4 in FIG. 14), the dividing / combining means 1400 determines that the division size is twice as appropriate, and the difference data Are merged with the full backup data in the full backup data storage means 1110 (step B5 in FIG. 14), and hash data to be transmitted to the next managed machine is created (step B6 in FIG. 14). Specifically, the dividing / combining unit 1400 calls the hash data calculating unit 1420 with “full backup image” as the calculation target area and “double size” as the size as input, and transmits to the next managed machine. Hash data is created (E1 to E3 in FIG. 17). The dividing / combining means 1400 stores the created hash data in the working area 1300 and ends the process.

上述したように、図１７に示したハッシュデータ計算処理は、ハッシュデータ計算手段１４２０、あるいは、ハッシュデータ計算手段１２２３が実現する動作例を示す。また、図１８に示したハッシュデータ比較処理は、ハッシュデータ比較手段１４３０、あるいは、ハッシュデータ比較手段１２２４が実現する動作例を示す。 As described above, the hash data calculation process shown in FIG. 17 shows an operation example realized by the hash data calculation unit 1420 or the hash data calculation unit 1223. Further, the hash data comparison processing shown in FIG. 18 shows an operation example realized by the hash data comparison unit 1430 or the hash data comparison unit 1224.

次に、図２、図１１、図１３〜１８、および、具体例（図１９〜２１）を用いて、本実施形態の動作（動作１）を詳細に説明する。なお、ここでは具体例を中心に説明し、上述した動作と同様である一部分を省略する。 Next, the operation (operation 1) of the present embodiment will be described in detail with reference to FIGS. 2, 11, 13 to 18, and specific examples (FIGS. 19 to 21). Here, a specific example will be mainly described, and a part similar to the above-described operation will be omitted.

前提として、図１９（ａ）に示すフルバックアップデータが図２のフルバックアップデータ格納手段１１１０に格納されており、図１９（ａ）のフルバックアップデータに対応するハッシュデータが図２のワーキングエリア１３００に格納されているとする。ハッシュデータの内容例を図１９（ｂ）に示す。 As a premise, the full backup data shown in FIG. 19A is stored in the full backup data storage unit 1110 of FIG. 2, and the hash data corresponding to the full backup data of FIG. 19A is the working area 1300 of FIG. It is assumed that it is stored in An example of the contents of hash data is shown in FIG.

ユーザからの操作等によって差分バックアップが指示されると、図２のブートイメージ送信手段１６００は、管理対象マシンにブートイメージ１２２０を送信するとともに、管理対象マシンを再起動させる（図１２のステップＳ１２）。 When a differential backup is instructed by a user operation or the like, the boot image transmission unit 1600 in FIG. 2 transmits the boot image 1220 to the managed machine and restarts the managed machine (step S12 in FIG. 12). .

再起動された管理対象マシンは、図１１のブートイメージ１２２０で起動し（図１２のステップＳ２２）、差分バックアップ処理（図１３）を開始する（図１２のステップＳ２３）。
差分バックアップ処理が開始されると、図１１のデータ送受信手段１２２１は、管理サーバ１０００から図１９（ｂ）のハッシュデータを受信する（図１３のステップＡ１）。
図１９（ｂ）のハッシュデータを受信後、図１１の差分データ作成手段１２２２は、計算対象領域として「管理対象マシンのディスク領域」を、サイズとして「５１２Ｋ」（「Ｋ」はキロバイト）を入力としてハッシュデータ計算手段１２２３を呼び出す（図１３のステップＡ２）。ハッシュデータ計算手段１２２３が計算したハッシュデータ例を図２０（ａ）に示す。 The restarted managed machine is started with the boot image 1220 of FIG. 11 (step S22 of FIG. 12), and the differential backup process (FIG. 13) is started (step S23 of FIG. 12).
When the differential backup process is started, the data transmitting / receiving unit 1221 in FIG. 11 receives the hash data in FIG. 19B from the management server 1000 (step A1 in FIG. 13).
After receiving the hash data of FIG. 19B, the difference data creation means 1222 of FIG. 11 inputs “disk area of the managed machine” as the calculation target area and “512K” (“K” is kilobytes) as the size. The hash data calculation means 1223 is called as (step A2 in FIG. 13). An example of hash data calculated by the hash data calculation unit 1223 is shown in FIG.

管理対象マシンのディスク領域のハッシュデータ（図２０（ａ）のハッシュデータ）を計算後、差分データ作成手段１２２２は、ハッシュデータ比較手段１２２４を使用して図１９（ｂ）のハッシュデータと図２０（ａ）のハッシュデータを比較する（図１３のステップＡ３）。アドレス０、５１２Ｋで始まるブロックのハッシュ値が異なるため、差分データ作成手段１２２２は、これらを差分データ（図２０（ｂ）の差分データ）として作成し、データ送受信手段１２２１を使用して管理サーバ１０００に差分データを送信する（図１３のステップＡ４）。図２０（ｂ）に作成した差分データ例を示す。この例では、アドレス０から始まる５１２Ｋ、及びアドレス５１２から始まる５１２Ｋの部分について、差分データを作成した場合を示している。 After calculating the hash data of the disk area of the managed machine (hash data in FIG. 20A), the difference data creation unit 1222 uses the hash data comparison unit 1224 to compare the hash data in FIG. The hash data of (a) are compared (step A3 in FIG. 13). Since the hash values of the blocks starting with addresses 0 and 512K are different, the difference data creation unit 1222 creates these as difference data (difference data in FIG. The difference data is transmitted to (step A4 in FIG. 13). FIG. 20B shows an example of difference data created. In this example, difference data is created for 512K starting from address 0 and 512K starting from address 512.

図２の管理サーバ１０００は、データ送受信手段１５００によって管理対象マシンから差分データを受信すると、その差分データを差分データ格納手段１１２０に格納し、分割・結合処理（図１４）を開始する。
図２の分割効率計算手段１４１０は、フルバックアップデータ格納手段１１１０に格納されている図１９（ａ）のフルバックアップデータのうち、差分データ格納手段１１２０に格納されている図２０（ｂ）の差分データに対応する領域（前回の領域）を抽出する（図１５のステップＣ１）。 When the data transmission / reception unit 1500 receives the difference data from the managed machine, the management server 1000 in FIG. 2 stores the difference data in the difference data storage unit 1120 and starts the division / combination processing (FIG. 14).
The division efficiency calculation unit 1410 in FIG. 2 includes the difference in FIG. 20B stored in the differential data storage unit 1120 out of the full backup data in FIG. 19A stored in the full backup data storage unit 1110. An area corresponding to the data (previous area) is extracted (step C1 in FIG. 15).

分割効率計算手段１４１０は、前回の領域を抽出後、ハッシュデータ計算手段１４２０に前回の領域について、小さいブロックサイズを用いてハッシュデータを計算させる。具体的には、ハッシュデータ計算手段１４２０は、計算対象領域として「前回の領域」を、サイズとして「２５６Ｋ」を入力としてハッシュデータ計算手段１４２０を呼びだして（図１５のステップＣ２）、ハッシュデータ（前回の領域）を作成させる。図２１（ａ）に、作成されたハッシュデータ（前回の領域）例を示す。ハッシュデータ計算手段１４２０がハッシュデータ作成後、分割効率計算手段１４１０は、計算対象領域として「図２０（ｂ）の差分データ内のデータ」を、サイズとして「２５６Ｋ」（５１２Ｋの１／２倍）を入力としてハッシュデータ計算手段１４２０を呼び出して（図１５のステップＣ３）、ハッシュデータ（差分データ内のデータ）を作成する。図２１（ｂ）にハッシュデータ（差分データ内のデータ）例を示す。 After dividing the previous area, the division efficiency calculation unit 1410 causes the hash data calculation unit 1420 to calculate hash data for the previous area using a small block size. Specifically, the hash data calculation means 1420 calls the hash data calculation means 1420 with “previous area” as the calculation target area and “256K” as the size as input (step C2 in FIG. 15), and the hash data (step C2). Create the previous area). FIG. 21A shows an example of created hash data (previous area). After the hash data calculation unit 1420 creates the hash data, the division efficiency calculation unit 1410 sets “data in the difference data in FIG. 20B” as the calculation target area and “256K” as the size (1/2 times 512K). As an input, the hash data calculation means 1420 is called (step C3 in FIG. 15) to create hash data (data in the difference data). FIG. 21B shows an example of hash data (data in difference data).

図２１（ａ）のハッシュデータ（前回の領域）、および、図２１（ｂ）のハッシュデータ（差分データ内のデータ）を作成後、図２の分割効率計算手段１４１０は、図２１（ａ）のハッシュデータ（前回の領域）と図２１（ｂ）のハッシュデータ（差分データ内のデータ）を入力としてハッシュデータ比較手段１４３０を呼び出す（図１５のステップＣ４）。ハッシュデータ比較手段１４３０は、ハッシュ値が異なるのはアドレス２５６Ｋ、７６８Ｋで始まるブロックのみとなることを検出する。従って、差分データ量が５１２Ｋとなる（図１５のステップＣ５）ため、「差分データサイズが減る」と判断する（図１５のステップＣ８）。 After creating the hash data (previous area) in FIG. 21 (a) and the hash data (data in the difference data) in FIG. 21 (b), the division efficiency calculation means 1410 in FIG. The hash data comparison unit 1430 is called by inputting the hash data (previous area) of FIG. 21 and the hash data (data in the difference data) of FIG. 21B (step C4 of FIG. 15). The hash data comparison unit 1430 detects that the hash values are different only in blocks starting with addresses 256K and 768K. Therefore, since the difference data amount is 512K (step C5 in FIG. 15), it is determined that “the difference data size is reduced” (step C8 in FIG. 15).

分割効率計算処理の結果、差分データ量が減少するため、分割サイズが２５６Ｋである方が適正値であると判断し、図２０の差分データをフルバックアップデータ格納手段１１１０内の図１９（ａ）のフルバックアップデータにマージ（図１４のステップＢ９）後、計算対象領域として「フルバックアップデータ」を、サイズとして「２５６Ｋ」を入力としてハッシュデータ（次回管理対象マシンに送信するハッシュデータ）を作成（図１４のステップＢ１０）、ワーキングエリア１３００に格納して、処理を終了する。
以上により、次回差分バックアップ時の分割ブロックサイズは２５６Ｋで実行される。 As a result of the division efficiency calculation process, the amount of difference data decreases. Therefore, it is determined that the division size is 256K, and the difference data in FIG. 20 is converted to the difference data shown in FIG. (Step B9 in FIG. 14) to create “full backup data” as the calculation target area and “256K” as the size as input to create hash data (hash data to be transmitted to the next managed machine) ( Step B10 in FIG. 14 is stored in the working area 1300, and the process is terminated.
As described above, the divided block size at the next differential backup is executed at 256K.

次に、図２、図１１〜１８、および、具体例（図１９〜２０、図２２）を用いて、本実施形態の動作（動作２）を詳細に説明する。 Next, the operation (operation 2) of the present embodiment will be described in detail with reference to FIGS. 2, 11 to 18 and specific examples (FIGS. 19 to 20 and FIG. 22).

前提として、図１９（ａ）のフルバックアップデータが図２のフルバックアップデータ格納手段１１１０に格納されており、図１９（ａ）のフルバックアップデータに対応するハッシュデータが図２のワーキングエリア１３００に格納されているとする。ハッシュデータの内容例を図１９（ｂ）に示す。
また、説明を簡略化するため、図１４のステップＢ２にて差分データが減少しないと判断された状態であるとする。 As a premise, the full backup data of FIG. 19A is stored in the full backup data storage unit 1110 of FIG. 2, and the hash data corresponding to the full backup data of FIG. 19A is stored in the working area 1300 of FIG. Assume that it is stored. An example of the contents of hash data is shown in FIG.
Further, to simplify the explanation, it is assumed that the difference data is determined not to decrease in step B2 of FIG.

分割効率計算処理の結果、差分データ量が減少しないため、図２の結合効率計算手段１４４０は、フルバックアップデータ格納手段１１１０に格納されている図１９（ａ）のフルバックアップデータを、１０２４Ｋ（５１２Ｋの２倍）のサイズでブロックに分割し、差分データ内のデータを包含するブロック（図２２（ａ）のフルバックアップデータの０〜１０２４Ｋの箇所）を抽出する（図１６のステップＤ１）。
図２２（ａ）のフルバックアップデータの０〜１０２４Ｋの箇所と、図２２（ｂ）の差分データ量を比較し（図１６のステップＤ２）、サイズが同じであるため「データサイズは増えない」と判断できる（図１６のステップＤ５）。具体的には、図２２（ａ）の場合、一つのブロックが１０２４Ｋであるが、差分データとして抽出された領域（図２２（ｂ））が一つのブロックの領域に対応している。このため、差分データ量が増えない。 Since the difference data amount does not decrease as a result of the division efficiency calculation process, the coupling efficiency calculation unit 1440 in FIG. 2 converts the full backup data in FIG. 19A stored in the full backup data storage unit 1110 to 1024K (512K). Is divided into blocks having a size of 2), and a block including the data in the differential data (a portion of 0 to 1024K of the full backup data in FIG. 22A) is extracted (step D1 in FIG. 16).
The portion of 0 to 1024K of the full backup data in FIG. 22A and the difference data amount in FIG. 22B are compared (step D2 in FIG. 16), and “the data size does not increase” because the sizes are the same. (Step D5 in FIG. 16). Specifically, in the case of FIG. 22A, one block is 1024K, but the area extracted as difference data (FIG. 22B) corresponds to the area of one block. For this reason, the amount of difference data does not increase.

結合効率計算処理の結果、差分データ量が増加しないため、分割サイズが１０２４Ｋである方が適正値であると判断し、図２０の差分データをフルバックアップデータ格納手段１１１０内の図１９（ａ）のフルバックアップデータにマージ（図１４のステップＢ５）後、計算対象領域として「フルバックアップデータ」を、サイズとして「１０２４Ｋ」を入力としてハッシュデータ（次回管理対象マシンに送信するハッシュデータ）を作成（図１４のステップＢ６）、ワーキングエリア１３００に格納して、処理を終了する。 Since the difference data amount does not increase as a result of the coupling efficiency calculation process, it is determined that the division size is 1024K, and the difference data of FIG. 20 is converted into the difference data of FIG. 20 in FIG. (Step B5 in FIG. 14), then the hash data (the hash data to be sent to the managed machine next time) is created by inputting “full backup data” as the calculation target area and “1024K” as the size ( Step B6 in FIG. 14 is stored in the working area 1300, and the process is terminated.

以上により、次回差分バックアップ時の分割ブロックサイズは１０２４Ｋで実行される。 As described above, the divided block size at the next differential backup is executed at 1024K.

（その他の実施形態）
実施形態１では、分割・結合手段１４００は、ブロックサイズを２分の１に分割、または２倍に結合する場合を説明したが、これ以外の数値を用いてブロックサイズを変更してもよい。例えば、分割・結合手段１４００は、ブロックサイズに現在の値より小さい減少値を用いた場合と、ブロックサイズに現在の値の倍数値を用いた場合との差分データ量を算出し、差分データ量に応じて前記ブロックサイズを計算してもよい。減少値は、現在の値の約数であることが好ましい。また、倍数値は、現在の値のＭ倍（Ｍ＞０の整数）を用いる。 (Other embodiments)
In the first embodiment, the division / combination unit 1400 has been described with respect to the case where the block size is divided into half or combined twice, but the block size may be changed using other numerical values. For example, the dividing / combining unit 1400 calculates the difference data amount between the case where the decrease value smaller than the current value is used for the block size and the case where the multiple value of the current value is used for the block size. The block size may be calculated according to The decrease value is preferably a divisor of the current value. The multiple value is M times the current value (an integer greater than M> 0).

具体的には、設定方法は、図８に示すロジックと同様であり、分割・結合手段１４００は、減少値を用いたときに、差分データ量が減少する場合に、ブロックサイズを減少値に設定する。減少値を用いたとき及び倍数値を用いたときに差分データ量が同じである場合に、ブロックサイズを倍数値に設定する。減少値を用いたときに差分データ量が同じであり、かつ、倍数値を用いたときに差分データ量が増加する場合に、ブロックサイズを変更しない。 Specifically, the setting method is the same as the logic shown in FIG. 8, and the dividing / combining means 1400 sets the block size to the decrease value when the difference data amount decreases when the decrease value is used. To do. When the difference data amount is the same when using the decrease value and when using the multiple value, the block size is set to the multiple value. If the difference data amount is the same when the decrease value is used and the difference data amount increases when the multiple value is used, the block size is not changed.

実施形態１では、フルバックアップデータとは、管理対象マシンのディスク領域のアドレス０からディスクの最大アドレスＮまでのデータのフルコピーである場合を説明したが、ディスク領域の連続した所定の範囲内をコピーする場合であっても本発明を適用することができる。このような場合、フルバックアップデータとは、予めバックアップすると定めたディスク領域の完全なバックアップデータを意味することになる。 In the first embodiment, the case where the full backup data is a full copy of data from the address 0 of the disk area of the managed machine to the maximum address N of the disk has been described. The present invention can be applied even when copying. In such a case, the full backup data means complete backup data of a disk area determined to be backed up in advance.

また、実施形態１で説明した各手段を、プログラムを用いて実現する場合、プログラムはコンピュータで読み取り可能な記録媒体に記録することが可能である。また、実施形態１で説明した各記憶領域は、プログラムによって、管理サーバ１０００もしくは管理対象マシンの記憶領域内に確保することができる。 Moreover, when each means demonstrated in Embodiment 1 is implement | achieved using a program, a program can be recorded on a computer-readable recording medium. In addition, each storage area described in the first embodiment can be secured in the storage area of the management server 1000 or the management target machine by a program.

一例として、プログラムはコンピュータへ次の処理を実行させる。管理対象マシンのフルバックアップデータをフルバックアップデータ格納手段１１１０にバックアップする処理。フルバックアップデータを複数のブロックに分割するブロックサイズと、各ブロックに対応する複数のハッシュ値とを含むハッシュデータをワーキングエリア（ハッシュデータ格納手段）１３００に格納する処理。管理対象マシンがバックアップを開始するときに、フルバックアップデータと現在のデータとの差分データを、ハッシュデータを用いて管理対象マシンに抽出させるブートイメージを送信し、管理対象マシンを再起動させる処理。管理対象マシンにハッシュデータを通知して差分データを抽出させる処理。フルバックアップデータと差分データとを用いて、次のバックアップに用いるブロックサイズを計算する処理。差分データを反映させた最新のフルバックアップデータをフルバックアップデータ格納手段へ格納する処理。計算したブロックサイズを用いて最新のフルバックアップデータを分割した各ブロックに対応するハッシュ値を算出する処理。計算したブロックサイズと算出したハッシュ値とをハッシュデータ格納手段に格納する処理。 As an example, the program causes the computer to execute the following process. Processing for backing up the full backup data of the managed machine to the full backup data storage unit 1110. A process of storing hash data including a block size for dividing the full backup data into a plurality of blocks and a plurality of hash values corresponding to each block in a working area (hash data storage means) 1300. A process of sending a boot image that causes the managed machine to extract the difference data between the full backup data and the current data using hash data when the managed machine starts backup, and restarting the managed machine. Processing to notify the managed machine of hash data and extract difference data. Processing to calculate the block size used for the next backup using full backup data and differential data. Processing to store the latest full backup data reflecting the differential data in the full backup data storage means. A process of calculating a hash value corresponding to each block obtained by dividing the latest full backup data using the calculated block size. Processing for storing the calculated block size and the calculated hash value in the hash data storage means.

以上説明したように、上記各実施形態のいずれかによれば次のような効果を奏する。
第１の効果は、差分計算の仕組みを管理対象マシンに持つ必要がないため、システム構築のコストを削減できることである。その理由は、差分計算の仕組み（バックアップモジュール）を管理サーバに持ち、それを差分バックアップ実行時に管理対象マシンに送信するためである。 As described above, according to any one of the above embodiments, the following effects can be obtained.
The first effect is that the cost of system construction can be reduced because there is no need to have a difference calculation mechanism in the managed machine. The reason is that the management server has a differential calculation mechanism (backup module) and transmits it to the managed machine when executing differential backup.

第２の効果は、フルバックアップを採取してから管理サーバ上で差分データを計算する方式、あるいは、前回採取したフルバックアップデータを管理対象マシンに送信して管理対象マシン上で差分を計算する方式と比べて、ネットワーク負荷が小さくなることである。その理由は、あらかじめ計算しておいたハッシュデータのみを管理対象マシンに送信し、それにもとづいて差分計算を行っているためである。 The second effect is that the difference data is calculated on the management server after the full backup is collected, or the difference is calculated on the managed machine by sending the previously collected full backup data to the managed machine. Compared to the above, the network load is reduced. The reason is that only the hash data calculated in advance is transmitted to the managed machine, and the difference calculation is performed based on it.

第３の効果は、特定のアクセス傾向を示す管理対象マシンについて、バックアップを採取すればするほど、差分データ量が最適化（最小化）されることである。その理由は、差分バックアップ実行後に、差分計算用のブロックサイズの適正値を計算し、適正値でなければ自動的にブロックサイズを変更しているためである。 A third effect is that the difference data amount is optimized (minimized) as the backup is collected for the managed machine that exhibits a specific access tendency. The reason is that, after execution of differential backup, an appropriate value for the block size for differential calculation is calculated, and if it is not an appropriate value, the block size is automatically changed.

本発明によれば、管理対象マシン上に差分計算の仕組みを持たせなくても、リーズナブルに差分バックアップを行うことができる。また、ディスクアクセスに特徴の出る（ランダムではない）管理対象マシンのバックアップに適用できる。 According to the present invention, it is possible to perform differential backup reasonably without providing a differential calculation mechanism on a managed machine. It can also be applied to backup of managed machines that are characterized by disk access (not random).

なお、本発明は上記に示す実施形態に限定されるものではない。本発明の範囲において、上記実施形態の各要素を、当業者であれば容易に考えうる内容に変更、追加、変換することが可能である。 In addition, this invention is not limited to embodiment shown above. Within the scope of the present invention, it is possible to change, add, or convert each element of the above-described embodiment to a content that can be easily considered by those skilled in the art.

１０００管理サーバ
１１００バックアップデータ格納領域
１１１０フルバックアップデータ格納手段
１１２０差分データ格納手段
１２００ブートイメージ格納領域
１２２０ブートイメージ
１２２１データ送受信手段
１２２２差分データ作成手段
１２２３、１４２０ハッシュデータ計算手段
１２２４、１４３０ハッシュデータ比較手段
１３００ワーキングエリア
１４００分割・結合手段
１４１０分割効率計算手段
１４４０結合効率計算手段
１５００データ送受信手段
１６００ブートイメージ送信手段
２０００管理用ネットワーク
３０００〜５０００管理対象マシン 1000 Management server 1100 Backup data storage area 1110 Full backup data storage means 1120 Differential data storage means 1200 Boot image storage area 1220 Boot image 1221 Data transmission / reception means 1222 Differential data creation means 1223 and 1420 Hash data calculation means 1224 and 1430 Hash data comparison means 1300 Working Area 1400 Division / Combination Unit 1410 Division Efficiency Calculation Unit 1440 Connection Efficiency Calculation Unit 1500 Data Transmission / Reception Unit 1600 Boot Image Transmission Unit 2000 Management Network 3000-5000 Managed Machine

Claims

A management server that backs up data on managed machines,
Full backup data storage means for storing full backup data of the managed machine;
Hash data storage means for storing hash data including a block size for dividing the full backup data into a plurality of blocks and a plurality of hash values corresponding to each block;
Boot image storage means for storing a boot image that causes the managed machine to extract difference data between the full backup data and current data using the hash data;
Transmitting means for transmitting the boot image to the managed machine when the managed machine starts backup;
A data transmission / reception means for transmitting the hash data to the managed machine after the managed machine is started using the boot image, and receiving differential data extracted by the managed machine;
The block size used for the next backup is calculated using the full backup data and the differential data, the latest full backup data reflecting the differential data is stored in the full backup data storage means, and the calculated block A hash value corresponding to each block obtained by dividing the latest full backup data using a size, and a dividing / combining means for storing the calculated block size and the calculated hash value in the hash data storage means, Management server provided.

The dividing / combining means calculates a difference data amount between a case where a decrease value smaller than a current value is used for the block size and a case where a multiple value of the current value is used for the block size, and the difference data The management server according to claim 1, wherein the block size is calculated according to an amount.

The dividing / combining means sets the block size to the decrease value when the difference data amount decreases when using the decrease value, and uses the multiple value when the decrease value is used. When the difference data amount is the same when the block size is set to the multiple value, the difference data amount is the same when the decrease value is used, and the multiple value is used The management server according to claim 2, wherein the block size is not changed when a difference data amount increases.

4. The dividing / combining means uses a half of the current value as the decrease value, and uses a value obtained by doubling the current value as the multiple value. Management server.

A backup method that backs up data on a managed machine using the management server.
The management server
Full backup data storage means for storing full backup data of the managed machine;
Hash data storage means for storing hash data including a block size for dividing the full backup data into a plurality of blocks and a plurality of hash values corresponding to each block;
Boot image storage means for storing a boot image that causes the managed machine to extract difference data between the full backup data and current data using the hash data;
Transmitting means for transmitting the boot image to the managed machine when the managed machine starts backup;
A data transmission / reception means for transmitting the hash data to the managed machine after the managed machine is started using the boot image, and receiving differential data extracted by the managed machine;
The block size used for the next backup is calculated using the full backup data and the differential data, the latest full backup data reflecting the differential data is stored in the full backup data storage means, and the calculated block A hash value corresponding to each block obtained by dividing the latest full backup data using a size, and a dividing / combining means for storing the calculated block size and the calculated hash value in the hash data storage means, Prepared,
The managed machine is
Data transmitting / receiving means for receiving the boot image and the hash data and transmitting the difference data;
A backup method comprising: difference data creating means for calculating hash data of backup target data and creating difference data by comparing with the received hash data.

A backup method for backing up data on a managed machine,
Back up the full backup data of the managed machine to the full backup data storage means,
Hash data including a block size for dividing the full backup data into a plurality of blocks and a plurality of hash values corresponding to each block is stored in a hash data storage unit,
Store the boot image storage means in the boot image storage means to extract the difference data between the full backup data and the current data using the hash data to the managed machine,
When the managed machine starts backup, restart the managed machine using the boot image,
Notifying the managed machine of the hash data and extracting the difference data;
Using the full backup data and the differential data, the block size used for the next backup is calculated,
The latest full backup data reflecting the difference data is stored in the full backup data storage means,
Calculate a hash value corresponding to each block obtained by dividing the latest full backup data using the calculated block size,
A backup method for storing the calculated block size and the calculated hash value in the hash data storage means.

The block size calculation is
Calculating a difference data amount between a case where a decrease value smaller than a current value is used for the block size and a case where a multiple value of the current value is used for the block size;
The backup method according to claim 6, wherein the block size is calculated according to the difference data amount.

The block size calculation is
When the difference data amount decreases when the decrease value is used, the block size is set to the decrease value,
When the difference data amount is the same when using the decrease value and when using the multiple value, the block size is set to the multiple value,
8. The block size is not changed when the difference data amount is the same when the decrease value is used and the difference data amount increases when the multiple value is used. Backup method.

The block size calculation is
9. The backup method according to claim 7, wherein a half of the current value is used as the decrease value, and a value obtained by doubling the current value is used as the multiple value.

A program that implements backup that backs up data on managed machines,
On the computer,
A process of backing up the full backup data of the managed machine to a full backup data storage means;
Processing for storing hash data including a block size for dividing the full backup data into a plurality of blocks and a plurality of hash values corresponding to each block in a hash data storage unit;
When the managed machine starts backup, it transmits a boot image that causes the managed machine to extract difference data between the full backup data and current data using the hash data, and Process to restart,
Processing for notifying the managed machine of the hash data and extracting the difference data;
Using the full backup data and the difference data, a process of calculating a block size used for the next backup;
A process of storing the latest full backup data reflecting the difference data in the full backup data storage means;
A process of calculating a hash value corresponding to each block obtained by dividing the latest full backup data using the calculated block size;
A program for executing a process of storing the calculated block size and the calculated hash value in the hash data storage means.