JP5494123B2

JP5494123B2 - Differential backup system, management server and client, differential backup method and program

Info

Publication number: JP5494123B2
Application number: JP2010078162A
Authority: JP
Inventors: 浩山本
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2010-03-30
Filing date: 2010-03-30
Publication date: 2014-05-14
Anticipated expiration: 2030-03-30
Also published as: JP2011210068A

Description

本発明は差分バックアップシステム、管理サーバおよびクライアント、差分バックアップ方法およびプログラムに関し、特に差分バックアップのブロックサイズを最適化し、短時間に差分バックアップを完了させることを可能とする差分バックアップシステム等に関する。 The present invention relates to a differential backup system, a management server and a client, a differential backup method, and a program, and more particularly to a differential backup system and the like that can optimize a differential backup block size and complete a differential backup in a short time.

コンピュータ、あるいはコンピュータが相互に接続されるネットワークが企業や社会の根幹として重要なインフラとなっている現代では、データの喪失はそれだけで社会的に重要な損失である。そのため、データのバックアップの重要性はますます高まっている。業務サービスを動作させるサーバや業務端末は通常、不慮の事故に備えて重要なデータもしくはディスク全体をバックアップするという運用が行われる。 In today's world, when computers or networks where computers are connected to each other are an important infrastructure as a foundation for companies and society, data loss is a socially important loss. Therefore, data backup is becoming increasingly important. Servers and business terminals that operate business services are usually operated to back up important data or the entire disk in preparation for an unexpected accident.

その際、そのコンピュータの動作を停止させないこと、あるいは停止するとしても可能な限り短時間の停止で動作を再開することが求められているので、バックアップの動作は可能な限り短時間で完了させる必要がある。また、各々のコンピュータで扱うデータの容量は膨大になる一方であるので、保存するバックアップデータの容量は可能な限り小さくする必要がある。そのコンピュータが何らかの業務サービスを動作させている場合には、特にその必要性が高くなる。 At that time, it is required not to stop the operation of the computer, or even if it stops, it is required to restart the operation with the shortest possible stop, so the backup operation must be completed in the shortest possible time There is. Further, since the volume of data handled by each computer is becoming enormous, the volume of backup data to be stored needs to be as small as possible. This necessity is particularly high when the computer is operating some kind of business service.

バックアップにかかる時間を短縮し、かつ保存すべきデータの容量を小さくするため、バックアップ対象データ全体に対してフルバックアップを行った後、そのフルバックアップが行われた時点からの変更のあったファイルや領域のみをバックアップする差分バックアップが行われることも多い。 To reduce the time required for backup and reduce the amount of data to be saved, after performing a full backup of the entire data to be backed up, files and files that have changed since the full backup was performed In many cases, a differential backup that backs up only the area is performed.

たとえばフルバックアップを週に１回行い、このフルバックアップデータに対する差分バックアップを毎日の終業時点で行うことによって、毎日の終業時点でのデータを再現することを可能にしつつ、毎日フルバックアップを取るよりも保存データの容量を大幅に小さくすることが可能となる。 For example, by performing a full backup once a week and performing a differential backup for this full backup data at the end of each day, it is possible to reproduce the data at the end of each day, rather than taking a full backup every day. The capacity of stored data can be significantly reduced.

データのバックアップ、特に差分バックアップに関連して、次の各々の技術文献がある。その中でも特許文献１には、バックアップ対象であるクライアント側で差分データを抽出し、その差分データだけをサーバに送信することで差分バックアップを行うという技術が記載されている。特許文献２には、差分バックアップのデータ転送量の変異を予測し、それによってバックアップ作業の失敗を検出および警告するという技術が記載されている。 There are the following technical documents related to data backup, particularly differential backup. Among them, Patent Document 1 describes a technique of performing differential backup by extracting differential data on the client side to be backed up and transmitting only the differential data to the server. Patent Document 2 describes a technique of predicting a variation in the data transfer amount of differential backup, thereby detecting and warning a backup work failure.

特許文献３には、通信経路を介したデータのコピーで、複製前後のデータをブロックに区切ってブロックごとのハッシュ値を比較し、相違しているブロックだけをコピーするという技術が記載されている。特許文献４には、データをコピーする場合にコピー時間を見積もって、その結果によって効率的なデータコピー方式（差分データコピーまたは全データコピー）を選択する技術が記載されている。特許文献５には、コピーするブロックサイズをもとに効率的なバックアップ方式（ファイルバックアップまたはボリュームバックアップ）を選択するという技術が記載されている。 Patent Document 3 describes a technique of copying data via a communication path, dividing data before and after duplication into blocks, comparing hash values for each block, and copying only the blocks that are different. . Patent Document 4 describes a technique for estimating a copy time when copying data and selecting an efficient data copy method (differential data copy or all data copy) according to the result. Patent Document 5 describes a technique of selecting an efficient backup method (file backup or volume backup) based on the block size to be copied.

特開２００４−０９４６１７号公報JP 2004-094617 A 特開２００４−２０６６１１号公報JP 2004-206611 A 特開２００５−１００００７号公報JP 2005-100007 A 特開２００６−２６８７４０号公報JP 2006-268740 A 特開２００８−２９９４４１号公報JP 2008-299441 A

差分バックアップは、フルバックアップデータとバックアップ対象データとを同一アドレスで所定のブロックサイズのブロックに区切り、両者の間でブロックごとにデータを比較し、相違しているブロックのみを差分データとして保存する。このため、ブロックサイズを変更すれば、その差分バックアップの処理の所要時間が異なることがある。前述のように、バックアップにかかる時間は可能な限り短くすることが望ましいので、そのようにブロックサイズを調整することは有用である。 In differential backup, full backup data and backup target data are divided into blocks having a predetermined block size at the same address, the data is compared between the blocks for each block, and only the different blocks are stored as differential data. For this reason, if the block size is changed, the time required for the differential backup process may differ. As described above, it is desirable to make the time required for backup as short as possible, so it is useful to adjust the block size in this way.

しかしながら、前述の特許文献１〜５には、ブロックサイズの調整に着目した技術は記載されていない。当然、これらの技術でブロックサイズを調整して最適化することによって、差分バックアップの所要時間やデータ容量を小さくすることはできない。 However, Patent Documents 1 to 5 described above do not describe a technique that focuses on the adjustment of the block size. Naturally, the time required for differential backup and the data capacity cannot be reduced by adjusting and optimizing the block size with these techniques.

特に、バックアップ管理サーバがネットワークで接続されている複数台のクライアントコンピュータのバックアップ対象データを一括してバックアップするという環境、さらにそのネットワークの中に複数種類のハードウェアもしくはソフトウェアが混在しているオープンなネットワーク環境では、バックアップ処理の所要時間を正確に見積もり、その所要時間を短縮するということ自体が困難である。この問題を前述の特許文献１〜５で解決することはできない。 In particular, an environment where the backup management server backs up data to be backed up from multiple client computers connected via a network, and an open environment where multiple types of hardware or software exist in the network. In a network environment, it is difficult to accurately estimate the time required for backup processing and shorten the time required. This problem cannot be solved by the aforementioned Patent Documents 1-5.

本発明の目的は、差分バックアップの処理でブロックサイズを的確に調整して、その所要時間を短縮することを可能とする差分バックアップシステム、管理サーバおよびクライアント、差分バックアップ方法およびプログラムを提供することにある。 An object of the present invention is to provide a differential backup system, a management server and a client, a differential backup method, and a program capable of accurately adjusting a block size in a differential backup process and reducing the required time. is there.

上記目的を達成するため、本発明に係る差分バックアップシステムは、バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバとが相互に接続された差分バックアップシステムであって、バックアップ管理サーバが、フルバックアップデータを記憶する第１の固定記憶手段と、ネットワークを経由してクライアントとデータ通信を行う第１のデータ送受信部と、予め与えられた第１ないし第３のブロックサイズの各々でフルバックアップデータをブロックに区切ってこれら各ブロックのデータに予め与えられた関数を適用して第１のハッシュデータを算出する第１のハッシュデータ計算部と、第１ないし第３のブロックサイズの各々の場合の第１のハッシュデータをフルバックアップデータの一部である試験用領域についてのものとそれ以外の残領域のものとに分割して試験用領域についてのハッシュデータをクライアントに送信する最適化処理用データ作成部とを有し、クライアントが、バックアップ管理サーバとデータ通信を行う第２のデータ送受信部と、バックアップ対象データを記憶する第２の固定記憶手段と、第１〜第３のブロックサイズの各々の場合についてバックアップ対象データをブロックに区切ってこれら各ブロックのデータに関数を適用して第２のハッシュデータを算出する第２のハッシュデータ計算部と、算出された第２のハッシュデータをブロックごとにバックアップ管理サーバから受信した第１のハッシュデータと比較するハッシュデータ比較部と、ブロックごとの第１および第２のハッシュデータの比較結果が異なっている場合にバックアップ対象データの該ブロックを差分バックアップする差分バックアップ部と、第１ないし第３のブロックサイズの各々の場合について、試験用領域についての第１のハッシュデータを利用してバックアップ対象データの試験用領域について差分バックアップ部に試験的差分バックアップを行わせると共に、第１〜第３のブロックサイズで行った試験的差分バックアップで最も所要時間の少なかったブロックサイズを最適サイズとして決定する最適サイズ計算部と、決定された最適サイズの場合の残領域についての第１のハッシュデータをバックアップ管理サーバに要求すると共に、送られてきたこの残領域についての第１のハッシュデータを利用して差分バックアップ部に最適サイズでバックアップ対象データの残領域の差分バックアップを行わせる差分バックアップ最適化部とを有することを特徴とする。 In order to achieve the above object, a differential backup system according to the present invention includes a client, which is one or a plurality of computer devices that hold backup target data, and backup management that holds full backup data that is a complete backup of each backup target data. A differential backup system in which servers are connected to each other, wherein a backup management server performs first data transmission / reception with a first fixed storage means for storing full backup data and data communication with a client via a network And first hash data is calculated by dividing the full backup data into blocks at each of the first to third block sizes given in advance and applying a function given in advance to the data of each block. 1 hash data total And the first hash data in each of the first to third block sizes are divided into a test area that is a part of the full backup data and a remaining area other than the test area. A data generation unit for optimization processing that transmits hash data for the storage area to the client, the second data transmitting / receiving unit that performs data communication with the backup management server, and a second data that stores the backup target data And the second hash for calculating the second hash data by dividing the backup target data into blocks and applying a function to the data of each block for each of the first to third block sizes The data calculation unit and the first received the calculated second hash data from the backup management server for each block A hash data comparison unit for comparing with the hash data, a differential backup unit for performing differential backup of the block of the backup target data when the comparison result of the first and second hash data for each block is different, and the first to first For each of the three block sizes, the differential backup unit performs a test differential backup for the test area of the backup target data using the first hash data for the test area, and the first to third The optimal size calculation unit that determines the block size with the shortest required time in the experimental differential backup performed at the block size as the optimal size, and the first hash data for the remaining area in the case of the determined optimal size are backed up Requests to the management server and sent And a differential backup optimizing unit that makes the differential backup unit perform differential backup of the remaining area of the backup target data with the optimum size using the first hash data for the remaining area.

上記目的を達成するため、本発明に係るバックアップ管理サーバは、バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと相互に接続され、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバであって、フルバックアップデータを記憶する固定記憶手段と、ネットワークを経由してクライアントとデータ通信を行うデータ送受信部と、予め与えられた第１ないし第３のブロックサイズの各々でフルバックアップデータをブロックに区切ってこれら各ブロックのデータに予め与えられた関数を適用してハッシュデータを算出するハッシュデータ計算部と、第１ないし第３のブロックサイズの各々の場合のハッシュデータをフルバックアップデータの一部である試験用領域についてのものとそれ以外の残領域のものとに分割して試験用領域についてのハッシュデータをクライアントに送信する最適化処理用データ作成部とを有することを特徴とする。 In order to achieve the above object, a backup management server according to the present invention is connected to a client, which is one or a plurality of computer devices that hold backup target data, and performs full backup data that is a complete backup of each backup target data. A backup management server for holding a fixed storage means for storing full backup data, a data transmission / reception unit for performing data communication with a client via a network, and first to third block sizes given in advance. And a hash data calculator for calculating hash data by dividing the full backup data into blocks and applying a function given in advance to the data of each block, and hash data in each of the first to third block sizes Full back-up It has an optimization processing data creation unit that divides the data into a test area that is a part of the data and a remaining area other than that and transmits hash data about the test area to the client And

上記目的を達成するため、本発明に係るクライアントは、バックアップ対象データを保持し、バックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバと相互に接続されたコンピュータ装置であるクライアントであって、バックアップ管理サーバとデータ通信を行うデータ送受信部と、バックアップ対象データを記憶する固定記憶手段と、第１〜第３のブロックサイズの各々の場合についてバックアップ対象データをブロックに区切ってこれら各ブロックのデータに関数を適用して第２のハッシュデータを算出するハッシュデータ計算部と、算出された第２のハッシュデータをブロックごとにバックアップ管理サーバから受信した第１のハッシュデータと比較するハッシュデータ比較部と、ブロックごとの第１および第２のハッシュデータの比較結果が異なっている場合にバックアップ対象データの該ブロックを差分バックアップする差分バックアップ部と、第１ないし第３のブロックサイズの各々の場合について、バックアップ対象データの一部である試験用領域についての第１のハッシュデータを利用してバックアップ対象データの試験用領域について差分バックアップ部に試験的差分バックアップを行わせると共に、第１〜第３のブロックサイズで行った試験的差分バックアップで最も所要時間の少なかったブロックサイズを最適サイズとして決定する最適サイズ計算部と、決定された最適サイズの場合の試験用領域以外の残領域についての第１のハッシュデータをバックアップ管理サーバに要求すると共に、送られてきたこの残領域についての第１のハッシュデータを利用して差分バックアップ部に最適サイズでバックアップ対象データの残領域の差分バックアップを行わせる差分バックアップ最適化部とを有することを特徴とする。 In order to achieve the above object, a client according to the present invention is a client that is a computer device interconnected with a backup management server that holds backup target data and holds full backup data that is a complete backup of the backup target data. The data transmission / reception unit that performs data communication with the backup management server, the fixed storage unit that stores the data to be backed up, and each of these blocks by dividing the data to be backed up into blocks for each of the first to third block sizes A hash data calculation unit that calculates a second hash data by applying a function to the data of the hash data, and a hash data that compares the calculated second hash data with the first hash data received from the backup management server for each block A comparison unit; A differential backup unit that differentially backs up the block of the data to be backed up when the comparison results of the first and second hash data for each lock are different, and a backup for each of the first to third block sizes Using the first hash data for the test area that is part of the target data, the differential backup unit performs a test differential backup for the test area of the backup target data, and the first to third block sizes An optimal size calculation unit that determines the block size with the shortest required time in the experimental differential backup performed in step 1 as the optimal size, and the first hash data for the remaining area other than the test area in the case of the determined optimal size Is requested and sent to the backup management server And having the a first differential backup optimization unit hash data using to perform differential backup of the residual area of the backup target data in the optimal size differential backup portion for the remaining area.

上記目的を達成するため、本発明に係る差分バックアップ方法は、バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバとが相互に接続された差分バックアップシステムにあって、予め与えられた第１ないし第３のブロックサイズの各々でバックアップ管理サーバの第１のハッシュデータ計算部がフルバックアップデータをブロックに区切り、各ブロックのデータに予め与えられた関数を適用してバックアップ管理サーバの第１のハッシュデータ計算部が第１のハッシュデータを算出し、第１ないし第３のブロックサイズの各々の場合について第１のハッシュデータをバックアップ管理サーバの最適化処理用データ作成部がフルバックアップデータの一部である試験用領域についてのものとそれ以外の残領域のものとに分割し、試験用領域についてのハッシュデータをバックアップ管理サーバの最適化処理用データ作成部がクライアントに送信し、第１〜第３のブロックサイズの各々の場合についてクライアントの第２のハッシュデータ計算部がバックアップ対象データを当該ブロックサイズのブロックに区切り、各ブロックのデータに関数を適用してクライアントの第２のハッシュデータ計算部が第２のハッシュデータを算出し、ブロックごとに算出された第２のハッシュデータとバックアップ管理サーバから受信した第１のハッシュデータとをクライアントのハッシュデータ比較部が比較し、ブロックごとに第１および第２のハッシュデータの比較結果が異なっている場合にバックアップ対象データの該ブロックをクライアントの差分バックアップ部が差分バックアップし、第１〜第３のブロックサイズで行った試験用領域についての差分バックアップで最も所要時間の少なかったブロックサイズをクライアントの最適サイズ計算部が最適サイズとして決定し、決定された最適サイズの場合の残領域についての第１のハッシュデータをクライアントの差分バックアップ最適化部がバックアップ管理サーバに要求し、要求された残領域についての第１のハッシュデータをバックアップ管理サーバのデータ送受信部がクライアントに送信し、この残領域についての第１のハッシュデータを利用してクライアントの差分バックアップ最適化部が差分バックアップ部にバックアップ対象データの残領域の差分バックアップを行わせることを特徴とする。 In order to achieve the above object, a differential backup method according to the present invention includes a client, which is one or a plurality of computer devices that hold backup target data, and backup management that holds full backup data that is a complete backup of each backup target data. In the differential backup system interconnected with the server, the first hash data calculation unit of the backup management server divides the full backup data into blocks at each of the first to third block sizes given in advance, The first hash data calculation unit of the backup management server calculates the first hash data by applying a function given in advance to the data of each block, and the first for each of the first to third block sizes. Backup of hash data The optimization processing data creation unit of the logical server divides the data into the test area that is part of the full backup data and the remaining area, and the hash data for the test area is stored in the backup management server. The optimization processing data creation unit transmits to the client, and for each of the first to third block sizes, the second hash data calculation unit of the client divides the backup target data into blocks of the block size, The second hash data calculation unit of the client calculates the second hash data by applying the function to the data of the second hash data calculated for each block and the first hash data received from the backup management server Are compared by the client hash data comparison unit, and the first and When the comparison result of the second hash data is different, the differential backup unit of the client differentially backs up the block of the backup target data, and the differential backup for the test area performed with the first to third block sizes. The optimal size calculation unit of the client determines the block size with the shortest required time as the optimal size, and the differential backup optimization unit of the client performs backup management of the first hash data for the remaining area in the case of the determined optimal size. The request is sent to the server, and the data transmission / reception unit of the backup management server sends the first hash data for the requested remaining area to the client, and the client uses the first hash data for the remaining area to optimize the differential backup of the client. Is the differential backup unit A differential backup of the remaining area of the backup target data is performed.

上記目的を達成するため、本発明に係る差分バックアッププログラムは、バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバとが相互に接続された差分バックアップシステムにあって、バックアップ管理サーバが備えるコンピュータに、予め与えられた第１ないし第３のブロックサイズの各々でフルバックアップデータをブロックに区切る手順、各ブロックのデータに予め与えられた関数を適用してハッシュデータを算出する手順、第１ないし第３のブロックサイズの各々の場合についてハッシュデータをフルバックアップデータの一部である試験用領域についてのものとそれ以外の残領域のものとに分割する手順、試験用領域についてのハッシュデータをクライアントに送信する手順、およびクライアントから要求された残領域についての第１のハッシュデータを当該クライアントに送信する手順を実行させることを特徴とする。 In order to achieve the above object, a differential backup program according to the present invention includes a client, which is one or a plurality of computer devices that hold backup target data, and backup management that holds full backup data that is a complete backup of each backup target data. In a differential backup system connected to a server, a procedure for dividing full backup data into blocks in each of first to third block sizes given in advance to a computer provided in the backup management server, Procedure for calculating hash data by applying a function given in advance to the data, and for each of the first to third block sizes, the hash data is for the test area that is part of the full backup data and Less than Of the remaining area, the procedure for transmitting the hash data for the test area to the client, and the procedure for transmitting the first hash data for the remaining area requested by the client to the client. It is characterized by that.

上記目的を達成するため、本発明に係る他の差分バックアッププログラムは、バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバとが相互に接続された差分バックアップシステムにあって、クライアントが備えるコンピュータに、第１〜第３のブロックサイズの各々の場合についてバックアップ対象データを当該ブロックサイズのブロックに区切る手順、各ブロックのデータに関数を適用して第２のハッシュデータを算出する手順、ブロックごとに算出された第２のハッシュデータをバックアップ管理サーバから受信した第１のハッシュデータと比較する手順、ブロックごとの第１および第２のハッシュデータの比較結果が異なっている場合にバックアップ対象データの該ブロックを差分バックアップする手順、第１〜第３のブロックサイズで行ったバックアップ対象データの一部である試験用領域についての差分バックアップで最も所要時間の少なかったブロックサイズを最適サイズとして決定する手順、決定された最適サイズの場合の試験用領域以外の残領域についての第１のハッシュデータをバックアップ管理サーバに要求する手順、および送られてきたこの残領域についての第１のハッシュデータを利用してバックアップ対象データの残領域の差分バックアップを行う手順を実行させることを特徴とする。 In order to achieve the above object, another differential backup program according to the present invention holds a client, which is one or a plurality of computer devices that hold backup target data, and full backup data that is a complete backup of each backup target data. In the differential backup system interconnected with the backup management server, a procedure for dividing the backup target data into blocks of the block size for each of the first to third block sizes in the computer provided in the client, A procedure for calculating the second hash data by applying a function to the block data, a procedure for comparing the second hash data calculated for each block with the first hash data received from the backup management server, and for each block First A procedure for differentially backing up the block of the backup target data when the comparison results of the second hash data are different, and a test area that is a part of the backup target data performed in the first to third block sizes Of determining the block size with the shortest required time in the differential backup of the first as the optimum size, and requesting the backup management server for the first hash data for the remaining area other than the test area in the case of the determined optimum size And a procedure for performing a differential backup of the remaining area of the backup target data using the first hash data for the remaining area that has been sent.

本発明は、上述したように第１〜第３のブロックサイズで試験用領域をブロックに区切って差分バックアップを行い、その中で所要時間が最短だったブロックサイズで残領域をブロックに区切って差分バックアップを行うように構成したので、所要時間を短くできるようにブロックサイズを的確に調整して、これによって差分バックアップの所要時間を短縮することが可能である差分バックアップシステム、管理サーバおよびクライアント、差分バックアップ方法およびプログラムを提供することができる。 As described above, the present invention performs differential backup by dividing the test area into blocks with the first to third block sizes, and divides the remaining area into blocks with the block size having the shortest required time. Since it is configured to perform backup, the differential backup system, management server and client, and differential that can reduce the time required for differential backup by adjusting the block size accurately so that the required time can be shortened A backup method and program can be provided.

図２で示したバックアップ管理サーバのより詳しい構成を示す説明図である。FIG. 3 is an explanatory diagram showing a more detailed configuration of the backup management server shown in FIG. 2. 本実施形態に係る差分バックアップシステムの構成を示す説明図である。It is explanatory drawing which shows the structure of the differential backup system which concerns on this embodiment. 図１で示したフルバックアップデータおよび差分データの概念について示す説明図である。図３（ａ）がフルバックアップデータについて、図３（ｂ）が差分データについて各々示している。It is explanatory drawing shown about the concept of the full backup data shown in FIG. 1, and difference data. FIG. 3A shows the full backup data, and FIG. 3B shows the difference data. 図３に示した差分データのハッシュデータのデータ構成について示す説明図である。It is explanatory drawing shown about the data structure of the hash data of the difference data shown in FIG. 図１で示したバックアップ管理サーバのワーキングエリアに記憶される各データの内容を示す説明図である。It is explanatory drawing which shows the content of each data memorize | stored in the working area of the backup management server shown in FIG. 図１で示したクライアントおよびブートイメージの構成について示す説明図である。It is explanatory drawing shown about the structure of the client shown in FIG. 1, and a boot image. 図６に示したクライアントのワーキングエリアに記憶されるデータについて示す説明図である。It is explanatory drawing shown about the data memorize | stored in the working area of the client shown in FIG. 本実施形態に係る差分バックアップ処理の概要を示す説明図である。図８（ａ）には第１番目の工程である「ステップ１」を、図８（ｂ）には第２番目の工程である「ステップ２」を各々示す。It is explanatory drawing which shows the outline | summary of the differential backup process which concerns on this embodiment. FIG. 8A shows “Step 1” as the first process, and FIG. 8B shows “Step 2” as the second process. 図８の続きである。図９（ｃ）には第３番目の工程である「ステップ３」を、図９（ｄ）には第４番目の工程である「ステップ４」を各々示す。It is a continuation of FIG. FIG. 9C shows “step 3” as the third process, and FIG. 9D shows “step 4” as the fourth process. 図８〜９に示した各工程で作成されるハッシュデータについて示す説明図である。図１０（ａ）はブロックサイズ「カレント（５１２キロバイト）」の「全領域」に対するハッシュデータ、図１０（ｂ）はブロックサイズ「結合（１０２４キロバイト）」の「全領域」に対するハッシュデータ、図１０（ｃ）はブロックサイズ「分割（２５６キロバイト）」の「全領域」に対するハッシュデータ、図１０（ｄ）はブロックサイズ「カレント」の「試験用領域」に対するハッシュデータ、図１０（ｅ）はブロックサイズ「結合」の「試験用領域」に対するハッシュデータ、図１０（ｆ）はブロックサイズ「分割」の「試験用領域」に対するハッシュデータを各々示す。It is explanatory drawing shown about the hash data produced at each process shown to FIGS. 10A shows hash data for “all areas” of the block size “current (512 kilobytes)”, FIG. 10B shows hash data for “all areas” of the block size “join (1024 kilobytes)”, FIG. (C) is hash data for “all areas” of block size “divided (256 kilobytes)”, FIG. 10 (d) is hash data for “test areas” of block size “current”, and FIG. 10 (e) is a block FIG. 10F shows hash data for the “test area” with the block size “division”, respectively. 図１０の続きである。図１１（ｇ）はブロックサイズ「カレント」の「残領域」に対するハッシュデータ、図１１（ｈ）はブロックサイズ「結合」の「残領域」に対するハッシュデータ、図１１（ｉ）はブロックサイズ「分割」の「残領域」に対するハッシュデータを各々示す。It is a continuation of FIG. FIG. 11G shows the hash data for the “remaining area” with the block size “current”, FIG. 11H shows the hash data for the “remaining area” with the block size “join”, and FIG. Hash data for “remaining area” in FIG. バックアップ管理サーバで図１に示した最適化処理用データ作成部が差分バックアップの準備処理として行う、最適化用ハッシュデータ計算処理について示すフローチャートである。3 is a flowchart showing optimization hash data calculation processing performed as a differential backup preparation processing by the optimization processing data creation unit shown in FIG. 1 in the backup management server. 図１２のステップＳ３０２、３０４、３０６として示したハッシュデータ計算部が行うハッシュデータ計算処理について示すフローチャートである。13 is a flowchart showing hash data calculation processing performed by a hash data calculation unit shown as steps S302, 304, and 306 in FIG. クライアントで図６に示した差分バックアップ最適化部が行う最適化差分バックアップの処理について示すフローチャートである。It is a flowchart shown about the process of the optimization differential backup which the differential backup optimization part shown in FIG. 6 performed by the client. 図１４のステップＳ４０１として示した、最適サイズ計算部が行う最適サイズ計算処理について示すフローチャートである。It is a flowchart shown about the optimal size calculation process which the optimal size calculation part shown as step S401 of FIG. 図１５の続きである。It is a continuation of FIG. 図１５のステップＳ４５２、４５４、４５７、および図１４のＳ４０４、４０５、４０６として示した、差分バックアップ部およびハッシュデータ比較部による差分バックアップ処理について示すフローチャートである。16 is a flowchart illustrating differential backup processing by the differential backup unit and the hash data comparison unit illustrated as steps S452, 454, and 457 in FIG. 15 and S404, 405, and 406 in FIG. 図１４〜１６に示したクライアントでの処理に対応するバックアップ管理サーバでの処理について示すフローチャートである。FIG. 17 is a flowchart showing processing at the backup management server corresponding to processing at the client shown in FIGS.

（第１の実施形態）
以下、本発明の第１の実施形態の構成について添付図１〜７に基づいて説明する。最初に、本実施形態の基本的な内容について説明し、その後でより具体的な内容について説明する。
本実施形態に係る差分バックアップシステム１は、バックアップ対象データ２２１を保持する単数もしくは複数のコンピュータ装置であるクライアント２１と、各々のバックアップ対象データを完全バックアップしたフルバックアップデータ１２１を保持するバックアップ管理サーバ１０とが相互に接続された差分バックアップシステムである。バックアップ管理サーバ１０は、フルバックアップデータ１２１を記憶する第１の固定記憶手段１０３と、ネットワーク３０を経由してクライアント２１とデータ通信を行う第１のデータ送受信部１１２と、予め与えられた第１ないし第３のブロックサイズの各々でフルバックアップデータ１２１をブロックに区切ってこれら各ブロックのデータに予め与えられた関数を適用して第１のハッシュデータ１６０を算出する第１のハッシュデータ計算部１１４と、第１ないし第３のブロックサイズの各々の場合の第１のハッシュデータをフルバックアップデータ１２１の一部である試験用領域についてのものとそれ以外の領域のものとに分割して試験用領域についてのハッシュデータ１６４〜１６６をクライアントに送信する最適化処理用データ作成部１１１とを有する。一方のクライアント２１は、バックアップ管理サーバ１０とデータ通信を行う第２のデータ送受信部２１１と、バックアップ対象データ２２１を記憶する第２の固定記憶手段２０３と、第１〜第３のブロックサイズの各々の場合についてバックアップ対象データ２２１をブロックに区切ってこれら各ブロックのデータに関数を適用して第２のハッシュデータを算出する第２のハッシュデータ計算部２１５と、算出された第２のハッシュデータをブロックごとにバックアップ管理サーバから受信した第１のハッシュデータ１６４〜１６６と比較するハッシュデータ比較部２１６と、ブロックごとの第１および第２のハッシュデータの比較結果が異なっている場合にバックアップ対象データの該ブロックを差分バックアップする差分バックアップ部２１４と、第１ないし第３のブロックサイズの各々の場合について、試験用領域についての第１のハッシュデータ１６４〜１６６を利用してバックアップ対象データの試験用領域について差分バックアップ部に試験的差分バックアップを行わせると共に、第１〜第３のブロックサイズで行った試験的差分バックアップで最も所要時間の少なかったブロックサイズを最適サイズとして決定する最適サイズ計算部２１３と、決定された最適サイズの場合の残領域についての第１のハッシュデータをバックアップ管理サーバに要求すると共に、送られてきたこの残領域についての第１のハッシュデータを利用して差分バックアップ部に最適サイズでバックアップ対象データの残領域の差分バックアップを行わせる差分バックアップ最適化部２１２とを有する。 (First embodiment)
Hereinafter, the structure of the 1st Embodiment of this invention is demonstrated based on attached FIGS. First, the basic content of the present embodiment will be described, and then more specific content will be described.
The differential backup system 1 according to the present embodiment includes a client 21 that is one or a plurality of computer devices that hold backup target data 221 and a backup management server 10 that holds full backup data 121 that is a complete backup of each backup target data. Is a differential backup system connected to each other. The backup management server 10 includes a first fixed storage unit 103 that stores the full backup data 121, a first data transmission / reception unit 112 that performs data communication with the client 21 via the network 30, and a first given in advance. The first hash data calculation unit 114 calculates the first hash data 160 by dividing the full backup data 121 into blocks for each of the third block sizes and applying a predetermined function to the data of each block. The first hash data in each of the first to third block sizes is divided into a test area that is a part of the full backup data 121 and a test area that is divided into other areas. Data for optimization processing for transmitting hash data 164 to 166 for the area to the client And a generation unit 111. One client 21 includes a second data transmission / reception unit 211 that performs data communication with the backup management server 10, a second fixed storage unit 203 that stores backup target data 221, and each of the first to third block sizes. In this case, the backup object data 221 is divided into blocks, a second hash data calculation unit 215 that calculates a second hash data by applying a function to the data of each block, and the calculated second hash data The data to be backed up when the comparison result of the first and second hash data for each block differs from the hash data comparison unit 216 that compares the first hash data 164 to 166 received from the backup management server for each block. Differential backup to perform differential backup of the block For each of 214 and the first to third block sizes, a trial differential backup is performed in the differential backup unit for the test area of the backup target data using the first hash data 164 to 166 for the test area. And an optimal size calculation unit 213 that determines the block size with the shortest required time in the experimental differential backup performed with the first to third block sizes as the optimal size, and the determined optimal size Request the first hash data for the remaining area to the backup management server, and use the first hash data for the remaining area that has been sent to the differential backup unit to determine the remaining area of the backup target data with the optimum size. Differential backup optimization unit 212 that performs differential backup Having.

ここで、第２のブロックサイズが第１のブロックサイズより大きく、かつ第３のブロックサイズが第１のブロックサイズより小さく、バックアップ管理サーバ１０の最適化処理用データ作成部１１１が、残領域についての第１のハッシュデータをクライアントに返信する際に受信した最適サイズを次回同一処理時の第１のブロックサイズとして記憶する。 Here, the second block size is larger than the first block size, the third block size is smaller than the first block size, and the optimization processing data creation unit 111 of the backup management server 10 determines the remaining area. The optimum size received when returning the first hash data to the client is stored as the first block size for the same processing next time.

また、クライアント２１の最適サイズ計算部２１３が、第２のブロックサイズで行った試験的差分バックアップの所要時間が第１のブロックサイズで行った試験的差分バックアップの所要時間より短ければ、その時点で第２のブロックサイズを最適サイズとして決定する。 Further, if the time required for the experimental differential backup performed with the second block size is shorter than the time required for the experimental differential backup performed with the first block size, the optimum size calculation unit 213 of the client 21 at that time. The second block size is determined as the optimum size.

そしてバックアップ管理サーバ１０が、クライアントの備えるプロセッサを第２のデータ送受信部、第２のハッシュデータ計算部、ハッシュデータ比較部、差分バックアップ部、最適サイズ計算部、および差分バックアップ最適化部として機能させるプログラムであるブートイメージ１４１を第１の固定記憶手段１０３に記憶するブートイメージ格納部１１７と、このブートイメージをクライアントに送信して動作させるブートイメージ送信部１１３を有する。 The backup management server 10 causes the processor included in the client to function as a second data transmission / reception unit, a second hash data calculation unit, a hash data comparison unit, a differential backup unit, an optimal size calculation unit, and a differential backup optimization unit A boot image storage unit 117 that stores a boot image 141 as a program in the first fixed storage unit 103, and a boot image transmission unit 113 that transmits the boot image to a client to operate the client.

この構成を備えることにより、この差分バックアップシステム１はブロックサイズを的確に調整して、これによって差分バックアップの所要時間を短縮することが可能となる。
以下、これをより詳細に説明する。 By providing this configuration, the differential backup system 1 can accurately adjust the block size, thereby shortening the time required for differential backup.
Hereinafter, this will be described in more detail.

図２は、本実施形態に係る差分バックアップシステム１の構成を示す説明図である。差分バックアップシステム１は、バックアップ管理サーバ１０と複数台のクライアント２１、２２、…とがネットワーク３０を介して相互に接続されて構成される。バックアップ管理サーバ１０は、ネットワーク３０を経由して、クライアント２１、２２、…のデータをバックアップするソフトウェアを搭載したコンピュータ装置である。 FIG. 2 is an explanatory diagram showing the configuration of the differential backup system 1 according to the present embodiment. The differential backup system 1 includes a backup management server 10 and a plurality of clients 21, 22,... Connected to each other via a network 30. The backup management server 10 is a computer device equipped with software for backing up data of the clients 21, 22,... Via the network 30.

クライアント２１、２２、…はバックアップ管理サーバ１０によってデータをバックアップされるためのソフトウェアを搭載したコンピュータ装置である。以後、これらのクライアントの中の任意の１台を、代表してクライアント２１という。１台のバックアップ管理サーバ１０に対してクライアント２１は１台以上何台でもよい。ネットワーク３０は、バックアップ管理サーバ１０とクライアント２１、２２、…とを相互に接続するためのＬＡＮ（Local Area Network）もしくはＷＡＮ（Wide Area Network）などである。 The clients 21, 22,... Are computer devices equipped with software for backing up data by the backup management server 10. Hereinafter, any one of these clients will be referred to as a client 21 as a representative. One or more clients 21 may be provided for one backup management server 10. The network 30 is a LAN (Local Area Network) or a WAN (Wide Area Network) for connecting the backup management server 10 and the clients 21, 22,.

図１は、図２で示したバックアップ管理サーバ１０のより詳しい構成を示す説明図である。バックアップ管理サーバ１０は、プロセッサ１０１、一時記憶手段１０２、固定記憶手段１０３、および通信手段１０４を備えるコンピュータ装置である。一時記憶手段１０２はプロセッサ１０１が作業中のデータを記憶するＲＡＭ（Random Access Memory）などのような記憶装置であり、固定記憶手段１０３は大容量のデータを保存するハードディスクなどのような記憶装置である。 FIG. 1 is an explanatory diagram showing a more detailed configuration of the backup management server 10 shown in FIG. The backup management server 10 is a computer device that includes a processor 101, a temporary storage unit 102, a fixed storage unit 103, and a communication unit 104. The temporary storage means 102 is a storage device such as a RAM (Random Access Memory) that stores data that the processor 101 is working on, and the fixed storage means 103 is a storage device such as a hard disk that stores a large amount of data. is there.

プロセッサ１０１では、最適化処理用データ作成部１１１、データ送受信部１１２、ブートイメージ送信部１１３、ハッシュデータ計算部１１４、フルバックアップデータ格納部１１５、差分データ格納部１１６、およびブートイメージ格納部１１７が、コンピュータプログラムとして実行される。最適化処理用データ作成部１１１は、後述の処理でクライアント２１が差分バックアップの最適化のために利用するハッシュデータを作成する。データ送受信部１１２は、通信手段１０４を制御してクライアント２１とのデータ通信を行わせる。ブートイメージ送信部１１３は、ブートイメージをクライアント２１に送信して動作させる。ハッシュデータ計算部１１４は、フルバックアップデータを任意のサイズで区切ったブロックの各々に対してハッシュ値を計算する。 In the processor 101, an optimization processing data creation unit 111, a data transmission / reception unit 112, a boot image transmission unit 113, a hash data calculation unit 114, a full backup data storage unit 115, a difference data storage unit 116, and a boot image storage unit 117 Executed as a computer program. The optimization processing data creation unit 111 creates hash data that is used by the client 21 for optimization of differential backup in the processing described later. The data transmission / reception unit 112 controls the communication unit 104 to perform data communication with the client 21. The boot image transmission unit 113 transmits the boot image to the client 21 to operate. The hash data calculation unit 114 calculates a hash value for each block obtained by dividing full backup data by an arbitrary size.

また、固定記憶手段１０３上には、フルバックアップデータ格納領域１２０、差分データ格納領域１３０、およびブートイメージ格納領域１４０という領域が確保され、フルバックアップデータ格納領域１２０にはフルバックアップデータ格納部１１５がフルバックアップデータ１２１を、差分データ格納領域１３０には差分データ格納部１１６が差分データ１３１を各々記憶する。ブートイメージ格納領域１４０にはブートイメージ格納部１１７がブートイメージ１４１を記憶する。ここでフルバックアップデータ格納部１１５、差分データ格納部１１６、およびブートイメージ格納部１１７は、大量のデータを格納して管理するＤＢＭＳ（Database Management System）を利用することができる。 In addition, a full backup data storage area 120, a differential data storage area 130, and a boot image storage area 140 are secured on the fixed storage unit 103, and a full backup data storage unit 115 is provided in the full backup data storage area 120. The difference data storage unit 116 stores the full backup data 121 and the difference data 131 in the difference data storage area 130. The boot image storage unit 117 stores the boot image 141 in the boot image storage area 140. Here, the full backup data storage unit 115, the differential data storage unit 116, and the boot image storage unit 117 can use a DBMS (Database Management System) that stores and manages a large amount of data.

通信手段１０４は、ネットワーク３０を介してクライアント２１とのデータ通信を行う。一時記憶手段１０２には、最適化処理用データ作成部１１１が作業中のデータを一時的に記憶する領域であるワーキングエリア１５０が確保されている。ワーキングエリア１５０には、後述するハッシュデータ１６０と、カレントサイズ設定値１５１が記憶されるが、これについては後述する。 The communication unit 104 performs data communication with the client 21 via the network 30. The temporary storage unit 102 has a working area 150 that is an area for temporarily storing data being worked on by the optimization processing data creation unit 111. The working area 150 stores hash data 160, which will be described later, and a current size setting value 151, which will be described later.

図３は、図１で示したフルバックアップデータ１２１および差分データ１３１の概念について示す説明図である。図３（ａ）がフルバックアップデータ１２１について、図３（ｂ）が差分データ１３１について各々示している。図３（ａ）に示すフルバックアップデータ１２１は、バックアップ対象データ２２１、即ちクライアント２１のディスク領域の全コピーに相当し、アドレス０からディスクの最大アドレスＮまでのデータのフルコピーをいう。 FIG. 3 is an explanatory diagram showing the concept of the full backup data 121 and the difference data 131 shown in FIG. 3A shows the full backup data 121, and FIG. 3B shows the difference data 131. The full backup data 121 shown in FIG. 3A corresponds to the backup target data 221, that is, a full copy of the disk area of the client 21, and is a full copy of data from address 0 to the maximum address N of the disk.

それに対して、図３（ｂ）に示す差分データ１３１は、クライアント２１のディスク領域の中で、前回採取分のフルバックアップデータ１２１と今回のバックアップ対象データ２２１とを比較し、その中で相違点のある部分だけをコピーするものである。 On the other hand, the difference data 131 shown in FIG. 3B is the difference between the full backup data 121 of the previous collection and the current backup target data 221 in the disk area of the client 21. Only a certain part is copied.

フルバックアップデータ１２１全体のサイズをＮとすると、前回採取分のフルバックアップデータ１２１と今回のバックアップ対象データ２２１の両者を同一のブロックサイズｎにてブロックに分割し、各ブロックでデータが完全に一致すれば「差分なし」、相違すれば「差分あり」とし、「差分あり」となったブロックの先頭アドレスを後述の一時リスト２６０に記憶する。そして、一時リスト２６０に先頭アドレスの記憶されたブロックだけを、「差分データ」としてバックアップする。 Assuming that the size of the entire full backup data 121 is N, both the full backup data 121 of the previous collection and the current backup target data 221 are divided into blocks with the same block size n, and the data is completely the same in each block. If there is a difference, “no difference” is set, and if there is a difference, “difference exists”, and the head address of the block that is “with difference” is stored in a temporary list 260 described later. Then, only the block whose head address is stored in the temporary list 260 is backed up as “difference data”.

図３（ｂ）にあるように、差分データ１３１は、分割するブロックサイズに相当するサイズ１３１ａ、ブロックの先頭アドレスに相当するアドレス１３１ｂ、実際のブロックデータに相当するデータ１３１ｃといった各データから構成される。図３（ｂ）では、「差分なし」のブロックは空欄、「差分あり」のブロックは斜線パターンで示されている。そして先頭のブロックを０番目とすると、１番目とａ番目（ａは自然数、かつ２≦ａ＜ｋ−１、ｋ＝Ｎ／ｎ）の２ブロックが「差分あり」となって差分データ１３１としてバックアップされたという例を示している。 As shown in FIG. 3B, the difference data 131 is composed of data such as a size 131a corresponding to the block size to be divided, an address 131b corresponding to the head address of the block, and data 131c corresponding to the actual block data. The In FIG. 3B, the “no difference” block is blank, and the “difference” block is indicated by a hatched pattern. If the first block is 0th, the first and ath blocks (a is a natural number, and 2 ≦ a <k−1, k = N / n) become “difference” and become differential data 131. An example of being backed up is shown.

ここで、最適化処理用データ作成部１１１は、各ブロックのハッシュデータ１６０を計算してワーキングエリア１５０に記憶する。図４は、図３に示した差分データ１３１のハッシュデータ１６０のデータ構成について示す説明図である。ハッシュデータ１６０は、バックアップ対象データをブロックサイズｎで分割したブロックごとのハッシュ値を集めたデータである。ハッシュデータ１６０は、分割するブロックサイズｎに相当するブロックサイズ１６０ａ、各ブロックの先頭アドレス１６０ｂとそのブロックのハッシュ値１６０ｃを格納した表である。 Here, the optimization processing data creation unit 111 calculates the hash data 160 of each block and stores it in the working area 150. FIG. 4 is an explanatory diagram showing the data structure of the hash data 160 of the difference data 131 shown in FIG. The hash data 160 is data obtained by collecting hash values for each block obtained by dividing the backup target data by the block size n. The hash data 160 is a table storing a block size 160a corresponding to the block size n to be divided, a head address 160b of each block, and a hash value 160c of the block.

ここでいうハッシュ値とは、任意のデータを要約して得られる固定長の値をいう。たとえばＣ＃言語でいうGetHashCode関数などのように、任意のデータに対してその要約として一意に決定される関数をそのデータに適用して得られる値である。その関数をハッシュ関数という。本実施形態では、図３（ｂ）に示した各ブロックごとのデータの比較は、該ブロックデータ自体を直接比較するのではなく、該ブロックデータのハッシュ値を比較することによって相違の有無を判定している。 The hash value here means a fixed-length value obtained by summarizing arbitrary data. For example, a value obtained by applying a function uniquely determined as a summary to arbitrary data, such as the GetHashCode function in the C # language, to the data. This function is called a hash function. In the present embodiment, the comparison of the data for each block shown in FIG. 3B does not directly compare the block data itself, but determines whether or not there is a difference by comparing the hash values of the block data. doing.

なお、本実施形態では後述するように、ブロックサイズ「カレント」「結合」「分割」、ハッシュ値の計算対象を「全領域」「試験用領域」「残領域」の各々に対して図４に示したデータが作成されるので、ハッシュデータ１６０はそれらの総称としていい、各々のブロックサイズおよび計算対象についてはハッシュデータ１６１〜１６９という。 In this embodiment, as will be described later, the block size “current”, “combine”, “partition”, and hash value calculation targets are “all areas”, “test areas”, and “remaining areas” in FIG. Since the data shown is created, the hash data 160 is a generic term for them, and each block size and calculation object is called hash data 161-169.

最適化処理用データ作成部１１１およびハッシュデータ計算部１１４は、バックアップデータ格納領域１２０を参照しながら、差分バックアップの準備処理としてハッシュデータ１６０を計算する。データ送受信部１１２は、クライアント２１へのハッシュデータ１６０の送信およびクライアント２１からの差分データの受信を行う。これらの処理の詳細については後述する。 The optimization processing data creation unit 111 and the hash data calculation unit 114 calculate the hash data 160 as a differential backup preparation process while referring to the backup data storage area 120. The data transmission / reception unit 112 transmits the hash data 160 to the client 21 and receives differential data from the client 21. Details of these processes will be described later.

図５は、図１で示したバックアップ管理サーバ１０のワーキングエリア１５０に記憶される各データの内容を示す説明図である。ワーキングエリア１５０には、最適化処理用データ作成部１１１およびハッシュデータ計算部１１４が作成した（図４に示した）ハッシュデータ１６０（１６１〜１６９）と、カレントサイズ設定値１５１とが記憶されている。 FIG. 5 is an explanatory diagram showing the contents of each data stored in the working area 150 of the backup management server 10 shown in FIG. In the working area 150, hash data 160 (161 to 169) created by the optimization processing data creation unit 111 and the hash data calculation unit 114 (shown in FIG. 4) and a current size setting value 151 are stored. Yes.

図６は、図１で示したクライアント２１およびブートイメージ１４１の構成について示す説明図である。バックアップ管理サーバ１０のブートイメージ送信部１１３は、クライアント２１の差分バックアップを実行するため、クライアント２１にブートイメージ１４１を送信する。ブートイメージ１４１をクライアント２１に送信し、クライアント２１を再起動することによって、クライアント２１のプロセッサ２０１で各種プログラムが動作して差分バックアップが実行される。 FIG. 6 is an explanatory diagram showing the configuration of the client 21 and the boot image 141 shown in FIG. The boot image transmission unit 113 of the backup management server 10 transmits the boot image 141 to the client 21 in order to execute the differential backup of the client 21. By transmitting the boot image 141 to the client 21 and restarting the client 21, various programs operate on the processor 201 of the client 21 and differential backup is executed.

クライアント２１は、バックアップ管理サーバ１０と同様に、プロセッサ２０１、一時記憶手段２０２、固定記憶手段２０３、および通信手段２０４を備えるコンピュータ装置である。プロセッサ２０１では、データ送受信部２１１、差分バックアップ最適化部２１２、最適サイズ計算部２１３、差分バックアップ部２１４、ハッシュデータ計算部２１５、ハッシュデータ比較部２１６の各々が動作する。ブートイメージ１４１は、プロセッサ２０１をこれらの動作部として機能させるためのコンピュータプログラムである。 Similar to the backup management server 10, the client 21 is a computer device that includes a processor 201, a temporary storage unit 202, a fixed storage unit 203, and a communication unit 204. In the processor 201, each of the data transmission / reception unit 211, the differential backup optimization unit 212, the optimal size calculation unit 213, the differential backup unit 214, the hash data calculation unit 215, and the hash data comparison unit 216 operates. The boot image 141 is a computer program for causing the processor 201 to function as these operation units.

ブートイメージ１４１は、たとえばＲＦＣ（Request For Comments）４５７８で規定されているＰＸＥ（Preboot eXecution Environment）プロトコルに従ったサーバプログラムでもよいし、クライアント２１上で動作するいわゆるクライアントプログラムにプッシュ方式でブートイメージを送り込むものであってもよい。ブートイメージ１４１をクライアント２１に送信し、そのブートイメージでクライアント２１を再起動させることは既知の技術であるので、特に詳しく説明しない。 The boot image 141 may be a server program that complies with the PXE (Preboot eXecution Environment) protocol defined in RFC (Request For Comments) 4578, for example, or the boot image is transferred to a so-called client program operating on the client 21 by a push method. You may send in. Since transmitting the boot image 141 to the client 21 and restarting the client 21 with the boot image is a known technique, it will not be described in detail.

固定記憶手段２０３には、バックアップ対象データ記憶領域２２０が確保され、その中にバックアップ対象データ２２１が記憶されている。一時記憶手段２０２には、差分バックアップ最適化部２１２が作業中のデータを一時的に記憶する領域であるワーキングエリア２５０が確保されている。ワーキングエリア２５０には、差分バックアップ処理に必要となるハッシュデータ、処理時間、最適サイズが格納される。この詳細については後述する。 The fixed storage unit 203 has a backup target data storage area 220 in which backup target data 221 is stored. In the temporary storage unit 202, a working area 250, which is an area for temporarily storing data that the differential backup optimization unit 212 is working, is secured. The working area 250 stores hash data, processing time, and optimum size necessary for differential backup processing. Details of this will be described later.

データ送受信部２１１は、バックアップ管理サーバ１０からのハッシュデータ１６４〜１６６の受信、および、クライアント２１上で作成した差分データのバックアップ管理サーバ１０への送信を行う。差分バックアップ最適化部２１２は、差分バックアップの処理時間が最も小さくなるブロックサイズの計算、および、そのブロックサイズでの差分バックアップ処理を行う。最適サイズ計算部２１３は、バックアップ対象データ２２１の試験用領域に対して差分バックアップを試行する。差分バックアップ部２１４は、実際の差分バックアップ処理を行う。ハッシュデータ計算部２１５は、バックアップ対象データ２２１に対してハッシュデータの計算を行う。ハッシュデータ比較部２１６は、算出されたハッシュデータをバックアップ管理サーバ１０から受信したものと比較する。以上の各々の処理について、詳しくは後述する。 The data transmission / reception unit 211 receives the hash data 164 to 166 from the backup management server 10 and transmits the differential data created on the client 21 to the backup management server 10. The differential backup optimizing unit 212 calculates a block size that minimizes the differential backup processing time, and performs differential backup processing with that block size. The optimum size calculation unit 213 tries differential backup for the test area of the backup target data 221. The differential backup unit 214 performs actual differential backup processing. The hash data calculation unit 215 calculates hash data for the backup target data 221. The hash data comparison unit 216 compares the calculated hash data with that received from the backup management server 10. Details of each of the above processes will be described later.

図７は、図６に示したクライアント２１のワーキングエリア２５０に記憶されるデータについて示す説明図である。ワーキングエリア２５０には、バックアップ管理サーバ１０から受信した「試験用領域」のハッシュデータ１６４〜１６６と、その各々のブロックサイズ「カレント」「結合」「分割」で「試験用領域」に対して行った試験用バックアップ所要時間２５１〜２５３、そこから最適サイズ計算部２１３が決定した最適ブロックサイズ２５４、さらに（図３（ｂ）に示した）バックアップの作業中に使用される一時リスト２６０が記憶される。 FIG. 7 is an explanatory diagram showing data stored in the working area 250 of the client 21 shown in FIG. In the working area 250, the “test area” hash data 164 to 166 received from the backup management server 10 and the block sizes “current”, “combined”, and “divided” are performed on the “test area”. The backup backup required times 251 to 253, the optimum block size 254 determined by the optimum size calculation unit 213 therefrom, and the temporary list 260 used during the backup operation (shown in FIG. 3B) are stored. The

また、ワーキングエリア２５０には、ブロックサイズ「カレント」「結合」「分割」のうち後述の処理で「最適サイズ」として決定されたブロックサイズの「残領域」のハッシュデータ１６７〜１６９のうちいずれか１つも、「残領域」のバックアップの動作のために記憶される。 In the working area 250, one of the hash data 167 to 169 of the “remaining area” of the block size determined as the “optimal size” in the processing described later among the block sizes “current”, “combine”, and “division”. One is also stored for the “remaining area” backup operation.

図８〜９は、本実施形態に係る差分バックアップ処理の概要を示す説明図である。この差分バックアップ処理は、ステップ１〜４の４つの工程からなる。図８〜９は、紙面の都合上、２枚の図に分けて示している。その中で第１番目の工程である「ステップ１」を示す図８（ａ）には、バックアップ管理サーバ１０内のフルバックアップデータ１２１と、クライアント２１のバックアップ対象データ２２１とが示されている。フルバックアップデータ１２１の中で特定の領域および容量を試験用領域１２１ａとし、それ以外の領域を残領域１２１ｂとする。これに対応して、バックアップ対象データ２２１の同じ領域を試験用領域２２１ａおよび残領域２２１ｂとする。 8-9 is explanatory drawing which shows the outline | summary of the differential backup process which concerns on this embodiment. This differential backup process consists of four steps of steps 1 to 4. 8 to 9 are divided into two drawings for the sake of space. Among them, FIG. 8A showing the first step “Step 1” shows the full backup data 121 in the backup management server 10 and the backup target data 221 of the client 21. In the full backup data 121, a specific area and capacity are set as a test area 121a, and other areas are set as remaining areas 121b. Correspondingly, the same area of the backup target data 221 is set as a test area 221a and a remaining area 221b.

その中で、フルバックアップデータ１２１全体について、ブロックサイズｎで分割してブロックごとのハッシュ値を計算するが、その際のブロックサイズｎを「カレントサイズ（以後単にカレントという）」「カレント×２」「カレント／２」の３通りを用意し、それら各々の場合でハッシュデータ計算部１１４がハッシュ値を計算して「全領域」に対するハッシュデータ１６１〜１６３を作成する。 Among them, the entire full backup data 121 is divided by the block size n and the hash value for each block is calculated. The block size n at that time is set to “current size (hereinafter simply referred to as current)” “current × 2”. Three types of “current / 2” are prepared, and in each case, the hash data calculation unit 114 calculates hash values and creates hash data 161 to 163 for “all areas”.

カレントは任意に決定できるが、本実施例では５１２キロバイトをカレントサイズ設定値１５１、即ちカレントのデフォルト値とする。以下、それらの各々のブロックサイズを「カレント」「結合」「分割」といい、またそのハッシュ値の計算対象を「全領域」「試験用領域」「残領域」という。 Although the current can be determined arbitrarily, in this embodiment, 512 kilobytes is set as the current size setting value 151, that is, the current default value. Hereinafter, the respective block sizes are referred to as “current”, “combination”, and “division”, and the hash value calculation targets are referred to as “all areas”, “test areas”, and “remaining areas”.

第２番目の工程である「ステップ２」を示す図８（ｂ）では、フルバックアップデータ１２１およびバックアップ対象データ２２１の中で特定の領域および容量を試験用領域とし、それ以外の領域を残領域とする。そして最適化処理用データ作成部１１１が、ステップ１で計算されたハッシュデータ１６１〜１６３を試験用領域１２１ａ上のハッシュデータ１６４〜１６６と残領域１２１ｂ上のハッシュデータ１６７〜１６９に分割する。 In FIG. 8B showing “step 2” as the second process, a specific area and capacity are set as test areas in the full backup data 121 and the backup target data 221, and the remaining areas are the remaining areas. And Then, the optimization processing data creation unit 111 divides the hash data 161 to 163 calculated in Step 1 into hash data 164 to 166 on the test area 121a and hash data 167 to 169 on the remaining area 121b.

試験用領域のサイズは、任意に決定できる。本実施例では先頭から２０４８キロバイトを試験用領域、それ以後を残領域としている。そして試験用領域用のハッシュデータ１６４〜１６６をクライアント２１に送信する。 The size of the test area can be arbitrarily determined. In this embodiment, 2048 kilobytes from the beginning are used as test areas, and the remaining areas are used as remaining areas. Then, the hash data 164 to 166 for the test area are transmitted to the client 21.

第３番目の工程である「ステップ３」を示す図９（ｃ）では、試験用領域２２１ａに対して「カレント」「結合」「分割」の各ブロックサイズの場合で、最適サイズ計算部２１３の制御によって、ハッシュデータ比較部２１６が各ブロックのハッシュ値を比較しつつ差分バックアップ部２１４が差分バックアップを実行し、各々の場合で実際の処理にかかった所要時間を算出し、最適サイズ計算部２１３がいずれの場合が所要時間が最短であるかについて判断する。 In FIG. 9C showing “step 3” as the third process, the optimum size calculation unit 213 uses the block size of “current”, “join”, and “divide” for the test area 221a. Under the control, the hash data comparison unit 216 compares the hash values of the blocks while the differential backup unit 214 executes the differential backup, calculates the time required for actual processing in each case, and calculates the optimum size calculation unit 213. It is determined whether the required time is the shortest.

第４番目の工程である「ステップ４」を示す図９（ｄ）では、ステップ３で所要時間が最短となったブロックサイズの残領域２２１ｂ用のハッシュデータ１６７〜１６９のうちいずれか１つをバックアップ管理サーバ１０からクライアント２１に送信する。そして差分バックアップ最適化部２１２の制御によって、ハッシュデータ比較部２１６が各ブロックのハッシュ値を比較しつつ差分バックアップ部２１４が残領域に対する差分バックアップを実行し、バックアップ対象データ２２１全体に対する差分バックアップを完了する。 In FIG. 9D showing the fourth step “Step 4”, any one of the hash data 167 to 169 for the remaining area 221b of the block size whose required time is the shortest in Step 3 is stored. It is transmitted from the backup management server 10 to the client 21. Under the control of the differential backup optimization unit 212, the hash data comparison unit 216 compares the hash value of each block, the differential backup unit 214 executes the differential backup for the remaining area, and completes the differential backup for the entire backup target data 221. To do.

図１０〜１１は、図８〜９に示した各工程で作成されるハッシュデータ１６０について示す説明図である。ハッシュデータ１６０は、ブロックサイズ１６０ａが「カレント」「結合」「分割」の各ブロックサイズについて、そしてその各々の「全領域」についてのものと「試験用領域」「残領域」の各々についてのものという９通りについて、各ブロックの先頭アドレス１６０ｂとそのブロックのハッシュ値１６０ｃを格納するものである。 FIGS. 10-11 is explanatory drawing shown about the hash data 160 produced at each process shown to FIGS. The hash data 160 is for each block size whose block size 160a is "current", "combine", and "divided", and for each "all area" and for each of "test area" and "remaining area". In the nine ways, the head address 160b of each block and the hash value 160c of that block are stored.

図１０（ａ）はブロックサイズ「カレント（５１２キロバイト）」の「全領域」に対するハッシュデータ１６１、図１０（ｂ）はブロックサイズ「結合（１０２４キロバイト）」の「全領域」に対するハッシュデータ１６２、図１０（ｃ）はブロックサイズ「分割（２５６キロバイト）」の「全領域」に対するハッシュデータ１６３、図１０（ｄ）はブロックサイズ「カレント」の「試験用領域」に対するハッシュデータ１６４、図１０（ｅ）はブロックサイズ「結合」の「試験用領域」に対するハッシュデータ１６５、図１０（ｆ）はブロックサイズ「分割」の「試験用領域」に対するハッシュデータ１６６、図１１（ｇ）はブロックサイズ「カレント」の「残領域」に対するハッシュデータ１６７、図１１（ｈ）はブロックサイズ「結合」の「残領域」に対するハッシュデータ１６８、図１１（ｉ）はブロックサイズ「分割」の「残領域」に対するハッシュデータ１６９、を各々示す。 FIG. 10A shows hash data 161 for “all areas” of the block size “current (512 kilobytes)”, and FIG. 10B shows hash data 162 for “all areas” of the block size “join (1024 kilobytes)”. FIG. 10C shows hash data 163 for “all areas” with a block size “partition (256 kilobytes)”, and FIG. 10D shows hash data 164 for “test areas” with a block size “current”. e) is the hash data 165 for the “test area” for the block size “join”, FIG. 10F is the hash data 166 for the “test area” for the block size “divided”, and FIG. Hash data 167 for “remaining area” of “current”, FIG. Hash data 168 for the "remaining region", FIG. 11 (i) shows the hash data 169 block size of the "split" to "remaining area", respectively.

（最適化用ハッシュデータ計算処理）
図１２は、バックアップ管理サーバ１０で図１に示した最適化処理用データ作成部１１１が差分バックアップの準備処理として行う、最適化用ハッシュデータ計算処理について示すフローチャートである。最適化処理用データ作成部１１１はまず、カレントサイズ設定値１５１が未設定の場合、システム内の初期値を設定する（ステップＳ３０１）。ここで、システム内の初期値は、適用する領域などに応じてシステムごとに適切な値が設定されるべきものであるため、本発明では規定しない。 (Optimization hash data calculation processing)
FIG. 12 is a flowchart showing optimization hash data calculation processing performed as the differential backup preparation processing by the optimization processing data creation unit 111 shown in FIG. 1 in the backup management server 10. First, when the current size setting value 151 is not set, the optimization processing data creation unit 111 sets an initial value in the system (step S301). Here, the initial value in the system is not defined in the present invention because an appropriate value should be set for each system in accordance with the area to be applied.

ワーキングエリア１５０内にカレントが設定されていると、計算対象データとして「フルバックアップデータ１２１」、ブロックサイズとして「カレント」を入力としてハッシュデータ計算部１１４を呼び出して後述の図１３に示すハッシュデータ計算処理を実行し（ステップＳ３０２）、その出力結果をブロックサイズ「カレント」の「全領域」についてのハッシュデータ１６１としてワーキングエリア１５０に記憶する（ステップＳ３０３）。 When the current is set in the working area 150, the hash data calculation unit 114 is called by calling the hash data calculation unit 114 with “full backup data 121” as the calculation target data and “current” as the block size as inputs, and shown in FIG. The process is executed (step S302), and the output result is stored in the working area 150 as hash data 161 for “all areas” of the block size “current” (step S303).

続いて、計算対象データとして「フルバックアップデータ１２１」、ブロックサイズとして「カレント×２（結合）」を入力としてハッシュデータ計算部１１４を呼び出して図１３に示すハッシュデータ計算処理を実行し（ステップＳ３０４）、その出力結果をブロックサイズ「結合」の「全領域」についてのハッシュデータ１６２としてワーキングエリア１５０に記憶する（ステップＳ３０５）。 Subsequently, the hash data calculation process shown in FIG. 13 is executed by calling the hash data calculation unit 114 with “full backup data 121” as the calculation target data and “current × 2 (combination)” as the block size (step S304). The output result is stored in the working area 150 as hash data 162 for “all areas” of the block size “join” (step S305).

同様に、計算対象データとして「フルバックアップデータ１２１」、ブロックサイズとして「カレント／２（分割）」を入力としてハッシュデータ計算部１１４を呼び出して図１３に示すハッシュデータ計算処理を実行し（ステップＳ３０６）、その出力結果をブロックサイズ「分割」の「全領域」についてのハッシュデータ１６３としてワーキングエリア１５０に記憶する（ステップＳ３０７）。 Similarly, the hash data calculation process shown in FIG. 13 is executed by calling the hash data calculation unit 114 with “full backup data 121” as the calculation target data and “current / 2 (partition)” as the block size as inputs (step S306). The output result is stored in the working area 150 as hash data 163 for “all areas” of the block size “division” (step S307).

以上で求められたハッシュデータ１６０の各々を、最適化処理用データ作成部１１１が図１０〜１１に示したように試験用領域のものと残領域のものに分割して（ステップＳ３０８）、準備処理は終了する。なお、この準備処理は、任意のタイミングで行うことができる。即ち、差分バックアップを実行する直前であってもよいし、差分バックアップを実行した後に次回差分バックアップのために行ってもよい。 Each of the hash data 160 obtained as described above is divided into the test area and the remaining area as shown in FIGS. 10 to 11 by the optimization processing data creation unit 111 (step S308). The process ends. This preparation process can be performed at an arbitrary timing. That is, it may be performed immediately before executing the differential backup, or may be performed for the next differential backup after executing the differential backup.

図１３は、図１２のステップＳ３０２、３０４、３０６として示したハッシュデータ計算部１１４が行うハッシュデータ計算処理について示すフローチャートである。ハッシュデータ計算部１１４は、最適化処理用データ作成部１１１から計算対象データとブロックサイズを指定されて起動され、それに基づいてまず計算対象データをそのブロックサイズに分割して（ステップＳ３５１）、分割された各ブロックについてハッシュ値を計算し（ステップＳ３５２）、ブロックサイズ１６０ａ、先頭アドレス１６０ｂ、ハッシュ値１６０ｃを合わせたハッシュデータ１６０を出力する（ステップＳ３５３）。 FIG. 13 is a flowchart showing hash data calculation processing performed by the hash data calculation unit 114 shown as steps S302, 304, and 306 in FIG. The hash data calculation unit 114 is started by specifying the calculation target data and the block size from the optimization processing data creation unit 111, and based on this, the calculation target data is first divided into the block size (step S351). A hash value is calculated for each block (step S352), and hash data 160 that combines the block size 160a, the head address 160b, and the hash value 160c is output (step S353).

図１２の準備処理が行われた状態で、バックアップ管理サーバ１０がユーザの操作などによって差分バックアップが指示されると、図２のブートイメージ送信部１１３は、クライアント２１にブートイメージ１４１を送信し、その後にそのクライアント２１を再起動させる。再起動されたクライアント２１は、図６に示したブートイメージ１４１によって動作し、最適化差分バックアップ処理を開始する。 When the backup management server 10 is instructed to perform differential backup by a user operation or the like in the state where the preparation processing of FIG. 12 has been performed, the boot image transmission unit 113 of FIG. 2 transmits the boot image 141 to the client 21, Thereafter, the client 21 is restarted. The restarted client 21 operates according to the boot image 141 shown in FIG. 6 and starts the optimized differential backup process.

（最適化差分バックアップ処理）
図１４は、クライアント２１で図６に示した差分バックアップ最適化部２１２が行う最適化差分バックアップの処理について示すフローチャートである。差分バックアップ最適化部２１２は、最適サイズ計算部２１３を呼び出して最適サイズ計算処理を実行させる（ステップＳ４０１）。 (Optimized differential backup processing)
FIG. 14 is a flowchart illustrating optimized differential backup processing performed by the differential backup optimization unit 212 illustrated in FIG. The differential backup optimization unit 212 calls the optimal size calculation unit 213 to execute the optimal size calculation process (step S401).

図１５〜１６（紙面の都合で２枚に分けて示す）は、図１４のステップＳ４０１として示した、最適サイズ計算部２１３が行う最適サイズ計算処理について示すフローチャートである。最適サイズ計算部２１３はまず、データ送受信部２１１を介してバックアップ管理サーバ１０に試験用領域の各ブロックサイズ（カレント、結合、分割）についてのハッシュデータ１６４〜１６６を要求してこれを受信する（ステップＳ４５１）。 FIGS. 15 to 16 (shown divided into two for convenience of paper) are flowcharts showing the optimum size calculation processing performed by the optimum size calculator 213 shown as step S401 in FIG. First, the optimum size calculation unit 213 requests the backup management server 10 via the data transmission / reception unit 211 to request and receive hash data 164 to 166 for each block size (current, combined, and divided) of the test area (see FIG. Step S451).

これを受けて最適サイズ計算部２１３は、ブロックサイズ「カレント」、対象「試験用領域」を入力として差分バックアップ部２１４に差分バックアップ処理を実行させ（ステップＳ４５２）、その処理時間を「カレント」「試験用領域」の試験用バックアップ所要時間２５１としてワーキングエリア２５０に記憶する（ステップＳ４５３）。 In response to this, the optimum size calculation unit 213 receives the block size “current” and the target “test area” as input, causes the differential backup unit 214 to execute differential backup processing (step S452), and sets the processing time to “current” “ It is stored in the working area 250 as the required test backup time 251 for “test area” (step S453).

続いて最適サイズ計算部２１３は、ブロックサイズとして「結合」、対象として「試験用領域」を入力として差分バックアップ部２１４に差分バックアップ処理を実行させ（ステップＳ４５４）、その処理時間を「結合」「試験用領域」の試験用バックアップ所要時間２５２としてワーキングエリア２５０に記憶する（ステップＳ４５５）。 Subsequently, the optimum size calculation unit 213 receives the “join” as the block size and the “test area” as the target, causes the differential backup unit 214 to execute the differential backup process (step S454), and sets the processing time to “join” and “ It is stored in the working area 250 as the required test backup time 252 for “test area” (step S455).

ここで、「カレント」「試験用領域」の試験用バックアップ所要時間２５１と「結合」「試験用領域」の試験用バックアップ所要時間２５２とを比較し（ステップＳ４５６）、後者の方が短ければ「結合」が最適なブロックサイズであると判断して、これをワーキングエリア２５０内に最適ブロックサイズ２５４として格納して（ステップＳ４６２）最適サイズ計算処理を終了する。 Here, the test backup time 251 of “current” “test area” and the test backup time 252 of “join” “test area” are compared (step S456). It is determined that “join” is the optimum block size, and this is stored as the optimum block size 254 in the working area 250 (step S462), and the optimum size calculation process is terminated.

一方、ステップＳ４５６で「カレント」「試験用領域」の試験用バックアップ所要時間２５１が「結合」「試験用領域」の試験用バックアップ所要時間２５２よりも短ければ、今度はブロックサイズとして「分割」、対象として「試験用領域」を入力として差分バックアップ部２１４に差分バックアップ処理を実行させ（ステップＳ４５７）、その処理時間を「分割」「試験用領域」の試験用バックアップ所要時間２５３としてワーキングエリア２５０に記憶する（ステップＳ４５８）。 On the other hand, if the required backup time 251 for “current” and “test area” is shorter than the required backup time 252 for “join” and “test area” in step S456, then the block size is “divided”. The differential backup unit 214 is caused to execute the differential backup process by inputting “test area” as an object (step S457), and the processing time is set in the working area 250 as the required backup time 253 for “division” and “test area”. Store (step S458).

そして「カレント」「試験用領域」の試験用バックアップ所要時間２５１と「分割」「試験用領域」の試験用バックアップ所要時間２５３とを比較し（ステップＳ４５９）、後者の方が短ければ「分割」が最適なブロックサイズであると判断して、これをワーキングエリア２５０内に最適ブロックサイズ２５４として格納する（ステップＳ４６１）。前者の方が短ければ「カレント」が最適なブロックサイズであると判断して、これをワーキングエリア２５０内に最適サイズとして格納する（ステップＳ４６０）。その後、最適サイズ計算処理を終了する。 Then, the test backup time 251 of “current” “test area” is compared with the test backup time 253 of “division” “test area” (step S459). If the latter is shorter, “division” is performed. Is the optimum block size, and is stored as the optimum block size 254 in the working area 250 (step S461). If the former is shorter, it is determined that “current” is the optimum block size, and this is stored as the optimum size in the working area 250 (step S460). Thereafter, the optimum size calculation process is terminated.

本実施形態で示したような、バックアップ管理サーバがネットワークで接続されている複数台のクライアントコンピュータのバックアップ対象データを一括してバックアップするという環境では、ネットワーク上でのデータ転送速度が差分バックアップの所要時間にとって最も大きく影響する。 In an environment where the backup target data of multiple client computers connected via a network is backed up collectively as shown in this embodiment, the data transfer speed on the network requires differential backup. It has the greatest impact on time.

もしネットワーク上でのデータ転送速度が極めて早く、そのデータ転送速度による処理時間に対する影響が無視できるなら、バックアップ処理においてオーバーヘッドとなる差分データ計算（ハッシュデータの算出および比較など）の処理を可能な限り少なくする方が、差分バックアップの所要時間の短縮になる。従って、ブロックサイズは可能な限り大きくする方が望ましい。バックアップ対象データ全体を１ブロックとすることが最も理想的ではある。 If the data transfer speed on the network is extremely fast and the effect on the processing time due to the data transfer speed can be ignored, differential data calculation (such as calculation and comparison of hash data) that is an overhead in backup processing should be performed as much as possible. Decreasing the number will shorten the time required for differential backup. Therefore, it is desirable to increase the block size as much as possible. It is most ideal to make the entire backup target data one block.

しかしながら、現実的にはネットワーク上でのデータ転送速度がバックアップ作業の上で障害となることが多い。そのため本実施形態では、図１５〜１６に示したように、データ転送速度による影響が無視できる範囲で、可能な限りブロックサイズを大きくするというアルゴリズムの構成となっている。 However, in reality, the data transfer speed on the network often becomes an obstacle to backup work. Therefore, in this embodiment, as shown in FIGS. 15 to 16, the algorithm is configured to increase the block size as much as possible within a range where the influence of the data transfer rate can be ignored.

前述のステップＳ４５６で、「カレント」「試験用領域」の試験用バックアップ所要時間２５１より「結合」「試験用領域」の試験用バックアップ所要時間２５２が短ければ「結合」を最適ブロックサイズ２５４としたのは、この考え方に基づいている。この場合は、まだデータ転送速度が障害になる範囲ではないと判断できるから、ブロックサイズ「分割」に対する試験用バックアップは行わない。 If the required backup time 252 for “join” and “test area” is shorter than the required test backup time 251 for “current” and “test area” in step S456 described above, “join” is set to the optimum block size 254. Is based on this idea. In this case, since it can be determined that the data transfer rate is not in the range of an obstacle, the test backup for the block size “division” is not performed.

図１４に戻って、差分バックアップ最適化部２１２は最適サイズ計算部２１３による最適なブロックサイズの計算結果、即ち最適ブロックサイズ２５４について判断し（ステップＳ４０２〜４０３）、これが「カレント」であれば（ステップＳ４０２：ＹＥＳ）、データ送受信部２１１はバックアップ管理サーバ１０に残領域のブロックサイズ「カレント」についてのハッシュデータ１６７を要求してこれを受信し、これを入力として差分バックアップ部２１４を呼び出して残領域の差分バックアップ処理を実行する（ステップＳ４０６）。 Returning to FIG. 14, the differential backup optimization unit 212 determines the calculation result of the optimal block size by the optimal size calculation unit 213, that is, the optimal block size 254 (steps S 402 to S 403), and if this is “current” ( In step S402: YES), the data transmission / reception unit 211 requests the backup management server 10 to receive the hash data 167 for the block size “current” of the remaining area, receives it, calls the differential backup unit 214 as an input, and remains. An area differential backup process is executed (step S406).

最適ブロックサイズ２５４が「結合」であれば（ステップＳ４０２：ＮＯ、ステップＳ４０３：ＹＥＳ）、データ送受信部２１１はバックアップ管理サーバ１０に残領域のブロックサイズ「結合」についてのハッシュデータ１６８を要求してこれを受信し、これを入力として差分バックアップ部２１４を呼び出して残領域の差分バックアップ処理を実行する（ステップＳ４０５）。 If the optimum block size 254 is “join” (step S402: NO, step S403: YES), the data transmitting / receiving unit 211 requests the backup management server 10 for the hash data 168 for the block size “join” of the remaining area. Receiving this, the differential backup unit 214 is called with this as an input, and the differential backup process of the remaining area is executed (step S405).

最適ブロックサイズ２５４が「カレント」でも「結合」でもなく「分割」である場合（ステップＳ４０２：ＮＯ、ステップＳ４０３：ＮＯ）、データ送受信部２１１はバックアップ管理サーバ１０に残領域のブロックサイズ「分割」についてのハッシュデータ１６９を要求してこれを受信し、これを入力として差分バックアップ部２１４を呼び出して残領域の差分バックアップ処理を実行する（ステップＳ４０４）。 When the optimum block size 254 is not “current” nor “combined” but “divided” (step S402: NO, step S403: NO), the data transmitting / receiving unit 211 notifies the backup management server 10 of the remaining area block size “divided”. The hash data 169 is requested and received, and is received as an input, and the differential backup unit 214 is called to execute the differential backup process for the remaining area (step S404).

差分バックアップ処理完了後、最適ブロックサイズ２５４を次回処理でカレントサイズとして使用するようバックアップ管理サーバ１０に送信して（ステップＳ４０７）、以上で最適化差分バックアップ処理を終了する。なお、この最適化差分バックアップ処理の後に差分データ１３１をフルバックアップデータ１２１にマージするという処理があるが、このマージ処理については既知の技術で実現可能であり、本発明の範囲ではないので、詳しくは説明しない。 After the differential backup process is completed, the optimum block size 254 is transmitted to the backup management server 10 so as to be used as the current size in the next process (step S407), and the optimized differential backup process is thus completed. Although there is a process of merging the differential data 131 into the full backup data 121 after the optimized differential backup process, the merge process can be realized by a known technique and is not within the scope of the present invention. Will not be explained.

図１７は、図１５のステップＳ４５２、４５４、４５７、および図１４のＳ４０４、４０５、４０６として示した、差分バックアップ部２１４およびハッシュデータ比較部２１６による差分バックアップ処理について示すフローチャートである。差分バックアップ部２１４は、差分バックアップ最適化部２１２もしくは最適サイズ計算部２１３からブロックサイズとバックアップ対象領域とを指定し、その領域のハッシュデータ１６０を入力されて起動される。 FIG. 17 is a flowchart showing the differential backup processing by the differential backup unit 214 and the hash data comparison unit 216 shown as steps S452, 454, and 457 in FIG. 15 and S404, 405, and 406 in FIG. The differential backup unit 214 is activated by designating a block size and a backup target area from the differential backup optimization unit 212 or the optimal size calculation unit 213 and receiving the hash data 160 of that area.

起動された差分バックアップ部２１４は、ｉ＝１を初期値として、ｉ番目の領域のバックアップ対象データ２２１を読み取り（ステップＳ５０２）、その領域のハッシュ値２５１ｃを計算し（ステップＳ５０３）、ハッシュデータ比較部２１６にこれとハッシュデータ１６０で入力されたｉ番目の領域のハッシュ値１６０ｃとを比較させ（ステップＳ５０４）、異なっていればハッシュ値１６０ｃおよび２５１ｃが異なるブロックの先頭アドレス１６０ｂを図３（ｂ）に示した一時リスト２６０として出力する（ステップＳ５０５）。 The activated differential backup unit 214 reads i-th area backup target data 221 with i = 1 as an initial value (step S502), calculates a hash value 251c of the area (step S503), and compares hash data The unit 216 compares this with the hash value 160c of the i-th area input by the hash data 160 (step S504). If they are different, the head address 160b of the block having different hash values 160c and 251c is shown in FIG. ) Is output as a temporary list 260 (step S505).

以上のステップＳ５０２〜５０５の処理を、ｉの値を１ずつ増加させて、ｉがＮ／ｎに到達するまで（Ｎは全ブロックサイズ、ｎはブロックサイズ）、即ちバックアップ対象データ２２１の全てのブロックに対して行う（ステップＳ５０６〜５０７）。そして、全ブロックに対してハッシュ値１６０ｃおよび２５１ｃの比較が完了したら、一時リスト２６０として出力された先頭アドレス１６０ｂに対応するブロックのデータを、差分データとしてバックアップ管理サーバ１０に送信する（ステップＳ５０８）。 The processes in steps S502 to S505 are repeated until the value of i is incremented by 1 until i reaches N / n (N is the total block size, n is the block size), that is, all the backup target data 221 It performs with respect to a block (steps S506-507). When the comparison of the hash values 160c and 251c is completed for all the blocks, the block data corresponding to the head address 160b output as the temporary list 260 is transmitted to the backup management server 10 as difference data (step S508). .

図１８は、図１４〜１６に示したクライアント２１での処理に対応するバックアップ管理サーバ１０での処理について示すフローチャートである。まずは図１５のステップＳ４５１の処理で試験用領域のハッシュデータ１６４〜１６６を要求されたバックアップ管理サーバ１０は、データ送受信部１１２がこれを受けて、それらのハッシュデータ１６４〜１６６をクライアント２１に送信する（ステップＳ６０１）。 FIG. 18 is a flowchart showing processing in the backup management server 10 corresponding to the processing in the client 21 shown in FIGS. First, in the backup management server 10 for which the hash data 164 to 166 of the test area is requested in the process of step S451 in FIG. (Step S601).

そして、それらを利用して行われた差分バックアップデータを、クライアント２１からデータ送受信部１１２が受信し、これを差分データ格納部１１６が差分データ１３１として保存する（ステップＳ６０２）。 Then, the data transmission / reception unit 112 receives the differential backup data performed using them from the client 21, and the differential data storage unit 116 stores it as the differential data 131 (step S602).

クライアント２１では、これに引き続いて最適サイズが決定され、図１４のステップＳ４０４〜４０６の処理でその最適サイズで残領域のハッシュデータ１６７〜１６９を要求される（ステップＳ６０３）ので、これを受信したバックアップ管理サーバ１０は、データ送受信部１１２がこれを受けて、その要求に対応するハッシュデータ１６７〜１６９をクライアント２１に送信する（ステップＳ６０４）。 Subsequently, the client 21 determines the optimum size, and requests hash data 167 to 169 of the remaining area with the optimum size in the processing of steps S404 to S406 in FIG. 14 (step S603). In the backup management server 10, the data transmitting / receiving unit 112 receives this, and transmits the hash data 167 to 169 corresponding to the request to the client 21 (step S604).

そして、ステップＳ６０１で送信した残領域のハッシュデータ１６７〜１６９で行われた差分バックアップデータを、クライアント２１からデータ送受信部１１２が受信し、これを差分データ格納部１１６が差分データ１３１として保存する（ステップＳ６０５）。 Then, the differential backup data performed with the remaining area hash data 167 to 169 transmitted in step S601 is received from the client 21 by the data transmission / reception unit 112, and the differential data storage unit 116 stores it as differential data 131 ( Step S605).

そしてその後、ステップＳ６０３の処理で受信した最適サイズ以外の試験用差分データを差分データ格納部１１６が破棄し（ステップＳ６０６）、受信した最適サイズを最適化処理用データ作成部１１１がカレントサイズ設定値１５１としてワーキングエリア１５０に記憶する（ステップＳ６０７）。 Thereafter, the differential data storage unit 116 discards the test differential data other than the optimum size received in the process of step S603 (step S606), and the optimization process data creation unit 111 sets the received optimum size to the current size setting value. 151 is stored in the working area 150 (step S607).

（具体的な処理の例）
以上の動作について、より具体的な動作例を挙げて説明する。バックアップ対象データ２２１は、０〜１６３８４キロバイトをその対象領域のアドレス範囲とする。そして、カレントサイズ設定値１５１として「５１２キロバイト」、試験用領域サイズとして「２０４８キロバイト」の各々のブロックサイズが設定されているものとする。 (Example of specific processing)
The above operation will be described with a more specific operation example. The backup target data 221 has 0 to 16384 kilobytes as the address range of the target area. It is assumed that each block size of “512 kilobytes” is set as the current size setting value 151 and “2048 kilobytes” is set as the test area size.

バックアップ対象データ２２１全体がフルバックアップデータ１２１としてバックアップ管理サーバ１０の固定記憶手段１０３に保存されている。これに対してクライアント２１の固定記憶手段２０３にあるバックアップ対象データ２２１の差分バックアップを取る動作を、これから具体的に説明する。 The entire backup target data 221 is stored in the fixed storage means 103 of the backup management server 10 as full backup data 121. On the other hand, the operation of taking a differential backup of the backup target data 221 in the fixed storage means 203 of the client 21 will be specifically described below.

まず差分バックアップの準備処理として、バックアップ管理サーバ１０の最適化処理用データ作成部１１１は、図１２に示した最適化用ハッシュデータ計算処理を行う。その計算対象データは図１６に示した「フルバックアップデータ１２１」である。カレントサイズ設定値１５１として「５１２キロバイト」が設定されているので、ハッシュデータ計算部１１４がステップＳ３０２および図１３に示したハッシュデータ計算処理を行い、その結果として得られる図１０（ａ）に示したブロックサイズ「カレント（５１２キロバイト）」の「全領域」に対するハッシュデータ１６１をワーキングエリア１５０に記憶する（図１２・ステップＳ３０３）。 First, as a differential backup preparation process, the optimization processing data creation unit 111 of the backup management server 10 performs the optimization hash data calculation process shown in FIG. The calculation target data is “full backup data 121” shown in FIG. Since “512 kilobytes” is set as the current size setting value 151, the hash data calculation unit 114 performs the hash data calculation processing shown in step S302 and FIG. 13, and the result shown in FIG. The hash data 161 for “all areas” of the block size “current (512 kilobytes)” is stored in the working area 150 (FIG. 12, step S303).

続いてハッシュデータ計算部１１４は、ブロックサイズ「結合（１０２４キロバイト）」およびブロックサイズ「分割（２５６キロバイト）」についても同じように処理を行い、各々のブロックサイズの「全領域」に対するハッシュデータ１６２および１６３をワーキングエリア１５０に記憶する（図１２・ステップＳ３０５、３０７）。 Subsequently, the hash data calculation unit 114 performs the same processing for the block size “combined (1024 kilobytes)” and the block size “divided (256 kilobytes)”, and the hash data 162 for each block size “all areas” is obtained. And 163 are stored in the working area 150 (FIG. 12, steps S305 and 307).

図１０（ａ）〜（ｃ）の各ハッシュデータ１６０を計算後、最適化処理用データ作成部１１１はあらかじめシステム内に設定されている試験用領域のサイズ（２０４８キロバイト）をもとに、ハッシュデータを分割し（ステップＳ３０８）、その結果として図１０（ｄ）〜図１１（ｉ）に示した各々のハッシュデータ１６４〜１６９がワーキングエリア１５０に記憶される。 After calculating the hash data 160 of FIGS. 10A to 10C, the optimization processing data creation unit 111 performs hashing based on the size of the test area (2048 kilobytes) set in the system in advance. The data is divided (step S308), and as a result, the hash data 164 to 169 shown in FIGS. 10 (d) to 11 (i) are stored in the working area 150.

ユーザからの操作、もしくは設定時刻の到来などによって差分バックアップが開始されると、ブートイメージ送信部１１３はクライアント２１にブートイメージ１４１を送信し、そしてクライアント２１を再起動させる。再起動されたクライアント２１はブートイメージ１４１を起動し、プロセッサ２０１が図６に示した各動作部として機能する。そして差分バックアップ最適化部２１２が、図１４に示した最適化差分バックアップ処理を開始する。 When differential backup is started by an operation from the user or the arrival of a set time, the boot image transmission unit 113 transmits the boot image 141 to the client 21 and restarts the client 21. The restarted client 21 starts the boot image 141, and the processor 201 functions as each operation unit illustrated in FIG. Then, the differential backup optimization unit 212 starts the optimized differential backup process shown in FIG.

最適化差分バックアップ処理が開始されると、差分バックアップ最適化部２１２は、最適サイズ計算部２１３を呼び出して図１５〜１６に示した最適サイズ計算処理を実行させる（ステップＳ４０１）。最適サイズ計算処理が開始されると、最適サイズ計算部２１３は、バックアップ管理サーバ１０から試験用領域の各ブロックサイズ（カレント、結合、分割）についてのハッシュデータ１６４〜１６６を受信し（図１５・ステップＳ４５１）、まずはブロックサイズ「カレント」、対象「試験用領域」を入力として差分バックアップ部２１４を呼び出して差分バックアップ処理を実行させる（図１５・ステップＳ４５２）。 When the optimized differential backup process is started, the differential backup optimization unit 212 calls the optimal size calculation unit 213 to execute the optimal size calculation process shown in FIGS. 15 to 16 (step S401). When the optimum size calculation process is started, the optimum size calculator 213 receives the hash data 164 to 166 for each block size (current, combined, divided) of the test area from the backup management server 10 (FIG. 15). Step S451) First, the differential backup unit 214 is called with the block size “current” and the target “test area” as inputs to execute the differential backup process (step S452 in FIG. 15).

この差分バックアップの処理時間として５００ｍｓが得られた。最適サイズ計算部２１３は、その処理時間を「カレント」「試験用領域」の試験用バックアップ所要時間２５１としてワーキングエリア２５０に記憶する（図１５・ステップＳ４５３）。 The differential backup processing time was 500 ms. The optimum size calculation unit 213 stores the processing time in the working area 250 as the required test backup time 251 for “current” and “test area” (step S453 in FIG. 15).

続いて最適サイズ計算部２１３は、ブロックサイズとして「結合」、対象として「試験用領域」を入力として差分バックアップ部２１４に差分バックアップ処理を実行さる（図１５・ステップＳ４５４）。この差分バックアップの処理時間として２５０ｍｓが得られた。その処理時間を「結合」「試験用領域」の試験用バックアップ所要時間２５２としてワーキングエリア２５０に記憶する（図１５・ステップＳ４５５）。 Subsequently, the optimum size calculation unit 213 executes the differential backup processing in the differential backup unit 214 with “join” as the block size and “test area” as the target as input (FIG. 15, step S454). The processing time for this differential backup was 250 ms. The processing time is stored in the working area 250 as a test backup required time 252 for “join” and “test area” (FIG. 15, step S455).

ここで「カレント」「試験用領域」の試験用バックアップ所要時間２５１と「結合」「試験用領域」の試験用バックアップ所要時間２５２とを比較すると（図１５・ステップＳ４５６）、前者が５００ｍｓ、後者が２５０ｍｓで、後者の方が短かった。従って「結合」が最適なブロックサイズであると判断して、これをワーキングエリア２５０内に最適ブロックサイズ２５４として格納する（図１５・ステップＳ４６２）。 Here, when the required backup time 251 for “current” and “test area” is compared with the required backup time 252 for “join” and “test area” (step S456 in FIG. 15), the former is 500 ms and the latter. Was 250 ms, and the latter was shorter. Therefore, it is determined that “join” is the optimum block size, and this is stored as the optimum block size 254 in the working area 250 (step S462 in FIG. 15).

これに続いてデータ送受信部２１１は、バックアップ管理サーバ１０から残領域のブロックサイズ「結合」についてのハッシュデータ１６８を受信し、これを入力として差分バックアップ部２１４を呼び出して残領域の差分バックアップ処理を実行する（図１４・ステップＳ４０５、図１８・ステップＳ６０３〜６０４）。ここで取得されてクライアント２１からバックアップ管理サーバ１０に送信された差分バックアップデータを、データ送受信部１１２が受信し、これを差分データ格納部１１６が差分データ１３１として保存する（図１８・ステップＳ６０５）。 Subsequently, the data transmission / reception unit 211 receives the hash data 168 for the block size “combined” of the remaining area from the backup management server 10 and calls the differential backup unit 214 as an input to perform the differential backup process for the remaining area. It performs (FIG. 14, step S405, FIG. 18, step S603-604). The differential backup data acquired here and transmitted from the client 21 to the backup management server 10 is received by the data transmission / reception unit 112, and the differential data storage unit 116 stores it as the differential data 131 (FIG. 18, step S605). .

そしてこの差分バックアップ処理が完了すると、データ送受信部２１１が最適サイズ「１０２４キロバイト」を次回処理のカレントサイズとしてバックアップ管理サーバ１０に送信する（図１４・ステップＳ４０５）。バックアップ管理サーバ１０では、最適化処理用データ作成部１１１がこれをカレントサイズ設定値１５１としてワーキングエリア１５０に記憶する（図１８・ステップＳ６０７）。 When this differential backup processing is completed, the data transmitting / receiving unit 211 transmits the optimum size “1024 kilobytes” to the backup management server 10 as the current size of the next processing (FIG. 14, step S405). In the backup management server 10, the optimization processing data creation unit 111 stores this in the working area 150 as the current size setting value 151 (FIG. 18, step S607).

（第１の実施形態の全体的な動作）
次に、上記の実施形態の全体的な動作について説明する。本実施形態に係る差分バックアップ方法は、バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアント２１と、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバ１０とが相互に接続された差分バックアップシステム１にあって、予め与えられた第１ないし第３のブロックサイズの各々でバックアップ管理サーバの第１のハッシュデータ計算部がフルバックアップデータをブロックに区切り、各ブロックのデータに予め与えられた関数を適用してバックアップ管理サーバの第１のハッシュデータ計算部が第１のハッシュデータを算出し（図１２・ステップＳ３０２〜３０７）、第１ないし第３のブロックサイズの各々の場合について第１のハッシュデータをバックアップ管理サーバの最適化処理用データ作成部がフルバックアップデータの一部である試験用領域についてのものとそれ以外の領域のものとに分割し（図１２・ステップＳ３０８）、試験用領域についてのハッシュデータをバックアップ管理サーバの最適化処理用データ作成部がクライアントに送信し（図１８・ステップＳ６０１）、第１〜第３のブロックサイズの各々の場合についてクライアントの第２のハッシュデータ計算部がバックアップ対象データを当該ブロックサイズのブロックに区切り、各ブロックのデータに関数を適用してクライアントの第２のハッシュデータ計算部が第２のハッシュデータを算出し（図１７・ステップＳ５０３）、ブロックごとに算出された第２のハッシュデータとバックアップ管理サーバから受信した第１のハッシュデータとをクライアントのハッシュデータ比較部が比較し（図１７・ステップＳ５０４）、ブロックごとに第１および第２のハッシュデータの比較結果が異なっている場合にバックアップ対象データの該ブロックをクライアントの差分バックアップ部が差分バックアップし（図１７・ステップＳ５０８）、第１〜第３のブロックサイズで行った試験用領域についての差分バックアップで最も所要時間の少なかったブロックサイズをクライアントの最適サイズ計算部が最適サイズとして決定し（図１５・ステップＳ４５６、図１６・ステップＳ４５９）、決定された最適サイズの場合の残領域についての第１のハッシュデータをクライアントの差分バックアップ最適化部がバックアップ管理サーバに要求し（図１８・ステップＳ６０３）、要求された残領域についての第１のハッシュデータをバックアップ管理サーバのデータ送受信部がクライアントに送信し（図１８・ステップＳ６０４）、最適サイズでバックアップ対象データの残領域をブロックに区切ってクライアントの差分バックアップ最適化部が差分バックアップ部に差分バックアップを行わせる（図１４・ステップＳ４０２〜４０６）。 (Overall operation of the first embodiment)
Next, the overall operation of the above embodiment will be described. In the differential backup method according to the present embodiment, the client 21 that is one or a plurality of computer devices that hold backup target data and the backup management server 10 that holds full backup data that is a complete backup of each backup target data are mutually connected. In the differential backup system 1 connected to the first backup data, the first hash data calculation unit of the backup management server divides the full backup data into blocks for each of the first to third block sizes given in advance. The first hash data calculation unit of the backup management server calculates the first hash data by applying a function given in advance to the data (steps S302 to S307 in FIG. 12), and the first to third block sizes. In each case, the first The optimization data generator of the backup management server divides the data into the test area that is a part of the full backup data and the other area (step S308 in FIG. 12). The optimization processing data creation unit of the backup management server transmits the hash data for the area to the client (step S601 in FIG. 18), and the second hash data of the client for each of the first to third block sizes. The calculation unit divides the backup target data into blocks of the block size, applies a function to the data of each block, and the second hash data calculation unit of the client calculates the second hash data (FIG. 17, step S503). , Second hash data calculated for each block and backup management The hash data comparison unit of the client compares the first hash data received from the server (step S504 in FIG. 17), and the backup is performed when the comparison result of the first and second hash data is different for each block. The differential backup unit of the client differentially backs up the block of the target data (step S508 in FIG. 17), and the block size with the shortest required time in the differential backup for the test area performed with the first to third block sizes Is determined as the optimum size by the client's optimum size calculation unit (FIG. 15, step S456, FIG. 16, step S459), and the first hash data for the remaining area in the case of the decided optimum size is used as the client differential backup optimum. Request from the backup management server (Figure 18. Step S603), the data transmission / reception unit of the backup management server transmits the first hash data for the requested remaining area to the client (FIG. 18, Step S604), and blocks the remaining area of the backup target data at the optimum size. The differential backup optimization unit of the client causes the differential backup unit to perform differential backup (steps S402 to 406 in FIG. 14).

またここで、第２のブロックサイズが第１のブロックサイズより大きく、かつ第３のブロックサイズが第１のブロックサイズより小さい。そして残領域についての第１のハッシュデータをクライアントに返信する際に受信した最適サイズを次回同一処理時の第１のブロックサイズとしてバックアップ管理サーバの最適化処理用データ作成部が記憶する（図１８・ステップＳ６０７）。 Here, the second block size is larger than the first block size, and the third block size is smaller than the first block size. Then, the optimization processing data creation unit of the backup management server stores the optimum size received when the first hash data for the remaining area is returned to the client as the first block size at the next same processing (FIG. 18). Step S607).

ここで、上記各動作ステップについては、これをコンピュータで実行可能にプログラム化し、これらを前記各ステップを直接実行するコンピュータであるバックアップ管理サーバ１０およびクライアント２１に実行させるようにしてもよい。
この構成および動作により、本実施形態は以下のような効果を奏する。 Here, each of the above-described operation steps may be programmed to be executable by a computer, and may be executed by the backup management server 10 and the client 21 which are computers that directly execute the respective steps.
With this configuration and operation, the present embodiment has the following effects.

本実施形態は、３種類のブロックサイズで試験用領域についての差分バックアップを行い、その結果として所要時間が最短となったブロックサイズによって残領域について差分バックアップを行うので、単なる「見積もり」ではなく、実際のデータおよび処理の内容に即した形で最適なブロックサイズを得て、これによって差分バックアップにかかる時間を短縮することができる。 In this embodiment, the differential backup for the test area is performed with three types of block sizes, and as a result, the differential backup is performed for the remaining area with the block size having the shortest required time. An optimum block size can be obtained in conformity with actual data and processing contents, thereby reducing the time required for differential backup.

しかも、その処理で得られた最適なブロックサイズは、次回処理時に「カレントサイズ」として記憶されるので、同一の差分バックアップ処理を何度も繰り返すうちにますます最適化された数値となる。もちろんこの処理は、クライアント２１、２２、…の各々で行われるので、各々のクライアントにとって最適なブロックサイズによって差分バックアップを行うことができる。 In addition, since the optimum block size obtained by the processing is stored as the “current size” at the next processing, the numerical value becomes more and more optimized as the same differential backup processing is repeated many times. Of course, since this processing is performed in each of the clients 21, 22,..., Differential backup can be performed with an optimum block size for each client.

これまで本発明について図面に示した特定の実施形態をもって説明してきたが、本発明は図面に示した実施形態に限定されるものではなく、本発明の効果を奏する限り、これまで知られたいかなる構成であっても採用することができる。 The present invention has been described with reference to the specific embodiments shown in the drawings. However, the present invention is not limited to the embodiments shown in the drawings, and any known hitherto provided that the effects of the present invention are achieved. Even if it is a structure, it is employable.

上述した各々の実施形態について、その新規な技術内容の要点をまとめると、以下のようになる。なお、上記実施形態の一部または全部は、新規な技術として以下のようにまとめられるが、本発明は必ずしもこれに限定されるものではない。 About each embodiment mentioned above, it is as follows when the summary of the novel technical content is put together. In addition, although part or all of the said embodiment is summarized as follows as a novel technique, this invention is not necessarily limited to this.

（付記１）バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバとが相互に接続された差分バックアップシステムであって、
前記バックアップ管理サーバが、
前記フルバックアップデータを記憶する第１の固定記憶手段と、
ネットワークを経由して前記クライアントとデータ通信を行う第１のデータ送受信部と、
予め与えられた第１ないし第３のブロックサイズの各々で前記フルバックアップデータをブロックに区切ってこれら各ブロックのデータに予め与えられた関数を適用して第１のハッシュデータを算出する第１のハッシュデータ計算部と、
前記第１ないし第３のブロックサイズの各々の場合の前記第１のハッシュデータを前記フルバックアップデータの一部である試験用領域についてのものとそれ以外の残領域のものとに分割して前記試験用領域についてのハッシュデータを前記クライアントに送信する最適化処理用データ作成部と
を有し、
前記クライアントが、
前記バックアップ管理サーバとデータ通信を行う第２のデータ送受信部と、
前記バックアップ対象データを記憶する第２の固定記憶手段と、
前記第１〜第３のブロックサイズの各々の場合について前記バックアップ対象データをブロックに区切ってこれら各ブロックのデータに前記関数を適用して第２のハッシュデータを算出する第２のハッシュデータ計算部と、
算出された前記第２のハッシュデータを前記ブロックごとに前記バックアップ管理サーバから受信した前記第１のハッシュデータと比較するハッシュデータ比較部と、
前記ブロックごとの前記第１および第２のハッシュデータの比較結果が異なっている場合に前記バックアップ対象データの該ブロックを差分バックアップする差分バックアップ部と、
前記第１ないし第３のブロックサイズの各々の場合について、前記試験用領域についての第１のハッシュデータを利用して前記バックアップ対象データの前記試験用領域について前記差分バックアップ部に試験的差分バックアップを行わせると共に、前記第１〜第３のブロックサイズで行った前記試験的差分バックアップで最も所要時間の少なかったブロックサイズを最適サイズとして決定する最適サイズ計算部と、
決定された前記最適サイズの場合の前記残領域についての第１のハッシュデータを前記バックアップ管理サーバに要求すると共に、送られてきたこの残領域についての第１のハッシュデータを利用して前記差分バックアップ部に前記最適サイズで前記バックアップ対象データの残領域の差分バックアップを行わせる差分バックアップ最適化部と
を有することを特徴とする差分バックアップシステム。 (Supplementary note 1) In a differential backup system in which a client, which is a computer device or a plurality of computer devices that hold backup target data, and a backup management server that holds full backup data that is a complete backup of each backup target data are interconnected There,
The backup management server is
First fixed storage means for storing the full backup data;
A first data transmitting / receiving unit that performs data communication with the client via a network;
A first hash data is calculated by dividing the full backup data into blocks at each of first to third block sizes given in advance and applying a given function to the data of each block. A hash data calculator,
The first hash data in each of the first to third block sizes is divided into a test area that is a part of the full backup data and a remaining area other than the test area. An optimization processing data creation unit that transmits hash data about the test area to the client;
The client
A second data transmitting / receiving unit that performs data communication with the backup management server;
Second fixed storage means for storing the backup target data;
A second hash data calculation unit that calculates the second hash data by dividing the backup target data into blocks and applying the function to the data of each block for each of the first to third block sizes When,
A hash data comparison unit that compares the calculated second hash data with the first hash data received from the backup management server for each block;
A differential backup unit that differentially backs up the block of the backup target data when the comparison result of the first and second hash data for each block is different;
For each of the first to third block sizes, a trial differential backup is performed on the differential backup unit for the test area of the backup target data using the first hash data for the test area. An optimal size calculation unit that determines the block size with the shortest required time in the experimental differential backup performed with the first to third block sizes as an optimal size,
Requesting the backup management server for the first hash data for the remaining area in the case of the determined optimum size, and using the first hash data for the remaining area sent thereto, the differential backup A differential backup system, comprising: a differential backup optimization unit that causes a differential backup of the remaining area of the backup target data to be performed at the optimal size.

（付記２）前記第２のブロックサイズが前記第１のブロックサイズより大きく、かつ前記第３のブロックサイズが前記第１のブロックサイズより小さく、
前記バックアップ管理サーバの前記最適化処理用データ作成部が、前記残領域についての第１のハッシュデータを前記クライアントに返信する際に受信した前記最適サイズを次回同一処理時の前記第１のブロックサイズとして記憶する機能を有することを特徴とする、付記１に記載の差分バックアップシステム。 (Supplementary Note 2) The second block size is larger than the first block size, and the third block size is smaller than the first block size,
The optimization processing data creation unit of the backup management server returns the first block size at the time of the same processing next time to the optimum size received when returning the first hash data for the remaining area to the client. The differential backup system according to appendix 1, characterized by having a function of storing as

（付記３）前記クライアントの前記最適サイズ計算部が、前記第２のブロックサイズで行った前記試験的差分バックアップの所要時間が前記第１のブロックサイズで行った前記試験的差分バックアップの所要時間より短ければ、その時点で前記第２のブロックサイズを前記最適サイズとして決定する機能を有することを特徴とする、付記２に記載の差分バックアップシステム。 (Supplementary Note 3) The time required for the experimental differential backup performed with the second block size by the optimum size calculation unit of the client is greater than the time required for the experimental differential backup performed with the first block size. 3. The differential backup system according to appendix 2, characterized in that if it is shorter, it has a function of determining the second block size as the optimum size at that time.

（付記４）前記バックアップ管理サーバが、前記クライアントの備えるプロセッサを前記第２のデータ送受信部、前記第２のハッシュデータ計算部、前記ハッシュデータ比較部、前記差分バックアップ部、前記最適サイズ計算部、および前記差分バックアップ最適化部として機能させるプログラムであるブートイメージを前記第１の固定記憶手段に記憶するブートイメージ格納部と、
前記ブートイメージを前記クライアントに送信して動作させるブートイメージ送信部とを有することを特徴とする、付記１に記載の差分バックアップシステム。 (Supplementary Note 4) The backup management server includes a processor included in the client as the second data transmission / reception unit, the second hash data calculation unit, the hash data comparison unit, the differential backup unit, the optimum size calculation unit, A boot image storage unit that stores a boot image that is a program that functions as the differential backup optimization unit in the first fixed storage unit;
The differential backup system according to appendix 1, further comprising: a boot image transmission unit that transmits the boot image to the client for operation.

（付記５）バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと相互に接続され、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバであって、
前記フルバックアップデータを記憶する固定記憶手段と、
ネットワークを経由して前記クライアントとデータ通信を行うデータ送受信部と、
予め与えられた第１ないし第３のブロックサイズの各々で前記フルバックアップデータをブロックに区切ってこれら各ブロックのデータに予め与えられた関数を適用してハッシュデータを算出するハッシュデータ計算部と、
前記第１ないし第３のブロックサイズの各々の場合の前記ハッシュデータを前記フルバックアップデータの一部である試験用領域についてのものとそれ以外の残領域のものとに分割して前記試験用領域についてのハッシュデータを前記クライアントに送信する最適化処理用データ作成部と
を有することを特徴とするバックアップ管理サーバ。 (Supplementary Note 5) A backup management server that is connected to a client, which is one or a plurality of computer devices that hold backup target data, and that holds full backup data that is a complete backup of each backup target data,
Fixed storage means for storing the full backup data;
A data transmitting / receiving unit for performing data communication with the client via a network;
A hash data calculation unit for calculating hash data by dividing the full backup data into blocks at each of first to third block sizes given in advance and applying a function given in advance to the data of each block;
The test area is divided by dividing the hash data in each of the first to third block sizes into a test area that is a part of the full backup data and a remaining area other than the test area. A backup management server, comprising: an optimization processing data creation unit that transmits hash data for the client to the client.

（付記６）バックアップ対象データを保持し、前記バックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバと相互に接続されたコンピュータ装置であるクライアントであって、
前記バックアップ管理サーバとデータ通信を行うデータ送受信部と、
前記バックアップ対象データを記憶する固定記憶手段と、
前記第１〜第３のブロックサイズの各々の場合について前記バックアップ対象データをブロックに区切ってこれら各ブロックのデータに前記関数を適用して第２のハッシュデータを算出するハッシュデータ計算部と、
算出された前記第２のハッシュデータを前記ブロックごとに前記バックアップ管理サーバから受信した前記第１のハッシュデータと比較するハッシュデータ比較部と、
前記ブロックごとの前記第１および第２のハッシュデータの比較結果が異なっている場合に前記バックアップ対象データの該ブロックを差分バックアップする差分バックアップ部と、
前記第１ないし第３のブロックサイズの各々の場合について、前記バックアップ対象データの一部である試験用領域についての第１のハッシュデータを利用して前記バックアップ対象データの前記試験用領域について前記差分バックアップ部に試験的差分バックアップを行わせると共に、前記第１〜第３のブロックサイズで行った前記試験的差分バックアップで最も所要時間の少なかったブロックサイズを最適サイズとして決定する最適サイズ計算部と、
決定された前記最適サイズの場合の前記試験用領域以外の残領域についての第１のハッシュデータを前記バックアップ管理サーバに要求すると共に、送られてきたこの残領域についての第１のハッシュデータを利用して前記差分バックアップ部に前記最適サイズで前記バックアップ対象データの残領域の差分バックアップを行わせる差分バックアップ最適化部と
を有することを特徴とするクライアント。 (Additional remark 6) It is a client which is a computer apparatus which hold | maintains backup object data, and is mutually connected with the backup management server which hold | maintains the full backup data which carried out full backup of the said backup object data,
A data transmission / reception unit for performing data communication with the backup management server;
Fixed storage means for storing the backup target data;
A hash data calculation unit that calculates the second hash data by dividing the backup target data into blocks for each case of the first to third block sizes and applying the function to the data of each block;
A hash data comparison unit that compares the calculated second hash data with the first hash data received from the backup management server for each block;
A differential backup unit that differentially backs up the block of the backup target data when the comparison result of the first and second hash data for each block is different;
For each of the first to third block sizes, the difference for the test area of the backup target data using the first hash data for the test area that is part of the backup target data. An optimal size calculation unit for making the backup unit perform a test differential backup, and determining a block size having the shortest required time in the test differential backup performed in the first to third block sizes as an optimal size;
The first hash data for the remaining area other than the test area in the case of the determined optimum size is requested to the backup management server, and the sent first hash data for the remaining area is used. And a differential backup optimizing unit that causes the differential backup unit to perform differential backup of the remaining area of the backup target data with the optimal size.

（付記７）バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバとが相互に接続された差分バックアップシステムにあって、
予め与えられた第１ないし第３のブロックサイズの各々で前記バックアップ管理サーバの第１のハッシュデータ計算部が前記フルバックアップデータをブロックに区切り、
前記各ブロックのデータに予め与えられた関数を適用して前記バックアップ管理サーバの前記第１のハッシュデータ計算部が第１のハッシュデータを算出し、
前記第１ないし第３のブロックサイズの各々の場合について前記第１のハッシュデータを前記バックアップ管理サーバの最適化処理用データ作成部が前記フルバックアップデータの一部である試験用領域についてのものとそれ以外の残領域のものとに分割し、
前記試験用領域についてのハッシュデータを前記バックアップ管理サーバの前記最適化処理用データ作成部が前記クライアントに送信し、
前記第１〜第３のブロックサイズの各々の場合について前記クライアントの第２のハッシュデータ計算部が前記バックアップ対象データを当該ブロックサイズのブロックに区切り、
前記各ブロックのデータに前記関数を適用して前記クライアントの前記第２のハッシュデータ計算部が第２のハッシュデータを算出し、
前記ブロックごとに算出された前記第２のハッシュデータと前記バックアップ管理サーバから受信した前記第１のハッシュデータとを前記クライアントのハッシュデータ比較部が比較し、
前記ブロックごとに前記第１および第２のハッシュデータの比較結果が異なっている場合に前記バックアップ対象データの該ブロックを前記クライアントの差分バックアップ部が差分バックアップし、
前記第１〜第３のブロックサイズで行った前記試験用領域についての差分バックアップで最も所要時間の少なかったブロックサイズを前記クライアントの最適サイズ計算部が最適サイズとして決定し、
決定された前記最適サイズの場合の前記残領域についての第１のハッシュデータを前記クライアントの差分バックアップ最適化部が前記バックアップ管理サーバに要求し、
要求された前記残領域についての第１のハッシュデータを前記バックアップ管理サーバのデータ送受信部が前記クライアントに送信し、
この残領域についての第１のハッシュデータを利用して前記クライアントの差分バックアップ最適化部が前記差分バックアップ部に前記バックアップ対象データの残領域の差分バックアップを行わせる
ことを特徴とする差分バックアップ方法。 (Supplementary note 7) A differential backup system in which a client, which is one or a plurality of computer devices that hold backup target data, and a backup management server that holds full backup data that is a complete backup of each backup target data are interconnected There,
The first hash data calculation unit of the backup management server divides the full backup data into blocks at each of the first to third block sizes given in advance,
The first hash data calculation unit of the backup management server calculates the first hash data by applying a function given in advance to the data of each block,
For each of the first to third block sizes, the first hash data is used for the test area in which the optimization processing data creation unit of the backup management server is part of the full backup data. Divide it into other remaining areas,
Hash data about the test area is transmitted to the client by the optimization processing data creation unit of the backup management server,
For each case of the first to third block sizes, the second hash data calculation unit of the client divides the backup target data into blocks of the block size,
The second hash data calculation unit of the client calculates second hash data by applying the function to the data of each block,
The hash data comparison unit of the client compares the second hash data calculated for each block and the first hash data received from the backup management server,
When the comparison result of the first and second hash data is different for each block, the differential backup unit of the client differentially backs up the block of the backup target data,
The optimal size calculation unit of the client determines the block size that required the least amount of time in the differential backup for the test area performed with the first to third block sizes as the optimal size,
The differential backup optimization unit of the client requests the backup management server for the first hash data for the remaining area in the case of the determined optimum size,
The data transmission / reception unit of the backup management server transmits the first hash data for the remaining area requested to the client,
A differential backup method, wherein the differential backup optimization unit of the client uses the first hash data for the remaining area to cause the differential backup unit to perform differential backup of the remaining area of the backup target data.

（付記８）前記第２のブロックサイズが前記第１のブロックサイズより大きく、かつ前記第３のブロックサイズが前記第１のブロックサイズより小さく、
前記残領域についての第１のハッシュデータを前記クライアントに返信する際に受信した前記最適サイズを次回同一処理時の前記第１のブロックサイズとして前記バックアップ管理サーバの前記最適化処理用データ作成部が記憶する
ことを特徴とする、付記７に記載の差分バックアップ方法。 (Supplementary Note 8) The second block size is larger than the first block size, and the third block size is smaller than the first block size,
The optimization processing data creation unit of the backup management server sets the optimum size received when returning the first hash data for the remaining area to the client as the first block size at the same time next time. The differential backup method according to appendix 7, wherein the differential backup method is stored.

（付記９）バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバとが相互に接続された差分バックアップシステムにあって、
前記バックアップ管理サーバが備えるコンピュータに、
予め与えられた第１ないし第３のブロックサイズの各々で前記フルバックアップデータをブロックに区切る手順、
前記各ブロックのデータに予め与えられた関数を適用してハッシュデータを算出する手順、
前記第１ないし第３のブロックサイズの各々の場合について前記ハッシュデータを前記フルバックアップデータの一部である試験用領域についてのものとそれ以外の残領域のものとに分割する手順、
前記試験用領域についてのハッシュデータを前記クライアントに送信する手順、
および前記クライアントから要求された前記残領域についての第１のハッシュデータを当該クライアントに送信する手順
を実行させることを特徴とする差分バックアッププログラム。 (Supplementary Note 9) A differential backup system in which a client, which is one or a plurality of computer devices that hold backup target data, and a backup management server that holds full backup data that is a complete backup of each backup target data are interconnected There,
In the computer provided in the backup management server,
A procedure for dividing the full backup data into blocks in each of first to third block sizes given in advance;
A procedure for calculating hash data by applying a function given in advance to the data of each block,
A procedure for dividing the hash data for each of the first to third block sizes into a test area that is a part of the full backup data and a remaining area other than the test area;
A procedure for transmitting hash data about the test area to the client;
And a differential backup program for causing the client to execute a procedure of transmitting the first hash data for the remaining area requested by the client to the client.

（付記１０）バックアップ対象データを保持する単数もしくは複数のコンピュータ装置であるクライアントと、各々のバックアップ対象データを完全バックアップしたフルバックアップデータを保持するバックアップ管理サーバとが相互に接続された差分バックアップシステムにあって、
前記クライアントが備えるコンピュータに、
前記第１〜第３のブロックサイズの各々の場合について前記バックアップ対象データを当該ブロックサイズのブロックに区切る手順、
前記各ブロックのデータに前記関数を適用して第２のハッシュデータを算出する手順、
前記ブロックごとに算出された前記第２のハッシュデータを前記バックアップ管理サーバから受信した前記第１のハッシュデータと比較する手順、
前記ブロックごとの前記第１および第２のハッシュデータの比較結果が異なっている場合に前記バックアップ対象データの該ブロックを差分バックアップする手順、
前記第１〜第３のブロックサイズで行った前記バックアップ対象データの一部である試験用領域についての差分バックアップで最も所要時間の少なかったブロックサイズを最適サイズとして決定する手順、
決定された前記最適サイズの場合の前記試験用領域以外の残領域についての第１のハッシュデータを前記バックアップ管理サーバに要求する手順、
および送られてきたこの残領域についての第１のハッシュデータを利用して前記バックアップ対象データの残領域の差分バックアップを行う手順
を実行させることを特徴とする差分バックアッププログラム。 (Supplementary Note 10) In a differential backup system in which a client, which is one or a plurality of computer devices that hold backup target data, and a backup management server that holds full backup data that is a complete backup of each backup target data are interconnected There,
A computer included in the client;
A procedure for dividing the backup target data into blocks of the block size for each of the first to third block sizes;
A procedure for calculating the second hash data by applying the function to the data of each block,
Comparing the second hash data calculated for each block with the first hash data received from the backup management server;
A procedure for differentially backing up the block of the backup target data when the comparison result of the first and second hash data for each block is different;
A procedure for determining, as an optimum size, a block size that requires the least amount of time in a differential backup for a test area that is a part of the backup target data performed in the first to third block sizes;
Requesting the backup management server for first hash data for the remaining area other than the test area in the case of the determined optimum size;
And a differential backup program for executing a procedure for performing differential backup of the remaining area of the backup target data by using the first hash data of the remaining area sent thereto.

本発明は、バックアップシステムに適用できる。特に、複数のクライアントコンピュータについてのバックアップをバックアップ管理サーバによって一括して行うという形のバックアップシステムに適している。 The present invention can be applied to a backup system. Particularly, it is suitable for a backup system in which backups for a plurality of client computers are collectively performed by a backup management server.

１差分バックアップシステム
１０バックアップ管理サーバ
２１クライアント
３０ネットワーク
１０１、２０１プロセッサ
１０２、２０２一時記憶手段
１０３、２０３固定記憶手段
１０４、２０４通信手段
１１１最適化処理用データ作成部
１１２データ送受信部
１１３ブートイメージ送信部
１１４ハッシュデータ計算部
１１５フルバックアップデータ格納部
１１６差分データ格納部
１１７ブートイメージ格納部
１２０バックアップデータ格納領域
１２１フルバックアップデータ
１２１ａ、２２１ａ試験用領域
１２１ｂ、２２１ｂ残領域
１３０差分データ格納領域
１３１差分データ
１４０ブートイメージ格納領域
１４１ブートイメージ
１５０、２５０ワーキングエリア
１５１カレントサイズ設定値
１６０〜１６９ハッシュデータ
２１１データ送受信部
２１２差分バックアップ最適化部
２１３最適サイズ計算部
２１４差分バックアップ部
２１５ハッシュデータ計算部
２１６ハッシュデータ比較部
２２０バックアップ対象データ記憶領域
２２１バックアップ対象データ DESCRIPTION OF SYMBOLS 1 Differential backup system 10 Backup management server 21 Client 30 Network 101, 201 Processor 102, 202 Temporary storage means 103, 203 Fixed storage means 104, 204 Communication means 111 Optimization processing data creation part 112 Data transmission / reception part 113 Boot image transmission part 114 Hash data calculation unit 115 Full backup data storage unit 116 Differential data storage unit 117 Boot image storage unit 120 Backup data storage area 121 Full backup data 121a, 221a Test area 121b, 221b Remaining area 130 Differential data storage area 131 Differential data 140 Boot image storage area 141 Boot image 150, 250 Working area 151 Current size setting value 160-1 9 hash data 211 data transceiver 212 differential backup optimization unit 213 optimum size calculation unit 214 differential backup unit 215 hash data calculating unit 216 hash data comparator 220 backup target data storage area 221 backup target data

Claims

A differential backup system in which a client, which is one or a plurality of computer devices that hold backup target data, and a backup management server that holds full backup data in which each backup target data is completely backed up are mutually connected,
The backup management server is
First fixed storage means for storing the full backup data;
A first data transmitting / receiving unit that performs data communication with the client via a network;
A first hash data is calculated by dividing the full backup data into blocks at each of first to third block sizes given in advance and applying a given function to the data of each block. A hash data calculator,
The first hash data in each of the first to third block sizes is divided into a test area that is a part of the full backup data and a remaining area other than the test area. An optimization processing data creation unit that transmits hash data about the test area to the client;
The client
A second data transmitting / receiving unit that performs data communication with the backup management server;
Second fixed storage means for storing the backup target data;
A second hash data calculation unit that calculates the second hash data by dividing the backup target data into blocks and applying the function to the data of each block for each of the first to third block sizes When,
A hash data comparison unit that compares the calculated second hash data with the first hash data received from the backup management server for each block;
A differential backup unit that differentially backs up the block of the backup target data when the comparison result of the first and second hash data for each block is different;
For each of the first to third block sizes, a trial differential backup is performed on the differential backup unit for the test area of the backup target data using the first hash data for the test area. An optimal size calculation unit that determines the block size with the shortest required time in the experimental differential backup performed with the first to third block sizes as an optimal size,
Requesting the backup management server for the first hash data for the remaining area in the case of the determined optimum size, and using the first hash data for the remaining area sent thereto, the differential backup A differential backup system, comprising: a differential backup optimization unit that causes a differential backup of the remaining area of the backup target data to be performed at the optimal size.

The second block size is larger than the first block size and the third block size is smaller than the first block size;
The optimization processing data creation unit of the backup management server returns the first block size at the time of the same processing next time to the optimum size received when returning the first hash data for the remaining area to the client. The differential backup system according to claim 1, wherein the differential backup system has a function of storing as:

If the time required for the experimental differential backup performed by the second block size is shorter than the time required for the experimental differential backup performed by the first block size by the optimal size calculation unit of the client, The differential backup system according to claim 2, further comprising a function of determining the second block size as the optimum size at a time point.

The backup management server is
The processor included in the client functions as the second data transmission / reception unit, the second hash data calculation unit, the hash data comparison unit, the differential backup unit, the optimal size calculation unit, and the differential backup optimization unit. A boot image storage unit for storing a boot image as a program in the first fixed storage unit;
The differential backup system according to claim 1, further comprising: a boot image transmission unit configured to transmit the boot image to the client for operation.

A backup management server that is connected to a client, which is one or more computer devices that hold backup target data, and that holds full backup data that is a complete backup of each backup target data,
Fixed storage means for storing the full backup data;
A data transmitting / receiving unit for performing data communication with the client via a network;
A hash data calculation unit for calculating hash data by dividing the full backup data into blocks at each of first to third block sizes given in advance and applying a function given in advance to the data of each block;
The test area is divided by dividing the hash data in each of the first to third block sizes into a test area that is a part of the full backup data and a remaining area other than the test area. A backup management server, comprising: an optimization processing data creation unit that transmits hash data for the client to the client.

A client that is a computer device that holds backup target data and is interconnected with a backup management server that holds full backup data that is a complete backup of the backup target data,
A data transmission / reception unit for performing data communication with the backup management server;
Fixed storage means for storing the backup target data;
A hash data calculation unit that calculates the second hash data by dividing the backup target data into blocks for each case of the first to third block sizes and applying the function to the data of each block;
A hash data comparison unit that compares the calculated second hash data with the first hash data received from the backup management server for each block;
A differential backup unit that differentially backs up the block of the backup target data when the comparison result of the first and second hash data for each block is different;
For each of the first to third block sizes, the difference for the test area of the backup target data using the first hash data for the test area that is part of the backup target data. An optimal size calculation unit for making the backup unit perform a test differential backup, and determining a block size having the shortest required time in the test differential backup performed in the first to third block sizes as an optimal size;
The first hash data for the remaining area other than the test area in the case of the determined optimum size is requested to the backup management server, and the sent first hash data for the remaining area is used. And a differential backup optimizing unit that causes the differential backup unit to perform differential backup of the remaining area of the backup target data with the optimal size.

In a differential backup system in which a client, which is one or a plurality of computer devices that hold backup target data, and a backup management server that holds full backup data in which each backup target data is completely backed up are mutually connected,
The first hash data calculation unit of the backup management server divides the full backup data into blocks at each of the first to third block sizes given in advance,
The first hash data calculation unit of the backup management server calculates the first hash data by applying a function given in advance to the data of each block,
For each of the first to third block sizes, the first hash data is used for the test area in which the optimization processing data creation unit of the backup management server is part of the full backup data. Divide it into other remaining areas,
Hash data about the test area is transmitted to the client by the optimization processing data creation unit of the backup management server,
For each case of the first to third block sizes, the second hash data calculation unit of the client divides the backup target data into blocks of the block size,
The second hash data calculation unit of the client calculates second hash data by applying the function to the data of each block,
The hash data comparison unit of the client compares the second hash data calculated for each block and the first hash data received from the backup management server,
When the comparison result of the first and second hash data is different for each block, the differential backup unit of the client differentially backs up the block of the backup target data,
The optimal size calculation unit of the client determines the block size that required the least amount of time in the differential backup for the test area performed with the first to third block sizes as the optimal size,
The differential backup optimization unit of the client requests the backup management server for the first hash data for the remaining area in the case of the determined optimum size,
The data transmission / reception unit of the backup management server transmits the first hash data for the remaining area requested to the client,
A differential backup method, wherein the differential backup optimization unit of the client uses the first hash data for the remaining area to cause the differential backup unit to perform differential backup of the remaining area of the backup target data.

The second block size is larger than the first block size and the third block size is smaller than the first block size;
The optimization processing data creation unit of the backup management server sets the optimum size received when returning the first hash data for the remaining area to the client as the first block size at the same time next time. The differential backup method according to claim 7, wherein the differential backup method is stored.

In a differential backup system in which a client, which is one or a plurality of computer devices that hold backup target data, and a backup management server that holds full backup data in which each backup target data is completely backed up are mutually connected,
In the computer provided in the backup management server,
A procedure for dividing the full backup data into blocks in each of first to third block sizes given in advance;
A procedure for calculating hash data by applying a function given in advance to the data of each block,
A procedure for dividing the hash data for each of the first to third block sizes into a test area that is a part of the full backup data and a remaining area other than the test area;
A procedure for transmitting hash data about the test area to the client;
And a differential backup program for causing the client to execute a procedure of transmitting the first hash data for the remaining area requested by the client to the client.

In a differential backup system in which a client, which is one or a plurality of computer devices that hold backup target data, and a backup management server that holds full backup data in which each backup target data is completely backed up are mutually connected,
A computer included in the client;
A procedure for dividing the backup target data into blocks of the block size for each of the first to third block sizes;
A procedure for calculating the second hash data by applying the function to the data of each block,
Comparing the second hash data calculated for each block with the first hash data received from the backup management server;
A procedure for differentially backing up the block of the backup target data when the comparison result of the first and second hash data for each block is different;
A procedure for determining, as an optimum size, a block size that requires the least amount of time in a differential backup for a test area that is a part of the backup target data performed in the first to third block sizes;
Requesting the backup management server for first hash data for the remaining area other than the test area in the case of the determined optimum size;
And a differential backup program for executing a procedure for performing differential backup of the remaining area of the backup target data by using the first hash data of the remaining area sent thereto.