JP4127461B2

JP4127461B2 - Backup system and method in disk shared file system

Info

Publication number: JP4127461B2
Application number: JP2001025748A
Authority: JP
Inventors: 芳浩土屋; 慶武新開
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2000-02-04
Filing date: 2001-02-01
Publication date: 2008-07-30
Anticipated expiration: 2021-02-01
Also published as: JP2001290686A

Description

【０００１】
【発明の属する技術分野】
本発明は、計算機（コンピュータ）システムのディスク等の記録媒体に格納されたデータをバックアップし、必要なときにリストアするシステムおよびその方法に関する。
【０００２】
【従来の技術】
従来の計算機システムでは、ファイルシステムがバックアップを行う際に、まず、ファイル単位で、使用されているブロックのアドレス等のブロック情報を調べる。そして、該当するブロックのデータをディスクから読み出すことにより、ファイルを読み、読んだデータをテープにコピーする。このような動作をファイル毎に繰り返すことで、ファイルのバックアップが行われていた。
【０００３】
しかし、この方法では、多数のファイルのバックアップを行う場合、ディスクへのアクセスがランダムアクセスに近くなるため、システムの性能を損ねる原因となっていた。
【０００４】
そこで、バックアップを効率化するために、ファイルが占めている複数のブロックを直接コピーするイメージバックアップが用いられるようになった。この方法では、計算機システムは、ファイルを選択的にコピーする代わりに、ディスク内でファイルが占めているブロックの領域を一括してコピーする。このため、１回のディスクアクセスでバックアップが行われ、処理が効率化される。
【０００５】
【発明が解決しようとする課題】
しかしながら、上述した従来のバックアップ方法には、以下のような問題がある。
【０００６】
従来のイメージバックアップでは、ディスクを単位としてデータをコピーすることはできたが、ファイルやディレクトリを単位としてコピーすることはできなかった。このため、必要のないデータまでコピーしなければならないという問題があった。また、バックアップされたデータをリストアするには、すべてのデータをディスク上にコピーして展開する必要があった。
【０００７】
さらに、複数の計算機がディスクを共有して処理を行うクラスタシステムにおいてバックアップを行おうとすると、次のような問題が発生する。
クラスタシステムは、複数の計算機が共有ディスクに同時にアクセスするためのファイルシステム（ディスク共用ファイルシステム）を備えており、各計算機は、書き込みデータをキャッシュする領域を備えている。このため、単に共有ディスクをコピーするだけでは、キャッシュされた書き込みデータ（ライトキャッシュ）の内容がコピーに反映されないので、通常のイメージバックアップを行うことは不可能である。
【０００８】
また、従来の計算機システムでは、イメージバックアップを業務の稼動中に（オンラインで）行うために、ファイルシステムが、データの変更を契機として、変更前の元データを別の領域に退避させ、ディスクをコピーした後に、退避されていた元データをコピーに上書きしていた。これにより、バックアップデータ上で、バックアップ開始時点の内容を確定することができた。
【０００９】
しかし、クラスタシステムにおいては、同一のファイル領域に対して、複数の計算機による変更がほとんど同時に発生する場合があり、元データを用いてバックアップ開始時点の内容を確定する方法を、直接適用できないという問題がある。
【００１０】
このように、従来のバックアップ方法では、クラスタシステムにおける大量のデータを効率的にバックアップすることができない。このため、クラスタシステムにおける有効なバックアップ方法は開発されておらず、バックアップされたデータを効率的に閲覧する方法もない。
【００１１】
本発明の課題は、ディスク共用ファイルシステムを有する計算機システムにおいて、データを効率的にバックアップするシステムおよびその方法を提供することである。
【００１２】
【課題を解決するための手段】
図１は、本発明のバックアップシステムの原理図である。
本発明の第１の局面において、バックアップシステムは、コピー手段１と制御手段２を備え、複数の計算機３により共有される共有媒体４のバックアップを行う。
【００１３】
コピー手段１は、共有媒体４の複数の単位領域を一括してバックアップ媒体５にコピーする。制御手段２は、各計算機３が共有媒体４に書き込む書き込みデータを管理し、バックアップ時に、各計算機３の書き込みデータを共有媒体４に反映させる。
【００１４】
各計算機３は、共有媒体４のデータを変更するとき、書き込みデータをライトキャッシュとして保持し、共有媒体４へのアクセスが可能になったとき、その内容を共有媒体４に書き込む。制御手段２は、各計算機３が保持する書き込みデータの有無を管理し、バックアップ時に、計算機３が保持する書き込みデータを共有媒体４に書き込む制御を行う。
【００１５】
共有媒体４の格納領域は、例えば、ブロックのような単位領域毎に分割されている。コピー手段１は、すべての書き込みデータが書き込まれた後に、例えば、イメージバックアップのような方法により、共有媒体４の複数の単位領域を一括してバックアップ媒体５にコピーする。
【００１６】
このようなバックアップシステムによれば、ディスク共用ファイルシステムにおいて、各計算機３が保持する書き込みデータを含めて、共有媒体４のバックアップを効率よく行うことができる。
【００１７】
また、本発明の第２の局面において、バックアップシステムは、ログ管理手段６と生成手段７を備え、複数の計算機３により共有される共有媒体４のイメージバックアップを行う。
【００１８】
ログ管理手段６は、複数の計算機３のうちのいずれかの計算機３が共有媒体４のある領域に書き込む際に、その領域の書き込み前のイメージデータをログとして書き込みを行った計算機３で管理し、複数の計算機３が管理するログをまとめて全体のログを生成する。そして、同一領域に対するログが複数ある場合は、イメージバックアップ開始時点以降、最も古いログを用いる。生成手段７は、リストア時に、全体のログを用いて、イメージバックアップ開始時点のデータを生成する。
【００１９】
計算機３が共有媒体４のデータを変更するとき、変更前の元データがログとして保存される。ログ管理手段６は、各計算機３のログを管理し、２つ以上の計算機３のログをまとめて、システム全体のログを生成する。そして、生成手段７は、例えば、共有媒体４のバックアップデータに全体のログを上書きすることで、バックアップ開始時点の内容を確定する。
【００２０】
このようなバックアップシステムによれば、複数の計算機３によるデータの変更を契機として保存された元データが編集され、全体のログが生成される。したがって、ディスク共用ファイルシステムにおいても、システムの稼動中にバックアップを効率よく行うことができる。
【００２１】
また、本発明の第３の局面において、バックアップシステムは、コピー手段１とグループ管理手段８を備え、複数の計算機３により共有される共有媒体４のバックアップを行う。
【００２２】
グループ管理手段８は、共有媒体４に格納されたファイルのグループを設定し、そのグループに含まれるファイルが占める単位領域をリストアップする。コピー手段１は、リストアップされた複数の単位領域を一括してバックアップ媒体５にコピーする。
【００２３】
グループ管理手段８は、１つ以上のファイルを含むグループを設定し、そのグループに含まれる各ファイルが占める単位領域をリストアップする。そして、コピー手段１は、例えば、イメージバックアップのような方法により、各ファイルを区別することなく、リストアップされた複数の単位領域を一括してバックアップ媒体５にコピーする。
【００２４】
このようなバックアップシステムによれば、ディスク共用ファイルシステムにおいて、バックアップするファイルを指定することが可能になり、不要なファイルのコピーを行う必要がなくなる。したがって、バックアップが効率化される。
【００２５】
また、本発明の第４の局面において、バックアップシステムは、コピー手段１と領域管理手段９を備え、計算機３からアクセスされるファイルを格納する格納媒体４のバックアップを行う。
【００２６】
領域管理手段９は、格納媒体４の単位領域毎に使用されているか否かを判定し、使用されている単位領域をリストアップする。コピー手段１は、リストアップされた複数の単位領域を一括してバックアップ媒体５にコピーする。
【００２７】
領域管理手段９は、格納媒体４の各単位領域を管理し、各単位領域がファイルとして使用されているか否かを判定して、ファイルが占める単位領域をリストアップする。そして、コピー手段１は、例えば、イメージバックアップのような方法により、各ファイルを区別することなく、リストアップされた複数の単位領域を一括してバックアップ媒体５にコピーする。
【００２８】
このようなバックアップシステムによれば、ファイルとして使用されていない単位領域のコピーを行う必要がなくなる。したがって、ファイルシステムにおけるバックアップが効率化される。
【００２９】
また、本発明の第５の局面において、バックアップシステムは、コピー手段１と領域管理手段９を備え、計算機３からアクセスされるファイルを格納する格納媒体４のバックアップを行う。
【００３０】
領域管理手段９は、格納媒体４の単位領域のうち、前回のバックアップの後で変更があった単位領域を差分としてリストアップする。コピー手段１は、リストアップされた複数の単位領域を一括して、差分バックアップデータとしてバックアップ媒体５にコピーする。
【００３１】
バックアップシステムは、適当なタイミングで格納媒体４のバックアップを時系列に行う。領域管理手段９は、格納媒体４の各単位領域を管理し、前回のバックアップの後で変更された単位領域や、ファイルとして新たに使用された単位領域をリストアップする。そして、コピー手段１は、例えば、イメージバックアップのような方法により、各ファイルを区別することなく、リストアップされた複数の単位領域を一括してバックアップ媒体５にコピーする。これにより、変更のあった単位領域のみが、差分として保存される。
【００３２】
このようなバックアップシステムによれば、前回のバックアップの後でデータに変更がなかった単位領域のコピーを行う必要がなくなる。したがって、ファイルシステムにおけるバックアップが効率化される。
【００３３】
例えば、図１の共有媒体４は、後述する図２の共有ディスク１３に対応し、図１のバックアップ媒体５は、図２のバックアップ媒体１５またはテープ１６に対応する。また、例えば、図１のコピー手段１は、図２のコピー管理部２５に対応し、図１の制御手段２は、図２のキャッシュ制御部２１に対応し、図１のログ管理手段６は、図２のログ管理部２６に対応し、図１の生成手段７および領域管理手段９は、図２のブロック管理部２２に対応し、図１のグループ管理手段８は、図２のグループ管理部２３に対応する。
【００３４】
【発明の実施の形態】
以下、図面を参照しながら、本発明の実施の形態を詳細に説明する。
本実施形態の計算機システムは、複数の計算機と、それらの計算機が共有する共有ディスクと、複数の計算機が共有ディスクに同時にアクセスするためのファイルシステムとを備える。
【００３５】
この計算機システムは、データのバックアップの際に、改良されたイメージバックアップにより、ディスクのすべての内容をバックアップ用の媒体に直接コピーする。また、計算機のディスクに対するアクセスを検出し、アクセスの発生する以前の元データをログ媒体に保存する。そして、ログ媒体に保存されたデータを用いて、バックアップ開始時点のイメージ（データの内容）を確定する。以下では、この元データを、Before Image Log（ＢＩログ）、または、単にログと呼ぶことにする。
【００３６】
本実施形態のイメージバックアップにおいて、バックアップ動作に関する主な特徴は以下の通りである。
（ａ．１）各計算機のメモリ上のライトキャッシュの内容を管理し、バックアップ時に、各計算機のライトキャッシュの内容をディスクに反映させる。これにより、クラスタ内のデータの矛盾が生じることがなくなる。
（ａ．２）ディスクに書き込みを行おうとする各計算機がＢＩログを残し、バックアップ時に複数の計算機のＢＩログをマージする。これにより、バックアップ終了後にすべてのログをまとめたシステム全体のログが編集され、このログを用いてデータを確定することで、バックアップ（コピー）中のライトによるデータの破壊が防止される。
（ａ．３）ディスクに書き込みを行おうとする各計算機が、ＢＩログの責任を持つ特定の計算機に書き込みを通知し、その計算機がＢＩログを管理する。これにより、複数の計算機のログが特定の計算機に送られ、マージされてログ媒体に保存される。このログを用いてデータを確定することにより、バックアップ中のライトによるデータの破壊が防止される。
（ａ．４）ＢＩログを保存する媒体として、バックアップデータと同一の媒体を選択する。これにより、バックアップと同時にログを保存することができる。
（ａ．５）ＢＩログを保存する媒体として、バックアップデータと別の媒体を選択する。これにより、バックアップデータを保存する媒体が上書き不可能な場合でも、ログを残すことができる。
（ａ．６）ＢＩログをバックアップデータに上書きしてから、バックアップ媒体に保存する。これにより、リストア時に複数の媒体を参照する必要がなくなる。
（ａ．７）ＢＩログの中に、上書きの対象となるバックアップデータのアドレス情報を書き込んでおく。これにより、ログの管理情報にアクセスしなくても、ログを読むだけで、ログをバックアップデータに上書きすることが可能になる。
（ａ．８）ディスク上のブロックのうち、使用済みのものをリストアップして、必要な部分のみをコピーする。これにより、コピーするデータの量を削減されるため、コピー時間の短縮と必要な媒体容量の削減が可能になる。
（ａ．９）ディスク上の使用済みのブロックのうち、前回のバックアップ以後に変更があったもの（差分）をリストアップして、変更部分のみをコピーする。このような差分バックアップにより、コピーするデータの量を削減されるため、コピー時間の短縮と必要な媒体容量の削減が可能になる。
（ａ．１０）バックアップが完了した後、リストアの前に、差分バックアップデータ同士、または差分バックアップデータと全体バックアップデータの内容を、ブロック単位でマージする。差分バックアップデータをまとめておくことで、リストアが効率化される。
（ａ．１１）差分バックアップデータの記録開始時点を選択できるようにする。リストア時に、選択された時点以後の差分バックアップデータのみを用いることで、その時点より前に行われた変更を無視することができ、柔軟なリストアが可能になる。
（ａ．１２）ディスクのコピー処理をクラスタ内の複数の計算機に分散する。これにより、負荷が分散され、コピー時間が短縮される。
【００３７】
また、本実施形態のイメージバックアップにおいて、ファイルのグループ化に関する主な特徴は以下の通りである。
（ｂ．１）ファイルをグループ化し、グループに含まれているファイルの占めるブロックを管理して、バックアップ時には、それらのファイルが使用するブロックのみをコピーする。これにより、ファイルのグループの設定と、グループ単位のファイルのバックアップが可能になる。
（ｂ．２）ディレクトリを単位としてファイルをグループ化し、ディレクトリに含まれるすべてのファイルをグループとして設定する。これにより、ファイルのグループの設定と、グループ単位のファイルのバックアップが可能になる。
（ｂ．３）グループとして設定されたディレクトリの下の特定のファイルまたはディレクトリを、グループから除外する。これにより、あるグループとして設定されたディレクトリに含まれる特定のファイルを、グループから除外することができ、柔軟なグループ設定が可能になる。
（ｂ．４）複数のグループを設定し、それぞれ別のスケジュールでバックアップを行う。これにより、柔軟なグループ設定とバックアップが可能になる。
（ｂ．５）１つのファイルが複数のグループに属することを認める。これにより、柔軟なグループ設定とバックアップが可能になる。
【００３８】
また、本実施形態のイメージバックアップにおいて、リストア動作に関する主な特徴は以下の通りである。
（ｃ．１）ファイルシステムが、バックアップデータを保存する媒体を、ディスクの代わりにそのままマウントする。これにより、バックアップデータを保存する媒体をディスクの代わりにアクセスすることができ、リストアのための特別な操作が不要になる。
（ｃ．２）上述した差分バックアップが行われた場合、必要に応じて、全体バックアップデータまで、各世代のバックアップデータを探索して遡る。これにより、ファイルのブロックが最新の差分バックアップデータに含まれていないときに、それより前のバックアップにより保存されたブロックを参照して利用することができる。したがって、リストア時に、ユーザにはすべてのデータが存在しているように見せることが可能になる。
（ｃ．３）バックアップテープから必要なブロックのみを、バッファとして用いるディスクにロードし、それらのブロックをキャッシュとして利用する。これにより、必要なブロックのみをバッファ上に配置することが可能になり、頻繁にアクセスされるブロックへのアクセス効率が向上する。
（ｃ．４）バックアップテープから必要なブロックのみをディスクにロードし、テープに繋がっていない計算機に対してデータを見せる。これにより、クラスタ内のテープを持たない計算機でも、テープに保存されたバックアップデータを読むことが可能になる。
（ｃ．５）ＢＩログがバックアップデータに上書きされていない場合、ＢＩログを先に参照し、必要に応じて、バックアップデータを後で参照する。ログとバックアップデータが別々に媒体に保存されている場合、ログの存在と内容を確認して、ログがあればログを参照し、ログがなければバックアップデータを参照する。これにより、リストア後のデータに矛盾が生じることがなくなる。
【００３９】
図２は、上述したようなイメージバックアップを行うクラスタシステムの構成図である。図２のクラスタシステムは、複数の計算機１１、１２、共有ディスク１３、ログ媒体１４、バックアップ媒体１５、およびテープ１６を備える。
【００４０】
複数の計算機１１は、共有ディスク１３を共有し、ディスク１３に格納されたファイルにアクセスしながらデータ処理を行う。計算機１１、１２および共有ディスク１３はクラスタを構成し、クラスタ内には、一般に、１つ以上の共有ディスク１３が設けられる。ログ媒体１４は、計算機１１のＢＩログを格納し、バックアップ媒体１５およびテープ１６は、ディスク１３内のファイルのバックアップデータを格納する。
【００４１】
計算機１２は、クラスタを管理する計算機であり、キャッシュ制御部２１、ブロック管理部２２、グループ管理部２３、媒体制御部２４、コピー管理部２５、ログ管理部２６、およびテープ制御部２７を含む。これらの管理部および制御部は、例えば、プログラムにより記述されたソフトウェアに対応し、ブロック管理部２２は、ファイルシステムの主要部に対応する。
【００４２】
キャッシュ制御部２１は、各計算機１１のメモリ上に設けられたキャッシュ２８を制御する。キャッシュ２８には、ディスク１３に格納されたクラスタのファイルに対して計算機１１が書き込むデータが一時的に保存される。
【００４３】
また、ブロック管理部２２は、ファイルへのブロック割当てを行い、ファイルの各ブロックがどのディスク１３のどのアドレスに割り当てられているかを管理する。グループ管理部２３は、ユーザが定義したグループと、グループに含まれているファイルを管理する。
【００４４】
また、媒体制御部２４は、バックアップ媒体１５へのアクセスを制御し、コピー管理部２５は、ディスク１３からバックアップ媒体１５へのデータのコピー動作を管理する。ログ管理部２６は、各計算機１１のＢＩログとログ媒体１４を管理し、テープ制御部２７は、テープ１６へのアクセスを制御する。
【００４５】
このようなクラスタシステムによれば、ブロック管理部２２がディスク１３上のバックアップ対象ファイルのブロックを管理し、コピー管理部２５がそれらのブロックをバックアップ媒体１５にコピーすることにより、バックアップが行われる。コピー中に発生したクラスタ内の各計算機１１の書き込みに基づくＢＩログは、ログ管理部２６によりログ媒体にコピーされる。ログ媒体に格納されたＢＩログは、後でバックアップ媒体１５に反映されるか、または、そのままログの形式で保持される。
【００４６】
ログ媒体１４およびバックアップ媒体１５は、ディスクやテープのような不揮発な媒体である。バックアップ媒体１５がディスクである場合、それはテープ１６に対するバッファとしても利用され、テープ制御部２７がバックアップ媒体１５からテープ１６へバックアップデータをコピーする。
【００４７】
まず、図３から図１６までを参照しながら、上述した特徴（ａ．１）〜（ａ．７）に関する動作を詳細に説明する。
キャッシュ制御部２１は、（ａ．１）のキャッシュ管理を行い、バックアップ時に、キャッシュ２８内の書き込みデータ（ライトキャッシュ）をディスク１３に反映させる。キャッシュ制御部２１は、クラスタ内のすべてのキャッシュ２８について以下のような情報を登録したキャッシュテーブルを管理する。
・計算機名
・ファイル名
・ファイル内の領域（オフセット、サイズ）
・キャッシュがダーティか否か（書き込みデータがディスク１３に反映されずに残っているか否か）
ある計算機１１がダーティキャッシュを生成するとき、キャッシュ制御部２１は、対応するファイルの対応する領域内について他の計算機１１が持っているライトキャッシュを破棄するように、各計算機１１に指示する。そして、バックアップ時には、すべてのダーティキャッシュをディスク１３に書き出すように、各計算機１１に指示する。書き出しを指示された計算機１１は、キャッシュ２８内のライトキャッシュをディスク１３に書き込む。
【００４８】
こうして、クラスタ内のすべてのライトキャッシュがディスク１３に反映されると、コピー管理部２５によりイメージバックアップが実行され、ディスク１３内のデータがバックアップ媒体１５にコピーされる。これにより、クラスタ内のデータの矛盾が生じることなく、バックアップが行われる。
【００４９】
図３は、キャッシュ制御部２１がライトキャッシュをディスク１３に反映させる処理のフローチャートである。バックアップが開始されると、キャッシュ制御部２１は、まず、キャッシュテーブルを探索して、クラスタ内にダーティキャッシュが残っているか否かをチェックする（ステップＳ１）。
【００５０】
ダーティキャッシュが残っていれば、その計算機名と、書き込み先のファイル名および領域の情報を取得する（ステップＳ２）。次に、その計算機に対して、ディスク１３の対応するファイルの対応する領域にダーティキャッシュを反映させるように指示し（ステップＳ３）、ステップＳ１以降の処理を繰り返す。そして、ステップＳ１においてダーティキャッシュがなくなると、処理を終了する。
【００５１】
また、ログ管理部２６は、ＢＩログに関して、（ａ．２）または（ａ．３）のログ管理を行う。
図４は、（ａ．２）のログ管理を示している。図４において、各計算機１１は、各自の一時ログ媒体３１とログ管理ファイル３２を持ち、ログ管理部２６は、バックアップ終了後に、すべての計算機１１のログをまとめてシステム全体のログを編集し、ログ媒体１４に格納する。ログ管理ファイル３２には、各ログについて以下のような情報を記録したログリストが含まれる。
・ディスク１３のデバイス名
・領域（オフセット、サイズ）
・時刻
このうち、時刻は、ログが生成された時刻を表し、他のログとの前後関係を決定するために用いられる。ここでは、実際の時刻の代わりに、計算機１２に備えられたクロック部３３が生成する論理時刻（logical time）を用いている。クロック部３３は、例えば、最初のログが生成されたときに論理時刻“１”を生成し、以後、ログが生成される度に論理時刻を１ずつインクリメントする。
【００５２】
各計算機１１は、一時ログ媒体３１のログとログ管理ファイル３２のログリストをログ管理部２６に送り、ログ管理部２６は、受け取ったログの中に同一領域のログが複数あれば、最も古いものを優先的に残してログを編集する。こうして編集されたログをバックアップ媒体１５に上書きすることで、バックアップ開始時点のイメージが確定され、バックアップ中のディスク１３への書き込みによる変更をキャンセルすることができる。
【００５３】
図５は、ログ管理部２６によるログ編集処理のフローチャートである。ログ管理部２６は、まず、クラスタ内のすべての計算機１１からログとログリストを受け取り（ステップＳ１１）、受け取ったログを時刻の古い順にソートして、作業用ログリストを用意する（ステップＳ１２）。
【００５４】
次に、最も古いログを選択し（ステップＳ１３）、それと同一領域のログが作業用ログリストにあるか否かをチェックする（ステップＳ１４）。同一領域のログがなければ、選択されたログを作業用ログリストに追加し（ステップＳ１５）、同一領域のログがあれば、選択されたログを破棄する（ステップＳ１６）。
【００５５】
次に、未選択のログがあるか否かをチェックし（ステップＳ１７）、そのようなログが残っていれば、ステップＳ１３以降の処理を繰り返す。そして、すべてのログを選択すると、作業用ログリストに含まれるログをログ媒体１４に記録して（ステップＳ１８）、処理を終了する。
【００５６】
図６は、（ａ．３）のログ管理を示している。図６において、各計算機１１は、ログをログ管理部２６に送り、ログ管理部２６は、受け取ったログをログ媒体１４に記録し、対応するデバイス名および領域（オフセット、サイズ）をログ管理ファイル３４に記録する。このように、図６のログ管理では、ログが初めからログ媒体１４に置かれることになる。こうして記録されたログをバックアップ媒体１５に上書きすることで、バックアップ開始時点のイメージが確定される。
【００５７】
図７は、ログ管理部２６によるログ記録処理のフローチャートである。ログ管理部２６は、まず、クラスタ内の計算機１１からログを受け取り（ステップＳ２１）、それと同一領域のログが既にログ媒体１４に保存されているか否かをチェックする（ステップＳ２２）。同一領域のログが保存されていなければ、受け取ったログをログ媒体１４に記録し（ステップＳ２３）、同一領域のログが保存されていれば、受け取ったログを記録しない。
【００５８】
次に、クラスタ内のすべての計算機１１からログを受け取ったか否かをチェックし（ステップＳ２４）、ログを送っていない計算機１１があれば、ステップＳ２１以降の処理を繰り返す。そして、すべての計算機１１からログを受け取ると、処理を終了する。
【００５９】
図４のログ管理では、バックアップ終了後に各計算機１１のログがまとめてログ管理部２６に送られるので、通信コストが小さいという利点がある。しかし、一時ログ媒体３１を必要とするため、ハードウェアコストが増大し、バックアップ終了後にログを編集しなければならないため、後処理が必要となる。
【００６０】
これに対して、図６のログ管理では、一時ログ媒体３１と後処理が不要であるという利点がある。しかし、各計算機１１でログが発生する度にログ管理部２６に送られるので、図４の場合より通信コストが増大する。
【００６１】
図６のログ管理は、さらに（ａ．４）または（ａ．５）のログ管理に分類することができる。（ａ．４）のログ管理では、ログ媒体１４の代わりにバックアップ媒体１５にログを保存し、（ａ．５）のログ管理では、ログ媒体１４にログを保存する。
【００６２】
図８は、（ａ．４）のログ管理を示している。ディスク１３のコピー先であるバックアップ媒体１５がテープではなく、テープ１６に対するバッファとして用いられるディスクである場合、バックアップ媒体１５への部分的上書きが可能である。そこで、コピー管理部２５とログ管理部２６が連携することにより、バックアップ対象となるディスク１３のコピーが行われているとき、コピーと同時にログをバックアップ媒体１５に記録することができる。
【００６３】
このとき、まず、ログ管理部２６がログをバックアップ媒体１５に記録した後に、ログが存在しない領域に関して、コピー管理部２５がディスク１３のデータをバックアップ媒体１５にコピーする。
【００６４】
図９は、ログ管理部２６によるログ記録処理のフローチャートである。ログ管理部２６は、まず、保存すべきログの管理情報を、図１０に示すようなログ管理ファイル３４に記録し（ステップＳ３１）、ログをバックアップ媒体１５にコピーする（ステップＳ３２）。次に、コピーしていない他のログがあるか否かをチェックし（ステップＳ３３）、そのようなログがあれば、ステップＳ３１以降の処理を繰り返す。そして、すべてのログをコピーし終えると、処理を終了する。
【００６５】
図１０のログ管理ファイルには、ログ毎に、デバイス名、元アドレス、および長さが記録されている。デバイス名は、対応するディスク１３の識別情報を表し、元アドレスと長さは、それぞれ、対応する領域のオフセットとサイズを表す。
【００６６】
図１１は、コピー管理部２５によるコピー処理のフローチャートである。コピー管理部２５は、まず、バックアップ対象のディスク１３の開始アドレスを現在のアドレスとしてコピーを開始し（ステップＳ４１）、現在のアドレスが終了アドレスか否かをチェックする（ステップＳ４２）。
【００６７】
現在のアドレスが終了アドレスでなければ、次に、そのアドレスがログ管理ファイル３４に存在するか否かをチェックする（ステップＳ４３）。現在のアドレスがログ管理ファイル３４に存在しなければ、そのアドレスのブロックをバックアップ媒体１５にコピーし（ステップＳ４４）、ログ管理ファイル３４に存在すれば、そのアドレスのブロックのコピーを行わない。
【００６８】
次に、次のアドレスを現在のアドレスとして（ステップＳ４５）、ステップＳ４２以降の処理を繰り返す。そして、ステップＳ４２において現在のアドレスが終了アドレスに一致すると、処理を終了する。
【００６９】
図１２は、（ａ．５）のログ管理を示している。バックアップ媒体１５が、テープのように、部分的上書きが不可能な媒体の場合、バックアップ媒体１５とは別のログ媒体１４を用意して、ログだけをその媒体上に置く。これにより、バックアップ媒体１５とは別の媒体にログを残すことができる。このとき、ログ管理部２６とコピー管理部２５は、それぞれ、ログの記録とディスク１３のコピーを独立に行う。
【００７０】
図１３は、ログ管理部２６によるログ記録処理のフローチャートである。図１３のステップＳ５１およびＳ５３の処理は、それぞれ、図９のステップＳ３１およびＳ３３の処理と同様である。ステップＳ５１の処理の後、ログ管理部２６は、ログをログ媒体１４にコピーし（ステップＳ５２）、ステップＳ５３の処理を行う。
【００７１】
図１４は、コピー管理部２５によるコピー処理のフローチャートである。図１４のステップＳ６１、Ｓ６２、Ｓ６３、およびＳ６４の処理は、それぞれ、図１１のステップＳ４１、Ｓ４２、Ｓ４４、およびＳ４５の処理と同様である。この場合、バックアップ対象のディスク１３のすべてのブロックがバックアップ媒体１５にコピーされる。
【００７２】
また、（ａ．６）のログ管理では、ログ管理部２６は、バックアップ時に、ＢＩログをバックアップ媒体１５に上書きしてから保存する。このようにログとバックアップデータをあらかじめマージして保存しておけば、リストア時に、バックアップ媒体１５のみを参照すればよく、複数の媒体を参照する必要がなくなる。したがって、リストアが効率化される。
【００７３】
図１５は、（ａ．６）のログ管理を示している。図１５において、ログ管理部２６は、ログ管理ファイル３４を参照しながら、ログ媒体１４の複数のログを、それぞれ、バックアップ媒体１５の対応する領域に上書きして、ログとバックアップデータをマージする。その後、テープ制御部２７により、バックアップ媒体１５のデータがテープ１６に保存される。
【００７４】
また、（ａ．７）のログ管理では、ログ管理部２６は、バックアップ時に、ＢＩログのデータとともに、ログの上書き先のバックアップデータのアドレス情報を記録しておく。本来、ログの管理情報であるログ管理ファイルを参照しないとログにはアクセスできないが、ログの中に管理情報を書いておけば、ログを読むだけでログをバックアップデータに上書きすることができる。したがって、ログの管理情報を参照しなくても、ログを解消することが可能になり、ログ管理が効率化される。
【００７５】
図１６は、このようなログ媒体のデータ形式を示している。図１６において、元アドレスと長さは、それぞれ、対応する上書き先の領域のオフセットとサイズを表し、これらはログの管理情報に相当する。
【００７６】
次に、図１７から図２７までを参照しながら、上述した特徴（ａ．８）〜（ａ．１２）および（ｂ．１）〜（ｂ．５）に関する動作を詳細に説明する。
図１７は、（ａ．８）および（ａ．９）のブロック管理と、（ｂ．１）〜（ｂ．５）のグループ管理を示している。使用済みブロックリスト４１は、（ａ．８）のブロック管理で用いられ、変更ブロックリスト４２は、（ａ．９）のブロック管理で用いられる。また、グループブロックリスト４３およびグループ変更ブロックリスト４４は、（ｂ．１）〜（ｂ．５）のグループ管理で用いられる。
【００７７】
（ａ．８）のブロック管理では、ブロック管理部２２は、ディスク１３上のファイルに割当て済みのブロックを、使用済みブロックリスト４１に記録して管理する。そして、ブロック管理部２２は、使用済みブロックリスト４１に記録されたブロックをコピー管理部２５に通知し、コピー管理部２５は、通知されたブロックのみをコピーする。このように、バックアップデータとして必要なブロックのみをコピーすることで、コピー時間が短縮され、必要な媒体容量が削減される。
【００７８】
使用済みブロックリスト４１は、例えば、図１８に示すような空き領域管理表から生成される。図１８の空き領域管理表は、ブロック管理部２２により管理され、ディスク１３のすべてのブロックのブロック識別情報（ブロック番号）と、各ブロックが使用中か否かを示すフラグ情報を有する。ここでは、フラグ“○”が空きブロックを表し、フラグ“×”が使用中のブロックを表す。
【００７９】
ブロック管理部２２は、バックアップ時に、空き領域管理表から使用中のブロックのブロック番号をリストアップし、使用済みブロックリスト４１を生成する。例えば、図１８の空き領域管理表からは、図１９に示すような使用済みブロックリストが生成される。
【００８０】
また、（ａ．９）のブロック管理では、ブロック管理部２２は、ディスク１３上のブロックのうち前回のバックアップの後で変更のあったブロックを、変更ブロックリスト４２に記録して管理する。そして、ブロック管理部２２は、変更ブロックリスト４２に記録されたブロックを差分としてコピー管理部２５に通知し、コピー管理部２５は、通知されたブロックのみをコピーする。これにより、差分バックアップ（インクリメンタルバックアップ）が行われる。
【００８１】
例えば、ファイルｆがブロックｘ、ｙ、およびｚを使用しているとき、前回のバックアップの後で、ブロックｘに上書きが行われ、新たにブロックｕが追加されたとする。この場合、ブロックｘおよびｕが変更ブロックリスト４２に記録され、バックアップ時には、これらのブロックの内容がコピーされる。
【００８２】
ディスク１３上のすべてのブロックのバックアップ（全体バックアップ）を行う代わりに、このような差分バックアップを行うことで、コピー時間が短縮され、必要な媒体容量が削減される。
【００８３】
図２０は、ブロック管理部２２による変更ブロックリスト更新処理のフローチャートである。この処理では、ファイルへの書き込み要求から変更のあったブロックが判定され、そのブロックが変更ブロックリスト４２に追加される。
【００８４】
まず、ブロック管理部２２は、計算機１１からファイルへの書き込み要求を受け取る（ステップＳ７１）。書き込み要求には、ファイル名と書き込み領域のオフセットａおよびサイズＳが含まれている。次に、対応するファイルのａ〜ａ＋Ｓの範囲に割当てられているブロックのブロック番号を求め（ステップＳ７２）、そのブロック番号を変更ブロックリスト４２に追加する（ステップＳ７３）。そして、要求されたブロックにアクセスして、書き込みのための処理を行い（ステップＳ７４）、処理を終了する。
【００８５】
また、（ｂ．１）のグループ管理では、グループ管理部２３は、ブロック管理部２２と連携しながらファイルをグループ毎に管理し、各グループに属するファイルが使用しているブロックを、グループブロックリスト４３に記録して管理する。そして、グループ管理部２３は、特定のグループのグループブロックリスト４３に記録されたブロックをコピー管理部２５に通知し、コピー管理部２５は、通知されたブロックのみをコピーする。これにより、特定のグループに関するバックアップが行われる。
【００８６】
また、（ｂ．２）のグループ管理では、グループ管理部２３は、ディレクトリを単位としてファイルをグループ化し、ディレクトリに含まれるすべてのファイルをグループとして設定する。
【００８７】
また、（ｂ．３）のグループ管理では、グループ管理部２３は、グループとして設定されたディレクトリの下の特定のファイルまたはディレクトリを、グループから除外する。これにより、あるグループとして設定されたディレクトリに含まれる特定のファイルを、グループから除外することができる。
【００８８】
以上説明した（ｂ．１）〜（ｂ．３）のグループ管理により、ユーザが任意のファイルをグループ化し、グループ単位でファイルのバックアップを行うことが可能になる。
【００８９】
このようなグループ管理の例として、ファイルシステム上に、図２１に示すようなディレクトリツリーが存在する場合を考える。図２１において、Ａ、Ｂ、Ｃ、およびＤはディレクトリ名を表し、ａ、ｂ、ｃ、ｄ、ｅ、およびｆはファイル名を表す。ユーザは、任意のディレクトリ名およびファイル名を用いて、ファイルのグループを設定することができる。
【００９０】
ここで、ユーザが、図２２に示すようなグループリストを入力して、グループの設定を指示したとする。図２の“ｄｉｒ＿Ａ／＊”および“ｄｉｒ＿Ｃ／＊”は、ディレクトリＡおよびＣのすべてのファイルをグループに含めることを表し、“Ｘｄｉｒ＿Ｄ／ｆｉｌｅ＿ｄ”は、ディレクトリＤのファイルｄをグループから除くことを表す。
【００９１】
このとき、グループ管理部２３は、ディレクトリＡおよびＣに属するすべてのファイルのうち、ファイルｄを除いた残りのファイルａ、ｂ、ｃ、ｅ、およびｆをグループとして選択する。そして、各ファイルに割当てられたブロックのブロック番号をブロック管理部２２から取得し、グループブロックリスト４３に記録する。
【００９２】
図２３は、別のディレクトリツリーに関するグループリストの例を示している。このグループリストは、ディレクトリＸのファイルａ、ディレクトリＹのファイルｂ、およびディレクトリＺのすべてのファイルをグループに含め、ディレクトリＺのファイルｃをグループから除くことを表している。
【００９３】
このブロックリストからは、例えば、図２４に示すようなグループブロックリストが生成される。図２４において、“ｂｌｏｃｋｎｏ”はブロック番号を表し、複数の連続するブロック番号は１まとまりにして記録されている。
【００９４】
バックアップ時には、ファイルに関するメタ情報と、グループブロックリスト４３に記録されたブロック番号と、対応するブロックのデータがバックアップ媒体にコピーされる。メタ情報としては、ディレクトリツリーに含まれるすべてのファイルのファイル名と属性か、または、グループに属するファイルのファイル名と属性が用いられる。
【００９５】
また、リストア時にファイルが参照されると、ブロック管理部２２は、そのファイル名から対応するブロック番号を求め、そのブロックのバックアップデータにアクセスする。
【００９６】
このとき、メタ情報としてすべてのファイルの情報が記録されていると、計算機１１には、図２１のファイルｄのように、グループに属さないファイルのファイル名も見えることになる。しかし、ファイルｄのブロックのバックアップデータは存在しないため、このファイルを参照するとエラーが返される。これに対して、メタ情報としてグループに属するファイルの情報のみを記録しておけば、計算機１１には、グループに属さないファイルのファイル名は見えないので、見えているすべてのファイルの参照が可能になる。
【００９７】
さらに、グループ管理部２３は、前回のバックアップの後で変更のあったブロックを、グループ単位でグループ変更ブロックリスト４４に記録して管理する。そして、グループ管理部２３は、グループ変更ブロックリスト４４に記録されたブロックを差分としてコピー管理部２５に通知し、コピー管理部２５は、通知されたブロックのみをコピーする。これにより、グループ単位で差分バックアップが行われる。
【００９８】
図１９および図２４のブロックリストでは、ブロック番号が明示的に記録されているが、代わりに元アドレスと長さを用いて連続する複数のブロックの集合を記録してもよい。他のブロックリストについても同様である。
【００９９】
また、（ｂ．４）のグループ管理では、グループ管理部２３は、複数のグループを設定し、それぞれ別のスケジュールでバックアップを行う。また、（ｂ．５）のグループ管理では、グループ管理部２３は、１つのファイルが複数のグループに属することを認めるようなグループ化を行う。これにより、柔軟なグループ設定とバックアップが可能になる。
【０１００】
図１７のディスク１３の斜線部分は、上述した様々なブロックリストのうちの１つに記録されたブロックの集合を表しており、これらのブロックのデータは、コピー管理部２５により、バックアップ媒体１５の斜線部分にコピーされる。
【０１０１】
このような方法によれば、あらかじめ生成されたブロックリストに基づいてバックアップが行われるため、異なるファイルのブロックを含む複数のブロックを一括してコピーすることができる。したがって、ファイル単位でコピーする場合に比べて、ディスク１３へのアクセス回数が大幅に削減され、ランダムアクセスに近い状況は発生しにくくなる。
【０１０２】
また、（ａ．１０）のブロック管理では、差分バックアップを行った後、リストアの前に、差分バックアップデータ同士、または差分バックアップデータと全体バックアップデータの内容を、ブロック単位でマージする。さらに、全体バックアップデータと差分バックアップデータを含む２つ以上のバックアップデータをマージしておいてもよい。差分バックアップデータをあらかじめまとめておくことで、リストアが効率化される。
【０１０３】
図２５は、このようなマージ処理の例を示している。図２５において、全体バックアップデータ５１は、最初の世代Ｇ３のバックアップデータを表し、差分バックアップデータ５２、５３は、それぞれ、世代Ｇ２、Ｇ１における差分を表す。この場合、世代Ｇ３が最も古く、世代Ｇ１が最も新しい。斜線部分は、バックアップデータが存在する領域を表している。
【０１０４】
ここで、全体バックアップデータ５１と差分バックアップデータ５２をマージすると、バックアップデータ５４が生成される。ただし、同一領域においては、より新しいデータが優先的に保存される。この場合、リストア時には、バックアップデータ５４と差分バックアップデータ５３のみを用いて、データが参照される。
【０１０５】
また、差分バックアップデータ５２と差分バックアップデータ５３をマージすると、バックアップデータ５５が生成される。この場合、リストア時には、バックアップデータ５５と全体バックアップデータ５１のみを用いて、データが参照される。
【０１０６】
また、（ａ．１１）のブロック管理では、差分バックアップを行う際に、ユーザが差分の基準となる時点を選択し、ブロック管理部２２は、指定された時点以後に変更のあったブロックのみを、変更ブロックリスト４２に記録する。そして、コピー管理部２５は、それらのブロックのみをコピーする。
【０１０７】
これにより、必要に応じて差分バックアップの開始時点を変更することができ、前回のバックアップが行われた時点と選択された時点の間に発生した変更は、バックアップ媒体１５には保存されない。したがって、リストアに反映させる変更を取捨選択することが可能になる。
【０１０８】
図２６は、差分バックアップの開始時点を変更する例を示している。ここでは、時刻ｔ０において前回の差分バックアップが行われ、時刻ｔ０とｔ１の間にブロックｘおよびｙが変更ブロックリスト４２に追加され、時刻ｔ１とｔ２の間にブロックｚが変更ブロックリスト４２に追加されるものとする。ユーザが差分バックアップの開始時点を変更しなければ、時刻ｔ２において、変更ブロックリスト４２には、ブロックｘ、ｙ、およびｚが時刻ｔ０からの差分として記録されている。
【０１０９】
しかし、ユーザが時刻ｔ１を差分バックアップの開始時点として指定すると、ブロック管理部２２は、時刻ｔ１において、変更ブロックリスト４２を一旦クリアして、ブロックｘおよびｙのブロック番号を消去する。その後、ブロックｚが変更ブロックリスト４２に追加され、時刻ｔ２においては、ブロックｚのみが時刻ｔ１からの差分として記録される。そして、次のバックアップ時には、時刻ｔ１からの差分に基づいて、差分バックアップが行われる。
【０１１０】
また、（ａ．１２）のコピー管理では、クラスタ内に複数のディスク１３が存在する場合、コピー管理部２５は、各計算機１１にいずれかのディスク１３のコピーを指示し、クラスタ内でコピーを行う計算機１１を分散させる。そして、各計算機１１は、コピー管理部２５から指示されたディスク１３のコピーを行う。このように、複数の計算機１１がコピーを行うことで、バックアップの負荷が分散され、コピー時間が短縮される。
【０１１１】
図２７は、このようなコピー管理の例を示している。図２７のクラスタ内には、複数のバックアップ媒体１５が設けられている。コピー管理部２５は、コピー対象のディスク１３とコピー先のバックアップ媒体１５のデバイス名を各計算機１１に通知し、コピー作業を依頼する。コピー作業を依頼された計算機１１は、通知されたデバイス名のディスク１３のデータを、通知されたデバイス名のバックアップ媒体１５にコピーする。このとき、複数の計算機１１により、コピー作業が並列に行われる。
【０１１２】
次に、図２８から図３５までを参照しながら、上述した特徴（ｃ．１）〜（ｃ．５）に関するリストア時の動作を詳細に説明する。
（ｃ．１）の動作では、ファイルシステムが、ディスク１３の代わりにバックアップ媒体１５をそのままマウントすることで、データをリストアする。これにより、各計算機１１は、バックアップ媒体１５に保存されたバックアップデータに直接アクセスできるようになり、リストアのための特別な操作が不要になる。
【０１１３】
図２８は、ブロック管理部２２がバックアップ媒体１５をファイルシステム上にマウントする処理を示している。図２８において、ディスク１３のデータ（斜線部分）がバックアップ媒体１５にコピーされた後、計算機１１からディスク１３の参照要求を受け取ると、ブロック管理部２２は、バックアップ媒体１５上の対応するデータを計算機１１に返す。
【０１１４】
図２９は、このような参照処理のフローチャートである。ブロック管理部２２は、まず、計算機１１からファイルの読み込み要求を受け取る（ステップＳ８１）。読み込み要求には、ファイル名と読み込み領域のオフセットａおよびサイズＳが含まれている。次に、バックアップ媒体１５のメタ情報を参照し（ステップＳ８２）、対応するファイルのａ〜ａ＋Ｓの範囲に割当てられているブロックの番号＃ｘを求める（ステップＳ８３）。
【０１１５】
次に、そのブロック番号＃ｘのデータが保存されているバックアップ媒体１５上のブロックの番号＃ｙを求め（ステップＳ８４）、そのブロックのデータを読んで（ステップＳ８５）、読み込み要求に応答し（ステップＳ８６）、処理を終了する。
【０１１６】
また、（ｃ．２）の動作では、差分バックアップのリストア時に、必要に応じて、複数の差分バックアップデータを最新のものから順に探索しながら遡る。差分バックアップの場合、複数の世代のバックアップデータのうちのいずれかに必要なデータが保存されているため、それらのバックアップデータを探索することで、すべてのデータを計算機１１に提示することができる。
【０１１７】
図３０は、このような世代管理を示している。図３０において、各世代のバックアップデータは、それぞれ異なるバックアップ媒体１５に保存されている。ブロック管理情報６１は、ファイル名と、ディスク１３のデバイス名およびブロック番号とをマッピングしている。また、ブロック管理情報６２は、バックアップデータの世代毎に設けられ、ディスク１３のデバイス名およびブロック番号と、バックアップ媒体１５の識別情報およびブロック番号とをマッピングしている。
【０１１８】
計算機１１からファイルのアクセス要求を受け取ると、ブロック管理部２２は、ブロック管理情報６１を参照して、ファイル名に対応するデバイス名とブロック番号を取得し、それらを媒体制御部２４に渡す。
【０１１９】
媒体制御部２４は、各世代のバックアップデータを生成順に管理し、最も新しい世代Ｇ１のブロック管理情報６２を参照して、与えられたデバイス名とブロック番号（ブロック情報）があるか否かをチェックする。与えられたブロック情報がある場合は、それに対応するバックアップ媒体１５のブロック番号を取得し、世代Ｇ１のバックアップ媒体１５を参照する。与えられたブロック情報がない場合は、それより１つ前の世代Ｇ２のブロック管理情報６２を参照し、そのブロック情報があるか否かをチェックする。
【０１２０】
このような処理を繰り返しながら、世代を１つずつ遡っていけば、いずれかの世代のバックアップ媒体１５において、与えられたブロック情報に対応するデータを参照することができる。このように、差分バックアップが行われた場合でも、過去のバックアップデータを利用して、すべてのデータをユーザに見せることが可能になる。
【０１２１】
図３１は、差分バックアップのリストアの例を示している。図３１において、全体バックアップデータ７１は、最初の世代Ｇ３のバックアップデータを表し、差分バックアップデータ７２、７３は、それぞれ、世代Ｇ２、Ｇ１における差分を表す。斜線部分は、バックアップデータが存在するブロックを表している。例えば、差分バックアップデータ７２においては、ブロック８１、８２が変更されたデータに対応し、ブロック８３が新たに追加されたデータに対応する。
【０１２２】
リストア時に、ブロック９３、９４のデータが要求されると、世代Ｇ１の差分バックアップデータ７３の対応するブロックが参照され、ブロック９２、９７のデータが要求されると、世代Ｇ２まで遡って、差分バックアップデータ７２の対応するブロックが参照される。また、ブロック９１、９５、９６のデータが要求されると、世代Ｇ３まで遡って、全体バックアップデータ７１の対応するブロックが参照される。
【０１２３】
さらに、図２５に示したようなマージ処理が行われた場合は、マージされた２つのバックアップデータの代わりに、マージにより生成されたバックアップデータを用いて、同様のリストア動作が行われる。
【０１２４】
また、（ｃ．３）および（ｃ．４）の動作では、バックアップデータがテープ１６に保存されている場合、バックアップ媒体１５を、クラスタ内のすべての計算機１１からアクセス可能なバッファとして用いる。そして、計算機１１がバックアップデータを参照したとき、必要なブロックのみをバックアップ媒体１５にロードし、それらのブロックをキャッシュとして利用する。これにより、必要なデータのみをバックアップ媒体１５上に配置することができ、頻繁にアクセスされるデータへのアクセス効率が向上する。
【０１２５】
図３２は、バックアップ媒体１５をバッファとして利用する処理を示している。ここでは、図２の構成では計算機１２のみがテープ１６に接続されているので、この計算機１２のテープ制御部２７が、テープ１６から必要なブロックのデータを読み出して、バックアップ媒体１５上に配置する。バックアップ媒体１５としては、例えば、ディスクが用いられる。こうしてバックアップ媒体１５にロードされたデータを参照することで、テープ１６に接続されていない計算機１１でも、テープ１６に保存されたバックアップデータを読むことが可能になる。
【０１２６】
図３３は、バックアップデータの参照処理のフローチャートである。図３２のステップＳ９１〜Ｓ９３の処理は、図２９のステップＳ８１〜Ｓ８３の処理と同様である。次に、ブロック管理部２２は、ステップＳ９３で得られたブロック番号＃ｘのデータが保存されているテープ１６上のブロックの番号＃ｙを求め（ステップＳ９４）、そのブロックのキャッシュがバックアップ媒体１５上にあるか否かをチェックする（ステップＳ９５）。
【０１２７】
キャッシュがバックアップ媒体１５上になければ、テープ制御部２７が、そのブロックのデータをテープ１６から読んで、バックアップ媒体１５上の空いているブロック＃ｚに書き込む（ステップＳ９６）。そして、ブロック管理部２２は、書き込まれたデータを用いて読み込み要求に応答し（ステップＳ９７）、処理を終了する。また、キャッシュがバックアップ媒体１５上にあれば、ブロック管理部２２は、そのデータを用いて読み込み要求に応答し（ステップＳ９７）、処理を終了する。
【０１２８】
また、（ｃ．５）の動作では、ログ媒体１４のＢＩログがバックアップ媒体１５のバックアップデータに上書きされておらず、ログとバックアップデータが別々に保存されている場合、ログを先に参照し、必要に応じて、バックアップデータを後で参照する。リストア時にログを参照することで、バックアップ開始時点のイメージが再現され、データの矛盾が生じることがなくなる。
【０１２９】
図３４は、ログ媒体１４のログを参照する処理を示している。計算機１１からアクセス要求を受け取ると、ブロック管理部２２は、ログ管理部２６にログの存在と内容を確認して、要求されたデータのログがあれば、ログ媒体１４を参照し、ログがなければ、バックアップ媒体１５を参照する。
【０１３０】
図３５は、このような参照処理のフローチャートである。図３５のステップＳ１０１〜Ｓ１０３の処理は、図２９のステップＳ８１〜Ｓ８３の処理と同様である。次に、ブロック管理部２２は、ステップＳ１０３で得られたブロック番号＃ｘのログがあるか否かをログ管理部２６に問い合わせ（ステップＳ１０４）、回答をチェックする（ステップＳ１０５）。
【０１３１】
問い合せたブロックのログがあれば、ログ媒体１４上のそのログを読み、そのようなログがなければ、対応するバックアップ媒体１５上のブロックの番号＃ｙを求めて、そのブロックのバックアップデータを読む（ステップＳ１０７）。そして、読み込み要求に応答し（ステップＳ１０８）、処理を終了する。
【０１３２】
ところで、図２のクラスタシステムでは、キャッシュ制御部２１、ブロック管理部２２、グループ管理部２３、媒体制御部２４、コピー管理部２５、ログ管理部２６、およびテープ制御部２７を、管理用の計算機１２に設けているが、これらの制御部および管理部の一部または全部を、複数の計算機１１に分散して設けてもよい。
【０１３３】
図３６は、このようなクラスタシステムにおけるバックアップ動作を示している。図３６においては、コピー管理部２５、ログ管理部２６、およびテープ制御部２７が計算機１１に分散して設けられ、コピー管理部２５を有する計算機１１が、ディスク１３の内容をバックアップ媒体１５にコピーする。
【０１３４】
各計算機１１は、このコピーの最中に発生するＢＩログを、ログ管理部２６を有する計算機１１に転送し、この計算機１１がログを編集してログ媒体１４に書く。そして、バックアップデータとログは、それぞれ、テープ制御部２７を有する計算機１１により、テープ１６に書かれる。
【０１３５】
図３７は、図３６のクラスタシステムにおけるリストア動作を示している。図３７においては、テープ制御部２７を有する計算機１１が、他の計算機１１からの読み込み要求を受け取り、必要なバックアップデータとログを別々にテープ１６から読んで、それぞれ、バックアップ媒体１５とログ媒体１４上に展開する。
【０１３６】
展開が済むと、読み込みを要求した計算機１１は、ログがあれば、ログ媒体１４からログを読む。ログがなければ、ブロック管理部２２からファイル名に対応するブロック番号を受け取り、バックアップ媒体１５からバックアップデータを読む。
【０１３７】
ただし、テープ制御部２７を有する計算機１１自身が読み込みを要求する場合は、バックアップデータとログをバックアップ媒体１５とログ媒体１４上に展開する必要はない。
【０１３８】
図３８は、図３６のクラスタシステムにおいて、バックアップデータをテープ１６に書く前にログの上書きを行う場合を示している。図３８においては、バックアップデータとログが確定した段階で、ログ管理部２６は、ログ媒体１４のログを、バックアップ媒体１５のバックアップデータに上書きする。上書きが済むと、テープ制御部２７は、上書きされたバックアップデータをテープ１６に書く。
【０１３９】
図３９は、図３８のクラスタシステムにおけるリストア動作を示している。図３９においては、テープ制御部２７は、ログが上書きされたバックアップデータをテープ１６から読んで、バックアップ媒体１５上に展開する。そして、読み込みを要求した計算機１１は、バックアップ媒体１５から必要なデータを読む。ただし、テープ制御部２７を有する計算機１１自身が読み込みを要求する場合は、データをバックアップ媒体１５上に展開する必要はない。
【０１４０】
図２の計算機１１、１２は、例えば、図４０に示すような情報処理装置を用いて構成することができる。図４０の情報処理装置は、ＣＰＵ（中央処理装置）１１１、メモリ１１２、入力装置１１３、出力装置１１４、外部記憶装置１１５、媒体駆動装置１１６、およびネットワーク接続装置１１７を備え、それらはバス１１８により互いに接続されている。
【０１４１】
メモリ１１２は、例えば、ＲＯＭ（read only memory）、ＲＡＭ（random access memory）等を含み、処理に用いられるプログラムとデータを格納する。ＣＰＵ１１１は、メモリ１１２を利用してプログラムを実行することにより、必要な処理を行う。
【０１４２】
図２のキャッシュ制御部２１、ブロック管理部２２、グループ管理部２３、媒体制御部２４、コピー管理部２５、ログ管理部２６、およびテープ制御部２７は、例えば、プログラムにより記述されたソフトウェアコンポーネントとしてメモリ１１２に格納される。
【０１４３】
入力装置１１３は、例えば、キーボード、ポインティングデバイス、タッチパネル等であり、ユーザからの指示や情報の入力に用いられる。出力装置１１４は、例えば、ディスプレイ、プリンタ、スピーカ等であり、ユーザへの問い合わせや処理結果の出力に用いられる。
【０１４４】
外部記憶装置１１５は、例えば、磁気ディスク装置、光ディスク装置、光磁気ディスク（magneto-optical disk）装置、テープ装置等である。情報処理装置は、この外部記憶装置１１５に、上述のプログラムとデータを保存しておき、必要に応じて、それらをメモリ１１２にロードして使用する。また、外部記憶装置１１５は、共有ディスク１３、ログ媒体１４、バックアップ媒体１５、テープ１６等として用いられる。
【０１４５】
媒体駆動装置１１６は、可搬記録媒体１１９を駆動し、その記録内容にアクセスする。可搬記録媒体１１９としては、メモリカード、フロッピーディスク、ＣＤ−ＲＯＭ（compact disk read only memory ）、光ディスク、光磁気ディスク等、任意のコンピュータ読み取り可能な記録媒体が用いられる。ユーザは、この可搬記録媒体１１９に上述のプログラムとデータを格納しておき、必要に応じて、それらをメモリ１１２にロードして使用する。
【０１４６】
ネットワーク接続装置１１７は、計算機間を結ぶ通信ネットワークへの接続に用いられ、通信に伴うデータ変換を行う。情報処理装置は、上述のプログラムとデータをネットワーク接続装置１１７を介して他の装置から受け取り、必要に応じて、それらをメモリ１１２にロードして使用する。
【０１４７】
図４１は、図４０の情報処理装置にプログラムとデータを供給することのできるコンピュータ読み取り可能な記録媒体を示している。可搬記録媒体１１９や外部のデータベース１２０に保存されたプログラムとデータは、メモリ１１２にロードされる。そして、ＣＰＵ１１１は、そのデータを用いてそのプログラムを実行し、必要な処理を行う。
【０１４８】
【発明の効果】
本発明によれば、クラスタシステムのような、ディスク共用ファイルシステムを有する計算機システムにおいて、システムの稼働中にデータを効率的にバックアップすることができる。また、リストア時に、バックアップデータを効率的に参照することができる。
【図面の簡単な説明】
【図１】本発明のバックアップシステムの原理図である。
【図２】クラスタシステムの構成図である。
【図３】キャッシュ制御処理のフローチャートである。
【図４】第１のログ管理を示す図である。
【図５】ログ編集処理のフローチャートである。
【図６】第２のログ管理を示す図である。
【図７】第１のログ記録処理のフローチャートである。
【図８】第３のログ管理を示す図である。
【図９】第２のログ記録処理のフローチャートである。
【図１０】ログ管理ファイルを示す図である。
【図１１】第１のコピー処理のフローチャートである。
【図１２】第４のログ管理を示す図である。
【図１３】第３のログ記録処理のフローチャートである。
【図１４】第２のコピー処理のフローチャートである。
【図１５】第５のログ管理を示す図である。
【図１６】ログ媒体のデータ形式を示す図である。
【図１７】ブロック管理とグループ管理を示す図である。
【図１８】空き領域管理表を示す図である。
【図１９】使用済みブロックリストを示す図である。
【図２０】変更ブロックリスト更新処理のフローチャートである。
【図２１】ディレクトリツリーを示す図である。
【図２２】第１のグループリストを示す図である。
【図２３】第２のグループリストを示す図である。
【図２４】グループブロックリストを示す図である。
【図２５】差分バックアップデータのマージを示す図である。
【図２６】差分バックアップ開始時点の変更を示す図である。
【図２７】コピー管理を示す図である。
【図２８】バックアップ媒体のマウントを示す図である。
【図２９】第１の参照処理のフローチャートである。
【図３０】世代管理を示す図である。
【図３１】差分バックアップのリストアを示す図である。
【図３２】バッファとしてのバックアップ媒体を示す図である。
【図３３】第２の参照処理のフローチャートである。
【図３４】ログの参照を示す図である。
【図３５】第３の参照処理のフローチャートである。
【図３６】第１のバックアップを示す図である。
【図３７】第１のリストアを示す図である。
【図３８】第２のバックアップを示す図である。
【図３９】第２のリストアを示す図である。
【図４０】情報処理装置の構成図である。
【図４１】記録媒体を示す図である。
【符号の説明】
１コピー手段
２制御手段
３、１１、１２計算機
４共有媒体
５、１５バックアップ媒体
６ログ管理手段
７生成手段
８グループ管理手段
９領域管理手段
１３共有ディスク
１４ログ媒体
１６テープ
２１キャッシュ制御部
２２ブロック管理部
２３グループ管理部
２４媒体制御部
２５コピー管理部
２６ログ管理部
２７テープ制御部
２８キャッシュ
３１一時ログ媒体
３２、３４ログ管理ファイル
３３クロック部
４１使用済みブロックリスト
４２変更ブロックリスト
４３グループブロックリスト
４４グループ変更ブロックリスト
５１、７１全体バックアップデータ
５２、５３、７２、７３差分バックアップデータ
５４、５５バックアップデータ
６１、６２ブロック管理情報
８１、８２、８３、９１、９２、９３、９４、９５、９６、９７ブロック
１０１アクセス要求
１１１ＣＰＵ
１１２メモリ
１１３入力装置
１１４出力装置
１１５外部記憶装置
１１６媒体駆動装置
１１７ネットワーク接続装置
１１８バス
１１９可搬記録媒体
１２０データベース[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a system and method for backing up data stored in a recording medium such as a disk of a computer (computer) system and restoring it when necessary.
[0002]
[Prior art]
In a conventional computer system, when a file system performs a backup, first, block information such as an address of a used block is checked for each file. Then, by reading the data of the corresponding block from the disk, the file is read and the read data is copied to the tape. By repeating such an operation for each file, the file is backed up.
[0003]
However, with this method, when a large number of files are backed up, access to the disk is close to random access, which is a cause of deteriorating system performance.
[0004]
Therefore, in order to improve the efficiency of backup, image backup that directly copies a plurality of blocks occupied by a file has been used. In this method, the computer system copies the block area occupied by the file in the disk at once instead of selectively copying the file. For this reason, the backup is performed by one disk access, and the processing becomes efficient.
[0005]
[Problems to be solved by the invention]
However, the conventional backup method described above has the following problems.
[0006]
In the conventional image backup, data can be copied in units of disks, but files and directories cannot be copied in units. For this reason, there was a problem that even unnecessary data had to be copied. In addition, in order to restore the backed up data, it was necessary to copy and expand all the data on the disk.
[0007]
Further, when a backup is performed in a cluster system in which a plurality of computers share a disk and perform processing, the following problem occurs.
The cluster system includes a file system (disk shared file system) for a plurality of computers to simultaneously access a shared disk, and each computer includes an area for caching write data. For this reason, simply copying the shared disk does not reflect the contents of the cached write data (write cache) in the copy, so it is impossible to perform normal image backup.
[0008]
In addition, in a conventional computer system, in order to perform image backup while the business is running (online), the file system saves the original data before the change to another area when the data is changed, and the disk is saved. After copying, the saved original data was overwritten on the copy. As a result, the contents at the start of the backup can be determined on the backup data.
[0009]
However, in a cluster system, changes by multiple computers may occur almost simultaneously in the same file area, and the method of determining the contents at the start of backup using the original data cannot be applied directly. There is.
[0010]
Thus, the conventional backup method cannot efficiently back up a large amount of data in the cluster system. For this reason, an effective backup method in the cluster system has not been developed, and there is no method for efficiently browsing the backed up data.
[0011]
An object of the present invention is to provide a system and method for efficiently backing up data in a computer system having a disk shared file system.
[0012]
[Means for Solving the Problems]
FIG. 1 is a principle diagram of a backup system according to the present invention.
In the first aspect of the present invention, the backup system includes a copy unit 1 and a control unit 2 and backs up a shared medium 4 shared by a plurality of computers 3.
[0013]
The copy unit 1 copies a plurality of unit areas of the shared medium 4 to the backup medium 5 at once. The control means 2 manages the write data that each computer 3 writes to the shared medium 4 and reflects the write data of each computer 3 to the shared medium 4 at the time of backup.
[0014]
Each computer 3 holds the write data as a write cache when the data in the shared medium 4 is changed, and writes the contents in the shared medium 4 when the shared medium 4 becomes accessible. The control means 2 manages the presence or absence of write data held by each computer 3 and performs control to write the write data held by the computer 3 into the shared medium 4 at the time of backup.
[0015]
The storage area of the shared medium 4 is divided into unit areas such as blocks, for example. After all the write data is written, the copy unit 1 copies a plurality of unit areas of the shared medium 4 to the backup medium 5 by a method such as an image backup, for example.
[0016]
According to such a backup system, it is possible to efficiently back up the shared medium 4 including the write data held by each computer 3 in the disk shared file system.
[0017]
In the second aspect of the present invention, the backup system includes a log management unit 6 and a generation unit 7, and includes a shared medium 4 shared by a plurality of computers 3. image Make a backup.
[0018]
The log management means 6 is one of a plurality of computers 3 Calculator 3 Shared media 4 When writing to a certain area , Image of the area before writing Log data as On the computer 3 that wrote Manage, Managed by multiple computers 3 Collect the logs and generate the whole log. When there are a plurality of logs for the same area, the oldest log is used after the image backup start time. The generation means 7 When restoring, Data at the start of image backup is generated using the entire log.
[0019]
When the computer 3 changes the data of the shared medium 4, the original data before the change is saved as a log. The log management means 6 manages the logs of each computer 3 and collects the logs of two or more computers 3 to generate the entire system log. And the production | generation means 7 determines the content at the time of a backup start by overwriting the whole log on the backup data of the shared medium 4, for example.
[0020]
According to such a backup system, the original data stored in response to the data change by the plurality of computers 3 is edited, and the entire log is generated. Therefore, even in the disk shared file system, backup can be performed efficiently while the system is operating.
[0021]
In the third aspect of the present invention, the backup system includes a copy unit 1 and a group management unit 8 and backs up a shared medium 4 shared by a plurality of computers 3.
[0022]
The group management means 8 sets a group of files stored in the shared medium 4 and lists unit areas occupied by files included in the group. The copy unit 1 copies a plurality of listed unit areas to the backup medium 5 at once.
[0023]
The group management means 8 sets a group including one or more files, and lists unit areas occupied by each file included in the group. Then, the copy unit 1 copies a plurality of listed unit areas to the backup medium 5 by a method such as image backup without distinguishing each file.
[0024]
According to such a backup system, it is possible to specify a file to be backed up in the disk shared file system, and there is no need to copy an unnecessary file. Therefore, the backup is made efficient.
[0025]
In the fourth aspect of the present invention, the backup system includes a copy unit 1 and an area management unit 9 and backs up a storage medium 4 that stores a file accessed from the computer 3.
[0026]
The area management unit 9 determines whether each unit area of the storage medium 4 is used and lists the used unit areas. The copy unit 1 copies a plurality of listed unit areas to the backup medium 5 at once.
[0027]
The area management unit 9 manages each unit area of the storage medium 4, determines whether each unit area is used as a file, and lists the unit areas occupied by the file. Then, the copy unit 1 copies a plurality of listed unit areas to the backup medium 5 by a method such as image backup without distinguishing each file.
[0028]
According to such a backup system, it is not necessary to copy a unit area that is not used as a file. Therefore, the backup in the file system is made efficient.
[0029]
In the fifth aspect of the present invention, the backup system includes a copy unit 1 and an area management unit 9 and backs up a storage medium 4 that stores a file accessed from the computer 3.
[0030]
The area management unit 9 lists unit areas of the storage medium 4 that have been changed after the previous backup as differences. The copying unit 1 copies a plurality of listed unit areas collectively to the backup medium 5 as differential backup data.
[0031]
The backup system backs up the storage medium 4 in time series at an appropriate timing. The area management unit 9 manages each unit area of the storage medium 4 and lists a unit area changed after the previous backup or a unit area newly used as a file. Then, the copy unit 1 copies a plurality of listed unit areas to the backup medium 5 by a method such as image backup without distinguishing each file. Thereby, only the changed unit area is stored as a difference.
[0032]
According to such a backup system, it is not necessary to copy a unit area whose data has not changed since the previous backup. Therefore, the backup in the file system is made efficient.
[0033]
For example, the shared medium 4 in FIG. 1 corresponds to a shared disk 13 in FIG. 2 described later, and the backup medium 5 in FIG. 1 corresponds to the backup medium 15 or the tape 16 in FIG. 1 corresponds to the copy management unit 25 in FIG. 2, the control unit 2 in FIG. 1 corresponds to the cache control unit 21 in FIG. 2, and the log management unit 6 in FIG. 2 corresponds to the log management unit 26 in FIG. 2, the generation unit 7 and the area management unit 9 in FIG. 1 correspond to the block management unit 22 in FIG. 2, and the group management unit 8 in FIG. This corresponds to unit 23.
[0034]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
The computer system of this embodiment includes a plurality of computers, a shared disk shared by these computers, and a file system for allowing the plurality of computers to simultaneously access the shared disk.
[0035]
This computer system directly copies the entire contents of a disk to a backup medium by means of an improved image backup at the time of data backup. Further, the computer access to the disk is detected, and the original data before the access occurs is stored in the log medium. Then, using the data stored in the log medium, an image (data content) at the time of starting backup is determined. Hereinafter, this original data will be referred to as a Before Image Log (BI log) or simply a log.
[0036]
In the image backup of the present embodiment, the main features regarding the backup operation are as follows.
(A.1) The contents of the write cache on the memory of each computer are managed, and the contents of the write cache of each computer are reflected on the disk at the time of backup. This prevents data inconsistencies in the cluster from occurring.
(A.2) Each computer that attempts to write to the disk leaves a BI log, and merges the BI logs of multiple computers at the time of backup. As a result, the log of the entire system in which all the logs are collected after the backup is completed is determined, and the data is determined using this log, so that the destruction of the data due to the write during the backup (copying) is prevented.
(A.3) Each computer that intends to write to the disk notifies the specific computer responsible for the BI log of the write, and the computer manages the BI log. As a result, logs of a plurality of computers are sent to a specific computer, merged, and stored in a log medium. By confirming data using this log, destruction of data due to writing during backup is prevented.
(A.4) The same medium as the backup data is selected as the medium for storing the BI log. Thereby, a log can be preserve | saved simultaneously with backup.
(A.5) A medium different from backup data is selected as a medium for storing the BI log. As a result, even when the medium for storing backup data cannot be overwritten, a log can be left.
(A.6) The BI log is overwritten on the backup data and then stored in the backup medium. This eliminates the need to refer to a plurality of media during restoration.
(A.7) The address information of the backup data to be overwritten is written in the BI log. As a result, it is possible to overwrite the backup data by simply reading the log without accessing the log management information.
(A.8) List used blocks among the blocks on the disk, and copy only necessary parts. Thereby, since the amount of data to be copied is reduced, it is possible to shorten the copy time and the required medium capacity.
(A.9) Of the used blocks on the disk, those that have changed since the previous backup (difference) are listed, and only the changed part is copied. Since such differential backup reduces the amount of data to be copied, it is possible to shorten the copy time and the required medium capacity.
(A.10) After backup is completed and before restoration, the differential backup data or the contents of the differential backup data and the entire backup data are merged in units of blocks. Restoration is made more efficient by collecting differential backup data.
(A.11) The recording start time of differential backup data can be selected. By using only differential backup data after the selected point in time of restoration, changes made before that point can be ignored, and flexible restoration becomes possible.
(A.12) The disk copy process is distributed to a plurality of computers in the cluster. This distributes the load and shortens the copy time.
[0037]
In the image backup according to the present embodiment, main characteristics regarding file grouping are as follows.
(B.1) Files are grouped, the blocks occupied by the files included in the group are managed, and only the blocks used by those files are copied during backup. This makes it possible to set file groups and back up files in groups.
(B.2) Files are grouped in units of directories, and all files included in the directory are set as a group. This makes it possible to set file groups and back up files in groups.
(B.3) A specific file or directory under a directory set as a group is excluded from the group. Thereby, a specific file included in a directory set as a certain group can be excluded from the group, and a flexible group setting becomes possible.
(B.4) Set up a plurality of groups and perform backups according to different schedules. This enables flexible group setting and backup.
(B.5) Recognize that one file belongs to a plurality of groups. This enables flexible group setting and backup.
[0038]
In the image backup according to the present embodiment, main features regarding the restore operation are as follows.
(C.1) The file system mounts the medium for storing the backup data as it is instead of the disk. As a result, a medium for storing backup data can be accessed instead of a disk, and a special operation for restoration is not required.
(C.2) When the above-described differential backup is performed, the backup data of each generation is searched and traced back to the entire backup data as necessary. Thereby, when the block of the file is not included in the latest differential backup data, the block stored by the previous backup can be referred to and used. Therefore, at the time of restoration, it becomes possible for the user to appear as if all the data exists.
(C.3) Load only the necessary blocks from the backup tape into the disk used as a buffer, and use those blocks as a cache. As a result, only necessary blocks can be arranged on the buffer, and the access efficiency to frequently accessed blocks is improved.
(C.4) Load only the necessary blocks from the backup tape onto the disk and show the data to a computer not connected to the tape. This makes it possible for a computer that does not have a tape in the cluster to read backup data stored on the tape.
(C.5) When the BI log is not overwritten on the backup data, the BI log is referred to first, and the backup data is referred to later if necessary. If the log and backup data are stored separately on the medium, the existence and contents of the log are confirmed, and if there is a log, the log is referenced, and if there is no log, the backup data is referenced. As a result, no contradiction occurs in the restored data.
[0039]
FIG. 2 is a configuration diagram of a cluster system that performs image backup as described above. The cluster system in FIG. 2 includes a plurality of computers 11 and 12, a shared disk 13, a log medium 14, a backup medium 15, and a tape 16.
[0040]
The plurality of computers 11 share the shared disk 13 and perform data processing while accessing files stored on the disk 13. The computers 11 and 12 and the shared disk 13 constitute a cluster, and generally one or more shared disks 13 are provided in the cluster. The log medium 14 stores the BI log of the computer 11, and the backup medium 15 and the tape 16 store backup data of files in the disk 13.
[0041]
The computer 12 is a computer that manages a cluster, and includes a cache control unit 21, a block management unit 22, a group management unit 23, a medium control unit 24, a copy management unit 25, a log management unit 26, and a tape control unit 27. These management unit and control unit correspond to, for example, software described by a program, and the block management unit 22 corresponds to a main part of the file system.
[0042]
The cache control unit 21 controls a cache 28 provided on the memory of each computer 11. In the cache 28, data to be written by the computer 11 with respect to the cluster file stored on the disk 13 is temporarily stored.
[0043]
In addition, the block management unit 22 performs block allocation to the file, and manages to which address of which disk 13 each block of the file is allocated. The group management unit 23 manages a user-defined group and files included in the group.
[0044]
The medium control unit 24 controls access to the backup medium 15, and the copy management unit 25 manages the data copy operation from the disk 13 to the backup medium 15. The log management unit 26 manages the BI log and the log medium 14 of each computer 11, and the tape control unit 27 controls access to the tape 16.
[0045]
According to such a cluster system, the block management unit 22 manages the blocks of the backup target file on the disk 13, and the copy management unit 25 copies these blocks to the backup medium 15, thereby performing backup. The BI log based on the writing of each computer 11 in the cluster generated during copying is copied to the log medium by the log management unit 26. The BI log stored in the log medium is later reflected in the backup medium 15 or is retained in the log format as it is.
[0046]
The log medium 14 and the backup medium 15 are non-volatile media such as disks and tapes. When the backup medium 15 is a disk, it is also used as a buffer for the tape 16, and the tape control unit 27 copies backup data from the backup medium 15 to the tape 16.
[0047]
First, the operations relating to the above-described features (a.1) to (a.7) will be described in detail with reference to FIGS.
The cache control unit 21 performs cache management of (a.1), and reflects the write data (write cache) in the cache 28 on the disk 13 at the time of backup. The cache control unit 21 manages a cache table in which the following information is registered for all the caches 28 in the cluster.
・ Computer name
·file name
-Area in the file (offset, size)
Whether the cache is dirty (whether write data remains without being reflected on the disk 13)
When a certain computer 11 generates a dirty cache, the cache control unit 21 instructs each computer 11 to discard the write cache owned by the other computer 11 in the corresponding area of the corresponding file. At the time of backup, each computer 11 is instructed to write all dirty caches to the disk 13. The computer 11 instructed to write writes the write cache in the cache 28 to the disk 13.
[0048]
In this way, when all the write caches in the cluster are reflected on the disk 13, an image backup is executed by the copy management unit 25, and the data in the disk 13 is copied to the backup medium 15. As a result, backup is performed without inconsistency of data in the cluster.
[0049]
FIG. 3 is a flowchart of processing in which the cache control unit 21 reflects the write cache on the disk 13. When backup is started, the cache control unit 21 first searches the cache table and checks whether or not there is a dirty cache remaining in the cluster (step S1).
[0050]
If the dirty cache remains, the computer name, the write destination file name, and the area information are acquired (step S2). Next, the computer is instructed to reflect the dirty cache in the corresponding area of the corresponding file on the disk 13 (step S3), and the processing after step S1 is repeated. When there is no dirty cache in step S1, the process is terminated.
[0051]
Further, the log management unit 26 performs log management of (a.2) or (a.3) regarding the BI log.
FIG. 4 shows the log management of (a.2). In FIG. 4, each computer 11 has its own temporary log medium 31 and log management file 32, and the log management unit 26 edits the log of the entire system by collecting the logs of all the computers 11 after the backup is completed. Store in the log medium 14. The log management file 32 includes a log list in which the following information is recorded for each log.
-Device name of disk 13
-Area (offset, size)
·Times of Day
Among these, the time represents the time when the log is generated and is used to determine the context with other logs. Here, instead of the actual time, the logical time generated by the clock unit 33 provided in the computer 12 is used. For example, the clock unit 33 generates a logical time “1” when the first log is generated, and thereafter increments the logical time by one each time a log is generated.
[0052]
Each computer 11 sends the log of the temporary log medium 31 and the log list of the log management file 32 to the log management unit 26. The log management unit 26 is the oldest if there are multiple logs in the same area in the received logs. Edit the log, leaving things preferentially. By overwriting the edited log on the backup medium 15, the image at the time of starting backup is confirmed, and the change caused by writing to the disk 13 being backed up can be canceled.
[0053]
FIG. 5 is a flowchart of log editing processing by the log management unit 26. First, the log management unit 26 receives logs and a log list from all the computers 11 in the cluster (step S11), sorts the received logs in chronological order, and prepares a work log list (step S12). .
[0054]
Next, the oldest log is selected (step S13), and it is checked whether or not the log in the same area is in the work log list (step S14). If there is no log in the same area, the selected log is added to the work log list (step S15), and if there is a log in the same area, the selected log is discarded (step S16).
[0055]
Next, it is checked whether or not there is an unselected log (step S17). If such a log remains, the processes in and after step S13 are repeated. When all the logs are selected, the logs included in the work log list are recorded on the log medium 14 (step S18), and the process ends.
[0056]
FIG. 6 shows the log management of (a.3). In FIG. 6, each computer 11 sends a log to the log management unit 26, and the log management unit 26 records the received log in the log medium 14, and stores the corresponding device name and area (offset, size) in the log management file. 34. As described above, in the log management of FIG. 6, the log is placed on the log medium 14 from the beginning. By overwriting the log thus recorded on the backup medium 15, the image at the time of starting the backup is determined.
[0057]
FIG. 7 is a flowchart of log recording processing by the log management unit 26. The log management unit 26 first receives a log from the computers 11 in the cluster (step S21), and checks whether a log in the same area is already stored in the log medium 14 (step S22). If the log of the same area is not saved, the received log is recorded in the log medium 14 (step S23), and if the log of the same area is saved, the received log is not recorded.
[0058]
Next, it is checked whether or not logs have been received from all the computers 11 in the cluster (step S24). If there is a computer 11 that has not sent logs, the processing from step S21 is repeated. Then, when logs are received from all the computers 11, the processing is terminated.
[0059]
The log management of FIG. 4 has an advantage that the communication cost is low because the logs of the computers 11 are collectively sent to the log management unit 26 after the backup is completed. However, since the temporary log medium 31 is required, the hardware cost increases, and the log must be edited after the backup is completed, so that post-processing is required.
[0060]
On the other hand, the log management of FIG. 6 has an advantage that the temporary log medium 31 and post-processing are unnecessary. However, each time a log is generated in each computer 11, the log is sent to the log management unit 26, so the communication cost increases compared to the case of FIG.
[0061]
The log management in FIG. 6 can be further classified into (a.4) or (a.5) log management. In the log management (a.4), a log is stored in the backup medium 15 instead of the log medium 14, and in the log management (a.5), the log is stored in the log medium 14.
[0062]
FIG. 8 shows the log management of (a.4). When the backup medium 15 that is the copy destination of the disk 13 is not a tape but a disk used as a buffer for the tape 16, partial overwrite on the backup medium 15 is possible. Therefore, the copy management unit 25 and the log management unit 26 cooperate to record a log on the backup medium 15 at the same time as copying when the disk 13 to be backed up is being copied.
[0063]
At this time, first, after the log management unit 26 records the log on the backup medium 15, the copy management unit 25 copies the data on the disk 13 to the backup medium 15 for an area where no log exists.
[0064]
FIG. 9 is a flowchart of log recording processing by the log management unit 26. First, the log management unit 26 records the management information of the log to be stored in the log management file 34 as shown in FIG. 10 (step S31), and copies the log to the backup medium 15 (step S32). Next, it is checked whether there is another log that has not been copied (step S33). If there is such a log, the processing from step S31 onward is repeated. Then, when all the logs have been copied, the process is terminated.
[0065]
In the log management file of FIG. 10, the device name, the original address, and the length are recorded for each log. The device name represents identification information of the corresponding disk 13, and the original address and length represent the offset and size of the corresponding area, respectively.
[0066]
FIG. 11 is a flowchart of copy processing by the copy management unit 25. First, the copy management unit 25 starts copying using the start address of the disk 13 to be backed up as the current address (step S41), and checks whether the current address is the end address (step S42).
[0067]
If the current address is not the end address, it is next checked whether or not the address exists in the log management file 34 (step S43). If the current address does not exist in the log management file 34, the block at that address is copied to the backup medium 15 (step S44), and if it exists in the log management file 34, the block at that address is not copied.
[0068]
Next, the next address is set as the current address (step S45), and the processes after step S42 are repeated. Then, when the current address matches the end address in step S42, the process ends.
[0069]
FIG. 12 shows the log management of (a.5). When the backup medium 15 is a medium that cannot be partially overwritten, such as a tape, a log medium 14 different from the backup medium 15 is prepared, and only the log is placed on the medium. Thereby, a log can be left on a medium different from the backup medium 15. At this time, the log management unit 26 and the copy management unit 25 respectively perform log recording and disk copy independently.
[0070]
FIG. 13 is a flowchart of log recording processing by the log management unit 26. The processes in steps S51 and S53 in FIG. 13 are the same as the processes in steps S31 and S33 in FIG. 9, respectively. After the process of step S51, the log management unit 26 copies the log to the log medium 14 (step S52), and performs the process of step S53.
[0071]
FIG. 14 is a flowchart of copy processing by the copy management unit 25. The processes in steps S61, S62, S63, and S64 in FIG. 14 are the same as the processes in steps S41, S42, S44, and S45 in FIG. 11, respectively. In this case, all blocks of the disk 13 to be backed up are copied to the backup medium 15.
[0072]
In the log management of (a.6), the log management unit 26 overwrites the BI log on the backup medium 15 and saves it at the time of backup. If the log and the backup data are merged and stored in advance as described above, only the backup medium 15 needs to be referred to at the time of restoration, and there is no need to refer to a plurality of media. Therefore, restoration is made efficient.
[0073]
FIG. 15 shows log management of (a.6). In FIG. 15, the log management unit 26 merges the log and the backup data by overwriting the plurality of logs of the log medium 14 with the corresponding areas of the backup medium 15 while referring to the log management file 34. Thereafter, the tape control unit 27 stores the data of the backup medium 15 on the tape 16.
[0074]
In the log management of (a.7), the log management unit 26 records the address information of the log overwrite destination backup data together with the BI log data at the time of backup. Originally, the log cannot be accessed without referring to the log management file that is the management information of the log, but if the management information is written in the log, the log can be overwritten with the backup data only by reading the log. Therefore, the log can be eliminated without referring to the log management information, and the log management is made efficient.
[0075]
FIG. 16 shows the data format of such a log medium. In FIG. 16, the original address and length represent the offset and size of the corresponding overwrite destination area, respectively, and these correspond to log management information.
[0076]
Next, the operations relating to the features (a.8) to (a.12) and (b.1) to (b.5) described above will be described in detail with reference to FIGS.
FIG. 17 illustrates block management (a.8) and (a.9) and group management (b.1) to (b.5). The used block list 41 is used in block management (a.8), and the changed block list 42 is used in block management (a.9). The group block list 43 and the group change block list 44 are used in group management (b.1) to (b.5).
[0077]
In the block management (a.8), the block management unit 22 records and manages the blocks allocated to the files on the disk 13 in the used block list 41. The block management unit 22 notifies the copy management unit 25 of the blocks recorded in the used block list 41, and the copy management unit 25 copies only the notified block. Thus, by copying only the necessary blocks as backup data, the copy time is shortened and the required medium capacity is reduced.
[0078]
The used block list 41 is generated from, for example, a free space management table as shown in FIG. The free space management table in FIG. 18 is managed by the block management unit 22 and includes block identification information (block numbers) of all blocks on the disk 13 and flag information indicating whether each block is in use. Here, the flag “◯” represents an empty block, and the flag “×” represents a block in use.
[0079]
The block management unit 22 creates a used block list 41 by listing the block numbers of blocks in use from the free space management table at the time of backup. For example, a used block list as shown in FIG. 19 is generated from the free space management table of FIG.
[0080]
In block management (a.9), the block management unit 22 manages the blocks on the disk 13 that have been changed after the previous backup by recording them in the changed block list 42. The block management unit 22 notifies the copy management unit 25 of the blocks recorded in the changed block list 42 as differences, and the copy management unit 25 copies only the notified block. Thereby, differential backup (incremental backup) is performed.
[0081]
For example, when the file f uses blocks x, y, and z, after the previous backup, the block x is overwritten and a new block u is added. In this case, the blocks x and u are recorded in the changed block list 42, and the contents of these blocks are copied at the time of backup.
[0082]
By performing such differential backup instead of performing backup (entire backup) of all blocks on the disk 13, the copy time is shortened and the required medium capacity is reduced.
[0083]
FIG. 20 is a flowchart of changed block list update processing by the block management unit 22. In this process, the changed block is determined from the write request to the file, and the block is added to the changed block list 42.
[0084]
First, the block management unit 22 receives a file write request from the computer 11 (step S71). The write request includes the file name, the write area offset a, and the size S. Next, the block number of the block assigned to the range of a to a + S of the corresponding file is obtained (step S72), and the block number is added to the changed block list 42 (step S73). Then, the requested block is accessed, a process for writing is performed (step S74), and the process ends.
[0085]
Further, in the group management of (b.1), the group management unit 23 manages files for each group in cooperation with the block management unit 22, and blocks used by files belonging to each group are displayed in the group block list. It records in 43 and manages. The group management unit 23 notifies the copy management unit 25 of the blocks recorded in the group block list 43 of the specific group, and the copy management unit 25 copies only the notified block. Thereby, the backup regarding a specific group is performed.
[0086]
In the group management of (b.2), the group management unit 23 groups files in units of directories, and sets all files included in the directories as groups.
[0087]
In the group management (b.3), the group management unit 23 excludes a specific file or directory under the directory set as a group from the group. Thereby, a specific file included in a directory set as a certain group can be excluded from the group.
[0088]
By the group management of (b.1) to (b.3) described above, it becomes possible for the user to group arbitrary files and perform file backup in units of groups.
[0089]
As an example of such group management, consider a case where a directory tree as shown in FIG. 21 exists on the file system. In FIG. 21, A, B, C, and D represent directory names, and a, b, c, d, e, and f represent file names. The user can set a group of files using an arbitrary directory name and file name.
[0090]
Here, it is assumed that the user inputs a group list as shown in FIG. 22 and instructs group setting. In FIG. 2, “dir_A / *” and “dir_C / *” indicate that all files in directories A and C are included in the group, and “X dir_D / file_d” excludes file d in directory D from the group. Represents.
[0091]
At this time, the group management unit 23 selects the remaining files a, b, c, e, and f excluding the file d among all the files belonging to the directories A and C as a group. Then, the block number of the block allocated to each file is acquired from the block management unit 22 and recorded in the group block list 43.
[0092]
FIG. 23 shows an example of a group list relating to another directory tree. This group list indicates that the file a in the directory X, the file b in the directory Y, and all the files in the directory Z are included in the group, and the file c in the directory Z is excluded from the group.
[0093]
From this block list, for example, a group block list as shown in FIG. 24 is generated. In FIG. 24, “blockno” represents a block number, and a plurality of consecutive block numbers are recorded as one unit.
[0094]
At the time of backup, the meta information about the file, the block number recorded in the group block list 43, and the data of the corresponding block are copied to the backup medium. As the meta information, the file names and attributes of all files included in the directory tree or the file names and attributes of files belonging to the group are used.
[0095]
When a file is referred to at the time of restoration, the block management unit 22 obtains a corresponding block number from the file name and accesses the backup data of the block.
[0096]
At this time, if the information of all the files is recorded as meta information, the computer 11 can also see the file names of files that do not belong to the group, such as the file d in FIG. However, since there is no backup data for the block of file d, an error is returned when this file is referenced. On the other hand, if only the information of the files belonging to the group is recorded as meta information, the computer 11 cannot see the file names of the files that do not belong to the group, so that all the visible files can be referred to. become.
[0097]
Further, the group management unit 23 records and manages the blocks changed after the previous backup in the group change block list 44 in units of groups. The group management unit 23 notifies the copy management unit 25 of the blocks recorded in the group change block list 44 as differences, and the copy management unit 25 copies only the notified blocks. Thereby, differential backup is performed in units of groups.
[0098]
In the block lists of FIGS. 19 and 24, block numbers are explicitly recorded, but instead, a set of a plurality of consecutive blocks may be recorded using the original address and length. The same applies to other block lists.
[0099]
In (b.4) group management, the group management unit 23 sets a plurality of groups, and performs backup according to different schedules. In the group management (b.5), the group management unit 23 performs grouping that recognizes that one file belongs to a plurality of groups. This enables flexible group setting and backup.
[0100]
The hatched portion of the disk 13 in FIG. 17 represents a set of blocks recorded in one of the various block lists described above, and the data of these blocks is stored in the backup medium 15 by the copy management unit 25. Copied to the shaded area.
[0101]
According to such a method, since backup is performed based on a block list generated in advance, a plurality of blocks including blocks of different files can be copied together. Therefore, the number of accesses to the disk 13 is greatly reduced as compared with the case of copying in units of files, and a situation close to random access hardly occurs.
[0102]
In the block management of (a.10), after performing differential backup and before restoration, the differential backup data or the contents of the differential backup data and the entire backup data are merged in units of blocks. Further, two or more backup data including the whole backup data and the differential backup data may be merged. Restoration is made more efficient by collecting differential backup data in advance.
[0103]
FIG. 25 shows an example of such a merge process. In FIG. 25, the entire backup data 51 represents the first generation G3 backup data, and the differential backup data 52 and 53 represent the differences in the generations G2 and G1, respectively. In this case, the generation G3 is the oldest and the generation G1 is the newest. The hatched portion represents an area where backup data exists.
[0104]
Here, when the whole backup data 51 and the differential backup data 52 are merged, backup data 54 is generated. However, newer data is preferentially stored in the same area. In this case, at the time of restoration, data is referred to using only the backup data 54 and the differential backup data 53.
[0105]
Further, when the differential backup data 52 and the differential backup data 53 are merged, backup data 55 is generated. In this case, at the time of restoration, data is referred to using only the backup data 55 and the entire backup data 51.
[0106]
In the block management of (a.11), when performing differential backup, the user selects a time point that is a reference for the difference, and the block management unit 22 selects only blocks that have changed since the specified time point. The change block list 42 is recorded. Then, the copy management unit 25 copies only those blocks.
[0107]
As a result, the starting point of the differential backup can be changed as necessary, and changes that occur between the time when the previous backup was performed and the selected time are not stored in the backup medium 15. Therefore, it is possible to select changes to be reflected in the restoration.
[0108]
FIG. 26 shows an example of changing the starting point of differential backup. Here, the previous differential backup is performed at time t0, blocks x and y are added to the changed block list 42 between times t0 and t1, and block z is added to the changed block list 42 between times t1 and t2. Shall be. If the user does not change the starting point of the differential backup, the blocks x, y, and z are recorded as differences from the time t0 in the changed block list 42 at time t2.
[0109]
However, when the user designates the time t1 as the starting point of the differential backup, the block management unit 22 once clears the changed block list 42 and erases the block numbers of the blocks x and y at the time t1. Thereafter, the block z is added to the changed block list 42, and at time t2, only the block z is recorded as a difference from the time t1. Then, at the time of the next backup, differential backup is performed based on the difference from time t1.
[0110]
Further, in the copy management of (a.12), when a plurality of disks 13 exist in the cluster, the copy management unit 25 instructs each computer 11 to copy any one of the disks 13 and performs the copy within the cluster. The computers 11 to be distributed are distributed. Then, each computer 11 performs copying of the disk 13 instructed from the copy management unit 25. As described above, when a plurality of computers 11 perform copying, the backup load is distributed and the copying time is shortened.
[0111]
FIG. 27 shows an example of such copy management. In the cluster of FIG. 27, a plurality of backup media 15 are provided. The copy management unit 25 notifies each computer 11 of the device names of the copy target disk 13 and the copy destination backup medium 15 and requests a copy operation. The computer 11 requested to copy the data copies the data of the notified disk 13 having the device name to the backup medium 15 having the notified device name. At this time, the copy work is performed in parallel by the plurality of computers 11.
[0112]
Next, with reference to FIG. 28 to FIG. 35, the operation at the time of restoration related to the above-described features (c.1) to (c.5) will be described in detail.
In the operation of (c.1), the file system restores data by mounting the backup medium 15 as it is instead of the disk 13. As a result, each computer 11 can directly access the backup data stored in the backup medium 15, and a special operation for restoration is not required.
[0113]
FIG. 28 shows a process in which the block management unit 22 mounts the backup medium 15 on the file system. In FIG. 28, after data on the disk 13 (shaded portion) is copied to the backup medium 15 and a reference request for the disk 13 is received from the computer 11, the block management unit 22 transfers the corresponding data on the backup medium 15 to the computer. Return to 11.
[0114]
FIG. 29 is a flowchart of such a reference process. The block management unit 22 first receives a file read request from the computer 11 (step S81). The read request includes the file name, the read area offset a, and the size S. Next, the meta information of the backup medium 15 is referred to (step S82), and the block number #x assigned to the range a to a + S of the corresponding file is obtained (step S83).
[0115]
Next, the block number #y on the backup medium 15 in which the data of the block number #x is stored is obtained (step S84), the block data is read (step S85), and the read request is responded ( Step S86), the process is terminated.
[0116]
In the operation of (c.2), when restoring the differential backup, the plurality of differential backup data are searched in order from the latest one as necessary. In the case of differential backup, since necessary data is stored in any of a plurality of generations of backup data, all data can be presented to the computer 11 by searching for the backup data.
[0117]
FIG. 30 shows such generation management. In FIG. 30, backup data for each generation is stored in different backup media 15. The block management information 61 maps the file name to the device name and block number of the disk 13. The block management information 62 is provided for each generation of backup data, and maps the device name and block number of the disk 13 to the identification information and block number of the backup medium 15.
[0118]
When receiving a file access request from the computer 11, the block management unit 22 refers to the block management information 61 to acquire a device name and a block number corresponding to the file name, and passes them to the medium control unit 24.
[0119]
The medium control unit 24 manages the backup data of each generation in the order of generation, refers to the block management information 62 of the newest generation G1, and checks whether there is a given device name and block number (block information) To do. If there is the given block information, the block number of the backup medium 15 corresponding to the block information is obtained, and the backup medium 15 of the generation G1 is referred to. If there is no given block information, the block management information 62 of the previous generation G2 is referred to and it is checked whether or not the block information exists.
[0120]
If the generations are traced back one by one while repeating such processing, the data corresponding to the given block information can be referred to in the backup medium 15 of any generation. Thus, even when differential backup is performed, it is possible to show all data to the user using past backup data.
[0121]
FIG. 31 shows an example of differential backup restoration. In FIG. 31, overall backup data 71 represents the first generation G3 backup data, and differential backup data 72 and 73 represent the differences in generations G2 and G1, respectively. The hatched portion represents a block in which backup data exists. For example, in the differential backup data 72, the blocks 81 and 82 correspond to the changed data, and the block 83 corresponds to the newly added data.
[0122]
When data of blocks 93 and 94 is requested at the time of restoration, the corresponding block of the differential backup data 73 of generation G1 is referred to, and when data of blocks 92 and 97 is requested, the differential backup is traced back to generation G2. A corresponding block of data 72 is referenced. Further, when the data of the blocks 91, 95, 96 are requested, the corresponding block of the whole backup data 71 is referred back to the generation G3.
[0123]
Further, when the merge process as shown in FIG. 25 is performed, a similar restore operation is performed using the backup data generated by the merge instead of the merged two backup data.
[0124]
In the operations (c.3) and (c.4), when backup data is stored on the tape 16, the backup medium 15 is used as a buffer accessible from all the computers 11 in the cluster. When the computer 11 refers to the backup data, only necessary blocks are loaded onto the backup medium 15 and those blocks are used as a cache. As a result, only necessary data can be arranged on the backup medium 15, and access efficiency to frequently accessed data is improved.
[0125]
FIG. 32 shows processing for using the backup medium 15 as a buffer. Here, since only the computer 12 is connected to the tape 16 in the configuration of FIG. 2, the tape control unit 27 of the computer 12 reads out necessary block data from the tape 16 and arranges it on the backup medium 15. . For example, a disk is used as the backup medium 15. By referring to the data loaded on the backup medium 15 in this way, the computer 11 not connected to the tape 16 can read the backup data stored on the tape 16.
[0126]
FIG. 33 is a flowchart of backup data reference processing. The processes in steps S91 to S93 in FIG. 32 are the same as the processes in steps S81 to S83 in FIG. Next, the block management unit 22 obtains the block number #y on the tape 16 where the data of the block number #x obtained in step S93 is stored (step S94), and the cache of the block is stored in the backup medium 15. It is checked whether it is above (step S95).
[0127]
If the cache is not on the backup medium 15, the tape control unit 27 reads the data of the block from the tape 16 and writes it in an empty block #z on the backup medium 15 (step S96). Then, the block management unit 22 responds to the read request using the written data (step S97), and ends the process. If the cache is on the backup medium 15, the block management unit 22 responds to the read request using the data (step S97) and ends the process.
[0128]
In the operation of (c.5), when the BI log of the log medium 14 is not overwritten with the backup data of the backup medium 15 and the log and the backup data are stored separately, the log is referred to first. Refer back to the backup data later if necessary. By referring to the log at the time of restoration, the image at the start of the backup is reproduced, and data inconsistency does not occur.
[0129]
FIG. 34 shows processing for referring to the log of the log medium 14. When receiving an access request from the computer 11, the block management unit 22 confirms the existence and contents of the log in the log management unit 26. For example, the backup medium 15 is referred to.
[0130]
FIG. 35 is a flowchart of such a reference process. The processing in steps S101 to S103 in FIG. 35 is the same as the processing in steps S81 to S83 in FIG. Next, the block management unit 22 inquires of the log management unit 26 whether there is a log of the block number #x obtained in step S103 (step S104), and checks the answer (step S105).
[0131]
If there is a log of the inquired block, the log on the log medium 14 is read. If there is no such log, the block number #y on the corresponding backup medium 15 is obtained and the backup data of the block is read. (Step S107). Then, in response to the read request (step S108), the process ends.
[0132]
In the cluster system of FIG. 2, the cache control unit 21, the block management unit 22, the group management unit 23, the medium control unit 24, the copy management unit 25, the log management unit 26, and the tape control unit 27 are managed by a management computer. However, some or all of these control units and management units may be distributed among a plurality of computers 11.
[0133]
FIG. 36 shows a backup operation in such a cluster system. In FIG. 36, the copy management unit 25, the log management unit 26, and the tape control unit 27 are provided in a distributed manner in the computer 11, and the computer 11 having the copy management unit 25 copies the contents of the disk 13 to the backup medium 15. To do.
[0134]
Each computer 11 transfers the BI log generated during the copying to the computer 11 having the log management unit 26, and the computer 11 edits the log and writes it in the log medium 14. Then, the backup data and the log are written on the tape 16 by the computer 11 having the tape control unit 27, respectively.
[0135]
FIG. 37 shows the restore operation in the cluster system of FIG. In FIG. 37, the computer 11 having the tape control unit 27 receives a read request from another computer 11 and separately reads the necessary backup data and log from the tape 16, and the backup medium 15 and the log medium 14 respectively. Expand on top.
[0136]
When the expansion is completed, the computer 11 that has requested reading reads the log from the log medium 14 if there is a log. If there is no log, the block number corresponding to the file name is received from the block management unit 22 and the backup data is read from the backup medium 15.
[0137]
However, when the computer 11 having the tape control unit 27 requests reading, it is not necessary to expand the backup data and log on the backup medium 15 and the log medium 14.
[0138]
FIG. 38 shows a case where the log is overwritten before the backup data is written on the tape 16 in the cluster system of FIG. In FIG. 38, the log management unit 26 overwrites the backup data of the backup medium 15 with the log of the log medium 14 when the backup data and the log are finalized. When overwriting is completed, the tape control unit 27 writes the overwritten backup data on the tape 16.
[0139]
FIG. 39 shows a restore operation in the cluster system of FIG. In FIG. 39, the tape control unit 27 reads the backup data with the log overwritten from the tape 16 and develops it on the backup medium 15. Then, the computer 11 that has requested reading reads the necessary data from the backup medium 15. However, when the computer 11 itself having the tape control unit 27 requests reading, it is not necessary to expand the data on the backup medium 15.
[0140]
The computers 11 and 12 in FIG. 2 can be configured using, for example, an information processing apparatus as shown in FIG. 40 includes a CPU (central processing unit) 111, a memory 112, an input device 113, an output device 114, an external storage device 115, a medium driving device 116, and a network connection device 117, which are connected via a bus 118. Are connected to each other.
[0141]
The memory 112 includes, for example, a ROM (read only memory), a RAM (random access memory), and the like, and stores programs and data used for processing. The CPU 111 performs necessary processing by executing a program using the memory 112.
[0142]
The cache control unit 21, block management unit 22, group management unit 23, medium control unit 24, copy management unit 25, log management unit 26, and tape control unit 27 in FIG. 2 are, for example, software components described by a program. Stored in the memory 112.
[0143]
The input device 113 is, for example, a keyboard, a pointing device, a touch panel, and the like, and is used for inputting instructions and information from the user. The output device 114 is, for example, a display, a printer, a speaker, or the like, and is used for outputting an inquiry to a user and a processing result.
[0144]
The external storage device 115 is, for example, a magnetic disk device, an optical disk device, a magneto-optical disk device, a tape device, or the like. The information processing apparatus stores the above-described program and data in the external storage device 115, and loads them into the memory 112 for use as necessary. The external storage device 115 is used as the shared disk 13, the log medium 14, the backup medium 15, the tape 16, and the like.
[0145]
The medium driving device 116 drives the portable recording medium 119 and accesses the recorded contents. As the portable recording medium 119, an arbitrary computer-readable recording medium such as a memory card, a floppy disk, a CD-ROM (compact disk read only memory), an optical disk, or a magneto-optical disk is used. The user stores the above-described program and data in the portable recording medium 119 and loads them into the memory 112 and uses them as necessary.
[0146]
The network connection device 117 is used for connection to a communication network that connects computers, and performs data conversion accompanying communication. The information processing apparatus receives the above-described program and data from another apparatus via the network connection apparatus 117, and loads them into the memory 112 and uses them as necessary.
[0147]
FIG. 41 shows a computer-readable recording medium that can supply a program and data to the information processing apparatus of FIG. Programs and data stored in the portable recording medium 119 or the external database 120 are loaded into the memory 112. Then, the CPU 111 executes the program using the data and performs necessary processing.
[0148]
【The invention's effect】
According to the present invention, in a computer system having a disk shared file system such as a cluster system, data can be efficiently backed up while the system is operating. In addition, backup data can be referred to efficiently at the time of restoration.
[Brief description of the drawings]
FIG. 1 is a principle diagram of a backup system according to the present invention.
FIG. 2 is a configuration diagram of a cluster system.
FIG. 3 is a flowchart of cache control processing.
FIG. 4 is a diagram illustrating first log management;
FIG. 5 is a flowchart of log editing processing.
FIG. 6 is a diagram illustrating second log management.
FIG. 7 is a flowchart of first log recording processing;
FIG. 8 is a diagram illustrating third log management;
FIG. 9 is a flowchart of second log recording processing;
FIG. 10 is a diagram showing a log management file.
FIG. 11 is a flowchart of first copy processing.
FIG. 12 is a diagram illustrating fourth log management.
FIG. 13 is a flowchart of third log recording processing;
FIG. 14 is a flowchart of second copy processing;
FIG. 15 is a diagram illustrating fifth log management;
FIG. 16 is a diagram illustrating a data format of a log medium.
FIG. 17 is a diagram illustrating block management and group management.
FIG. 18 is a diagram showing a free space management table.
FIG. 19 is a diagram illustrating a used block list.
FIG. 20 is a flowchart of changed block list update processing;
FIG. 21 is a diagram showing a directory tree.
FIG. 22 is a diagram showing a first group list.
FIG. 23 is a diagram illustrating a second group list.
FIG. 24 is a diagram showing a group block list.
FIG. 25 is a diagram illustrating merging of differential backup data.
FIG. 26 is a diagram showing a change at the start point of differential backup.
FIG. 27 is a diagram showing copy management.
FIG. 28 is a diagram illustrating mounting of a backup medium.
FIG. 29 is a flowchart of first reference processing;
FIG. 30 is a diagram illustrating generation management.
FIG. 31 is a diagram showing differential backup restoration;
FIG. 32 is a diagram showing a backup medium as a buffer.
FIG. 33 is a flowchart of second reference processing.
FIG. 34 is a diagram illustrating log reference.
FIG. 35 is a flowchart of third reference processing.
FIG. 36 is a diagram showing a first backup.
FIG. 37 is a diagram illustrating first restoration.
FIG. 38 is a diagram showing a second backup.
FIG. 39 is a diagram showing a second restoration.
FIG. 40 is a configuration diagram of an information processing apparatus.
FIG. 41 is a diagram illustrating a recording medium.
[Explanation of symbols]
1 Copying means
2 Control means
3, 11, 12 Calculator
4 Shared media
5, 15 Backup media
6 Log management means
7 Generation means
8 Group management means
9 Area management means
13 Shared disk
14 Log media
16 tapes
21 Cache control unit
22 Block management department
23 Group Management Department
24 Medium control unit
25 Copy Management Department
26 Log management department
27 Tape controller
28 cash
31 Temporary log media
32, 34 Log management file
33 Clock part
41 Used Block List
42 Changed block list
43 Group Block List
44 Group change block list
51, 71 Whole backup data
52, 53, 72, 73 Differential backup data
54, 55 Backup data
61, 62 Block management information
81, 82, 83, 91, 92, 93, 94, 95, 96, 97 blocks
101 Access request
111 CPU
112 memory
113 Input device
114 output device
115 External storage device
116 Medium drive device
117 Network connection device
118 bus
119 Portable recording media
120 database

Claims

A backup system for performing image backup of a shared medium shared by a plurality of computers,
When any one of the plurality of computers writes in a certain area of the shared medium , the image data before writing in the area is managed by the computer that performed the writing as a log,
Log management means for generating logs as a whole by collecting logs managed by the plurality of computers, and when there are a plurality of logs for the same area, log management means using the oldest log after the image backup start time ,
During the restore, the backup system comprising: a generating means for generating data of the image backup start time using the whole log.

Temporary log storage means for temporarily saving each log of the plurality of computers is further provided, and the log management means edits each log saved in the temporary log storage means to generate the entire log The backup system according to claim 1, wherein:

When one of the plurality of computers accesses the shared medium, the log management means receives an access notification from the accessed computer and stores the log of the accessed computer, thereby saving the entire log. The backup system according to claim 1, wherein the backup system is generated.

2. The backup system according to claim 1, further comprising backup storage means for storing backup data of the shared medium, wherein the log management means stores the entire log in the backup storage means.

The backup system according to claim 1, further comprising backup storage means for storing backup data of the shared medium, and log storage means for storing the entire log.

The backup system according to claim 1, further comprising backup storage means for storing backup data of the shared medium, wherein the log management means overwrites the entire log on the backup data.

Log storage means for storing a log managed by the log management means and address information of backup data overwritten by the log, and the generation means converts the log into corresponding backup data based on the address information; 2. The backup system according to claim 1, wherein overwriting is performed.

The generation unit refers to the entire log first when the entire log is not overwritten on the backup data of the shared medium, and refers to the backup data later if necessary. The backup system according to claim 1.

A backup method for performing an image backup of a shared medium shared by a plurality of computers,
After the start of image backup,
When any one of the plurality of computers writes in a certain area of the shared medium , the image data before writing in the area is managed by the computer that performed the writing as a log,
When the computer that performs the image backup generates a whole log by collecting logs managed by the plurality of computers, and there are a plurality of logs for the same area when the whole log is generated, the image backup start time or later Use the oldest log,
During the restore, to generate data of the image backup start time using the whole log,
A backup method characterized by that .