JP2004094681A

JP2004094681A - Control device, control method, and control program for distributed database

Info

Publication number: JP2004094681A
Application number: JP2002256005A
Authority: JP
Inventors: Makoto Tsunoda; 角田　誠; Yasuhiro Taga; 多賀　康博
Original assignee: NTT Comware Corp
Current assignee: NTT Comware Corp
Priority date: 2002-08-30
Filing date: 2002-08-30
Publication date: 2004-03-25

Abstract

<P>PROBLEM TO BE SOLVED: To provide a control device, a control method, and a control program for a distributed database, having high resistance to disturbance at low costs by unifying a multitude of computer systems by means of a network. <P>SOLUTION: In a distributed database system, data are processed by using a database. This database system is formed by connecting first to n-th computers, each equipped with a control part 22 and a data storage part 23, by means of the network. The database is divided into first to n-th slots, the first to n-th slots are stored as original slots in data housing parts of the first to n-th computers, and duplicates of the original slots are stored respectively in data storage parts 23 other than data storage parts 23 stored with the original slots. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明はネットワークで接続された多数のコンピュータを用いて分散型データベースを構築するデータ制御装置および制御方法並びに制御プログラムに関する。
【０００２】
【従来の技術】
従来、分散型データベースシステムにおいて、耐障害性能の向上を図るために、ネットワーク上の複数のサーバを連携させて一つの大きなシステムとして機能させる方法が提案されている。複数のサーバは並列的に処理を行うことによって全体の処理時間を短縮し、また、複製データを複数のサーバに分散させることによって、あるサーバで障害が発生しても他のサーバからデータを復元することができる。
【０００３】
上述した分散型データベースは、利用者においては、単一のデータベースのように扱うことができる必要がある。この要求を実現するためには、データベースが統合されて見える仮想領域を構築することが考えられる。この場合、従来は、センターサーバに仮想領域を配置するか、各サーバそれぞれが仮想領域のデータをすべての複製するといったことが行われていた。
また１つの情報は１システムのみが保有している大規模な分散データベースにおいては、システムごとに仮想領域のデータをミラーリングすることが行われていた。
【０００４】
しかし、データの複製やミラーリング等によるサーバシステムの容量拡大は、指数的なコストの増大を必要とする。このような従来システムにおいては、高速性とフォールトトラレント性を同時に実現しようとすると、耐障害性を確保するために高価なシステムに頼らざるを得ず、コストが高くなるという問題点があった。
【０００５】
【発明が解決しようとする課題】
この発明はこのような事情を考慮してなされたもので、その目的は、ネットワークを介して多数のコンピュータシステムを統合し、安価で高い耐障害性を持つ、分散型データベース制御装置および制御方法並びに制御プログラムを提供することにある。
【０００６】
【課題を解決するための手段】
この発明は上記の課題を解決すべくなされたもので、請求項１に記載の発明は、データベースを使用してデータ処理を行う分散型データベースシステムであって、制御部とデータ格納部とを具備する第１〜第ｎのコンピュータをネットワークを介して接続して構成した分散型データベースシステムにおいて、前記データベースを第１〜第ｎのスロットに分割し、前記第１〜第ｎのスロットを各々原本スロットとして前記第１〜第ｎのコンピュータのデータ格納部に記憶させ、
前記各原本スロットの複製を各々原本スロットが記憶されているデータ格納部と異なるデータ格納部に記憶させたことを特徴とする。
【０００７】
請求項２に記載の発明は、請求項１に記載の発明において、第ｎ＋１のコンピュータが新たに使用可能となった場合に、第１〜第ｎのコンピュータの格納するスロットに関する情報に基づいて、第ｋ（ｋ＝１、２、・・・ｎ）のスロットの複製を第ｎ＋１のコンピュータのデータ格納部内に記憶させることを特徴とする。
【０００８】
請求項３に記載の発明は、請求項１に記載の発明において、第ｍ（ｍ＝１、２、・・・ｎ）のコンピュータが使用不能となった場合に、第ｍのスロットの複製いずれか１つを原本に変更し、該複製を有するコンピュータ以外のコンピュータのデータ格納部内に第ｍのスロットの複製を記憶させることを特徴とする。
【０００９】
請求項４に記載の発明は、請求項３に記載の発明において、前記第ｍのコンピュータが使用不能となった場合に、第ｍのコンピュータが記憶していた複製と同一の複製を有するコンピュータ以外のコンピュータのデータ格納部内に第ｍのコンピュータが記憶していた複製を記憶させることを特徴とする。
【００１０】
請求項５に記載の発明は、請求項３または請求項４に記載の発明において、前記第ｍのコンピュータが復帰し使用可能となった場合に、原本スロットが２つ以上格納されているコンピュータから前記第ｍのコンピュータのデータ格納部内に第ｍのスロットの原本を記憶させることを特徴とする。
【００１１】
請求項６に記載の発明は、データベースを使用してデータ処理を行う分散型データベースシステムであって、制御部とデータ格納部とを具備する第１〜第ｎのコンピュータをネットワークを介して接続して構成した分散型データベースシステムにおいて、前記データベースを第１〜第ｎのスロットに分割し、前記第１〜第ｎのスロットを各々原本スロットとして前記第１〜第ｎのコンピュータのデータ格納部に記憶させ、前記各原本スロットの複製を各々原本スロットが記憶されているデータ格納部と異なるデータ格納部に記憶させておき、第ｋ（ｋ＝１、２・・・ｎ）のコンピュータの制御部がデータ検索を行う場合、前記第１〜第ｎのコンピュータの制御部へ各々検索要求を送信し、前記第１〜第ｎのコンピュータの制御部がそれぞれ自データ格納部の検索を行い、検索結果を前記第ｋのコンピュータの制御部へ送信し、前記第ｋのコンピュータの制御部がデータ更新を行う場合、更新すべきデータを含む原本スロットを有するコンピュータの制御部へ更新要求を送信し、前記送信を受けたコンピュータの制御部がデータ格納部内の原本スロットのデータ更新を行うと共に、該原本スロットの複製スロットを有するコンピュータへ更新要求を送信し、前記送信を受けたコンピュータの制御部がデータ格納部内の複製スロットのデータ更新を行い、前記第ｋのコンピュータの制御部がデータ挿入を行う場合、前記第１〜第ｎの原本スロットの中でデータ量が少ない原本スロットにデータを挿入することを特徴とする。
【００１２】
請求項７に記載の発明は、請求項６に記載の発明において、前記第ｋのコンピュータの制御部がデータ更新を行う場合において、更新すべきデータを含む原本スロットを有するコンピュータがわからない場合、前記第１〜第ｎのコンピュータの制御部へ各々検索要求を送信することを特徴とする。
【００１３】
請求項８に記載の発明は、請求項６または請求項７に記載の発明において、第ｎ＋１のコンピュータが新たに使用可能となった場合に、第１〜第ｎのコンピュータの格納するスロットに関する情報に基づいて、第ｋ（ｋ＝１、２、・・・ｎ）のスロットの複製を第ｎ＋１のコンピュータのデータ格納部内に記憶させることを特徴とする。
【００１４】
請求項９に記載の発明は、請求項７または請求項８に記載の発明において、第ｍ（ｍ＝１、２、・・・ｎ）のコンピュータが使用不能となった場合に、第ｍのスロットの複製いずれか１つを原本に変更し、該複製を有するコンピュータ以外のコンピュータのデータ格納部内に第ｍのスロットの複製を記憶させることを特徴とする。
【００１５】
請求項１０に記載の発明は、請求項９に記載の発明において、前記第ｍのコンピュータが使用不能となった場合に、前記第ｍのコンピュータが記憶していた複製と同一の複製を有するコンピュータ以外のコンピュータのデータ格納部内に第ｍのコンピュータが記憶していた複製を記憶させることを特徴とする。
【００１６】
請求項１１に記載の発明は、請求項９または請求項１０に記載の発明において、前記第ｍのコンピュータが復帰し使用可能となった場合に、原本スロットが２つ以上格納されているコンピュータから第ｍのコンピュータのデータ格納部内に第ｍのスロットの原本を記憶させることを特徴とする。
【００１７】
請求項１２に記載の発明は、データベースを使用してデータ処理を行う分散型データベースシステムであって、制御部とデータ格納部とを具備する第１〜第ｎのコンピュータをネットワークを介して接続して構成した分散型データベースシステムにおいて用いられるプログラムであって、前記データベースを第１〜第ｎのスロットに分割し、前記第１〜第ｎのスロットを各々原本スロットとして前記第１〜第ｎのコンピュータのデータ格納部に記憶させ、前記各原本スロットの複製を各々原本スロットが記憶されているデータ格納部と異なるデータ格納部に記憶させておき、第ｋ（ｋ＝１、２、・・・ｎ）のコンピュータの制御部がデータ検索を行う場合、前記第１〜第ｎのコンピュータの制御部へ各々検索要求を送信し、前記第ｋのコンピュータの制御部がデータ更新を行う場合、更新すべきデータを含む原本スロットを有するコンピュータまたは前記第１〜第ｎのコンピュータすべての制御部へ更新要求を送信し、前記第ｋのコンピュータの制御部がデータ挿入を行う場合、前記第１〜第ｎの原本スロットの中でデータ量が少ない原本スロットにデータを挿入することを特徴とする分散型データベース制御プログラムである。
【００１８】
請求項１３に記載の発明は、データベースを使用してデータ処理を行う分散型データベースシステムであって、制御部とデータ格納部とを具備する第１〜第ｎのコンピュータをネットワークを介して接続して構成した分散型データベースシステムにおいて用いられるプログラムであって、前記データベースを第１〜第ｎのスロットに分割し、前記第１〜第ｎのスロットを各々原本スロットとして前記第１〜第ｎのコンピュータのデータ格納部に記憶させ、前記各原本スロットの複製を各々原本スロットが記憶されているデータ格納部と異なるデータ格納部に記憶させておき、第ｋ（ｋ＝１、２、・・・ｎ）のコンピュータの制御部がデータ検索を行う場合、前記第ｋのコンピュータの制御部より検索要求を受信し、前記第１〜第ｎのコンピュータの制御部がそれぞれ自データ格納部の検索を行い、検索結果を前記第ｋのコンピュータの制御部へ送信し、前記第ｋのコンピュータの制御部がデータ更新を行う場合、第ｋのコンピュータの制御部より更新要求を受信し、更新すべきデータを含む原本スロットを有する場合、データ更新要求のあった原本データの更新を行うと共に、該原本スロットの複製スロットを有するコンピュータへ更新要求を送信し、前記送信を受けたコンピュータの制御部がデータ格納部内の複製スロットのデータ更新を行い、前記第ｋのコンピュータの制御部がデータ挿入を行う場合、前記第１〜第ｎの原本スロットの中でデータ量が少ない原本スロットにデータを挿入することを特徴とする分散型データベース制御プログラムである。
【００１９】
【発明の実施の形態】
以下、図面を参照しこの発明の一実施形態について説明する。図１はこの発明の一実施形態による分散データベース制御方法を適用した分散データベースシステムの全体構成を示すブロック図であり、このシステムは、ネットワーク１０を介して相互接続された６個のコンピュータシステム（以下、Ｐｅｅｒという）Ａ〜Ｆから構成されている。
【００２０】
図２はＰｅｅｒＡ〜Ｆの構成を示すブロック図である。この図において、２１はネットワーク１０を介して他のＰｅｅｒと通信を行う通信部、２２は各種のデータ処理を行うデータ制御部、２３はスロットデータ格納部である。ここで、スロットとはデータの束をいう。２４は状態格納部であり、スロットデータ格納部２３内のスロットデータの数や種類に関する情報を記憶している。
【００２１】
次に、スロットデータ格納部２３内に格納されるスロットデータについて説明する。いま、図１のシステム全体のデータ記憶領域について仮想領域を構築し、仮想領域Ｄ１とすると、図３に示すように、この仮想領域Ｄ１をＳｌｏｔＡ〜ＳｌｏｔＦに６等分し、各ＳｌｏｔＡ〜ＳｌｏｔＦを各々ＰｅｅｒＡ〜Ｆのスロットデータ格納部２３に記憶させる。このデータを原本という。次に、図３に示すように、ＳｌｏｔＡの複製であるＳｌｏｔＡ’をＰｅｅｒＢおよびＰｅｅｒＣのスロットデータ格納部２３に記憶させ、ＳｌｏｔＢの複製であるＳｌｏｔＢ’をＰｅｅｒＣおよびＰｅｅｒＤのスロットデータ格納部２３に記憶させ、・・・、ＳｌｏｔＦの複製であるＳｌｏｔＦ’をＰｅｅｒＡおよびＰｅｅｒＢのスロットデータ格納部２３に記憶させる。このように、原本１スロット毎に、２スロットの複製を異なるＰｅｅｒに記憶させておく。この原本と複製のスロット数の合計値３をデータの多重度とする。
【００２２】
また、ＰｅｅｒＡの状態格納部２４には、スロットデータ格納部２３にＳｌｏｔＡ、ＳｌｏｔＥ’、ＳｌｏｔＦ’が格納され、ＳｌｏｔＡ’がＰｅｅｒＢおよびＰｅｅｒＣに格納されていること、ＳｌｏｔＥがＰｅｅｒＥに、ＳｌｏｔＦがＰｅｅｒＦに、ＳｌｏｔＥ’がＰｅｅｒＦに、ＳｌｏｔＦ’がＰｅｅｒＥに格納されていることを示す情報が記憶されている。ＰｅｅｒＢ〜Ｆの状態格納部２４も同様である。
【００２３】
次に、上述したシステムの動作を説明する。
（１）データ検索
ＰｅｅｒＡ〜Ｆのいずれかのデータ制御部２２がデータ検索を行う場合、図４に示すように、全ＰｅｅｒＡ〜Ｆへ検索要求を送信する。各ＰｅｅｒＡ〜Ｆは各々自らのスロットデータ格納部２３内の原本（ＳｌｏｔＡ〜ＳｌｏｔＦ）を検索し、検索結果を検索要求元へ送信する。
【００２４】
（２）データ更新
例えば、ＰｅｅｒＡのデータ制御部２２がデータ更新を行う場合において、もし、更新すべきデータを含む原本Ｓｌｏｔの所在が状態格納部２４に記憶されている場合は、その原本Ｓｌｏｔを持つＰｅｅｒへ直接更新要求メッセージを送信する。いま、そのＰｅｅｒが、例えばＰｅｅｒＣであった場合は、更新要求メッセージがＰｅｅｒＣへ送信される。ＰｅｅｒＣはそのメッセージを受け、まず、スロットデータ格納部２３内の原本ＳｌｏｔＣをメッセージに基づいて更新し、次いで、ＳｌｏｔＣ’を保持するＰｅｅｒＤおよびＰｅｅｒＥへ更新要求メッセージを送信する。ＰｅｅｒＤおよびＰｅｅｒＥはその更新要求メッセージを受け、ＳｌｏｔＣ’の更新を行う。
また、上述したデータ更新を行う場合において、更新すべきデータを含む原本Ｓｌｏｔの所在が分からない場合は、図５に示すように、全ＰｅｅｒＡ〜Ｆへ更新要求メッセージを送信する。更新対象データを含む原本Ｓｌｏｔを持つＰｅｅｒは、その原本Ｓｌｏｔを更新し、次いで、複製Ｓｌｏｔを持つＰｅｅｒへ更新要求メッセージを送信する。
【００２５】
（３）データ挿入
ＰｅｅｒＡ〜Ｆのいずれかのデータ制御部２２がデータ挿入を行う場合、Ｐｅｅｒは、仮想領域に対してデータ挿入を行う。仮想領域に挿入されたデータは、挿入するデータと既に記憶されているデータの主キーの値が重複しないことを確認した後、ＳｌｏｔＡ〜Ｆのうちデータ量の少ないＳｌｏｔに対して挿入される。
【００２６】
（４）Ｐｅｅｒの状態監視
ＰｅｅｒＡ〜Ｆのデータ制御部２２は一定時間間隔ごとに他のＰｅｅｒすべての状態を互いに監視しあう。
【００２７】
次にＰｅｅｒＡ〜Ｆのいずれかが障害発生により、停止した場合におけるシステムの動作について説明する。
（５）データ修復
例えばＰｅｅｒＡが障害発生により停止した場合、ＰｅｅｒＢ〜ＦはＰｅｅｒＡが停止したことを認識し、図６に示すようにデータ修復を行う。まずＰｅｅｒＡの停止により損失した原本ＳｌｏｔＡの複製ＳｌｏｔＡ’を持つＰｅｅｒＢは、複製ＳｌｏｔＡ’を原本ＳｌｏｔＡに変更する。次にＰｅｅｒＢは、原本ＳｌｏｔＡの複製を２スロット異なるＰｅｅｒに記憶させるために、ＰｅｅｒＦが複製ＳｌｏｔＡ’を持たないことを確認した後、原本ＳｌｏｔＡの複製ＳｌｏｔＡ’を作成する。またＰｅｅｒＥは、ＰｅｅｒＤが複製ＳｌｏｔＥ’を持たないことを確認した後、複製ＳｌｏｔＥ’を作成する。またＰｅｅｒＦは、ＰｅｅｒＥが複製ＳｌｏｔＦ’を持たないことを確認した後、複製ＳｌｏｔＦ’を作成する。
【００２８】
次に障害発生により停止したＰｅｅｒが復帰した場合におけるシステムの動作について説明する。
（６）データ分散
Ｐｅｅｒの復帰時に、他に原本Ｓｌｏｔを２つ以上持つ原本スロット過剰Ｐｅｅｒが存在する場合、過剰Ｐｅｅｒは復帰したＰｅｅｒに対して自身が持つ原本Ｓｌｏｔを挿入する。またＰｅｅｒ全体において複製Ｓｌｏｔの数が多重度に足りない場合、足りない複製Ｓｌｏｔの原本を持つＰｅｅｒは、復帰したＰｅｅｒに対して自身が持つ原本Ｓｌｏｔの複製を行う。
【００２９】
次に上記システムにおけるスロットの複製過程について説明する。
（７）段階的データ複製
Ｐｅｅｒの停止が起こる平均時間間隔が１データレコードのライフサイクルに比べ十分に長い場合、複製処理は段階的に行われる。図７において、ＰｅｅｒＤはＳｌｏｔＢ’の複製元であり、複製先であるＰｅｅｒＥにＳｌｏｔＢ’をレコード単位で複製処理中である。ＳｌｏｔＢ’はレコードｂ１、ｂ２、ｂ３、ｂ４で構成され、今レコードｂ１の複製が完了し、さらにレコードｂ２、ｂ３、ｂ４をレコード単位で複製していく。図８は図７において進行中である複製が完了した状態であり、ＰｅｅｒＤはＰｅｅｒＥにレコードｂ１、ｂ２、ｂ３、ｂ４すべてが複製されたことを確認した後、ＳｌｏｔＢ’を削除する。
【００３０】
（８）段階的データ複製時におけるデータ検索
データ検索要求は、障害により原本スロットが検索できない場合、複製スロットに対して行われる。図９において、原本ＳｌｏｔＢを持つＰｅｅｒＢが停止している間にＰｅｅｒＦよりＳｌｏｔＢのレコードｂ１に対してデータ検索要求があった場合、複製処理中である複製元ＰｅｅｒＤ、複製先ＰｅｅｒＥは共に検索を行う。複製先ＰｅｅｒＥで検索し該当データが見つかればそれを値として、ＰｅｅｒＦに送信し、ＰｅｅｒＥで見つからなかった場合は、複製元ＰｅｅｒＤの検索結果をＰｅｅｒＦに送信する。
【００３１】
（９）段階的データ複製時におけるデータ挿入
複製スロットに対してデータ挿入要求があった場合、複製元は新たなデータを持つ必要がないため、挿入は複製先にのみ行う。図１０において、複製元ＰｅｅｒＤが複製先であるＰｅｅｒＥにＳｌｏｔＢ’を構成するレコードｂ２の複製処理を行っている間に、ＰｅｅｒＢがデータ挿入要求を行うと、複製先であるＰｅｅｒＥのＳｌｏｔＢにレコードｂ２がＰｅｅｒＢより挿入される。
【００３２】
（１０）段階的データ複製時におけるデータ更新
複製スロットに対してデータ更新要求があった場合、複製元は新たなデータを管理する必要がないため、更新は複製先にのみ挿入処理を行い、複製元は未更新レコードを削除する。図１１において、複製元ＰｅｅｒＤが複製先であるＰｅｅｒＥにＳｌｏｔＢ’を構成するレコードｂ２の複製処理を行っている間に、ＰｅｅｒＢがデータ更新要求を行うと、複製先であるＰｅｅｒＥのＳｌｏｔＢ’にＰｅｅｒＢの更新レコードｂ２が挿入される。
【００３３】
（１１）段階的データ複製時におけるデータ修復
複製処理中であるＰｅｅｒ２つのいずれかが障害により停止した場合、停止したＰｅｅｒが複製先であれば、進行中の複製処理を中止し、複製元のＰｅｅｒを復旧する。　複製元のＰｅｅｒは、復旧後、新たな複製先を決定し最初から複製処理を行う。停止したＰｅｅｒが複製元であった場合は、進行中の複製処理を一旦中断し未処理のレコード複製を行うことができるＰｅｅｒを決定した後、レコード複製を再開する。図１２においては、停止したのは複製元であるため、進行中の複製処理を一旦中断し、ＳｌｏｔＢを持つＰｅｅｒＢがＳｌｏｔＢ’を構成するレコードｂ２より、複製処理を再開する。
【００３４】
以上のＰｅｅｒの動作におけるスロットの状態を以下に定め、スロットの状態がどのように遷移するかについて説明する。図１３における各スロットの状態は次の意味を示す。
符号０ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙＩｎｓｕｆｆｉｃｉｅｎｔ
符号１ＰｒｉｍａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎ
符号２ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙｓｕｆｆｉｃｉｅｎｔ
符号３ＰｒｉｍａｒｙＴｒａｎｓｆｅｒｉｎｇＰｒｉｍａｒｙＲｉｇｈｔｓ
符号４ＳｅｃｏｎｄａｒｙＰｒｅｐａｒｉｎｇ
符号５ＳｅｃｏｎｄａｒｙＥｘｅｃｕｔｉｎｇ
符号６ＳｅｃｏｎｄａｒｙＯｖｅｒｌｏａｄ
符号７ＳｅｃｏｎｄａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎ
符号８ＳｅｃｏｎｄａｒｙＰｒｉｍａｒｙＦａｉｌ
符号９ＳｅｃｏｎｄａｒｙＴｒａｎｓｆｅｒｉｎｇＰｒｉｍａｒｙＲｉｇｈｔｓ
符号１０ＳｅｃｏｎｄａｒｙＣｌｏｓｅ
【００３５】
符号０は、該当スロットが原本で、そのスロットの複製の数が規定値に足りない状態を示す。符号１は、当該スロットが原本で、他のＰｅｅｒに複製を作成している状態を示す。符号２は、当該スロットが原本で、複製スロットが規定数ある定常状態を示す。符号３は、当該スロットが原本で、複製スロットのうちの１つを当該スロットの代わりに原本スロットに変更している状態を示す。
【００３６】
符号４は、スロットが符号１または符号７の状態にある他のＰｅｅｒからの要求により複製スロット作成している状態を示す。符号５は複製が完了し、複製スロットとして定常動作している状態を示す。符号６は、当該スロットが複製で、Ｐｅｅｒ内のスロットの数が規定数以上で過負荷状態となっている状態を示す。符号７は、当該スロットが複製で、他のＰｅｅｒに複製を作成している状態を示す。符号８は、当該スロットが複製で、対応する原本スロットを持つＰｅｅｒが停止したために、他の対応する複製スロットを持つＰｅｅｒと、どのＰｅｅｒが持つ複製スロットを原本スロットにするか、を決定している状態を示す。符号９は、当該スロットが複製で、対応する原本スロットが状態３にあり原本と複製の関係を交換している状態を示す。符号１０は過負荷状態だったＰｅｅｒが複製スロットの複製を他のＰｅｅｒに作成した結果、複製元スロットが消滅する状態を示す。符号Ｐ１は原本スロット生成前の状態、　符号Ｐ２は複製スロット生成前の状態、　Ｅ１はスロット消滅後の状態を示す。
【００３７】
ネットワーク上に過剰に原本スロットがない状態で新しくＰｅｅｒが起動すると、原本スロットを作成する条件が整うため、状態Ｐ１から状態０ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙＩｎｓｕｆｆｉｃｉｅｎｔへ遷移する（図１３のステップａを参照）。
【００３８】
当該スロットの複製を保持しないＰｅｅｒのうち１番負荷の低いＰｅｅｒが認識され，そのＰｅｅｒに複製スロットを作成する準備が整うと、複製が開始されるため、状態０ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙＩｎｓｕｆｆｉｃｉｅｎｔから状態１ＰｒｉｍａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎへ遷移する（ステップｂを参照）。
【００３９】
一方、複製が完了してもなお複製スロットが足りない、または複製が途中で失敗すると、状態１ＰｒｉｍａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎから状態０ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙＩｎｓｕｆｆｉｃｉｅｎｔへ遷移する（ステップｃを参照）。
【００４０】
複製スロットの作成が終了し複製が規定数に達すると定常状態になるため、状態１ＰｒｉｍａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎから状態２ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙｓｕｆｆｉｃｉｅｎｔに遷移する（ステップｄを参照）。
【００４１】
複製スロットを持つＰｅｅｒが停止した場合、複製が規定数に達しないため、状態２ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙＳｕｆｆｉｃｉｅｎｔから状態０ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙＩｎｓｕｆｆｉｃｉｅｎｔに遷移する（ステップｅを参照）。
【００４２】
２つ以上の原本スロットを持つＰｅｅｒが存在すると、原本スロットと、それに対応する複製スロット間で、原本と、複製の関係を入れ替えるため、状態２ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙｓｕｆｆｉｃｉｅｎｔから状態３ＰｒｉｍａｒｙＴｒａｎｓｆｅｒｉｎｇＰｒｉｍａｒｙＲｉｇｈｔｓに遷移する（ステップｆを参照）。
【００４３】
複製スロットから原本スロットへの変更が障害により失敗した場合、複製は削除されるので規定数に達しないため、状態３ＰｒｉｍａｒｙＴｒａｎｓｆｅｒｉｎｇＰｒｉｍａｒｙＲｉｇｈｔｓから状態０ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙＩｎｓｕｆｆｉｃｉｅｎｔに遷移する（ステップｇを参照）。
【００４４】
一方、　複製スロットから原本スロットへの変更が成功すると当該スロットは原本スロットから複製スロットの定常状態になるため、状態３ＰｒｉｍａｒｙＴｒａｎｓｆｅｒｉｎｇＰｒｉｍａｒｙＲｉｇｈｔｓから状態５ＳｅｃｏｎｄａｒｙＥｘｅｃｕｔｉｎｇに遷移する（ステップｈを参照）。
【００４５】
複製元のスロットが、状態１にあって、当該スロットの複製を保持しないＰｅｅｒのうち１番負荷の低いＰｅｅｒが認識され，そのＰｅｅｒに複製スロットを作成する準備が整う、或いは，状態７にあって、過剰にスロットを持つＰｅｅｒから他のＰｅｅｒに複製スロットを作成する準備が開始されるため、Ｐ２から状態４ＳｅｃｏｎｄａｒｙＰｒｅｐａｒｉｎｇに遷移する（ステップｉを参照）。
【００４６】
複製スロットの生成が完了し、複製スロットとして動作可能な状態になると定常状態となり、状態４ＳｅｃｏｎｄａｒｙＰｒｅｐａｒｉｎｇから状態５ＳｅｃｏｎｄａｒｙＥｘｅｃｕｔｉｎｇに遷移する（ステップｊを参照）。
【００４７】
複製により、Ｐｅｅｒが保持するスロットの数が過大となる場合、状態５ＳｅｃｏｎｄａｒｙＥｘｅｃｕｔｉｎｇから状態符号６ＳｅｃｏｎｄａｒｙＯｖｅｒｌｏａｄに遷移する（ステップｋを参照）。
【００４８】
当該スロットを保持するＰｅｅｒの，他の複製スロットの複製が他のＰｅｅｒに作成され，その複製スロットが削除された事で，当該スロットを保持するＰｅｅｒが保持するスロットの数が適正数になると定常状態になるため再び状態６ＳｅｃｏｎｄａｒｙＯｖｅｒｌｏａｄから状態５ＳｅｃｏｎｄａｒｙＥｘｅｃｕｔｉｎｇに遷移する（ステップｌを参照）。
【００４９】
当該スロットの複製を保持しないＰｅｅｒのうち１番負荷の低いＰｅｅｒで，　そのＰｅｅｒに当該スロットを移動した結果，　そのＰｅｅｒの負荷が当該スロットを保持していたＰｅｅｒと逆転しないようなＰｅｅｒが見つかった場合，　状態６ＳｅｃｏｎｄａｒｙＯｖｅｒｌｏａｄから状態７ＳｅｃｏｎｄａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎに遷移し，　当該Ｐｅｅｒにスロットを複製する（ステップｍを参照）。
【００５０】
一方、複製スロットの作成に失敗すると過剰にスロットを持つＰｅｅｒが依然存在することになるため、再び状態７ＳｅｃｏｎｄａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎから状態６ＳｅｃｏｎｄａｒｙＯｖｅｒｌｏａｄに遷移する（ステップｎを参照）。
【００５１】
スロットの数が過大であるＰｅｅｒにおいて、複製スロットの複製が完了すると、状態７ＳｅｃｏｎｄａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎから状態１０ＳｅｃｏｎｄａｒｙＣｌｏｓｅに遷移する（ステップｏを参照）。
【００５２】
複製スロットの消滅が完了すると最終状態Ｅ１に遷移する（ステップｐを参照）。
【００５３】
原本スロットを持つＰｅｅｒから複製スロットを原本スロットへ変更する要求が行われると、変更処理が行われるため、状態５ＳｅｃｏｎｄａｒｙＥｘｅｃｕｔｉｎｇから状態９ＳｅｃｏｎｄａｒｙＴｒａｎｓｆｅｒｉｎｇＰｒｉｍａｒｙＲｉｇｈｔｓに遷移する（ステップｑを参照）。
【００５４】
複製スロットから原本スロットへの変更処理が終了すると、定常状態となるため状態９ＳｅｃｏｎｄａｒｙＴｒａｎｓｆｅｒｉｎｇＰｒｉｍａｒｙＲｉｇｈｔｓから状態２ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙｓｕｆｆｉｃｉｅｎｔに遷移する（ステップｒを参照）。
【００５５】
Ｐｅｅｒが障害発生により、原本を保持したまま停止すると、保持されていた原本が足りなくなるため、状態９ＳｅｃｏｎｄａｒｙＴｒａｎｓｆｅｒｉｎｇＰｒｉｍａｒｙＲｉｇｈｔｓ、状態５ＳｅｃｏｎｄａｒｙＥｘｅｃｕｔｉｎｇ、状態６ＳｅｃｏｎｄａｒｙＯｖｅｒｌｏａｄ、状態７ＳｅｃｏｎｄａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎから状態８ＳｅｃｏｎｄａｒｙＰｒｉｍａｒｙＦａｉｌに遷移する（ステップｓ、ｔ、ｕ、ｖを参照）。
【００５６】
原本を持ったまま停止したＰｅｅｒが保持していた原本スロットの別のＰｅｅｒの複製スロットを原本スロットに変更し、かつ、状態８ＳｅｃｏｎｄａｒｙＰｒｉｍａｒｙＦａｉｌに遷移する前は、状態６ＳｅｃｏｎｄａｒｙＯｖｅｒｌｏａｄまたは状態７ＳｅｃｏｎｄａｒｙＭａｋｉｎｇＤｕｐｌｉｃａｔｉｏｎであった場合、Ｐｅｅｒ内のスロットの数が規定数以上であった過負荷状態に戻るため、状態８ＳｅｃｏｎｄａｒｙＰｒｉｍａｒｙＦａｉｌから状態６ＳｅｃｏｎｄａｒｙＯｖｅｒｌｏａｄに遷移する（ステップｗを参照）。
【００５７】
また、原本を持ったまま停止したＰｅｅｒが保持していた原本スロットの別のＰｅｅｒの複製スロットを原本スロットに変更し、定常状態となると、状態８ＳｅｃｏｎｄａｒｙＰｒｉｍａｒｙＦａｉｌに遷移する前は状態５ＳｅｃｏｎｄａｒｙＥｘｅｃｕｔｉｎｇまたは状態９ＳｅｃｏｎｄａｒｙＴｒａｎｓｆｅｒｉｎｇＰｒｉｍａｒｙＲｉｇｈｔｓであった場合、状態８ＳｅｃｏｎｄａｒｙＰｒｉｍａｒｙＦａｉｌから状態５ＳｅｃｏｎｄａｒｙＥｘｅｃｕｔｉｎｇに遷移する（ステップｘを参照）。
【００５８】
当該複製スロットが原本に変更され、かつ、複製スロットが規定数以下であると、複製を作成するため、状態８ＳｅｃｏｎｄａｒｙＰｒｉｍａｒｙＦａｉｌから状態０ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙＩｎｓｕｆｆｉｃｉｅｎｔに遷移する（ステップｙを参照）。
【００５９】
また、当該複製スロットが原本に変更され、複製スロットが規定数を満たすので定常状態になるため、状態８ＳｅｃｏｎｄａｒｙＰｒｉｍａｒｙＦａｉｌから状態２ＰｒｉｍａｒｙＳｅｃｏｎｄａｒｙＳｕｆｆｉｃｉｅｎｔに遷移する（ステップｚを参照）。
【００６０】
なお、上記実施例においては、Ｐｅｅｒの数を６とし、多重度を３としたが、Ｐｅｅｒの数をｎ（ｎは自然数）、多重度をｍ（ｍは２以上の整数）としてもよい。このとき、各Ｐｅｅｒは原本スロット１個と、複製スロットｍ−１個を持つ。
【００６１】
【発明の効果】
以上説明したように、請求項１並びに請求項６並びに請求項１２および請求項１３の発明によれば、データベースを使用してデータ処理を行う分散型データベースシステムであって、制御部とデータ格納部とを具備する第１〜第ｎのコンピュータをネットワークを介して接続して構成した分散型データベースシステムにおいて、データベースを第１〜第ｎのスロットに分割し、第１〜第ｎのスロットを各々原本スロットとして第１〜第ｎのコンピュータのデータ格納部に記憶させ、各原本スロットの複製を各々原本スロットが記憶されているデータ格納部と異なるデータ格納部に記憶させるので、サーバを介せずに多数のコンピュータシステムを統合しデータの分散化と多重化することができるので、安価で高い耐障害性を持つ分散型データベース制御システムおよび制御方法並びに制御プログラムを提供することができる。
【００６２】
請求項２および請求項８の発明によれば、第ｎ＋１のコンピュータが新たに使用可能となった場合に、第１〜第ｎのコンピュータの格納するスロットに関する情報に基づいて、第ｋ（ｋ＝１、２、・・・ｎ）のスロットの複製を第ｎ＋１のコンピュータのデータ格納部内に記憶させるので、格納するデータが特定のコンピュータに偏重することを防ぎ、他の新たに起動したコンピュータに分散されるので耐障害性にさらに優れた分散型データベース制御システム並びに制御方法を提供することができる。
【００６３】
請求項３および請求項９の発明によれば、第ｍのコンピュータが使用不能となった場合に、第ｍのスロットの複製いずれか１つを原本に変更し、該複製を有するコンピュータ以外のコンピュータのデータ格納部内に第ｍのスロットの複製を記憶させるので、原本を持ったコンピュータが停止した場合でも、他のコンピュータによって原本を復元することができ、耐障害性にさらに優れた分散型データベース制御システムおよび制御方法を提供することができる。
【００６４】
請求項４および請求項１０の発明によれば、第ｍのコンピュータが使用不能となった場合に、第ｍのコンピュータが記憶していた複製を有するコンピュータ以外のコンピュータのデータ格納部内に第ｍのコンピュータが記憶していた複製と同一の複製を記憶させるのでコンピュータの停止によって失われた複製データを復元することができ、耐障害性にさらに優れた分散型データベース制御システムおよび制御方法を提供することができる。
【００６５】
請求項５の発明および請求項１１の発明によれば、第ｍのコンピュータが復帰し使用可能となった場合に、原本スロットが２つ以上格納されているコンピュータから第ｍのコンピュータのデータ格納部内に第ｍのスロットの原本を記憶させるので、原本スロットが特定のコンピュータに偏重することを防ぎ、他の新たに起動したコンピュータに分散されるので耐障害性にさらに優れた分散型データベース制御システム並びに制御方法を提供することができる。
【００６６】
請求項７の発明によれば、前記第ｋのコンピュータの制御部がデータ更新を行う場合において、更新すべきデータを含む原本スロットを有するコンピュータがわからない場合、前記第１〜第ｎのコンピュータの制御部へ各々検索要求を送信するので、検索要求が当該コンピュータにおいて確実に受信され、検索性に優れた分散型データベース制御方法を提供することができる。
【図面の簡単な説明】
【図１】本発明の一実施例である分散データベース制御アルゴリズムを用いて実施された分散データベースシステムの構成を示すブロック図である。
【図２】Ｐｅｅｒの構成を示すブロック図である。
【図３】スロットの初期状態を示した図である。
【図４】各Ｐｅｅｒにデータ検索要求があった状態を示した図である。
【図５】各Ｐｅｅｒにデータ更新要求があった状態を示した図である。
【図６】Ｐｅｅｒが障害発生により停止した場合に他のＰｅｅｒが停止したＰｅｅｒのスロットを復元する過程を示した図である。
【図７】段階的に複製スロットを管理する権限を複製する過程を示した図である。
【図８】図７において進行中である複製が完了した状態を示した図である。
【図９】複製処理中にデータ検索要求があった状態を示した図である。
【図１０】複製処理中にデータ挿入要求があった状態を示した図である。
【図１１】複製処理中にデータ更新要求があった状態を示した図である。
【図１２】複製処理中に障害が発生した状態を示した図である。
【図１３】スロットの状態遷移を示した図である。
【符号の説明】
Ａ〜Ｆ…Ｐｅｅｒ
１０…ネットワーク
２１…通信部
２２…データ制御部
２３…スロットデータ格納部
２４…状態格納部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a data control device, a control method, and a control program for constructing a distributed database using a large number of computers connected via a network.
[0002]
[Prior art]
Conventionally, in a distributed database system, a method of linking a plurality of servers on a network to function as one large system has been proposed in order to improve fault-tolerant performance. Multiple servers perform processing in parallel to reduce the overall processing time, and replicated data is distributed to multiple servers so that if one server fails, data can be restored from other servers can do.
[0003]
The above-mentioned distributed database needs to be able to be handled by a user like a single database. In order to fulfill this demand, it is conceivable to construct a virtual area in which the database appears to be integrated. In this case, conventionally, a virtual area is arranged in a center server, or each server copies all data in the virtual area.
In a large-scale distributed database in which only one information is held by one system, data in a virtual area is mirrored for each system.
[0004]
However, an increase in the capacity of the server system due to data replication or mirroring requires an exponential increase in cost. In such a conventional system, there is a problem in that if high speed and fault tolerability are to be simultaneously realized, an expensive system must be relied on in order to ensure fault tolerance, resulting in an increase in cost. .
[0005]
[Problems to be solved by the invention]
The present invention has been made in view of such circumstances, and an object of the present invention is to integrate a large number of computer systems via a network, to provide an inexpensive and highly fault-tolerant distributed database controller, a control method, and a control method. It is to provide a control program.
[0006]
[Means for Solving the Problems]
SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and an invention according to claim 1 is a distributed database system for performing data processing using a database, comprising a control unit and a data storage unit. In a distributed database system configured by connecting first to nth computers via a network, the database is divided into first to nth slots, and each of the first to nth slots is an original slot. Stored in the data storage unit of the first to n-th computer as
A copy of each of the original slots is stored in a data storage different from the data storage in which the original slots are stored.
[0007]
According to a second aspect of the present invention, in the first aspect of the present invention, when the (n + 1) th computer is newly usable, based on information on slots stored in the first to nth computers, The replica of the k-th (k = 1, 2,..., N) slot is stored in the data storage unit of the (n + 1) -th computer.
[0008]
According to a third aspect of the present invention, in the first aspect of the invention, when the m-th (m = 1, 2,..., N) computer becomes unusable, the m-th slot is duplicated. One of them is changed to an original, and a copy of the m-th slot is stored in a data storage unit of a computer other than the computer having the copy.
[0009]
According to a fourth aspect of the present invention, in the third aspect of the invention, when the m-th computer becomes unusable, a computer other than a computer having the same copy as that stored by the m-th computer is stored. The replica stored by the m-th computer is stored in the data storage section of the computer.
[0010]
According to a fifth aspect of the present invention, in the third or fourth aspect of the invention, when the m-th computer is restored and becomes usable, the computer in which two or more original slots are stored is used. An original of the m-th slot is stored in a data storage section of the m-th computer.
[0011]
According to a sixth aspect of the present invention, there is provided a distributed database system for performing data processing using a database, wherein first to n-th computers each having a control unit and a data storage unit are connected via a network. In the distributed database system configured as described above, the database is divided into first to n-th slots, and the first to n-th slots are respectively stored as original slots in the data storage units of the first to n-th computers. A copy of each original slot is stored in a data storage unit different from the data storage unit in which the original slot is stored, and the control unit of the k-th (k = 1, 2,... When performing a data search, a search request is transmitted to each of the control units of the first to n-th computers, and the control units of the first to n-th computers respectively transmit the search requests. A computer having an original slot containing data to be updated when a search is performed on the data storage unit and the search result is transmitted to the control unit of the k-th computer, and when the control unit of the k-th computer updates the data. Transmitting an update request to the control unit of the computer, the control unit of the computer having received the transmission updates the data of the original slot in the data storage unit, and transmits an update request to a computer having a duplicate slot of the original slot, When the control unit of the computer that has received the transmission updates the data in the duplicate slot in the data storage unit and the control unit of the k-th computer inserts data, the data amount in the first to n-th original slots is changed. Characterized in that data is inserted into the original slot with less data.
[0012]
According to a seventh aspect of the present invention, in the invention according to the sixth aspect, when the control unit of the k-th computer updates the data, if the computer having the original slot containing the data to be updated is not known, A search request is transmitted to each of the control units of the first to n-th computers.
[0013]
According to an eighth aspect of the present invention, in the invention according to the sixth or seventh aspect, when the (n + 1) th computer is newly usable, information on slots stored in the first to nth computers is provided. , The replica of the k-th (k = 1, 2,..., N) slot is stored in the data storage of the (n + 1) -th computer.
[0014]
According to a ninth aspect of the present invention, when the m-th (m = 1, 2,... N) computer becomes unusable in the seventh or eighth aspect, One of the duplicates of the slot is changed to the original, and the duplicate of the m-th slot is stored in the data storage unit of a computer other than the computer having the duplicate.
[0015]
According to a tenth aspect of the present invention, in the invention according to the ninth aspect, when the m-th computer becomes unusable, the computer having the same copy as that stored by the m-th computer The replica stored by the m-th computer is stored in the data storage unit of the other computer.
[0016]
According to an eleventh aspect of the present invention, in the ninth or tenth aspect of the present invention, when the m-th computer is restored and becomes usable, a computer storing two or more original slots is used. The original of the m-th slot is stored in the data storage of the m-th computer.
[0017]
According to a twelfth aspect of the present invention, there is provided a distributed database system for performing data processing using a database, wherein first to n-th computers each having a control unit and a data storage unit are connected via a network. Is a program used in a distributed database system configured as described above, wherein the database is divided into first to n-th slots, and the first to n-th computers are used as the original slots, respectively. , And a copy of each original slot is stored in a data storage different from the data storage in which the original slot is stored, and the k-th (k = 1, 2,..., N) When the control unit of the computer performs data search, the control unit of each of the first to n-th computers transmits a search request to the k-th computer. When the control unit of the data unit updates the data, it transmits an update request to the computer having the original slot containing the data to be updated or the control unit of all of the first to nth computers, and controls the kth computer. A distributed database control program characterized in that when a unit inserts data, data is inserted into an original slot having a smaller data amount among the first to n-th original slots.
[0018]
The invention according to claim 13 is a distributed database system for performing data processing using a database, wherein first to n-th computers having a control unit and a data storage unit are connected via a network. Is a program used in a distributed database system configured as described above, wherein the database is divided into first to n-th slots, and the first to n-th computers are used as the original slots, respectively. , And a copy of each original slot is stored in a data storage different from the data storage in which the original slot is stored, and the k-th (k = 1, 2,..., N) In the case where the control unit of the computer performs data search, the control unit of the k-th computer receives a search request from the control unit of the k-th computer. When the control unit of the k-th computer performs a search of its own data storage unit and transmits the search result to the control unit of the k-th computer, and the control unit of the k-th computer updates the data, When an update request is received from the control unit and an original slot including data to be updated is included, the update of the original data requested to be updated is performed, and an update request is transmitted to a computer having a duplicate slot of the original slot. When the control unit of the computer that has received the transmission updates the data of the duplicate slot in the data storage unit and the control unit of the k-th computer inserts data, the first to n-th original slots This is a distributed database control program characterized by inserting data into an original slot having a small amount of data.
[0019]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing an entire configuration of a distributed database system to which a distributed database control method according to an embodiment of the present invention is applied. This system is composed of six computer systems (hereinafter, referred to as interconnected) via a network 10. , Peer) AF.
[0020]
FIG. 2 is a block diagram showing the configuration of Peers A to F. In this figure, reference numeral 21 denotes a communication unit for communicating with other peers via the network 10, reference numeral 22 denotes a data control unit for performing various data processing, and reference numeral 23 denotes a slot data storage unit. Here, a slot refers to a bundle of data. Reference numeral 24 denotes a state storage unit, which stores information on the number and type of slot data in the slot data storage unit 23.
[0021]
Next, the slot data stored in the slot data storage unit 23 will be described. Now, assuming that a virtual area is constructed for the data storage area of the entire system of FIG. 1 and is a virtual area D1, as shown in FIG. 3, this virtual area D1 is divided into 6 equal to SlotA to SlotF, and each SlotA to SlotF is The data is stored in the slot data storage units 23 of Peer A to Peer F, respectively. This data is called the original. Next, as shown in FIG. 3, the slot A ′, which is a copy of Slot A, is stored in the slot data storage unit 23 of PeerB and Peer C, and the slot B ′, which is a copy of Slot B, is stored in the slot data storage unit 23 of Peer C and Peer D. .., And the SlotF ′, which is a copy of SlotF, is stored in the slot data storage unit 23 of PeerA and PeerB. In this way, duplicates of two slots are stored in different peers for each original slot. The total value 3 of the number of slots of the original and the copy is defined as the data multiplicity.
[0022]
In the state storage unit 24 of PeerA, SlotA, SlotE ', and SlotF' are stored in the slot data storage unit 23, SlotA 'is stored in PeerB and PeerC, SlotE is stored in PeerE, and SlotF is stored in PeerF. , SlotE ′ is stored in PeerF, and information indicating that SlotF ′ is stored in PeerE. The same applies to the state storage units 24 of PeerB to F.
[0023]
Next, the operation of the above-described system will be described.
(1) Data search
When any one of the data control units 22 of the peers A to F performs a data search, a search request is transmitted to all the peers A to F as shown in FIG. Each Peer A to F searches for the original (Slot A to Slot F) in its own slot data storage unit 23 and transmits the search result to the search request source.
[0024]
(2) Data update
For example, in the case where the data control unit 22 of Peer A updates the data, if the location of the original slot including the data to be updated is stored in the state storage unit 24, the data is directly updated to the peer having the original slot. Send a request message. If the Peer is, for example, PeerC, an update request message is transmitted to PeerC. PeerC receives the message, first updates the original SlotC in the slot data storage unit 23 based on the message, and then transmits an update request message to PeerD and PeerE holding SlotC ′. PeerD and PeerE receive the update request message and update SlotC '.
In addition, when performing the above-described data update, if the location of the original slot including the data to be updated is not known, an update request message is transmitted to all Peers A to F as shown in FIG. The peer having the original slot including the data to be updated updates the original slot, and then transmits an update request message to the peer having the duplicate slot.
[0025]
(3) Insert data
When any one of the data control units 22 of the peers A to F inserts data, the peer inserts data into the virtual area. The data inserted into the virtual area is inserted into the Slot A with a smaller data amount among Slots A to F after confirming that the value of the primary key of the data to be inserted does not overlap with the value of the primary key of the already stored data.
[0026]
(4) Peer status monitoring
The data control units 22 of the peers A to F mutually monitor the status of all other peers at regular time intervals.
[0027]
Next, the operation of the system when any of Peers A to F is stopped due to a failure will be described.
(5) Data restoration
For example, when PeerA stops due to a failure, PeerB to F recognize that PeerA has stopped, and perform data restoration as shown in FIG. First, PeerB having the copy SlotA ′ of the original SlotA lost due to the suspension of PeerA changes the copy SlotA ′ to the original SlotA. Next, PeerB creates a copy SlotA 'of the original SlotA after confirming that PeerF does not have a copy SlotA' in order to store a copy of the original SlotA in a peer different by two slots. PeerE creates a copy SlotE 'after confirming that PeerD does not have the copy SlotE'. PeerF creates a duplicate SlotF 'after confirming that PeerE does not have the duplicate SlotF'.
[0028]
Next, the operation of the system when the peer that has been stopped due to the occurrence of a failure returns will be described.
(6) Data distribution
When the peer is restored, if there is another original slot excess Peer having two or more original slots, the excess peer inserts its original slot into the restored peer. Further, when the number of duplicate slots is insufficient for the multiplicity in the entire Peer, the peer having the original of the insufficient duplicate slot performs the duplication of its own original slot to the restored peer.
[0029]
Next, the slot duplication process in the above system will be described.
(7) Step-by-step data replication
If the average time interval at which the Peer stops occurs is sufficiently longer than the life cycle of one data record, the duplication process is performed stepwise. In FIG. 7, PeerD is the copy source of SlotB ', and the copy processing of SlotB' is being performed on PeerE, which is the copy destination, in record units. Slot B ′ is composed of records b1, b2, b3, and b4. Copying of record b1 is now completed, and records b2, b3, and b4 are further copied in record units. FIG. 8 shows a state in which the copy in progress in FIG. 7 has been completed, and PeerD deletes SlotB 'after confirming that records b1, b2, b3, and b4 have all been copied to PeerE.
[0030]
(8) Data search during stepwise data replication
The data search request is made to the duplicate slot when the original slot cannot be searched due to a failure. In FIG. 9, when PeerF issues a data search request for the record b1 of SlotB while PeerB having the original SlotB is stopped, both the copy source PeerD and the copy destination PeerE that are being copied are searched. . If the corresponding data is found by searching at the copy destination PeerE, it is transmitted as a value to PeerF, and if not found at PeerE, the search result of the copy source PeerD is sent to PeerF.
[0031]
(9) Data insertion during stepwise data replication
When there is a data insertion request for a copy slot, the copy source does not need to have new data, so insertion is performed only at the copy destination. In FIG. 10, when PeerB makes a data insertion request while PeerE, which is the copy source, performs copy processing of record b2 constituting SlotB ′ to PeerE, which is the copy destination, record b2 is added to SlotB of PeerE, which is the copy destination Is inserted from PeerB.
[0032]
(10) Data update during stepwise data replication
When there is a data update request for the replication slot, the replication source does not need to manage new data, so the update performs an insertion process only on the replication destination, and the replication source deletes the unupdated record. In FIG. 11, while PeerB makes a data update request while PeerE, which is the copy source, performs copy processing of the record b2 constituting SlotB ′ with PeerE, which is the copy destination, PeerB is added to SlotB ′ of PeerE, which is the copy destination. Update record b2 is inserted.
[0033]
(11) Data restoration during stepwise data replication
If one of the two peers being copied is stopped due to a failure, if the stopped peer is the copy destination, the ongoing copy process is stopped and the copy source peer is restored. After restoration, the copy source Peer determines a new copy destination and performs copy processing from the beginning. If the stopped Peer is the copy source, the ongoing copy processing is temporarily interrupted, a Peer capable of performing unprocessed record copy is determined, and then record copy is restarted. In FIG. 12, since it is the copy source that has stopped, the ongoing copy processing is temporarily interrupted, and PeerB having SlotB resumes the copy processing from the record b2 constituting SlotB '.
[0034]
The state of the slot in the above Peer operation is defined below, and how the state of the slot changes will be described. The state of each slot in FIG. 13 indicates the following.
Code 0 PrimarySecondaryInsufficient
Code 1 Primary Making Duplication
Code 2 PrimarySecondarysufficient
Code 3PrimaryTransferringPrimaryRights
Code 4SecondaryPreparing
Code 5SecondaryExecuting
Sign 6SecondaryOverload
Code 7SecondaryMakingDuplication
Code 8 SecondaryPrimaryFail
Code 9SecondaryTransferringPrimaryRights
Code 10 SecondaryClose
[0035]
Reference numeral 0 indicates a state in which the slot is an original and the number of copies of the slot is less than a specified value. Reference numeral 1 indicates a state in which the slot is an original and a copy is being created in another peer. Reference numeral 2 indicates a steady state in which the slot is the original and the number of duplicate slots is a specified number. Reference numeral 3 indicates a state in which the slot is an original and one of the duplicated slots is changed to an original slot instead of the slot.
[0036]
Reference numeral 4 indicates a state where a duplicate slot is being created in response to a request from another peer whose slot is in the state of reference numeral 1 or reference numeral 7. Reference numeral 5 indicates a state in which the duplication has been completed and a normal operation is performed as a duplication slot. Reference numeral 6 indicates a state in which the slot is duplicated, the number of slots in the peer is equal to or greater than a specified number, and the slot is in an overloaded state. Reference numeral 7 indicates a state in which the slot is a duplicate and a duplicate is being created in another peer. Reference numeral 8 indicates that, since the slot is a duplicate and the peer having the corresponding original slot has stopped, the peer having another corresponding duplicate slot and the peer having the duplicate slot as the original slot are determined. Indicates a state in which Reference numeral 9 indicates a state in which the slot is a duplicate, the corresponding original slot is in state 3, and the relationship between the original and the copy is exchanged. Reference numeral 10 denotes a state in which the overloaded Peer creates a copy of the copy slot in another Peer, and as a result, the copy source slot disappears. Reference numeral P1 indicates a state before the generation of the original slot, reference numeral P2 indicates a state before the generation of the duplicate slot, and E1 indicates a state after the disappearance of the slot.
[0037]
When a new Peer starts up in a state where there is no excessive original slot on the network, the condition for creating an original slot is satisfied, so that a transition is made from state P1 to state 0 PrimarySecondaryInsufficient (see step a in FIG. 13).
[0038]
When the peer with the lowest load among the peers that do not hold the replica of the slot is recognized and the peer is ready to create a duplicate slot, the replication is started. Therefore, the state transits from the state 0 PrimarySecondaryInsufficient to the state 1PrimaryMakingDuplication (step b).
[0039]
On the other hand, if the duplication slot is still insufficient even after the duplication is completed, or if the duplication fails halfway, the state transits from the state 1 Primary Making Duplication to the state 0 Primary Secondary Insufficient (see step c).
[0040]
When the creation of the copy slot is completed and the number of copies reaches the specified number, the state becomes a steady state, so that the state 1 changes from Primary Making Duplication to the state 2 Primary Secondary Sufficient (see step d).
[0041]
When the peer having the replication slot stops, the number of replications does not reach the specified number, so that the state transitions from the state 2 PrimarySecondarySufficient to the state 0 PrimarySecondaryInsufficient (see step e).
[0042]
If there is a Peer having two or more original slots, a transition from the state 2 PrimarySecondarysufficient to the state 3PrimaryTransferringPrimaryRights is performed in order to exchange the relationship between the original and the duplication slot corresponding to the Peer and the duplication slot (see step f).
[0043]
If the change from the copy slot to the original slot fails due to a failure, the copy is deleted and the specified number is not reached, so that the state 3PrimaryTransferringPrimaryRights transitions to the state 0PrimarySecondaryInsufficient (see step g).
[0044]
On the other hand, if the change from the duplication slot to the original slot succeeds, the slot becomes a steady state from the original slot to the duplication slot, so that the state 3PrimaryTransferringPrimaryRights transitions to the state 5SecondaryExecuting (see step h).
[0045]
The duplication source slot is in state 1, and among the peers that do not hold the duplication of the slot, the peer with the lowest load is recognized, and preparation for creating a duplication slot in that peer is complete, or in state 7, Then, since preparation for creating a duplicate slot in another peer from a peer having an excessive number of slots is started, a transition is made from P2 to state 4 SecondaryPreparing (see step i).
[0046]
When the generation of the duplication slot is completed and the duplication slot becomes operable, the state becomes a steady state, and the state transits from the state 4SecondaryPreparing to the state 5SecondaryExecuting (see step j).
[0047]
When the number of slots held by the peer becomes excessive due to the duplication, the state transits from the state 5SecondaryExecuting to the state code 6SecondaryOverload (see step k).
[0048]
When a copy of another copy slot of the Peer holding the slot is created in another Peer and the duplicate slot is deleted, the number of slots held by the Peer holding the slot becomes an appropriate number. Since the state becomes the state, the state transits again from the state 6 SecondaryOverload to the state 5 SecondaryExecuting (see step l).
[0049]
As a result of moving the slot to a peer with the lowest load among peers that do not hold a copy of the slot, a peer was found in which the load of the peer did not reverse the peer that held the slot. In this case, the state transits from the state 6 SecondaryOverload to the state 7 Secondary Making Duplication, and the slot is duplicated in the peer (see step m).
[0050]
On the other hand, if the creation of a duplicate slot fails, peers with excessive slots still exist, so the state transitions again from state 7 SecondaryMakingDuplication to state 6SecondaryOverload (see step n).
[0051]
When the duplication of the duplication slot is completed in the Peer with an excessive number of slots, the state transits from the state 7 SecondaryMakingDuplication to the state 10SecondaryClose (see step o).
[0052]
When the duplication slot disappears, the state transits to the final state E1 (see step p).
[0053]
When a request to change a duplicate slot to an original slot is made from a peer having an original slot, a change process is performed, so that the state changes from 5SecondaryExecuting to 9SecondaryTransferringPrimaryRights (see step q).
[0054]
When the process of changing from the duplication slot to the original slot is completed, the state becomes a steady state, so that the state 9TransferringPrimaryRights transits to the state 2PrimarySecondarysufficient (see step r).
[0055]
If Peer is stopped while retaining the original data due to a failure, the retained original data will be insufficient. See).
[0056]
Before changing the duplicate slot of another Peer of the original slot held by the Peer stopped while holding the original to the original slot and transiting to the state 8 SecondaryPrimaryFail, if the state 6SecondaryOverload or the state 7SecondaryMakingDuplication, In order to return to the overload state in which the number of slots in the area is equal to or larger than the specified number, the state changes from the state 8 SecondaryPrimaryFail to the state 6 SecondaryOverload (see step w).
[0057]
In addition, when the Peer stopped while holding the original has changed the duplicate slot of another Peer from the original slot held by the Peer to the original slot, and becomes a steady state, before transitioning to the state 8 SecondaryPrimaryFail, the state 5 is a SecondaryExecutingRinging or a state 9SecondaryTransferRancingRing. In this case, the state transits from the state 8 SecondaryPrimaryFail to the state 5 SecondaryExecuting (see step x).
[0058]
When the duplication slot is changed to the original and the duplication slot is equal to or less than the specified number, the state is changed from the state 8 SecondaryPrimaryFail to the state 0 PrimarySecondaryInsufficient to create a copy (see step y).
[0059]
In addition, since the copy slot is changed to the original and the copy slot satisfies the specified number, the state becomes a steady state, so that the state 8 is changed from the SecondaryPrimaryFail to the state 2PrimarySecondarySufficient (see step z).
[0060]
In the above embodiment, the number of peers is set to 6 and the multiplicity is set to 3. However, the number of peers may be set to n (n is a natural number) and the multiplicity may be set to m (m is an integer of 2 or more). At this time, each Peer has one original slot and m-1 duplicate slots.
[0061]
【The invention's effect】
As described above, according to the first, sixth, twelfth and thirteenth aspects of the present invention, there is provided a distributed database system for performing data processing using a database, comprising a control unit and a data storage unit. In a distributed database system constructed by connecting first to n-th computers via a network, the database is divided into first to n-th slots, and each of the first to n-th slots is assigned to an original. Slots are stored in the data storage units of the first to n-th computers, and a copy of each original slot is stored in a data storage unit different from the data storage unit in which the original slot is stored. Inexpensive and highly fault-tolerant distributed data because many computer systems can be integrated to distribute and multiplex data. It is possible to provide the over scan control system and a control method and control program.
[0062]
According to the second and eighth aspects of the present invention, when the (n + 1) th computer is newly usable, the k-th (k = k) is determined based on the information on the slots stored in the first to n-th computers. Since the duplicates of the slots (1, 2,..., N) are stored in the data storage unit of the (n + 1) th computer, the stored data is prevented from being overloaded on a specific computer and distributed to other newly started computers. Therefore, it is possible to provide a distributed database control system and a control method that are more excellent in fault tolerance.
[0063]
According to the third and ninth aspects of the present invention, when the m-th computer becomes unusable, any one of the copies of the m-th slot is changed to the original, and the computer other than the computer having the copy is changed. In the data storage unit, the copy of the m-th slot is stored, so that even if the computer having the original is stopped, the original can be restored by another computer, and the distributed database control with higher fault tolerance. A system and control method can be provided.
[0064]
According to the fourth and tenth aspects of the present invention, when the m-th computer becomes unusable, the m-th computer stores the m-th computer in a data storage unit of a computer other than the computer having the copy stored therein. To provide a distributed database control system and a control method which are more excellent in fault tolerance because a duplicate of the same replica as that stored by the computer is stored, so that the replicated data lost by the stop of the computer can be restored. Can be.
[0065]
According to the invention of claim 5 and the invention of claim 11, when the m-th computer is restored and becomes usable, the data storage section of the m-th computer is shifted from the computer in which two or more original slots are stored. , The original data of the m-th slot is stored, so that the original data slot is prevented from being biased to a specific computer, and distributed to other newly activated computers, so that a distributed database control system having more excellent fault tolerance and A control method can be provided.
[0066]
According to the invention of claim 7, when the control unit of the k-th computer updates the data, if the computer having the original slot containing the data to be updated is not known, the control of the first to n-th computers is performed. Since the search request is transmitted to the respective sections, the search request is reliably received by the computer, and a distributed database control method excellent in searchability can be provided.
[Brief description of the drawings]
FIG. 1 is a block diagram showing the configuration of a distributed database system implemented using a distributed database control algorithm according to one embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of Peer.
FIG. 3 is a diagram showing an initial state of a slot.
FIG. 4 is a diagram showing a state where each peer has received a data search request.
FIG. 5 is a diagram showing a state in which each peer has received a data update request.
FIG. 6 is a diagram illustrating a process of restoring a slot of a peer in which another peer has stopped when a peer stops due to a failure.
FIG. 7 is a diagram illustrating a process of duplicating the authority to manage the duplication slot in stages.
FIG. 8 is a diagram showing a state in which copying in progress in FIG. 7 has been completed;
FIG. 9 is a diagram illustrating a state in which a data search request has been issued during a copying process.
FIG. 10 is a diagram showing a state in which a data insertion request has been made during a copying process.
FIG. 11 is a diagram showing a state in which a data update request has been issued during a copying process.
FIG. 12 is a diagram illustrating a state in which a failure has occurred during a copying process.
FIG. 13 is a diagram showing a state transition of a slot.
[Explanation of symbols]
AF ... Peer
10. Network
21 ... Communication unit
22 Data control unit
23 Slot data storage
24 ... Status storage unit

Claims

A distributed database system that performs data processing using a database, wherein the distributed database system is configured by connecting a first to n-th computers having a control unit and a data storage unit via a network,
Dividing the database into first to nth slots,
The first to n-th slots are respectively stored as original slots in data storage units of the first to n-th computers,
A distributed database system, wherein a copy of each original slot is stored in a data storage different from the data storage in which the original slot is stored.

When the (n + 1) th computer is newly available, the kth (k = 1, 2,..., N) slot is copied based on the information on the slots stored in the first to nth computers. Is stored in the data storage unit of the (n + 1) th computer.

When the m-th (m = 1, 2,..., N) computer becomes unusable, one of the copies of the m-th slot is changed to an original, and a copy of a computer other than the computer having the copy is changed. The distributed database system according to claim 1, wherein a copy of the m-th slot is stored in the data storage unit.

When the m-th computer becomes unavailable, the copy stored by the m-th computer in the data storage unit of a computer other than the computer having the same copy as the copy stored by the m-th computer is copied. The distributed database system according to claim 3, wherein the distributed database system is stored.

When the m-th computer is restored and becomes usable, the computer in which two or more original slots are stored stores the original of the m-th slot in the data storage section of the m-th computer. The distributed database system according to claim 3 or 4, wherein

A distributed database system that performs data processing using a database, wherein the distributed database system is configured by connecting a first to n-th computers having a control unit and a data storage unit via a network,
The database is divided into first to n-th slots, the first to n-th slots are respectively stored as original slots in a data storage unit of the first to n-th computers, and a copy of each of the original slots is copied. Each is stored in a data storage unit different from the data storage unit in which the original slot is stored,
When the control unit of the k-th (k = 1, 2,..., N) computer performs a data search, it transmits a search request to the control units of the first to n-th computers, respectively.
The control units of the first to n-th computers respectively search their own data storage units, and transmit search results to the control unit of the k-th computer,
When the control unit of the k-th computer updates the data, the control unit transmits an update request to the control unit of the computer having the original slot including the data to be updated,
The control unit of the computer that has received the transmission updates the data of the original slot in the data storage unit and transmits an update request to a computer having a duplicate slot of the original slot,
The control unit of the computer receiving the transmission updates the data of the duplication slot in the data storage unit,
When the control unit of the k-th computer inserts data, the data is inserted into an original slot having a smaller data amount among the first to n-th original slots.

In the case where the control unit of the k-th computer updates data, if a computer having an original slot containing data to be updated is not known, a search request is transmitted to the control unit of each of the first to n-th computers. 7. The distributed database control method according to claim 6, wherein:

When the (n + 1) th computer is newly available, the kth (k = 1, 2,..., N) slot is copied based on the information on the slots stored in the first to nth computers. Is stored in the data storage of the (n + 1) th computer.

When the m-th (m = 1, 2,..., N) computer becomes unusable, one of the copies of the m-th slot is changed to an original, and a copy of a computer other than the computer having the copy is changed. 9. The distributed database control method according to claim 7, wherein a copy of the m-th slot is stored in the data storage unit.

When the m-th computer becomes unavailable, the copy stored by the m-th computer in the data storage unit of a computer other than the computer having the same copy as the copy stored by the m-th computer 10. The distributed database control method according to claim 9, wherein is stored.

When the m-th computer is restored and becomes usable, the original in the m-th slot is stored in the data storage unit of the m-th computer from the computer in which two or more original slots are stored. 11. The distributed database control method according to claim 9 or claim 10.

A distributed database system for performing data processing using a database, wherein the distributed database system is configured by connecting first to n-th computers having a control unit and a data storage unit via a network. Program
The database is divided into first to n-th slots, the first to n-th slots are respectively stored as original slots in a data storage unit of the first to n-th computers, and a copy of each of the original slots is copied. Each is stored in a data storage unit different from the data storage unit in which the original slot is stored,
When the control unit of the k-th (k = 1, 2,..., N) computer performs a data search, it transmits a search request to the control units of the first to n-th computers, respectively.
When the control unit of the k-th computer performs data update, transmits an update request to a computer having an original slot including data to be updated or a control unit of all of the first to n-th computers,
When the control unit of the k-th computer inserts data, the data is inserted into an original slot having a smaller data amount among the first to n-th original slots.

A distributed database system for performing data processing using a database, wherein the distributed database system is configured by connecting first to n-th computers having a control unit and a data storage unit via a network. Program
The database is divided into first to n-th slots, the first to n-th slots are respectively stored as original slots in a data storage unit of the first to n-th computers, and a copy of each of the original slots is copied. Each is stored in a data storage unit different from the data storage unit in which the original slot is stored,
When the control unit of the k-th (k = 1, 2,..., N) computer performs a data search, it receives a search request from the control unit of the k-th computer,
The control units of the first to n-th computers respectively search their own data storage units, and transmit search results to the control unit of the k-th computer,
When the controller of the k-th computer updates the data, the controller receives an update request from the controller of the k-th computer,
When having an original slot containing data to be updated, update the original data for which a data update request has been made, and transmit an update request to a computer having a duplicate slot of the original slot,
The control unit of the computer receiving the transmission updates the data of the duplication slot in the data storage unit,
When the control unit of the k-th computer inserts data, the data is inserted into an original slot having a smaller data amount among the first to n-th original slots.