JP2004264911A

JP2004264911A - Computer node, cluster system, cluster control method, and cluster control program

Info

Publication number: JP2004264911A
Application number: JP2003042104A
Authority: JP
Inventors: Kazuhiro Suzuki; 和宏鈴木
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2003-02-20
Filing date: 2003-02-20
Publication date: 2004-09-24
Anticipated expiration: 2023-02-20
Also published as: JP4021780B2; WO2004075070A1

Abstract

<P>PROBLEM TO BE SOLVED: To provide computer nodes, a cluster system, a cluster control method, and a cluster control program capable of flexibly dealing with changes in arrangement such as computer node failure or the like by concealing the computer nodes within the cluster system for virtualization. <P>SOLUTION: The cluster system 1 which provides a plurality of virtual computer nodes to a client 5 comprises actual nodes 3a, 3b, 3c for running applications, and a coordinator node 2 for allocating a virtual IP address which is the IP address of a virtual computer node to an actual IP address which is the IP address of any of the actual nodes. The actual nodes 3a, 3b, 3c and the coordinator node 2 each store an IP control table as a lookup table for the virtual and actual IP addresses and carry out communications using virtual IP addresses on the basis of the IP control table. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、ネットワーク上の計算機をノードとし、複数のノードを１つのシステムとして動作させるクラスタシステムにおいて、ノード故障などの構成の変化に柔軟に対応できる計算機ノード、クラスタシステム、クラスタ管理方法、クラスタ管理プログラムに関するものである。
【０００２】
【従来の技術】
クラスタシステムは、複数のノードを動作させて同一作業目的の処理を実行させることにより、単体のノードでは限界であった処理能力や信頼性を向上させるシステムである。一般的にクラスタシステムには、フェイルオーバー型とロードバランシング型とＨＰＣ（ＨｉｇｈＰｅｒｆｏｒｍａｎｃｅＣｏｍｐｕｔｉｎｇ）型の３通りが存在する。
【０００３】
まず、フェイルオーバー型について説明する。フェイルオーバー型では、２台またはそれ以上のノードを動作させ、何らかの原因で動作不能になった場合に、バックアップとして待機させていた他のノードがその処理を引き継ぐことによってＨＡ（ＨｉｇｈＡｖａｉｌａｂｉｌｉｔｙ）を向上させる。
【０００４】
次に、ロードバランシング型について説明する。ロードバランシング型では、ＷＷＷ（ＷｏｒｌｄＷｉｄｅＷｅｂ）やＦＴＰ（ＦｉｌｅＴｒａｎｓｆｅｒＰｒｏｔｏｃｏｌ）サーバなどのサーバを多重化して、スケーラビリティを実現する。具体的には、１つのロードバランサに対するＩＰレベルのセッションを、背後に控える複数のサービスノードに割り振ることにより、１つのノードの負荷分散を行う。割り振る方法にはいくつかあるが、順番に処理を割り振るラウンドロビン型やサービスノードやネットワークトラフィックの負荷を監視しながら負荷の少ないサービスノードに処理を割り振るダイナミックなロードバランサなどの構成を取ることが多い。
【０００５】
次に、ＨＰＣ型について説明する。ＨＰＣ型では、複数のノードが強調動作することによって並列処理アプリケーションを高速に実行する。ノード間のインターコネクトのデータ転送帯域が狭いとそこがボトルネックとなって全体の処理能力が低下するために、ギガビットＥｔｈｅｒｎｅｔやＭｙｒｉｎｅｔなどの高速なインターフェースで接続されることがある。並列処理アプリケーション作成にはＭＰＩ（ＭｅｓｓａｇｅＰａｓｓｉｎｇＩｎｔｅｒｆａｃｅ）やＰＶＭ（ＰａｒａｌｌｅｌＶｉｒｔｕａｌＭａｃｈｉｎｅ）等のライブラリがあり、これらは数値演算ライブラリと合わせて学術研究分野で利用されている。
【０００６】
【発明が解決しようとする課題】
しかしながら、上述のクラスタシステムにおいて、ユーザアプリケーションはノード上で直接実行される。そのため、ノードの故障や構成の変化に柔軟に対応することができないという問題があり、例えば、ユーザアプリケーション毎にノード故障などの対応を行う必要があった。
【０００７】
本発明は上述した課題に鑑みてなされたものであり、ユーザアプリケーションにクラスタシステム内のノードを直接操作させないように、ノードを隠蔽し仮想化することで、ユーザアプリケーションを変更すること無く、ノード故障などの構成の変化に柔軟に対応できる計算機ノード、クラスタシステム、クラスタ管理方法、クラスタ管理プログラムを提供することを目的とする。
【０００８】
【課題を解決するための手段】
上述した課題を解決するために、本発明は、クライアントに対して少なくとも１つの仮想的な計算機ノードを提供するクラスタシステムにおける物理的な計算機ノードであって、前記仮想的な計算機ノードのＩＰアドレスである仮想ＩＰアドレスと前記物理的な計算機ノードのＩＰアドレスである実ＩＰアドレスとの対応表であるＩＰ管理テーブルを記憶し、該ＩＰ管理テーブルに基づいて仮想ＩＰアドレスを用いた通信を行うＩＰ層と、他の計算機ノードと前記クライアントにネットワークを介して接続するネットワークデバイスとを備えてなるものである。
【０００９】
このような構成によれば、クラスタシステムの計算機ノードが共通のＩＰ管理テーブルを備えることにより、クラスタシステムにおいて仮想ＩＰアドレスを用いた通信を行うことができる。
【００１０】
また、本発明に係る計算機ノードにおいて、さらに、前記クライアントから指示されたアプリケーションを実行するアプリケーション実行部を備えたことを特徴とするものである。
【００１１】
このような構成によれば、アプリケーションを実行する計算機ノードをクライアントから隠蔽し、仮想的な計算機ノードとして動的に割り当てることにより、高可用性のクラスタシステムを提供することができる。
【００１２】
また、本発明に係る計算機ノードにおいて、前記ＩＰ層は、前記アプリケーション実行部から仮想ＩＰアドレスを宛先とする第１のＩＰヘッダを付けた第１のパケットが入力された場合、宛先の仮想ＩＰアドレスに対応する実ＩＰアドレスを前記ＩＰ管理テーブルを用いて検索し、検索した実ＩＰアドレスを宛先とする第２のＩＰヘッダを前記第１のパケットにさらに付けるカプセル化を施し、得られた第２のパケットを前記ネットワークデバイスへ出力するカプセル化部と、前記ネットワークデバイスから実ＩＰアドレスを宛先とする第３のＩＰヘッダを付けた第３のパケットが入力された場合、前記第３のパケットから前記第３のＩＰヘッダを外して第４のパケットを生成し、さらに仮想ＩＰアドレスを宛先とする第４のＩＰヘッダの仮想ＩＰアドレスが自らの計算機ノードの仮想ＩＰアドレスであれば、得られた前記第４のパケットを前記アプリケーション実行部へ出力するカプセル解除部とを備えたことを特徴とするものである。
【００１３】
このような構成によれば、計算機ノードは、共通のＩＰ管理テーブルに従って仮想ＩＰアドレスと実ＩＰアドレスを変換することにより、仮想的な計算機ノード間の通信を実現することができる。
【００１４】
また、本発明に係る計算機ノードにおいて、さらに、前記ネットワークデバイスと同様に扱われ、自らの計算機ノードの仮想ＩＰアドレスを宛先とするパケットを前記ＩＰ層へ出力するトンネルデバイスを備えたことを特徴とするものである。
【００１５】
このような構成によれば、ＩＰ層は実ＩＰアドレスを宛先とするパケットと同様に、仮想ＩＰアドレスを宛先とするパケットの処理を行うことができる。
【００１６】
また、本発明に係る計算機ノードにおいて、さらに、他の計算機ノードの負荷状態を検出する負荷状態検出部と、前記負荷状態に基づいて前記仮想ＩＰアドレスに前記実ＩＰアドレスを割り当て、前記ＩＰ管理テーブルを作成するノード割り当て部と、前記ＩＰ管理テーブルを他の計算機ノードへブロードキャストするブロードキャスト部とを備え、前記ＩＰ層は、前記クライアントに対して前記仮想ＩＰアドレスを提供する仮想ノード提供部と、前記ネットワークデバイスから前記仮想ＩＰアドレスを宛先とするパケットが入力された場合、前記ＩＰ管理テーブルを用いて前記仮想ＩＰアドレスから前記実ＩＰアドレスを検索し、検索した前記実ＩＰアドレスを宛先とするパケットを前記ネットワークデバイスへ出力するパケット割り振り部とを備えたことを特徴とするものである。
【００１７】
このような構成によれば、クラスタシステムにおいて、他の計算機ノードの負荷状態や故障等に応じて仮想ＩＰアドレスを実ＩＰアドレスへ割り振ることにより、クライアントが計算機ノードのＩＰアドレスを変更することなく、計算機ノードの負荷の変化や計算機ノードの故障に柔軟に対応でき、高可用性を実現することができる。
【００１８】
また、本発明に係る計算機ノードにおいて、前記パケット割り振り部は、前記ネットワークデバイスから前記仮想ＩＰアドレスを宛先とする第１のＩＰヘッダを付けた第１のパケットが入力された場合、検索した実ＩＰアドレスを宛先とする第２のＩＰヘッダを前記第１のパケットにさらに付けるカプセル化を施し、得られた第２のパケットを前記ネットワークデバイスへ出力することを特徴とするものである。
【００１９】
このような構成によれば、計算機ノードは、共通のＩＰ管理テーブルに従って仮想ＩＰアドレスと実ＩＰアドレスを変換することにより、仮想的な計算機ノードに対する通信を実現することができる。
【００２０】
また、本発明に係る計算機ノードにおいて、前記仮想ノード提供部は、前記ネットワークデバイスに対して実ＩＰアドレスと異なる少なくとも１つのＩＰアドレスを割り当てることを特徴とするものである。
【００２１】
このような構成によれば、複数の仮想的なノードに対する要求を受け付け、物理的な計算機ノードに割り振ることができる。
【００２２】
また、本発明に係る計算機ノードにおいて、前記ノード割り当て部は、１つの仮想ＩＰアドレスに対して複数の実ＩＰアドレスを割り当てることを特徴とするものである。
【００２３】
このような構成によれば、仮想的な計算機ノードを用いてロードバランシング型のクラスタシステムを構成することが可能となる。
【００２４】
また、本発明に係る計算機ノードにおいて、前記ノード割り振り部は、クライアントから前記１つの仮想ＩＰアドレスに対して要求がある度に、要求を転送する実ＩＰアドレスを変化させることを特徴とするものである。
【００２５】
このような構成によれば、仮想的な計算機ノードに対してクライアントから大量のリクエストを受けた場合でも、物理的な計算機ノードの負荷を分散させることができる。
【００２６】
また、本発明に係る計算機ノードにおいて、前記ブロードキャスト部は、前記ＩＰ管理テーブルのうち更新されたエントリだけを他の計算機ノードへブロードキャストすることを特徴とするものである。
【００２７】
このような構成によれば、ＩＰ管理テーブルに関するデータの転送量を低減することができる。
【００２８】
また、本発明に係る計算機ノードにおいて、前記ブロードキャスト部は、前記ＩＰ管理テーブルのうち他の計算機ノードから要求されたエントリだけを要求した計算機ノードだけへ送信することを特徴とするものである。
【００２９】
このような構成によれば、ＩＰ管理テーブルに関するデータの転送量を低減することができる。
【００３０】
また、本発明は、クライアントに対して複数の仮想的な計算機ノードのサービスを提供するクラスタシステムであって、アプリケーションを実行する計算機ノードである少なくとも１つの実ノードと、前記仮想ＩＰアドレスを前記実ノードの実ＩＰアドレスに割り当てる計算機ノードである少なくとも１つのコーディネータノードとを備えたことを特徴とするものである。
【００３１】
このような構成によれば、コーディネータノードが実ノードの負荷状態や故障等に応じて仮想ＩＰアドレスを実ＩＰアドレスへ割り振ることにより、クライアントが実ノードのＩＰアドレスを変更することなく、ノードの負荷の変化やノードの故障に柔軟に対応でき、高可用性を実現することができる。
【００３２】
また、本発明に係るクラスタシステムにおいて、前記コーディネータノードは、前記ＩＰ管理テーブルを前記実ノードへ送信し、該実ノードは、前記ＩＰ管理テーブルを受信したことを前記コーディネータノードへ送信することを特徴とするものである。
【００３３】
このような構成によれば、コーディネータノードは、実ノードへ確実にＩＰ管理テーブルを配信することができる。
【００３４】
また、本発明は、クライアントに対して少なくとも１つの仮想的な計算機ノードを提供し、前記クライアントから指示されたアプリケーションを実際に実行する計算機ノードである少なくとも１つの実ノードの管理を行うクラスタ管理方法であって、前記実ノードと前記クライアントにネットワークを介して接続するステップと、前記クライアントに対して前記仮想的な計算機ノードのＩＰアドレスである仮想ＩＰアドレスを提供するステップと、前記実ノードの負荷状態を検出するステップと、前記負荷状態に基づいて前記仮想ＩＰアドレスに対して前記実ノードのＩＰアドレスである実ＩＰアドレスを割り当て、前記ＩＰ管理テーブルを作成するステップと、前記ＩＰ管理テーブルを前記実ノードへブロードキャストするステップと、ネットワークを介して前記クライアントから前記仮想ＩＰアドレスを宛先とするパケットが入力された場合、前記ＩＰ管理テーブルを用いて前記仮想ＩＰアドレスから前記実ＩＰアドレスを検索し、検索した前記実ＩＰアドレスを宛先とするパケットを、ネットワークを介して宛先の実ノードへ出力するステップとを備えてなるものである。
【００３５】
このような構成によれば、コーディネータノードが実ノードの負荷状態や故障等に応じて仮想ＩＰアドレスを実ＩＰアドレスへ割り振ることにより、クライアントが実ノードのＩＰアドレスを変更することなく、ノードの負荷の変化やノードの故障に柔軟に対応でき、高可用性を実現することができる。
【００３６】
また、本発明は、クライアントに対して少なくとも１つの仮想的な計算機ノードを提供し、前記クライアントから指示されたアプリケーションを実際に実行する計算機ノードである少なくとも１つの実ノードの管理を行うために、コンピュータにより読取可能な媒体に記憶されたクラスタ管理プログラムであって、前記実ノードと前記クライアントにネットワークを介して接続するステップと、前記クライアントに対して前記仮想的な計算機ノードのＩＰアドレスである仮想ＩＰアドレスを提供するステップと、前記実ノードの負荷状態を検出するステップと、前記負荷状態に基づいて前記仮想ＩＰアドレスに対して前記実ノードのＩＰアドレスである実ＩＰアドレスを割り当て、前記ＩＰ管理テーブルを作成するステップと、前記ＩＰ管理テーブルを前記実ノードへブロードキャストするステップと、ネットワークを介して前記クライアントから前記仮想ＩＰアドレスを宛先とするパケットが入力された場合、前記ＩＰ管理テーブルを用いて前記仮想ＩＰアドレスから前記実ＩＰアドレスを検索し、検索した前記実ＩＰアドレスを宛先とするパケットを、ネットワークを介して宛先の実ノードへ出力するステップとをコンピュータに実行させることを特徴とするものである。
【００３７】
このような構成によれば、コーディネータノードが実ノードの負荷状態や故障等に応じて仮想ＩＰアドレスを実ＩＰアドレスへ割り振ることにより、クライアントが実ノードのＩＰアドレスを変更することなく、ノードの負荷の変化やノードの故障に柔軟に対応でき、高可用性を実現することができる。
なお、本発明において、上記コンピュータにより読取り可能な媒体は、ＣＤ−ＲＯＭやフレキシブルディスク、ＤＶＤディスク、光磁気ディスク、ＩＣカード等の可搬型記憶媒体や、コンピュータプログラムを保持するデータベース、或いは、他のコンピュータ並びにそのデータベースや、更に回線上の伝送媒体をも含むものである。
【００３８】
【発明の実施の形態】
以下、本発明の実施の形態について図面を用いて説明する。
実施の形態１．
本実施の形態では、本発明のクラスタシステムをｒｓｈ（ｒｅｍｏｔｅｓｈｅｌｌ）モードで使う場合について説明する。ｒｓｈモードでは、１つのアプリケーションの実行を１つの実ノードで行う。
【００３９】
まず、本実施の形態に係るクラスタシステムの構成について、図１を用いて説明する。図１は、本実施の形態に係るクラスタシステムの構成の一例を示すブロック図である。図１に示すように、本実施の形態に係るクラスタシステム１は、コーディネータノード２と実ノード３ａ，３ｂ，３ｃから構成される。各ノードは、ネットワーク４を介して互いに接続されると共に、ネットワーク４を介してクライアント５と接続される。
【００４０】
本実施の形態に係るクラスタシステムでは、ユーザアプリケーションにクラスタ内のノードを直接操作することを禁止する。これは仮想的な計算機ノードである仮想ノードだけをクライアント５に提供し、実際にアプリケーションを実行する物理的な計算機ノードである実ノードを隠蔽することにより実現される。コーディネータノードとは、クライアントからの要求や実ノードの負荷状態等に応じて、実ノードに対して仮想ノードやアプリケーション実行の割り当てを行う計算機ノードである。実ノードは、コーディネータノードの割り当てに従って、実際にアプリケーションを実行する。アプリケーションが終了すると、実ノードは解放されて、仮想ノードとアプリケーション実行の割り当てが解除される。
【００４１】
次に、本実施の形態に係るコーディネータノードの機能について、図２を用いて説明する。図２は、本実施の形態に係るコーディネータノードの機能の一例を示すブロック図である。図２に示すように、本実施の形態に係るコーディネータノード２は、階層モデルにおけるＩＰ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）層に注目して見た場合に大きく分けて、下位層に属するＮＩＣ２１（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）と、ＩＰ層２２と、上位層に属するコーディネータ２３から構成される。ＮＩＣ２１は、ネットワークデバイス２１１から構成される。ＩＰ層２２は、ＩＰ処理部２２１と仮想ノード提供部２２２とパケット割り振り部２２３から構成される。コーディネータ２３は、負荷状態検出部２３１とノード割り当て部２３２とブロードキャスト部２３３から構成される。
【００４２】
次に、本実施の形態に係るコーディネータノードの動作について説明する。負荷状態検出部２３１は、各実ノード３ａ，３ｂ，３ｃの負荷状態を検出する。負荷状態とは、今現在、どんなプロセスがどれだけのＣＰＵの使用率で動いているかを示すものである。
【００４３】
ノード割り当て部２３２は、ＩＰ管理テーブルにおいて仮想的なＩＰアドレスである仮想ＩＰアドレス（ＶＩＰ：ＶｉｒｔｕａｌＩＰＡｄｄｒｅｓｓ）を設定し、外部からノード割り当て要求を受けた場合に、各実ノード３ａ，３ｂ，３ｃの負荷状態に基づいて、ＶＩＰを実ノードの実際のＩＰアドレスである実ＩＰアドレス（ＲＩＰ：ＲｅａｌＩＰＡｄｄｒｅｓｓ）に割り当て、ＩＰ管理テーブルを生成する。また、必要に応じてＩＰ管理テーブルを更新する。ＩＰ管理テーブルとは、ＶＩＰからＲＩＰを検索するための対応表のことである。
【００４４】
ブロードキャスト部２３３は、ＩＰ管理テーブルが更新された場合に、ＩＰ管理テーブルを全ての実ノード３ａ，３ｂ，３ｃへブロードキャストする。ここでは、一例としてＩＰ管理テーブルの更新時にのみＩＰ管理テーブルをブロードキャストするとしたが、ＩＰ管理テーブルにおいて変更されたエントリのみをブロードキャストするようにしても良い。また、ＩＰ管理テーブルのうち、実ノードから要求されたエントリを送信するようにしても良い。さらに、コーディネータノード２が、ＩＰ管理テーブルが確実に届いているかを確認するために、実ノードはＩＰ管理テーブルを受け取ったことを示す情報をコーディネータノード２へ返すようにしても良い。
【００４５】
仮想ノード提供部２２２は、ＩＰ管理テーブルに設定されたＶＩＰを仮想ノードとしてクライアント５へ提供する。クライアント５に対してＶＩＰを見せるために、１つのネットワークデバイス２１１に複数のＶＩＰを割り当てる。
【００４６】
ＩＰ処理部２２１は、従来のＩＰ層と同様、上位層または下位層から入力されたＩＰパケットのフィルタリングやルーティングを行う。ネットワークデバイス２１１からＩＰパケットが入力された場合、ＩＰパケットが自ノード宛ならば上位層に渡し、そうでなければ適切なネットワークへ転送するために再びネットワークデバイス２１１へ出力する。反対に、上位層からＩＰパケットが入力された場合、同じく適切なネットワークへ送信するためにネットワークデバイス２１１へ出力する。また、本発明のＩＰ処理部２２１は、さらにＶＩＰヘッダを持つＩＰパケットがネットワークデバイス２１１から入力された場合、パケット割り振り部２２３へ出力する。ここで、ＲＩＰを宛先とするＩＰヘッダをＲＩＰヘッダ、ＶＩＰを宛先とするＩＰヘッダをＶＩＰヘッダと呼ぶ。
【００４７】
パケット割り振り部２２３は、仮想ノード提供部２２２により提供されたＶＩＰ宛に到達したＩＰパケットの処理を行う。まず、パケット割り振り部２２３は、ＩＰ処理部２２１からＶＩＰヘッダを持つＩＰパケットが入力された場合、ＩＰ管理テーブルを用いてＶＩＰからＲＩＰを検索し、実際にパケットを送信する宛先の実ノードのＲＩＰを得る。次に、パケット割り振り部２２３は、図３（ａ）に示すように、ＶＩＰヘッダを持つＩＰパケットにさらにＲＩＰヘッダを付けることによりカプセル化する。カプセル化したＩＰパケットは、ＩＰ処理部２２１へ出力され、ネットワークデバイス２１１を介して外部のＲＩＰ宛に送信される。
【００４８】
ネットワークデバイス２１１は、外部のクライアント５と各ノードへネットワーク４を介して接続されている。外部から受信したパケットをＩＰ処理部２２１へ出力するとともに、ＩＰ処理部２２１から入力されたパケットを外部へ送信する。
【００４９】
次に、ＩＰ層であるＩＰ処理部２２１とパケット割り振り部２２３の実装例について説明する。まず、仮想ノード提供部２２２の実装例について説明する。ここではＩＰエイリアスと呼ばれる機構を使う。ＩＰエイリアスはＬｉｎｕｘカーネルで標準サポートされており、コマンド「＃ｉｆｃｏｎｆｉｇｅｔｈ０：０１９２．１６８．１．１００」によって他のＩＰアドレスを割り当てることができる。“：”以降の数字を変更すれば複数のＩＰアドレスを割り当てることも可能である。この機能を用いてＶＩＰを割り当てる。
【００５０】
次に、ＩＰ処理部２２１とパケット割り振り部２２３の実装例について説明する。図４は、ＩＰ層の実装の一例を示すブロック図である。Ｌｉｎｕｘカーネル２．４にはパケットフィルタリングの機構が組み込まれている。これはｎｅｔｆｉｌｔｅｒと呼ばれ、カーネル内でのＩＰパケット処理を行うコードの拡張性を提供するためのフレームワークである。ここではＬｉｎｕｘのＩＰ層におけるｎｅｔｆｉｌｔｅｒの機能を用いてＩＰ処理部２２１を実現する。ｎｅｔｆｉｌｔｅｒ７は、下位層８と上位層９に接続されている。
【００５１】
ここで、従来のｎｅｔｆｉｌｔｅｒの具体的な動作について説明する。下位層８が受信したパケットは、ＮＦ＿ＩＰ＿ＰＲＥ＿ＲＯＵＴＩＮＧ７１を介してｒｏｕｔｉｎｇ７２に送られ、他のノードへ転送するパケットであれば、ＮＦ＿ＩＰ＿ＦＯＲＷＡＲＤ７４へ送られ、そうでなければ、ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＩＮ７３へ送られる。ＮＦ＿ＩＰ＿ＦＯＲＷＡＲＤ７４へ送られたＩＰパケットは、ＮＦ＿ＩＰ＿ＰＯＳＴ＿ＲＯＵＴＩＮＧ７７を介して下位層８へ送られ、他のノードへ送信される。一方、ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＩＮ７３へ送られたパケットは上位層９へ送られる。
【００５２】
上位層９からのＩＰパケットは、ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＯＵＴ７５を介してｒｏｕｔｉｎｇ７６に送られ、さらにＮＦ＿ＩＰ＿ＰＯＳＴ＿ＲＯＵＴＩＮＧ７７を介して下位層８へ送られ、他のノードへ送信される。以上のｎｅｔｆｉｌｔｅｒ７により、ＩＰ処理部２２１は実現される。
【００５３】
また、パケット割り振り部２２３は、図４で説明したｎｅｔｆｉｌｔｅｒの機能を拡張することにより実現することができる。ｎｅｔｆｉｌｔｅｒ７は、ＮＦ＿ＩＰ＿ＰＲＥ＿ＲＯＵＴＩＮＧ７１，ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＩＮ７３，ＮＦ＿ＩＰ＿ＦＯＲＷＡＲＤ７４，ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＯＵＴ７５，ＮＦ＿ＩＰ＿ＰＯＳＴ＿ＲＯＵＴＩＮＧ７７の部分でそれぞれフック関数を呼び出す仕組みを提供する。これらには関数を登録するためのリストが用意され、「ｉｎｔｎｆ＿ｒｅｇｉｓｔｅｒ＿ｈｏｏｋ（ｓｔｒｕｃｔｎｆ＿ｈｏｏｋ＿ｏｐｓ＊ｒｅｇ）」のインターフェースによって登録することができ、「ｉｎｔｎｆ＿ｕｎｒｅｇｉｓｔｅｒ＿ｈｏｏｋ（ｓｔｒｕｃｔｎｆ＿ｈｏｏｋ＿ｏｐｓ＊ｒｅｇ）」のインターフェースによって削除することができる。ここで、ｓｔｒｕｃｔｎｆ＿ｈｏｏｋ＿ｏｐｓ型の構造体はフック関数を登録するためのものである。本実施の形態では、ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＩＮ７３にパケット割り振り部２２３の動作をフック関数として登録することにより、パケット割り振り部２２３を実現することができる。
【００５４】
また、パケット割り振り部２２３におけるカプセル化はＩＰトンネリング機能を応用して実装することができる。ＲＩＰ宛のＩＰパケットのプロトコルはＩＰＰＲＯＴＯＩＰＩＰに設定する。これはＩＰトンネリングプロトコルと呼ばれ、パケットがカプセル化されていることを示すためのものである。
【００５５】
次に、本実施の形態に係る実ノードの機能について、図５を用いて説明する。図５は、本実施の形態に係る実ノードの機能の一例を示すブロック図である。図３に示すように、本実施の形態に係る実ノード３ａ，３ｂ，３ｃは、階層モデルにおけるＩＰ層に注目して見た場合に大きく分けて、下位層に属するＮＩＣ３１と、ＩＰ層３２と、上位層に属するアプリケーション実行部３３から構成される。ＮＩＣ３１は、ネットワークデバイス３１１とトンネルデバイス３１２から構成される。ＩＰ層３２は、ＩＰ処理部３２１とカプセル解除部３２２とカプセル化部３２３から構成される。
【００５６】
次に、本実施の形態に係る実ノードの動作について説明する。アプリケーション実行部３１はアプリケーションの実行ファイルを備えており、コーディネータノード２を介してクライアント５から受信したパケットの内容に従ってアプリケーションの実行を行い、実行結果をパケットとしてＩＰ層へ渡す。この時、クライアント５に対して実行結果を送信する場合は通常通りパケットにクライアント宛のヘッダを付加し、他の実ノードと通信を行う場合はパケットにＶＩＰヘッダを付加する。
【００５７】
ＩＰ処理部３２１は、従来のＩＰ層やＩＰ処理部２２１と同様、上位層または下位層から入力されたＩＰパケットのフィルタリングやルーティングを行う。本発明のＩＰ処理部３２１は、さらに、下位層からＲＩＰヘッダを持つＩＰパケットが入力された場合、カプセル解除部３２２へ出力する。また、上位層からＶＩＰヘッダを持つＩＰパケットが入力された場合、カプセル化部３２３へ出力する。
【００５８】
カプセル解除部３２２は、ＩＰ処理部３２１からＲＩＰヘッダを持つＩＰパケットが入力されると、図３（ｂ）に示すように、ＩＰパケットからＲＩＰヘッダを外す。この時、カプセル化されたＩＰパケットがＩＰトンネリングプロトコルであるために、ＶＩＰヘッダを持つＩＰパケットを、ＩＰ処理部３２１を介してトンネルデバイス３１２へ出力する。
【００５９】
カプセル化部３２３は、ＩＰ処理部３２１からＶＩＰヘッダを持つＩＰパケットが入力されると、コーディネータノード２からブロードキャストされたＩＰ管理テーブルを用いてＶＩＰからＲＩＰを検索し、実際にパケットを送信する宛先の実ノードのＲＩＰを得る。次に、カプセル化部３２３は、図３（ａ）に示すように、ＶＩＰヘッダを持つＩＰパケットにさらにＲＩＰヘッダを付けることによりカプセル化する。カプセル化したＩＰパケットは、ＩＰ処理部３２１とネットワークデバイス３１１を介してＲＩＰ宛に送信される。
【００６０】
ネットワークデバイス３１１は、ネットワークデバイス２１１と同様であり、外部のクライアント５と各ノードへネットワーク４を介して接続されている。外部から受信したパケットをＩＰ処理部３２１へ出力するとともに、ＩＰ処理部３２１から入力されたパケットを外部へ送信する。
【００６１】
トンネルデバイス３１２は、ＩＰ処理部３２１からＶＩＰヘッダを持つＩＰパケットが入力されると、そのままＩＰ処理部３２１へ出力する。カプセル解除されたＩＰパケットの宛先はＶＩＰであるため、自ノードのトンネルデバイス３１２が受け取って、再びＩＰ処理部３２１へ入力される。ＩＰ処理部３２１は、トンネルデバイス３１２からのＶＩＰヘッダを持つＩＰパケットをアプリケーション実行部３３へ出力する。
【００６２】
ここで、ＩＰ層であるＩＰ処理部３２１とカプセル解除部３２２とカプセル化部３２３の実装例について説明する。ＩＰ処理部２２１と同様、ＬｉｎｕｘのＩＰ層におけるｎｅｔｆｉｌｔｅｒの機能を用いてＩＰ処理部３２１を実現し、さらに、ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＩＮ７３にカプセル解除部３２２の動作をフック関数として登録し、ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＯＵＴ７５にカプセル化部３２３の動作をフック関数として登録することにより、カプセル解除部３２２とカプセル化部３２３を実現することができる。
【００６３】
また、カプセル解除部３２２とカプセル化部３２３はＩＰトンネリング機能を応用して実装することができる。ＩＰトンネリングによってＩＰパケットをＩＰヘッダでカプセル化することによって、カプセル化されたパケットに関わらず正しい転送先に送出することができる。ここでＲＩＰ宛のＩＰパケットのプロトコルはＩＰＰＲＯＴＯ＿ＩＰＩＰに設定する。
【００６４】
ＩＰトンネリングではＮＡＴ（ＮｅｔｗｏｒｋＡｄｄｒｅｓｓＴｒａｎｓｌａｔｉｏｎ）等のアドレス変換と異なり、クライアント５からのリクエストに対して直接応答を返すことができる。以上のように、トンネルデバイス３１２を全ての実ノード３ａ，３ｂ，３ｃに実装し、コーディネータノード２と全ての実ノード３ａ，３ｂ，３ｃが同じＩＰ管理テーブルを備えることで、ＶＩＰレベルでの通信を実現することができ、実ノード間通信が可能となる。
【００６５】
次に、ｒｓｈモードの動作について、図６を用いて説明する。図６は、本実施の形態に係るクラスタシステムのｒｓｈモードにおける動作の一例を示すシーケンス図である。ｒｓｈモードでは一般的なＵＮＩＸコマンドであるｒｓｈと同様に、コマンド「％ｒｓｈｖｎｏｄｅａｐｐｌｉｃａｔｉｏｎ［ａｒｇｓ．．．］」によってクラスタシステム１上でアプリケーションを実行することができる。ここでｖｎｏｄｅは仮想ノードを示している。ここでは、説明のため、実ノード３ａが持つＲＩＰをＲＩＰ＃ａ、実ノード３ｂが持つＲＩＰをＲＩＰ＃ｂ、実ノード３ｃが持つＲＩＰをＲＩＰ＃ｃとする。
【００６６】
まず、ユーザは、クライアント５を用いてｒｓｈコマンドを入力し、宛先を指定する。ここでは例えばＶＩＰ＃１を宛先として指定する。これにより、ノード割り当て要求が行われる（Ｓ１０１）。ノード割り当て要求を受けたコーディネータノード２は、各実ノード３ａ，３ｂ，３ｃの負荷状態に応じて、ＶＩＰ＃１に例えばＲＩＰ＃ａを割り当てる（Ｓ１０２）。実ノード３ａは、アプリケーションの実行が可能な状態であれば、そのことを示す情報をコーディネータノード２へ返す（Ｓ１０３）。次に、コーディネータノード２は、図７に示すＩＰ管理テーブルを生成し、各実ノード３ａ，３ｂ，３ｃへブロードキャストすると共に、ノード割り当てが成功したことを示すノード割り当て完了通知をクライアント５へ返す（Ｓ１０４）。
【００６７】
ノード割り当て完了通知を受信したクライアント５は、ジョブをＶＩＰ＃１宛に投入する（Ｓ１０５）。ジョブはコーディネータノード２を介して実ノード３ａへ渡される（Ｓ１０６）。実ノード３ａは、ジョブであるアプリケーションの実行を行い、その実行結果をクライアント５へ返す（Ｓ１０７）。また、実ノード３ａは、ジョブが終了したことを示す情報をコーディネータノード２へ返す（Ｓ１０８）。ジョブの終了を検出したコーディネータノード２は、ＩＰ管理テーブルのＶＩＰ＃１とＲＩＰ＃ａのエントリを削除し、その結果をブロードキャストすることによりノードの解放を行う（Ｓ１０９）。実ノード３ａは、ノードの解放を確認したことを示す情報をコーディネータノード２へ返す（Ｓ１１０）。
【００６８】
以上のように、コーディネータノード２が実ノードの負荷状態や故障等に応じてＶＩＰをＲＩＰへ割り振ることにより、ノードの負荷の変化やノードの故障に柔軟に対応でき、高可用性を実現することができる。例えば、ノード故障が発生した場合でも、ＶＩＰとＲＩＰの割り当てを自動的に変更するだけで復帰することができる。
【００６９】
実施の形態２．
本実施の形態では、図１で説明した本発明のクラスタシステムをＨＰＣモードで使う場合について説明する。ＨＰＣモードでは、複数のアプリケーションの実行を複数の実ノードで分担して行う。一般にＨＰＣ型のアプリケーションではノード間通信が行われる。本実施の形態では、コーディネータノード２がクライアント５に対して複数の仮想ノードを提供し、実ノード間で通信を行うことにより、ＨＰＣ型のアプリケーションが動作可能となる。
【００７０】
以下、ＨＰＣモードの動作について、図８を用いて説明する。図８は、本実施の形態に係るクラスタシステムのＨＰＣモードにおける動作の一例を示すシーケンス図である。ここでは、説明のため、実ノード３ａが持つＲＩＰをＲＩＰ＃ａ、実ノード３ｂが持つＲＩＰをＲＩＰ＃ｂ、実ノード３ｃが持つＲＩＰをＲＩＰ＃ｃとする。
【００７１】
まず、ユーザは、クライアント５を用いて宛先を指定する。ここでは例えばＶＩＰ＃１とＶＩＰ＃２を宛先として指定する。これにより、ノード割り当て要求が行われる（Ｓ２０１）。ノード割り当て要求を受けたコーディネータノード２は、各実ノード３ａ，３ｂ，３ｃの負荷状態に応じて、ＶＩＰ＃１に例えばＲＩＰ＃ａを割り当てる（Ｓ２０２）。実ノード３ａは、アプリケーションの実行が可能な状態であれば、そのことを示す情報をコーディネータノード２へ返す（Ｓ２０３）。同様に、コーディネータノード２は、ＶＩＰ＃２に例えばＲＩＰ＃ｃを割り当てる（Ｓ２０２）。実ノード３ｃは、アプリケーションの実行が可能な状態であれば、そのことを示す情報をコーディネータノード２へ返す（Ｓ２０３）。処理Ｓ２０２と処理Ｓ２０３は、要求されたノード数分繰り返される。次に、コーディネータノード２は、図９に示すＩＰ管理テーブルを生成し、各実ノード３ａ，３ｂ，３ｃへブロードキャストすると共に、ノード割り当てが成功したことを示すノード割り当て完了通知をクライアント５へ返す（Ｓ２０４）。
【００７２】
ノード割り当て完了通知を受信したクライアント５は、ジョブをＶＩＰ＃１宛とＶＩＰ＃２宛に投入する（Ｓ２０５）。ここで、例えばＶＩＰ＃１で得られた実行結果をＶＩＰ＃２で用いて実行するというジョブをＶＩＰ＃１宛とＶＩＰ＃２宛に与えたとする。ＶＩＰ＃１宛のジョブはコーディネータノード２を介して実ノード３ａへ渡され、ＶＩＰ＃２宛のジョブはコーディネータノード２を介して実ノード３ｃへ渡される（Ｓ２０６）。実ノード３ａは、ジョブであるアプリケーションの実行を行い、その実行結果を実ノード３ｃへ渡す。実ノード３ｃは、実ノード３ａの実行結果を用いてアプリケーションの実行を行い、その実行結果をクライアント５へ返す（Ｓ２０７）。また、実ノード３ａと実ノード３ｃは、ジョブが終了したことを示す情報をコーディネータノード２へ返す（Ｓ２０８）。ジョブの終了を検出したコーディネータノード２は、ＩＰ管理テーブルのＶＩＰ＃１とＲＩＰ＃ａのエントリ、ＶＩＰ＃２とＲＩＰ＃ｃのエントリを削除し、その結果をブロードキャストすることによりノードの解放を行う（Ｓ２０９）。実ノード３ａと実ノード３ｃは、ノードの解放を確認したことを示す情報をコーディネータノード２へ返す（Ｓ２１０）。
【００７３】
以上のように、ＨＰＣ型のアプリケーションを動作させる場合において、実ノードが故障した場合でも、故障した実ノードが行っていた処理をコーディネータノード２が自動的に適切な実ノードへ割り振ることにより、処理を継続させることができるため、ユーザ側で対応する必要がない。
【００７４】
実施の形態３．
本実施の形態では、本発明のクラスタシステムをＷＷＷモードで使う場合について説明する。例えばＷＷＷサーバのような複数のサーバに対してクライアントから大量のリクエストが来るような場合に、１つのＶＩＰに複数のＲＩＰを割り当てることでＩＰレベルの負荷分散を行うことができる。このような動作モードをＷＷＷモードと呼び、このＷＷＷモードでは１つのサービスの実行を複数の実ノードで分担して行う。本実施の形態では、コーディネータノード２が１つの仮想ノードを複数の実ノードに割り当て、クライアント５からの仮想ノードに対するリクエストを複数の実ノードへ分散することにより、ロードバランシング型のクラスタシステムを構成することが可能となる。
【００７５】
以下、ＷＷＷモードの動作について、図１０〜図１４を用いて説明する。まず、本実施の形態に係るクラスタシステムの構成について、図１０を用いて説明する。図１０は、本実施の形態に係るクラスタシステムの構成の他の一例を示すブロック図である。図１０において、図１と同一符号は図１に示された対象と同様のものであり、ここでの説明を省略する。ＷＷＷモードでは、オペレータがＷＷＷサーバの起動と停止を行う必要があるため、図１０では図１の構成にオペレータ６を加える。オペレータ６はコーディネータノード２を操作する。
【００７６】
次に、ＷＷＷモードにおけるサーバ起動の動作について、図１１を用いて説明する。図１１は、本実施の形態に係るクラスタシステムのＷＷＷモードにおけるサーバ起動の動作の一例を示すシーケンス図である。ここでは、説明のため、実ノード３ａが持つＲＩＰをＲＩＰ＃ａ、実ノード３ｂが持つＲＩＰをＲＩＰ＃ｂ、実ノード３ｃが持つＲＩＰをＲＩＰ＃ｃとする。
【００７７】
まず事前に、オペレータ６は、立ち上げるべきＷＷＷサーバの数を指定する。ここでは例えば３台と指定する。これにより、ノード割り当て要求が行われる（Ｓ３０１）。ノード割り当て要求を受けたコーディネータノード２は、各実ノード３ａ，３ｂ，３ｃの負荷状態に応じて、ＶＩＰ＃１に例えばＲＩＰ＃ａとＲＩＰ＃ｂとＲＩＰ＃ｃを割り当てる（Ｓ３０２）。実ノード３ａ，３ｂ，３ｃは、それぞれサービスの実行が可能な状態であれば、そのことを示す情報をコーディネータノード２へ返す（Ｓ３０３）。次に、コーディネータノード２は、図１２に示すＩＰ管理テーブルを生成し、各実ノード３ａ，３ｂ，３ｃへブロードキャストすると共に、ノード割り当てが成功したことを示すノード割り当て完了通知をオペレータ６へ返す（Ｓ３０４）。
【００７８】
ノード割り当て完了通知を受信したオペレータ６は、サーバ起動要求を行う（Ｓ３０５）。サーバ起動要求を受けたコーディネータノード２は、各実ノード３ａ，３ｂ，３ｃに対してサーバ起動を指示する（Ｓ３０６）。各実ノード３ａ，３ｂ，３ｃは、自ノードを起動させると共に、起動したことを示す情報をコーディネータノード２へ返す（Ｓ３０７）。コーディネータノード２は、サーバ起動が完了したことを示すサーバ起動完了通知をオペレータ６へ返す（Ｓ３０８）。以上のように、ＷＷＷモードにおけるサーバ起動の動作を行うことで、事前にＶＩＰ＃１にはＲＩＰ＃ａとＲＩＰ＃ｂとＲＩＰ＃ｃが割り当てられる。
【００７９】
次に、ＷＷＷモードにおけるサービス提供時の動作について、図１３を用いて説明する。図１３は、本実施の形態に係るクラスタシステムのＷＷＷモードにおけるサービス提供時の動作の一例を示すシーケンス図である。まず、クライアント５は、ＨＴＴＰ（ＨｙｐｅｒｔｅｘｔＴｒａｎｓｆｅｒＰｒｏｔｏｃｏｌ）リクエストをＶＩＰ＃１宛に投入する（Ｓ４０１）。ＶＩＰ＃１宛のリクエストはコーディネータノード２を介して実ノード３ａへ割り振られる（Ｓ４０２）。実ノード３ａは、リクエストに対するＨＴＴＰレスポンスをクライアント５へ返す（Ｓ４０３）。以上がＷＷＷモードにおけるサービス提供時の動作である。ここではＶＩＰ＃１宛のリクエストを実ノード３ａへ割り振った例について説明したが、コーディネータノード２のパケット割り振り部２２３は、クライアント５からのリクエストを受ける度に、リクエストを転送する実ノードを変化させる。
【００８０】
次に、ＷＷＷモードにおけるサーバ停止の動作について、図１４を用いて説明する。図１４は、本実施の形態に係るクラスタシステムのＷＷＷモードにおけるサーバ停止の動作の一例を示すシーケンス図である。
【００８１】
まず、オペレータ６は、サーバ停止要求を行う（Ｓ５０１）。サーバ停止要求を受けたコーディネータノード２は、各実ノード３ａ，３ｂ，３ｃに対してサーバ停止を指示する（Ｓ５０２）。各実ノード３ａ，３ｂ，３ｃは、自ノードを停止させると共に、停止したことを示す情報をコーディネータノード２へ返す（Ｓ５０３）。コーディネータノード２は、サーバ停止が完了したことを示すサーバ停止完了通知をオペレータ６へ返す（Ｓ５０４）。
【００８２】
サーバ停止完了通知を受信したオペレータ６は、ノード解放要求を行う（Ｓ５０５）。ノード解放要求を受けたコーディネータノード２は、ＩＰ管理テーブルのＶＩＰ＃１とＲＩＰ＃ａのエントリ、ＶＩＰ＃１とＲＩＰ＃ｂのエントリ、ＶＩＰ＃１とＲＩＰ＃ｃのエントリを削除し、その結果をブロードキャストすることによりノードの解放を行う（Ｓ５０６）。各実ノード３ａ，３ｂ，３ｃは、ノードの解放を確認したことを示す情報をコーディネータノード２へ返す（Ｓ５０７）。コーディネータノード２は、ノード解放が終了したことを示すノード解放終了通知をオペレータ６へ返す（Ｓ５０８）。以上により、ＷＷＷモードにおけるサーバ停止の動作は終了する。
【００８３】
以上のように、ＷＷＷモードのサービス提供時において、クライアント５から大量のリクエストを受けた場合でも、コーディネータノード２が実ノードの負荷状態や故障等に応じて、リクエストを適切な実ノードへ割り振ることができる。
【００８４】
以上、実施の形態１から実施の形態３において本発明のクラスタシステムについて説明したが、図３で説明した本発明のクラスタシステムの構成において、コーディネータノード２が状況に応じたＩＰ管理テーブルを作成することにより、実施の形態１で説明したｒｓｈモード、実施の形態２で説明したＨＰＣモード、実施の形態３で説明したＷＷＷモードからなる３つのモードをいずれかに切り替えて動作させたり、３つのモードを組み合わせて動作させるようにすることも可能である。また、実ノードのいずれかに仮想ノード提供部２２２とパケット割り振り部２２３とコーディネータ２３の機能を備えることにより、コーディネータノード２として機能することが可能となるため、コーディネータノード２が故障した場合にも対応することができる。これによりさらに高信頼性を実現することができる。
【００８５】
（付記１）クライアントに対して少なくとも１つの仮想的な計算機ノードを提供するクラスタシステムにおける物理的な計算機ノードであって、
前記仮想的な計算機ノードのＩＰアドレスである仮想ＩＰアドレスと前記物理的な計算機ノードのＩＰアドレスである実ＩＰアドレスとの対応表であるＩＰ管理テーブルを記憶し、該ＩＰ管理テーブルに基づいて仮想ＩＰアドレスを用いた通信を行うＩＰ層と、
他の計算機ノードと前記クライアントにネットワークを介して接続するネットワークデバイスと、
を備えてなる計算機ノード。
（付記２）付記１に記載の計算機ノードにおいて、
さらに、
前記クライアントから指示されたアプリケーションを実行するアプリケーション実行部を備えたことを特徴とする計算機ノード。
（付記３）付記２に記載の計算機ノードにおいて、
前記ＩＰ層は、
前記アプリケーション実行部から仮想ＩＰアドレスを宛先とする第１のＩＰヘッダを付けた第１のパケットが入力された場合、宛先の仮想ＩＰアドレスに対応する実ＩＰアドレスを前記ＩＰ管理テーブルを用いて検索し、検索した実ＩＰアドレスを宛先とする第２のＩＰヘッダを前記第１のパケットにさらに付けるカプセル化を施し、得られた第２のパケットを前記ネットワークデバイスへ出力するカプセル化部と、
前記ネットワークデバイスから実ＩＰアドレスを宛先とする第３のＩＰヘッダを付けた第３のパケットが入力された場合、前記第３のパケットから前記第３のＩＰヘッダを外して第４のパケットを生成し、さらに仮想ＩＰアドレスを宛先とする第４のＩＰヘッダの仮想ＩＰアドレスが自らの計算機ノードの仮想ＩＰアドレスであれば、得られた前記第４のパケットを前記アプリケーション実行部へ出力するカプセル解除部と、
を備えたことを特徴とする計算機ノード。
（付記４）付記３に記載の計算機ノードにおいて、
さらに、
前記ネットワークデバイスと同様に扱われ、自らの計算機ノードの仮想ＩＰアドレスを宛先とするパケットを前記ＩＰ層へ出力するトンネルデバイスを備えたことを特徴とする計算機ノード。
（付記５）付記１に記載の計算機ノードにおいて、
さらに、
他の計算機ノードの負荷状態を検出する負荷状態検出部と、
前記負荷状態に基づいて前記仮想ＩＰアドレスに前記実ＩＰアドレスを割り当て、前記ＩＰ管理テーブルを作成するノード割り当て部と、
前記ＩＰ管理テーブルを他の計算機ノードへブロードキャストするブロードキャスト部とを備え、
前記ＩＰ層は、
前記クライアントに対して前記仮想ＩＰアドレスを提供する仮想ノード提供部と、
前記ネットワークデバイスから前記仮想ＩＰアドレスを宛先とするパケットが入力された場合、前記ＩＰ管理テーブルを用いて前記仮想ＩＰアドレスから前記実ＩＰアドレスを検索し、検索した前記実ＩＰアドレスを宛先とするパケットを前記ネットワークデバイスへ出力するパケット割り振り部と、
を備えたことを特徴とする計算機ノード。
（付記６）付記５に記載の計算機ノードにおいて、
前記パケット割り振り部は、前記ネットワークデバイスから前記仮想ＩＰアドレスを宛先とする第１のＩＰヘッダを付けた第１のパケットが入力された場合、検索した実ＩＰアドレスを宛先とする第２のＩＰヘッダを前記第１のパケットにさらに付けるカプセル化を施し、得られた第２のパケットを前記ネットワークデバイスへ出力することを特徴とする計算機ノード。
（付記７）付記５または付記６に記載の計算機ノードにおいて、
前記仮想ノード提供部は、前記ネットワークデバイスに対して実ＩＰアドレスと異なる少なくとも１つのＩＰアドレスを割り当てることを特徴とする計算機ノード。
（付記８）付記５乃至付記７のいずれかに記載の計算機ノードにおいて、
前記ノード割り当て部は、１つの仮想ＩＰアドレスに対して複数の実ＩＰアドレスを割り当てることを特徴とする計算機ノード。
（付記９）付記８に記載の計算機ノードにおいて、
前記ノード割り振り部は、クライアントから前記１つの仮想ＩＰアドレスに対して要求がある度に、要求を転送する実ＩＰアドレスを変化させることを特徴とする計算機ノード。
（付記１０）付記５乃至付記９のいずれかに記載の計算機ノードにおいて、
前記ブロードキャスト部は、前記ＩＰ管理テーブルのうち更新されたエントリだけを他の計算機ノードへブロードキャストすることを特徴とする計算機ノード。
（付記１１）付記５乃至付記１０のいずれかに記載の計算機ノードにおいて、
前記ブロードキャスト部は、前記ＩＰ管理テーブルのうち他の計算機ノードから要求されたエントリだけを要求した計算機ノードだけへ送信することを特徴とする計算機ノード。
（付記１２）クライアントに対して複数の仮想的な計算機ノードを提供するクラスタシステムであって、
付記２乃至付記４のいずれかに記載の計算機ノードであり、アプリケーションを実行する計算機ノードである少なくとも１つの実ノードと、
付記５乃至付記１１のいずれかに記載の計算機ノードであり、前記仮想ＩＰアドレスを前記実ノードの実ＩＰアドレスに割り当てる計算機ノードである少なくとも１つのコーディネータノードと、
を備えたことを特徴とするクラスタシステム。
（付記１３）付記１２に記載のクラスタシステムにおいて、
前記コーディネータノードは、前記ＩＰ管理テーブルを前記実ノードへ送信し、
該実ノードは、前記ＩＰ管理テーブルを受信したことを前記コーディネータノードへ送信することを特徴とするクラスタシステム。
（付記１４）クライアントに対して少なくとも１つの仮想的な計算機ノードを提供し、前記クライアントから指示されたアプリケーションを実際に実行する計算機ノードである少なくとも１つの実ノードの管理を行うクラスタ管理方法であって、
前記実ノードと前記クライアントにネットワークを介して接続するステップと、
前記クライアントに対して前記仮想的な計算機ノードのＩＰアドレスである仮想ＩＰアドレスを提供するステップと、
前記実ノードの負荷状態を検出するステップと、
前記負荷状態に基づいて前記仮想ＩＰアドレスに対して前記実ノードのＩＰアドレスである実ＩＰアドレスを割り当て、前記ＩＰ管理テーブルを作成するステップと、
前記ＩＰ管理テーブルを前記実ノードへブロードキャストするステップと、
ネットワークを介して前記クライアントから前記仮想ＩＰアドレスを宛先とするパケットが入力された場合、前記ＩＰ管理テーブルを用いて前記仮想ＩＰアドレスから前記実ＩＰアドレスを検索し、検索した前記実ＩＰアドレスを宛先とするパケットを、ネットワークを介して宛先の実ノードへ出力するステップと、
を備えてなるクラスタ管理方法。
（付記１５）クライアントに対して少なくとも１つの仮想的な計算機ノードを提供し、前記クライアントから指示されたアプリケーションを実際に実行する計算機ノードである少なくとも１つの実ノードの管理を行うために、コンピュータにより読取可能な媒体に記憶されたクラスタ管理プログラムであって、
前記実ノードと前記クライアントにネットワークを介して接続するステップと、
前記クライアントに対して前記仮想的な計算機ノードのＩＰアドレスである仮想ＩＰアドレスを提供するステップと、
前記実ノードの負荷状態を検出するステップと、
前記負荷状態に基づいて前記仮想ＩＰアドレスに対して前記実ノードのＩＰアドレスである実ＩＰアドレスを割り当て、前記ＩＰ管理テーブルを作成するステップと、
前記ＩＰ管理テーブルを前記実ノードへブロードキャストするステップと、
ネットワークを介して前記クライアントから前記仮想ＩＰアドレスを宛先とするパケットが入力された場合、前記ＩＰ管理テーブルを用いて前記仮想ＩＰアドレスから前記実ＩＰアドレスを検索し、検索した前記実ＩＰアドレスを宛先とするパケットを、ネットワークを介して宛先の実ノードへ出力するステップと、
をコンピュータに実行させることを特徴とするクラスタ管理プログラム。
【００８６】
【発明の効果】
以上に詳述したように本発明によれば、コーディネータノードが実ノードの負荷状態や故障等に応じてＶＩＰをＲＩＰへ割り振ることにより、クライアント側でノードの変更の処理を行うことなく、ノードの負荷の変化やノードの故障に柔軟に対応でき、高可用性を実現することができる。
【図面の簡単な説明】
【図１】本実施の形態に係るクラスタシステムの構成の一例を示すブロック図である。
【図２】本実施の形態に係るコーディネータノードの機能の一例を示すブロック図である。
【図３】カプセル化とカプセル化解除の動作の一例を示す図である。
【図４】ＩＰ層の実装の一例を示すブロック図である。
【図５】本実施の形態に係る実ノードの機能の一例を示すブロック図である。
【図６】本実施の形態に係るクラスタシステムのｒｓｈモードにおける動作の一例を示すシーケンス図である。
【図７】ｒｓｈモードにおけるＩＰ管理テーブルの一例を示す図である。
【図８】本実施の形態に係るクラスタシステムのＨＰＣモードにおける動作の一例を示すシーケンス図である。
【図９】ＨＰＣモードにおけるＩＰ管理テーブルの一例を示す図である。
【図１０】本実施の形態に係るクラスタシステムの構成の他の一例を示すブロック図である。
【図１１】本実施の形態に係るクラスタシステムのＷＷＷモードにおけるサーバ起動の動作の一例を示すシーケンス図である。
【図１２】ＷＷＷモードにおけるＩＰ管理テーブルの一例を示す図である。
【図１３】本実施の形態に係るクラスタシステムのＷＷＷモードにおけるサービス提供時の動作の一例を示すシーケンス図である。
【図１４】本実施の形態に係るクラスタシステムのＷＷＷモードにおけるサーバ停止の動作の一例を示すシーケンス図である。
【符号の説明】
１クラスタシステム、２コーディネータノード、２１ＮＩＣ、２１１ネットワークデバイス、２２ＩＰ層、２２１ＩＰ処理部、２２２仮想ノード提供部、２２３パケット割り振り部、２３コーディネータ、２３１負荷状態検出部、２３２ノード割り当て部、２３３ブロードキャスト部、３ａ，３ｂ，３ｃ実ノード、３１ＮＩＣ、３１１ネットワークデバイス、３１２トンネルデバイス、３２ＩＰ層、３２１ＩＰ処理部、３２２カプセル解除部、３２３カプセル化部、３３アプリケーション実行部、４ネットワーク、５クライアント、６オペレータ、７ｎｅｔｆｉｌｔｅｒ、７１ＮＦ＿ＩＰ＿ＰＲＥ＿ＲＯＵＴＩＮＧ、７２ｒｏｕｔｉｎｇ、７３ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＩＮ、７４ＮＦ＿ＩＰ＿ＦＯＲＷＡＲＤ、７５ＮＦ＿ＩＰ＿ＬＯＣＡＬ＿ＯＵＴ、７６ｒｏｕｔｉｎｇ、７７ＮＦ＿ＩＰ＿ＰＯＳＴ＿ＲＯＵＴＩＮＧ、８下位層、９上位層。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention provides a computer node, a cluster system, a cluster management method, and a cluster management that can flexibly cope with a change in configuration such as a node failure in a cluster system in which computers on a network are used as nodes and a plurality of nodes operate as one system. It is about the program.
[0002]
[Prior art]
A cluster system is a system in which a plurality of nodes are operated to execute processing for the same work purpose, thereby improving processing capacity and reliability, which were limits in a single node. In general, there are three types of cluster systems: a failover type, a load balancing type, and an HPC (High Performance Computing) type.
[0003]
First, the failover type will be described. In the failover type, two or more nodes are operated, and in the event that operation becomes inoperable for some reason, another node that has been waiting as a backup takes over the processing, thereby improving HA (High Availability). Let it.
[0004]
Next, the load balancing type will be described. In the load balancing type, scalability is realized by multiplexing servers such as WWW (World Wide Web) and FTP (File Transfer Protocol) servers. More specifically, load distribution of one node is performed by allocating an IP-level session for one load balancer to a plurality of service nodes located behind. There are several allocation methods, such as a round robin type that allocates processing in order or a dynamic load balancer that allocates processing to service nodes with low load while monitoring the load of service nodes and network traffic. .
[0005]
Next, the HPC type will be described. In the HPC type, a parallel processing application is executed at high speed by emphasizing a plurality of nodes. If the data transfer band of the interconnect between nodes is narrow, it becomes a bottleneck and the overall processing performance is reduced. Therefore, the nodes may be connected by a high-speed interface such as Gigabit Ethernet or Myrinet. For creating a parallel processing application, there are libraries such as an MPI (Message Passing Interface) and a PVM (Parallel Virtual Machine), and these are used in the academic research field together with a numerical operation library.
[0006]
[Problems to be solved by the invention]
However, in the above-described cluster system, the user application is executed directly on the node. For this reason, there is a problem that it is not possible to flexibly respond to a node failure or a change in configuration. For example, it is necessary to deal with a node failure or the like for each user application.
[0007]
The present invention has been made in view of the above-described problem, and conceals and virtualizes a node so that a user application does not directly operate a node in a cluster system. It is an object of the present invention to provide a computer node, a cluster system, a cluster management method, and a cluster management program that can flexibly respond to a change in the configuration such as.
[0008]
[Means for Solving the Problems]
In order to solve the above-described problem, the present invention provides a physical computer node in a cluster system that provides at least one virtual computer node to a client, wherein the physical computer node uses an IP address of the virtual computer node. An IP layer that stores an IP management table that is a correspondence table between a certain virtual IP address and a real IP address that is an IP address of the physical computer node, and performs communication using a virtual IP address based on the IP management table And a network device for connecting to another computer node and the client via a network.
[0009]
According to such a configuration, since the computer nodes of the cluster system have a common IP management table, communication using a virtual IP address can be performed in the cluster system.
[0010]
Further, in the computer node according to the present invention, the computer node further comprises an application execution unit that executes an application specified by the client.
[0011]
According to such a configuration, a high-availability cluster system can be provided by hiding a computer node that executes an application from a client and dynamically allocating it as a virtual computer node.
[0012]
Also, in the computer node according to the present invention, when the first packet with the first IP header addressed to the virtual IP address is input from the application execution unit, the IP layer may determine the destination virtual IP address. Is searched using the IP management table, and a second IP header destined to the searched real IP address is further added to the first packet to perform encapsulation. An encapsulation unit that outputs a packet of the third type to the network device, and when a third packet with a third IP header addressed to a real IP address is input from the network device, the encapsulating unit outputs the packet from the third packet. A fourth packet is generated by removing the third IP header, and the temporary packet of the fourth IP header addressed to the virtual IP address is generated. If IP address is a virtual IP address of the own computer node, it is characterized in that the obtained fourth packet and a decapsulation unit for output to the application execution unit.
[0013]
According to such a configuration, the computer node can realize communication between virtual computer nodes by converting the virtual IP address and the real IP address according to the common IP management table.
[0014]
Further, the computer node according to the present invention further comprises a tunnel device which is treated in the same manner as the network device and outputs a packet addressed to the virtual IP address of the own computer node to the IP layer. Is what you do.
[0015]
According to such a configuration, the IP layer can process a packet addressed to a virtual IP address as well as a packet addressed to a real IP address.
[0016]
Further, in the computer node according to the present invention, further, a load state detection unit that detects a load state of another computer node, the real IP address is assigned to the virtual IP address based on the load state, and the IP management table And a broadcast unit that broadcasts the IP management table to other computer nodes, wherein the IP layer provides the virtual IP address to the client, When a packet addressed to the virtual IP address is input from a network device, the virtual IP address is searched for the real IP address using the IP management table, and a packet addressed to the searched real IP address is retrieved. Allocation of packets to be output to the network device It is characterized in that it comprises and.
[0017]
According to such a configuration, in the cluster system, by assigning a virtual IP address to a real IP address according to a load state or a failure of another computer node, the client does not change the IP address of the computer node, It is possible to flexibly cope with a change in the load of a computer node and a failure of the computer node, and realize high availability.
[0018]
Further, in the computer node according to the present invention, when a first packet with a first IP header addressed to the virtual IP address is input from the network device, the packet allocating unit searches for the real IP address. The present invention is characterized by encapsulating the first packet with a second IP header having an address as a destination, and outputting the obtained second packet to the network device.
[0019]
According to such a configuration, the computer node can realize communication with the virtual computer node by converting the virtual IP address and the real IP address according to the common IP management table.
[0020]
Further, in the computer node according to the present invention, the virtual node providing unit allocates at least one IP address different from a real IP address to the network device.
[0021]
According to such a configuration, requests for a plurality of virtual nodes can be received and allocated to physical computer nodes.
[0022]
Further, in the computer node according to the present invention, the node assignment unit assigns a plurality of real IP addresses to one virtual IP address.
[0023]
According to such a configuration, it is possible to configure a load-balancing type cluster system using virtual computer nodes.
[0024]
Further, in the computer node according to the present invention, each time the client requests the one virtual IP address, the node allocating unit changes the real IP address to which the request is transferred. is there.
[0025]
According to such a configuration, even when a large number of requests are received from a client to a virtual computer node, the load on the physical computer node can be distributed.
[0026]
Further, in the computer node according to the present invention, the broadcast unit broadcasts only an updated entry in the IP management table to another computer node.
[0027]
According to such a configuration, it is possible to reduce a transfer amount of data relating to the IP management table.
[0028]
Further, in the computer node according to the present invention, the broadcast unit transmits only the entry requested from another computer node in the IP management table to only the requesting computer node.
[0029]
According to such a configuration, it is possible to reduce a transfer amount of data relating to the IP management table.
[0030]
Further, the present invention is a cluster system that provides a service of a plurality of virtual computer nodes to a client, wherein at least one real node that is a computer node that executes an application and the virtual IP address are assigned to the virtual IP address. At least one coordinator node, which is a computer node assigned to a real IP address of the node, is provided.
[0031]
According to such a configuration, the coordinator node allocates the virtual IP address to the real IP address according to the load state or the failure of the real node, so that the client does not change the IP address of the real node, thereby changing the load on the node. Changes and node failures can be flexibly handled, and high availability can be realized.
[0032]
In the cluster system according to the present invention, the coordinator node transmits the IP management table to the real node, and the real node transmits to the coordinator node that the IP management table has been received. It is assumed that.
[0033]
According to such a configuration, the coordinator node can reliably distribute the IP management table to the real nodes.
[0034]
The present invention also provides a cluster management method for providing at least one virtual computer node to a client and managing at least one real node that is a computer node that actually executes an application specified by the client. Connecting the real node and the client via a network, providing the client with a virtual IP address that is the IP address of the virtual computer node, and loading the real node Detecting a status, assigning a real IP address that is an IP address of the real node to the virtual IP address based on the load status, and creating the IP management table; Broadcasting to the real nodes; When a packet addressed to the virtual IP address is input from the client via the network, the real IP address is searched from the virtual IP address using the IP management table, and the searched real IP address is sent to the destination. And outputting the packet to the destination real node via the network.
[0035]
According to such a configuration, the coordinator node allocates the virtual IP address to the real IP address according to the load state or the failure of the real node, so that the client does not change the IP address of the real node, thereby changing the load on the node. Changes and node failures can be flexibly handled, and high availability can be realized.
[0036]
Further, the present invention provides at least one virtual computer node to a client, and manages at least one real node which is a computer node that actually executes an application designated by the client. A cluster management program stored on a computer-readable medium, the method comprising: connecting the real node and the client via a network; Providing an IP address; detecting a load state of the real node; assigning a real IP address that is an IP address of the real node to the virtual IP address based on the load state; Creating a table and the IP management And when a packet addressed to the virtual IP address is input from the client via a network, the real IP address is converted from the virtual IP address using the IP management table. Searching and outputting a packet addressed to the searched real IP address to a destination real node via a network.
[0037]
According to such a configuration, the coordinator node allocates the virtual IP address to the real IP address according to the load state or the failure of the real node, so that the client does not change the IP address of the real node, thereby changing the load on the node. Changes and node failures can be flexibly handled, and high availability can be realized.
In the present invention, the computer-readable medium is a portable storage medium such as a CD-ROM, a flexible disk, a DVD disk, a magneto-optical disk, an IC card, a database holding a computer program, or another database. It also includes computers, their databases, and transmission media on lines.
[0038]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
Embodiment 1 FIG.
In this embodiment, a case where the cluster system of the present invention is used in a remote shell (rsh) mode will be described. In the rsh mode, one application is executed by one real node.
[0039]
First, the configuration of the cluster system according to the present embodiment will be described with reference to FIG. FIG. 1 is a block diagram illustrating an example of a configuration of a cluster system according to the present embodiment. As shown in FIG. 1, a cluster system 1 according to the present embodiment includes a coordinator node 2 and real nodes 3a, 3b, 3c. Each node is connected to each other via a network 4 and to a client 5 via the network 4.
[0040]
The cluster system according to the present embodiment prohibits a user application from directly operating a node in the cluster. This is realized by providing only the virtual node that is a virtual computer node to the client 5 and hiding the real node that is a physical computer node that actually executes the application. The coordinator node is a computer node that allocates a virtual node or application execution to a real node according to a request from a client, a load state of the real node, or the like. The real node actually executes the application according to the assignment of the coordinator node. When the application terminates, the real node is released, and the virtual node and application execution are deallocated.
[0041]
Next, the function of the coordinator node according to the present embodiment will be described using FIG. FIG. 2 is a block diagram illustrating an example of a function of the coordinator node according to the present embodiment. As shown in FIG. 2, the coordinator node 2 according to the present embodiment is roughly divided into an IP (Internet Protocol) layer in the hierarchical model, and is roughly divided into an NIC 21 (Network Interface Card) belonging to a lower layer and a NIC 21 (Network Interface Card) belonging to a lower layer. , An IP layer 22 and a coordinator 23 belonging to an upper layer. The NIC 21 includes a network device 211. The IP layer 22 includes an IP processing unit 221, a virtual node providing unit 222, and a packet allocating unit 223. The coordinator 23 includes a load state detection unit 231, a node assignment unit 232, and a broadcast unit 233.
[0042]
Next, the operation of the coordinator node according to the present embodiment will be described. The load state detector 231 detects the load state of each of the real nodes 3a, 3b, 3c. The load state indicates what process is currently running at what CPU usage rate.
[0043]
The node allocating unit 232 sets a virtual IP address (VIP: Virtual IP Address), which is a virtual IP address, in the IP management table, and receives each of the real nodes 3a, 3b, and 3c when receiving a node allocation request from outside. Is assigned to a real IP address (RIP: Real IP Address), which is the actual IP address of the real node, based on the load state of the real node, and an IP management table is generated. Also, the IP management table is updated as needed. The IP management table is a correspondence table for searching for a RIP from a VIP.
[0044]
The broadcast unit 233 broadcasts the IP management table to all the real nodes 3a, 3b, 3c when the IP management table is updated. Here, as an example, the IP management table is broadcast only when the IP management table is updated. However, only the changed entry in the IP management table may be broadcast. Further, the entry requested by the real node in the IP management table may be transmitted. Further, in order for the coordinator node 2 to confirm whether the IP management table has arrived, the real node may return information indicating that the IP management table has been received to the coordinator node 2.
[0045]
The virtual node providing unit 222 provides the client 5 with the VIP set in the IP management table as a virtual node. In order to show the VIP to the client 5, a plurality of VIPs are assigned to one network device 211.
[0046]
The IP processing unit 221 performs filtering and routing of an IP packet input from an upper layer or a lower layer, similarly to the conventional IP layer. When an IP packet is input from the network device 211, if the IP packet is addressed to the own node, the IP packet is passed to an upper layer, and if not, the IP packet is output to the network device 211 again for transfer to an appropriate network. Conversely, when an IP packet is input from an upper layer, the IP packet is output to the network device 211 for transmission to an appropriate network. Further, when an IP packet having a VIP header is further input from the network device 211, the IP processing unit 221 of the present invention outputs the IP packet to the packet allocation unit 223. Here, an IP header destined for RIP is called a RIP header, and an IP header destined for VIP is called a VIP header.
[0047]
The packet allocating unit 223 processes an IP packet that has arrived at the VIP provided by the virtual node providing unit 222. First, when an IP packet having a VIP header is input from the IP processing unit 221, the packet allocating unit 223 searches for a RIP from the VIP using the IP management table, and checks the RIP of the real node to which the packet is actually transmitted. Get. Next, as shown in FIG. 3A, the packet allocating unit 223 encapsulates the IP packet having the VIP header by further adding a RIP header. The encapsulated IP packet is output to the IP processing unit 221 and transmitted to an external RIP via the network device 211.
[0048]
The network device 211 is connected to an external client 5 and each node via the network 4. It outputs the packet received from the outside to the IP processing unit 221 and transmits the packet input from the IP processing unit 221 to the outside.
[0049]
Next, an implementation example of the IP processing unit 221 and the packet allocation unit 223, which are the IP layers, will be described. First, an implementation example of the virtual node providing unit 222 will be described. Here, a mechanism called an IP alias is used. IP aliases are standardly supported in the Linux kernel, and other IP addresses can be assigned by the command “# ifconfig eth0: 0192.168.1.10”. If the number after “:” is changed, a plurality of IP addresses can be assigned. A VIP is assigned using this function.
[0050]
Next, an implementation example of the IP processing unit 221 and the packet allocation unit 223 will be described. FIG. 4 is a block diagram illustrating an example of the implementation of the IP layer. The Linux kernel 2.4 incorporates a packet filtering mechanism. This is called a netfilter and is a framework for providing extensibility of a code for performing IP packet processing in a kernel. Here, the IP processing unit 221 is realized by using the function of the netfilter in the Linux IP layer. The netfilter 7 is connected to the lower layer 8 and the upper layer 9.
[0051]
Here, a specific operation of the conventional netfilter will be described. The packet received by the lower layer 8 is sent to the routing 72 via the NF_IP_PRE_ROUTING 71. If the packet is to be transferred to another node, the packet is sent to the NF_IP_FORWARD 74. Otherwise, the packet is sent to the NF_IP_LOCAL_IN 73. The IP packet sent to NF_IP_FORWARD 74 is sent to the lower layer 8 via NF_IP_POST_ROUTING 77 and sent to another node. On the other hand, the packet sent to NF_IP_LOCAL_IN 73 is sent to upper layer 9.
[0052]
The IP packet from the upper layer 9 is transmitted to the routing 76 via the NF_IP_LOCAL_OUT 75, further transmitted to the lower layer 8 via the NF_IP_POST_ROUTING 77, and transmitted to another node. The IP processing unit 221 is realized by the netfilter 7 described above.
[0053]
Further, the packet allocating unit 223 can be realized by extending the function of the netfilter described in FIG. The netfilter 7 provides a mechanism for calling a hook function at each of NF_IP_PRE_ROUTING71, NF_IP_LOCAL_IN73, NF_IP_FORWARD74, NF_IP_LOCAL_OUT75, and NF_IP_POST_ROUTING77. In these, a list for registering functions is prepared, which can be registered by an interface of “intnf_register_hook (struct nf_hook_ops * reg)”, and “intnf_unregister_hook (an interface that can be deleted by a structure nf_hook_ops * reg)”. . Here, the structure of the structure nf_hook_ops type is for registering a hook function. In the present embodiment, the packet allocating unit 223 can be realized by registering the operation of the packet allocating unit 223 in the NF_IP_LOCAL_IN 73 as a hook function.
[0054]
Further, the encapsulation in the packet allocation unit 223 can be implemented by applying the IP tunneling function. The protocol of the IP packet addressed to the RIP is set to IPPROTO IPIP. This is called an IP tunneling protocol and is used to indicate that a packet is encapsulated.
[0055]
Next, the function of the real node according to the present embodiment will be described using FIG. FIG. 5 is a block diagram illustrating an example of a function of the real node according to the present embodiment. As shown in FIG. 3, the real nodes 3a, 3b, and 3c according to the present embodiment are roughly divided into a NIC 31 belonging to a lower layer and an IP layer 32 when viewed focusing on the IP layer in the hierarchical model. , An application execution unit 33 belonging to an upper layer. The NIC 31 includes a network device 311 and a tunnel device 312. The IP layer 32 includes an IP processing unit 321, a decapsulation unit 322, and an encapsulation unit 323.
[0056]
Next, the operation of the real node according to the present embodiment will be described. The application execution unit 31 includes an application execution file, executes the application according to the contents of the packet received from the client 5 via the coordinator node 2, and passes the execution result as a packet to the IP layer. At this time, when transmitting the execution result to the client 5, a header addressed to the client is added to the packet as usual, and when performing communication with another real node, a VIP header is added to the packet.
[0057]
The IP processing unit 321 performs filtering and routing of an IP packet input from an upper layer or a lower layer, similarly to the conventional IP layer or the IP processing unit 221. Further, when an IP packet having a RIP header is input from a lower layer, the IP processing unit 321 of the present invention outputs the packet to the decapsulation unit 322. When an IP packet having a VIP header is input from an upper layer, the IP packet is output to the encapsulation unit 323.
[0058]
When an IP packet having a RIP header is input from the IP processing unit 321, the decapsulation unit 322 removes the RIP header from the IP packet as shown in FIG. At this time, since the encapsulated IP packet is the IP tunneling protocol, the IP packet having the VIP header is output to the tunnel device 312 via the IP processing unit 321.
[0059]
When an IP packet having a VIP header is input from the IP processing unit 321, the encapsulation unit 323 searches for a RIP from the VIP using the IP management table broadcast from the coordinator node 2, and sends a destination to which the packet is actually transmitted. Of the real node is obtained. Next, as shown in FIG. 3A, the encapsulation unit 323 encapsulates the IP packet having the VIP header by further adding a RIP header. The encapsulated IP packet is transmitted to the RIP via the IP processing unit 321 and the network device 311.
[0060]
The network device 311 is similar to the network device 211, and is connected to an external client 5 and each node via the network 4. It outputs the packet received from the outside to the IP processing unit 321 and transmits the packet input from the IP processing unit 321 to the outside.
[0061]
When the IP packet having the VIP header is input from the IP processing unit 321, the tunnel device 312 outputs the IP packet to the IP processing unit 321 as it is. Since the destination of the decapsulated IP packet is VIP, it is received by the tunnel device 312 of the own node and input to the IP processing unit 321 again. The IP processing unit 321 outputs an IP packet having a VIP header from the tunnel device 312 to the application execution unit 33.
[0062]
Here, an implementation example of the IP processing unit 321, the decapsulation unit 322, and the encapsulation unit 323, which are IP layers, will be described. Similarly to the IP processing unit 221, the IP processing unit 321 is realized using the function of the netfilter in the Linux IP layer, and the operation of the decapsulation unit 322 is registered as a hook function in NF_IP_LOCAL_IN73, and the encapsulation unit 323 is registered in NF_IP_LOCAL_OUT75. By registering the operation as a hook function, the decapsulation unit 322 and the encapsulation unit 323 can be realized.
[0063]
In addition, the decapsulation unit 322 and the encapsulation unit 323 can be implemented by applying an IP tunneling function. By encapsulating an IP packet with an IP header by IP tunneling, the packet can be transmitted to a correct destination regardless of the encapsulated packet. Here, the protocol of the IP packet addressed to the RIP is set to IPPROTO_IPIP.
[0064]
In IP tunneling, unlike address translation such as NAT (Network Address Translation), a direct response can be returned to a request from the client 5. As described above, since the tunnel device 312 is mounted on all the real nodes 3a, 3b, and 3c, and the coordinator node 2 and all the real nodes 3a, 3b, and 3c have the same IP management table, communication at the VIP level is performed. Can be realized, and communication between real nodes can be realized.
[0065]
Next, the operation in the rsh mode will be described with reference to FIG. FIG. 6 is a sequence diagram illustrating an example of an operation in the rsh mode of the cluster system according to the present embodiment. In the rsh mode, an application can be executed on the cluster system 1 by the command “% rsh vnode application [args...]”, similar to rsh, which is a general UNIX command. Here, vnode indicates a virtual node. Here, for the sake of explanation, the RIP of the real node 3a is RIP # a, the RIP of the real node 3b is RIP # b, and the RIP of the real node 3c is RIP # c.
[0066]
First, the user inputs an rsh command using the client 5 and specifies a destination. Here, for example, VIP # 1 is specified as the destination. Thereby, a node assignment request is made (S101). The coordinator node 2 that has received the node allocation request allocates, for example, RIP # a to VIP # 1 according to the load state of each of the real nodes 3a, 3b, 3c (S102). If the application can be executed, the real node 3a returns information indicating that to the coordinator node 2 (S103). Next, the coordinator node 2 generates the IP management table shown in FIG. 7, broadcasts it to each of the real nodes 3a, 3b, and 3c, and returns a node allocation completion notification indicating that the node allocation has been successful to the client 5 ( S104).
[0067]
Upon receiving the node assignment completion notification, the client 5 submits the job to VIP # 1 (S105). The job is passed to the real node 3a via the coordinator node 2 (S106). The real node 3a executes the application, which is a job, and returns the execution result to the client 5 (S107). The real node 3a returns information indicating that the job has been completed to the coordinator node 2 (S108). The coordinator node 2 that has detected the end of the job deletes the entries of VIP # 1 and RIP # a in the IP management table, and releases the node by broadcasting the result (S109). The real node 3a returns information indicating that the release of the node has been confirmed to the coordinator node 2 (S110).
[0068]
As described above, the coordinator node 2 allocates VIPs to RIPs according to the load state and the failure of the real node, so that the coordinator node 2 can flexibly cope with a change in the load of the node and the failure of the node, thereby realizing high availability. it can. For example, even if a node failure occurs, it is possible to recover simply by automatically changing the assignment of VIP and RIP.
[0069]
Embodiment 2 FIG.
In the present embodiment, a case where the cluster system of the present invention described in FIG. 1 is used in the HPC mode will be described. In the HPC mode, execution of a plurality of applications is shared between a plurality of real nodes. Generally, in an HPC type application, communication between nodes is performed. In the present embodiment, the coordinator node 2 provides a plurality of virtual nodes to the client 5 and performs communication between real nodes, so that an HPC-type application can operate.
[0070]
Hereinafter, the operation in the HPC mode will be described with reference to FIG. FIG. 8 is a sequence diagram showing an example of an operation in the HPC mode of the cluster system according to the present embodiment. Here, for the sake of explanation, the RIP of the real node 3a is RIP # a, the RIP of the real node 3b is RIP # b, and the RIP of the real node 3c is RIP # c.
[0071]
First, the user specifies a destination using the client 5. Here, for example, VIP # 1 and VIP # 2 are designated as destinations. As a result, a node assignment request is made (S201). The coordinator node 2 that has received the node allocation request allocates, for example, RIP # a to VIP # 1 according to the load state of each of the real nodes 3a, 3b, 3c (S202). If the application can be executed, the real node 3a returns information indicating that to the coordinator node 2 (S203). Similarly, the coordinator node 2 assigns, for example, RIP # c to VIP # 2 (S202). If the application can be executed, the real node 3c returns information indicating this to the coordinator node 2 (S203). Steps S202 and S203 are repeated for the requested number of nodes. Next, the coordinator node 2 generates the IP management table shown in FIG. 9 and broadcasts it to each of the real nodes 3a, 3b, 3c, and returns a node allocation completion notification indicating that the node allocation has been successful to the client 5 ( S204).
[0072]
Upon receiving the node assignment completion notification, the client 5 submits the job to VIP # 1 and VIP # 2 (S205). Here, for example, it is assumed that a job to be executed using the execution result obtained in VIP # 1 in VIP # 2 is given to VIP # 1 and VIP # 2. The job addressed to VIP # 1 is passed to the real node 3a via the coordinator node 2, and the job addressed to VIP # 2 is passed to the real node 3c via the coordinator node 2 (S206). The real node 3a executes an application, which is a job, and passes the execution result to the real node 3c. The real node 3c executes the application using the execution result of the real node 3a, and returns the execution result to the client 5 (S207). Further, the real nodes 3a and 3c return information indicating that the job has been completed to the coordinator node 2 (S208). The coordinator node 2 that has detected the end of the job deletes the entries of VIP # 1 and RIP # a and the entries of VIP # 2 and RIP # c in the IP management table, and releases the node by broadcasting the result. (S209). The real nodes 3a and 3c return information indicating that the release of the nodes has been confirmed to the coordinator node 2 (S210).
[0073]
As described above, in the case of operating the HPC type application, even when the real node fails, the coordinator node 2 automatically allocates the processing performed by the failed real node to an appropriate real node. Can be continued, and there is no need for the user to respond.
[0074]
Embodiment 3 FIG.
In the present embodiment, a case where the cluster system of the present invention is used in the WWW mode will be described. For example, when a large number of requests are sent from a client to a plurality of servers such as a WWW server, IP-level load distribution can be performed by assigning a plurality of RIPs to one VIP. Such an operation mode is called a WWW mode, and in the WWW mode, execution of one service is shared by a plurality of real nodes. In the present embodiment, a coordinator node 2 allocates one virtual node to a plurality of real nodes, and distributes a request for a virtual node from a client 5 to the plurality of real nodes, thereby configuring a load-balancing cluster system. It becomes possible.
[0075]
Hereinafter, the operation in the WWW mode will be described with reference to FIGS. First, the configuration of the cluster system according to the present embodiment will be described with reference to FIG. FIG. 10 is a block diagram showing another example of the configuration of the cluster system according to the present embodiment. 10, the same reference numerals as those in FIG. 1 denote the same components as those shown in FIG. 1, and a description thereof will be omitted. In the WWW mode, since the operator needs to start and stop the WWW server, an operator 6 is added to the configuration of FIG. 1 in FIG. The operator 6 operates the coordinator node 2.
[0076]
Next, the operation of starting the server in the WWW mode will be described with reference to FIG. FIG. 11 is a sequence diagram showing an example of a server startup operation in the WWW mode of the cluster system according to the present embodiment. Here, for the sake of explanation, the RIP of the real node 3a is RIP # a, the RIP of the real node 3b is RIP # b, and the RIP of the real node 3c is RIP # c.
[0077]
First, the operator 6 specifies the number of WWW servers to be started up beforehand. Here, for example, three are specified. As a result, a node assignment request is made (S301). The coordinator node 2 that has received the node assignment request assigns, for example, RIP # a, RIP # b, and RIP # c to VIP # 1 according to the load status of each of the real nodes 3a, 3b, and 3c (S302). If each of the real nodes 3a, 3b, and 3c can execute the service, the real nodes 3a, 3b, and 3c return information indicating the execution to the coordinator node 2 (S303). Next, the coordinator node 2 generates the IP management table shown in FIG. 12 and broadcasts it to each of the real nodes 3a, 3b, 3c, and returns a node assignment completion notification indicating that the node assignment has been successful to the operator 6 ( S304).
[0078]
The operator 6 having received the node assignment completion notification issues a server start request (S305). The coordinator node 2 receiving the server start request instructs each of the real nodes 3a, 3b, 3c to start the server (S306). Each of the real nodes 3a, 3b, and 3c activates its own node and returns information indicating the activation to the coordinator node 2 (S307). The coordinator node 2 returns a server startup completion notification indicating that the server startup has been completed to the operator 6 (S308). As described above, by performing the operation of starting the server in the WWW mode, VIP # 1 is assigned RIP # a, RIP # b, and RIP # c in advance.
[0079]
Next, an operation at the time of providing a service in the WWW mode will be described with reference to FIG. FIG. 13 is a sequence diagram showing an example of an operation at the time of providing a service in the WWW mode of the cluster system according to the present embodiment. First, the client 5 inputs an HTTP (Hypertext Transfer Protocol) request to VIP # 1 (S401). The request addressed to VIP # 1 is allocated to the real node 3a via the coordinator node 2 (S402). The real node 3a returns an HTTP response to the request to the client 5 (S403). The above is the operation when the service is provided in the WWW mode. Here, an example has been described in which a request addressed to VIP # 1 is allocated to the real node 3a, but the packet allocating unit 223 of the coordinator node 2 changes the real node to which the request is transferred every time a request from the client 5 is received. .
[0080]
Next, the operation of stopping the server in the WWW mode will be described with reference to FIG. FIG. 14 is a sequence diagram illustrating an example of an operation of stopping the server in the WWW mode of the cluster system according to the present embodiment.
[0081]
First, the operator 6 issues a server stop request (S501). The coordinator node 2 receiving the server stop request instructs each of the real nodes 3a, 3b, 3c to stop the server (S502). Each of the real nodes 3a, 3b, 3c stops its own node and returns information indicating the stop to the coordinator node 2 (S503). The coordinator node 2 returns a server stop completion notification to the operator 6 indicating that the server stop has been completed (S504).
[0082]
The operator 6 having received the server stop completion notification issues a node release request (S505). Upon receiving the node release request, the coordinator node 2 deletes the entries of VIP # 1 and RIP # a, the entries of VIP # 1 and RIP # b, and the entries of VIP # 1 and RIP # c in the IP management table. The node is released by broadcasting (S506). Each of the real nodes 3a, 3b, 3c returns information indicating that the release of the node has been confirmed to the coordinator node 2 (S507). The coordinator node 2 returns a node release end notification indicating that the node release has ended to the operator 6 (S508). Thus, the operation of stopping the server in the WWW mode ends.
[0083]
As described above, even when a large number of requests are received from the client 5 when providing the service in the WWW mode, the coordinator node 2 allocates the request to an appropriate real node according to the load state or the failure of the real node. Can be.
[0084]
The cluster system according to the present invention has been described in the first to third embodiments. In the configuration of the cluster system according to the present invention described with reference to FIG. 3, the coordinator node 2 creates an IP management table according to a situation. Accordingly, the three modes including the rsh mode described in the first embodiment, the HPC mode described in the second embodiment, and the WWW mode described in the third embodiment are switched to one of the three modes, and the three modes are operated. Can be operated in combination. In addition, by providing the function of the virtual node providing unit 222, the packet allocating unit 223, and the coordinator 23 in one of the real nodes, it is possible to function as the coordinator node 2, so that even if the coordinator node 2 fails, Can respond. Thereby, higher reliability can be realized.
[0085]
(Supplementary Note 1) A physical computer node in a cluster system that provides at least one virtual computer node to a client,
An IP management table, which is a correspondence table between a virtual IP address, which is the IP address of the virtual computer node, and a real IP address, which is the IP address of the physical computer node, is stored based on the IP management table. An IP layer that performs communication using an IP address;
A network device connecting to another computer node and the client via a network,
Computer node comprising:
(Supplementary note 2) In the computer node according to supplementary note 1,
further,
A computer node comprising an application execution unit that executes an application specified by the client.
(Supplementary note 3) In the computer node according to supplementary note 2,
The IP layer comprises:
When a first packet with a first IP header addressed to a virtual IP address is input from the application execution unit, a real IP address corresponding to the destination virtual IP address is searched using the IP management table. An encapsulation unit that performs encapsulation that further attaches a second IP header destined to the searched real IP address to the first packet, and outputs the obtained second packet to the network device;
When a third packet with a third IP header addressed to a real IP address is input from the network device, a fourth packet is generated by removing the third IP header from the third packet If the virtual IP address of the fourth IP header destined for the virtual IP address is the virtual IP address of its own computer node, decapsulation for outputting the obtained fourth packet to the application execution unit Department and
A computer node comprising:
(Supplementary note 4) In the computer node according to supplementary note 3,
further,
A computer node comprising a tunnel device that is handled in the same manner as the network device and outputs a packet addressed to a virtual IP address of the own computer node to the IP layer.
(Supplementary note 5) In the computer node according to supplementary note 1,
further,
A load state detection unit that detects a load state of another computer node,
A node assignment unit that assigns the real IP address to the virtual IP address based on the load state and creates the IP management table;
A broadcast unit that broadcasts the IP management table to another computer node,
The IP layer comprises:
A virtual node providing unit that provides the client with the virtual IP address;
When a packet addressed to the virtual IP address is input from the network device, the real IP address is searched from the virtual IP address using the IP management table, and the packet addressed to the searched real IP address is used. A packet allocating unit that outputs to the network device
A computer node comprising:
(Supplementary note 6) In the computer node according to supplementary note 5,
The packet allocating unit, when a first packet with a first IP header addressed to the virtual IP address is input from the network device, a second IP header addressed to the searched real IP address The computer node performs an encapsulation that further attaches the first packet to the first packet, and outputs the obtained second packet to the network device.
(Supplementary note 7) In the computer node according to supplementary note 5 or 6,
The computer node, wherein the virtual node providing unit assigns at least one IP address different from a real IP address to the network device.
(Supplementary note 8) In the computer node according to any one of supplementary notes 5 to 7,
The computer node, wherein the node allocating unit allocates a plurality of real IP addresses to one virtual IP address.
(Supplementary note 9) In the computer node according to supplementary note 8,
The computer node according to claim 1, wherein the node allocating unit changes a real IP address to which the request is transferred every time there is a request from the client for the one virtual IP address.
(Supplementary note 10) In the computer node according to any one of supplementary notes 5 to 9,
The computer node, wherein the broadcast unit broadcasts only updated entries in the IP management table to other computer nodes.
(Supplementary note 11) In the computer node according to any one of supplementary notes 5 to 10,
The computer node, wherein the broadcast unit transmits only the entry requested by another computer node in the IP management table to only the requesting computer node.
(Supplementary Note 12) A cluster system that provides a plurality of virtual computer nodes to a client,
At least one real node, which is the computer node according to any one of Supplementary Notes 2 to 4, which is a computer node that executes an application,
At least one coordinator node, which is a computer node according to any of Supplementary Notes 5 to 11, wherein the computer node assigns the virtual IP address to a real IP address of the real node;
A cluster system comprising:
(Supplementary note 13) In the cluster system according to supplementary note 12,
The coordinator node sends the IP management table to the real node,
The cluster system according to claim 1, wherein the real node transmits the reception of the IP management table to the coordinator node.
(Supplementary Note 14) A cluster management method that provides at least one virtual computer node to a client and manages at least one real node that is a computer node that actually executes an application specified by the client. hand,
Connecting the real node and the client via a network;
Providing the client with a virtual IP address that is the IP address of the virtual computer node;
Detecting the load state of the real node;
Assigning a real IP address that is an IP address of the real node to the virtual IP address based on the load state, and creating the IP management table;
Broadcasting the IP management table to the real node;
When a packet addressed to the virtual IP address is input from the client via the network, the real IP address is searched from the virtual IP address using the IP management table, and the searched real IP address is sent to the destination. Outputting a packet to the destination real node via the network;
A cluster management method comprising:
(Supplementary Note 15) In order to provide at least one virtual computer node to a client and manage at least one real node that is a computer node that actually executes an application designated by the client, A cluster management program stored on a readable medium,
Connecting the real node and the client via a network;
Providing the client with a virtual IP address that is the IP address of the virtual computer node;
Detecting the load state of the real node;
Assigning a real IP address that is an IP address of the real node to the virtual IP address based on the load state, and creating the IP management table;
Broadcasting the IP management table to the real node;
When a packet addressed to the virtual IP address is input from the client via the network, the real IP address is searched from the virtual IP address using the IP management table, and the searched real IP address is sent to the destination. Outputting a packet to the destination real node via the network;
A cluster management program for causing a computer to execute the following.
[0086]
【The invention's effect】
As described in detail above, according to the present invention, the coordinator node allocates a VIP to a RIP according to the load state or the failure of the real node, so that the client does not perform a node change process, and It is possible to flexibly respond to a change in load or a node failure, and to realize high availability.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating an example of a configuration of a cluster system according to an embodiment.
FIG. 2 is a block diagram illustrating an example of a function of a coordinator node according to the present embodiment.
FIG. 3 is a diagram illustrating an example of an operation of encapsulation and decapsulation.
FIG. 4 is a block diagram illustrating an example of implementation of an IP layer.
FIG. 5 is a block diagram illustrating an example of a function of a real node according to the present embodiment;
FIG. 6 is a sequence diagram showing an example of an operation in an rsh mode of the cluster system according to the present embodiment.
FIG. 7 is a diagram illustrating an example of an IP management table in rsh mode.
FIG. 8 is a sequence diagram showing an example of an operation in the HPC mode of the cluster system according to the present embodiment.
FIG. 9 is a diagram illustrating an example of an IP management table in the HPC mode.
FIG. 10 is a block diagram showing another example of the configuration of the cluster system according to the present embodiment.
FIG. 11 is a sequence diagram showing an example of a server startup operation in the WWW mode of the cluster system according to the present embodiment.
FIG. 12 is a diagram illustrating an example of an IP management table in a WWW mode.
FIG. 13 is a sequence diagram showing an example of an operation at the time of providing a service in the WWW mode of the cluster system according to the present embodiment.
FIG. 14 is a sequence diagram showing an example of an operation of stopping a server in the WWW mode of the cluster system according to the present embodiment.
[Explanation of symbols]
1 cluster system, 2 coordinator node, 21 NIC, 211 network device, 22 IP layer, 221 IP processing section, 222 virtual node providing section, 223 packet allocating section, 23 coordinator, 231 load state detecting section, 232 node allocating section, 233 Broadcast unit, 3a, 3b, 3c real node, 31 NIC, 311 network device, 312 tunnel device, 32 IP layer, 321 IP processing unit, 322 decapsulation unit, 323 encapsulation unit, 33 application execution unit, 4 networks, 5 networks Client, 6 Operator, 7 netfilter, 71 NF_IP_PRE_ROUTING, 72 routing, 73 NF_IP_LOCAL_IN, 74 NF_IP_FORWARD, 75 N _IP_LOCAL_OUT, 76 routing, 77 NF_IP_POST_ROUTING, 8 lower layer, the upper layer 9.

Claims

A physical computer node in a cluster system that provides at least one virtual computer node to a client,
An IP management table, which is a correspondence table between a virtual IP address, which is the IP address of the virtual computer node, and a real IP address, which is the IP address of the physical computer node, is stored based on the IP management table. An IP layer that performs communication using an IP address;
A network device connecting to another computer node and the client via a network,
Computer node comprising:

The computer node according to claim 1,
The computer node further includes an application execution unit that executes an application specified by the client.

The computer node according to claim 2,
The IP layer comprises:
When a first packet with a first IP header addressed to a virtual IP address is input from the application execution unit, a real IP address corresponding to the destination virtual IP address is searched using the IP management table. An encapsulation unit that performs encapsulation that further attaches a second IP header destined to the searched real IP address to the first packet, and outputs the obtained second packet to the network device;
When a third packet with a third IP header addressed to a real IP address is input from the network device, a fourth packet is generated by removing the third IP header from the third packet If the virtual IP address of the fourth IP header destined for the virtual IP address is the virtual IP address of its own computer node, decapsulation for outputting the obtained fourth packet to the application execution unit Department and
A computer node comprising:

The computer node according to claim 3,
further,
A computer node comprising a tunnel device that is handled in the same manner as the network device and outputs a packet addressed to a virtual IP address of the own computer node to the IP layer.

The computer node according to claim 1,
further,
A load state detection unit that detects a load state of another computer node,
A node assignment unit that assigns the real IP address to the virtual IP address based on the load state and creates the IP management table;
A broadcast unit that broadcasts the IP management table to another computer node,
The IP layer comprises:
A virtual node providing unit that provides the client with the virtual IP address;
When a packet addressed to the virtual IP address is input from the network device, the real IP address is searched from the virtual IP address using the IP management table, and the packet addressed to the searched real IP address is used. A packet allocating unit that outputs to the network device
A computer node comprising:

The computer node according to claim 5,
The packet allocating unit, when a first packet with a first IP header addressed to the virtual IP address is input from the network device, a second IP header addressed to the searched real IP address The computer node performs an encapsulation that further attaches the first packet to the first packet, and outputs the obtained second packet to the network device.

In the computer node according to claim 5 or 6,
The computer node, wherein the virtual node providing unit assigns at least one IP address different from a real IP address to the network device.

A cluster system that provides a plurality of virtual computer nodes to a client,
At least one real node, which is a computer node according to any one of claims 2 to 4, wherein the real node is a computer node that executes an application.
At least one coordinator node, which is the computer node according to any one of claims 5 to 7, wherein the at least one coordinator node is a computer node that assigns the virtual IP address to a real IP address of the real node.
A cluster system comprising:

A cluster management method that provides at least one virtual computer node to a client and manages at least one real node that is a computer node that actually executes an application specified by the client,
Connecting the real node and the client via a network;
Providing the client with a virtual IP address that is the IP address of the virtual computer node;
Detecting the load state of the real node;
Assigning a real IP address that is an IP address of the real node to the virtual IP address based on the load state, and creating the IP management table;
Broadcasting the IP management table to the real node;
When a packet addressed to the virtual IP address is input from the client via the network, the real IP address is searched from the virtual IP address using the IP management table, and the searched real IP address is sent to the destination. Outputting a packet to the destination real node via the network;
A cluster management method comprising:

A computer-readable medium for providing at least one virtual computer node to a client and managing at least one real node that is a computer node that actually executes an application specified by the client A cluster management program stored in
Connecting the real node and the client via a network;
Providing the client with a virtual IP address that is the IP address of the virtual computer node;
Detecting the load state of the real node;
Assigning a real IP address that is an IP address of the real node to the virtual IP address based on the load state, and creating the IP management table;
Broadcasting the IP management table to the real node;
When a packet addressed to the virtual IP address is input from the client via the network, the real IP address is searched from the virtual IP address using the IP management table, and the searched real IP address is sent to the destination. Outputting a packet to the destination real node via the network;
A cluster management program for causing a computer to execute the following.