JP3803233B2

JP3803233B2 - Network error handling method and node device

Info

Publication number: JP3803233B2
Application number: JP2000224041A
Authority: JP
Inventors: 靖小林
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2000-07-25
Filing date: 2000-07-25
Publication date: 2006-08-02
Anticipated expiration: 2020-07-25
Also published as: JP2002044127A

Description

【０００１】
【発明の属する技術分野】
本発明は、ダイナミックルーティング方式が採用されているネットワークにおいて、ネットワーク全体が異常状態に陥ったときにそれに対処する方法およびそのためのノード装置に関する。
【０００２】
【従来の技術】
ＩＰ（Internet Protocol ）ネットワークにおけるＯＳＰＦ（Open Shertest Path First) やＡＴＭ（Asynchronous Transfer Mode) ネットワークにおけるＰＮＮＩ（Private Network - Network Interface)などではルーティングテーブル（経路表）作成のためのネットワークトポロジーデータベースを作成し更新するためにリンクステートアルゴリズムを用いたダイナミックルーチング方式が採用されている。
【０００３】
【発明が解決しようとする課題】
この方式は各ノードが定期的および構成に変更があったときに自分のリンクステートをネットワークに流す事によって各ノードが保有するネットワークのトポロジーデータベースを自動的に更新するもので、“コンバージェンス時間が短い”とか“ルーティングループがおきにくい”などの利点がある。しかし動作の原理上、ネットワーク内のあるノードが異常動作となり、誤ったリンクステートがネットワークに流れてしまった場合に、それがネットワーク全体のトポロジーデータベースにまで影響を与え、ネットワークのルーティング動作が麻痺してしまうことが起き得る。
【０００４】
したがって本発明の目的は、ダイナミックルーティング方式が採用されたネットワークにおいて、ネットワーク全体が異常状態に陥ったときの対処方法およびそのためのノード装置を提供することにある。
【０００５】
【課題を解決するための手段】
本発明によれば、ネットワークに含まれる複数のノードの間で各ノードのリンクステートを互いに交換することより、各ノードが互いに同一のトポロジーデータを保持してルーティング制御に用いるネットワークにおけるネットワーク異常への対処方法であって、ネットワークが安定状態にあるときのトポロジーデータのスナップショットを各ノードにおいて互いに同一の時刻に記録し、ネットワークが異常状態にある間、各ノードにおいて、該スナップショットをトポロジーデータとしてルーティング制御に使用するステップを具備する方法が提供される。
【０００６】
本発明によれば、ネットワークに含まれる複数のノードの間で各ノードのリンクステートを互いに交換することより、各ノードが互いに同一のトポロジーデータを保持してルーティング制御に用いるネットワークにおけるネットワーク異常に対処するためのノード装置であって、ネットワークが安定状態にあるときのトポロジーデータのスナップショットを他のノードと同一の時刻に記録する手段と、ネットワークが異常状態にある間、該スナップショットをトポロジーデータとしてルーティング制御に使用する手段とを具備するノード装置もまた提供される。
【０００７】
【発明の実施の形態】
本発明の説明の前に、ＩＰネットワークにおけるＯＳＰＦまたはＡＴＭネットワークにおけるＰＮＮＩなどにおけるダイナミックルーティング方式を説明する。
図１に示したネットワーク構成において、ネットワークＡはノード２を介してネットワークＣに接続されている。ここでＩＰネットワークにおいてはルータがノードに相当し、ＡＴＭネットワークにおいてはＡＴＭ交換機がノードに相当する。ネットワークＣはさらにノード１を介してネットワークＤに接続されている。さらに詳しく言えば、ネットワークＣはノード１のポート１に直接接続され、ネットワークＤはノード１のポート０に直接接続されている。
【０００８】
各ノードはそれにどのネットワークが直接接続されているかを示すリンクステートをリンクステートアルゴリズムに従って互いに交換することにより、図２に示すようなトポロジーデータを格納したトポロジーデータベースを構築する。このトポロジーデータに基づき各ノードはルーティングテーブル（経路表）を作成し、これに従ってＩＰパケットまたはＡＴＭセルのルーティングが行なわれる。図３に図１の例におけるノード１のルーティングテーブルを示す。図３のルーティングテーブルによれば、例えばノード１に到達した宛先ネットワークがＡであるＩＰパケットまたはＡＴＭセルは、ポート１からノード２へ向けてルーティングされる。
【０００９】
なお、パケットまたはセルには一般のユーザデータ用パケットまたはセルとルーティングプロトコル用パケットまたはセルとがある。ユーザデータ用パケットまたはセルは上記のようにルーティングテーブルに従ってルーティングされる。ルーティングプロトコル用パケットまたはセルの場合、その種類によって動作は様々だが、前述のリンクステートを運ぶパケットまたはセルの場合、他のノードへ転送されるとともに、図２に示すトポロジーデータとしてトポロジーデータベースに格納され、それに基いて新たなルーティングテーブルが作成される。
【００１０】
図４は本発明が適用されるネットワーク構成の一例を示す。図４において、各ノード１０はネットワーク１２を介して相互に接続されるとともに、ネットワーク全体の管理を行なうＮＭＳ（Network Management System ）１４にも接続されている。
図５は本発明の一実施形態に係るノード装置１０の構成を示す。ノード装置とは、例えばＩＰネットワークにおいてはルータに相当し、ＡＴＭネットワークにおいてはＡＴＭ交換機に相当する。図５において、ＮＭＳ制御部２０は図４のＮＭＳとのデータのやり取りを行うものであり、ポート制御部２２はポート２４から入出力されるパケットまたはセルをルーティングテーブル２５に従ってルーティングするものである。ポート制御部２２はＡＴＭ交換機の場合のＡＴＭスイッチに相当する。到着したパケットまたはセルがルーティングプロトコル用パケットまたはセルであるとき、そのパケットまたはセルはメインプロセッサ２６にも送られる。ルーティングプロトコル用パケットまたはセルにリンクステートが含まれていれば、それに従ってトポロジーデータ２８が更新され、更新されたトポロジーデータ２８に基いて新たなルーティングテーブル２５が作成される。スナップショット３０については後述する。
【００１１】
以下に説明する実施形態では、ＮＭＳを含め、ＮＭＳの配下の全ノードの時刻が完全に一致していることが、前提であるので、もし一致していないときは（ここでは示さないが）適切な方法で一致させておく。この時刻同期確認は適当な周期で行い、常に各ノード間で時刻が一致しているようにしておく。
図６はノード装置１０のメインプロセッサ２６におけるスナップショット収集の処理のフローチャートである。ＮＭＳ１４にて保守者がネットワークのスナップショットを撮るように指示すると、ＮＭＳ１４は配下の各ノードに対して時刻を指定し、この時刻になったら各ノードが保持しているトポロジーデータのコピーを取るように促す。各ノード１０はＮＭＳ１４が指示した時刻になったら（ステップ１０００）トポロジーデータのコピーを取る（ステップ１００２）と同時に直前までのトポロジーデータのやり取りから十分安定性のあるデータかどうかを以下の方法で検証し（ステップ１００４）、その結果をＮＭＳに通知する（ステップ１００８）。
【００１２】
一般にリンクステートアルゴリズムで使用されるパケットには、ネットワークの構成情報を運ぶデータベース記述パケットと自ノードが利用可能であることを示すＨｅｌｌｏパケットがある。後者のＨｅｌｌｏパケットは自ノードが稼動しているときに、ある一定間隔で常にネットワークのリンク上を流れているが、前者のデータベース記述パケットはネットワーク構成に変更があり、それぞれのノード間で同期をあわせているときにしか使用されない。このことからネットワークのリンク上に、ある一定時間、Ｈｅｌｌｏパケットしか流れていないようなら、その時このネットワークは安定している（同期がとれている）とみなす事が出来る。
【００１３】
図７は安定状態判定の処理のフローチャートである。図７において、パケットが受信されるとき（ステップ１１００）、それがＨｅｌｌｏパケットか否かを判定し（ステップ１１０２）、Ｈｅｌｌｏパケットでなければ安定タイマを初期化して（ステップ１１０４）、状態を「不安定」とする（ステップ１１０６）。受信されたパケットがＨｅｌｌｏパケットであるとき、タイマが満了していれば（ステップ１１０８）、状態を「安定」とする（ステップ１１１０）。
【００１４】
なおこの安定とみなせるまでの時間（猶予時間）は、各ノード固有で保有している時間、または、ＮＭＳからの指示による時間であり、各ノードがこれら２つの手法のいずれかを選択するようにしても良い。
ＮＭＳ１４は全てのノードからの通知を待ち、全てのノードにて十分安定性のあるデータが取れたかどうかを確認する。これら一連の動作により、ネットワーク内全てのノードにおいてある決まった時刻でのトポロジーデータのコピー（スナップショット）を保持できた事になる。
【００１５】
なお、スナップショット作成の際にＮＭＳ１４の指示を受けずに全ノードが決まった時刻に自律的にスナップショットを収集するようにしても良い。この場合、他の部分の処理は前述と同様である。
これ以降、各ノードは図８に示す手順によりネットワークの正常性の確認を定常的に行う。各ノードはコネクション型接続なら呼損率、コネクションレス型接続ならパケット廃棄率などの値を算出する（ステップ１２０２）。ただしこの計算の際に自分が搭載しているハードウェア起因による障害分は差し引いて、純粋にソフトウェアによるルーチングの結果として発生した分のみを計算する。ハードウェア障害により特定の方路が使用不能になった場合は、ルーチングプロトコル自身によって、自動的にルーチング対象から外されて継続して呼損やパケット廃棄が起こることはない。このように算出された値がある基準値を超えた場合（ステップ１２０４）ネットワークは異常状態になったと判定し、その旨ＮＭＳに通知をする（ステップ１２０６）。
【００１６】
この基準値は、各ノード固有で保有してある値でも、ＮＭＳからの指示による値でも良く、各ノードがこれら２つの方法のいずれかを選択するようにしても良い。
図９は異常状態の通知をいずれかのノードから受け取ったときのＮＭＳの処理を示す。異常状態を受け取ったＮＭＳは各ノードに対し、指示した時刻になったら現在使用しているトポロジーデータを廃棄し、スナップショットとして保存してあるトポロジーデータを使用するように指示を出す（ステップ１３０２）。ＮＭＳより指示を受けた各ノードは指定された時刻になったら、異常データとなっているトポロジーデータを破棄し、保持してあったトポロジーデータからルーチングテーブルを再生成し、このデータで呼処理動作を継続する。この間に異常となった原因を突き止め、復旧させリンクステートアルゴリズムにより正常と思われるトポロジーデータが出来上がった後、ＮＭＳに通知を行う（ステップ１３０４）。ＮＭＳはこれを受けて再度各ノードに通常モードへの復帰時刻を通知し（ステップ１３０６）各ノードは指示に従い通常モードに移行する。図１０に異常検出時のＮＭＳおよびノードの動作シーケンスを示す。
【００１７】
（付記１）ネットワークに含まれる複数のノードの間で各ノードのリンクステートを互いに交換することより、各ノードが互いに同一のトポロジーデータを保持してルーティング制御に用いるネットワークにおけるネットワーク異常への対処方法であって、
（ａ）ネットワークが安定状態にあるときのトポロジーデータのスナップショットを各ノードにおいて互いに同一の時刻に記録し、
（ｂ）ネットワークが異常状態にある間、各ノードにおいて、該スナップショットをトポロジーデータとしてルーティング制御に使用するステップを具備する方法。（１）
（付記２）ステップ（ａ）は、
（ｉ）ネットワークの安定性を各ノードにおいて判定し、
（ii）各ノードにおいて互いに同一の時刻にスナップショットを作成し、
（iii ）ネットワークが安定していると判定されるときのスナップショットの作成に成功したとき、その旨を各ノードからネットワークマネジメントシステム（ＮＭＳ）に通知するサブステップを含む付記１記載の方法。（２）
（付記３）サブステップ（ａ）（ii）は、
ＮＭＳから各ノードへ、指定時刻におけるスナップショットの作成を指示し、
各ノードにおいて、ＮＭＳからの指示に応じて指定時刻にトポロジーデータをスナップショットとしてコピーするサブステップを含む付記２記載の方法。
【００１８】
（付記４）サブステップ（ａ）（ii）において、各ノードは、ＮＭＳからの指示によらず、予め定められた時刻にトポロジーデータをスナップショットとしてコピーする付記２記載の方法。
（付記５）サブステップ（ａ）（ｉ）において、所定の猶予時間内に隣接ノードからリンクステートが受信されないときネットワークが安定していると判定される付記２記載の方法。
【００１９】
（付記６）前記猶予時間は、各ノードについて固有に設定される付記５記載の方法。
（付記７）前記猶予時間は、ＮＭＳから指定される付記５記載の方法。
（付記８）ステップ（ｂ）は、
（ｉ）各ノードにおいて、ネットワークの正常性を判定し、
（ii）ネットワークが異常であると判定されるとき、その旨をノードからＮＭＳへ通知し、
（iii ）ネットワークが異常であるとの通知に応答して、ＮＭＳから各ノードへ、トポロジーデータを、記録されているスナップショットへ指定時刻に切り替えるよう指示するサブステップを含む付記１記載の方法。（３）
（付記９）サブステップ（ｂ）（ｉ）において、ハードウェア障害によるものを除外した呼損率および／またはパケット廃棄率が所定の閾値を超えるときネットワークの異常と判定される付記８記載の方法。
【００２０】
（付記１０）前記閾値は、各ノードについて固有に設定される付記９記載の方法。
（付記１１）前記閾値は、ＮＭＳから指定される付記９記載の方法。
（付記１２）ネットワークに含まれる複数のノードの間で各ノードのリンクステートを互いに交換することより、各ノードが互いに同一のトポロジーデータを保持してルーティング制御に用いるネットワークにおけるネットワーク異常に対処するためのノード装置であって、
ネットワークが安定状態にあるときのトポロジーデータのスナップショットを他のノードと同一の時刻に記録する手段と、
ネットワークが異常状態にある間、該スナップショットをトポロジーデータとしてルーティング制御に使用する手段とを具備するノード装置。（４）
（付記１３）スナップショット記録手段は、
ネットワークの安定性を判定する手段と、
他のノードと同一の時刻にスナップショットを作成する手段と、
ネットワークが安定していると判定されるときのスナップショットの作成に成功したとき、その旨をＮＭＳに通知する手段とを含む付記１２記載のノード装置。（５）
（付記１４）スナップショット作成手段は、
ＮＭＳからの、指定時刻におけるスナップショットの作成の指示を受信する手段と、
ＮＭＳからの指示に応じて指定時刻にトポロジーデータをスナップショットとしてコピーする手段とを含む付記１３記載のノード装置。
【００２１】
（付記１５）スナップショット作成手段は、ＮＭＳからの指示によらず、予め定められた時刻にトポロジーデータをスナップショットとしてコピーする付記１３記載のノード装置。
（付記１６）ネットワーク安定性判定手段は、所定の猶予時間内に隣接ノードからリンクステートが受信されないときネットワークが安定していると判定する付記１３記載のノード装置。
【００２２】
（付記１７）前記猶予時間は、各ノードについて固有に設定される付記１６記載のノード装置。
（付記１８）前記猶予時間は、ＮＭＳから指定される付記１６記載のノード装置。
（付記１９）スナップショット使用手段は、
ネットワークの正常性を判定する手段と、
ネットワークが異常であると判定されるとき、その旨をＮＭＳへ通知する手段と、
ＮＭＳからの指示に応じて、トポロジーデータを、記録されているスナップショットへ指定時刻に切り替える手段とを含む付記１２記載のノード装置。
【００２３】
（付記２０）ネットワーク正常性判定手段は、ハードウェア障害によるものを除外した呼損率および／またはパケット廃棄率が所定の閾値を超えるときネットワークの異常と判定する付記１９記載のノード装置。
（付記２１）前記閾値は、各ノードについて固有に設定される付記２０記載のノード装置。
【００２４】
（付記２２）前記閾値は、ＮＭＳから指定される付記２０記載のノード装置。
【００２５】
【発明の効果】
以上説明した様に、本発明により、１つのノードの異常によりネットワーク全体が麻痺し、サービスを提供できなくなるといった最悪の事態を防ぐことができ、復旧をするまでの間でも各ノード間の通信を継続することができ、可用性を保持できる。
【図面の簡単な説明】
【図１】ネットワーク構成の一例を示す図である。
【図２】図１のネットワーク構成におけるトポロジーデータを示す図である。
【図３】図２のトポロジーデータから生成されるノード１のルーティングテーブルを示す図である。
【図４】本発明が適用されるネットワーク構成の一例を示す図である。
【図５】ノード装置のハードウェア構成を示すブロック図である。
【図６】スナップショット収集処理のフローチャートである。
【図７】安定状態判定処理のフローチャートである。
【図８】ノード動作正常性確認処理のフローチャートである。
【図９】ＮＭＳにおける異常時の動作のフローチャートである。
【図１０】異常時におけるＮＭＳと各ノードの動作を示すシーケンス図である。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a method for coping with a case where the entire network falls into an abnormal state in a network adopting a dynamic routing method, and a node device therefor.
[0002]
[Prior art]
Create and update the network topology database for creating routing tables (route tables) in OSPF (Open Shertest Path First) in IP (Internet Protocol) networks and PNNI (Private Network-Network Interface) in ATM (Asynchronous Transfer Mode) networks Therefore, a dynamic routing method using a link state algorithm is employed.
[0003]
[Problems to be solved by the invention]
This method automatically updates the topology database of the network held by each node by sending its link state to the network periodically and when each node changes its configuration. There are advantages such as "and routing loop is hard to occur". However, due to the principle of operation, if a node in the network malfunctions and an incorrect link state flows to the network, it affects the topology database of the entire network, and the network routing operation is paralyzed. Can happen.
[0004]
Accordingly, an object of the present invention is to provide a coping method when the entire network falls into an abnormal state in a network adopting a dynamic routing method and a node device therefor.
[0005]
[Means for Solving the Problems]
According to the present invention, by exchanging the link state of each node among a plurality of nodes included in the network, each node maintains the same topology data and uses it for routing control. As a countermeasure, snapshots of topology data when the network is in a stable state are recorded at the same time in each node, and the snapshots are used as topology data in each node while the network is in an abnormal state. A method is provided comprising the steps used for routing control.
[0006]
According to the present invention, by exchanging the link state of each node among a plurality of nodes included in the network, each node maintains the same topology data and copes with a network abnormality in a network used for routing control. And a means for recording a snapshot of topology data when the network is in a stable state at the same time as other nodes, and the snapshot while the network is in an abnormal state. And a node device comprising means for use in routing control.
[0007]
DETAILED DESCRIPTION OF THE INVENTION
Prior to the description of the present invention, a dynamic routing method in OSPF in an IP network or PNNI in an ATM network will be described.
In the network configuration shown in FIG. 1, the network A is connected to the network C via the node 2. Here, a router corresponds to a node in the IP network, and an ATM switch corresponds to a node in the ATM network. The network C is further connected to the network D via the node 1. More specifically, network C is directly connected to port 1 of node 1 and network D is directly connected to port 0 of node 1.
[0008]
Each node constructs a topology database storing topology data as shown in FIG. 2 by exchanging link states indicating which networks are directly connected to each node according to a link state algorithm. Based on this topology data, each node creates a routing table (route table), and IP packets or ATM cells are routed in accordance with the routing table. FIG. 3 shows a routing table of the node 1 in the example of FIG. According to the routing table of FIG. 3, for example, an IP packet or ATM cell whose destination network is A that has reached node 1 is routed from port 1 to node 2.
[0009]
The packet or cell includes a general user data packet or cell and a routing protocol packet or cell. The user data packet or cell is routed according to the routing table as described above. In the case of a packet or cell for a routing protocol, the operation varies depending on the type, but in the case of a packet or cell carrying the above link state, it is transferred to another node and stored in the topology database as the topology data shown in FIG. Based on this, a new routing table is created.
[0010]
FIG. 4 shows an example of a network configuration to which the present invention is applied. In FIG. 4, the nodes 10 are connected to each other via a network 12 and are also connected to an NMS (Network Management System) 14 for managing the entire network.
FIG. 5 shows a configuration of the node device 10 according to an embodiment of the present invention. The node device corresponds to a router in an IP network, for example, and corresponds to an ATM switch in an ATM network. In FIG. 5, the NMS control unit 20 exchanges data with the NMS in FIG. 4, and the port control unit 22 routes packets or cells input / output from the port 24 according to the routing table 25. The port control unit 22 corresponds to an ATM switch in the case of an ATM exchange. When the arriving packet or cell is a routing protocol packet or cell, the packet or cell is also sent to the main processor 26. If the link state is included in the routing protocol packet or cell, the topology data 28 is updated accordingly, and a new routing table 25 is created based on the updated topology data 28. The snapshot 30 will be described later.
[0011]
In the embodiment described below, it is assumed that the time of all nodes under the NMS, including the NMS, are completely the same, so if they do not match (not shown here) Keep them in the same way. This time synchronization check is performed at an appropriate cycle, and the time is always matched between the nodes.
FIG. 6 is a flowchart of snapshot collection processing in the main processor 26 of the node device 10. When the NMS 14 instructs the maintenance person to take a snapshot of the network, the NMS 14 designates a time for each subordinate node, and when this time comes, a copy of the topology data held by each node is taken. Prompt. When the time indicated by the NMS 14 is reached (step 1000), each node 10 takes a copy of the topology data (step 1002), and at the same time, verifies whether the data is sufficiently stable from the previous exchange of topology data by the following method. (Step 1004), and the result is notified to the NMS (Step 1008).
[0012]
In general, the packets used in the link state algorithm include a database description packet that carries network configuration information and a Hello packet that indicates that the local node can be used. The latter Hello packet always flows on the network link at a certain interval when the node is operating, but the former database description packet has a change in the network configuration, and synchronization between each node Used only when matching. Therefore, if only Hello packets are flowing on the link of the network for a certain period of time, the network can be regarded as stable (synchronized) at that time.
[0013]
FIG. 7 is a flowchart of stable state determination processing. In FIG. 7, when a packet is received (step 1100), it is determined whether or not it is a hello packet (step 1102). If it is not a hello packet, a stability timer is initialized (step 1104), Stable ”(step 1106). When the received packet is a Hello packet, if the timer has expired (step 1108), the state is set to "stable" (step 1110).
[0014]
It should be noted that the time until this can be regarded as stable (grace time) is the time possessed by each node or the time instructed by the NMS, and each node selects one of these two methods. May be.
The NMS 14 waits for notifications from all the nodes and confirms whether or not sufficiently stable data has been obtained at all the nodes. Through this series of operations, a copy (snapshot) of topology data at a certain time can be held in all nodes in the network.
[0015]
Note that snapshots may be collected autonomously at a time when all nodes are determined without receiving an instruction from the NMS 14 when creating a snapshot. In this case, the processing of other parts is the same as described above.
Thereafter, each node regularly checks the normality of the network according to the procedure shown in FIG. Each node calculates values such as a call loss rate for connection-type connections and a packet discard rate for connectionless connections (step 1202). However, in this calculation, the fault caused by the hardware installed in the machine is subtracted, and only the part generated as a result of routing by software is calculated. When a specific route becomes unusable due to a hardware failure, the routing protocol itself is automatically excluded from the routing target and does not cause call loss or packet discard. When the calculated value exceeds a certain reference value (step 1204), it is determined that the network is in an abnormal state, and a notification to that effect is sent to the NMS (step 1206).
[0016]
This reference value may be a value that is unique to each node or a value that is instructed by the NMS, and each node may select one of these two methods.
FIG. 9 shows NMS processing when a notification of an abnormal state is received from any node. The NMS that has received the abnormal state instructs each node to discard the topology data that is currently used when the instructed time is reached, and to use the topology data that is stored as a snapshot (step 1302). . Each node that receives an instruction from the NMS discards the topology data that has become abnormal data at the specified time, regenerates the routing table from the retained topology data, and uses this data for call processing operations Continue. The cause of the abnormality during this period is identified and recovered, and after topology data that is considered to be normal is completed by the link state algorithm, the NMS is notified (step 1304). In response to this, the NMS again notifies each node of the return time to the normal mode (step 1306), and each node shifts to the normal mode according to the instruction. FIG. 10 shows an operation sequence of the NMS and the node when an abnormality is detected.
[0017]
(Supplementary note 1) A method for dealing with network abnormality in a network in which each node holds the same topology data and is used for routing control by exchanging link states of the nodes among a plurality of nodes included in the network. Because
(A) Record snapshots of topology data when the network is in a stable state at each node at the same time,
(B) A method comprising the step of using the snapshot as topology data for routing control in each node while the network is in an abnormal state. (1)
(Appendix 2) Step (a)
(I) determine the stability of the network at each node;
(Ii) Create snapshots at the same time on each node,
(Iii) The method according to supplementary note 1, including a sub-step of notifying each network management system (NMS) of the fact that a snapshot has been successfully created when it is determined that the network is stable. (2)
(Supplementary Note 3) Sub-steps (a) and (ii)
Instructing each node to create a snapshot at a specified time from the NMS,
The method according to appendix 2, including a sub-step of copying the topology data as a snapshot at a specified time in accordance with an instruction from the NMS at each node.
[0018]
(Supplementary note 4) The method according to supplementary note 2, wherein in each of the sub-steps (a) and (ii), each node copies topology data as a snapshot at a predetermined time regardless of an instruction from the NMS.
(Supplementary note 5) The method according to supplementary note 2, wherein, in sub-steps (a) and (i), it is determined that the network is stable when a link state is not received from an adjacent node within a predetermined grace period.
[0019]
(Supplementary note 6) The method according to supplementary note 5, wherein the grace period is uniquely set for each node.
(Supplementary note 7) The method according to supplementary note 5, wherein the grace time is specified by the NMS.
(Supplementary Note 8) Step (b)
(I) In each node, determine the normality of the network,
(Ii) When it is determined that the network is abnormal, the node notifies the NMS to that effect,
(Iii) The method according to appendix 1, including a sub-step for instructing each node to switch the topology data to a recorded snapshot at a specified time in response to a notification that the network is abnormal. (3)
(Supplementary note 9) The method according to supplementary note 8, wherein, in the substeps (b) and (i), when the call loss rate and / or the packet discard rate excluding those due to hardware failure exceeds a predetermined threshold, it is determined that the network is abnormal.
[0020]
(Supplementary note 10) The method according to supplementary note 9, wherein the threshold value is set uniquely for each node.
(Additional remark 11) The said threshold value is the method of Additional remark 9 designated from NMS.
(Additional remark 12) In order to cope with the network abnormality in the network where each node holds the same topology data and is used for routing control by exchanging the link state of each node among a plurality of nodes included in the network. Node equipment,
Means for recording a snapshot of topology data at the same time as other nodes when the network is in a stable state;
A node device comprising: means for using the snapshot as topology data for routing control while the network is in an abnormal state. (4)
(Supplementary note 13) Snapshot recording means
A means of determining network stability;
Means to create a snapshot at the same time as other nodes;
The node device according to appendix 12, further comprising means for notifying the NMS of the successful creation of a snapshot when it is determined that the network is stable. (5)
(Supplementary note 14) Snapshot creation means
Means for receiving an instruction to create a snapshot at a specified time from the NMS;
14. A node device according to appendix 13, comprising means for copying topology data as a snapshot at a specified time in response to an instruction from the NMS.
[0021]
(Supplementary note 15) The node device according to supplementary note 13, wherein the snapshot creation means copies the topology data as a snapshot at a predetermined time regardless of an instruction from the NMS.
(Supplementary note 16) The node device according to supplementary note 13, wherein the network stability determination means determines that the network is stable when a link state is not received from an adjacent node within a predetermined grace period.
[0022]
(Supplementary note 17) The node device according to supplementary note 16, wherein the grace period is uniquely set for each node.
(Supplementary note 18) The node device according to supplementary note 16, wherein the grace time is designated by the NMS.
(Supplementary note 19) Snapshot use means
Means for determining the normality of the network;
Means for notifying the NMS when the network is determined to be abnormal;
The node device according to appendix 12, including means for switching topology data to a recorded snapshot at a specified time in response to an instruction from the NMS.
[0023]
(Supplementary note 20) The node device according to supplementary note 19, wherein the network normality determining means determines that the network is abnormal when the call loss rate and / or the packet discard rate excluding those due to hardware failure exceeds a predetermined threshold.
(Supplementary note 21) The node device according to supplementary note 20, wherein the threshold value is uniquely set for each node.
[0024]
(Additional remark 22) The said threshold value is a node apparatus of Additional remark 20 designated from NMS.
[0025]
【The invention's effect】
As described above, according to the present invention, it is possible to prevent the worst situation in which the entire network is paralyzed due to an abnormality of one node and the service cannot be provided. Can continue and maintain availability.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating an example of a network configuration.
FIG. 2 is a diagram showing topology data in the network configuration of FIG. 1;
FIG. 3 is a diagram showing a routing table of node 1 generated from the topology data of FIG. 2;
FIG. 4 is a diagram showing an example of a network configuration to which the present invention is applied.
FIG. 5 is a block diagram illustrating a hardware configuration of a node device.
FIG. 6 is a flowchart of snapshot collection processing;
FIG. 7 is a flowchart of a stable state determination process.
FIG. 8 is a flowchart of a node operation normality confirmation process.
FIG. 9 is a flowchart of an operation at the time of abnormality in the NMS.
FIG. 10 is a sequence diagram showing the operation of the NMS and each node at the time of abnormality.

Claims

By exchanging the link state of each node among a plurality of nodes included in the network, each node retains the same topology data and uses it for routing control.
(A) A snapshot of topology data when the network is in a stable state in which only packets other than packets that carry network configuration information flow as packets used in the link state algorithm within a certain time on the link of the network. Record at the same time in the node,
(B) A method comprising the step of using the snapshot as topology data at each node for routing control while the network is in an abnormal state until an abnormality occurs in the network and the network is recovered .

By exchanging the link state of each node among a plurality of nodes included in the network, each node retains the same topology data and uses it for routing control.
(A) Record snapshots of topology data when the network is in a stable state at each node at the same time,
(B) including a step of using the snapshot as topology data for routing control in each node while the network is in an abnormal state;
Step (a)
(I) determine the stability of the network at each node;
(Ii) Create snapshots at the same time on each node,
(Iii) when the network is successfully created snapshot when it is determined to be stable, including METHODS a substep notifies each node in the network management system (NMS).

Step (b)
(I) In each node, determine the normality of the network,
(Ii) When it is determined that the network is abnormal, the node notifies the NMS to that effect,
(Iii) The method according to claim 1, further comprising a sub-step for instructing each node to switch the topology data to a recorded snapshot at a specified time in response to a notification that the network is abnormal. .

A node device for dealing with network anomalies in a network used for routing control by maintaining the same topology data for each node by exchanging the link state of each node among a plurality of nodes included in the network. There,
A snapshot of topology data when the network is in a stable state where only packets other than those that carry network configuration information flow as packets used in the link state algorithm within a certain period of time on the link of the network. Means for recording at the same time;
A node device comprising means for using the snapshot as topology data for routing control while the network is in an abnormal state until an abnormality occurs in the network and the network is recovered .

A node device for dealing with network anomalies in a network used for routing control by maintaining the same topology data for each node by exchanging the link state of each node among a plurality of nodes included in the network. There,
Means for recording a snapshot of topology data at the same time as other nodes when the network is in a stable state;
Means for using the snapshot as topology data for routing control while the network is in an abnormal state;
Snapshot recording means
A means of determining network stability;
Means to create a snapshot at the same time as other nodes;
When the network has succeeded in creating a snapshot of when it is determined to be stable, including node device and means for notifying the NMS.