JP2004229224A

JP2004229224A - Failure surveillance apparatus, method, and program

Info

Publication number: JP2004229224A
Application number: JP2003017725A
Authority: JP
Inventors: Takashi Aida; 敬合田
Original assignee: Bank of Tokyo Mitsubishi Ltd
Current assignee: MUFG Bank Ltd
Priority date: 2003-01-27
Filing date: 2003-01-27
Publication date: 2004-08-12

Abstract

<P>PROBLEM TO BE SOLVED: To survey occurrence of any failure in a network without direct access to a computer even if the network contains a local network whose real structure is unknown. <P>SOLUTION: In a computer network 10, a plurality of LANs 12 wherein fire wall (F/W) routers 24 are arranged at portals are connected mutually, and a plurality of routers 20, 24, 28 which update routing information by dynamic routing are arranged. The router 28 located in most near the F/W router 24 and outside the LAN 12 is made applicable to collection, and routing information of the respective routers 28 in the state that a failure does not occur on the network 10 is registered in an HDD 44 of a failure surveillance apparatus 36, which repeats a matter periodically that routing information is collected from the respective routers 28, and comparison of the information with the routing information registered into the HDD 44 is performed, and judges existence of barrier generation and a barrier generation point, thereby monitoring occurrence of any failure. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は障害監視装置、障害監視方法及びプログラムに係り、特に、複数台のルータを含んで構成されたネットワークにおける障害の発生を監視するための障害監視方法、該障害監視方法が適用された障害監視装置、及び、コンピュータを障害監視装置装置として機能させるためのプログラムに関する。
【０００２】
【従来の技術】
ネットワークにおける障害の発生を監視し、障害が発生した場合に障害発生箇所を判断するための技術としては、従来より種々の技術が提案されている。
【０００３】
具体的には、例えば障害監視コンピュータを設置すると共に、障害監視コンピュータと監視センタ用コンピュータを、ネットワークを介して情報を送受可能とし、障害監視コンピュータがｐｉｎｇ応答確認により障害を検知した場合に監視センタ用コンピュータへ障害検知信号を送出することで、監視対象ネットワークにおける障害発生等を、遠隔の監視センタで迅速かつ的確に把握することを可能とする技術が提案されている（例えば特許文献１参照）
また、例えば被監視対象のネットワークからネットワークの実構造に関する情報と各ネットワークノードに関する情報を取得し、以前に取得した同情報と比較し、比較結果に基づいて障害を検知する技術も提案されている（例えば特許文献２参照）
【０００４】
【特許文献１】
特開２００１−２９８４２６号公報
【特許文献２】
特開２００２−１４８８１号公報
【０００５】
【発明が解決しようとする課題】
ところで、複数のローカルエリアネットワーク（ＬＡＮ）が相互に接続された構成のネットワーク（ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）ともいう）においては、個々のＬＡＮの外部から個々のＬＡＮへの不正アクセスを防止するために、個々のＬＡＮと個々のＬＡＮの外部との境界（ＬＡＮの入口）にファイヤウォールを設けることが一般的であり、このファイヤウォールには、ｐｉｎｇコマンド等の監視コマンドによって送出される情報やその応答に相当する情報の通過を遮断する設定がされることも多い。
【０００６】
これに対して特許文献１に記載の技術は、監視コマンドを用いてコンピュータに直接アクセスし障害検知を行うものであるので、ＬＡＮの入口に設けられたファイヤウォールに、監視コマンドを用いることで送受される情報の通過を遮断する設定がされていた場合には、当該ＬＡＮ内に存在しているコンピュータへのアクセスが拒否されるため、上記構成のネットワークにおける障害の発生を検知することが困難である、という問題があった。
【０００７】
また、複数のＬＡＮが相互に接続された構成のネットワークでは、個々のＬＡＮを単位としてネットワークの管理が行われ、個々のＬＡＮのセキュリティ・ポリシーに基づき個々のＬＡＮの実構造がネットワーク全体を管理する管理者に公開されていなかったり、コンピュータの追加等のＬＡＮの実構造の変化があった場合にも前記管理者への報告が行われないことも多く、管理者がネットワークの個々のＬＡＮ内の実構造を把握していないネットワークも多数存在している。このようなネットワークにおいても、障害の発生に伴って当該ネットワークを利用して為される業務に支障が生じた等の場合、管理者は障害発生箇所を特定して対策を講ずる必要があるが、特許文献２に記載の技術はネットワークの実構造が既知であることを前提とした技術であり、当該技術を適用したとしても、実構造が不明な部分が存在するネットワークにおける障害の発生を検知することは困難であった。
【０００８】
本発明は上記事実を考慮して成されたもので、実構造が不明なローカルネットワークを含むネットワークであっても、コンピュータに直接アクセスすることなく当該ネットワークにおける障害の発生を監視できる障害監視装置、障害監視方法及びプログラムを得ることが目的である。
【０００９】
【課題を解決するための手段】
上記目的を達成するために請求項１記載の発明に係る障害監視装置は、通信回線を介して相互に接続された複数台のルータを含んで構成され、各ルータが、所定の宛先へ送達されるべき情報を受信した場合の当該情報の送信先を表すルーティング情報を各宛先毎に記憶し、受信した情報を、該情報の宛先に対応するルーティング情報に従って送信先を選択して送信すると共に、ネットワークの状態に応じて記憶しているルーティング情報を適宜更新する機能を備えているネットワークに接続され、当該ネットワークにおける障害の発生を監視する障害監視装置であって、特定のルータからルーティング情報を取得する取得手段と、前記特定のルータに本来記憶されているべきルーティング情報が予め登録された記憶手段と、前記取得手段によって取得されたルーティング情報を、前記記憶手段に登録されているルーティング情報と比較することで、前記ネットワークにおける障害発生の有無及び障害発生箇所を判断する判断手段と、を含んで構成されている。
【００１０】
請求項１記載の発明に係るネットワークは、通信回線を介して相互に接続された複数台のルータを含んで構成されており、各ルータは、所定の宛先へ送達されるべき情報を受信した場合の当該情報の送信先を表すルーティング情報を各宛先毎に記憶し、受信した情報を、該情報の宛先に対応するルーティング情報に従って送信先を選択して送信すると共に、ネットワークの状態に応じて記憶しているルーティング情報を適宜更新する機能を備えている。請求項１記載の発明に係る障害監視装置は上記構成のネットワークに接続されている。
【００１１】
ルータの上述した機能は一般に動的ルーティングと称され（ダイナミック・ルーティングともいう）、ルータによるネットワークの状態に応じたルーティング情報の更新は、具体的には、例えば請求項２に記載したように、各ルータは、自ルータが記憶しているルーティング情報に含まれる何れかの送信先と通信不能状態になった場合に自ルータのルーティング情報を更新する（例えば自ルータと通信不能状態になった送信先に対応する情報（前記送信先を通る通信経路に対応する情報）を削除又は無効化する）と共に、隣り合う他ルータと相互にルーティング情報を通知し、他ルータから通知されたルーティング情報に変化が有った場合にも該変化に応じて自ルータのルーティング情報を更新する（例えば上記と同様に、他ルータのルーティング情報上で削除又は無効化されている情報に対応する情報（他ルータと通信不能状態になった送信先に対応する情報（前記送信先を通る通信経路に対応する情報）を削除又は無効化する）ことによって行うことができる。
【００１２】
上述した機能（動的ルーティングの機能）を備えたルータを含んで構成されたネットワークにおいては、障害発生を含むネットワークの状態に応じて、ルータに記憶されているルーティング情報の内容が変化する。また、本発明に係るネットワークが、実構造が不明なローカルネットワーク（例えば請求項３に記載したように、障害監視のために必要な情報の送受を遮断するファイヤウォールを介して接続されたローカルネットワーク）を含んで構成されている場合にも、ルータのルーティング情報には、ローカルネットワーク外の他のコンピュータ（外部コンピュータ）と情報の送受を行うローカルネットワーク内のコンピュータに関する情報（当該コンピュータへ送達されるべき情報を受信した場合の当該情報の送信先を表す情報）も含まれているので、当該コンピュータと外部コンピュータと情報の送受に影響を与える障害が発生すれば、ルーティング情報のうち当該コンピュータに関する情報も変化することになり、障害発生箇所がローカルネットワーク内であったとしてもルーティング情報の変化に基づいて障害発生箇所を判断することは可能である。
【００１３】
上記事実に基づき請求項１記載の発明では、特定のルータに本来記憶されているべきルーティング情報が記憶手段に予め登録されており、取得手段によって特定のルータからルーティング情報が取得される。そして判断手段は、取得手段によって取得されたルーティング情報を、記憶手段に登録されているルーティング情報と比較することで、ネットワークにおける障害発生の有無及び障害発生箇所を判断する。なお、障害発生箇所の判断には、障害発生箇所を特定する以外に、障害発生箇所を絞り込むことも含まれる。
【００１４】
このように、請求項１記載の発明では、ルータのルーティング情報を利用して障害発生の有無及び障害発生箇所を判断するので、実構造が不明なローカルネットワークを含むネットワークであっても適用可能であり、コンピュータに直接アクセスする必要もないので、コンピュータへの直接アクセス（障害監視のために必要な情報の送受）を遮断するファイヤウォール等が設けられている場合にも、この影響を受けることはない。従って、請求項１記載の発明によれば、実構造が不明なローカルネットワークを含むネットワークであっても、コンピュータに直接アクセスすることなく当該ネットワークにおける障害の発生を監視することができる。
【００１５】
なお、請求項１記載の発明に係る記憶手段には、特定のルータに本来記憶されているべきルーティング情報の全てを登録することに限られるものではない。例えば、ネットワーク内の特定コンピュータ間の通信に影響を与える障害についてのみ発生を監視する等のように、障害発生の監視対象が限られている場合には、監視対象に対応するルーティング情報のみを登録しておくようにしてもよい。この場合、障害発生の有無及び障害発生箇所の判断は、例えば後述する請求項４のように、記憶手段に登録されているルーティング情報に含まれる各宛先毎の送信先が、取得手段によって取得されたルーティング情報の中に各々存在しているか否かを判断することで行えばよい。
【００１６】
また、本発明に係る取得手段は、特定の単一のルータからルーティング情報を取得するようにしてもよいし、ネットワーク内の全てのルータからルーティング情報を取得するようにしてもよいが、ルーティング情報に基づく障害発生箇所の判断では、ルーティング情報を取得したルータと障害発生箇所の位置関係に応じて判断精度が相違する場合があり、障害発生箇所に近いルータからルーティング情報を取得する方が判断精度が向上することが多い。
【００１７】
このため、本発明に係るネットワークが、障害監視のために必要な情報の送受を遮断するファイヤウォールを介して接続されたローカルネットワークを含んで構成されている場合には、請求項３に記載したように、少なくとも、ファイヤウォールよりも障害監視装置側でかつファイヤウォールから最も近い位置（ファイヤウォールとの間の通信経路上に別のルータが存在していない位置）に位置しているルータのルーティング情報を取得することが好ましい。上記構成のネットワークにおいて、ローカルネットワーク内で障害が発生した場合には、障害発生箇所の判断精度が低下する可能性が高いが、請求項３記載の発明では、少なくともファイヤウォールから最も近い位置に位置しているルータからルーティング情報を取得するので、ローカルネットワーク内で障害が発生した場合にも障害発生箇所を精度良く判断することができる。
【００１８】
また、請求項１記載の発明において、判断手段による障害発生の有無及び障害発生箇所の判断は、具体的には、例えば請求項４に記載したように、記憶手段に登録されているルーティング情報に含まれる各宛先毎の送信先が、取得手段によって取得されたルーティング情報の中に各々存在しているか否かを判断することで行うことができる。この場合、記憶手段に登録されているルーティング情報に含まれる各宛先毎の送信先の中に、取得手段によって取得されたルーティング情報の中に存在していない情報があれば、何らかの障害が発生していると判断できると共に、当該情報から障害発生箇所を判断できるので、障害発生の有無及び障害発生箇所の判断を簡易な処理で実現することができる。
【００１９】
また、特定の通信回線に障害が発生した場合、ルータに記憶されているルーティング情報のうち、特定の通信回線を経由する通信経路に対応する送信先は全て削除、或いは無効化されることになるが、通信回線は障害の発生に備えて多重化されていることが一般的であり、特定の通信回線に障害が発生したとしても、特定の宛先に対応する一部の送信先（特定の通信回線を経由しない通信経路に対応する送信先）がルーティング情報に残存（単に残存、或いは有効な送信先として残存）していることも多い。そして、この場合は特定の宛先への情報送信は可能である。
【００２０】
上記に基づき請求項５記載の発明は、請求項４記載の発明において、判断手段は、取得手段によって取得されたルーティング情報上で、対応する送信先の数が減少している特定の宛先が存在していることを認識した場合に、取得手段によって取得されたルーティング情報の中に特定の宛先に対応する送信先が残存していれば警告を発し、取得手段によって取得されたルーティング情報の中に特定の宛先に対応する送信先が残存していなければ障害の発生を報知することを特徴としている。
【００２１】
請求項５記載の発明では、取得されたルーティング情報の中に特定の宛先に対応する送信先が残存（単に残存、或いは有効な送信先として残存）している場合、すなわち特定の宛先への情報送信が可能な場合には警告を発し、取得されたルーティング情報の中に特定の宛先に対応する送信先が残存（単に残存、或いは有効な送信先として残存）していない場合、すなわち特定の宛先への情報送信が不可能な場合は障害の発生を報知するので、ネットワークに発生した障害の程度をネットワークの管理者等に容易に認識させることができる。
【００２２】
また、請求項１記載の発明において、例えば請求項６に記載したように、取得手段はルーティング情報の取得を定期的に（例えば数秒〜数分周期で）行い、判断手段はネットワークにおける障害発生の有無及び障害発生箇所を定期的に（例えば数秒〜数分周期で）判断することが好ましい。これにより、ネットワークに障害が発生した場合に、これを迅速に検知し、障害発生箇所を速やかに判断することができる。
【００２３】
なお、請求項６記載の発明において、取得手段によってルーティング情報を取得するルータを固定せず、ルーティング情報の取得対象のルータを動的に変更又は追加するようにしてもよい。例えば、通常時は特定のルータからルーティング情報を取得し、取得したルーティング情報に基づき障害が発生していることを認識した場合に、別のルータからもルーティング情報を取得し、取得したルーティング情報も勘案して障害発生箇所を判断するようにしてもよい。
【００２４】
請求項７記載の発明に係る障害監視方法は、通信回線を介して相互に接続された複数台のルータを含んで構成され、各ルータが、所定の宛先へ送達されるべき情報を受信した場合の当該情報の送信先を表すルーティング情報を各宛先毎に記憶し、受信した情報を、該情報の宛先に対応するルーティング情報に従って送信先を選択して送信すると共に、ネットワークの状態に応じて記憶しているルーティング情報を適宜更新する機能を備えているネットワークにおいて、ルーティング情報取得対象の特定のルータに本来記憶されているべきルーティング情報を予め登録しておき、前記特定のルータからルーティング情報を取得し、取得したルーティング情報を、前記予め登録したルーティング情報と比較することで、前記ネットワークにおける障害発生の有無及び障害発生箇所を判断するので、請求項１記載の発明と同様に、実構造が不明なローカルネットワークを含むネットワークであっても、コンピュータに直接アクセスすることなく当該ネットワークにおける障害の発生を監視することができる。
【００２５】
請求項８記載の発明に係るプログラムは、通信回線を介して相互に接続された複数台のルータを含んで構成され、各ルータが、所定の宛先へ送達されるべき情報を受信した場合の当該情報の送信先を表すルーティング情報を各宛先毎に記憶し、受信した情報を、該情報の宛先に対応するルーティング情報に従って送信先を選択して送信すると共に、ネットワークの状態に応じて記憶しているルーティング情報を適宜更新する機能を備えているネットワークに接続されたコンピュータを、特定のルータからルーティング情報を取得する取得手段、前記取得手段によって取得されたルーティング情報を、前記特定のルータに本来記憶されているべき予め登録されたルーティング情報と比較することで、前記ネットワークにおける障害発生の有無及び障害発生箇所を判断する判断手段として機能させる。
【００２６】
請求項８記載の発明に係るプログラムは、コンピュータを、上記の取得手段、判断手段として機能させるためのプログラムであるので、コンピュータが請求項８記載の発明に係るプログラムを実行することにより、コンピュータが請求項１に記載の障害監視装置として機能することになり、請求項１及び請求項７記載の発明と同様に、実構造が不明なローカルネットワークを含むネットワークであっても、コンピュータに直接アクセスすることなく当該ネットワークにおける障害の発生を監視することができる。
【００２７】
【発明の実施の形態】
以下、図面を参照して本発明の実施形態の一例を詳細に説明する。図１には本実施形態に係るコンピュータ・ネットワーク１０が示されている。このコンピュータ・ネットワーク１０は金融機関の各種業務を支援するために設置されたネットワークであり、詳しくは、各々金融機関の支店が設けられた複数の拠点（図１では例として拠点Ａ〜拠点Ｄ）に各々設置されたローカルエリア・ネットワーク（ＬＡＮ）１２が通信網１４を介して互いに接続されて構成されている。なお、このＬＡＮ１２は請求項３に記載のローカルネットワークに対応している。
【００２８】
個々の拠点に設置されているＬＡＮ１２は類似の構成であるので、以下、拠点Ａに設置されたＬＡＮ１２Ａを例に構成を説明すると、ＬＡＮ１２Ａは複数台のサーバ・コンピュータ１６（図１のＬＡＮ１２Ａではサーバ・コンピュータ１６Ａ〜１６Ｆを示す）を含んで構成されている。個々のサーバ・コンピュータ１６には、金融機関の特定の業務を支援するためのアプリケーション・プログラムが各々インストールされており、当該アプリケーション・プログラムを実行することで、対応する特定業務を支援するアプリケーション処理を行う。
【００２９】
本実施形態において、金融機関の個々の業務の支援は各々複数台のサーバ・コンピュータ１６（が実行するアプリケーション処理）によって実現され、同一の業務を支援するアプリケーション処理を実行するサーバ・コンピュータ１６同士が通信媒体１８を介して相互に接続されている。例として、図１のＬＡＮ１２Ａでは、サーバ・コンピュータ１６Ａ，１６Ｂが通信媒体１８Ａを介して相互に接続され、サーバ・コンピュータ１６Ｃ，１６Ｄが通信媒体１８Ｂを介して相互に接続され、サーバ・コンピュータ１６Ｅ，１６Ｆが通信媒体１８Ｃを介して相互に接続されている。
【００３０】
また、同一の業務を支援するアプリケーション処理を実行するサーバ・コンピュータ１６同士を相互に接続する個々の通信媒体１８には、各々ルータ（以下、業務ルータと称する）２０が接続されており（図１のＬＡＮ１２Ａでは業務ルータ２０Ａ〜２０Ｃを示す）、これらの業務ルータ２０は通信媒体２２を介して相互に接続されている。通信媒体２２には、各々ファイヤウォールとして機能する２台のルータ（以下、Ｆ／Ｗルータと称する）２４Ａ，２４Ｂが各々接続されている、Ｆ／Ｗルータ２４Ａ，２４ＢはＬＡＮ１２の入口に相当し、通信媒体２６Ａ及び通信媒体２６Ｂを介して相互に接続されている。
【００３１】
また、個々のＬＡＮ１２には、個々のＬＡＮに対応して各々２台のルータ（以下、バックボーン・ルータと称する）２８Ａ，２８Ｂが設けられており、このバックボーン・ルータ２８Ａ，２８Ｂは通信媒体２６Ａ及び通信媒体２６Ｂに各々接続されている。従って、個々のＬＡＮ１２のＦ／Ｗルータ２４と個々のＬＡＮ１２に対応するバックボーン・ルータ２８との間には、バックボーン・ルータ２８Ａ⇔Ｆ／Ｗルータ２４Ａ、バックボーン・ルータ２８Ａ⇔Ｆ／Ｗルータ２４Ｂ、バックボーン・ルータ２８Ｂ⇔Ｆ／Ｗルータ２４Ａ、バックボーン・ルータ２８Ｂ⇔Ｆ／Ｗルータ２４Ｂの４つの通信経路が設けられており、通信経路が多重化されている。バックボーン・ルータ２８Ａ，２８Ｂは通信回線を介して通信網１４に接続されている。
【００３２】
業務ルータ２０、Ｆ／Ｗルータ２４及びバックボーン・ルータ２８は互いに類似の構成であり、バックボーン・ルータ２８Ａを例にその構成を説明すると、バックボーン・ルータ２８ＡはＣＰＵ３０及び不揮発性メモリ３２を内蔵しており、不揮発性メモリ３２には外部から受信した情報の送信先を表すルーティング情報３４（詳細は後述）が記憶されている。また、不揮発性メモリ３２にはＣＰＵ３０が実行するための所定のプログラムも記憶されており、ＣＰＵ３０が所定のプログラムを実行することで、ＣＰＵ３０は、外部から情報を受信したか否かを監視し、外部から情報を受信すると受信した情報の宛先に基づきルーティング情報３４を参照することで送信先を判断し、受信した情報を判断した送信先へ送信すると共に、ネットワークの状態の変化に応じてルーティング情報を適宜更新する処理を行う（動的ルーティング）。
【００３３】
また、Ｆ／Ｗルータ２４の不揮発性メモリ３２には、Ｆ／Ｗルータ２４を通過することを阻止すべき情報を規定する規定情報も記憶されており、Ｆ／Ｗルータ２４では、外部から受信した情報が前記規定情報に規定された情報（通過を阻止すべき情報）であった場合に、受信した情報を破棄するフィルタリング処理（ファイヤウォールに相当する処理）も行われる。Ｆ／Ｗルータ２４は個々のＬＡＮ１２の入口に相当し、個々のＬＡＮ１２の入口でＦ／Ｗルータ２４が上記のフィルタリング処理を行うことで、個々のＬＡＮ１２内のサーバ・コンピュータ１６等の各コンピュータが外部から不正にアクセスされることを防止することができる。Ｆ／Ｗルータ２４は請求項３に記載のファイヤウォールに対応している。
【００３４】
ところで、ＬＡＮ１２Ａの通信媒体１８Ａには、本発明に係る障害監視装置に相当する障害監視端末３６が接続されている。障害監視端末３６はパーソナル・コンピュータ（ＰＣ）等から成り、ＣＰＵ３６Ａ、ＲＯＭ３６Ｂ、ＲＡＭ３６Ｃ、入出力ポート３６Ｄを備え、これらはアドレスバス、データバス、制御バス等のバス３６Ｅを介して互いに接続されている。入出力ポート３６Ｄには、ＣＲＴ又はＬＣＤから成るディスプレイ３８、マウス４０、キーボード４２、ハードディスク装置（ＨＤＤ）４４、通信制御装置４６が各々接続されている。通信制御装置４６は通信媒体１８Ａに接続されており、障害監視端末３６は、ルーティング情報収集対象のルータ（後述）と通信媒体１８Ａ等を介して通信可能とされている。
【００３５】
また、障害監視端末３６のＨＤＤ４４には、ＣＰＵ３６Ａが障害監視処理（詳細は後述）を行うための障害監視プログラムが予めインストールされている。この障害監視プログラムは請求項８に記載のプログラムに対応しており、ＣＰＵ３６Ａが障害監視プログラムを実行することで、障害監視端末３６が本発明に係る障害監視装置として機能することになる。
【００３６】
更に、ＨＤＤ４４には、ルーティング情報取得対象の複数のルータ（詳細は後述）の不揮発性メモリ３２に本来記憶されているべきルーティング情報（コンピュータ・ネットワーク１０に何ら障害が発生していない状態で記憶されているべきルーティング情報）が予め登録された登録テーブル４８も記憶されている。なお、ＨＤＤ４４は本発明に係る記憶手段に対応している。この登録テーブル４８に登録されているルーティング情報は、例えばコンピュータ・ネットワーク１０の全体を管理する管理者により、キーボード４２等を介して障害監視端末３６へ入力される。
【００３７】
なお、登録テーブル４８には、ルータの不揮発性メモリ３２に記憶されるルーティング情報を全て登録することに限られるものではなく、コンピュータ・ネットワーク１０内の各コンピュータのうち、例えば金融機関の特定業務を支援するコンピュータ間の通信に影響を与える障害についてのみ発生を監視する等のように、障害発生の監視対象が限られている場合には、監視対象に対応するルーティング情報のみが登録される。
【００３８】
次に本実施形態の作用として、まず各ルータにおけるルーティング情報の更新（動的ルーティング）について、図２を参照し、具体例を挙げて説明する。なお、図２では説明を簡単にするために、ルータａ〜ｄの４台のルータが設けられ、コンピュータＡが接続されたルータａがルータｂ及びルータｃに各々接続されると共に、コンピュータＢが接続されたルータｄもルータｂ及びルータｃに各々接続され、コンピュータＡ，Ｂが相互に情報を送受する構成のネットワークが示されている。
【００３９】
上記構成のネットワークにおいて、ネットワークに何ら障害が発生していない通常時には、ルータａ〜ルータｄに、特定の宛先（コンピュータＡ又はコンピュータＢ）へ送達されるべき情報を個々のルータが受信した場合の当該情報の全ての送信先が各宛先（コンピュータＡ，Ｂ）について各々設定されたルーティング情報が各々記憶されている（図２（Ａ）参照）。
【００４０】
すなわち、例えばルータａにおいて、コンピュータＢを宛先とする情報を受信した場合に、当該情報のコンピュータＢへの送達に利用可能な通信経路（情報の伝達経路）としては「ルータａ→ルータｂ→ルータｄ→コンピュータＢ」という通信経路と、「ルータａ→ルータｃ→ルータｄ→コンピュータＢ」という通信経路が存在している。このため、ルータａに記憶されているルーティング情報は、コンピュータＢを宛先とする情報の送信先として、上記の２種類の通信経路に対応する送信先である「ルータｂ」及び「ルータｃ」が設定されている。
【００４１】
また、例えばルータｂにおいて、コンピュータＡを宛先とする情報を受信した場合に、当該情報のコンピュータＡへの送達に利用可能な通信経路としては「ルータｂ→ルータａ→コンピュータＡ」という通信経路と、「ルータｂ→ルータｄ→ルータｃ→ルータａ→コンピュータＡ」という通信経路が存在している。このため、ルータｂに記憶されているルーティング情報は、コンピュータＡを宛先とする情報の送信先として、上記の２種類の通信経路に対応する送信先である「ルータａ」及び「ルータｄ」が設定されている。
【００４２】
また、各ルータは、自ルータが記憶しているルーティング情報に含まれる何れかの送信先と通信不能状態になったか否か、及び、以前に通信不能状態になっていた特定の送信先との通信不能状態が解消したか（通信可能状態になったか）否かを監視し、通信不能状態になるか又は通信不能状態が解消したことを検知した場合に、自ルータのルーティング情報を検知した事象に応じて更新すると共に、自ルータのルーティング情報を隣り合うルータと適宜相互に通知し、通知されたルーティング情報に変化がある（この変化も、隣り合うルータと任意の送信先が通信不能状態になるか、又は任意の送信先との通信不能状態が解消することによって生ずる）ことを検知した場合には、検知した変化に応じて自ルータのルーティング情報を更新する処理（動的ルーティング）を行う。
【００４３】
具体的には、例えば図２（Ｂ）に示すようにルータａとルータｂの間の通信回線に障害が発生し、ルータａとルータｂが通信不能状態となり、前記通信回線を経由する通信経路が使用不可の状態となった場合、上記通信不能状態がルータａ，ｂによって各々検知され、ルータａにおいては、ルータａのルーティング情報に設定されている各送信先のうち、通信不能状態となった送信先の情報（コンピュータＢを宛先とする情報の送信先の１つである「ルータｂ」）が無効化されると共に、ルータｂにおいては、ルータｂのルーティング情報に設定されている各送信先のうち、通信不能状態となった送信先の情報（コンピュータＡを宛先とする情報の送信先の１つである「ルータａ」、及びコンピュータＢを宛先とする情報の送信先の１つである「ルータａ」）が無効化される。
【００４４】
更に、ルータａ，ｂが隣り合うルータ（ルータｃ，ｄ）とルーティング情報を相互に通知することで、ルータｃにおいても、ルータｃのルーティング情報に設定されている各送信先のうち、通信不能状態となっている通信回線を経由する通信経路に対応する送信先の情報（コンピュータＡを宛先とする情報の送信先の１つである「ルータｄ」及びコンピュータＢを宛先とする情報の送信先の１つである「ルータａ」）が無効化され、ルータｄにおいても、ルータｄのルーティング情報に設定されている各送信先のうち、通信不能状態となっている通信回線を経由する通信経路に対応する送信先の情報（コンピュータＡを宛先とする情報の送信先の１つである「ルータｂ」）が無効化される。このようにして、ルータａとルータｂの間の通信不能状態（通信回線の障害）に応じて各ルータのルーティング情報が各々更新されることになる。
【００４５】
また、例えば図２（Ｃ）に示すようにルータｄとコンピュータＢの間の通信回線に障害が発生するか、又はコンピュータＢに障害が発生することで、ルータｄとコンピュータＢの間が通信不能状態になった場合、この通信不能状態の発生がルータｄによって検知され、ルータｄにおいて、ルータｄのルーティング情報に設定されている各送信先のうち、コンピュータＢを宛先とする情報の送信先である「コンピュータＢ」が無効化される。
【００４６】
また、ルータｄが隣り合うルータｃ，ｄとルーティング情報を相互に通知し、ルータｃ，ｄがルータａとルーティング情報を相互に通知することで、ルータｃにおいても、ルータｃのルーティング情報に設定されている各送信先のうち、コンピュータＢを宛先とする情報の送信先である「ルータｄ」及び「ルータａ」が各々無効化され、ルータｄにおいても、ルータｄのルーティング情報に設定されている各送信先のうち、コンピュータＢを宛先とする情報の送信先である「ルータｄ」及び「ルータａ」が各々無効化され、ルータａにおいても、ルータａのルーティング情報に設定されている各送信先のうち、コンピュータＢを宛先とする情報の送信先である「ルータｂ」及び「ルータｃ」が各々無効化される。このようにして、ルータｄとコンピュータＢの間の通信回線、又はコンピュータＢの障害に伴うルータｄとコンピュータＢの通信不能状態に応じて各ルータのルーティング情報が各々更新されることになる。このように、本実施形態に係る各ルータは、詳しくは請求項２に記載のルータに対応している。
【００４７】
なお、以前に発生していた障害が復旧し、各ルータがこれを検知した場合にも各ルータのルーティング情報が各々更新され、詳しくは、無効化されていた対応する情報が有効化される。また、隣り合うルータとのルーティング情報の相互通知には、定期的（例えば３０秒周期）に行う方式と、ルーティング情報の変化が生じた場合（ルーティング情報を更新した場合）にルーティング情報の差分（変化分）のみを相互に通知する方式があるが、何れの方式を採用してもよい。
【００４８】
続いて、障害監視端末３６のＣＰＵ３６Ａが障害監視プログラムを実行することで実現される障害監視処理について、図３のフローチャートを参照して説明する。なお、以下で説明する障害監視処理は、本発明に係る障害監視方法が適用された処理であり、障害監視端末３６によって所定周期で繰り返し実行される。
【００４９】
本実施形態では、ルーティング情報取得対象のルータが予め複数定められており、ステップ１００では、ルーティング情報取得対象の複数のルータの中から、以下で説明する処理を実行していない単一のルータを選択し、選択したルータに対してリモートログインをするための情報を送信することで、当該ルータへのリモートログインを試行する。
【００５０】
本実施形態では、各ＬＡＮ１２に対応して設けられた全てのバックボーン・ルータ２８Ａ，２８Ｂ（ＬＡＮ１２Ａの入口に設けられたＦ／Ｗルータ２４から最も近い位置に位置しているバックボーン・ルータ２８Ａ，２８Ｂ、及び、ＬＡＮ１２Ｂ〜１２Ｄの入口に設けられたＦ／Ｗルータ２４よりも障害監視端末側でかつ前記Ｆ／Ｗルータ２４から最も近い位置に位置しているバックボーン・ルータ２８Ａ，２８Ｂ）をルーティング情報取得対象のルータとしており、ステップ１００でルーティング情報取得対象のバックボーン・ルータ２８へ送信された情報は、Ｆ／Ｗルータ２４のファイヤウォール機能により破棄されることなくバックボーン・ルータ２８で受信される。
【００５１】
次のステップ１０２では、リモートログインのための情報を送信したバックボーン・ルータ２８（ルーティング情報取得対象のバックボーン・ルータ２８）から正常応答に相当する情報を受信したか否かを判断することで、ルーティング情報取得対象のバックボーン・ルータ２８へのログインに成功したか否かを判定する。上記判定が否定された場合は、ルーティング情報取得対象のバックボーン・ルータ２８自体に障害が発生していると推定できるため、ステップ１０４において、ルーティング情報取得対象のバックボーン・ルータ２８に障害が発生していることを通知するメッセージを出力する（例えば前記メッセージをディスプレイ３８に表示させる）ことで、当該バックボーン・ルータ２８の障害発生を管理者等へ通知し、ステップ１００へ戻る。
【００５２】
一方、ステップ１０２の判定が肯定された場合はステップ１０６へ移行し、ログインしたバックボーン・ルータ２８に対してルーティング情報の送信を要求し、この要求に従いバックボーン・ルータ２８によって不揮発性メモリ３２から読み出されて送信されたルーティング情報を受信することで、ログインしたバックボーン・ルータ２８のルーティング情報を取得する。なお、上述したステップ１００〜ステップ１０６は本発明に係る取得手段に相当する処理である。
【００５３】
次のステップ１０８以降は本発明に係る判断手段に相当する処理であり、ステップ１０８では、ＨＤＤ４４に記憶されている登録テーブル４８からルーティング情報収集対象のバックボーン・ルータ２８に対応するルーティング情報を読み出し、登録テーブル４８から読み出したルーティング情報をバックボーン・ルータ２８から取得したルーティング情報と比較する。そして、ステップ１１０において、登録テーブル４８から読み出したルーティング情報に含まれる各情報が、バックボーン・ルータ２８から取得したルーティング情報の中に有効な情報として存在しているか否か判定する。
【００５４】
上記の判定が肯定された場合は何ら処理を行うことなくステップ１１４へ移行するが、上記の判定が否定された場合はコンピュータ・ネットワーク１０に障害が発生していると判断できるので、ステップ１１２へ移行し、登録テーブル４８から読み出したルーティング情報に含まれる各情報のうち、バックボーン・ルータ２８から取得したルーティング情報では欠落している（有効な情報として存在していない）情報を全て抽出し、ＲＡＭ３６Ｃ又はＨＤＤ４４に記憶させる。
【００５５】
なお、コンピュータ・ネットワーク１０内の各ルータの動的ルーティングにより、任意のＬＡＮ１２内で障害が発生した場合にも、各ルータのルーティング情報に変化（一部の情報の欠落）が生ずるので、ステップ１１２で欠落している情報がＲＡＭ３６Ｃ又はＨＤＤ４４に記憶されることになる。
【００５６】
ステップ１１４では、バックボーン・ルータ２８から取得したルーティング情報を、障害監視処理を前回実行した際にＲＡＭ３６Ｃ又はＨＤＤ４４に記憶させた対応する欠落情報と比較し、次のステップ１１６において、当該欠落情報が取得したルーティング情報上で復旧している（有効となっている）か否か、すなわち障害監視処理を前回実行した際に発生していた障害が復旧しているか否かを判定する。判定が否定された場合には何ら処理を行うことなくステップ１２０へ移行するが、判定が肯定された場合にはステップ１１８へ移行し、復旧した欠落情報をＲＡＭ３６Ｃ又はＨＤＤ４４に記憶させる。
【００５７】
ステップ１２０では、ルーティング情報取得対象の全てのルータからルーティング情報を取得したか否か判定する。判定が否定された場合にはステップ１００に戻り、ステップ１００〜ステップ１２０を繰り返す。これにより、ルーティング情報取得対象の全てのルータからルーティング情報が取得され、登録テーブル４８から読み出したルーティング情報に含まれる各情報のうち取得したルーティング情報では欠落している情報が有れば、該情報が欠落情報として記憶され、前回の障害監視処理で記憶していた欠落情報のうち取得したルーティング情報上で復旧している情報が有れば、該情報が復旧した欠落情報として記憶されることになる。
【００５８】
ステップ１２０の判定が肯定されるとステップ１２２へ移行し、ＲＡＭ３６Ｃ又はＨＤＤ４４に記憶されている欠落情報（先のステップ１１２で記憶された欠落情報）が有るか否か判定する。ステップ１２２の判定が否定された場合には、コンピュータ・ネットワーク１０には何ら障害が発生していない（又は監視対象の障害は発生していない）と判断できるので、何ら処理を行うことなくステップ１３２へ移行する。
【００５９】
一方、ステップ１２２の判定が肯定された場合には、コンピュータ・ネットワーク１０に何らかの障害が発生していると判断できるので、ステップ１２４へ移行し、ＲＡＭ３６Ｃ又はＨＤＤ４４に記憶されている欠落情報に基づいて障害発生箇所を特定する。なお、ＬＡＮ１２内で障害が発生した場合にも各ルータのルーティング情報に変化（一部の情報の欠落）が生ずることで障害の発生を検知できると共に、各ＬＡＮ１２の入口に設けられたＦ／Ｗルータ２４から最も近い位置に位置しているバックボーン・ルータ２８からルーティング情報を取得しているので、特定のＬＡＮ１２内で障害が発生した場合にも、例えば特定のＬＡＮ１２内で障害が発生している、等のように、障害発生箇所を精度良く絞り込む（おおよそ特定する）ことができる。
【００６０】
そして、次のステップ１２６では、ステップ１２４で特定した障害発生箇所に基づいて、コンピュータ・ネットワーク１０の中に通信不能となっている区間が存在しているか否か判定する。例えば個々のＬＡＮ１２に対応するバックボーン・ルータ２８とＦ／Ｗルータ２４の間の区間には、前述のように４つの通信経路が設けられているが、この４つの通信経路の何れも通信不能状態となっている場合（例えば４つの通信経路の何れでも障害が発生している場合、或いはＦ／Ｗルータ２４Ａ，２４Ｂに各々障害が発生している場合）には、情報の送信元から送信先へ至る通信経路上に当該区間が含まれる情報の送受を行うことは不可能である。なお、このような状態が生ずると、例として図２（Ｃ）に示すように、バックボーン・ルータ２８より取得したルーティング情報から、特定の宛先に対応する送信先が全て欠落することになる。
【００６１】
ステップ１２４で特定した障害発生箇所に基づき、通信不能状態となっている区間がコンピュータ・ネットワーク１０の中に存在していると判断した場合には、ステップ１２６の判定が肯定されてステップ１３０へ移行し、通信不能状態となっている区間を明示して重度の障害（通信不能）が生じていることを通知する障害通知メッセージを出力する（例えばディスプレイ３８に表示させる）。
【００６２】
また、通信不能状態となっている区間がコンピュータ・ネットワーク１０の中に存在していないと判断した場合（この場合、図２（Ｂ）に示すように、バックボーン・ルータ２８より取得したルーティング情報から、特定の宛先に対応する送信先が部分的に欠落するものの、特定の宛先に対応する一部の送信先は残存している）には、ステップ１２６の判定が否定されてステップ１２８へ移行し、特定した障害発生箇所を明示して軽度の障害が生じている（通信可能であるものの一部の通信経路が通信不能状態となっている）ことを通知する警告メッセージを出力する（例えばディスプレイ３８に表示させる）。
【００６３】
これにより、管理者はコンピュータ・ネットワーク１０に障害が発生したこと及び障害発生箇所を容易に認識できると共に、発生した障害の程度も容易に認識することができ、発生した障害を復旧させるための対策を講ずることができる。
【００６４】
次のステップ１３２では、ＲＡＭ３６Ｃ又はＨＤＤ４４に記憶されている復旧した欠落情報（先のステップ１１８で記憶された情報）が有るか否か判定する。判定が否定された場合には障害監視処理を終了するが、判定が肯定された場合にはステップ１３４へ移行し、ＲＡＭ３６Ｃ又はＨＤＤ４４に記憶されている復旧した欠落情報に基づいて、以前に発生していた障害が復旧したことを通知する復旧メッセージを出力（例えばディスプレイ３８に表示させる）した後に、障害監視処理を終了する。これにより、コンピュータ・ネットワーク１０に以前に発生していた障害が復旧したことを管理者が認識することができる。
【００６５】
上述した障害監視処理は所定周期で繰り返し実行されるので、コンピュータ・ネットワーク１０に障害が発生した場合に、これを迅速に検知し、障害発生箇所を速やかに判断して管理者へ通知することができ、コンピュータ・ネットワーク１０が、障害が発生している状態で長時間放置されたり、ＬＡＮ１２内で障害が発生していたために障害発生箇所の特定に時間がかかったりすることを回避することができる。
【００６６】
また、例えば特定のＬＡＮ１２内に設けられ特定業務を支援するアプリケーション処理を行う特定のサーバ・コンピュータ１６と、特定のＬＡＮ１２の外部に設けられた特定のコンピュータの間で情報を送受できない異常が発生した場合、特定のコンピュータの側に問題がなく、特定のＬＡＮ１２の外部に障害が発生していないのであれば、発生した異常の原因としては、「特定のサーバ・コンピュータ１６で行われるアプリケーション処理に問題がある（例えばアプリケーション・プログラムのバグ）」又は「特定のＬＡＮ１２に障害が発生している」が考えられるが、従来の技術ではこの判別ができないという問題があった。これに対し、本実施形態に係るコンピュータ・ネットワーク１０では、コンピュータ・ネットワーク１０に障害が発生すれば、障害発生箇所がＬＡＮ１２内であっても障害通知メッセージ又は警告メッセージが出力されるので、コンピュータ・ネットワークが正常か否かを容易に判断することができ、上記の判別も容易に行うことができる。
【００６７】
なお、上記では複数台のルータからルーティング情報を収集する例を説明したが、これに限られるものではなく、単一のルータのみからルーティング情報を収集するようにしてもよい。但し、ルーティング情報収集対象のルータに障害が発生してルーティング情報を収集できない事態が生ずる可能性があることを考慮すると、複数台のルータからルーティング情報を収集するか、又は、ルーティング情報収集対象のルータに障害が発生していた場合には他のルータからルーティング情報を収集する（ルーティング情報収集対象のルータを動的に変更する）ことが好ましい。
【００６８】
また、上記では本発明に係るネットワークとして、複数のＬＡＮが相互に接続されたコンピュータ・ネットワーク１０を例に説明したが、これに限定されるものではなく、本発明は任意の構成のネットワークに適用可能であることは言うまでもない。
【００６９】
【発明の効果】
以上説明したように本発明は、通信回線を介して相互に接続された複数台のルータを含んで構成され、各ルータが、所定の宛先へ送達されるべき情報を受信した場合の当該情報の送信先を表すルーティング情報を各宛先毎に記憶し、受信した情報を、該情報の宛先に対応するルーティング情報に従って送信先を選択して送信すると共に、ネットワークの状態に応じて記憶しているルーティング情報を適宜更新する機能を備えているネットワークにおいて、特定のルータからルーティング情報を取得し、取得したルーティング情報を、特定のルータに本来記憶されているべきルーティング情報と比較することで、ネットワークにおける障害発生の有無及び障害発生箇所を判断するので、実構造が不明なローカルネットワークを含むネットワークであっても、コンピュータに直接アクセスすることなく当該ネットワークにおける障害の発生を監視できる、という優れた効果を有する。
【図面の簡単な説明】
【図１】本実施形態に係るコンピュータ・ネットワークの概略構成を示すブロック図である。
【図２】ルータによるルーティング情報の更新を説明するための概略図である。
【図３】障害監視端末によって実行される障害監視処理の内容を示すフローチャートである。
【符号の説明】
１０コンピュータ・ネットワーク
１２ＬＡＮ
１６サーバ・コンピュータ
２４Ｆ／Ｗルータ
２８バックボーン・ルータ
３２不揮発性メモリ
３４ルーティング情報
３６障害監視端末
３８ディスプレイ
４４ＨＤＤ
４８登録テーブル[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a fault monitoring device, a fault monitoring method, and a program, and more particularly to a fault monitoring method for monitoring the occurrence of a fault in a network including a plurality of routers, and a fault to which the fault monitoring method is applied. The present invention relates to a monitoring device and a program for causing a computer to function as a failure monitoring device.
[0002]
[Prior art]
2. Description of the Related Art Various techniques have been proposed for monitoring the occurrence of a failure in a network and determining the location of the failure when the failure occurs.
[0003]
Specifically, for example, a fault monitoring computer is installed, and the fault monitoring computer and the computer for the monitoring center can transmit and receive information via a network. A technology has been proposed that enables a remote monitoring center to quickly and accurately grasp the occurrence of a failure in a monitored network by transmitting a failure detection signal to a computer for monitoring (for example, see Patent Document 1).
In addition, for example, a technology has been proposed in which information about the actual structure of a network and information about each network node are acquired from a monitored network and compared with the previously acquired information, and a failure is detected based on the comparison result. (For example, see Patent Document 2)
[0004]
[Patent Document 1]
JP 2001-298426 A
[Patent Document 2]
JP-A-2002-14881
[0005]
[Problems to be solved by the invention]
Meanwhile, in a network (also referred to as a WAN (Wide Area Network)) having a configuration in which a plurality of local area networks (LANs) are connected to each other, in order to prevent unauthorized access to each LAN from outside the LAN. Generally, a firewall is provided at the boundary between each LAN and the outside of each LAN (the entrance of the LAN). The firewall transmits information transmitted by a monitoring command such as a ping command and a response thereof. Is often set to block the passage of information corresponding to.
[0006]
On the other hand, the technology described in Patent Document 1 directly accesses a computer using a monitoring command to detect a failure, and is transmitted and received by using a monitoring command to a firewall provided at the entrance of a LAN. If the setting is made to block the passage of such information, access to the computer existing in the LAN is denied, and it is difficult to detect the occurrence of a failure in the network having the above configuration. There was a problem.
[0007]
In a network having a configuration in which a plurality of LANs are connected to each other, network management is performed for each LAN, and the actual structure of each LAN manages the entire network based on the security policy of each LAN. When the actual structure of the LAN is not disclosed to the administrator or when there is a change in the actual structure of the LAN such as addition of a computer, the report to the administrator is often not made. Many networks do not know the actual structure. Even in such a network, in the event that a failure occurs in the work performed using the network due to the occurrence of the failure, the administrator needs to identify the location where the failure has occurred and take measures, The technique described in Patent Literature 2 is based on the premise that the actual structure of the network is known, and even if this technique is applied, it detects the occurrence of a failure in a network having a part whose actual structure is unknown. It was difficult.
[0008]
The present invention has been made in view of the above facts, and has a failure monitoring device and a failure monitoring device that can monitor the occurrence of a failure in a network including a local network whose actual structure is unknown without directly accessing a computer. The purpose is to obtain a monitoring method and a program.
[0009]
[Means for Solving the Problems]
In order to achieve the above object, a fault monitoring apparatus according to the present invention includes a plurality of routers connected to each other via a communication line, and each router is delivered to a predetermined destination. In the case where information to be received is received, routing information indicating a transmission destination of the information is stored for each destination, and the received information is transmitted by selecting a transmission destination according to the routing information corresponding to the destination of the information, A failure monitoring device connected to a network having a function of appropriately updating stored routing information according to the state of the network and monitoring the occurrence of a failure in the network, and acquiring routing information from a specific router Acquiring means for performing routing information that should be originally stored in the specific router; Routing information acquired Te to, by comparing the routing information registered in the storage means is configured to include a determination means for determining whether a failure has occurred and point of failure in the network.
[0010]
The network according to the first aspect of the present invention includes a plurality of routers interconnected via a communication line, and each router receives information to be delivered to a predetermined destination. The routing information indicating the transmission destination of the information is stored for each destination, and the received information is selected and transmitted according to the routing information corresponding to the destination of the information, and stored according to the state of the network. It has a function of appropriately updating the routing information. The fault monitoring device according to the first aspect of the present invention is connected to the network having the above configuration.
[0011]
The above-described function of the router is generally referred to as dynamic routing (also referred to as dynamic routing), and the updating of the routing information by the router in accordance with the state of the network is specifically, for example, as described in claim 2. Each router updates the routing information of its own router when it becomes unable to communicate with any of the destinations included in the routing information stored in its own router (for example, a transmission that becomes incommunicable with its own router). Together with the corresponding information (the information corresponding to the communication path passing through the destination) is deleted or invalidated), and the adjacent routers are notified of the routing information, and change to the routing information notified by the other router. Also, if there is, the routing information of the own router is updated according to the change (for example, as described above, the routing information of the other router is updated). Corresponding to the information that has been deleted or invalidated on the signaling information (information corresponding to a destination that has become incommunicable with another router (information corresponding to a communication path passing through the destination)) is deleted or invalidated To do).
[0012]
In a network including a router having the above-described function (dynamic routing function), the contents of the routing information stored in the router changes according to the state of the network including the occurrence of a failure. Further, the network according to the present invention may be a local network whose actual structure is unknown (for example, a local network connected via a firewall that blocks transmission and reception of information necessary for fault monitoring as described in claim 3). ), The routing information of the router includes information on a computer in the local network that transmits / receives information to / from another computer (external computer) outside the local network (delivered to the computer). Information indicating the transmission destination of the information when the information to be received is received), and if a failure occurs that affects the transmission and reception of the information between the computer and the external computer, the information about the computer in the routing information Will also change, and the location of the failure It is possible to determine the failure location on the basis of also the change of the routing information as was the work.
[0013]
Based on the above fact, according to the first aspect of the present invention, the routing information that should be originally stored in the specific router is registered in the storage unit in advance, and the acquisition unit acquires the routing information from the specific router. Then, the determining unit determines whether or not a failure has occurred and the location of the failure in the network by comparing the routing information acquired by the acquiring unit with the routing information registered in the storage unit. It should be noted that the determination of the failure location includes not only specifying the failure location but also narrowing down the failure location.
[0014]
As described above, according to the first aspect of the present invention, the presence / absence of a failure and the location of the failure are determined using the routing information of the router, so that the invention can be applied to a network including a local network whose actual structure is unknown. Since there is no need to directly access the computer, even if a firewall or the like that blocks direct access to the computer (transmission and reception of information necessary for fault monitoring) is provided, this is not affected. Therefore, according to the first aspect of the invention, even in a network including a local network whose actual structure is unknown, it is possible to monitor the occurrence of a failure in the network without directly accessing the computer.
[0015]
It should be noted that the storage means according to the first aspect of the present invention is not limited to registering all routing information that should be originally stored in a specific router. If the monitoring target of the failure occurrence is limited, for example, monitoring the occurrence of only the failure affecting communication between specific computers in the network, register only the routing information corresponding to the monitoring target. You may do so. In this case, whether or not a failure has occurred and the location of the failure occurrence are determined, for example, as described in claim 4 below, by acquiring the transmission destination for each destination included in the routing information registered in the storage means. What is necessary is just to judge whether each exists in the routing information.
[0016]
Further, the obtaining means according to the present invention may obtain the routing information from a specific single router, or may obtain the routing information from all the routers in the network. When determining the location of a failure based on the location, the accuracy of the determination may differ depending on the positional relationship between the router from which the routing information was acquired and the location of the failure. Is often improved.
[0017]
For this reason, when the network according to the present invention is configured to include a local network connected via a firewall that blocks transmission and reception of information necessary for fault monitoring, the present invention is described in claim 3. As described above, at least the routing of the router located on the fault monitoring device side of the firewall and closest to the firewall (the position where another router does not exist on the communication path with the firewall) It is preferable to obtain information. In the network having the above configuration, when a failure occurs in the local network, there is a high possibility that the accuracy of determining the location of the failure decreases, but in the invention according to the third aspect, at least a location closest to the firewall is located. Since the routing information is obtained from the router that performs the processing, it is possible to accurately determine the location of the failure even when a failure occurs in the local network.
[0018]
Further, in the invention according to the first aspect, the determination of the presence / absence of the failure and the location of the failure by the determination means is, for example, based on the routing information registered in the storage means as described in the fourth aspect. The determination can be made by determining whether or not the transmission destination for each destination included in the routing information acquired by the acquisition unit is present. In this case, if any of the transmission destinations for each destination included in the routing information registered in the storage means does not exist in the routing information acquired by the acquisition means, some trouble occurs. It is possible to determine that a failure has occurred and to determine the location of the failure from the information, so that the presence / absence of the failure and the determination of the location of the failure can be realized by simple processing.
[0019]
Further, when a failure occurs in a specific communication line, all of the routing information stored in the router corresponding to the communication path passing through the specific communication line are deleted or invalidated. However, communication lines are generally multiplexed in preparation for the occurrence of a failure. Even if a failure occurs in a particular communication line, some transmission destinations corresponding to a particular destination (a particular communication In many cases, a destination corresponding to a communication path that does not pass through a line remains in the routing information (simply remains or remains as an effective destination). In this case, information transmission to a specific destination is possible.
[0020]
According to a fifth aspect of the present invention based on the above, in the fourth aspect of the invention, the determining unit determines that a specific destination having a reduced number of corresponding destinations exists in the routing information acquired by the acquiring unit. When it is recognized that there is a destination corresponding to a specific destination in the routing information acquired by the acquiring unit, a warning is issued, and the routing information acquired by the acquiring unit is included in the routing information acquired by the acquiring unit. If a transmission destination corresponding to a specific destination does not remain, the occurrence of a failure is reported.
[0021]
According to the fifth aspect of the present invention, when a destination corresponding to a specific destination remains (simply remains or remains as a valid destination) in the acquired routing information, that is, information to a specific destination If the transmission is possible, a warning is issued, and if the destination corresponding to the specific destination does not remain in the acquired routing information (simply remains or remains as a valid destination), that is, the specific destination When it is not possible to transmit information to the network, the occurrence of a failure is reported, so that the degree of the failure that has occurred in the network can be easily recognized by a network administrator or the like.
[0022]
Further, in the invention described in claim 1, for example, as described in claim 6, the obtaining means periodically obtains the routing information (for example, every several seconds to several minutes), and the determining means determines whether a failure has occurred in the network. It is preferable to determine the presence / absence and the location of the failure periodically (for example, at intervals of several seconds to several minutes). Thus, when a failure occurs in the network, the failure can be quickly detected and the location of the failure can be quickly determined.
[0023]
In the invention according to claim 6, the router from which the routing information is obtained by the obtaining means may not be fixed, and the router from which the routing information is obtained may be dynamically changed or added. For example, normally, when routing information is acquired from a specific router, and when it is recognized that a failure has occurred based on the acquired routing information, the routing information is acquired from another router, and the acquired routing information is also acquired. The location where the failure occurred may be determined in consideration of the situation.
[0024]
The fault monitoring method according to the invention according to claim 7 includes a plurality of routers connected to each other via a communication line, and each router receives information to be delivered to a predetermined destination. The routing information indicating the transmission destination of the information is stored for each destination, and the received information is selected and transmitted according to the routing information corresponding to the destination of the information, and stored according to the state of the network. In a network having a function of appropriately updating the routing information, the routing information that should be originally stored in the specific router from which the routing information is to be obtained is registered in advance, and the routing information is obtained from the specific router. Then, by comparing the acquired routing information with the previously registered routing information, Since the presence / absence of harm and the location of the failure occurrence are determined, even in the case of a network including a local network whose actual structure is unknown, the occurrence of a failure in the network without directly accessing the computer is performed as in the first aspect. Can be monitored.
[0025]
The program according to the invention according to claim 8 is configured to include a plurality of routers connected to each other via a communication line, wherein each router receives information to be delivered to a predetermined destination. The routing information indicating the destination of the information is stored for each destination, and the received information is selected and transmitted according to the routing information corresponding to the destination of the information, and stored according to the state of the network. A computer connected to a network having a function of appropriately updating the routing information stored in an acquisition unit for acquiring routing information from a specific router, and the routing information acquired by the acquisition unit is originally stored in the specific router. By comparing with the pre-registered routing information that should be To function as a determining means for determining the point of failure.
[0026]
Since the program according to the invention of claim 8 is a program for causing a computer to function as the above-described acquisition means and determination means, the computer executes the program according to the invention of claim 8 to execute the program. It functions as the fault monitoring device according to the first aspect, and, similarly to the inventions according to the first and seventh aspects, directly accesses a computer even in a network including a local network whose actual structure is unknown. It is possible to monitor the occurrence of a failure in the network.
[0027]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an example of an embodiment of the present invention will be described in detail with reference to the drawings. FIG. 1 shows a computer network 10 according to the present embodiment. The computer network 10 is a network installed to support various operations of a financial institution. More specifically, the computer network 10 has a plurality of bases each having a branch of the financial institution (in FIG. 1, for example, bases A to D). Are connected to one another via a communication network 14. The LAN 12 corresponds to the local network described in claim 3.
[0028]
Since the LANs 12 installed at the individual sites have a similar configuration, the configuration will be described below by taking the LAN 12A installed at the site A as an example. The LAN 12A is composed of a plurality of server computers 16 (the server 12 in the LAN 12A of FIG. 1). Computers 16A to 16F). An application program for supporting a specific business of a financial institution is installed in each server computer 16. By executing the application program, an application process for supporting the corresponding specific business is executed. Do.
[0029]
In the present embodiment, the support of the individual business of the financial institution is realized by a plurality of server computers 16 (application processing executed by each), and the server computers 16 executing the application processing supporting the same business are connected to each other. They are interconnected via a communication medium 18. As an example, in the LAN 12A of FIG. 1, the server computers 16A and 16B are connected to each other via a communication medium 18A, the server computers 16C and 16D are connected to each other via a communication medium 18B, and the server computers 16E and 16E are connected to each other. 16F are interconnected via a communication medium 18C.
[0030]
In addition, routers (hereinafter, referred to as business routers) 20 are connected to individual communication media 18 that mutually connect server computers 16 that execute application processes that support the same business (FIG. 1). The business routers 20A to 20C are shown in the LAN 12A), and these business routers 20 are interconnected via a communication medium 22. Two routers (hereinafter, referred to as F / W routers) 24A and 24B each functioning as a firewall are connected to the communication medium 22, and the F / W routers 24A and 24B correspond to the entrance of the LAN 12. , Are connected to each other via a communication medium 26A and a communication medium 26B.
[0031]
Each LAN 12 is provided with two routers (hereinafter, referred to as backbone routers) 28A and 28B corresponding to the respective LANs, and the backbone routers 28A and 28B are provided with communication media 26A and 28B. Each is connected to the communication medium 26B. Therefore, between the F / W router 24 of each LAN 12 and the backbone router 28 corresponding to each LAN 12, the backbone router 28A @ F / W router 24A, the backbone router 28A @ F / W router 24B, Four communication paths of a backbone router 28B @ F / W router 24A and a backbone router 28B @ F / W router 24B are provided, and the communication paths are multiplexed. The backbone routers 28A and 28B are connected to the communication network 14 via communication lines.
[0032]
The business router 20, the F / W router 24, and the backbone router 28 have a similar configuration to each other. The configuration will be described with reference to the backbone router 28A as an example. The backbone router 28A includes a CPU 30 and a nonvolatile memory 32. The non-volatile memory 32 stores routing information 34 (details will be described later) indicating a destination of information received from the outside. A predetermined program for the CPU 30 to execute is also stored in the non-volatile memory 32. The CPU 30 executes the predetermined program to monitor whether information has been received from the outside, When the information is received from the outside, the destination is determined by referring to the routing information 34 based on the destination of the received information, the received information is transmitted to the determined destination, and the routing information is changed according to the change in the state of the network. Is appropriately updated (dynamic routing).
[0033]
The non-volatile memory 32 of the F / W router 24 also stores regulation information that regulates information that should be prevented from passing through the F / W router 24. If the received information is information specified in the specified information (information to be prevented from passing), a filtering process (a process corresponding to a firewall) for discarding the received information is also performed. The F / W router 24 corresponds to the entrance of each LAN 12, and the F / W router 24 performs the above-described filtering processing at the entrance of each LAN 12, so that each computer such as the server computer 16 in each LAN 12 can operate. Unauthorized access from the outside can be prevented. The F / W router 24 corresponds to the firewall described in claim 3.
[0034]
Incidentally, a failure monitoring terminal 36 corresponding to the failure monitoring device according to the present invention is connected to the communication medium 18A of the LAN 12A. The fault monitoring terminal 36 is composed of a personal computer (PC) or the like and includes a CPU 36A, a ROM 36B, a RAM 36C, and an input / output port 36D, which are connected to each other via a bus 36E such as an address bus, a data bus, and a control bus. . A display 38 composed of a CRT or LCD, a mouse 40, a keyboard 42, a hard disk drive (HDD) 44, and a communication control device 46 are connected to the input / output port 36D. The communication control device 46 is connected to the communication medium 18A, and the failure monitoring terminal 36 can communicate with a router (described later) from which routing information is to be collected via the communication medium 18A and the like.
[0035]
In the HDD 44 of the failure monitoring terminal 36, a failure monitoring program for the CPU 36A to perform a failure monitoring process (details will be described later) is installed in advance. This fault monitoring program corresponds to the program described in claim 8, and when the CPU 36A executes the fault monitoring program, the fault monitoring terminal 36 functions as the fault monitoring device according to the present invention.
[0036]
Further, the HDD 44 stores the routing information that should be originally stored in the non-volatile memory 32 of the plurality of routers (details will be described later) from which routing information is to be acquired (in a state where no failure has occurred in the computer network 10). A registration table 48 in which routing information to be performed) is registered in advance is also stored. Note that the HDD 44 corresponds to the storage unit according to the present invention. The routing information registered in the registration table 48 is input to the fault monitoring terminal 36 via the keyboard 42 or the like by, for example, an administrator who manages the entire computer network 10.
[0037]
Note that the registration table 48 is not limited to registering all the routing information stored in the nonvolatile memory 32 of the router. When the monitoring target of the failure occurrence is limited, such as monitoring the occurrence of only the failure affecting the communication between the supporting computers, only the routing information corresponding to the monitoring target is registered.
[0038]
Next, as an operation of the present embodiment, updating (dynamic routing) of routing information in each router will be described with reference to FIG. 2 and a specific example. In FIG. 2, for the sake of simplicity, four routers a to d are provided. The router a to which the computer A is connected is connected to the router b and the router c, respectively, and the computer B is connected to the router b. A network is shown in which the connected router d is also connected to the router b and the router c, respectively, and the computers A and B mutually transmit and receive information.
[0039]
In the network having the above configuration, in a normal state where no failure occurs in the network, the router a to the router d receive information to be delivered to a specific destination (computer A or computer B) when each router receives the information. Routing information in which all transmission destinations of the information are set for each destination (computers A and B) is stored (see FIG. 2A).
[0040]
That is, for example, when the router a receives information destined for the computer B, the communication path (information transmission path) available for delivering the information to the computer B is “router a → router b → router”. A communication path “d → computer B” and a communication path “router a → router c → router d → computer B” exist. For this reason, the routing information stored in the router a indicates that “router b” and “router c”, which are the destinations corresponding to the above two types of communication paths, are the destinations of the information destined for the computer B. Is set.
[0041]
For example, when the router b receives information addressed to the computer A, the communication path available for delivering the information to the computer A includes a communication path of “router b → router a → computer A”. , A communication path “router b → router d → router c → router a → computer A” exists. For this reason, the routing information stored in the router b includes “router a” and “router d”, which are the destinations corresponding to the above two types of communication paths, as the destination of the information destined for the computer A. Is set.
[0042]
Further, each router determines whether or not any of the destinations included in the routing information stored in the own router has become incommunicable with each other, and determines whether or not the particular destination has been in a previously incommunicable state. An event that monitors whether the communication disabled state has been resolved (communication enabled state) or not and detects the communication disabled state or detects that the communication disabled state has been resolved, and detects the routing information of the own router. In addition, the routing information of the own router is notified to neighboring routers as appropriate, and the notified routing information changes. (This change also causes the neighboring router and any destination to become incapable of communication.) Or when the communication failure with any destination is eliminated), the routing information of the own router is updated in accordance with the detected change. Perform a physical (dynamic routing).
[0043]
Specifically, for example, as shown in FIG. 2B, a failure occurs in the communication line between the router a and the router b, the router a and the router b become incommunicable, and the communication path through the communication line Becomes unavailable, the above communication disabled state is detected by each of the routers a and b. In the router a, among the transmission destinations set in the routing information of the router a, the communication disabled state is set. The information of the transmission destination (“router b”, which is one of the transmission destinations of the information destined for the computer B) is invalidated, and at the router b, each transmission set in the routing information of the router b is set. Among the destinations, the information of the destination in the communication disabled state (“router a”, which is one of the destinations of information destined for computer A, and one of the destinations of information destined for computer B, Ah "Router a") is invalidated.
[0044]
Further, the routers a and b mutually notify the neighboring routers (routers c and d) of the routing information, so that the router c cannot communicate among the destinations set in the routing information of the router c. Destination information corresponding to the communication path via the communication line in the state (“router d”, which is one of the destinations of the information destined for the computer A, and the destination of the information destined for the computer B) "Router a") is invalidated, and the communication route via the communication line in the communication disabled state among the transmission destinations set in the routing information of the router d is also invalidated in the router d. (The "router b", one of the destinations of the information addressed to the computer A) corresponding to the destination is invalidated. In this way, the routing information of each router is updated in accordance with the communication disabled state between the router a and the router b (failure of the communication line).
[0045]
Further, for example, as shown in FIG. 2C, a failure occurs in a communication line between the router d and the computer B, or when a failure occurs in the computer B, communication between the router d and the computer B becomes impossible. In this case, the occurrence of the communication disabled state is detected by the router d, and in the router d, among the destinations set in the routing information of the router d, the destination of the information destined to the computer B is transmitted. A certain “computer B” is invalidated.
[0046]
Also, the router d mutually notifies the neighboring routers c and d of the routing information, and the routers c and d mutually notify the routing information of the router a. Of the destinations specified, "router d" and "router a", which are destinations of information destined for computer B, are respectively invalidated, and router d is set in the routing information of router d. Out of the transmission destinations, the routers “router d” and “router a”, which are the transmission destinations of the information destined for the computer B, are respectively invalidated. Among the transmission destinations, “router b” and “router c”, which are the transmission destinations of the information destined for the computer B, are respectively invalidated. In this way, the routing information of each router is updated in accordance with the communication line between the router d and the computer B or the communication failure state between the router d and the computer B due to the failure of the computer B. As described above, each router according to the present embodiment corresponds to the router described in claim 2 in detail.
[0047]
In addition, when a fault that has occurred previously is recovered and each router detects this, the routing information of each router is updated, and in detail, the corresponding information that has been invalidated is validated. Further, the mutual notification of the routing information between the adjacent routers is performed periodically (for example, every 30 seconds), and when the routing information changes (when the routing information is updated), the difference between the routing information (when the routing information is updated) is determined. There is a method of mutually notifying only the change), but any method may be adopted.
[0048]
Next, a failure monitoring process realized by the CPU 36A of the failure monitoring terminal 36 executing the failure monitoring program will be described with reference to the flowchart of FIG. The failure monitoring process described below is a process to which the failure monitoring method according to the present invention is applied, and is repeatedly executed by the failure monitoring terminal 36 at a predetermined cycle.
[0049]
In the present embodiment, a plurality of routers for which routing information is to be obtained are determined in advance, and in step 100, a single router that does not execute the processing described below is selected from among a plurality of routers for which routing information is to be obtained. By selecting and transmitting information for remote login to the selected router, remote login to the router is attempted.
[0050]
In the present embodiment, all the backbone routers 28A, 28B provided corresponding to each LAN 12 (the backbone routers 28A, 28B located closest to the F / W router 24 provided at the entrance of the LAN 12A). And the backbone routers 28A and 28B) located closer to the failure monitoring terminal than the F / W router 24 provided at the entrance of the LANs 12B to 12D and closest to the F / W router 24. The information to be acquired is transmitted to the backbone router 28 from which routing information is to be acquired in step 100, and is received by the backbone router 28 without being discarded by the firewall function of the F / W router 24.
[0051]
In the next step 102, it is determined whether or not information corresponding to a normal response has been received from the backbone router 28 that has transmitted the information for remote login (the backbone router 28 from which the routing information is to be obtained). It is determined whether or not login to the backbone router 28 for which information is to be acquired has succeeded. If the above determination is denied, it can be estimated that a failure has occurred in the backbone router 28 itself from which the routing information is to be obtained. By outputting a message notifying that there is an error (for example, displaying the message on the display 38), the occurrence of a failure in the backbone router 28 is notified to an administrator or the like, and the process returns to step 100.
[0052]
On the other hand, if the determination in step 102 is affirmative, the process proceeds to step 106, in which the backbone router 28 that has logged in is requested to transmit the routing information, and the backbone router 28 reads out the non-volatile memory 32 according to this request. By receiving the transmitted routing information, the routing information of the backbone router 28 that has logged in is obtained. Steps 100 to 106 described above are processing corresponding to the acquisition unit according to the present invention.
[0053]
Subsequent step 108 and subsequent steps are processing corresponding to the determination means according to the present invention. In step 108, routing information corresponding to the backbone router 28 whose routing information is to be collected is read from the registration table 48 stored in the HDD 44, The routing information read from the registration table 48 is compared with the routing information obtained from the backbone router 28. Then, in step 110, it is determined whether or not each information included in the routing information read from the registration table 48 exists as valid information in the routing information obtained from the backbone router 28.
[0054]
If the above determination is affirmed, the process proceeds to step 114 without performing any processing. However, if the above determination is denied, it can be determined that a failure has occurred in the computer network 10, and therefore the process proceeds to step 112. Then, of the information included in the routing information read from the registration table 48, all information missing (not existing as valid information) in the routing information acquired from the backbone router 28 is extracted, and the RAM 36C is extracted. Alternatively, it is stored in the HDD 44.
[0055]
It should be noted that, even if a failure occurs in any LAN 12 due to the dynamic routing of each router in the computer network 10, the routing information of each router changes (a part of the information is lost). Is stored in the RAM 36C or the HDD 44.
[0056]
In step 114, the routing information obtained from the backbone router 28 is compared with the corresponding missing information stored in the RAM 36C or the HDD 44 when the failure monitoring process was executed last time, and in the next step 116, the missing information is obtained. It is determined on the basis of the routing information that has been recovered (valid), that is, whether or not the fault that occurred when the fault monitoring process was executed last time has been recovered. When the determination is denied, the process proceeds to step 120 without performing any processing. When the determination is affirmed, the process proceeds to step 118, and the restored missing information is stored in the RAM 36C or the HDD 44.
[0057]
In step 120, it is determined whether the routing information has been obtained from all the routers for which the routing information is to be obtained. If the determination is negative, the process returns to step 100, and steps 100 to 120 are repeated. As a result, the routing information is obtained from all the routers for which the routing information is to be obtained, and if any of the information included in the routing information read from the registration table 48 is missing from the obtained routing information, the information is deleted. Is stored as missing information, and if there is information that has been restored on the acquired routing information among the missing information stored in the previous failure monitoring processing, the information is stored as restored missing information. Become.
[0058]
If the determination in step 120 is affirmative, the process proceeds to step 122, where it is determined whether or not there is missing information stored in the RAM 36C or the HDD 44 (missing information stored in the previous step 112). If the determination in step 122 is denied, it can be determined that no failure has occurred in the computer network 10 (or no failure to be monitored has occurred), so that step 132 is performed without performing any processing. Move to.
[0059]
On the other hand, if the determination in step 122 is affirmative, it can be determined that some kind of failure has occurred in the computer network 10, so the process proceeds to step 124, and based on the missing information stored in the RAM 36C or the HDD 44. Identify the fault location. When a failure occurs in the LAN 12, the occurrence of the failure can be detected by a change (loss of some information) in the routing information of each router, and the F / W provided at the entrance of each LAN 12. Since the routing information is obtained from the backbone router 28 located closest to the router 24, even if a failure occurs in the specific LAN 12, for example, the failure occurs in the specific LAN 12. , Etc., it is possible to accurately narrow down (approximately specify) a failure occurrence location.
[0060]
Then, in the next step 126, it is determined whether or not there is a section in which communication is impossible in the computer network 10 based on the failure occurrence point identified in step 124. For example, in the section between the backbone router 28 and the F / W router 24 corresponding to each LAN 12, four communication paths are provided as described above, but none of the four communication paths is in a communication disabled state. (For example, when a failure has occurred in any of the four communication paths, or when a failure has occurred in each of the F / W routers 24A and 24B). It is impossible to transmit and receive information including the section on the communication path leading to. When such a state occurs, as shown in FIG. 2C, for example, all the transmission destinations corresponding to the specific destination are missing from the routing information acquired from the backbone router 28.
[0061]
If it is determined based on the failure location identified in step 124 that the section in which communication is not possible exists in the computer network 10, the determination in step 126 is affirmed and the process proceeds to step 130. Then, a failure notification message notifying that a serious failure (communication failure) has occurred is output by clearly indicating the section in which the communication is disabled (for example, displayed on the display 38).
[0062]
When it is determined that the section in which communication is disabled does not exist in the computer network 10 (in this case, as shown in FIG. 2B, from the routing information acquired from the backbone router 28, If the transmission destination corresponding to the specific destination is partially missing, but some transmission destinations corresponding to the specific destination remain, the determination in step 126 is denied and the process proceeds to step 128. Then, a warning message notifying that a minor failure has occurred (communication is possible but some communication paths are in a communication disabled state) is output by specifying the specified failure occurrence location (for example, the display 38). To be displayed).
[0063]
As a result, the administrator can easily recognize that a failure has occurred in the computer network 10 and the location of the failure, and can easily recognize the degree of the failure that has occurred. Can be taken.
[0064]
In the next step 132, it is determined whether or not there is restored missing information (the information stored in the previous step 118) stored in the RAM 36C or the HDD 44. If the determination is denied, the failure monitoring process is terminated. However, if the determination is affirmed, the process proceeds to step 134 and, based on the restored missing information stored in the RAM 36C or the HDD 44, a failure that has occurred earlier has occurred. After outputting (for example, displaying on the display 38) a recovery message notifying that the fault that has been recovered has been recovered, the fault monitoring process is terminated. As a result, the administrator can recognize that the fault that has previously occurred in the computer network 10 has been restored.
[0065]
Since the above-described fault monitoring process is repeatedly executed at a predetermined cycle, when a fault occurs in the computer network 10, it is possible to quickly detect the fault, immediately determine the fault location, and notify the administrator. Thus, it is possible to prevent the computer network 10 from being left for a long time in a state where a failure has occurred, or taking a long time to specify the location of the failure due to the occurrence of a failure in the LAN 12. .
[0066]
Further, for example, an abnormality in which information cannot be transmitted and received between a specific server computer 16 provided in a specific LAN 12 for performing application processing for supporting a specific task and a specific computer provided outside the specific LAN 12 has occurred. In this case, if there is no problem on the specific computer side and no failure has occurred outside the specific LAN 12, the cause of the abnormality that has occurred is “problem in application processing performed on the specific server computer 16. (For example, a bug in an application program) "or" a failure has occurred in a specific LAN 12 ", but there was a problem that this determination could not be made with the conventional technology. On the other hand, in the computer network 10 according to the present embodiment, if a failure occurs in the computer network 10, a failure notification message or a warning message is output even if the failure location is within the LAN 12. Whether or not the network is normal can be easily determined, and the above determination can also be easily performed.
[0067]
In the above description, an example in which routing information is collected from a plurality of routers has been described. However, the present invention is not limited to this, and routing information may be collected only from a single router. However, considering that there is a possibility that a failure may occur in the routing information collection target router and the routing information cannot be collected, the routing information may be collected from a plurality of routers or the routing information collection target may not be collected. When a failure has occurred in a router, it is preferable to collect routing information from another router (dynamically change the router for which routing information is to be collected).
[0068]
In the above, the computer network 10 in which a plurality of LANs are interconnected has been described as an example of the network according to the present invention. However, the present invention is not limited to this, and the present invention can be applied to a network having an arbitrary configuration. It goes without saying that it is possible.
[0069]
【The invention's effect】
As described above, the present invention is configured to include a plurality of routers connected to each other via a communication line, and when each router receives information to be delivered to a predetermined destination, Routing information representing a transmission destination is stored for each destination, and received information is transmitted by selecting a transmission destination according to the routing information corresponding to the destination of the information and storing the received information according to the state of the network. In a network that has a function to update information as needed, it obtains routing information from a specific router and compares the obtained routing information with the routing information that should be stored in the specific router. A network including a local network whose actual structure is unknown because the presence / absence and failure location are determined Even has the excellent effect that can monitor the occurrence of a failure in the network without having to directly access the computer.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a schematic configuration of a computer network according to an embodiment.
FIG. 2 is a schematic diagram for explaining updating of routing information by a router.
FIG. 3 is a flowchart illustrating details of a failure monitoring process executed by the failure monitoring terminal.
[Explanation of symbols]
10. Computer Network
12 LAN
16 server computer
24 F / W router
28 Backbone Router
32 Non-volatile memory
34 routing information
36 Failure monitoring terminal
38 Display
44 HDD
48 Registration Table

Claims

It is configured to include a plurality of routers connected to each other via a communication line, and when each router receives information to be delivered to a predetermined destination, the router sends routing information indicating a destination of the information to each destination. A function of storing and transmitting received information, selecting a transmission destination according to routing information corresponding to the destination of the information, transmitting the information, and appropriately updating the stored routing information according to the state of the network. A fault monitoring device connected to a network and monitoring the occurrence of a fault in the network,
Obtaining means for obtaining routing information from a specific router;
Storage means in which routing information that should be originally stored in the specific router is registered in advance,
Determining means for determining the presence or absence of a failure and the location of the failure in the network by comparing the routing information obtained by the obtaining means with the routing information registered in the storage means;
Fault monitoring device including.

Each of the routers updates the routing information of its own router when it becomes unable to communicate with any of the destinations included in the routing information stored in its own router, and also exchanges the routing information with its neighbors. Update the routing information of the own router according to the change even if the routing information notified from the other router changes, thereby updating the routing information according to the state of the network. The fault monitoring device according to claim 1, wherein:

The network is configured to include a local network connected via a firewall that blocks transmission and reception of information necessary for fault monitoring,
2. The fault monitoring device according to claim 1, wherein the obtaining unit obtains at least routing information of a router located closer to the fault monitoring device than the firewall and closest to the firewall. .

The determining unit determines whether the transmission destination for each destination included in the routing information registered in the storage unit is present in the routing information acquired by the acquiring unit. 2. The fault monitoring apparatus according to claim 1, wherein the presence / absence of a fault and the location of the fault in the network are determined.

When the determining unit recognizes that there is a specific destination with a reduced number of corresponding destinations on the routing information obtained by the obtaining unit, the determining unit obtains the specific information by the obtaining unit. If the destination corresponding to the specific destination remains in the routing information, a warning is issued, and the destination corresponding to the specific destination must remain in the routing information acquired by the acquiring unit. 5. The fault monitoring device according to claim 4, wherein the occurrence of the fault is reported.

2. The fault monitoring device according to claim 1, wherein the obtaining unit periodically obtains routing information, and the determining unit periodically determines whether a failure has occurred in the network and the location of the failure.

It is configured to include a plurality of routers connected to each other via a communication line, and when each router receives information to be delivered to a predetermined destination, the router sends routing information indicating a destination of the information to each destination. A function of storing and transmitting received information, selecting a transmission destination according to routing information corresponding to the destination of the information, transmitting the information, and appropriately updating the stored routing information according to the state of the network. Network
Register the routing information that should be originally stored in the specific router for which the routing information is to be acquired, in advance,
Obtaining routing information from the specific router,
A failure monitoring method for comparing the acquired routing information with the pre-registered routing information to determine the presence / absence of a failure and the location of the failure in the network.

It is configured to include a plurality of routers connected to each other via a communication line, and when each router receives information to be delivered to a predetermined destination, the router sends routing information indicating a destination of the information to each destination. A function of storing and transmitting received information, selecting a transmission destination according to routing information corresponding to the destination of the information, transmitting the information, and appropriately updating the stored routing information according to the state of the network. Computers connected to the network
Acquisition means for acquiring routing information from a specific router,
Determining means for determining whether or not a failure has occurred and a failure occurrence location in the network by comparing the routing information obtained by the obtaining means with pre-registered routing information which should be originally stored in the specific router; Program to function as.