JP2018152672A

JP2018152672A - Monitoring method, monitoring device, and program

Info

Publication number: JP2018152672A
Application number: JP2017046387A
Authority: JP
Inventors: 康晃森; Yasuaki Mori; 外浩小林; Sotohiro Kobayashi; 昌人市橋; Masato Ichihashi; 祐一波岡; Yuichi Namioka; 松村　陽一; Yoichi Matsumura; 陽一松村; 友和西塔; Tomokazu Saito
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-03-10
Filing date: 2017-03-10
Publication date: 2018-09-27
Anticipated expiration: 2037-03-10
Also published as: JP6903960B2

Abstract

PROBLEM TO BE SOLVED: To distribute monitoring loads.SOLUTION: A monitoring method for a network including a plurality of nodes includes the steps of classifying the plurality of nodes according to a hop number with a monitoring device on a communication path with a minimum hop number with the monitoring device communicatively connected with the network for monitoring the network, and allocating a monitorable monitoring destination node so that monitoring loads are balanced among nodes with the same hop number at the time of setting a monitoring relation among the nodes so that a node at a side with a smaller hop number monitors a node at a side with a larger hop number.SELECTED DRAWING: Figure 9

Description

本発明は、監視方法，監視装置，及びプログラムに関する。 The present invention relates to a monitoring method, a monitoring device, and a program.

複数のネットワーク機器（以下、「ノード」とも表記）と、各ネットワーク機器を監視・制御するネットワークコントローラ（以下、「監視サーバ」とも表記）とを含むネットワークシステムがある。監視サーバは、例えば、いわゆるサーバ・クライアント方式によって、監視対象のノードの全てと直接に通信を行うことで監視を行う。 There is a network system including a plurality of network devices (hereinafter also referred to as “nodes”) and a network controller (hereinafter also referred to as “monitoring server”) that monitors and controls each network device. The monitoring server performs monitoring by directly communicating with all the nodes to be monitored by, for example, a so-called server / client method.

特開２０１１−１３０１４４号公報JP 2011-130144 A 特開２０１３−４７９２２号公報JP 2013-47922 A

近年のネットワークの大規模化によって、監視サーバの監視対象のノードの数が増加し、監視サーバの負荷が増大している。負荷の増大に対し、監視サーバの増設や高スペックの（高性能の）監視サーバの採用にて対応を図ると、設備に係るコストが増大する問題がある。 With the recent increase in the scale of networks, the number of nodes to be monitored by the monitoring server has increased, and the load on the monitoring server has increased. If the increase in load is dealt with by increasing the number of monitoring servers or adopting high-spec (high-performance) monitoring servers, there is a problem that the cost associated with the facilities increases.

本発明は、監視の負荷を分散させることが可能な技術を提供することを目的とする。 An object of the present invention is to provide a technique capable of distributing a monitoring load.

一つの態様は、複数のノードを含むネットワークに通信可能に接続され前記ネットワークの監視を行う監視装置に含まれるコンピュータに、前記監視装置との間の最小ホップ数の通信経路における前記監視装置との間のホップ数に応じて前記複数のノードを分類する処理と、ホップ数が小さい側のノードが、ホップ数が大きい側のノードを監視するようにノード間で監視関係の設定を行う際に、ホップ数が同じノード間で監視負荷がバランスされるように監視可能な監視先ノードの割り振りを実施する処理と、を実行させるプログラムである。 In one aspect, a computer included in a monitoring device that is communicably connected to a network including a plurality of nodes and that monitors the network is connected to the monitoring device in a communication path with a minimum number of hops between the monitoring device and the computer. When classifying the plurality of nodes according to the number of hops between them, and when setting the monitoring relationship between the nodes so that the node with the smaller hop number monitors the node with the larger hop number, And a process of allocating monitoring destination nodes that can be monitored so that the monitoring load is balanced between nodes having the same hop count.

一側面では、監視の負荷を分散させることができる。 In one aspect, the monitoring load can be distributed.

図１は、実施形態に係るネットワークシステムの一例を示す。FIG. 1 shows an example of a network system according to the embodiment. 図２はツリーの一例を示す。FIG. 2 shows an example of a tree. 図３は監視サーバ及びノードに適用可能な情報処理装置（コンピュータ）のハードウェア構成例を示す図である。FIG. 3 is a diagram illustrating a hardware configuration example of an information processing apparatus (computer) applicable to the monitoring server and the node. 図４は、監視コスト情報の説明図である。FIG. 4 is an explanatory diagram of monitoring cost information. 図５は、図１に示したネットワークシステムにおける複数のノードについてのレイヤ及び監視コストを決定した例を示す。FIG. 5 shows an example in which layers and monitoring costs for a plurality of nodes in the network system shown in FIG. 1 are determined. 図６は、図５の例における子ノードの割り振りの一例を示す。FIG. 6 shows an example of allocation of child nodes in the example of FIG. 図７は、親ノード間の子ノードの割り振り方法の一例を示す図である。FIG. 7 is a diagram illustrating an example of a method of allocating child nodes between parent nodes. 図８は、親ノード間の子ノードの割り振り方法の一例を示す図である。FIG. 8 is a diagram illustrating an example of a method of allocating child nodes between parent nodes. 図９は、監視サーバ（監視サーバとして動作する情報処理装置）のＣＰＵによって実行される監視処理の一例を示すフローチャートである。FIG. 9 is a flowchart illustrating an example of a monitoring process executed by the CPU of the monitoring server (information processing apparatus operating as the monitoring server). 図１０は、ノードの親ノードとしての処理例を示すフローチャートである。FIG. 10 is a flowchart illustrating an example of processing as a parent node of a node. 図１１は、通信断の検出時における監視サーバのＣＰＵの処理例を示すフローチャートである。FIG. 11 is a flowchart illustrating a processing example of the CPU of the monitoring server when communication disconnection is detected.

以下、図面を参照して、実施形態に係る監視方法，監視装置，及びプログラムについて説明する。実施形態の構成は例示であり、本発明は実施形態の構成に限定されない。 Hereinafter, a monitoring method, a monitoring device, and a program according to embodiments will be described with reference to the drawings. The configuration of the embodiment is an exemplification, and the present invention is not limited to the configuration of the embodiment.

＜ネットワーク構成＞
図１は、実施形態に係るネットワークシステムの一例を示す。ネットワークシステムは、ネットワークを監視する監視サーバ１と、ネットワークに含まれる複数のノード２とを含む。監視サーバ１は「監視装置」の一例である。 <Network configuration>
FIG. 1 shows an example of a network system according to the embodiment. The network system includes a monitoring server 1 that monitors the network and a plurality of nodes 2 included in the network. The monitoring server 1 is an example of a “monitoring device”.

複数のノード２は、監視サーバ１によって監視及び制御されるネットワーク機器である。ノード２（ネットワーク機器）は、端末装置及び中継装置を含む。端末装置は、例えば、パーソナルコンピュータ，ワークステーション，サーバマシン，タブレット端末，スマートフォン，センサノードなどと呼ばれる通信機能を有するセンサ端末などを含む。中継装置は、ルータ，レイヤ２／レイヤ３スイッチ，ＨＵＢなどを含む。但し、ネットワーク機器は上記例示以外のネットワーク機器を含み得る。ノード２間のトポロジは、各ノード２が直接に又は間接に（他のノード２を介して）監視サーバ１に接続されている限り適宜設定し得る。ノード２の数は適宜設定可能である。 The plurality of nodes 2 are network devices monitored and controlled by the monitoring server 1. The node 2 (network device) includes a terminal device and a relay device. The terminal device includes, for example, a sensor terminal having a communication function called a personal computer, a workstation, a server machine, a tablet terminal, a smartphone, a sensor node, or the like. The relay device includes a router, a layer 2 / layer 3 switch, a HUB, and the like. However, the network device may include network devices other than the above examples. The topology between the nodes 2 can be appropriately set as long as each node 2 is connected to the monitoring server 1 directly or indirectly (through another node 2). The number of nodes 2 can be set as appropriate.

監視サーバ１は、複数のノード２の監視及び制御を行う。監視に用いるプロトコルとして、Simple Network Management Protocol（ＳＮＭＰ），ping，テルネット（telnet）などを適用することができる。但し、上記以外のプロトコルが監視に適用されても良い。 The monitoring server 1 monitors and controls a plurality of nodes 2. As a protocol used for monitoring, Simple Network Management Protocol (SNMP), ping, telnet, or the like can be applied. However, protocols other than those described above may be applied for monitoring.

実施形態における監視サーバ１は、監視サーバ１を頂点（ルート）とするツリーを形成する。図２はツリーの一例を示す。ツリーは、ツリーの末端から頂点へ向かって監視用の情報（監視情報）を転送するルートとして使用される。 The monitoring server 1 in the embodiment forms a tree having the monitoring server 1 as a vertex (root). FIG. 2 shows an example of a tree. The tree is used as a route for transferring monitoring information (monitoring information) from the end of the tree toward the vertex.

監視サーバ１から１ホップ目に位置するノード２は「代表ノード」に設定される。代表ノード以外のノードは「一般ノード」に設定される。一般ノードは代表ノード又は他のノードに接続される。 The node 2 located at the first hop from the monitoring server 1 is set as a “representative node”. Nodes other than the representative node are set to “general nodes”. The general node is connected to the representative node or another node.

図２に示すツリーの例では、２つの代表ノード＃１及び代表ノード＃２が設定されている。代表ノードが属する１ホップ目は「レイヤ１」と呼ばれる。監視サーバ１から２ホップ目以降はレイヤ２，３，・・・と呼ばれる。図２の例では、一般ノードａ〜ｅが例示されており、一般ノードａ及び一般ノードｄはレイヤ２に属する。一般ノードｂ，一般ノードｃ及び一般ノードｅはレイヤ３に属する。図２に図示された各矢印は子ノードから親ノードに向いている。 In the example of the tree shown in FIG. 2, two representative nodes # 1 and # 2 are set. The first hop to which the representative node belongs is called “layer 1”. The second and subsequent hops from the monitoring server 1 are called layers 2, 3,. In the example of FIG. 2, general nodes a to e are illustrated, and the general node a and the general node d belong to the layer 2. General node b, general node c, and general node e belong to layer 3. Each arrow illustrated in FIG. 2 is directed from the child node to the parent node.

代表ノードは監視サーバ１との間で親子関係を形成し、一般ノードは代表ノード又は他の一般ノードとの間で親子関係を形成する。図２に示す例では、一般ノードａ〜ｅとなるノード２がある。一般ノードａは代表ノード＃１を親ノードとする子ノードとなり、一般ノードｄは代表ノード＃２を親ノードとする子ノードとなっている。一般ノードａは一般ノードｂ及び一般ノードｃの親ノードとなっており、一般ノードｄは一般ノードｅの親ノードとなっている。 The representative node forms a parent-child relationship with the monitoring server 1, and the general node forms a parent-child relationship with the representative node or another general node. In the example illustrated in FIG. 2, there is a node 2 that is general nodes a to e. The general node a is a child node having the representative node # 1 as a parent node, and the general node d is a child node having the representative node # 2 as a parent node. The general node a is a parent node of the general node b and the general node c, and the general node d is a parent node of the general node e.

監視サーバ１はツリーの生成（ノードの親子関係）を行い、各ノード２に監視制御用データを送信する。監視制御用データは各ノード２向けの親子関係を示すデータを含む。例えば、監視制御用データは、以下の情報を含む。
・１又は複数の監視対象の子ノードを示す情報（監視対象のノードを示す情報）
・監視結果を通知する上位レイヤのノード２又は監視サーバ１（監視結果を送信するノードを示す情報）
・監視対象項目を示す情報、例えば、監視対象の警報（アラーム）や状態などを示す情報。 The monitoring server 1 generates a tree (node parent-child relationship), and transmits monitoring control data to each node 2. The monitoring control data includes data indicating a parent-child relationship for each node 2. For example, the monitoring control data includes the following information.
-Information indicating one or more monitored child nodes (information indicating monitored nodes)
-Upper layer node 2 or monitoring server 1 for notifying the monitoring result (information indicating the node transmitting the monitoring result)
Information indicating the monitoring target item, for example, information indicating an alarm (alarm) or state of the monitoring target.

各ノード２は、監視制御用データに従って子ノードを監視する。監視は、子ノードから監視結果の情報を収集する「状態収集」と、子ノードが自発的に送信した情報を受け取る「イベント通知」の２つを含む。各ノード２は、子ノードからの監視結果に自ノードの監視結果を追加して親ノードに送信する。 Each node 2 monitors the child node according to the monitoring control data. The monitoring includes two types of “status collection” for collecting information of monitoring results from the child node and “event notification” for receiving information transmitted spontaneously by the child node. Each node 2 adds the monitoring result of its own node to the monitoring result from the child node and transmits it to the parent node.

＜ハードウェア構成＞
図３は監視サーバ１及びノード２に適用可能な情報処理装置（コンピュータ）のハードウェア構成例を示す図である。情報処理装置１０は、一例として、バスＢ１を介して相互に接続されたCentral Processing Unit（ＣＰＵ）１１と、主記憶装置１２と、補助記憶
装置１３と、通信インタフェース（通信ＩＦ）１４と、入力装置１５と、出力装置１６と、センサ１７とを含む。 <Hardware configuration>
FIG. 3 is a diagram illustrating a hardware configuration example of an information processing apparatus (computer) applicable to the monitoring server 1 and the node 2. As an example, the information processing apparatus 10 includes a central processing unit (CPU) 11, a main storage device 12, an auxiliary storage device 13, a communication interface (communication IF) 14, and an input connected to each other via a bus B 1. A device 15, an output device 16, and a sensor 17 are included.

主記憶装置１２はプログラムの展開領域、ＣＰＵ１１の作業領域、データやプログラムの記憶領域、通信データのバッファ領域などとして使用される。主記憶装置１２は、例えばRandom Access Memory（ＲＡＭ）、ＲＡＭとRead Only Memory（ＲＯＭ）との組み合わせで形成される。 The main storage device 12 is used as a program development area, a work area for the CPU 11, a data and program storage area, a communication data buffer area, and the like. The main storage device 12 is formed of, for example, a random access memory (RAM) or a combination of a RAM and a read only memory (ROM).

補助記憶装置１３はデータやプログラムの記憶領域として使用される。補助記憶装置１３は、例えば、ハードディスクドライブ（ＨＤＤ）、Solid State Drive（ＳＳＤ）、フ
ラッシュメモリ、Electrically Erasable Programmable Read-Only Memory（ＥＥＰＲＯ
Ｍ）などの不揮発性記憶媒体で形成される。主記憶装置１２及び補助記憶装置１３のそれぞれは、「記憶装置」、「記憶媒体」、「メモリ」、「記憶部」の一例である。 The auxiliary storage device 13 is used as a storage area for data and programs. The auxiliary storage device 13 is, for example, a hard disk drive (HDD), a solid state drive (SSD), a flash memory, an electrically erasable programmable read-only memory (EEPRO).
M). Each of the main storage device 12 and the auxiliary storage device 13 is an example of “storage device”, “storage medium”, “memory”, and “storage unit”.

通信ＩＦ１４は通信処理を司る。通信ＩＦ１４には例えばNetwork Interface Card（ＮＩＣ）が使用される。入力装置１５は、例えば、キー、ボタン、ポインティングデバイス（マウスなど）、タッチパネル、音声入力装置（マイクロフォン）などである。出力装置１６は、例えばディスプレイ、プリンタ、スピーカ、ランプなどである。 The communication IF 14 manages communication processing. For example, a network interface card (NIC) is used for the communication IF 14. The input device 15 is, for example, a key, a button, a pointing device (such as a mouse), a touch panel, or a voice input device (microphone). The output device 16 is, for example, a display, a printer, a speaker, a lamp, or the like.

ＣＰＵ１１は、補助記憶装置１３に記憶されたプログラムを主記憶装置１２にロードして実行する。プログラムの実行によって、情報処理装置１０は監視サーバ１としての動作を行う。ＣＰＵ１１は上述したツリーを生成し、ツリーに基づく監視制御データを各ノード２に送信する処理を行う。また、各ノード２での監視結果を受信し、監視結果の解析（監視）を通じてノード２やネットワークの制御を行う。ＣＰＵ１１は、プログラムの実行によって、「分類部」及び「割振部」として動作することができる。 The CPU 11 loads the program stored in the auxiliary storage device 13 to the main storage device 12 and executes it. The information processing apparatus 10 operates as the monitoring server 1 by executing the program. The CPU 11 performs processing for generating the above-described tree and transmitting monitoring control data based on the tree to each node 2. In addition, the monitoring result at each node 2 is received, and the node 2 and the network are controlled through analysis (monitoring) of the monitoring result. The CPU 11 can operate as a “classification unit” and an “allocation unit” by executing the program.

ＣＰＵ１１は、「制御装置」、「制御部」、「コントローラ」、「プロセッサ」の一例である。ＣＰＵ１１は、ＭＰＵ（Microprocessor）、プロセッサとも呼ばれる。ＣＰＵ１１は、単一のプロセッサに限定される訳ではなく、マルチプロセッサ構成であってもよい。また、単一のソケットで接続される単一のＣＰＵがマルチコア構成を有していても良い。ＣＰＵ１１で行われる処理の少なくとも一部は、マルチコア又は複数のＣＰＵで実行されても良い。ＣＰＵで行われる処理の少なくとも一部は、ＣＰＵ以外のプロセッサ、例え
ば、Digital Signal Processor(ＤＳＰ)、Graphics Processing Unit（ＧＰＵ）、数値演算プロセッサ、ベクトルプロセッサ、画像処理プロセッサ等の専用プロセッサで行われても良い。 The CPU 11 is an example of a “control device”, “control unit”, “controller”, and “processor”. The CPU 11 is also called an MPU (Microprocessor) or a processor. The CPU 11 is not limited to a single processor, and may have a multiprocessor configuration. A single CPU connected by a single socket may have a multi-core configuration. At least a part of the processing performed by the CPU 11 may be executed by a multi-core or a plurality of CPUs. At least a part of the processing performed by the CPU is performed by a processor other than the CPU, for example, a dedicated processor such as a digital signal processor (DSP), a graphics processing unit (GPU), a numerical operation processor, a vector processor, or an image processing processor. Also good.

また、ＣＰＵ１１によって行われる処理の少なくとも一部は、集積回路（ＩＣ）、その他のディジタル回路で行われても良い。また、集積回路やディジタル回路はアナログ回路を含んでいても良い。集積回路は、ＬＳＩ、Application Specific Integrated Circuit
（ＡＳＩＣ）、プログラマブルロジックデバイス（ＰＬＤ）を含む。ＰＬＤは、例えば、Field-Programmable Gate Array(ＦＰＧＡ)を含む。ＣＰＵ１１で行われる処理の少なく
とも一部は、プロセッサと集積回路との組み合わせにより実行されても良い。組み合わせは、例えば、マイクロコントローラ（ＭＣＵ）、ＳｏＣ（System-on-a-chip）、システムＬＳＩ、チップセットなどと呼ばれる。 Further, at least a part of the processing performed by the CPU 11 may be performed by an integrated circuit (IC) or other digital circuits. Further, the integrated circuit and the digital circuit may include an analog circuit. Integrated circuit is LSI, Application Specific Integrated Circuit
(ASIC) and programmable logic device (PLD). The PLD includes, for example, a field-programmable gate array (FPGA). At least a part of the processing performed by the CPU 11 may be executed by a combination of a processor and an integrated circuit. The combination is called, for example, a microcontroller (MCU), a SoC (System-on-a-chip), a system LSI, a chip set, or the like.

＜監視サーバの処理＞
次に、監視サーバ１におけるツリー及び監視制御データの生成及び送信処理について説明する。例えば、監視サーバ１として動作する情報処理装置１０の補助記憶装置１３は、図２に示す様に、トポロジ情報，ノード情報，監視コスト情報，ツリー情報などを記憶する。 <Monitoring server processing>
Next, generation and transmission processing of a tree and monitoring control data in the monitoring server 1 will be described. For example, the auxiliary storage device 13 of the information processing apparatus 10 that operates as the monitoring server 1 stores topology information, node information, monitoring cost information, tree information, and the like as shown in FIG.

トポロジ情報は、各ノード２の接続状態を示す。ノード情報は、各ノード２のネットワークアドレス（Internet Protocol（ＩＰ）アドレス，Media Access Control（ＭＡＣ）
アドレス），装置種別，装置構成を示す情報などを含む。監視コスト情報は、各ノード２を監視する場合に生じるコストの算出に用いる情報を含む。ツリー情報はレイヤと、各レイヤに属するノードと、ノードの親子関係を示す情報を含む。これらの情報は、ＣＰＵ１１がプログラムの実行によりツリー及び監視制御データを生成する場合に使用される。 The topology information indicates the connection state of each node 2. The node information includes the network address (Internet Protocol (IP) address, Media Access Control (MAC)) of each node 2
Address), device type, and information indicating the device configuration. The monitoring cost information includes information used for calculating a cost that occurs when each node 2 is monitored. The tree information includes information indicating layers, nodes belonging to each layer, and parent-child relationships of the nodes. These pieces of information are used when the CPU 11 generates a tree and monitoring control data by executing a program.

＜＜各ノードの監視コストの決定＞＞
監視サーバ１は、監視対象のそれぞれのノード２に対して、監視に使用するリソース（ＣＰＵ１１やメモリ（主記憶装置１２など）の使用量）の重み付けを行う。重みは、監視に使用されるＣＰＵ１１やメモリ（主記憶装置１２など）の使用時間や使用量、データの収集時間などの監視サーバ１にかかる負荷を示す。重みを「監視コスト」と呼ぶ。 << Determination of monitoring cost for each node >>
The monitoring server 1 weights each node 2 to be monitored with resources (amount of use of the CPU 11 and memory (main storage device 12 and the like)) used for monitoring. The weight indicates the load applied to the monitoring server 1 such as the usage time and usage of the CPU 11 and the memory (main storage device 12 or the like) used for monitoring and the data collection time. The weight is called “monitoring cost”.

図４は、監視コスト情報の説明図である。監視コストは、一例として、図４の表に示すような監視コスト要素の組み合わせにより決定される。図４の例では、監視コスト要素として、「警報／状態の収集数」，「収集インタフェース」，「収集複雑度」，及び「収集時間」が例示されている。なお、監視コスト要素は、上記から選択される少なくとも一つであっても良く、上記以外の要素が採用されても良い。 FIG. 4 is an explanatory diagram of monitoring cost information. As an example, the monitoring cost is determined by a combination of monitoring cost elements as shown in the table of FIG. In the example of FIG. 4, “alarm / status collection number”, “collection interface”, “collection complexity”, and “collection time” are illustrated as monitoring cost elements. The monitoring cost element may be at least one selected from the above, and elements other than the above may be adopted.

「警報／状態の収集数」は、監視対象のアラームや状態の数（監視対象の情報項目の数の一例）を示す。収集数の増加に比例して監視コストは大きくなる。例えば、収集数がそのままコスト値に設定される。収集数が例えば１５であれば、コスト値が１５となる。 “Number of alarms / states collected” indicates the number of alarms and states to be monitored (an example of the number of information items to be monitored). The monitoring cost increases in proportion to the increase in the number of collections. For example, the collection number is set as the cost value as it is. If the number of collection is 15, for example, the cost value is 15.

「収集インタフェース」は、監視情報の収集方法を示す。例えば、ＳＮＭＰのＴＲＡＰコマンドを用いて監視情報を収集する場合と、ＳＮＭＰのＧＥＴコマンドを用いて監視情報を収集する場合とがある。ＧＥＴでは、ＳＮＭＰマネージャからのリクエストに応じてＳＮＭＰエージェントが応答（監視情報）を返信する。ＴＲＡＰはＳＮＭＰエージェントが自発的に情報（監視情報）をＳＮＭＰマネージャに送信する。ＧＥＴの手順はＴＲＡＰよりも複雑であるので監視コストも高くなる。例えば、ＧＥＴに対するコスト値は１０、ＴＲＡＰに対するコスト値は２、ｐｉｎｇに対するコスト値は１、telnetに対するコスト値は２０に設定される。 The “collection interface” indicates a monitoring information collection method. For example, there are a case where monitoring information is collected using an SNMP TRAP command and a case where monitoring information is collected using an SNMP GET command. In GET, the SNMP agent returns a response (monitoring information) in response to a request from the SNMP manager. In TRAP, the SNMP agent voluntarily transmits information (monitoring information) to the SNMP manager. Since the GET procedure is more complicated than TRAP, the monitoring cost is also increased. For example, the cost value for GET is set to 10, the cost value for TRAP is set to 2, the cost value for ping is set to 1, and the cost value for telnet is set to 20.

「収集複雑度」は、収集開始から完了までのロジック（手順数：例えばコマンド実行数）を示す。手順数が多くなる程、コスト値は高くなる。例えば、コマンド実行の回数をそのままコスト値に設定可能である。 “Collecting complexity” indicates the logic (number of procedures: number of command executions, for example) from the start to the end of collection. The cost value increases as the number of procedures increases. For example, the number of command executions can be set as a cost value as it is.

「収集時間」は、例えば、監視情報のリクエスト送信から応答受信までの時間を示す。収集時間が長くなる程、監視コストは上昇する。例えば、収集時間が０．１秒の場合のコスト値が１に設定され、１秒の場合のコスト値が５に設定される。但し、各監視コスト要素に対するコスト値は適宜設定可能であり、監視の負荷が大きい程コスト値が高くなるように設定されていれば良い。 “Collecting time” indicates, for example, the time from request transmission of monitoring information to response reception. The longer the collection time, the higher the monitoring cost. For example, the cost value is set to 1 when the collection time is 0.1 seconds, and the cost value is set to 5 when the collection time is 1 second. However, the cost value for each monitoring cost element can be set as appropriate, and may be set so that the cost value increases as the monitoring load increases.

監視サーバ１は、監視対象の各ノード２のノード情報を用いて、各ノードから監視情報を収集する場合の監視コストの値を計算する。計算式は、例えば以下を適用し得る。
監視コスト = Σ（ｉ＋ｊ＋ｋ）× （ｈ）
但し、ｉは収集インタフェースのコスト値であり、ｊは収集複雑度のコスト値であり、ｋは収集時間のコスト値である。ｈは、ｉ，ｊ及びｋの組み合わせに対する警報/状態の
収集数である。但し、監視コストは上記計算式以外の式を用いて計算されても良い。 The monitoring server 1 uses the node information of each node 2 to be monitored to calculate a monitoring cost value when monitoring information is collected from each node. For example, the following formula can be applied.
Monitoring cost = Σ (i + j + k) x (h)
However, i is a cost value of the collection interface, j is a cost value of the collection complexity, and k is a cost value of the collection time. h is the number of alarm / status collections for the combination of i, j and k. However, the monitoring cost may be calculated using a formula other than the above formula.

＜＜各ノードの監視レイヤの決定＞＞
監視サーバ１は、トポロジ情報を用いて、監視サーバ１から各ノード２までの最短経路のホップ数を各ノード２が属する「監視レイヤ」とする。監視サーバ１から１ホップ目の経路にあたるノードはレイヤ１となる。 << Determination of monitoring layer for each node >>
The monitoring server 1 uses the topology information to set the number of hops of the shortest path from the monitoring server 1 to each node 2 as the “monitoring layer” to which each node 2 belongs. The node corresponding to the first hop path from the monitoring server 1 is layer 1.

上記したように、レイヤ１のノードを「代表ノード」と呼ぶ。２ホップ目はレイヤ２、３ホップ以降はレイヤ３…とする。レイヤ２以降のノードを「一般ノード」と呼ぶ。図５は、図１に示したネットワークシステムにおける複数のノード２についてのレイヤ及び監視コストを決定した例を示す。 As described above, the layer 1 node is referred to as a “representative node”. The second hop is layer 2, and the third and subsequent hops are layer 3. Nodes after layer 2 are called “general nodes”. FIG. 5 shows an example in which layers and monitoring costs for a plurality of nodes 2 in the network system shown in FIG. 1 are determined.

各レイヤのノード２（監視元）は、直下のレイヤのノード２（監視先）を監視する。例えば、レイヤ１のノード２は、レイヤ２のノードを監視する。監視元のレイヤを親レイヤ、親レイヤのノード２を親ノードと呼び、監視先となったレイヤを子レイヤ、子レイヤのノード２を子ノードと呼ぶ。１つのノード２が監視する子ノードの数は２以上であっても良い。 The node 2 (monitoring source) of each layer monitors the node 2 (monitoring destination) of the immediately lower layer. For example, the layer 1 node 2 monitors the layer 2 node. The monitoring source layer is called a parent layer, the parent layer node 2 is called a parent node, the monitoring destination layer is called a child layer, and the child layer node 2 is called a child node. The number of child nodes monitored by one node 2 may be two or more.

＜＜各ノードの監視先を決定＞＞
親レイヤでは、親ノード間で監視コストが分散するように子レイヤのノードを決定する。例えば、図５に示す例を用いて説明する。レイヤ２の親ノード（一般ノードａ及び一般ノードｂ）に対し、レイヤ３の子ノード（一般ノードｃ，一般ノードｄ及び一般ノードｅ）を割り振ることを考える。この場合、一般ノードａの監視先の子ノードの監視コストの合計と一般ノードｂの監視先の子ノードの監視コストの合計とが近くなる（バランスする）ように割り振りが行われる（図６参照）。 << Determine the monitoring destination of each node >>
In the parent layer, the nodes in the child layer are determined so that the monitoring cost is distributed among the parent nodes. For example, a description will be given using the example shown in FIG. Consider assigning child nodes (general node c, general node d, and general node e) of layer 3 to parent nodes (general node a and general node b) of layer 2. In this case, the allocation is performed so that the sum of the monitoring costs of the monitoring target child nodes of the general node a and the total monitoring cost of the monitoring target child nodes of the general node b are close (balanced) (see FIG. 6). ).

結果の一例として、監視サーバ１は、一般ノードａの監視先（子ノード）として、一般ノードｃ及び一般ノードｄ（監視コストの合計値：１５００）を決定する。また、監視サーバ１は、一般ノードａの監視先（子ノード）として、一般ノードｃ及び一般ノードｄ（監視コストの合計値：１５００）を決定する。 As an example of the result, the monitoring server 1 determines the general node c and the general node d (total value of monitoring costs: 1500) as the monitoring destination (child node) of the general node a. In addition, the monitoring server 1 determines a general node c and a general node d (total monitoring cost: 1500) as monitoring destinations (child nodes) of the general node a.

図７及び図８は、親ノード間の子ノードの割り振り方法の一例を示す図である。最初に、監視サーバ１は、（ｉ）親レイヤに属するノード（親ノード群）と、子レイヤに属するノード（子ノード群）を取り出す。なお、図７及び図８は一例であって、図５に示したト
ポロジと異なる。 7 and 8 are diagrams showing an example of a method for allocating child nodes between parent nodes. First, the monitoring server 1 takes out (i) a node (parent node group) belonging to the parent layer and a node (child node group) belonging to the child layer. 7 and 8 are examples, and are different from the topology shown in FIG.

例えば、図７に示すように、監視サーバ１は、親レイヤ（レイヤ２）の親ノード群と子レイヤ（レイヤ３）の子ノード群のデータを取り出す。親ノード群は、親ノードＡ，親ノードＢ及び親ノードＣを含み、子ノード群は、子ノードＨ，子ノードＩ，子ノードＪ及び子ノードＫを含む。子ノードＨ，子ノードＩ，子ノードＪ及び子ノードＫの監視コストのそれぞれは、１０００，２０００，３０００，４０００であると仮定する。 For example, as shown in FIG. 7, the monitoring server 1 takes out data of a parent node group of a parent layer (layer 2) and a child node group of a child layer (layer 3). The parent node group includes a parent node A, a parent node B, and a parent node C, and the child node group includes a child node H, a child node I, a child node J, and a child node K. Assume that the monitoring costs of child node H, child node I, child node J, and child node K are 1000, 2000, 3000, and 4000, respectively.

監視サーバ１は、（ii）各子ノードに対応する親候補テーブルを生成する。図８に示すように、親候補テーブルＴ１は、親ノードの候補（監視サーバ１からの最短経路上にある直上レイヤのノード）の一覧を含む。換言すれば、親候補テーブルＴ１は、子ノードの識別情報に対応する単数又は複数の親ノード候補の識別情報を記憶する。図８には、子ノードＨ，子ノードＩ，子ノードＪ及び子ノードＫに対応する親候補テーブルＴ１１，Ｔ１２，Ｔ１３及びＴ１４が図示されている。図８に示すように、親候補テーブルＴ１は、子ノードの監視コストを記憶していても良い。 The monitoring server 1 generates (ii) a parent candidate table corresponding to each child node. As shown in FIG. 8, the parent candidate table T1 includes a list of parent node candidates (nodes in the immediately upper layer on the shortest path from the monitoring server 1). In other words, the parent candidate table T1 stores identification information of one or more parent node candidates corresponding to the identification information of the child node. FIG. 8 shows parent candidate tables T11, T12, T13, and T14 corresponding to the child node H, child node I, child node J, and child node K. As shown in FIG. 8, the parent candidate table T1 may store the monitoring cost of the child node.

監視サーバ１は、（iii）以下の条件に従って、各親ノードがどの子ノードを監視する
かを示す「親ノード監視テーブル」を生成する。
（ルール１）親ノードの候補（親候補とも表記）が一つしか存在しない子ノード（第１の子ノード）は、無条件にその親ノードの監視下に入る。
（ルール２）親候補が複数存在する子ノード（第２の子ノード）は、親候補となる親ノード監視テーブルをチェックし、合計コストが一番小さい親を選択する。 The monitoring server 1 (iii) generates a “parent node monitoring table” indicating which child node each parent node monitors according to the following conditions.
(Rule 1) A child node (first child node) having only one parent node candidate (also referred to as a parent candidate) is unconditionally placed under the monitoring of the parent node.
(Rule 2) A child node (second child node) having a plurality of parent candidates checks a parent node monitoring table as a parent candidate, and selects a parent having the smallest total cost.

図８に示す例では、監視サーバ１は、若い番号順で親ノードを決定している。但し、割り振りの結果がバランスする限りにおいて、親ノードの決定順は適宜変更可能である。子ノードＨに対する親ノードの候補は親ノードＡの一つである。よって、監視サーバ１はルール１に従い、子ノードＨを親ノード監視テーブルＴ２に登録する。 In the example shown in FIG. 8, the monitoring server 1 determines the parent nodes in order of young numbers. However, as long as the result of allocation is balanced, the order of determination of the parent nodes can be changed as appropriate. The parent node candidate for the child node H is one of the parent nodes A. Therefore, the monitoring server 1 registers the child node H in the parent node monitoring table T2 according to the rule 1.

親ノード監視テーブルＴ２は、例えば、図８に示すように、親ノードの識別子に対応づけて、各子ノードの識別情報と、子ノードの監視コストの合計値（合計コスト）とが記憶される。図８には、親ノードＡ，親ノードＢ，親ノードＣに対応する親ノード監視テーブルＴ２１，Ｔ２２，Ｔ２３が図示されている。なお、親候補テーブルＴ１及び親ノード監視テーブルＴ２のそれぞれにおいて、親ノードの候補及び親ノードのそれぞれの監視コストが記憶されてもよい。 For example, as shown in FIG. 8, the parent node monitoring table T2 stores the identification information of each child node and the total value (total cost) of the monitoring cost of the child node in association with the identifier of the parent node. . FIG. 8 shows parent node monitoring tables T21, T22, and T23 corresponding to the parent node A, parent node B, and parent node C. In each of the parent candidate table T1 and the parent node monitoring table T2, the parent node candidate and the monitoring cost of each parent node may be stored.

子ノードＩについては、親ノードの候補として、親ノードＡ，親ノードＢ，及び親ノードＣがある。監視サーバ１は、親ノード候補の合計コストを参照し、合計コストの値が小さい親ノード候補の親ノード監視テーブルＴ２に子ノードＩを登録する（ルール２）。このとき、合計コストの値が同じである複数の親ノード監視テーブルＴ２が存在する場合には、所定の優先順位に従って一つの親ノード監視テーブルＴ２に子ノードＩを登録する。この例では、親ノードの候補である親ノードＢと親ノードＣとのうち、ノード番号が若い親ノードＢの親ノード監視テーブルＴ２２に子ノードＩが登録されている。但し、優先順位は適宜設定可能である。 For the child node I, there are a parent node A, a parent node B, and a parent node C as parent node candidates. The monitoring server 1 refers to the total cost of the parent node candidate and registers the child node I in the parent node monitoring table T2 of the parent node candidate having a small total cost value (rule 2). At this time, if there are a plurality of parent node monitoring tables T2 having the same total cost value, the child node I is registered in one parent node monitoring table T2 according to a predetermined priority. In this example, the child node I is registered in the parent node monitoring table T22 of the parent node B having a smaller node number among the parent node B and the parent node C that are candidates for the parent node. However, the priority order can be set as appropriate.

子ノードＪに対する親ノードの候補は親ノードＡの一つである。このため、監視サーバ１はルール１にしたがって、親ノード監視テーブルＴ２１に子ノードＪを登録する。監視サーバ１は親ノード監視テーブルＴ２１の合計コスト値を４０００（１０００＋３０００）に更新する。 The parent node candidate for the child node J is one of the parent nodes A. Therefore, the monitoring server 1 registers the child node J in the parent node monitoring table T21 according to the rule 1. The monitoring server 1 updates the total cost value of the parent node monitoring table T21 to 4000 (1000 + 3000).

子ノードＫについては、監視サーバ１は、ルール２に従って、親ノードＣに対応する親
ノード監視テーブルＴ２３に登録する。これにより、親ノードＡ，Ｂ，Ｃ間で、子ノードＨ，Ｉ，Ｊ，Ｋについての監視コストがバランスした状態で割り振られた状態となる。すなわち、親ノードＡ，Ｂ，Ｃ間で子ノードＨ，Ｉ，Ｊ，Ｋの監視に対する負荷が分散される。なお、親候補テーブルＴ１及び親ノード監視テーブルＴ２は主記憶装置１２及び補助記憶装置１３の少なくとも一方で生成され、記憶される。 The monitoring server 1 registers the child node K in the parent node monitoring table T23 corresponding to the parent node C according to the rule 2. As a result, the monitoring costs for the child nodes H, I, J, and K are allocated in a balanced manner between the parent nodes A, B, and C. That is, the load for monitoring the child nodes H, I, J, and K is distributed among the parent nodes A, B, and C. The parent candidate table T1 and the parent node monitoring table T2 are generated and stored in at least one of the main storage device 12 and the auxiliary storage device 13.

＜ＣＰＵによる処理＞
図９は、監視サーバ１（監視サーバ１として動作する情報処理装置１０）のＣＰＵ１１によって実行される監視処理の一例を示すフローチャートである。図９に示す処理は、監視サーバ１の起動など、所定の初期トリガの入力を契機に開始される。但し、図９の処理の開始条件は上記以外であっても良い。 <Processing by CPU>
FIG. 9 is a flowchart illustrating an example of a monitoring process executed by the CPU 11 of the monitoring server 1 (the information processing apparatus 10 operating as the monitoring server 1). The process shown in FIG. 9 is started upon input of a predetermined initial trigger such as activation of the monitoring server 1. However, the processing start condition in FIG. 9 may be other than the above.

００１では、監視サーバ１のＣＰＵ１１は各ノード２の監視コストを決定する（図４等参照）。００２では、監視サーバ１のＣＰＵ１１は各ノード監視レイヤを決定する（図５参照）。００３では、監視サーバ１のＣＰＵ１１は親レイヤに属するノード２と子レイヤに属するノードとを取り出す（図７参照）。 In 001, the CPU 11 of the monitoring server 1 determines the monitoring cost of each node 2 (see FIG. 4 and the like). In 002, the CPU 11 of the monitoring server 1 determines each node monitoring layer (see FIG. 5). In 003, the CPU 11 of the monitoring server 1 takes out the node 2 belonging to the parent layer and the node belonging to the child layer (see FIG. 7).

００４では、監視サーバ１のＣＰＵ１１は００３で取り出した子ノードに関する親候補テーブルＴ１を生成する。００５及び００６の処理は、取り出した子ノード数分ループする。００５では、監視サーバ１のＣＰＵ１１は親ノードの候補が一つか否かを判定する。００６では、親ノードの候補が一つであると００５で判定する場合に、ＣＰＵ１１は親ノードの候補を親ノードに決定し、対応する親ノード監視テーブルＴ２に登録する。 In 004, the CPU 11 of the monitoring server 1 generates a parent candidate table T1 related to the child node extracted in 003. The processing of 005 and 006 loops as many times as the number of extracted child nodes. In 005, the CPU 11 of the monitoring server 1 determines whether there is one parent node candidate. In 006, when it is determined in 005 that there is only one parent node candidate, the CPU 11 determines the parent node candidate as the parent node and registers it in the corresponding parent node monitoring table T2.

００７及び００８の処理は親ノードの候補が複数である子ノード数分ループする。また、００７は親候補テーブルＴ１中の各親ノードの候補分ループする。００７において、監視サーバ１のＣＰＵ１１は合計コストが小さい親ノードの候補を選択する。００８において、監視サーバ１のＣＰＵ１１は００７で選択した親ノードの候補を親ノードとして決定し、親ノード監視テーブルＴ２に登録する。 The processing of 007 and 008 loops for the number of child nodes having a plurality of candidate parent nodes. Further, 007 loops for each parent node candidate in the parent candidate table T1. In 007, the CPU 11 of the monitoring server 1 selects a parent node candidate having a small total cost. In 008, the CPU 11 of the monitoring server 1 determines the parent node candidate selected in 007 as a parent node and registers it in the parent node monitoring table T2.

００３〜００８の処理は、残りのレイヤについても実行される。すなわち、００３の処理における取り出しは、最下位のレイヤを起点に行われる。例えば、図５の例であれば、レイヤ２とレイヤ３とをそれぞれ親レイヤ及び子レイヤとするノードが取り出される。これについての割り振りが終了すると、一つ上位のレイヤについての処理が行われる。 The processing from 003 to 008 is also executed for the remaining layers. That is, the extraction in the process of 003 is performed starting from the lowest layer. For example, in the example of FIG. 5, nodes having layer 2 and layer 3 as a parent layer and a child layer, respectively, are extracted. When the allocation for this is completed, the process for the one higher layer is performed.

図５の例であれば、レイヤ１とレイヤ２とをそれぞれ親レイヤ及び子レイヤとするノードが取り出され（００３）、００４〜００８の処理が行われる。このとき、子ノードの監視コストとして、各子ノードにぶら下がる子ノードの監視コストと自ノードの監視コストとの合計値が用いられる。００３〜００８の処理は、レイヤ１が親レイヤとして取り出されるまで繰り返し行われる。 In the example of FIG. 5, nodes having layer 1 and layer 2 as a parent layer and a child layer, respectively, are extracted (003), and the processes of 004 to 008 are performed. At this time, the total value of the monitoring cost of the child node hanging from each child node and the monitoring cost of the own node is used as the monitoring cost of the child node. The processing from 003 to 008 is repeated until layer 1 is taken out as a parent layer.

このようにして、最終的に監視サーバ１を頂点とするツリーの情報（図２）が監視サーバ１で生成され、主記憶装置１２及び補助記憶装置１３の少なくとも一方（以下、「メモリ」という）に記憶される。００９では、監視サーバ１のＣＰＵ１１は各ノード２（代表及び一般ノード）に監視制御用データを送信する。監視制御用データは通信ＩＦ１４から各ノード２へ送信される。 In this way, the tree information (FIG. 2) finally having the monitoring server 1 as a vertex is generated by the monitoring server 1, and at least one of the main storage device 12 and the auxiliary storage device 13 (hereinafter referred to as “memory”). Is remembered. In 009, the CPU 11 of the monitoring server 1 transmits monitoring control data to each node 2 (representative and general nodes). The monitoring control data is transmitted from the communication IF 14 to each node 2.

各ノード２として動作する情報処理装置１０では、監視制御用データが通信ＩＦ１４で受信され、メモリ（主記憶装置１２及び補助記憶装置１３の少なくとも一方）に記憶される。各ノード２のＣＰＵ１１は、監視制御用データを用いて監視対象の情報（アラームや状態など）を監視し、監視結果（監視情報ともいう）をメモリに記憶する。 In the information processing apparatus 10 operating as each node 2, the monitoring control data is received by the communication IF 14 and stored in the memory (at least one of the main storage device 12 and the auxiliary storage device 13). The CPU 11 of each node 2 monitors information to be monitored (such as an alarm or status) using the monitoring control data, and stores the monitoring result (also referred to as monitoring information) in the memory.

図１０は、ノード２の親ノードとしての処理例を示すフローチャートである。図１０の処理１はノード２のＣＰＵ１１が定期ポーリングのトリガを受けて開始する。１０１では、ＣＰＵ１１は、監視対象の各子ノードから監視情報（子ノードにおける監視対象の監視結果を示す情報）を収集する。収集は、各子ノードに監視結果の送信の要求を送信し、応答を受信することで行われる。 FIG. 10 is a flowchart illustrating a processing example of the node 2 as a parent node. The process 1 in FIG. 10 starts when the CPU 11 of the node 2 receives a trigger for periodic polling. In 101, the CPU 11 collects monitoring information (information indicating the monitoring result of the monitoring target in the child node) from each monitoring target child node. The collection is performed by transmitting a monitoring result transmission request to each child node and receiving a response.

１０２では、ノード２のＣＰＵ１１は自身（自ノード）の監視情報を各子ノードから収集された監視情報に追加し、親ノード（通知先）への通知情報を生成する。１０３では、ノード２のＣＰＵ１１は通知先（親ノード）との通信が正常か否かを判定する。通信が正常と判定される場合にはＣＰＵ１１は親ノードへ通知情報を送信する（１０４）。通信が正常でないと判定される場合にはＣＰＵ１１は通知情報を退避（メモリに記憶）する（１０５）。退避されたデータは、親ノードとの通信が復旧した場合に親ノードへ送信される。 In 102, the CPU 11 of the node 2 adds the monitoring information of itself (own node) to the monitoring information collected from each child node, and generates notification information to the parent node (notification destination). In 103, the CPU 11 of the node 2 determines whether or not the communication with the notification destination (parent node) is normal. When it is determined that the communication is normal, the CPU 11 transmits notification information to the parent node (104). When it is determined that the communication is not normal, the CPU 11 saves (stores in the memory) the notification information (105). The saved data is transmitted to the parent node when communication with the parent node is restored.

図１０に示す処理２は、イベントとして、子ノードから監視情報が受信されたことを契機に開始される。１１１では、ノード２のＣＰＵ１１は自身（自ノード）の監視情報を各子ノードから収集された監視情報に追加し、親ノード（通知先）への通知情報を生成する。１１１の処理は１０２の処理と同様の処理である。その後、処理が１０３に進む。 Process 2 shown in FIG. 10 is started when monitoring information is received from a child node as an event. In 111, the CPU 11 of the node 2 adds the monitoring information of itself (own node) to the monitoring information collected from each child node, and generates notification information to the parent node (notification destination). The process 111 is the same as the process 102. Thereafter, the process proceeds to 103.

図１０に示す処理１及び処理２が各ノード２で実行されることによって、監視サーバ１は、各代表サーバから代表サーバ及びその下位にある一般ノードからの監視情報を含む通知情報を受信することができる。監視サーバ１は受信された監視情報を用いてネットワーク及び各ノード２の制御を行う。 The monitoring server 1 receives the notification information including the monitoring information from the representative server and the general nodes below it from each representative server by executing the processing 1 and the processing 2 shown in FIG. Can do. The monitoring server 1 controls the network and each node 2 using the received monitoring information.

図１１は、通信断の検出時における監視サーバ１のＣＰＵ１１の処理例を示すフローチャートである。２０１では、監視サーバ１のＣＰＵ１１は、各代表ノードから監視サーバ１への通知情報を受信する。 FIG. 11 is a flowchart illustrating a processing example of the CPU 11 of the monitoring server 1 when the communication disconnection is detected. In 201, the CPU 11 of the monitoring server 1 receives notification information from each representative node to the monitoring server 1.

２０２では、監視サーバ１のＣＰＵ１１は、通信断となっているノードＸを特定する。例えば、図６を例に説明すると、代表ノード＃１は、ライフチェックなどの既存の方法で一般ノードａと通信できなくなったとき、「一般ノードａとの通信断」の通知を監視サーバ１へ送信する。２０２において、監視サーバ１のＣＰＵ１１は、通信断の通知が受信されているかを判定する。ここでは、ＣＰＵ１１はノードＸとして一般ノードａを特定する。 In 202, the CPU 11 of the monitoring server 1 identifies the node X that is disconnected. For example, referring to FIG. 6 as an example, when the representative node # 1 becomes unable to communicate with the general node a by an existing method such as a life check, a notification of “disconnection with the general node a” is sent to the monitoring server 1. Send. In 202, the CPU 11 of the monitoring server 1 determines whether a communication disconnection notification has been received. Here, the CPU 11 specifies the general node a as the node X.

２０３では、監視サーバ１のＣＰＵ１１は、ノードＸが監視していたノードＹを特定する。ノードＹの特定は、ツリーの情報（ノードＸへの監視制御用データ）を用いて行うことができる。ＣＰＵ１１は、ノードＹとして、ノードＸ（一般ノードａ）の子ノードであった一般ノードｃ及び一般ノードｄを特定する。このようにして、ＣＰＵ１１はツリー上でノードＸの下流にある全てのノード２をノードＹとして検出する。 In 203, the CPU 11 of the monitoring server 1 identifies the node Y monitored by the node X. The identification of the node Y can be performed using tree information (monitoring control data for the node X). The CPU 11 identifies the general node c and the general node d, which are child nodes of the node X (general node a), as the node Y. In this way, the CPU 11 detects all nodes 2 downstream of the node X on the tree as nodes Y.

２０４では、監視サーバ１のＣＰＵ１１は、ノードＸ及びノードＹと通信できなくなったノード２（代表ノード＃１）を省いたツリーの情報（経路情報）を用いて、ノードＸ及びノードＹの親ノードを新たに決定する。親ノードの決定方法には、図７及び図８を用いて説明した方法を用いる。 In 204, the CPU 11 of the monitoring server 1 uses the information (path information) of the tree excluding the node 2 (representative node # 1) that can no longer communicate with the node X and the node Y, and the parent node of the node X and the node Y Is newly determined. As the method for determining the parent node, the method described with reference to FIGS. 7 and 8 is used.

２０５では、監視サーバ１のＣＰＵ１１は、ノードＸ，ノードＹ及び親ノードになったノードに監視制御用データを送信する。これによって、ノード間の通信断が発生しても、監視サーバ１への監視情報の転送経路を示すツリー情報に基づき、ツリーを修復して、監
視を継続することができる。 In 205, the CPU 11 of the monitoring server 1 transmits monitoring control data to the nodes X, Y, and the parent node. As a result, even if communication between nodes occurs, the tree can be repaired and monitoring can be continued based on the tree information indicating the transfer route of the monitoring information to the monitoring server 1.

＜実施形態の作用効果＞
実施形態では、複数のノード２を含むネットワークの監視サーバ１（監視装置の一例）に含まれるＣＰＵ１１（コンピュータの一例）が以下の処理を行う。 <Effects of Embodiment>
In the embodiment, a CPU 11 (an example of a computer) included in a network monitoring server 1 (an example of a monitoring device) including a plurality of nodes 2 performs the following processing.

ネットワークに通信可能に接続されネットワークの監視を行う監視サーバ１との間の最小ホップ数の通信経路における監視サーバ１との間のホップ数に応じて前記複数のノードを分類する。 The plurality of nodes are classified according to the number of hops with the monitoring server 1 in the communication path having the minimum number of hops with the monitoring server 1 connected to the network so as to be communicable.

ホップ数が小さい側のノードがホップ数が大きい側のノードを監視するようにノード間で監視関係の設定を行う（親子関係を設定する）際に、ホップ数が同じノード間で監視負荷がバランスされるように監視可能な監視先ノード（子ノード）の割り振りを行う。 When setting the monitoring relationship between nodes so that the node with the smaller hop count monitors the node with the larger hop count (setting the parent-child relationship), the monitoring load is balanced between the nodes with the same hop count The monitoring destination node (child node) that can be monitored is allocated as described above.

実施形態によれば、監視サーバ１が監視情報の転送経路となるツリーを生成し、親ノードとなる各ノード２にツリーに基づく監視制御用データを送信する。これによって、ノード２の監視負荷を親ノードに分散させることができる。よって、監視サーバ１の負荷が軽減乃至低減される。換言すれば、監視対象の増加に伴う監視サーバ１の負荷上昇を抑えることができる。したがって、監視サーバの数を増やしたり、高性能の監視サーバを採用したりしなくとも、監視サーバ１が監視対象の各ノードから監視情報を収集することができる。すなわち、設備コストの上昇を抑えて監視対象のノード数の増加に対応することができる。 According to the embodiment, the monitoring server 1 generates a tree that becomes a transfer route of monitoring information, and transmits monitoring control data based on the tree to each node 2 that becomes a parent node. Thereby, the monitoring load of the node 2 can be distributed to the parent node. Therefore, the load on the monitoring server 1 is reduced or reduced. In other words, an increase in the load on the monitoring server 1 due to an increase in the number of monitoring targets can be suppressed. Therefore, the monitoring server 1 can collect monitoring information from each node to be monitored without increasing the number of monitoring servers or adopting a high-performance monitoring server. That is, it is possible to cope with an increase in the number of nodes to be monitored while suppressing an increase in equipment cost.

また、各レイヤにおいて、親ノードにぶら下がる子ノードの合計コストが親ノード間でバランスする（偏りがないようにする）ことで、同レイヤの親ノード間で負荷の偏りが発生しないようにされ、円滑な監視情報の収集が行われるようにすることができる。 In addition, in each layer, the total cost of the child nodes hanging from the parent node is balanced between the parent nodes (so that there is no bias), so that load imbalance does not occur between the parent nodes of the same layer, Smooth collection of monitoring information can be performed.

さらに、ツリーの経路で通信断が発生した場合には、通信断を示す情報がノード２から監視サーバ１に送信され、監視サーバ１がツリーを再構築する。これによって、通信断が起きても、再構築されたツリーを用いて監視（監視情報の収集）を継続することができる。実施形態にて説明した構成は例示であり、適宜組み合わせることができる。 Furthermore, when communication disconnection occurs in the tree path, information indicating communication disconnection is transmitted from the node 2 to the monitoring server 1, and the monitoring server 1 reconstructs the tree. As a result, even if communication disconnection occurs, monitoring (collection of monitoring information) can be continued using the reconstructed tree. The configurations described in the embodiments are examples and can be combined as appropriate.

１・・・監視サーバ
２・・・ノード
１０・・・情報処理装置
１１・・・ＣＰＵ
１２・・・主記憶装置
１３・・・補助記憶装置 DESCRIPTION OF SYMBOLS 1 ... Monitoring server 2 ... Node 10 ... Information processing apparatus 11 ... CPU
12 ... Main storage device 13 ... Auxiliary storage device

Claims

A computer included in a monitoring device connected to a network including a plurality of nodes so as to be able to communicate with the network,
A process of classifying the plurality of nodes according to the number of hops with the monitoring device in the communication path of the minimum number of hops with the monitoring device;
When setting the monitoring relationship between nodes so that the node with the smaller number of hops monitors the node with the larger number of hops, the monitoring load can be balanced between the nodes with the same number of hops. A process for allocating a monitoring target node,
A program that executes

In the process of performing the allocation,
A plurality of nodes on the side with a smaller number of hops are candidates for a parent node,
A plurality of nodes on the side with a larger number of hops as child nodes monitored by a parent node,
For the first child node connected to one parent node candidate in the communication path among the child nodes, determine the one parent node candidate as a parent node;
Among the child nodes, the second child node connected to the plurality of parent node candidates in the communication path has a small total monitoring cost of the child nodes monitored by each of the plurality of parent node candidates The program according to claim 1, wherein the computer executes a process of determining one of a plurality of parent node candidates as a parent node.

The computer according to claim 1, further causing the computer to execute a process of determining the monitoring cost of each of the plurality of nodes based on at least one of the number of information items to be collected, a collection method, a collection complexity, and a collection time. program.

Based on the result of the allocation, information indicating the monitoring target node, information indicating the node transmitting the monitoring result, and information indicating the monitoring target item are transmitted to the monitoring node among the plurality of nodes. Processing,
Processing to receive the monitoring result of the node located at the first hop in the communication path and the monitoring result of each node downstream of the node located at the first hop from the node located at the first hop in the communication path. The program according to claim 1 or 2, wherein the program is executed.

In a network monitoring method including a plurality of nodes,
Classifying the plurality of nodes according to the number of hops with the monitoring device in the communication path with the minimum number of hops between the monitoring device and the monitoring device connected to the network so as to be communicable;
When setting the monitoring relationship between nodes so that the node with the smaller number of hops monitors the node with the larger number of hops, the monitoring load can be balanced between the nodes with the same number of hops. Execute allocation of monitoring nodes
A monitoring method characterized by that.

In a monitoring apparatus that is connected to a network including a plurality of nodes so as to be communicable and monitors the plurality of nodes,
A classification unit that classifies the plurality of nodes according to the number of hops between the monitoring device and a communication path with a minimum number of hops between the monitoring device;
When setting the monitoring relationship between nodes so that the node with the smaller number of hops monitors the node with the larger number of hops, the monitoring load can be balanced between the nodes with the same number of hops. An allocation unit that allocates a monitoring target node,
Including monitoring equipment.