JP2000295259A - Device for detecting abnormality in lan - Google Patents

Device for detecting abnormality in lan

Info

Publication number
JP2000295259A
JP2000295259A JP11103783A JP10378399A JP2000295259A JP 2000295259 A JP2000295259 A JP 2000295259A JP 11103783 A JP11103783 A JP 11103783A JP 10378399 A JP10378399 A JP 10378399A JP 2000295259 A JP2000295259 A JP 2000295259A
Authority
JP
Japan
Prior art keywords
communication control
lan
lan communication
control device
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP11103783A
Other languages
Japanese (ja)
Inventor
Satoru Moriwake
哲 森分
Masaaki Nomoto
正明 野本
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to JP11103783A priority Critical patent/JP2000295259A/en
Publication of JP2000295259A publication Critical patent/JP2000295259A/en
Pending legal-status Critical Current

Links

Landscapes

  • Small-Scale Networks (AREA)

Abstract

PROBLEM TO BE SOLVED: To keep sound communication by stopping the function of a reception side LAN communication controller when message packet interruptions being a fixed number or more occur within a fixed time in the reception side LAN communication controller in packet transmission from a transmission side LAN communication controller. SOLUTION: A distribution system is constituted of computers A-E101, 111 to 114. An abnormality detecting program 116 arranged in the LAN communication controller 106 counts the number of the message packets stored in a reception buffer 117 and monitors the number of the message packet interruptions occurring within the fixed time. When the number of the message packet interruptions occurring within the fixed time is abnormally larger than an index value which is set in accordance with the performance of the computers constituting OS 105, the self processing function of the LAN communication controller 106 is stopped. A configuration control program 104 changes-over a system into the LANB one 110 in order to fetch data from the LAN communication controller 107.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【0001】[0001]

【発明の属する技術分野】本発明はLAN異常検出装置
に関する。
The present invention relates to a LAN abnormality detection device.

【0002】[0002]

【従来の技術】従来、LANにより複数の計算機システ
ムを接続した分散システムにおいて、LAN通信制御装
置の停止に対しては、LAN通信制御装置を多重化し、
ヘルスチェックによるタイムアウト監視により停止LA
N通信制御装置を検出し、停止LAN通信制御装置から
正常LAN通信制御装置への切り替えを実施することに
より計算機システム信頼度を向上させている。
2. Description of the Related Art Conventionally, in a distributed system in which a plurality of computer systems are connected by a LAN, when the LAN communication control device is stopped, the LAN communication control device is multiplexed.
LA stopped due to timeout monitoring by health check
By detecting the N communication control devices and switching from the stopped LAN communication control device to the normal LAN communication control device, the computer system reliability is improved.

【0003】[0003]

【発明が解決しようとする課題】停止には至らないが異
常な数のパケットを送信するLAN通信制御装置の故障
は、受信側の計算機OSの負荷を増大させ、その計算機
のアプリ機能の停止あるいはOSそのものの停止を引き
起こす。従来の技術では、異常な数のパケットを送信す
るLAN通信制御装置の故障を検出し、故障したLAN
通信制御装置から正常LAN通信制御装置への切り替え
処理は不可能であった。
A failure of the LAN communication control device which does not stop but transmits an abnormal number of packets increases the load on the receiving computer OS and stops the application function of the computer or This causes the OS itself to stop. In the related art, a failure of a LAN communication control device that transmits an abnormal number of packets is detected, and the failed LAN communication control device is detected.
Switching from the communication control device to the normal LAN communication control device was impossible.

【0004】本発明は、異常な数のパケットを受信する
LAN通信制御装置を特定したLAN通信制御装置単位の
停止を実施することにより、システムの停止範囲を最小
限にして計算機システムの信頼性を向上させることを目
的とするものである。
According to the present invention, a LAN communication control unit that receives an abnormal number of packets is specified, and the LAN communication control unit is stopped. Thus, the system stop range is minimized and the reliability of the computer system is reduced. It is intended to improve.

【0005】[0005]

【課題を解決するための手段】LANにより複数の計算
機システムを接続した分散システムにおいて、1つの計
算機のLAN通信制御装置の故障により、相手計算機の
正常に受信可能な伝文パケット最大数を超える異常な数
を送信するような場合に、送信側の計算機のLAN通信
制御装置で発生した異常な数のパケット送信を、受信側
のLAN通信制御装置の機能にて一定時間内に一定以上
の伝文パケットの割り込みが発生したことを条件に検出
し、受信側のLAN通信制御装置の機能を停止し、健全
なLAN通信制御装置で送受信を継続する。
SUMMARY OF THE INVENTION In a distributed system in which a plurality of computer systems are connected by a LAN, an error exceeding the maximum number of message packets that can be normally received by the partner computer due to a failure of the LAN communication control device of one computer. When transmitting an unusual number of packets, the abnormal number of packets generated by the LAN communication control unit of the sending computer can be transmitted by a function of the LAN communication control unit on the receiving side within a certain period of time. Detecting that a packet interruption has occurred, the function of the LAN communication control device on the receiving side is stopped, and transmission and reception are continued by a sound LAN communication control device.

【0006】さらに、送信側の計算機のLAN通信制御
装置で発生した異常な数のパケット送信を、受信側のL
AN通信制御装置の機能にて一定時間内に一定以上の伝
文パケットの割り込みが発生したことを条件に検出し、
異常情報をLAN通信制御装置を制御するOSに通知
し、OSにより受信側のLAN通信制御装置の機能を停
止する。
Further, transmission of an abnormal number of packets generated in the LAN communication control device of the computer on the transmitting side is transmitted to the L on the receiving side.
The function of the AN communication control device detects that a message packet interrupt of a certain amount or more has occurred within a certain period of time,
The OS notifies the OS that controls the LAN communication control device of the abnormality information, and the OS stops the function of the LAN communication control device on the receiving side.

【0007】本発明は以上の機能を備えているので、L
AN通信制御装置の故障により発生した異常な数の伝文
パケットの割り込みに対して、異常な数の伝文パケット
を受信しているLAN通信制御装置を特定し、特定した
LAN通信制御装置のみを停止することが可能であり、
LANにより構成された計算機システムの停止範囲を最
小限にしシステム信頼性の向上を可能とする。
Since the present invention has the above functions, L
In response to the interruption of the abnormal number of message packets caused by the failure of the AN communication control device, the LAN communication control device receiving the abnormal number of message packets is specified, and only the specified LAN communication control device is identified. Can be stopped,
The halt range of the computer system constituted by the LAN is minimized, and the system reliability can be improved.

【0008】[0008]

【発明の実施の形態】以下、本発明の実施例を示す。Embodiments of the present invention will be described below.

【0009】図1は計算機A101と計算機B102と
計算機C103と計算機D104と計算機E105の5
台の計算機と5台の計算機を結ぶLAN109とLAN110
とトランシーバ108で構成されている分散システムを
表す。計算機A101はLAN通信制御装置A系106と
LAN通信制御装置B系107を内蔵し、OS105と
構成制御プログラム104,送信処理プログラム10
2,受信処理プログラム103からなるソフト構成とな
っている。
FIG. 1 shows a computer A101, a computer B102, a computer C103, a computer D104, and a computer E105.
LAN109 and LAN110 connecting 5 computers to 5 computers
And a distributed system configured with the transceiver 108. The computer A101 incorporates a LAN communication control device A system 106 and a LAN communication control device B system 107, and includes an OS 105, a configuration control program 104, and a transmission processing program 10.
2. It has a software configuration including a reception processing program 103.

【0010】計算機111と計算機112と計算機11
3と計算機114と計算機115は同様のソフトウェア
の構成とLAN通信制御装置を持つ。従来の技術では計
算機EからA系LAN109経由で異常な数の伝文パケ
ットが送信された場合、計算機A101ではLAN通信
制御装置A系を経由して計算機Eから送信された異常な
数の伝文パケットを受信するため、計算機OS105の
負荷を増大させ、OSよりレベルの低い送信処理10
2,受信処理103や構成制御プログラム104等のア
プリ機能の停止を引き起こし、正常なLAN通信制御装
置B系へ切り替えることも不可能なまま、OS105そ
のものの停止、すなわち計算機の停止を引き起こす。こ
のような計算機の停止はA系LANに接続する全ての計
算機に対して発生し、最終的にはこの分散システム全体
の停止を引き起こす。
Computer 111, Computer 112 and Computer 11
3, the computer 114, and the computer 115 have the same software configuration and LAN communication control device. In the prior art, when an abnormal number of message packets are transmitted from the computer E via the A-system LAN 109, the computer A101 transmits an abnormal number of message packets transmitted from the computer E via the LAN communication control device A. In order to receive the packet, the load on the computer OS 105 is increased, and the transmission processing 10 lower than the OS is performed.
2. The application functions such as the reception processing 103 and the configuration control program 104 are stopped, and the OS 105 itself is stopped, that is, the computer is stopped without being able to switch to the normal LAN communication control device B system. Such a computer stoppage occurs for all the computers connected to the A-system LAN, and eventually causes the entire distributed system to stop.

【0011】本発明は、上記に示したシステム全体のダ
ウンを回避するための実現案である。
The present invention is an implementation for avoiding the above-mentioned system-wide downtime.

【0012】第一実施例について説明する。図2に示す
ように、LAN通信制御装置に送信バッファ115と受
信バッファ117と異常検出プログラム116を持たせ
る。図3はLAN通信制御装置でLAN異常を検出する
処理の処理フロー図である。受信バッファは他の計算機
より送信された伝文パケットが蓄積される。異常検出プ
ログラム116は受信バッファ117に貯積されている
伝文パケット数をt秒周期でt秒間に蓄積された伝文パ
ケット数をカウントする(301)。t秒内にN個以上
のパケットを受信したか否かを判定する(302)。N
の値は計算機の性能にあわせて設定される指標値であ
る。もしt秒内にN個以上のパケットを検出したなら
ば、LAN通信制御装置106は異常な数の伝文パケッ
トを受信していると判定し自らのLAN通信制御装置処
理機能を停止する(303)。異常なLAN通信制御装
置の停止により、OSの負荷上昇がおさえられ、OSよ
りレベルの低い構成制御,送受信プログラム等のアプリ
ケーションは正常に動作する。構成制御はLAN通信制
御装置106が停止していることを検知し (304)L
AN通信制御装置107からデータを取り込むよう系切
り替えを実施する(305)。LAN通信制御装置10
7において正常な動作を行っている場合は、以降計算機
101は正しく動作し、分散システムの停止も発生しな
い。LAN通信制御装置107においても異常な数の伝
文パケットを受信していると判定した場合はLAN通信
制御装置107が自らの機能を停止させているため、L
AN経由によるデータの送受信の停止となる(30
6)。ただし、異常な数の伝文パケットを受信すること
によるOS負荷の上昇が押さえられ、計算機101の停
止、さらには分散システムを構成する全ての計算機の停
止を防止することが可能となる。
The first embodiment will be described. As shown in FIG. 2, the LAN communication control device is provided with a transmission buffer 115, a reception buffer 117, and an abnormality detection program 116. FIG. 3 is a processing flowchart of processing for detecting a LAN abnormality in the LAN communication control device. The reception buffer stores message packets transmitted from other computers. The abnormality detection program 116 counts the number of message packets stored in the reception buffer 117 for every t seconds in the period of t seconds (301). It is determined whether or not N or more packets have been received within t seconds (302). N
Is an index value set according to the performance of the computer. If N or more packets are detected within t seconds, the LAN communication control device 106 determines that an abnormal number of message packets have been received, and stops its own LAN communication control device processing function (303). ). Due to abnormal stoppage of the LAN communication control device, an increase in the load on the OS is suppressed, and applications such as configuration control and transmission / reception programs at lower levels than the OS operate normally. The configuration control detects that the LAN communication control device 106 has stopped (304) L
System switching is performed so as to take in data from the AN communication control device 107 (305). LAN communication control device 10
7, the computer 101 operates correctly, and the suspension of the distributed system does not occur. If it is determined that the LAN communication control device 107 has also received an abnormal number of message packets, the LAN communication control device 107 has stopped its function.
Data transmission / reception via the AN is stopped (30).
6). However, an increase in the OS load due to the reception of an abnormal number of message packets is suppressed, and it is possible to prevent the computer 101 from stopping and from stopping all the computers constituting the distributed system.

【0013】[0013]

【発明の効果】本発明によれば、LAN通信制御装置の
機能にて一定時間内に一定以上の割り込みが発生したこ
とを条件に異常な数のパケットを受信していることを検
出し、異常な受信を行っているLAN通信制御装置の機
能を停止することが可能となる。
According to the present invention, it is detected that an abnormal number of packets are received on the condition that a certain number or more interrupts occur within a certain period of time in the function of the LAN communication control device. It is possible to stop the function of the LAN communication control device performing the proper reception.

【0014】また、異常な数のパケットを受信している
情報をLAN通信制御装置を制御するOSに通知し、O
Sにより受信側のLAN通信制御装置の機能を停止する
ことが可能となる。
[0014] Also, information that receives an abnormal number of packets is notified to the OS controlling the LAN communication control device, and
By S, the function of the LAN communication control device on the receiving side can be stopped.

【0015】これにより異常な数のパケットを送信する
LAN通信制御装置を特定した上で、LAN通信制御装
置単位の停止を実施し、計算機システムの信頼度を向上
させることができる。
[0015] Thus, after specifying the LAN communication control device that transmits an abnormal number of packets, the LAN communication control device unit can be stopped to improve the reliability of the computer system.

【図面の簡単な説明】[Brief description of the drawings]

【図1】従来の技術における、LANにより構成された
分散システムを示す図。
FIG. 1 is a diagram showing a distributed system configured by a LAN in a conventional technique.

【図2】本発明の一実施例を示す、LANにより構成さ
れた分散システムを示す図。
FIG. 2 is a diagram showing a distributed system configured by a LAN, showing one embodiment of the present invention.

【図3】本発明の一実施例を示す、LAN異常を検出し
LAN通信制御装置の停止を実施する処理フロー図。
FIG. 3 is a flowchart illustrating a process of detecting a LAN abnormality and stopping the LAN communication control device according to an embodiment of the present invention.

───────────────────────────────────────────────────── フロントページの続き Fターム(参考) 5K032 AA06 BA05 CA06 CC03 CD02 DB01 DB23 DB24 DB28 EA03 EA04 EA06 EA07 EB06 5K033 AA06 BA05 CA06 CB03 CC02 DA13 DB15 DB16 DB17 DB20 EA03 EA04 EA06 EA07 EB06 ──────────────────────────────────────────────────続 き Continued on the front page F term (reference) 5K032 AA06 BA05 CA06 CC03 CD02 DB01 DB23 DB24 DB28 EA03 EA04 EA06 EA07 EB06 5K033 AA06 BA05 CA06 CB03 CC02 DA13 DB15 DB16 DB17 DB20 EA03 EA04 EA06 EA07 EB06

Claims (2)

【特許請求の範囲】[Claims] 【請求項1】多重化されたLANにより複数の計算機シ
ステムを接続した大分散システムにおいて、1つの計算
機のLAN通信制御装置の異常になり、LANに接続す
る全ての計算機システムの正常に受信可能な伝文パケッ
ト最大数を超える異常な数の伝文パケットを送信してし
まうような場合に、送信側の計算機のLAN通信制御装
置で発生した異常な数のパケット送信を、受信側のLA
N通信制御装置の機能にて一定時間内に一定以上の伝文
パケットの割り込みが発生したことを条件に検出し、受
信側のLAN通信制御装置の機能を停止し、健全なLA
N側で送受信を継続するLAN異常検出装置。
In a large distributed system in which a plurality of computer systems are connected by a multiplexed LAN, a LAN communication control device of one computer becomes abnormal, and all the computer systems connected to the LAN can receive data normally. When an abnormal number of message packets exceeding the maximum number of message packets are transmitted, the abnormal number of packets generated by the LAN communication control device of the transmitting computer is transmitted to the receiving LA.
N The function of the communication control unit is detected on the condition that an interrupt of a message packet of a certain number or more has occurred within a certain period of time, and the function of the LAN communication control unit on the receiving side is stopped, and a sound LA
LAN abnormality detection device that continues transmission and reception on N side.
【請求項2】多重化されたLANにより複数の計算機シ
ステムを接続した分散システムにおいて、1つの計算機
のLAN通信制御装置の異常になり、LANに接続する
全ての計算機システムの正常に受信可能な伝文パケット
最大数を超える異常な数の伝文パケットを送信してしま
うような場合に、送信側の計算機のLAN通信制御装置
で発生した異常な数のパケット送信を、受信側のLAN
通信制御装置の機能にて一定時間内に一定以上の伝文パ
ケットの割り込みが発生したことを条件に検出し、異常
情報をLAN受信制御装置を制御するOS(operating
system)に通知し、OSにより受信側のLAN通信制御
装置の機能を停止するLAN異常検出装置。
2. In a distributed system in which a plurality of computer systems are connected by a multiplexed LAN, an error occurs in a LAN communication control device of one computer, and transmission of data which can be normally received by all the computer systems connected to the LAN. In the case where an abnormal number of message packets exceeding the maximum number of sentence packets are transmitted, the abnormal number of packets generated by the LAN communication control device of the transmitting side computer is transmitted to the receiving side LAN.
An OS (operating) that detects a condition that an interruption of a message packet of a certain number or more has occurred within a certain period of time by a function of the communication control device and controls the LAN reception control device for abnormal information.
system), and the OS stops the function of the LAN communication control device on the receiving side by the OS.
JP11103783A 1999-04-12 1999-04-12 Device for detecting abnormality in lan Pending JP2000295259A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP11103783A JP2000295259A (en) 1999-04-12 1999-04-12 Device for detecting abnormality in lan

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP11103783A JP2000295259A (en) 1999-04-12 1999-04-12 Device for detecting abnormality in lan

Publications (1)

Publication Number Publication Date
JP2000295259A true JP2000295259A (en) 2000-10-20

Family

ID=14363021

Family Applications (1)

Application Number Title Priority Date Filing Date
JP11103783A Pending JP2000295259A (en) 1999-04-12 1999-04-12 Device for detecting abnormality in lan

Country Status (1)

Country Link
JP (1) JP2000295259A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010028529A (en) * 2008-07-22 2010-02-04 Nec Corp Communication control apparatus and method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010028529A (en) * 2008-07-22 2010-02-04 Nec Corp Communication control apparatus and method

Similar Documents

Publication Publication Date Title
US5983371A (en) Active failure detection
US20020120884A1 (en) Multi-computer fault detection system
US20070157052A1 (en) Protection of devices in a redundant configuration
JP2000295259A (en) Device for detecting abnormality in lan
JP2578985B2 (en) Redundant controller
JP6089766B2 (en) Information processing system and failure processing method for information processing apparatus
JP2002116920A (en) Cluster system, monitoring method in cluster system, and computer program
JP2006171995A (en) Control computer
JP2633351B2 (en) Control device failure detection mechanism
JPH10171769A (en) Composite computer system
JP7120678B1 (en) Communication processing device, communication processing system, failure notification method and failure notification program
KR100250888B1 (en) Network detection device of distributed control system
JP2977705B2 (en) Control system of networked multiplexed computer system
JP3812434B2 (en) Health check method
JPH02310755A (en) Health check system
JP2007026038A (en) Path monitoring system, path monitoring method and path monitoring program
JP2000349900A (en) Fault processing system for exchange
JPH06290126A (en) Fault monitoring system for computer system
JPH11331194A (en) Device and system for monitor
JP2020178297A (en) Information processing device, apparatus, program, and information processing system
JP2000222233A (en) Duplex system, and active system and stand-by system switching method
JP2834062B2 (en) Information processing system
JPH04340649A (en) Detection of subsystem down
JP4957068B2 (en) Redundant system switching method
JP2001325117A (en) Stand-by duplex system information processor and its system state checking method