JPH03266133A - Line system fault processing system in hot stand-by system - Google Patents

Line system fault processing system in hot stand-by system

Info

Publication number
JPH03266133A
JPH03266133A JP2066551A JP6655190A JPH03266133A JP H03266133 A JPH03266133 A JP H03266133A JP 2066551 A JP2066551 A JP 2066551A JP 6655190 A JP6655190 A JP 6655190A JP H03266133 A JPH03266133 A JP H03266133A
Authority
JP
Japan
Prior art keywords
standby
central processing
active
processing unit
communication control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2066551A
Other languages
Japanese (ja)
Inventor
Yoshinori Yamamoto
義則 山本
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP2066551A priority Critical patent/JPH03266133A/en
Publication of JPH03266133A publication Critical patent/JPH03266133A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To carry on the normal operation of a system by performing system switching, the initialization of a communication controller, and a reconnection with an external terminal when a communication controller (FNP) fault which does not shut down the system directly occurs and the number of lines degenerates exceeding some extent. CONSTITUTION:If a fault occurs to the FNP 7, an operating system (OS) 14 detects the fault in the process of communication with the FNP 7, decides the fault influence range, and informs a monitor program 11, where it is decided whether or not lines degenerate more than predetermined. If the lines degenerate exceeding the constant number, the monitor program 11 instructs the OS 14 to switch the system and OS 34 stops the on-line transaction processing program 12 and informs a stand-by OS 34 of the system shutdown. Then a monitor program 31 take the resource which is used by the in-use system in its system, initialize the FNPs 6 and 7 temporarily, and connects the FMP 7 to an external terminal again. Consequently, the normal operation is carried on.

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明はホットスタンバイシステムに関シ、特にシステ
ムの運用上系ダウンと等価な回線系縮退障害が発生した
場合の系切換え制御方式を特徴とスルホットスタンバイ
システムに関する。
[Detailed Description of the Invention] [Field of Industrial Application] The present invention relates to a hot standby system, and is particularly characterized by a system switching control method when a line degeneracy failure equivalent to a failure occurs in system operation. Regarding the hot standby system.

〔従来の技術〕[Conventional technology]

近年、増々システムとして高信頼度性が要求され、たと
えシステムダウンしたとして亀、即座にシステムが再開
することが必須となってきている。
In recent years, systems are increasingly required to have high reliability, and even if a system goes down, it has become essential that the system be restarted immediately.

従来、この糧の情報処理システムには、待機冗長システ
ムとして、待機系を常に動作可能にしておき、現用系に
障害が発生したとき、現用系から待機系へ高速に自動切
換を行い、システムのオンラインダウン時間−を短かく
して運用が続行されるホットスタンバイシステムがある
Conventionally, this type of information processing system has a standby system that is always operational as a standby redundant system, and when a failure occurs in the active system, it automatically switches from the active system to the standby system at high speed, and the system is maintained. There is a hot standby system that continues operation with a short online down time.

〔発明が解決しようとする課題〕[Problem to be solved by the invention]

上述した従来のホットスタンバイシステムにおいては、
障害が発生しても系ダウンとならなければ、積極的に系
切換えは行わない。しかしながらシステムとしてオンラ
イン運用を行っているため、ホストと通信制御装置との
インタフェース部障害等により、通信制御装置そのもの
の障害でなくても、大幅に回線が縮退するような障害が
発生した場合には、ホスト側は運用が継続されているが
、端末側の多数の操作者にはシステムダウンしているよ
うにみえ、システム運用の観点から重大力不都合が生じ
ていた。
In the conventional hot standby system mentioned above,
Even if a failure occurs, if the system does not go down, system switching will not be actively performed. However, since the system is operated online, if a failure occurs such as a failure in the interface between the host and the communication control device, even if it is not a failure of the communication control device itself, the line will be significantly degraded. Although operations continued on the host side, it appeared to many operators on the terminal side that the system was down, causing a serious inconvenience from the perspective of system operation.

〔課題を解決するための手段〕 本発明のホットスタンバイシステムでの回線障害処理方
式の構成は、現用系の中央処理装置と予備系の中央処理
装置との間が系間通信装置で接続され、前記現用系と予
備系の中央処理装置とから相互に接続されて現用或は予
備として機能する通信制御装置とがそれぞれ回線切換装
置に接続され、前記予備系では、前記現用系と同一のオ
ンラインド2ンザクシ冒ン処理プログラムを実行開始直
前の待機状態にしておき、前記現用系の中央処理装置は
、監視手段を用いて前記予備系の中央処理装置へ前記系
間通信装置を介して通信が可能であり、前記監視手段は
前記現用の通信制御装置の障害により前記予備の通信制
御装置へ切換え後、前記予備の通信制御装置が有する回
線の縮退状況を監視し、一定数以下に回線が縮退したな
らば、前記現用系のオンライン)2ンザクシ曹ン処理プ
ログラムを停止し、かつ前記系間通信装置を介し前記予
備系の中央処理装置へ系ダウン通知を行い、前記予備系
の中央処理装置は待機状態の前記予備系のオンライント
ランザクシ曹ン処理プログ2人を起動し、前記現用系の
中央処理装置で使用中でありた前記予備の通信制御装置
の回線状況を初期設定した後、系を切換えた状態でシス
テム運用を再開することを特徴とする。
[Means for Solving the Problems] The configuration of the line failure handling method in the hot standby system of the present invention is such that the active system central processing unit and the standby system central processing unit are connected by an intersystem communication device, A communication control device that is interconnected from the active and standby central processing units and functions as an active or standby system is connected to a line switching device, and the standby system is connected to the same online network as the active system. 2. The unauthorized attack processing program is placed in a standby state immediately before the start of execution, and the active central processing unit is able to communicate with the standby central processing unit via the intersystem communication device using a monitoring means. The monitoring means, after switching to the backup communication control device due to a failure in the current communication control device, monitors the degeneration status of the lines owned by the backup communication control device, and detects whether the number of lines has degenerated below a certain number. If so, the active system's online (2) processing program is stopped, and a system down notification is sent to the standby system central processing unit via the intersystem communication device, and the standby system central processing unit is placed on standby. After starting the two online transaction processing programs of the standby system and initializing the line status of the standby communication control unit that was in use with the active central processing unit, the system was switched. The feature is that system operation can be resumed in the current state.

〔実施例〕〔Example〕

次に1本発明について図面を参照して説明する。 Next, one embodiment of the present invention will be explained with reference to the drawings.

第1図は本発明の一実施例の構成図を示し、現用系は、
主記憶装置i1、中央処理装置(以降、CPUと略す)
2、予備系は主記憶装置3、中央処理装置4とを含んで
おj7、CPU2とCPU4とは、系間通信装置5で接
続され、通信制御装置(以降PNPと略す)6.7は、
それぞれCPU2゜CPU4とに接続され、かつ回線切
換装置8に接続され、どちらか一方(本例ではFNP6
)が回線を介して外部端末と接続されオンライン運用が
行われている。
FIG. 1 shows a configuration diagram of an embodiment of the present invention, and the current system is as follows:
Main storage device i1, central processing unit (hereinafter abbreviated as CPU)
2. The standby system includes a main storage device 3 and a central processing unit 4. The CPU 2 and CPU 4 are connected by an intersystem communication device 5, and a communication control device (hereinafter abbreviated as PNP) 6.7.
They are connected to CPU2 and CPU4, respectively, and connected to line switching device 8, and one of them (in this example, FNP6
) is connected to an external terminal via a line and operated online.

また、主記憶装置IKは、監視プログラム11、オンラ
イントランザクシ曹ン処理プログラム(以降0LTPと
略す)12、通信制御プログラム13、オペレーティン
グシステム(以降O8と略す)14がロードされておシ
、各プログラムは、0814の制御下で動作している。
In addition, the main storage device IK is loaded with a monitoring program 11, an online transaction processing program (hereinafter abbreviated as 0LTP) 12, a communication control program 13, and an operating system (hereinafter abbreviated as O8) 14. , 0814.

0814は、計算機システムの基本動作を制御しておシ
、通信制御プログラム13は、回線を通して接続されて
いる外部端末からの電文をFNP6を介して受信し、計
算機内で利用可能なように処理して、0LTP12へ引
き渡している。0LTP12は受けとったデータをシス
テムで定められている特定な処理を行い、処理結果をデ
ータベース(図には示していない)K格納し、処理結果
の応答を通信制御プログラム13によりFNP6を介し
て外部端末へ送信している。
0814 controls the basic operations of the computer system, and the communication control program 13 receives messages from external terminals connected through the line via the FNP 6 and processes them so that they can be used within the computer. and handed over to 0LTP12. The 0LTP 12 performs specific processing on the received data as determined by the system, stores the processing results in a database (not shown), and transmits the processing result responses to the external terminal via the FNP 6 using the communication control program 13. is being sent to.

監視プログラム11は、系ダウンの発生を監視し、系ダ
ウン発生時には系間通信装置5を介して系ダウン通知を
行い、また回線数の縮退状況の監視も行りている。待機
系の主記憶装置3にも同様なプログラムがロードされて
おシ、使用されるファイルは全て使用可となりておシ、
現用系で系ダウンが発生した場合には、現用系から送ら
れてくる系ダウン通知を08340制御下で監視プログ
ラム31が受けとシ、さらに監視プログラム31は、現
用系で使用していた資源(図示していない)を自系内に
取シ込み、0LTP32の実行を開始させて系切換え処
理を完了し、即座に現用系の処理を再開可能な状態とな
っている。
The monitoring program 11 monitors the occurrence of a system down, issues a system down notification via the intersystem communication device 5 when the system goes down, and also monitors the degeneration status of the number of lines. A similar program is loaded on the standby main storage device 3, and all files to be used are available for use.
When a system failure occurs in the active system, the monitoring program 31 receives the system down notification sent from the active system under the control of the 08340, and furthermore, the monitoring program 31 receives the system down notification sent from the active system. (not shown) into its own system, starts execution of 0LTP32, completes system switching processing, and is in a state where it can immediately resume processing on the active system.

いま、現用系のPNP6に障害が発生し、os14の制
御によ、りFNP7を介して回線切換装置8へ切換え指
示が行われ、PNP7が現用系のPNPとして接続され
ている場合に、さらKFNP7に障害が発生したとする
と、今度は予備のPNPは存在しないため、障害の波及
範囲により一部回線が縮退した形態で運用が続行される
か、オンラインダウンとなシ運用が停止することになる
Now, when a failure occurs in the active system PNP 6, a switching instruction is issued to the line switching device 8 via the FNP 7 under the control of the OS 14, and when the PNP 7 is connected as the active system PNP, the KFNP 7 If a failure occurs, there will be no backup PNP, so depending on the scope of the failure, operation will continue with some lines degraded, or operation will stop due to online downtime. .

しかし、PNPの障害には、間欠的なPNP共通部障害
や、CPU−PNP間のインタフェー7部障害などのよ
うなPNP自身に異常がなくてもPNP障害と認識され
るケースがある。このため、本発明ではPNP7障害時
には下記のようKしてシステムを制御する。
However, there are cases where a PNP failure is recognized as a PNP failure even if there is no abnormality in the PNP itself, such as an intermittent PNP common part failure or a 7-part failure of the CPU-PNP interface. Therefore, in the present invention, when a PNP7 failure occurs, the system is controlled by performing K as described below.

PNP7において障害が発生すると、0814はPNP
7への通信の過程で障害を検出して障害波及範囲を判定
し、監視プログラム11へ通知を行う。0814からの
FNP7障害の通知を受けると、監視プログラム11は
、あらかじめ定められた以上に回線が縮退されたかを判
定する。CPU2−PNP7間のインタフェース部障害
は、全回線障害或いは部分回線障害として認識される。
When a failure occurs in PNP7, 0814
A failure is detected in the process of communication to 7, the range of influence of the failure is determined, and the monitoring program 11 is notified. Upon receiving the notification of FNP7 failure from 0814, the monitoring program 11 determines whether the line has been degraded more than a predetermined amount. A failure in the interface between the CPU2 and PNP7 is recognized as a total line failure or a partial line failure.

定められた一定数以下であれば、そのまま運用が続行さ
れるが、一定数以上に回線が縮退しているならば、シス
テム機能として運用に耐えないものと判断し、監視プロ
グラム11は、0814に対して系切換えの指示を行う
。0814は、監視プログラム11からの系切換え指示
を受けとると、0LTP12を停止し、待機系の083
4に系ダウン通知を行う。待機系の0834は、系ダウ
ン通知を受けとると、監視プログラム31へ通知し、さ
らに監視プログラム31は、現用系システムで使用され
た資源(図示していない)を自系内に取込み、かつPN
P6.7を一端初期化し再度PNP7と外部端末との再
接続を行う。さらに、0LTP32の実行を開始させて
、系切換え処理を完了し、再び外部端末から送られてく
る電文の処理を開始し、システム運用を継続する。
If the number is less than a predetermined number, the operation will continue as is, but if the number of lines has degraded to more than a certain number, it will be determined that the system function cannot withstand operation, and the monitoring program 11 will call 0814. Instructs the system to switch. Upon receiving the system switching instruction from the monitoring program 11, 0814 stops 0LTP12 and switches to standby system 083.
4, a system down notification is sent. When the standby system 0834 receives the system down notification, it notifies the monitoring program 31, and the monitoring program 31 also imports the resources (not shown) used in the active system into its own system, and
After initializing P6.7, PNP7 is reconnected to the external terminal. Furthermore, execution of 0LTP32 is started, system switching processing is completed, processing of messages sent from external terminals is started again, and system operation is continued.

4L、CPU2−PNP7間のインタフェース部障害に
より回線が縮退していた場合には、これらの回線はCP
U4−FNP7経由で使用可能となシ運用がそのまま続
行されることKなる。
4L, if the lines are degraded due to a failure in the interface between CPU2 and PNP7, these lines will be
The operations available via U4-FNP7 will continue as they are.

〔発明の効果〕〔Effect of the invention〕

以上説明したように本発明は、直接系ダウンとならない
PNP障害が発生した場合にも、ある−定板上の回線数
が縮退したならば、系切換え、通信制御装置の初期化、
及び外部端末との再接続を行うととKよ)、中央処理装
置と通信制御装置とのインタフェース部障害のような通
信制御装置そのものの障害ではなくてもシステム運用が
停止してしまうケースを救済して、システムの正常運転
を継続することができるという効果がある。
As explained above, even when a PNP failure occurs that does not cause a direct system down, the present invention can perform system switching, initialization of communication control equipment, etc. if the number of lines on a fixed board is degraded.
(K) and reconnecting with external terminals) can help in cases where system operation stops even if there is no failure in the communication control device itself, such as a failure in the interface between the central processing unit and the communication control device. This has the effect of allowing the system to continue operating normally.

第1図は本発明を適用した実施例のシステム構成図であ
る。
FIG. 1 is a system configuration diagram of an embodiment to which the present invention is applied.

°1,3・・・・・・主記憶装置、2.4・・・・・・
中央処理装置、5・・・・・・系間通信装置、6,7・
・・・・・通信制御装置、8・・・・・・回線切換装置
、11.31・・・・・・監視ブロゲラム、12,32
・・・・・・オンライントランザクシ璽ン処理プログラ
ム、13.33・・・・・・通信制御プログラム、14
.34・・・・・・オペレーティングシステム。
°1, 3... Main memory, 2.4...
Central processing unit, 5... Intersystem communication device, 6, 7.
...Communication control device, 8...Line switching device, 11.31...Monitoring blogger, 12,32
...Online transaction processing program, 13.33...Communication control program, 14
.. 34...Operating system.

Claims (1)

【特許請求の範囲】[Claims] 現用系の中央処理装置と予備系の中央処理装置との間が
系間通信装置で接続され、前記現用系と予備系の中央処
理装置とから相互に接続されて現用或は予備として機能
する通信制御装置とがそれぞれ回線切換装置に接続され
、前記予備系では、前記現用系と同一のオンライントラ
ンザクション処理プログラムを実行開始直前の待機状態
にしておき、前記現用系の中央処理装置は、監視手段を
用いて前記予備系の中央処理装置へ前記系間通信装置を
介して通信が可能であり、前記監視手段は前記現用の通
信制御装置の障害により前記予備の通信制御装置へ切換
え後、前記予備の通信制御装置が有する回線の縮退状況
を監視し、一定数以下に回線が縮退したならば、前記現
用系のオンライントランザクション処理プログラムを停
止し、かつ前記系間通信装置を介し前記予備系の中央処
理装置へ系ダウン通知を行い、前記予備系の中央処理装
置は待機状態の前記予備系のオンライントランザクショ
ン処理プログラムを起動し、前記現用系の中央処理装置
で使用中であった前記予備の通信制御装置の回線状況を
初期設定した後、系を切換えた状態でシステム運用を再
開することを特徴とするホットスタンバイシステムでの
回線系障害処理方式。
An inter-system communication device connects an active system central processing unit and a standby system central processing unit, and communication between the active system and standby system central processing unit is connected to each other and functions as an active or standby system. and a control device are respectively connected to a line switching device, the standby system puts the same online transaction processing program as the active system in a standby state immediately before starting execution, and the active central processing unit controls the monitoring means. The monitoring means can communicate with the backup central processing unit via the inter-system communication device, and after switching to the backup communication control device due to a failure in the current communication control device, the monitoring means The communication control device monitors the degeneracy status of lines possessed by the communication control device, and when the number of lines degenerates below a certain number, it stops the online transaction processing program in the active system, and executes the central processing program in the backup system via the intersystem communication device. A system down notification is sent to the device, and the standby central processing unit starts the standby online transaction processing program, and the standby communication control unit that was being used by the active central processing unit A line system failure handling method in a hot standby system characterized by restarting system operation with the system switched after initializing the line status.
JP2066551A 1990-03-16 1990-03-16 Line system fault processing system in hot stand-by system Pending JPH03266133A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2066551A JPH03266133A (en) 1990-03-16 1990-03-16 Line system fault processing system in hot stand-by system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2066551A JPH03266133A (en) 1990-03-16 1990-03-16 Line system fault processing system in hot stand-by system

Publications (1)

Publication Number Publication Date
JPH03266133A true JPH03266133A (en) 1991-11-27

Family

ID=13319163

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2066551A Pending JPH03266133A (en) 1990-03-16 1990-03-16 Line system fault processing system in hot stand-by system

Country Status (1)

Country Link
JP (1) JPH03266133A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6377584B1 (en) 1997-10-13 2002-04-23 Fujitsu Limited Transmission equipment and a load-distribution transmitting method in the transmission equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6377584B1 (en) 1997-10-13 2002-04-23 Fujitsu Limited Transmission equipment and a load-distribution transmitting method in the transmission equipment

Similar Documents

Publication Publication Date Title
US4823256A (en) Reconfigurable dual processor system
CN110740066B (en) Seat-invariant cross-machine fault migration method and system
JPH03266133A (en) Line system fault processing system in hot stand-by system
JPH08235132A (en) Hot stand-by control method for multiserver system
JPH044444A (en) Communication control system
JPH07282022A (en) Multiprocessor system
JPH03266131A (en) Power source state decision system for multiple system
JPH07321799A (en) Input output equipment management method
JPH07200334A (en) Duplicate synchronization operation system
JPS597982B2 (en) Restart method in case of system failure of computer system
JPH032957A (en) Starting process system for composite computer system
JPH04242467A (en) Combined computer system
JP2000057108A (en) Switching test method for duplex computer system, monitoring device for it, and computer readable recording medium
JPS6139138A (en) Multiplexing system
JPH0730616A (en) Communication control equipment
JPS6242252A (en) Switching system for communication controller
JP2795246B2 (en) Failure recovery device at the time of interrupt processing in redundant memory system
JPH0277943A (en) System recovering method
JPH10187473A (en) Duplex information processor
JPH01234966A (en) Fault detecting system for multiplexed computer system
CN117785568A (en) Dual-master dual-machine hot standby method and device
JPH02196341A (en) Fault restoring system for information processor
JPS63266549A (en) Automatic system switching system
JPS6375843A (en) Abnormality monitor system
JPS61167245A (en) Spare switching system