JPS62172437A

JPS62172437A - High reliability system

Info

Publication number: JPS62172437A
Application number: JP61014263A
Authority: JP
Inventors: Shizuo Mihashi; 三橋　鎮雄
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1986-01-24
Filing date: 1986-01-24
Publication date: 1987-07-29

Abstract

PURPOSE:To efficiently use a real machine by connecting an input/output device connected to an opposite side to its own real computer when the real computer fails to operate at the opposite side, and making perform an operation handled with a virtual computer in the real computer at the opposite side, with the virtual computer controlled at its own side. CONSTITUTION:When a real computer 1 fails to operate, a virtual computer monitor 21 at a real computer 2 side can recognize the down of the real computer 1 because the periodical sending of a bit of information from the real computer 1 to the real computer 2 is ceased. When the real computer 1 fails to operate,a system switching part 23 at the real computer 2 side issues a command to switch I/O devices 5 and 6 to the real computer 2 side. And the virtual computer monitor 21 allocates the I/O devices 5 and 6 to a virtual computer 25-2, and informs the fact that the virtual computer 25-2 operates as a main system for the I/O devices, then making operate the virtual computer 25-2. The virtual computer 25-2 performs the initialization of a communication control program, and restarts the operation setting an operation A'' in an operation possible state, and the virtual computer 25-2 as the main system. In this way, a high reliability system in which the real machine can be used efficiently can be obtained.

Description

【発明の詳細な説明】〔概要〕２台の実計算機上にそれぞれ複数の仮想計算機を実現し
、１台の実計算機が故障した場合にはその業務を他の実
計算機で引き継ぐようにし、１台の実計算機上の仮想計
算機が故障した場合にはその実計算機上の他の仮想計算
機がその業務を引き継ぐようにした高信頼性システムで
ある。[Detailed Description of the Invention] [Summary] A plurality of virtual computers are realized on each of two real computers, and when one real computer breaks down, the work is taken over by another real computer. This is a highly reliable system in which if a virtual computer on one real computer fails, another virtual computer on that real computer will take over the work.

[Industrial application field]

本発明は、複数の仮想計算機が構築された実計算機Ａと
複数の仮想計算機が構築された実計算機Ｂとを有する高
信頼性システムに関するものである。The present invention relates to a highly reliable system having a real computer A in which a plurality of virtual computers are constructed and a real computer B in which a plurality of virtual machines are constructed.

[Prior art and problems]

高信頼性を持つ計算機システムとしては、待機システム
や並列システム等が知られている。待機システムでは、
現用システムがオンライン業務を行い、待機システムが
バッチ・ジョブを行い、現用システムが故障した場合に
は現用システムが待機システムとなり、待機システムが
現用システムとなってオンライン業務を実行する。並列
システムでは、２台の計算機システムが同じ業務を行っ
ており、一方の計算機システムが故障しても業務を支障
なく遂行することができる。Standby systems, parallel systems, and the like are known as highly reliable computer systems. In the standby system,
The active system performs online operations, the standby system executes batch jobs, and if the active system fails, the active system becomes the standby system, and the standby system becomes the active system and executes the online operations. In a parallel system, two computer systems perform the same task, and even if one computer system malfunctions, the task can be carried out without any problems.

従来の高信頼性システムでは、１台の計算機には１個の
オペレーティング・システムしか存在ぜず、システム数
に等しい数の実マシンが必要になり、高価な実マシンを
効率的に使用出来ないと言う欠点があった。In conventional high-reliability systems, there is only one operating system per computer, and a number of real machines equal to the number of systems is required, making it difficult to use expensive real machines efficiently. There was a drawback.

[Purpose of the invention]

本発明は、上記の考察に基づくものであって、従来の高
信頼性システムに較べて実マシンを効率的に使用できる
高信頼性システムを提供することを目的としている。The present invention is based on the above considerations, and aims to provide a highly reliable system that can use real machines more efficiently than conventional highly reliable systems.

[Means to achieve the purpose]

そしてそのため、本発明の高信頼性システムは、２台の
実計算機と、これら２台の実計算機間に設置された通信
路と、入出力装置と、上記各実計算機からの切換え指令
に従って指定された入出力装置を指定された実計算機に
接続する入出力監視装置を具備する高信頼性システムで
あって、各実計算機は、仮想計算機モニタと、複数台の
仮想計算機と、自己の実計算機上に構築された仮想計算
機のシステム・ダウンを検出する監視手段と、自己の実
計算機がダウンしていないことを上記通信路を介して相
手方の実計算機に通知する通信機能手段と、相手方の実
計算機がダウンしたことを認識した時に上記入出力監視
装置に対して切換え指令を送出するシステム切換え手段
とを有し、上記仮想計算機モニタは、自己配下の仮想計
算機がシステム・ダウンした場合には、当該仮想計算機
の行っていた業務を自己配下の他の仮想計算機に行わせ
、相手方の実計算機がダウンした場合には、相手方の実
計算機に接続されていた入出力装置を自己の実計算機に
接続し、相手方の実計算機上に構築された仮想計算機で
行われていた業務を自己配下の仮想計算機に行わせるよ
うに構成されていることを特徴とするものである。Therefore, the highly reliable system of the present invention has two real computers, a communication path installed between these two real computers, an input/output device, and a system specified according to switching commands from each of the above real computers. A highly reliable system equipped with an input/output monitoring device that connects a specified input/output device to a specified real computer, in which each real computer has a virtual computer monitor, multiple virtual computers, and its own real computer. a monitoring means for detecting a system down of a virtual computer built in a system, a communication function means for notifying the other party's real computer via the communication path that its own real computer is not down, and the other party's real computer and system switching means that sends a switching command to the input/output monitoring device when it recognizes that the virtual computer has gone down, and when the virtual computer under its own control goes down, the virtual computer monitor If you have another virtual machine under your control do the work that your virtual machine was doing, and the other party's real computer goes down, you can connect the input/output device that was connected to the other party's real computer to your own real computer. , is characterized in that it is configured to cause a virtual computer under its own control to perform tasks that were being performed on a virtual computer built on the other party's real computer.

[Embodiments of the invention]

以下、本発明を図面を参照しつつ説明する。図は本発明
の１実施例のブロック図である。図において、１と２は
実計算機、３はＣＰＵ間通信装置、４はＩ１０監視装置
、５はディス、プレイ、６はプリンタ、７は共用ＤＡＳ
Ｄ、１１は仮想計算機モニタ、１２はＣＰＵ間通信機能
部、１３はシステム切換え部、１４はＣＰＵ内監視部、
１５−１ないし１５−３は仮想計算機、２１は仮想計算
機モニタ、２２はＣＰＵ間通信機能部、２３はシステム
切換え部、２４はＣＰＵ内監視部、２５−１ないし２５
−３は仮想計算機をそれぞれ示している。Hereinafter, the present invention will be explained with reference to the drawings. The figure is a block diagram of one embodiment of the present invention. In the figure, 1 and 2 are actual computers, 3 is an inter-CPU communication device, 4 is an I10 monitoring device, 5 is a display, 6 is a printer, and 7 is a shared DAS
D, 11 is a virtual machine monitor, 12 is an inter-CPU communication function section, 13 is a system switching section, 14 is a CPU internal monitoring section,
15-1 to 15-3 are virtual computers, 21 is a virtual computer monitor, 22 is an inter-CPU communication function section, 23 is a system switching section, 24 is a CPU internal monitoring section, 25-1 to 25
-3 indicates each virtual machine.

仮想計算機モニタｌ　１．ＣＰＵ間通間通化部１２、シ
ステム切換え部１３、ＣＰＵ内監視部１４及び仮想計算
機１５−１ないし１５−３のそれぞれは、ソフトウェア
的な手段によって実現されるものであって、各部分に対
応するプログラムが存在する。実計算機２側のハードウ
ェア及びソフトウェアは、実計算機１側のそれと同じで
ある。仮想計算機モニタ１１は、Ａ　Ｖ　Ｍ　（Ａｄｖ
ａｎｃｅｄ　Ｖｅｒｔｕａｌ　Ｍａｃｈｉｎｅ）と呼ば
れるものであって、仮想計算機システム全体を制御する
ものである。ＣＰＵ間通間通化部１２は、一定時間毎に
実計算機１が正常に動作している旨の情報をｃｐｕ間通
信装置３を介して相手方に通知するものである。勿論、
実計算機ｌがダウンすると、この通信が行えな（なるこ
とは当然である。システム切換え部１３は、相手方の実
計算機２がダウンした時に、Ｉ１０資ａ５，６等を物理
的に実計算機１側に接続せよと言う指令をＩ１０監視装
置４に対して送出する。Virtual machine monitor 1. Each of the inter-CPU communication unit 12, system switching unit 13, internal CPU monitoring unit 14, and virtual machines 15-1 to 15-3 is realized by software means, and the A program exists. The hardware and software on the real computer 2 side are the same as those on the real computer 1 side. The virtual machine monitor 11 is an AV M (Adv
It is called an anced Virtual Machine) and controls the entire virtual computer system. The inter-CPU communication unit 12 notifies the other party via the inter-CPU communication device 3 of information that the actual computer 1 is operating normally at regular intervals. Of course,
If the real computer 1 goes down, this communication cannot be performed (it is natural that this will happen. When the other party's real computer 2 goes down, the system switching unit 13 physically transfers I10, A5, 6, etc. to the real computer 1 side. A command to connect to the I10 monitoring device 4 is sent to the I10 monitoring device 4.

Ｉ１０監視装置４は、システム切換え部１３又は２３か
らの指令に従ってＩ１０資源の物理的な切換えを行う。The I10 monitoring device 4 performs physical switching of I10 resources according to commands from the system switching unit 13 or 23.

ｃｐｕ内監視部１４は、仮想計算機１５−１．１５−２
．１５−３の監視を行うものである。図示しないが、監
視用テーブルが存在し、仮想計算機１５−４　　（ｉ＝
１．２．３）は、自己の障害を検出した場合には監視用
テーブルにその旨を記入すると共に、一定時間毎に自己
が生きている旨の情報を時刻と一緒に監視用テーブルに
記入する。ＣＰＵ内監視部１４は、監視用テーブルの内
容を読取って仮想計算機ｌ５−１ないし１５−３のシス
テム・ダウンを検出する。The CPU internal monitoring unit 14 monitors the virtual machine 15-1.15-2.
．． 15-3. Although not shown, a monitoring table exists, and the virtual computer 15-4 (i=
1.2.3), if it detects its own failure, it will write that fact in the monitoring table, and at regular intervals it will also write information that it is alive along with the time in the monitoring table. do. The CPU internal monitoring unit 14 reads the contents of the monitoring table and detects a system down of the virtual machines 15-1 to 15-3.

今、Ｉ１０装置５，６が仮想計算機１５−１に割付けら
れ、仮想計算［１５−１がＡ業務（例えば銀行業務）を
行っていたとする。この状態のもとでは、実計算機１例
の他の仮想計算機及び実計算機２側の仮想計算機はそれ
ぞれ他の業務を行っている。仮想計算機１５−１がシス
テム；ダウンすると、このシステム・ダウンはＣＰＵ内
監視部１４により検出され、ＣＰＵ内監視部１４はこの
旨を仮想計算機モニタ１１に通知する。この通知を受は
取ると、仮想計算機モニタ１１は、Ｉ１０装置５，６を
仮想計算機１５−２に割付け、仮想計算機１５−２に制
御を渡す。装置の割付けはアタッチ（ＡＴＴＡＣ）Ｉ）
コマンドを発行することにより行われる。仮想計算機１
５−２は、通信制御プログラムの初期設定を行い、業務
Ａ′を動作可能状態にして、業務を仮想計算機１５−２
を主系として再開する。図示しないが、仮想計算機のＯ
８とディスプレイ等の間の通信は、通信制御プログラム
を介して行われる。業務ＡとＡ′は同じであり、また図
示の例では、共用ＤＡＳＤ７は仮想計算機１５−１．１
５−２及び２５−２からアクセス可能であり、これは切
換えの対象にならない。Now, assume that the I10 devices 5 and 6 are allocated to the virtual computer 15-1, and the virtual computer 15-1 is performing business A (for example, banking business). Under this state, the other virtual computers of the real computer 1 and the virtual computers on the real computer 2 side are each performing other tasks. When the virtual computer 15-1 goes down, this system down is detected by the CPU internal monitoring unit 14, and the CPU internal monitoring unit 14 notifies the virtual computer monitor 11 of this fact. Upon receiving this notification, the virtual machine monitor 11 allocates the I10 devices 5 and 6 to the virtual machine 15-2, and passes control to the virtual machine 15-2. Device assignment is attached (ATTAC) I)
This is done by issuing a command. Virtual computer 1
Step 5-2 initializes the communication control program, puts the task A' into an operable state, and transfers the task to the virtual computer 15-2.
restart as the main system. Although not shown, the O of the virtual machine
Communication between 8 and the display etc. is performed via a communication control program. Businesses A and A' are the same, and in the example shown, the shared DASD 7 is the virtual machine 15-1.1.
It can be accessed from 5-2 and 25-2, and is not subject to switching.

実計算機１がダウンすると、実計算機１から実計算機２
に定期的に送られる情＠（システムが正常に動作してい
る旨の情報）がなくなるので、実計算機２側の仮想計算
機モニタ２１は実計算機１のダウンを認識することで出
来る。実計算機１がダウンすると、実計算機２側のシス
テム切換え部２３は、Ｉ１０装置５．６を実計算機１例
に切換えるべきことを指令する。そして、仮想計算機モ
ニタ２１は、Ｉ１０装置５．６を仮想計算機２５−２に
割付け、仮想計算機２５−２に主系として動作させる旨
を伝え、仮想計算機２５−２を動作させる。仮想計算機
２５−２は、通信制御プログラムの初期設定を行い、業
務Ａ″を動作可能な状態として仮想計算機２５−２を主
系として業務を再開する。業務Ａと業務Ａ＃は同じであ
る。When real computer 1 goes down, real computer 1 to real computer 2
Since the information @ (information indicating that the system is operating normally) that is periodically sent to the real computer 2 disappears, the virtual computer monitor 21 on the real computer 2 side can recognize that the real computer 1 is down. When the real computer 1 goes down, the system switching unit 23 on the real computer 2 side instructs the I10 device 5.6 to be switched to the real computer 1 example. Then, the virtual machine monitor 21 allocates the I10 device 5.6 to the virtual machine 25-2, notifies the virtual machine 25-2 that it will operate as the main system, and causes the virtual machine 25-2 to operate. The virtual machine 25-2 initializes the communication control program, puts the work A″ into an operable state, and restarts the work with the virtual machine 25-2 as the main system.The work A and the work A# are the same.

〔Effect of the invention〕

以上の説明から明らかなように、本発明によれば、実マ
シンを効率的に使用できる高信頼性システムを得ること
ができる。As is clear from the above description, according to the present invention, a highly reliable system that can efficiently use real machines can be obtained.

【図面の簡単な説明】図は本発明の１実施例のブロック図である。１と２・・・実計算機、３・・・ＣＰＵ間通信装置、４
・・弓１０監視装置、５・・・ディスプレイ、６・・・
プリンタ、７・・・共用ＤＡＳＤ、１１・・・仮想計算
機モニタ、１２・・・ＣＰＵ間通信機能部、１３・・・
システム切換え部、１４・・・ＣＰＵ内監視部、１５−
１ないし１５−３・・・仮想針ｎ機、２１・・・仮想３
１算機モニタ、２２・・・ｃｐｕ間通間通化機能部３・
・・システム切換え部、２４・・・ＣＰＵ内監視部、２
５−１ないし２５−３・・・仮想計算機。BRIEF DESCRIPTION OF THE DRAWINGS The figure is a block diagram of one embodiment of the present invention. 1 and 2...actual computer, 3...inter-CPU communication device, 4
... Bow 10 Monitoring device, 5... Display, 6...
Printer, 7... Shared DASD, 11... Virtual computer monitor, 12... Inter-CPU communication function unit, 13...
System switching unit, 14... CPU internal monitoring unit, 15-
1 to 15-3...virtual needle n machines, 21...virtual 3
1 Computer monitor, 22...CPU communication function unit 3.
...System switching section, 24...CPU internal monitoring section, 2
5-1 to 25-3...virtual computers.

Claims

[Claims]

Two real computers, a communication path installed between these two real computers, an input/output device, and a specified input/output device connected to the specified real computer according to the switching command from each of the above real computers. A highly reliable system equipped with an input/output monitoring device, in which each real computer has a virtual computer monitor, multiple virtual computers, and detects system down of a virtual computer built on its own real computer. a communication function means for notifying the other party's real computer via the communication path that the own real computer is not down; and a communication function means for notifying the other party's real computer that the other party's real computer is down; and system switching means for sending a switching command to the device, and when the virtual machine under itself goes down, the virtual machine monitor transfers the work being performed by the virtual machine to the other virtual machine under itself. If the other party's real computer goes down, the input/output device connected to the other party's real computer is connected to the own real computer, and the virtual computer built on the other party's real computer is activated. A high-reliability system characterized by being configured so that a virtual computer under its control performs operations that were previously performed by a computer.