JPS63231538A

JPS63231538A - I/O device failure detection method

Info

Publication number: JPS63231538A
Application number: JP62064158A
Authority: JP
Inventors: Takeshi Nakajima; 猛中嶋
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1987-03-20
Filing date: 1987-03-20
Publication date: 1988-09-27

Abstract

PURPOSE:To attain the early detection of a trouble without giving an excessive load to a system by activating automatically a task for testing the access only when an access frequency to an input/output device is lower than a reference value. CONSTITUTION:At a central processing device 1, a use frequency measuring means 6 to measure the use frequency of an input/output device 3 at an act side and a comparing means 7 to compare the measured use frequency with a reference value, and only when the measured use frequency is lower than the reference value, input/output devices 3 and 3a are operated by a trouble detecting means 5, the data are transferred, and the presence and absence of the fault are detected. Since a fault detecting means 3 is activated and an input/output action is executed only when the use frequency of the input/output device 3 is lower than the reference value, the access for the trouble detection is executed considering the time when a general processing request except the fault detection is little. Thus, the time from when the trouble occurs at the input/output device at a standby side until the fault is detected can be shortened without increasing the burden of the system.

Description

【発明の詳細な説明】〔概　要〕入出力装置の障害検出方式であって、アクセス頻度が少
ない又はアクセスする部分が片寄っている等の理由によ
って、スタンバイ側の入出力装置の障害の検出が遅れる
という問題を解決するために、入出力装置に対するアク
セス頻度が基準値を下回っているときのみ自動的にアク
セス試験用タスクを起動することにより、システムに余
分な負荷を与えずに障害の早期検出を可能にしたもの。[Detailed Description of the Invention] [Summary] This is an input/output device failure detection method that detects failures in the standby input/output device due to reasons such as infrequent access or uneven access to parts. To solve the problem of delays, by automatically starting the access test task only when the access frequency to the input/output device is below a standard value, failures can be detected early without placing an extra load on the system. What made it possible.

[Industrial application field]

本発明は、ランダムアクセスが可能な入出力装置の障害
検出方式に関する。The present invention relates to a failure detection method for input/output devices that can be randomly accessed.

一般に、コンピュータシステム、特に電子交換分野に於
いては２４時間ノンストップを実現するため、あらゆる
装置が二重化（ホレト・スタンバイ）されているが、ス
タンバイ側の装置は重要でありながら、アクセス頻度が
少ないために障害の検出が遅れるといった問題がある。In general, in computer systems, especially in the field of electronic exchange, all devices are duplicated (standby) in order to achieve 24-hour non-stop operation, but while the standby side devices are important, they are accessed less frequently. Therefore, there is a problem that failure detection is delayed.

このため、アクセス頻度の少ない装置又は装置内の部分
に対して、障害の早期検出が要求されている。For this reason, early detection of failures in devices or parts within devices that are accessed less frequently is required.

[Conventional technology]

第６図に従来の障害検出方式の一例を示す。 FIG. 6 shows an example of a conventional fault detection method.

第６図において、タスク・スケジューラ６１はＴ１０ア
クセスルーチン６２に対して、たとえば８ｍ秒周期で、
起動をかける。周期起動は交換処理においてリアルタイ
ム性を保持するためである。In FIG. 6, the task scheduler 61 controls the T10 access routine 62 at a period of, for example, 8 msec.
Start it up. The purpose of periodic activation is to maintain real-time performance in exchange processing.

各周期起動の際に、待ちキュー６３に処理要求があれば
、外部記憶装置等の二重化されたランダムアクセス入出
力装置６４及び６４ａに対してアクセスを行ない、処理
要求に従がってデータの書込み又は読出しが行なわれる
。At each cycle startup, if there is a processing request in the waiting queue 63, access is made to the duplicated random access input/output devices 64 and 64a such as external storage devices, and data is written in accordance with the processing request. Or reading is performed.

Ｉ１０アクセスルーチン６２によるアクセスとは別に、
アクト側及びスタンバイ側の装置の障害の有無を定期的
に点検するために、タスクスケジューラ６１は、たとえ
ば１日に１回、夜中や早朝などのシステムの負荷が少な
いと思われるある決まった時間に、障害検出用アクセス
要求ク６５を起動して、待ちキュー６３にアクト側及び
スタンバイ側の入出力装置の全領域に対するアクセス要
求を入力させる。Ｉ１０アクセスルーチン６２はその周
期起動により上記全領域に対するアクセス要求を検出し
てアクト側及びスタンバイ側の装置の障害の有無を検出
する。Apart from the access by the I10 access routine 62,
In order to periodically check whether there are any failures in the devices on the active side and standby side, the task scheduler 61, for example, once a day at a certain time when the load on the system is thought to be light, such as in the middle of the night or early morning. , activates the failure detection access request queue 65 to input access requests for all areas of the input/output devices on the active side and the standby side to the waiting queue 63. The I10 access routine 62 detects access requests to all the areas mentioned above by its periodic activation, and detects whether there is a failure in the active side or standby side devices.

なお、６６は待ちキュー６３に入力される処理要求を発
生する一般タスクである。Note that 66 is a general task that generates a processing request that is input to the waiting queue 63.

[Problem that the invention seeks to solve]

上述の従来方式によれば、障害検出用アクセスタスタ６
５の起動は所定の時間に限定されており、かつ、１日の
回数も制限されている。このため、スタンバイ側の装置
に障害が発生していてもこれを検出する迄に時間がかか
り、この間にアクト系の装置に障害が発生してスタンバ
イ側の装置に切替った場合にスタンバイ側の装置も障害
だったという事態が発生する可能性があるという問題が
ある。According to the above-mentioned conventional method, the access tasker 6 for failure detection
The activation of 5 is limited to a predetermined time, and the number of times per day is also limited. For this reason, even if a failure occurs in the standby device, it takes time to detect it, and if a failure occurs in the active device during this time and the switch is switched to the standby device, the standby device There is a problem in that a situation may occur in which the device is also at fault.

これを避けるために障害検出用アクセスタスク６５の１
日の起動回数を増やすと、システムの負荷がそれだけ多
くなり一般タスク６５からの処理要求が制限されること
になるという問題もある。To avoid this, 1 of access task 65 for failure detection
There is also the problem that increasing the number of activations per day increases the load on the system and limits processing requests from the general task 65.

又、障害検出用アクセスタスタ６５の起動時間は、シス
テムの負荷が少ないと考えられる夜間や早朝に固定され
ているが、夜間や早朝であっても緊急事態等に一般の処
理要求が集中することがあり、上記固定時間は必ずしも
常にシステムの負荷が軽い時間とは限らないという問題
がある。In addition, the startup time of the access tasker 65 for failure detection is fixed at night or early in the morning when the load on the system is considered to be low, but even at night or early in the morning, general processing requests may be concentrated due to emergency situations, etc. However, there is a problem in that the above-mentioned fixed time is not always a time when the load on the system is light.

[Means for solving problems]

第１図は本発明の原理ブロック図である。 FIG. 1 is a block diagram of the principle of the present invention.

第１図において、本発明による入出力装置の早期障害検
出方式は、中央処理装置１と、主記憶装置２と、ランダムアクセス
が可能な二重化された入出力装置３゜３ａとがバス４を
介して接続されており、中央処理装置ｌは入出力装置３
の障害を検出する障害検出手段５を備えたシステムにお
いて、中央処理装置１に、アクト側の入出力装置３の使
用頻度を測定する使用頻度測定手段６と、測定された使
用頻度を基準値と比較する比較手段７を設け、測定され
た使用頻度が基準値を下回っているときにのみ障害検出
手段５により入出力装置３゜３ａを動作させてデータの
転送を行ない、障害の有無を検出するようにしている。In FIG. 1, the early failure detection method for input/output devices according to the present invention is such that a central processing unit 1, a main storage device 2, and a duplex input/output device 3゜3a capable of random access are connected via a bus 4. The central processing unit 1 is connected to the input/output device 3.
In the system, the central processing unit 1 includes a usage frequency measuring means 6 that measures the frequency of use of the input/output device 3 on the act side, and uses the measured usage frequency as a reference value. A comparison means 7 is provided for comparison, and only when the measured frequency of use is lower than a reference value, the failure detection means 5 operates the input/output device 3゜3a to transfer data and detect the presence or absence of a failure. That's what I do.

[For production]

入出力装置３の使用頻度が基準値を下回っているときの
み障害検出手段３を起動して入出力動作させるので、障
害検出以外の一般の処理要求が少ない時間を見計らって
障害検出のためのアクセスが行なわれ、したがってスタ
ンバイ側の入出力装置に障害が発生してからその障害が
検出される迄の時間は、システムの負担を増大すること
なく短縮することができる。Since the failure detection means 3 is activated to perform input/output operations only when the frequency of use of the input/output device 3 is lower than the standard value, access for failure detection is performed in a time when general processing requests other than failure detection are low. Therefore, the time from when a failure occurs in the standby input/output device to when the failure is detected can be shortened without increasing the burden on the system.

〔実施例〕第２図は本発明の一実施例□による入出力装置の早期障
害検出方式を示すブロック図である。[Embodiment] FIG. 2 is a block diagram showing an early failure detection method for an input/output device according to an embodiment □ of the present invention.

第２図において、２Ｉはタイム・スケジューラ、２２は
Ｉ１０アクセスルーチン、２３は待ちキュー、２４はア
クト側ランダムアクセス入出力装置、２４ａはスタンバ
イ側のランダムアクセス入出力装置、２５は障害検出用
アクセスタスク、２６は一般タスクであり、これらは第
６図の従来方式におけるものと同一である。２７は処理
要求カウンタ、２８は周期起動テーブル、２９はアクセ
ス・アドレス情報テーブルである。In FIG. 2, 2I is a time scheduler, 22 is an I10 access routine, 23 is a waiting queue, 24 is an act side random access input/output device, 24a is a standby side random access input/output device, and 25 is an access task for failure detection. , 26 are general tasks, which are the same as those in the conventional system shown in FIG. 27 is a processing request counter, 28 is a periodic activation table, and 29 is an access address information table.

第３図は周期起動テーブル２８の内容の一例を示す図で
ある。FIG. 3 is a diagram showing an example of the contents of the periodic activation table 28.

第３図において、横軸はタスクの種類を示しており、最
上の行の１．０．０．・・・はタスクＡ　、　Ｂ。In FIG. 3, the horizontal axis indicates the type of task, with 1.0.0 in the top row. ...are tasks A and B.

Ｃ１・・・に対する起動要求の有無を示すフラグである
０図においては、タスクＡのフラグが“１”となってお
り、タスクＡに対する起動要求があることを示している
。縦軸は時間を示しており、たとえばタスクＡの起動周
期は１００１である。すなわち、タスクＡの起動フラグ
が１１”になると、時刻ｊｌ＋ｔ４でタスクＡが起動さ
れる。同様・に、タスクＢの起動周期は０１０１−・・
、タスクＣの起動周期は１０１０・・・となっている。In FIG. 0, which is a flag indicating the presence or absence of a startup request for C1..., the flag for task A is "1", indicating that there is a startup request for task A. The vertical axis indicates time; for example, the activation cycle of task A is 1001. That is, when the activation flag of task A becomes 11'', task A is activated at time jl+t4.Similarly, the activation cycle of task B is 0101-...
, the activation cycle of task C is 1010...

第４図はＩ１０アクセスルーチン２２の動作を示すフロ
ーチャート、第５図は周期起動テーブル２８の更新動作
を示すフローチャートである。FIG. 4 is a flowchart showing the operation of the I10 access routine 22, and FIG. 5 is a flowchart showing the updating operation of the periodic activation table 28.

第３図〜第５図を参照しながら第２図の実施例の動作を
次に説明する。The operation of the embodiment of FIG. 2 will now be described with reference to FIGS. 3-5.

タスク・スケジューラ２１は、たとえば８ｍ秒周期で、
Ｉ１０アクセスルーチン２２に対して起動をかける。各
周期起動の際に、待ちキュー２３に一般タスク２６から
のアクセ°ス要求があれば、その処理要求をＩ１０アク
セスルーチン２２が取り出し、外部記憶装置等のランダ
ムアクセス入出力装置２４　、２４　ａに対してアクセ
スを行ない、処理要求に従がってデータの書込み又は読
出しが行なわれる。処理要求を実行した場合には処理要
求カウンタ２７をインクリメントする゛（第４図のステ
ップ４１〜４３参照）。For example, the task scheduler 21 has a period of 8 msec,
The I10 access routine 22 is activated. When starting each cycle, if there is an access request from the general task 26 in the waiting queue 23, the I10 access routine 22 takes out the processing request and sends it to the random access input/output device 24, 24a such as an external storage device. Data is written or read according to processing requests. When a processing request is executed, the processing request counter 27 is incremented (see steps 41 to 43 in FIG. 4).

一方、［１０アクセスルーチン２２は、第５図に示すよ
うに、一定時間毎に処理要求カウンタ２７の値を抽出し
くステップ５１）、この値が基準値以下か否かを判別す
る（ステップ５２）。基準値以下であれば、カウンタ２
７をクリアしくステップ５３）、次いで周期起動テーブ
ル２８における障害検出用アクセスタスク、たとえばＡ
の起動フラグを“ｌ”にする（ステップ５４）。On the other hand, as shown in FIG. 5, the access routine 22 extracts the value of the processing request counter 27 at regular intervals (step 51), and determines whether this value is less than the reference value (step 52). . If it is below the reference value, counter 2
7 in step 53), and then the fault detection access task in the periodic activation table 28, for example A
The start flag of is set to "l" (step 54).

タスク・スケジューラ２１は、周期起動テーブル２８に
おける障害検出用アクセスタスクＡの起動フラグのオン
を検出すると、周期起動テーブル２８上のタスクＡの起
動周期にしたがって障害検出用アクセスタスタ２５を起
動する。この場合、障害検出用アクセスタスク２５の起
動の度に、アクセス・アドレス情報テーブル２９からの
アドレスを順次変更することにより、入出力装Ｗ２４゜
２４ａの全領域に対してアクセスし、障害の有無を検出
する。When the task scheduler 21 detects that the activation flag of the failure detection access task A in the periodic activation table 28 is turned on, it activates the failure detection access task A 25 according to the activation cycle of the task A in the periodic activation table 28 . In this case, each time the fault detection access task 25 is activated, the addresses from the access/address information table 29 are sequentially changed to access the entire area of the input/output device W24/24a and check for the presence or absence of a fault. To detect.

本実施例によって、システムの負荷が軽いときに限って
障害検出処理が実行されるので、システムの負担を増大
させることなく、入出力装置の早期障害検出が可能とな
る。According to this embodiment, failure detection processing is executed only when the load on the system is light, so that failures in input/output devices can be detected early without increasing the load on the system.

〔Effect of the invention〕

以上の説明から明らかなように、本発明によれば、入出
力装置の使用額度が基準値を下回っているときにのみ障
害検出手段を動作させるようにしたので、入出力装置の
障害検出がシステムの負荷の軽い時に直ちに行なわれ、
したがってシステムの高信顛度が達成される。As is clear from the above description, according to the present invention, the fault detection means is operated only when the usage amount of the input/output device is less than the reference value, so that the fault detection of the input/output device is performed by the system. It is carried out immediately when the load is light,
High reliability of the system is thus achieved.

[Brief explanation of drawings]

第１図は本発明の原理ブロック図、第２図は本発明の一実施例による入出力装置の早期障害
検出方式を示すブロック図、第３図は第２図における周期起動テーブルの内容の一例
を示す図、第４図はＩ１０アクセスルーチンの動作を示すフローチ
ャート、第５図は周期起動テーブルの更新動作を示すフローチャ
ート、第６図は従来の障害検出方式の一例を示すブロック図で
ある。１・・・中央処理装置、　２・・・主記憶装置、３・・
・入出力装置、　　４・・・バス、５・・・障害検出手
段、　６・・・使用頻度測定手段、７・・・比較手段。FIG. 1 is a block diagram of the principle of the present invention. FIG. 2 is a block diagram showing an early failure detection method for an input/output device according to an embodiment of the present invention. FIG. 3 is an example of the contents of the periodic activation table in FIG. 2. 4 is a flowchart showing the operation of the I10 access routine, FIG. 5 is a flowchart showing the updating operation of the periodic activation table, and FIG. 6 is a block diagram showing an example of a conventional failure detection method. 1...Central processing unit, 2...Main storage device, 3...
- Input/output device, 4... Bus, 5... Fault detection means, 6... Usage frequency measurement means, 7... Comparison means.

Claims

[Claims] A central processing unit (1), a main storage device (2), and a duplex input/output device (3, 3a) capable of random access.
are connected via a bus (4), and the central processing unit (1) is equipped with a failure detection means (5) for detecting a failure of the input/output device (3), wherein the central processing unit (1), the input/output device (3, 3a
) and a comparison means (7) to compare the measured frequency of use with a standard value, and when the measured frequency of use is below the standard value. The failure detection means (5) detects the input/output device (3) only when
, 3a) to transfer data and detect the presence or absence of a failure.