JPH0512039A

JPH0512039A - Trouble detecting mechanism

Info

Publication number: JPH0512039A
Application number: JP3189338A
Authority: JP
Inventors: Yukihiro Matsumoto; 行礼松本
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1991-07-03
Filing date: 1991-07-03
Publication date: 1993-01-22

Abstract

PURPOSE:To prevent tasks from continuing to queue a message and to quickly release the queuing state of the task at the time of the occurrence of trouble by monitoring whether message conversion between tasks is executed within a certain time or not. CONSTITUTION:A timer value C and event information ID are stored in a user area Y, and a queue area Q is provided with a tiger counter 5 and an event flag 6. A response monitor part 1 of an operating system OS is provided with a monitor table 10 which is set at the time of request of trouble monitor from each task. In this case, the trouble occurrence monitor time is dependent upon the timer value C. Each item of the monitor table 10 consists of a queue area address Qa and event information ID reported from each task. The queue area Q is referred to monitor the presence/absence of response for the certain time in response to requests from tasks A and B; and if response is not obtained, the occurrence of trouble is reported to request source tasks A and B to prevent them from continuing to queue a response.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、複数のタスクが相互に
メッセージの交換を行ない、処理を進める場合に発生す
る障害を検出する障害検出機構に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a failure detection mechanism for detecting a failure that occurs when a plurality of tasks exchange messages with each other and proceed with processing.

【０００２】[0002]

【従来の技術】ＯＳＩ（Open Systems Interconnectio
n）の規約において、通信のプロトコルが７つの階層に
分割されている。各層毎に、所定のプロトコル処理を実
施することになるが、プロトコル処理を迅速かつ円滑に
実施するため、各プロトコル処理毎にタスクを用意し、
タスクが相互にメッセージの交換を行ない、プロトコル
処理を完結させる手法が取られている。複数のタスクを
制御管理するために、オペレーティングシステムが用意
され、各タスクは、このオペレーティングシステムを介
してメッセージの交換を行なうことになる。2. Description of the Related Art OSI (Open Systems Interconnectio)
In the protocol of n), the communication protocol is divided into seven layers. Predetermined protocol processing will be performed for each layer, but in order to perform protocol processing quickly and smoothly, prepare a task for each protocol processing,
Tasks exchange messages with each other to complete protocol processing. An operating system is provided to control and manage a plurality of tasks, and each task will exchange messages via this operating system.

【０００３】図２に、従来のプロトコル処理の説明図を
示す。図には、通知（メッセージの送信）を実施した後
に、応答（メッセージの受信）を待つ場合を示したもの
である。図において、まずタスクＡが、オペレーティン
グシステムＯＳを介して、タスクＢに対してメッセージ
Ｍ１を送信すると（ステップＳ１）。タスクＡは、メッ
セージＭ２の受信を待つ、待ち状態に移る（ステップＳ
２）。FIG. 2 shows an explanatory diagram of conventional protocol processing. The figure shows a case of waiting for a response (reception of message) after performing notification (transmission of message). In the figure, first, task A sends a message M1 to task B via the operating system OS (step S1). The task A waits for the reception of the message M2 and shifts to the waiting state (step S
2).

【０００４】一方、タスクＢでは、メッセージＭ１を受
信すると（ステップＳ１１）、メッセージＭ１を基に所
定の処理（プロトコル処理）を実行してメッセージＭ２
を生成する（ステップＳ１２）。その後、、メッセージ
Ｍ２の送信を行ない（ステップＳ１３）、タスクＡから
の新たなメッセージを待つ、待ち状態に移る（ステップ
Ｓ１４）。タスクＡでは、メッセージＭ２を受信すると
（ステップＳ３）、メッセージＭ２を基に所定の処理を
実行し、再びステップＳ１に移る。On the other hand, in the task B, when the message M1 is received (step S11), a predetermined process (protocol process) is executed based on the message M1 to execute the message M2.
Is generated (step S12). After that, the message M2 is transmitted (step S13), and a waiting state for waiting for a new message from the task A is entered (step S14). When the task A receives the message M2 (step S3), predetermined processing is executed based on the message M2, and the process proceeds to step S1 again.

【０００５】さて、タスクＡとタスクＢの間でメッセー
ジの交換に障害が発生した場合について説明する。図３
は、従来の障害に係る説明図である。まず、先に図２に
おいて説明したように、タスクＡがメッセージＭ１を送
信し（ステップＳ１）、待ち状態に移ったものとする
（ステップＳ２）。そしてタスクＢは、メッセージＭ１
を受信し（ステップＳ１１）、メッセージＭ１を基にし
た処理に移ったものとする（ステップＳ２）。ここで、
タスクＢのプログラムのバグ等が原因でステップＳ１２
の処理、即ちタスクＢの動作が停止したものとする（ス
テップＳ１５）。Now, a case where a failure occurs in message exchange between task A and task B will be described. Figure 3
FIG. 7 is an explanatory diagram relating to a conventional failure. First, as described earlier with reference to FIG. 2, it is assumed that the task A transmits the message M1 (step S1) and enters the waiting state (step S2). Then task B receives message M1
Is received (step S11), and the processing based on the message M1 is started (step S2). here,
Step S12 due to a bug in the program of task B
Processing, that is, the operation of task B is stopped (step S15).

【０００６】[0006]

【発明が解決しようとする課題】このように、相互にメ
ッセージの交換を実行することにより処理が円滑に実施
される場合、タスクＢの停止に伴い、タスクＢからの応
答を待つタスクＡも停止状態（デッドロック状態）とな
ってしまうといった問題が生じていた。また、オペレー
ティングシステムは、タスクＡ，Ｂの状態を監視するこ
となく、個々のタスクからの要求に従った動作のみを行
なうため、タスクＡ，Ｂ共に動作が停止したことを把握
することができないといった問題も生じていた。本発明
は以上の点に着目してなされたもので、障害の発生を検
出して動作の停止状態が継続されるのを防止することの
できる障害検出機構を提供することを目的とする。As described above, when the processing is smoothly performed by exchanging messages with each other, task A also stops, and task A waiting for a response from task B also stops. There was a problem that it would be in a state (deadlock state). Further, since the operating system does not monitor the states of the tasks A and B and only performs the operations according to the requests from the individual tasks, it cannot grasp that the operations of both the tasks A and B have stopped. There was also a problem. The present invention has been made in view of the above points, and an object of the present invention is to provide a failure detection mechanism capable of detecting occurrence of a failure and preventing the operation from being stopped.

【０００７】[0007]

【課題を解決するための手段】本発明の障害検出機構
は、それぞれ予め設定された処理を行なう複数のタスク
と、前記各タスクの動作制御を行なうオペレーティング
システムを備えたものにおいて、前記各タスクには、他
のタスクからの応答が必要な通知を実行したことを示す
フラグ及び当該応答の待ち時間を示すカウンタを備えた
キューエリアが設けられ、前記オペレーティングシステ
ムには、前記通知を受けた場合、前記カウンタの示す前
記待ち時間内の前記応答の有無を監視する応答監視部が
設けられたものである。A failure detection mechanism according to the present invention includes a plurality of tasks each performing a preset process and an operating system for controlling the operation of each task. Is provided with a queue area having a flag indicating that a notification requiring a response from another task has been executed and a counter indicating the waiting time for the response, and the operating system, when receiving the notification, A response monitoring unit for monitoring the presence or absence of the response within the waiting time indicated by the counter is provided.

【０００８】[0008]

【作用】この機構のオペレーティングシステムＯＳは、
各タスクＡ，Ｂからの障害監視の依頼を基に、キュー領
域Ｑを参照し、一定時間、応答の有無を監視する。応答
が実施されなかった場合、障害監視の依頼を行なったタ
スクＡ，Ｂに対して、障害発生の通知を行ない、応答を
待ち続ける事態を回避する。The operating system OS of this mechanism is
Based on the fault monitoring requests from the tasks A and B, the queue area Q is referred to and the presence or absence of a response is monitored for a certain period of time. When the response is not executed, the task A and B that have requested the fault monitoring are notified of the occurrence of the fault, and the situation of waiting for the response is avoided.

【０００９】[0009]

【実施例】図１に、本発明に係る障害検出機構の概念図
を示す。図には、タスクＡ、タスクＢ、そしてオペレー
ティングシステムＯＳが示されている。各タスクには、
記憶装置上に設定されるイベント管理テーブルＴａ１，
Ｔａ２，…，Ｔｂ１，Ｔｂ２，…が設けられている。各
イベント管理テーブルには、タスクを起動するユーザ
（イベント）が独自の情報を記憶するユーザ領域Ｙ及び
オペレーティングシステムＯＳがアクセスするキュー領
域Ｑが設けられている。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 shows a conceptual diagram of a fault detecting mechanism according to the present invention. In the figure, a task A, a task B, and an operating system OS are shown. For each task,
Event management table Ta1, which is set in the storage device
, Tb1, Tb2, ... Are provided. Each event management table is provided with a user area Y in which a user (event) who activates a task stores unique information and a queue area Q which the operating system OS accesses.

【００１０】ユーザ領域Ｙには、タイマ値Ｃ、イベント
情報ＩＤが格納されている。キュー領域Ｑには、タイマ
カウンタ５、イベントフラグ６が設けられている。タイ
マ値Ｃは、オペレーティングシステムＯＳがタスクに係
る障害が発生したか否かを監視する時間で、各イベント
管理テーブル毎に任意の値を設定することができる。イ
ベント情報ＩＤは、イベント管理テーブルを特定するた
めの情報である。タイマカウンタ５は、オペレーティン
グシステムＯＳがタイマ値Ｃの値を計数するために利用
するものである。イベントフラグ６は、オペレーティン
グシステムＯＳが障害発生の監視を実行中であるイベン
ト管理テーブルを特定するものである。In the user area Y, the timer value C and the event information ID are stored. A timer counter 5 and an event flag 6 are provided in the queue area Q. The timer value C is a time during which the operating system OS monitors whether a failure related to a task has occurred, and an arbitrary value can be set for each event management table. The event information ID is information for identifying the event management table. The timer counter 5 is used by the operating system OS to count the value of the timer value C. The event flag 6 specifies the event management table in which the operating system OS is monitoring the occurrence of a failure.

【００１１】オペレーティングシステムＯＳには、応答
監視部１が設けられている。応答監視部１には、各タス
クから障害の監視依頼を受けた場合に設定される監視テ
ーブル１０が設けられている。監視テーブル１０は、１
つの項目が各タスクから通知されるキュー領域アドレス
Ｑａとイベント情報ＩＤから構成される。キュー領域ア
ドレスＱａは、イベント管理テーブルのキュー領域Ｑを
特定するためのアドレスである。イベント情報ＩＤは、
監視依頼を行なったタスクに係るイベント管理テーブル
のイベント情報に対応するものである。A response monitor 1 is provided in the operating system OS. The response monitoring unit 1 is provided with a monitoring table 10 set when a failure monitoring request is received from each task. The monitoring table 10 is 1
One item is composed of a queue area address Qa notified from each task and an event information ID. The queue area address Qa is an address for specifying the queue area Q in the event management table. The event information ID is
It corresponds to the event information of the event management table related to the task for which the monitoring request is made.

【００１２】以上の構成の障害検出機構の動作を図４以
降を参照しながら説明する。なお、ここではタスクＡの
イベント管理テーブルＴａ１に係るユーザが、タスクＢ
のイベント管理テーブルＴｂ１に係るユーザに通知を行
ない応答を待つものとする。図４は、監視依頼に係るフ
ローチャートである。タスクＡ側において、先の図２の
ステップＳ２を実施する際、タスクＡは、オペレーティ
ングシステムＯＳに向けて、監視依頼のコマンド（例え
ばコマンドWAIT D）と共に、キュー領域Ｑを特定するキ
ュー領域アドレス、タイマ値Ｃ、そしてイベント情報Ｉ
Ｄを通知する（ステップＳ２１）。The operation of the fault detection mechanism having the above configuration will be described with reference to FIG. Note that here, the user related to the event management table Ta1 of task A is task B
It is assumed that the user related to the event management table Tb1 is notified and the response is waited for. FIG. 4 is a flowchart relating to the monitoring request. When performing the step S2 of FIG. 2 on the task A side, the task A sends a queue request address (for example, command WAIT D) to the operating system OS together with a queue area address for specifying the queue area Q, Timer value C, and event information I
Notify D (step S21).

【００１３】通知を受けたオペレーティングシステムＯ
Ｓでは、監視テーブル１０に項目（キュー領域アドレス
Ｑａ、イベント情報ＩＤ）を設定し（ステップＳ２
２）、キュー領域Ｑのタイマカウンタ５にタイマ値Ｃを
セットし（ステップＳ２３）、そしてイベントフラグ６
のセットを行なう（ステップＳ２４）。その後、オペレ
ーティングシステムＯＳは、障害監視動作に移る。Operating system O notified
In S, items (queue area address Qa, event information ID) are set in the monitoring table 10 (step S2
2) The timer value C is set in the timer counter 5 of the queue area Q (step S23), and the event flag 6
Is set (step S24). After that, the operating system OS shifts to the failure monitoring operation.

【００１４】図５は、監視動作に係るフローチャートで
ある。オペレーティングシステムＯＳは、一定の周期で
応答監視部１に割込みをかけて障害監視の指示を与え
る。この指示を受けた応答監視部１は、監視テーブル１
０の項目の認識を行なう（ステップＳ３１）。そして、
キュー領域アドレスＱａからキュー領域Ｑの認識を行な
い（ステップＳ３２）、タイマカウンタ５の減算を行な
う（ステップＳ３３）。FIG. 5 is a flowchart relating to the monitoring operation. The operating system OS interrupts the response monitoring unit 1 at regular intervals and gives a fault monitoring instruction. The response monitoring unit 1 that has received this instruction uses the monitoring table 1
The item of 0 is recognized (step S31). And
The queue area Q is recognized from the queue area address Qa (step S32), and the timer counter 5 is decremented (step S33).

【００１５】更に応答監視部１は、タイマカウンタ５の
値が“０”であるかの判断を行ない（ステップＳ３
４）、結果がＮＯの場合、監視テーブル１０に監視すべ
き項目が存在するか判断する（ステップＳ３５）。ステ
ップＳ３５の結果がＹＥＳ、即ち監視テーブル１０に複
数の項目が設定されていた場合、ステップＳ３１に戻
り、結果がＮＯの場合、処理を終了して新たな割込みを
待つ。Further, the response monitoring section 1 judges whether the value of the timer counter 5 is "0" (step S3).
4) If the result is NO, it is determined whether or not there is an item to be monitored in the monitoring table 10 (step S35). If the result of step S35 is YES, that is, if a plurality of items have been set in the monitoring table 10, the process returns to step S31. If the result is NO, the process is terminated and a new interrupt is waited for.

【００１６】ステップＳ３４の結果がＹＥＳ、即ちタイ
マ値Ｃの計数を完了した場合、応答監視部１は、オペレ
ーティングシステムＯＳに対して、タスクＡに障害が発
生した旨を通知し（ステップＳ３６）、処理を終了す
る。オペレーティングシステムＯＳは、応答監視部１か
らの通知を受付けると、タスクＡに対して障害発生の通
知を行ない、タスクＡは障害の発生したイベント管理テ
ーブルのイベントフラグ６のリセット等を行なうことに
なる。When the result of step S34 is YES, that is, when the counting of the timer value C is completed, the response monitoring unit 1 notifies the operating system OS that the task A has failed (step S36). The process ends. When the operating system OS receives the notification from the response monitoring unit 1, it notifies the task A that a failure has occurred, and the task A resets the event flag 6 in the event management table in which the failure has occurred. ..

【００１７】次に、タスクＢが正常に応答した場合につ
いて説明する。図６は、応答に係るフローチャートであ
る。タスクＢは、メッセージを受信し、所定の処理を完
了すると、処理が完了したメッセージと共に、メッセー
ジの送信先、即ち相手ＩＤをオペレーティングシステム
ＯＳに発行する（ステップＳ４１）。オペレーティング
システムＯＳは、応答監視部１を起動して監視テーブル
１０の検索を実行させ、相手ＩＤに該当するイベント情
報ＩＤが登録されているか否かを判断させる（ステップ
Ｓ４２）。Next, the case where task B responds normally will be described. FIG. 6 is a flowchart of the response. When the task B receives the message and completes the predetermined processing, the task B issues the message to which the processing is completed, that is, the partner ID, to the operating system OS (step S41). The operating system OS activates the response monitoring unit 1 to search the monitoring table 10 and determine whether or not the event information ID corresponding to the partner ID is registered (step S42).

【００１８】ステップＳ４２の結果がＹＥＳの場合、該
当した監視テーブル１０の項目の削除（ステップＳ４
３）、この項目に設定されていたキュー領域Ｑのイベン
トフラグ６のリセット（ステップＳ４４）、そしてタイ
マカウンタ５のリセット（ステップＳ４５）を行なう。
その後、オペレーティングシステムＯＳは、タスクＢか
ら受信したメッセージをタスクＡに転送し（ステップＳ
４６）、処理を終了する。またステップＳ４２の結果が
ＮＯの場合、待ち状態のタスクが無いものとしてステッ
プＳ４６に移る。If the result of step S42 is YES, the corresponding item in the monitoring table 10 is deleted (step S4).
3) The event flag 6 of the queue area Q set in this item is reset (step S44), and the timer counter 5 is reset (step S45).
Then, the operating system OS transfers the message received from the task B to the task A (step S
46), the process ends. If the result of step S42 is NO, it is determined that there is no task in the waiting state, and the process proceeds to step S46.

【００１９】以上の説明のように、オペレーティングシ
ステムＯＳの応答監視部１において、タイマ値Ｃで指示
された時間内に応答がなされたか否かを監視し、タスク
の待ち状態が継続されるのを防止する。As described above, the response monitoring unit 1 of the operating system OS monitors whether or not a response is made within the time designated by the timer value C, and the task waiting state is maintained. To prevent.

【００２０】[0020]

【発明の効果】以上説明したように、オペレーティング
システムにおいて、タスク間のメッセージの交換が一定
の時間内に実施されているのを監視することにより、メ
ッセージを待ち続けるタスクが発生するのを防止するこ
とができる。また、障害の発生を検出した場合には、迅
速にタスクの待ち状態の解除を行なうことができ、タス
クによる新たな処理の起動を実現できる。As described above, in the operating system, the exchange of messages between tasks is monitored within a certain period of time to prevent the occurrence of tasks that keep waiting for messages. be able to. Further, when the occurrence of a failure is detected, the waiting state of the task can be released quickly, and the task can start a new process.

[Brief description of drawings]

【図１】本発明に係る障害検出機構の概念図である。FIG. 1 is a conceptual diagram of a failure detection mechanism according to the present invention.

【図２】従来のプロトコル処理の説明図である。FIG. 2 is an explanatory diagram of conventional protocol processing.

【図３】従来の障害に係る説明図である。FIG. 3 is an explanatory diagram related to a conventional failure.

【図４】監視依頼に係るフローチャートである。FIG. 4 is a flowchart relating to a monitoring request.

【図５】監視動作に係るフローチャートである。FIG. 5 is a flowchart relating to a monitoring operation.

【図６】応答に係るフローチャートである。FIG. 6 is a flowchart relating to a response.

[Explanation of symbols]

１応答監視部５タイマカウンタ６イベントフラグ１０監視テーブル 1 Response monitoring unit 5 Timer counter 6 Event flag 10 Monitoring table

Claims

Claim: What is claimed is: 1. A plurality of tasks, each of which performs preset processing, and an operating system, which controls the operation of each task. A queue area having a flag indicating that a notification requiring a response has been executed and a counter indicating a waiting time for the response is provided, and when the operating system receives the notification, the waiting area indicated by the counter is displayed. A failure detection mechanism comprising a response monitoring unit for monitoring the presence or absence of the response within a time period.