JP5822783B2

JP5822783B2 - Failure detection device

Info

Publication number: JP5822783B2
Application number: JP2012108377A
Authority: JP
Inventors: 賢一佐々木
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2012-05-10
Filing date: 2012-05-10
Publication date: 2015-11-24
Anticipated expiration: 2032-05-10
Also published as: JP2013235481A

Description

本発明は、同一機能を実現可能な複数のノードをグループ化し、グループ内ノードが相互に入出力データを交換して比較を行うことで機能の冗長化を行う分散システムにおいて、動的な故障検出を行う故障検出装置に関するものである。 The present invention groups a plurality of nodes capable of realizing the same function, and dynamically detects a failure in a distributed system in which nodes within the group exchange input / output data with each other and perform comparison for comparison. It is related with the failure detection apparatus which performs.

現在の自動車では、複数のノード間でデータをやり取りすることにより、様々な機能を実現している。そのため、故障発生時により細かな制御を実現するためには、システム内の正常なノード間で故障ノードに関する認識を一致させることが重要となる。 In current automobiles, various functions are realized by exchanging data between a plurality of nodes. Therefore, in order to realize finer control when a failure occurs, it is important to make the recognition about the failed node coincide between normal nodes in the system.

従来の故障検出及び故障情報共有技術は、監視の対象となるノード（以下、監視対象ノード）のデータに対して任意数のノードで監視を行うことで（以下、監視ノード）、監視対象ノードの故障を検出する。そして、各監視ノードはサイクル毎に各々の監視結果である故障情報を他の監視ノードと交換し、収集した結果を基に多数決等の事前に定められた方法によって、故障情報の共有を図っていた（例えば、特許文献１参照）。 The conventional failure detection and failure information sharing technology performs monitoring on an arbitrary number of nodes (hereinafter referred to as monitoring nodes) with respect to data of nodes to be monitored (hereinafter referred to as monitoring target nodes). Detect failure. Each monitoring node exchanges failure information, which is each monitoring result, with other monitoring nodes for each cycle, and shares failure information by a predetermined method such as majority decision based on the collected results. (For example, see Patent Document 1).

特開２００９−３７５７５号公報JP 2009-37575 A

しかしながら、従来手法では、故障の確認を行うための制御情報と、情報共有のための故障情報を別々に送信するため、あるノードに対して複数ノードで監視を行う場合は、故障検出の信頼性は高くなるが、バス負荷への影響が大きくなるという問題があった。一方、監視ノード数を少数とした場合は、バス負荷への影響は小さくなるが、監視ノードの故障等により対象の監視が出来なくなる可能性がある。監視ノードの故障を検出し、監視ノードを切り替えることも出来るが、切り替え先が故障している場合の対応等の複雑な設定を考慮する必要がある。 However, in the conventional method, since the control information for confirming the failure and the failure information for information sharing are transmitted separately, when monitoring a certain node with multiple nodes, the reliability of failure detection However, there is a problem that the influence on the bus load becomes large. On the other hand, when the number of monitoring nodes is small, the influence on the bus load is small, but there is a possibility that the target cannot be monitored due to a failure of the monitoring node. Although it is possible to detect a failure of the monitoring node and switch the monitoring node, it is necessary to consider complicated settings such as a response when the switching destination is out of order.

このように、従来の故障情報の共有方法は、監視ノードの故障に対する耐性を高めるためにはバス負荷への影響が大きく、バス負荷への影響を小さくするために監視ノードを少なくすると、対象ノードの監視が出来なくなる恐れがあった。 As described above, the conventional failure information sharing method has a large influence on the bus load in order to increase the resistance against the failure of the monitoring node, and reduces the number of monitoring nodes to reduce the influence on the bus load. There was a risk that it could not be monitored.

また、情報共有時のデータの選択手段によって、故障ノードが増えた場合に従来では誤った判定をしてしまうことも考えられる。例えば、多数決による故障ノードの特定では、過半数が故障した場合に誤った結果が出力される可能性があった。 In addition, when the number of failed nodes is increased by the data selection means at the time of information sharing, it is possible to make an erroneous determination conventionally. For example, in the case of identifying a failure node by majority vote, there is a possibility that an erroneous result is output when a majority of the failure occurs.

この発明は上記のような課題を解決するためになされたもので、故障情報共有のためのネットワーク負荷の増加を抑制しつつ、機能の冗長化、動的な故障検出機能の追加、正常ノード間での故障ノードに対する認識の一致を実現することのでき、かつ、信頼性の高い故障検出を行うことのできる故障検出装置を得ることを目的とする。 The present invention has been made to solve the above-described problems. While suppressing an increase in network load for sharing fault information, function redundancy, addition of a dynamic fault detection function, and between normal nodes are performed. It is an object of the present invention to obtain a failure detection apparatus that can realize the coincidence of recognition for a failure node and can perform failure detection with high reliability.

この発明に係る故障検出装置は、ネットワークに接続された複数のノードをグループ化し、グループ内で相互に入出力データを周期的に交換する分散システムにおいて、ノードの故障検出を行う故障検出装置であって、各ノードは、入力データに対する演算を行い、演算結果を出力する演算部と、演算結果である出力データと入力データと組にした入出力組をグループ内の他のノードに対して転送すると共に、他のノードから入出力組を受信した場合に入力データを取り出す転送部と、入出力組における出力データと演算結果とを比較し、演算結果が入出力組のいずれかの出力データと一致した場合はその値を出力すると共に、いずれかの出力データと一致し、かつ、一致しない出力データを含む場合、一致しない出力データのノードを故障と判定し、一方、演算結果が入出力組の全ての出力データと一致しない場合は、入出力組の全ての出力データに演算結果を追加して新たな入出力組とする比較部と、比較部が判定したノードの故障をグループ内の他のノードに対して通知する故障検出部とを備え、各ノードは、一つの周期で複数のノードから同一のノードに対する故障通知を受けた場合、次の周期以降、故障通知を受けたノードからの故障通知を無視するようにしたものである。 The failure detection device according to the present invention is a failure detection device that detects a failure of a node in a distributed system in which a plurality of nodes connected to a network are grouped and input / output data is periodically exchanged between the groups. Each node performs an operation on input data, and outputs an operation result to the other node in the group, and an operation unit that outputs the operation result and an input / output set that is a combination of the output data and the input data as the operation result. In addition, when the input / output group is received from another node, the transfer unit that extracts the input data is compared with the output data in the input / output group and the operation result, and the operation result matches one of the output data in the input / output group. If this happens, the value is output, and if any output data matches and does not match, the output data node that does not match fails. On the other hand, if the calculation result does not match all the output data of the input / output group, a comparison unit that adds the calculation result to all the output data of the input / output group and creates a new input / output group, and a comparison unit And a failure detection unit for notifying other nodes in the group of the failure of the node determined by each node. When each node receives a failure notification for the same node from a plurality of nodes in one cycle, After the period, the failure notification from the node that has received the failure notification is ignored.

この発明の故障検出装置は、一致しない出力データを含む場合はその出力データのノードを故障と判定して他のノードに通知し、かつ、一つの周期で複数のノードから同一のノードに対する故障通知を受けた場合、次の周期以降は故障通知を受けたノードからの故障通知は無視するようにしたので、故障情報共有のためのネットワーク負荷の増加を抑制しつつ、機能の冗長化、動的な故障検出機能の追加、正常ノード間での故障ノードに対する認識の一致を実現することができる。また、信頼性の高い故障検出を行うことができる。 The failure detection apparatus according to the present invention determines that the node of the output data is faulty when it includes output data that does not match, and notifies other nodes of the failure, and also notifies failure to the same node from a plurality of nodes in one cycle. Since the failure notification from the node that received the failure notification is ignored after the next cycle, the function redundancy and dynamics are suppressed while suppressing an increase in the network load for failure information sharing. Thus, it is possible to realize a failure detection function and a recognition of failure nodes among normal nodes. Further, it is possible to perform failure detection with high reliability.

この発明の実施の形態１の故障検出装置を示す構成図である。It is a block diagram which shows the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置の適用対象として想定するシステムの構成図である。BRIEF DESCRIPTION OF THE DRAWINGS It is a block diagram of the system assumed as an application object of the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における２つのＥＣＵ間の動作を示す説明図である。It is explanatory drawing which shows operation | movement between two ECUs in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における４つのＥＣＵからなるシステムの構成図である。It is a block diagram of the system which consists of four ECUs in the failure detection apparatus of Embodiment 1 of this invention. 図４のＥＣＵ間のデータフローを示す説明図である。It is explanatory drawing which shows the data flow between ECU of FIG. この発明の実施の形態１の故障検出装置におけるＥＣＵ１を開始ノードとした正常時の動作を示す説明図である。It is explanatory drawing which shows the operation | movement at the time of normal using ECU1 as a start node in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置におけるＥＣＵ１を開始ノードとした異常時の動作を示す説明図である。It is explanatory drawing which shows the operation | movement at the time of abnormality which made ECU1 the start node in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における４ＥＣＵ構成時の正常時のシステム全体動作を示す説明図である。It is explanatory drawing which shows the system whole operation | movement at the time of normal at the time of 4ECU structure in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における４ＥＣＵ構成時の異常時のシステム全体動作を示す説明図である。It is explanatory drawing which shows the whole system operation at the time of abnormality at the time of 4 ECU structure in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置におけるＥＣＵ３によるＥＣＵ２の故障通知を示す説明図である。It is explanatory drawing which shows the failure notification of ECU2 by ECU3 in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置におけるＥＣＵ４によるＥＣＵ２の故障通知を示す説明図である。It is explanatory drawing which shows the failure notification of ECU2 by ECU4 in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における入出力組を送信するＥＣＵの動作を示すフローチャートである。It is a flowchart which shows operation | movement of ECU which transmits the input / output group in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における入出力組を受信したＥＣＵの動作を示すフローチャートである。It is a flowchart which shows operation | movement of ECU which received the input / output group in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における故障通知を受信したＥＣＵの動作を示すフローチャートである。It is a flowchart which shows operation | movement of ECU which received the failure notification in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における故障ＥＣＵ２による正常ＥＣＵ１に対する誤った故障通知を示す説明図である。It is explanatory drawing which shows the erroneous failure notification with respect to normal ECU1 by failure ECU2 in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における故障ＥＣＵ２による正常ＥＣＵ１に対する誤った故障通知を示す説明図である。It is explanatory drawing which shows the erroneous failure notification with respect to normal ECU1 by failure ECU2 in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態１の故障検出装置における故障ＥＣＵ３による正常ＥＣＵ１に対する誤った故障通知を示す説明図である。It is explanatory drawing which shows the erroneous failure notification with respect to normal ECU1 by failure ECU3 in the failure detection apparatus of Embodiment 1 of this invention. この発明の実施の形態２の故障検出装置における動作を示す説明図である。It is explanatory drawing which shows the operation | movement in the failure detection apparatus of Embodiment 2 of this invention. この発明の実施の形態３の故障検出装置における動作を示す説明図である。It is explanatory drawing which shows the operation | movement in the failure detection apparatus of Embodiment 3 of this invention. この発明の実施の形態４の故障検出装置における動作を示す説明図である。It is explanatory drawing which shows the operation | movement in the failure detection apparatus of Embodiment 4 of this invention. この発明の実施の形態５の故障検出装置における動作を示す説明図である。It is explanatory drawing which shows the operation | movement in the failure detection apparatus of Embodiment 5 of this invention.

実施の形態１．
実施の形態１では、ネットワークに接続された複数のノードの例としてＥＣＵ（電子制御ユニット：Electrical Control Unit）の場合を説明する。実施の形態１では、同一機能を実現可能な複数のＥＣＵをグループ化し、グループ内ＥＣＵが相互に入出力データを交換して比較を行うことで機能の冗長化、及び動的な故障検出能力の追加を実現し、且つ故障検出時に故障情報の共有を行う。 Embodiment 1 FIG.
In the first embodiment, a case of an ECU (Electronic Control Unit) will be described as an example of a plurality of nodes connected to a network. In the first embodiment, a plurality of ECUs capable of realizing the same function are grouped, and in-group ECUs exchange input / output data with each other for comparison, thereby making the functions redundant and dynamic failure detection capability. Addition is realized and failure information is shared when a failure is detected.

ここで、同一機能を実現可能とは、該当ＥＣＵが同じ演算機能を持ち、ある入力に対して全く同じ演算結果が得られることを意味する。動的な故障検出能力とは、故障の発生状況に応じて故障を検出するＥＣＵが自動的に切り替わることを意味する。 Here, being able to realize the same function means that the corresponding ECU has the same calculation function, and the same calculation result can be obtained for a certain input. The dynamic failure detection capability means that the ECU that detects a failure automatically switches according to the occurrence state of the failure.

本発明は、１周期内に２つのＥＣＵが同時に故障しないことを想定している。この想定は、１周期内に２つのノードが同時に故障し、且つ両者が同時に同じ誤りをする場合のみ問題となるが、その確率は十分に小さいと考えられることから妥当である。 The present invention assumes that two ECUs do not fail simultaneously within one cycle. This assumption is only a problem when two nodes fail at the same time within one period and both make the same error at the same time, but is reasonable because the probability is considered to be sufficiently small.

各ＥＣＵは、入力データを取得し、それを基に演算を行い出力データを得る。その後、入力データと出力データのペア（以下、入出力組）をグループ内のＥＣＵに送信する。入出力組を受信したＥＣＵは、入力データを基に演算を行い、出力データを得て、受信した出力データ集合と比較を行う。ここで、入出力組を受信したＥＣＵは、送信したＥＣＵと同じ機能を持つため、正常であれば同じ入力データに対しては同じ出力データが得られる。演算した出力データと受信した出力データ集合のいずれかが一致すれば、その値を出力する。一方、演算した出力データが受信した出力データ集合のいずれとも一致しなければ、入出力組の最後に自らの演算結果を追加し、グループ内の他のＥＣＵに送信する。出力データが一致しない場合のデータの転送は、アプリケーションに要求される応答時間（以下、デッドラインという）を満足する間行う。これにより、システム内に正常なＥＣＵが２つ存在すれば、正しい結果が出力される。この結果、機能が冗長化される。デッドラインを違反するまで一致する出力が見つからない場合には、前回値やデフォルト値を出力することで応答時間も満足できる。 Each ECU acquires input data and performs an operation based on it to obtain output data. Thereafter, a pair of input data and output data (hereinafter referred to as an input / output group) is transmitted to the ECUs in the group. The ECU that has received the input / output set performs an operation based on the input data, obtains output data, and compares it with the received output data set. Here, since the ECU that has received the input / output set has the same function as the transmitted ECU, if it is normal, the same output data can be obtained for the same input data. If any of the calculated output data matches the received output data set, the value is output. On the other hand, if the calculated output data does not match any of the received output data sets, the calculation result is added to the end of the input / output set and transmitted to other ECUs in the group. Data transfer when the output data does not match is performed while satisfying a response time required for the application (hereinafter referred to as a deadline). Thus, if there are two normal ECUs in the system, a correct result is output. As a result, the function is made redundant. If no matching output is found until the deadline is violated, the response time can be satisfied by outputting the previous value or default value.

また、出力データが出力データ集合のいずれかに一致して出力される際には、一致しない出力データを追加したＥＣＵが故障していることが判断できる。上記手順では、出力パスに故障ＥＣＵを含む正常な２つのＥＣＵが故障ＥＣＵを判断することが可能となる。 Further, when the output data is output in accordance with one of the output data sets, it can be determined that the ECU to which the output data that does not match is added is out of order. In the above procedure, two normal ECUs including a faulty ECU in the output path can determine the faulty ECU.

一方、出力パスに故障ＥＣＵを含まないＥＣＵは、故障ＥＣＵを認識することができない。そのため、正常ＥＣＵ間で故障ＥＣＵに関する認識を一致させるためには、故障ＥＣＵを検出したＥＣＵが、グループ内の全ＥＣＵに故障ＥＣＵに関する通知を行う必要がある。本発明では正常な２つのＥＣＵが故障ＥＣＵを認識することが可能であるため、各ＥＣＵは故障通知を２つ以上受信した場合に、通知されたＥＣＵを故障ＥＣＵとしてマークし、以降のサイクルでは故障ＥＣＵからの故障通知を無視する。これにより、故障ＥＣＵが正常なＥＣＵに対する故障通知を行ったとしても無視され、正常なＥＣＵが故障と判断されることを防ぐことが出来る。
以下、このような分散システムにおける故障検出装置の実施の形態について説明する。 On the other hand, an ECU that does not include a faulty ECU in the output path cannot recognize the faulty ECU. Therefore, in order to make the recognition about the failure ECU coincide between the normal ECUs, the ECU that detects the failure ECU needs to notify all the ECUs in the group about the failure ECU. In the present invention, since two normal ECUs can recognize a faulty ECU, when each ECU receives two or more fault notifications, each ECU is marked as a faulty ECU, and in subsequent cycles Ignore the failure notification from the failure ECU. As a result, even if the failure ECU gives a failure notification to the normal ECU, it is ignored and it is possible to prevent the normal ECU from being determined as a failure.
Hereinafter, an embodiment of a failure detection apparatus in such a distributed system will be described.

図１は、この発明の実施の形態１による故障検出装置を備えたＥＣＵの構成図である。
図１に示すＥＣＵ１は、入力部１１、演算部１２、転送部１３、比較部１４、出力部１５、故障検出部１６を備えており、車載ＬＡＮ１００に接続されている。
また、図２は、本発明の対象として想定する分散システムの構成図である。図１で示したＥＣＵ１と同一構成のＥＣＵが同一のネットワークに複数接続されている。ここでは、複数のＥＣＵ１、ＥＣＵ２、・・・、ＥＣＵｎ（ｎは任意の整数）が、それぞれ図１のＥＣＵ１の構成を持つ。以下では、各ＥＣＵの機能について説明する。 FIG. 1 is a configuration diagram of an ECU provided with a failure detection apparatus according to Embodiment 1 of the present invention.
The ECU 1 shown in FIG. 1 includes an input unit 11, a calculation unit 12, a transfer unit 13, a comparison unit 14, an output unit 15, and a failure detection unit 16, and is connected to the in-vehicle LAN 100.
FIG. 2 is a configuration diagram of a distributed system assumed as an object of the present invention. A plurality of ECUs having the same configuration as the ECU 1 shown in FIG. 1 are connected to the same network. Here, a plurality of ECUs 1, ECUs 2,..., ECUn (n is an arbitrary integer) have the configuration of the ECU 1 in FIG. Below, the function of each ECU is demonstrated.

入力部１１は、センサ等を用いて制御に必要となるデータを取得する機能部である。演算部１２は、入力部１１あるいは転送部１３が取得した入力データを基にアプリケーションに応じた演算を行う機能部である。転送部１３は、入力データと演算部１２から得た演算値を組にして（以下、入出力組という）、グループ内のＥＣＵにメッセージを送信する機能部である。入出力組は、以下に示す構成となる。
｛入力データ：出力データ１：出力データ２：・・・：出力データｎ｝
ここで、ｎは任意の整数であり、転送を行う毎に該当ＥＣＵの出力データ（演算値）を追加する。そのため、各実施の形態中では入出力組の演算値を出力データ集合と称する。入出力組を受信した場合、転送部１３は、入出力組の入力データを取り出し、演算部１２を用いて計算を行う。 The input unit 11 is a functional unit that acquires data necessary for control using a sensor or the like. The calculation unit 12 is a functional unit that performs calculation according to the application based on input data acquired by the input unit 11 or the transfer unit 13. The transfer unit 13 is a functional unit that sends a message to the ECUs in the group by combining the input data and the calculated value obtained from the calculating unit 12 (hereinafter referred to as an input / output set). The input / output group has the following configuration.
{Input data: output data 1: output data 2: ...: output data n}
Here, n is an arbitrary integer, and the output data (calculated value) of the corresponding ECU is added every time transfer is performed. Therefore, in each embodiment, the operation value of the input / output group is referred to as an output data set. When the input / output group is received, the transfer unit 13 extracts the input data of the input / output group and performs calculation using the arithmetic unit 12.

比較部１４は、受信した入出力組の入力データに対する演算部１２の演算値と、受信した入出力組の出力データ集合とを比較する機能部である。比較の結果、結果が一致するものがあれば、これを出力部１５に渡し、出力データ集合に一致するデータが存在しなければ、自らの演算結果（演算部１２の演算結果）を出力データ集合に追加して転送部１３に送る。また、不一致の出力が含まれていた場合はその不一致のＥＣＵを故障と判定してその判定結果を故障検出部１６に出力する。出力部１５は、比較部１４が一致したと判定した演算値を出力する機能部である。
故障検出部１６は、各ＥＣＵにおいて故障情報の共有を実現する機能部であり、比較部１４において故障と判定したＥＣＵの情報をグループ内の全てのＥＣＵに対して通知する。また、１周期内に同一のＥＣＵに対して複数のＥＣＵから故障通知を受信した場合は、そのＥＣＵを故障としてマークし、以降の周期ではそのＥＣＵからの情報を無視するよう構成されている。
車載ＬＡＮ１００は、図２等に示すように、複数のＥＣＵ１，２，・・・，ｎを通信接続するためのネットワークである。 The comparison unit 14 is a functional unit that compares the calculated value of the calculation unit 12 with respect to the input data of the received input / output set and the output data set of the received input / output set. If there is a match as a result of the comparison, this is passed to the output unit 15, and if there is no data matching the output data set, its calculation result (the calculation result of the calculation unit 12) is output to the output data set. To the transfer unit 13. If a mismatched output is included, the mismatched ECU is determined as a failure, and the determination result is output to the failure detection unit 16. The output unit 15 is a functional unit that outputs a calculation value determined by the comparison unit 14 to match.
The failure detection unit 16 is a functional unit that realizes sharing of failure information among the ECUs, and notifies all the ECUs in the group of information on the ECUs determined to be a failure by the comparison unit 14. Further, when failure notifications are received from a plurality of ECUs for the same ECU within one cycle, the ECU is marked as failed, and information from that ECU is ignored in subsequent cycles.
The in-vehicle LAN 100 is a network for connecting a plurality of ECUs 1, 2,..., N as shown in FIG.

尚、各ＥＣＵ１，２，・・・，ｎは、それぞれコンピュータで構成され、入力部１１〜故障検出部１６におけるそれぞれの処理は、各処理に対応するソフトウェアと、これらのソフトウェアを実行するためのＣＰＵやメモリといったハードウェアによって実現されている。あるいはいずれかの機能部を専用のハードウェアで構成してもよい。 The ECUs 1, 2,..., N are each configured by a computer, and each process in the input unit 11 to the failure detection unit 16 includes software corresponding to each process and for executing these softwares. This is realized by hardware such as a CPU and a memory. Alternatively, any one of the functional units may be configured with dedicated hardware.

図３は、図２におけるＥＣＵ１とＥＣＵ２の比較処理を抜粋した動作を示す説明図である。
ＥＣＵ１は、入力部１１を用いて入力データを取得し、演算部１２を用いて出力データとして演算値を得る。そして、転送部１３を用いて入力データと演算値を組にして（以下、入出力組という）、グループ内ＥＣＵ（ここではＥＣＵ２）に送信する。ＥＣＵ２は入出力組を受信すると、入出力組から入力データを抽出してそれを演算部１２に提供する。比較部１４は、演算部１２から出力されたデータと、受信した入出力組に含まれる出力データ集合とを比較する。演算部１２から得られた演算値が出力データ集合のいずれかに一致した場合、一致した演算値を出力部１５を用いて出力する。図３では、出力データが一致する場合の流れを太線としている。図３は２つのＥＣＵ１，２で一致しているため不一致の結果が含まれていないが、出力データ集合に一致しない結果が含まれていれば、その結果を出力したＥＣＵを故障として、故障検出部１６に通知する。 FIG. 3 is an explanatory diagram showing an operation extracted from the comparison process between the ECU 1 and the ECU 2 in FIG.
The ECU 1 obtains input data using the input unit 11 and obtains a computation value as output data using the computation unit 12. Then, the transfer unit 13 is used to set the input data and the calculated value as a set (hereinafter referred to as an input / output set) and transmit the set to the in-group ECU (ECU 2 in this case). When the ECU 2 receives the input / output set, the ECU 2 extracts the input data from the input / output set and provides it to the arithmetic unit 12. The comparison unit 14 compares the data output from the calculation unit 12 with the output data set included in the received input / output set. When the calculated value obtained from the calculation unit 12 matches any of the output data sets, the calculated calculation value is output using the output unit 15. In FIG. 3, the flow when the output data matches is indicated by a bold line. 3 does not include a mismatch result because the two ECUs 1 and 2 match, but if a result that does not match the output data set is included, the ECU that outputs the result is regarded as a failure and a failure is detected. Notification to the unit 16.

図４に示す４つのＥＣＵ（ＥＣＵ１、ＥＣＵ２、ＥＣＵ３、ＥＣＵ４）で構成されるシステムにおいて、全てのＥＣＵが正常な場合と、ＥＣＵ２が故障した場合に関して、ＥＣＵ１を開始ノードとして着目した動作をそれぞれ図６、図７に示す。図５は、図４のＥＣＵ間のデータの流れを示している。 In the system configured by four ECUs (ECU1, ECU2, ECU3, ECU4) shown in FIG. 4, the operation focusing on the ECU1 as a start node is illustrated for all the ECUs being normal and when the ECU2 has failed. 6 and FIG. FIG. 5 shows a data flow between the ECUs in FIG.

図６では、ＥＣＵ１は入力部１１を用いて入力データ（５）を得、それに対して演算部１２を用いて演算を行い、出力データ（１０）を得る。ここで、（）内の値はデータを表し、正常なＥＣＵは入力データを２倍した結果が得られるものとする。そして、転送部１３を用いて入出力組｛５：１０｝を送信する。ＥＣＵ２は、入出力組｛５：１０｝を受信し、入力データ（５）に対して演算部１２を用いて演算を行い、演算結果（１０）を得る。比較部１４は、演算部１２の出力データ（１０）と受信した出力データ集合｛１０｝の比較を行い、値が（１０）で一致するため出力部１５を用いて（１０）を出力する。 In FIG. 6, the ECU 1 obtains input data (5) using the input unit 11 and performs computation using the computation unit 12 to obtain output data (10). Here, values in parentheses represent data, and a normal ECU can obtain a result obtained by doubling input data. Then, the transfer unit 13 is used to transmit the input / output set {5:10}. The ECU 2 receives the input / output set {5:10}, calculates the input data (5) using the calculation unit 12, and obtains a calculation result (10). The comparison unit 14 compares the output data (10) of the calculation unit 12 with the received output data set {10} and outputs (10) using the output unit 15 because the values match at (10).

図７では、ＥＣＵ１は入力部１１を用いて入力データ（５）を得、それに対して演算部１２を用いて演算を行い、出力データ（１０）を得る。そして、転送部１３を用いて入出力組｛５：１０｝を送信する。ＥＣＵ２は、入出力組｛５：１０｝を受信し、入力データ（５）に対して演算部１２を用いて演算を行い、演算結果（１１）を得る。ＥＣＵ２は故障しているため正しい結果が得られず出力データが（１１）となっている。従って、受信した出力データ集合と一致せず、比較部１４は不一致という結果を出力する。この時点では、データが２つのみのため、いずれのデータが正しいかが判断できないため、故障ＥＣＵの特定は出来ない。転送部１３は、比較結果が不一致であるため、入出力組の最後に演算部１２が出力した値（１１）を追加し、入出力組｛５：１０：１１｝をグループ内ＥＣＵ３に転送する。ＥＣＵ３は、入出力組｛５：１０：１１｝を受信し、入力データ（５）に対して演算部１２を用いて演算を行い、演算結果（１０）を得る。比較部１４は、演算部１２が出力した値（１０）と受信した出力データ集合｛１０：１１｝を比較し、出力（１０）が一致するため、出力部１５を用いて（１０）を出力する。比較部１４は、比較が完了した時点で誤った出力データ（１１）を付加したＥＣＵ２が故障していると判断でき、故障検出部１６を用いてグループ内の全ＥＣＵに故障通知を行う。図１０は、ＥＣＵ３によるＥＣＵ２に対する故障通知である。 In FIG. 7, the ECU 1 obtains input data (5) using the input unit 11 and performs computation using the computation unit 12 to obtain output data (10). Then, the transfer unit 13 is used to transmit the input / output set {5:10}. The ECU 2 receives the input / output set {5:10}, calculates the input data (5) using the calculation unit 12, and obtains a calculation result (11). Since the ECU 2 is out of order, a correct result cannot be obtained and the output data is (11). Therefore, it does not match the received output data set, and the comparison unit 14 outputs a result of mismatch. At this time, since there are only two pieces of data, it cannot be determined which data is correct, and therefore, the failure ECU cannot be specified. Since the comparison result does not match, the transfer unit 13 adds the value (11) output by the calculation unit 12 to the end of the input / output set, and transfers the input / output set {5:10:11} to the in-group ECU 3. . The ECU 3 receives the input / output set {5:10:11}, performs an operation on the input data (5) using the operation unit 12, and obtains an operation result (10). The comparison unit 14 compares the value (10) output from the calculation unit 12 with the received output data set {10:11}, and outputs (10) using the output unit 15 because the outputs (10) match. To do. The comparison unit 14 can determine that the ECU 2 to which the erroneous output data (11) is added at the time when the comparison is completed, and gives a failure notification to all the ECUs in the group using the failure detection unit 16. FIG. 10 is a failure notification from the ECU 3 to the ECU 2.

図８は、全てのＥＣＵが正常な場合のシステム全体の動作である。全てのＥＣＵは、図６で述べたＥＣＵ１の動作と同様に２つのＥＣＵの比較で結果が一致し、処理が終了する。 FIG. 8 shows the operation of the entire system when all the ECUs are normal. All the ECUs agree with each other in the comparison of the two ECUs as in the operation of the ECU 1 described with reference to FIG.

図９は、ＥＣＵ２が故障している場合のシステム全体の動作である。ＥＣＵ１から開始される比較処理の動作は、図７で述べた通りである。
ＥＣＵ２から開始される比較処理の動作は、以下の通りである。ＥＣＵ２は入力部１１を用いて入力データ（１）を得、それに対して演算部１２を用いて演算を行い、出力データ（３）を得る。ＥＣＵ２は故障しているため、正しい結果（２）が得られていない。そして、転送部１３を用いて入出力組｛１：３｝を送信する。ＥＣＵ３は、入出力組｛１：３｝を受信し、入力データ（１）に対して演算部１２を用いて演算を行い、演算結果（２）を得る。ＥＣＵ２が故障しているため、出力データ集合の値｛３｝はＥＣＵ３の出力データ（２）と一致せず、比較部１４は不一致という結果を出力する。転送部１３は、比較結果が不一致であるため、入出力組の最後に演算部１２が出力した値（２）を追加し、入出力組｛１：３：２｝をグループ内ＥＣＵ４に転送する。ＥＣＵ４は、入出力組｛１：３：２｝を受信し、入力データ（１）に対して演算部１２を用いて演算を行い、演算結果（２）を得る。比較部１４は、演算部１２が出力した値（２）と受信した出力データ集合｛３：２｝を比較し、出力（２）が一致するため、出力部１５を用いて（２）を出力する。ＥＣＵ４は、この時点で比較部１４によって一致しない出力を行ったＥＣＵ２が故障であることを判断でき、故障検出部１６を用いてグループ内の全ＥＣＵに故障通知を行う。図１１は、ＥＣＵ４によるＥＣＵ２に対する故障通知である。 FIG. 9 shows the operation of the entire system when the ECU 2 is out of order. The operation of the comparison process started from the ECU 1 is as described in FIG.
The operation of the comparison process started from the ECU 2 is as follows. The ECU 2 obtains input data (1) using the input unit 11 and performs computation using the computation unit 12 to obtain output data (3). Since the ECU 2 is out of order, the correct result (2) is not obtained. Then, the transfer unit 13 is used to transmit the input / output set {1: 3}. The ECU 3 receives the input / output set {1: 3}, calculates the input data (1) using the calculation unit 12, and obtains a calculation result (2). Since the ECU 2 is out of order, the value {3} of the output data set does not match the output data (2) of the ECU 3, and the comparison unit 14 outputs a result of mismatch. Since the comparison result does not match, the transfer unit 13 adds the value (2) output from the calculation unit 12 to the end of the input / output set, and transfers the input / output set {1: 3: 2} to the in-group ECU 4. . The ECU 4 receives the input / output set {1: 3: 2}, calculates the input data (1) using the calculation unit 12, and obtains a calculation result (2). The comparison unit 14 compares the value (2) output from the calculation unit 12 with the received output data set {3: 2}, and outputs (2) using the output unit 15 because the output (2) matches. To do. The ECU 4 can determine that the ECU 2 that has output inconsistent by the comparison unit 14 at this time is a failure, and sends a failure notification to all the ECUs in the group using the failure detection unit 16. FIG. 11 is a failure notification from the ECU 4 to the ECU 2.

ＥＣＵ３から開始される比較処理の動作は、以下の通りである。ＥＣＵ３は入力部１１を用いて入力データ（３）を得、それに対して演算部１２を用いて演算を行い、出力データ（６）を得る。そして、転送部１３を用いて入出力組｛３：６｝を送信する。ＥＣＵ４は、入出力組｛３：６｝を受信し、入力データ（３）に対して演算部１２を用いて演算を行い、演算結果（６）を得る。比較部１４は、演算部１２の出力データ（６）と受信した出力データ集合｛６｝の比較を行い、値が（６）で一致するため出力部１５を用いて（６）を出力する。ＥＣＵ３から開始される比較処理は、ＥＣＵ３、ＥＣＵ４共に正常であるため、２つ目のＥＣＵ４の比較処理でデータが一致し出力が行われ、余計な計算やメッセージの転送が行われない。また、出力までのパスに故障ＥＣＵが含まれないため、故障ＥＣＵも検出しない。 The operation of the comparison process started from the ECU 3 is as follows. The ECU 3 obtains input data (3) using the input unit 11, performs computation using the computation unit 12, and obtains output data (6). Then, the transfer unit 13 is used to transmit the input / output set {3: 6}. The ECU 4 receives the input / output set {3: 6}, calculates the input data (3) using the calculation unit 12, and obtains a calculation result (6). The comparison unit 14 compares the output data (6) of the calculation unit 12 with the received output data set {6}, and outputs (6) using the output unit 15 because the values match in (6). Since the comparison process started from the ECU 3 is normal for both the ECU 3 and the ECU 4, data is matched and output in the comparison process of the second ECU 4, and no extra calculation or message transfer is performed. Further, since the fault ECU is not included in the path to the output, the fault ECU is not detected.

ＥＣＵ４から開始される比較処理の動作は、以下の通りである。ＥＣＵ４は入力部１１を用いて入力データ（４）を得、それに対して演算部１２を用いて演算を行い、出力データ（８）を得る。そして、転送部１３を用いて入出力組｛４：８｝を送信する。ＥＣＵ１は、入出力組｛４：８｝を受信し、入力データ（４）に対して演算部１２を用いて演算を行い、演算結果（８）を得る。比較部１４は、演算部１２の出力データ（８）と受信した出力データ集合｛８｝の比較を行い、値が（８）で一致するため出力部１５を用いて（８）を出力する。ＥＣＵ４から開始される比較処理は、ＥＣＵ４、ＥＣＵ１共に正常であるため、２つ目のＥＣＵ１の比較処理でデータが一致し出力が行われ、余計な計算やメッセージの転送が行われない。また、出力までのパスに故障ＥＣＵが含まれないため、故障ＥＣＵも検出しない。 The operation of the comparison process started from the ECU 4 is as follows. The ECU 4 obtains input data (4) using the input unit 11 and performs computation using the computation unit 12 to obtain output data (8). Then, the transfer unit 13 is used to transmit the input / output set {4: 8}. The ECU 1 receives the input / output set {4: 8}, performs an operation on the input data (4) using the operation unit 12, and obtains an operation result (8). The comparison unit 14 compares the output data (8) of the calculation unit 12 with the received output data set {8}, and outputs (8) using the output unit 15 because the values match in (8). Since the comparison process started from the ECU 4 is normal for both the ECU 4 and the ECU 1, the data is matched and output in the comparison process of the second ECU 1, and no extra calculation or message transfer is performed. Further, since the fault ECU is not included in the path to the output, the fault ECU is not detected.

従って、この時点ではシステム内の正常なＥＣＵ間で故障ＥＣＵに関する認識は一致していない。即ち、ＥＣＵ３、４はＥＣＵ２が故障しているのを認識しているのに対し、ＥＣＵ１はＥＣＵ２の故障を認識できない。そのため、システム内の故障ＥＣＵに関する認識を一致させるために、故障通知が必要となる。 Therefore, at this point in time, the recognition regarding the faulty ECU does not match between normal ECUs in the system. That is, while the ECUs 3 and 4 recognize that the ECU 2 has failed, the ECU 1 cannot recognize the failure of the ECU 2. Therefore, in order to make the recognition about the failure ECU in the system coincide, a failure notification is required.

本発明の手順では、本例のように常に２つの正常ＥＣＵが故障ＥＣＵを特定可能であるため、２つの故障通知のみで故障ＥＣＵ情報を共有することが出来る。故障を特定したＥＣＵは故障通知を行い、１周期内に同じＥＣＵに対して２つの故障通知を受け取ったＥＣＵは、該当ＥＣＵを故障としてマークする。以降は、そのＥＣＵからの故障通知を無視する。 In the procedure of the present invention, since two normal ECUs can always identify a faulty ECU as in this example, faulty ECU information can be shared by only two fault notifications. The ECU that has identified the failure gives a failure notification, and the ECU that has received two failure notifications for the same ECU within one cycle marks the corresponding ECU as a failure. Thereafter, the failure notification from the ECU is ignored.

図１２は、入力装置から入力データを取得し、入出力組の転送を開始するＥＣＵの動作を示すフローチャートである。ここでは、当該ＥＣＵを開始ノードと表す。開始ノードは、周期毎に入力部１１から入力データを取得し（ステップＳＴ１００）、演算部１２を用いて入力データに対して演算を行い、出力データを得る（ステップＳＴ１０１）。そして、転送部１３を用いて入力データと出力データの組み合わせ入出力組）をグループ内のＥＣＵに転送する（ステップＳＴ１０２、ＳＴ１０３）。 FIG. 12 is a flowchart showing the operation of the ECU that acquires input data from the input device and starts transferring the input / output set. Here, the ECU is represented as a start node. The start node obtains input data from the input unit 11 for each cycle (step ST100), calculates the input data using the calculation unit 12, and obtains output data (step ST101). The transfer unit 13 is used to transfer the input / output data combination input / output set) to the ECUs in the group (steps ST102 and ST103).

図１３は、グループ内ＥＣＵから入出力組を受信したＥＣＵの動作を示すフローチャートである。入出力組を受信したＥＣＵは（ステップＳＴ１２０）、転送部１３を用いて入力データを抽出し、演算部１２を用いて演算を行い、出力データを得る（ステップＳＴ１２１）。そして、得られた出力データと入出力組の出力データ集合のいずれかが一致するかを比較部１４を用いて比較する（ステップＳＴ１２２、ＳＴ１２３）。一致するデータがあれば、出力部１５を用いてそのデータを出力する（ステップＳＴ１２４）。更に、一致したデータ以外を出力したＥＣＵがあれば、そのＥＣＵを故障していると判断する（ステップＳＴ１２５、ＳＴ１２６）。この故障ＥＣＵに関する情報は、故障検出部１６を用いてグループ内の全てのＥＣＵに通知する（ステップＳＴ１２７）。演算した出力データが、出力データ集合のいずれのデータにも一致しなかった場合には、転送部１３を用いて入出力組の最後に自らの演算した出力データを付加して（ステップＳＴ１２８）、グループ内の次のＥＣＵに転送する（ステップＳＴ１２９）。 FIG. 13 is a flowchart showing the operation of the ECU that has received the input / output set from the in-group ECU. The ECU that has received the input / output set (step ST120) extracts input data using the transfer unit 13, performs calculations using the calculation unit 12, and obtains output data (step ST121). Then, the comparison unit 14 is used to compare whether the obtained output data matches the output data set of the input / output group (steps ST122 and ST123). If there is matching data, the output unit 15 outputs the data (step ST124). Further, if there is an ECU that outputs data other than the matched data, it is determined that the ECU has failed (steps ST125 and ST126). Information regarding the faulty ECU is notified to all ECUs in the group using the fault detection unit 16 (step ST127). If the calculated output data does not match any data in the output data set, the output data calculated by itself is added to the end of the input / output group using the transfer unit 13 (step ST128). The data is transferred to the next ECU in the group (step ST129).

図１４は、故障検出部１６において故障通知を受信した際のフローチャートである。故障通知が、故障と特定されたＥＣＵからのものであれば、無視する（ステップＳＴ２００）。他ＥＣＵからの故障通知を最初に受信した場合には（ステップＳＴ２０１）、通知されたＥＣＵに関する故障通知数を１とし、この時点では通知されたＥＣＵを故障としてマークしない（ステップＳＴ２０３）。これによって、故障ＥＣＵが正常ＥＣＵに対して故障通知を行った場合に、正常なＥＣＵが誤って故障と判断されることを防ぐ事が出来る。ステップＳＴ２０１において該当ＥＣＵに対する故障通知数が２つ以上となった場合、該当ＥＣＵを故障ＥＣＵとしてマークし（ステップＳＴ２０２）、以降は故障ＥＣＵからの情報を無視する。尚、周期毎に故障通知数をクリアする。 FIG. 14 is a flowchart when the failure detection unit 16 receives a failure notification. If the failure notification is from the ECU identified as a failure, it is ignored (step ST200). When a failure notification from another ECU is first received (step ST201), the number of failure notifications regarding the notified ECU is set to 1, and at this time, the notified ECU is not marked as a failure (step ST203). Accordingly, when the failure ECU gives a failure notification to the normal ECU, it is possible to prevent the normal ECU from being erroneously determined as a failure. When the number of failure notifications to the corresponding ECU becomes two or more in step ST201, the corresponding ECU is marked as a failure ECU (step ST202), and the information from the failure ECU is ignored thereafter. The number of failure notifications is cleared for each cycle.

図１５は、故障ＥＣＵ２が正常なＥＣＵ１を故障として通知した場合の例である。この場合は、他の正常なＥＣＵは正常なＥＣＵ１に対する故障通知は行わないため、ＥＣＵ１に対する故障通知が２つ以上となることはなく、正常なＥＣＵ１が誤って故障と判断されることは無い。 FIG. 15 is an example when the failure ECU 2 notifies a normal ECU 1 as a failure. In this case, since other normal ECUs do not send a failure notification to the normal ECU 1, there are no two or more failure notifications to the ECU 1, and the normal ECU 1 is not erroneously determined to be a failure.

一方、図１６、図１７はあるサイクルにおいて２つの故障ＥＣＵ（ＥＣＵ２、ＥＣＵ３）が正常なＥＣＵ１に対して故障通知を行った場合である。本発明では、１サイクルに２つのＥＣＵが同時に故障しないことを想定しているため、一方は既に故障ＥＣＵとしてマークされており、そのＥＣＵからの故障通知は無視される。従って、ＥＣＵ１に対する故障通知は１つのみとなり、正常なＥＣＵ１が誤って故障と判断されることは無い。 On the other hand, FIGS. 16 and 17 show the case where two failure ECUs (ECU2, ECU3) give a failure notification to the normal ECU1 in a certain cycle. In the present invention, since it is assumed that two ECUs do not fail at the same time in one cycle, one of them is already marked as a failed ECU, and the failure notification from that ECU is ignored. Therefore, there is only one failure notification to the ECU 1, and a normal ECU 1 is not erroneously determined to be a failure.

実施の形態１におけるネットワーク負荷向上の抑制に関しては、例えば従来手法において４つのノードで、あるノードに対する監視を行う場合、周期毎に４つの監視結果を多数決することで故障ノードを決定していた。そのため、従来手法では、毎周期監視のためのメッセージが４つ送られることとなり、バス負荷が増加する。一方、本実施の形態では４つのノードの場合においても、メッセージ（入出力組）を送ったＥＣＵが故障していなければ出力が一致し、故障通知は行われずバス負荷への影響はない。但し、各ＥＣＵは自らの入力に対する演算と、他ノードの入力に対する演算を行う必要がありＣＰＵ負荷が２倍となるため、ＣＰＵ負荷が半分以下のシステムにのみ適当可能となる。本実施の形態では、任意の故障ノードに対して監視ノードが動的に切り替わるため、故障への耐性がある。また、故障ノードを故障としてマークするため、故障ノードが過半数となっても、正しいノードが誤って故障と判断されることはなく、正常なノード間の認識が一致する。 Regarding the suppression of network load improvement in the first embodiment, for example, when monitoring a certain node with four nodes in the conventional method, the failure node is determined by deciding a large number of four monitoring results for each period. Therefore, in the conventional method, four messages for monitoring each cycle are sent, and the bus load increases. On the other hand, in the present embodiment, even in the case of four nodes, if the ECU that sent the message (input / output group) has not failed, the outputs will match, no failure notification will be made, and the bus load will not be affected. However, each ECU needs to perform an operation on its own input and an operation on the input of another node, and the CPU load is doubled. Therefore, the ECU can be applied only to a system with a CPU load of half or less. In this embodiment, since the monitoring node is dynamically switched with respect to an arbitrary failure node, there is resistance to failure. Further, since the failed node is marked as failed, even if the majority of the failed nodes are detected, the correct nodes are not erroneously determined to be failed, and the recognition between the normal nodes is consistent.

以上説明したように実施の形態１の故障検出装置によれば、ネットワークに接続された複数のノードをグループ化し、グループ内で相互に入出力データを周期的に交換する分散システムにおいて、ノードの故障検出を行う故障検出装置であって、各ノードは、入力データに対する演算を行い、演算結果を出力する演算部と、演算結果である出力データと入力データと組にした入出力組をグループ内の他のノードに対して転送すると共に、他のノードから入出力組を受信した場合に入力データを取り出す転送部と、入出力組における出力データと演算結果とを比較し、演算結果が入出力組のいずれかの出力データと一致した場合はその値を出力すると共に、いずれかの出力データと一致し、かつ、一致しない出力データを含む場合、一致しない出力データのノードを故障と判定し、一方、演算結果が入出力組の全ての出力データと一致しない場合は、入出力組の全ての出力データに演算結果を追加して新たな入出力組とする比較部と、比較部が判定したノードの故障をグループ内の他のノードに対して通知する故障検出部とを備え、各ノードは、一つの周期で複数のノードから同一のノードに対する故障通知を受けた場合、次の周期以降、故障通知を受けたノードからの故障通知を無視するようにしたので、故障情報共有のためのネットワーク負荷の増加を抑制しつつ、機能の冗長化、動的な故障検出機能の追加、正常ノード間での故障ノードに対する認識の一致を実現することができ、また、正常なノードが誤って故障と判定されることがなく、信頼性の高い故障検出を行うことができる。 As described above, according to the failure detection apparatus of the first embodiment, in a distributed system in which a plurality of nodes connected to a network are grouped and input / output data is periodically exchanged within the group, the failure of the node A failure detection device that performs detection, each node performs an operation on input data, outputs an operation result, and an input / output set that is a combination of the output data and the input data as the operation result Transfers to other nodes and compares the output data in the input / output set with the transfer result to retrieve the input data when the input / output set is received from the other node. If the output data matches any of the output data, the value is output, and if it matches any output data and includes output data that does not match, it does not match If the node of the force data is determined to be faulty and the operation result does not match all the output data of the input / output group, the operation result is added to all the output data of the input / output group and a new input / output group is created. And a failure detection unit for notifying other nodes in the group of the failure of the node determined by the comparison unit, each node notifying a failure from the plurality of nodes to the same node in one cycle Since the failure notification from the node that received the failure notification is ignored after the next cycle, the function redundancy and dynamics are suppressed while suppressing an increase in the network load for failure information sharing. New failure detection function, recognition of failure nodes among normal nodes can be realized, and normal nodes are not mistakenly determined as failure, and reliable failure detection is performed. Can Kill.

実施の形態２．
実施の形態２は、データに更なる信頼性が求められるシステムに適用する故障検出装置に関するものである。なお、これ以降の実施の形態では、図面上の構成は実施の形態１と同様であるため、図１や図３の構成を用いて説明する。 Embodiment 2. FIG.
The second embodiment relates to a failure detection apparatus applied to a system where further reliability is required for data. In the following embodiments, the configuration on the drawing is the same as that of the first embodiment, and therefore, description will be made using the configuration of FIG. 1 and FIG.

実施の形態２は、システムに必要とされる安全度に応じて任意のｎ個のデータが一致するまでデータの転送を行うことにより、データの信頼性を向上させるようにしたものである。実施の形態２の動作例を図１８に示す。実施の形態２は、３つのデータが一致した場合に出力をする例であり、ＥＣＵ１、ＥＣＵ２でデータは一致しているがＥＣＵ３まで入出力組を転送部１３によって転送し、ＥＣＵ３で３つのデータが一致して出力部１５から出力が行われる。３つの出力データ一致での出力では、故障ＥＣＵ通知は３つのＥＣＵから行われる。この場合も、実施の形態２の動作は実施の形態１から変更する必要は無く、故障を検出したＥＣＵの故障検出部１６が故障通知を行い、各ＥＣＵは２つ以上のＥＣＵからの故障通知によって、通知されたＥＣＵを故障としてマークする。 In the second embodiment, data reliability is improved by transferring data until any n pieces of data match according to the degree of safety required for the system. FIG. 18 shows an operation example of the second embodiment. The second embodiment is an example in which output is performed when three pieces of data coincide with each other. The data is identical between the ECU 1 and the ECU 2, but the input / output set is transferred to the ECU 3 by the transfer unit 13, and the ECU 3 outputs the three data Are matched and output is performed from the output unit 15. In the output with three output data coincidence, the failure ECU notification is made from the three ECUs. Also in this case, the operation of the second embodiment does not need to be changed from the first embodiment, and the failure detection unit 16 of the ECU that detects the failure issues a failure notification, and each ECU notifies the failure notification from two or more ECUs. To mark the notified ECU as a failure.

実施の形態３．
実施の形態３は、データが一致しない場合の転送をデッドラインではなく、転送回数で規定する例である。動作例を図１９に示す。図１９は、最大転送回数は５としている。本例では、ＥＣＵ６までの出力データは全て一致していないため、転送部１３によってＥＣＵ６まで転送が行われている。図１９は、出力データ集合の一つにＥＣＵ６の出力データが一致したため出力部１５から出力が行われているが、最大転送回数を満了したＥＣＵ６の出力データが出力データ集合のいずれかに一致しなければ、前回値やデフォルト値を出力する。実施の形態３でも、一致しなかった結果を出力したＥＣＵが故障と判断でき、故障を検出したＥＣＵの故障検出部１６が故障を通知する。各ＥＣＵは２つ以上のＥＣＵからの故障通知によって、通知されたＥＣＵを故障としてマークする。 Embodiment 3 FIG.
The third embodiment is an example in which the transfer when the data does not match is defined not by the deadline but by the number of transfers. An example of the operation is shown in FIG. In FIG. 19, the maximum number of transfers is 5. In this example, since the output data up to the ECU 6 do not all match, the transfer unit 13 transfers the data to the ECU 6. In FIG. 19, output from the output unit 15 is performed because the output data of the ECU 6 matches one of the output data sets. However, the output data of the ECU 6 that has reached the maximum number of transfers matches any of the output data sets. If not, the previous value or default value is output. Also in the third embodiment, the ECU that outputs the result that did not match can be determined as a failure, and the failure detection unit 16 of the ECU that detected the failure notifies the failure. Each ECU marks the notified ECU as a failure by a failure notification from two or more ECUs.

実施の形態４．
実施の形態４は、データの一致・不一致にかかわらず、所定回数転送するものである。所定回数転送後は、得られた結果に対して比較部１４で比較を行い、最も一致する数が多いデータを出力部１５から出力する。動作例を図２０に示す。図２０は、転送回数は５であり、ＥＣＵ５の時点で出力データ集合として｛１０：１１：１０：９：１０｝が得られており、その中で最も数が多い｛１０｝を出力する。実施の形態４でも、出力を行うＥＣＵが一致しない結果を出力したＥＣＵを故障と判断でき、故障を検出したＥＣＵの故障検出部１６が故障を通知する。各ＥＣＵは２つ以上のＥＣＵからの故障通知によって、通知されたＥＣＵを故障としてマークする。 Embodiment 4 FIG.
In the fourth embodiment, data is transferred a predetermined number of times regardless of data coincidence / non-coincidence. After the predetermined number of transfers, the comparison result is compared with the obtained result, and the data with the largest number of matches is output from the output unit 15. An example of the operation is shown in FIG. In FIG. 20, the number of transfers is 5, and {10: 11: 10: 9: 10} is obtained as an output data set at the time of the ECU 5, and {10} having the largest number is output. Also in the fourth embodiment, an ECU that outputs a result that does not match the ECUs that perform output can be determined as a failure, and the failure detection unit 16 of the ECU that detects the failure notifies the failure. Each ECU marks the notified ECU as a failure by a failure notification from two or more ECUs.

実施の形態５
実施の形態５は、データの一致・不一致に関わらず、デッドラインまで転送を繰り返すものである。得られた結果に対して比較部１４によって多数決を行い、結果を出力部１５より出力する。動作例を図２１に示す。図２１は、デッドラインの直前で出力データ集合｛１０：９：１０：１０｝が得られており、多数決の結果である｛１０｝を出力する。ここでも、出力を行うＥＣＵが、一致しない結果を出力したＥＣＵを故障と判断でき、故障を検出したＥＣＵの故障検出部１６が故障を通知する。各ＥＣＵは２つ以上のＥＣＵからの故障通知によって、通知されたＥＣＵを故障としてマークする。 Embodiment 5
In the fifth embodiment, transfer is repeated up to the deadline regardless of data coincidence / mismatch. A majority decision is made by the comparison unit 14 on the obtained result, and the result is output from the output unit 15. An example of the operation is shown in FIG. In FIG. 21, an output data set {10: 9: 10: 10} is obtained immediately before the deadline, and {10} that is the result of the majority decision is output. Here again, the ECU that performs the output can determine that the ECU that has output a mismatched result is a failure, and the failure detection unit 16 of the ECU that has detected the failure notifies the failure. Each ECU marks the notified ECU as a failure by a failure notification from two or more ECUs.

なお、本願発明はその発明の範囲内において、各実施の形態の自由な組み合わせ、あるいは各実施の形態の任意の構成要素の変形、もしくは各実施の形態において任意の構成要素の省略が可能である。 In the present invention, within the scope of the invention, any combination of the embodiments, or any modification of any component in each embodiment, or omission of any component in each embodiment is possible. .

１，２，３，４，…，ｎＥＣＵ、１１入力部、１２演算部、１３転送部、１４比較部、１５出力部、１６故障検出部。 1, 2, 3, 4,..., N ECU, 11 input unit, 12 calculation unit, 13 transfer unit, 14 comparison unit, 15 output unit, 16 failure detection unit.

Claims

In a distributed system that groups a plurality of nodes connected to a network and periodically exchanges input / output data with each other in the group, a failure detection device that detects a failure of the node,
Each of the nodes
An arithmetic unit that performs an operation on input data and outputs an operation result;
When the input / output set which is a set of the output data and the input data which is the calculation result is transferred to another node in the group, and the input / output set is received from the other node, the input data A transfer unit for taking out
The output data in the input / output group is compared with the calculation result, and when the calculation result matches any output data in the input / output group, the value is output, and the output data is identical to the output data. If the output data does not match, the node of the output data that does not match is determined to be faulty. On the other hand, if the calculation result does not match all the output data of the input / output set, A comparison unit that adds the calculation result to all the output data to form a new input / output set;
A failure detection unit for notifying other nodes in the group of the failure of the node determined by the comparison unit;
When each node receives a failure notification for the same node from a plurality of nodes in one cycle, the failure detection from the node receiving the failure notification is ignored after the next cycle. apparatus.