JP3627545B2

JP3627545B2 - CPU abnormality detection method

Info

Publication number: JP3627545B2
Application number: JP35179298A
Authority: JP
Inventors: 清之内田
Original assignee: Toyota Motor Corp
Current assignee: Toyota Motor Corp
Priority date: 1998-12-10
Filing date: 1998-12-10
Publication date: 2005-03-09
Anticipated expiration: 2018-12-10
Also published as: JP2000172521A

Description

【０００１】
【発明の属する技術分野】
本発明は、ＣＰＵの異常検出方法に係り、特に、複数のＣＰＵを有するコンピュータシステムにおけるＣＰＵの異常検出方法に関する。
【０００２】
【従来の技術】
従来より、内部に複数のＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）を設けて動作信頼性を向上させたコンピュータシステムが広く知られている。このようなコンピュータシステムでは、複数のＣＰＵの演算結果を比較することにより、ＣＰＵの故障等の有無が検出される。例えば、特開平５−３２４３９１号では、バス比較器によってＣＰＵの故障の有無を検出する方法が開示されている。
【０００３】
このバス比較器は、複数のＣＰＵ毎に対応して設けられた圧縮処理部と比較器とを有している。そして、それぞれの圧縮処理部は、対応するＣＰＵのビット毎のデータ圧縮部と直列転送部を有している。このような構成において、ＣＰＵの出力データは、バス比較器内のデータ圧縮部によって符号圧縮される。符号圧縮されたデータは、直列転送部によって直列に転送された後に比較器側に出力される。比較器は、各圧縮処理部から出力された圧縮データを順次比較して、不一致のデータがある場合にＣＰＵの故障等を検出する。
【０００４】
このような故障検出方法によれば、ＣＰＵの出力データがデータ圧縮部によって圧縮されるので、ＣＰＵのバスのビット数が増加しても対応することができる。また、ＣＰＵの出力データが圧縮されて比較結果の出力周波数も低くなるので、フェールセーフを確保する分周回路等を設ける必要もなく、バス比較器等の低コスト化が図られる。
【０００５】
【発明が解決しようとする課題】
しかし、上記従来例のような方法でＣＰＵの異常を検出するには、複数のＣＰＵの演算タイミングや外部センサ等から複数のＣＰＵに与えられるデータの入力タイミング等の同期をとる必要がある。このため、同期信号発生回路から各ＣＰＵに同期信号を与える必要があった。
【０００６】
また、ＣＰＵの同期をとるためにＣＰＵが各処理毎に要する演算時間等を厳密に見積もる必要があった。更に、複数のＣＰＵの同期をとるための同期信号がうまく働かず、ＣＰＵの同期がとれない場合の対策を準備しておく必要があった。
上記従来例において、完全な同期がとられていない複数のＣＰＵが演算を行なうとそれぞれの演算結果に差異が生じ、互いの演算結果の比較の結果、正常なＣＰＵでも異常であると誤認識されたり、異常なＣＰＵでも正常であると誤認識される可能性がある。
【０００７】
本発明は、上記の点に鑑みてなされたものであり、複数のＣＰＵを有するコンピュータシステムにおいて、複数ＣＰＵの同期をとることなく、ＣＰＵの異常を容易に検出する方法を提供することを目的とする。
【０００８】
【課題を解決するための手段】
上記目的は、請求項１に記載する如く、複数のＣＰＵを有するコンピュータシステムにおけるＣＰＵの異常検出方法であって、
各ＣＰＵそれぞれに所定の間隔で順次入力信号を取得させる第１のステップと、
各ＣＰＵそれぞれに他のＣＰＵが算出した入力信号に対する最新の演算値を取得させる第２のステップと、
各ＣＰＵそれぞれに、前記第２のステップの実行により他のＣＰＵが算出した前記最新の演算値を取得させた後に、入力信号に対する演算値を算出させる第３のステップと、
各ＣＰＵそれぞれに、算出した最新の所定数の各演算値を、他のＣＰＵが算出した前記最新の演算値と大小比較させる第４のステップと、
前記第４のステップによる大小比較の結果に基づきＣＰＵの異常を検出する第５のステップとを備えるＣＰＵの異常検出方法により達成される。
【０００９】
このようなＣＰＵの異常検出方法では、複数のＣＰＵが互いの演算値を比較し合うことにより、複数のＣＰＵの同期をとることなくＣＰＵの異常が容易に検出される。また、本発明は、他のＣＰＵの最新の演算値を取得してから、各ＣＰＵが自己の演算値を算出する構成であるため、各ＣＰＵにとって、取得した他のＣＰＵの最新の演算値は、常に自己の最新の演算値よりも先に算出されたものとなる。このため、各ＣＰＵは、既に算出した最新の所定数の演算値と他のＣＰＵが算出した最新の演算値との誤差に基づき、ＣＰＵの異常の有無を検出すればよく、自己の最新の演算値に続いて算出する演算値を比較時に考慮する必要がない。従って、比較時の誤差の許容範囲を小さく設定することができ、高精度なＣＰＵの異常検出が実現する。
【００１０】
また、上記目的は、請求項２に記載する如く、請求項１記載のＣＰＵの異常検出方法であって、
前記第４のステップにおける最新の所定数の各演算値は、最新の３つの演算値であるＣＰＵの異常検出方法により達成される。
各ＣＰＵが他のＣＰＵが算出した最新の演算値を取得するまでには、所定の微小な時間を要するため、各ＣＰＵが取得した他のＣＰＵの最新の演算値は、自己の前々回の演算処理時から最新の演算処理時の間に算出されたものであると限定できる。本発明によると、相手側ＣＰＵの最新の演算値と比較する自己ＣＰＵの演算値が最新の３つの演算値に限定されるので、比較時の誤差の許容範囲をより小さな値に設定することができる。誤差許容範囲をより小さな値に設定することで、ＣＰＵの異常の検出がより高精度に行なわれる。
【００１１】
【発明の実施の形態】
以下、図１〜図４を用いて本発明の実施の形態について説明する。
図１は、本発明の異常検出方法によってＣＰＵの異常を検出する制御用コンピュータ１０の構成図である。この制御用コンピュータ１０は、外部機器１２から順次与えられる入力信号に基づいて演算処理を行う。そして、制御用コンピュータ１０は、演算処理の結果に応じた制御信号を外部機器１４に与えて外部機器１４の動作制御を行なう。制御用コンピュータ１０に入力信号を供給する外部機器１２は、例えば、スイッチやセンサ等である。また、制御用コンピュータ１０によって制御される外部機器１４は、例えば、アクチュエータやＬＥＤ表示器等である。
【００１２】
図１に示すように、制御コンピュータ１０は、入力ポート１６、ＣＰＵ１８ａ、１８ｂ、メモリ２０ａ、２０ｂ、受信バッファ２２ａ、２２ｂ及び出力ポート２４等を有する。
入力ポート１６は、スイッチやセンサ等で構成される外部機器１２からの入力信号を順次取り込む。そして、入力ポート１６は、ノイズ消去処理やレベルシフト処理等を施した後の入力信号をＣＰＵ１８ａ、１８ｂに与える。
【００１３】
ＣＰＵ１８ａは、外部機器１２から入力ポート１６を介して与えられた入力信号に基づき演算処理を行なって、入力信号に応じた演算値ａ１、ａ２、ａ３、・・・を算出する。同様にＣＰＵ１８ｂも外部機器１２から入力ポート１６を介して与えられた入力信号に基づき演算処理を行なって、入力信号に応じた演算値ｂ１、ｂ２、ｂ３、・・・を算出する。
【００１４】
ＣＰＵ１８ａ、１８ｂは、同期がとられておらず、共に所定の間隔ｔ（例えば、約６ｍｓ）で演算値の算出を繰り返す。従って、演算値ａ１、ａ２、ａ３、・・・と演算値ｂ１、ｂ２、ｂ３、・・・は、例えば、ｂ１、ａ１、ｂ２、ａ２、ｂ３、ａ３、・・・のような順で交互に算出される。
ＣＰＵ１８ａによって算出された演算値ａ１、ａ２、ａ３、・・・は、メモリ２０ａと受信バッファ２２ｂに順次格納される。また、ＣＰＵ１８ｂによって算出された演算値ｂ１、ｂ２、ｂ３、・・・は、メモリ２０ｂと受信バッファ２２ａに順次格納される。
【００１５】
また、ＣＰＵ１８ａは、最新の３つの自己の演算値（例えば、ａ１、ａ２、ａ３）をメモリ２０ａから読み出し、最新のＣＰＵ１８ｂの演算値（例えば、ｂ３）を受信バッファ２２ａから読み出す。そして、ＣＰＵ１８ａは、演算値ａ１、ａ２、ａ３と演算値ｂ３を比較し、比較結果に応じた制御信号を出力ポート２４を介してアクチュエータやＬＥＤ表示器等で構成される外部機器１４に与える。この時、外部機器１４は、ＣＰＵ１８ａから与えられた制御信号に従って作動する。
【００１６】
一方、ＣＰＵ１８ｂは、最新の３つの自己の演算値（例えば、ｂ１、ｂ２、ｂ３）をメモリ２０ｂから読み出し、最新のＣＰＵ１８ａの演算値（例えば、ａ３）を受信バッファ２２ｂから読み出す。そして、ＣＰＵ１８ｂは、演算値ｂ１、ｂ２、ｂ３と演算値ａ３を比較し、比較の結果、必要に応じてＣＰＵ１８ａから外部機器１４への制御信号の出力を禁止にする。
【００１７】
メモリ２０ａ、２０ｂは、それぞれＣＰＵ１８ａ、１８ｂの演算処理の結果である演算値ａ１、ａ２、・・・、ｂ１、ｂ２、・・・の他、ＣＰＵ１８ａ、１８ｂの動作プログラム等を格納する。
次に、ＣＰＵ１８ａ、１８ｂの動作説明をフローチャートを用いて行なう。
図２は、ＣＰＵ１８ａが実行するルーチンを示すフローチャートである。図２に示すルーチンは、その処理が終了する毎に繰り返し起動させる。なお、メモリ２０ａ内には、前々回と前回のルーチンで取得された入力信号に基づくＣＰＵ１８ａの演算処理の結果である演算値ａ１、ａ２が既に格納されているものとする。また、受信バッファ２２ａには、ＣＰＵ１８ｂの演算処理の結果である最新の演算値ｂ３が既に格納されているものとする。
【００１８】
図２に示すルーチンが起動されると、先ず、ステップ１００において、外部機器１２からの入力信号が入力ポート１６を介して取得される。このステップ１００の処理が終了すると、次に、ステップ１０２の処理が実行される。
ステップ１０２では、ＣＰＵ１８ｂによる演算処理の結果である演算値ｂ３が受信バッファ２２ａから取得される。そして、次に、ステップ１０４の処理が実行される。
【００１９】
ステップ１０４では、ステップ１００において取得された入力信号に基づいた所定の演算が実行され、その結果、演算値ａ３が算出される。そして、続くステップ１０６では、ステップ１０４で算出された演算値ａ３がメモリ２０ａに格納される。また、演算値ａ３は、受信バッファ２２ｂにも与えられ、受信バッファ２２ｂ内に格納される。このステップ１０６の処理が終了すると、次に、ステップ１０８の処理が実行される。
【００２０】
ステップ１０８では、メモリ２０ａに格納されていた最新の３つの演算値ａ１、ａ２、ａ３と、ステップ１０２で取得されたＣＰＵ１８ｂの最新の演算値ｂ３との大きさの比較が行なわれる。ここで、例えば、演算値ａ１、ａ２、ａ３のうちの最大値をａ_ＭＡＸ、最小値をａ_ＭＩＮとする。また、比較時の微小な誤差等の許容範囲を定める判定余裕値をαとする。この判定余裕値αは、予め設定されているものとする。ステップ１０８では、演算値ｂ３と演算値ａ_ＭＡＸ＋α及びａ_ＭＩＮ−αとの大小関係の比較が行なわれる。そして、この比較処理の後にステップ１１０の処理が実行される。
【００２１】
ステップ１１０では、ステップ１０８の比較結果に基づいた判別処理が実行される。ステップ１０８において、ｂ３＞ａ_ＭＡＸ＋α、又は、ｂ３＜ａ_ＭＩＮ−αが不成立ならば、ステップ１１０において、ＣＰＵ１８ｂの演算値ｂ３は、ＣＰＵ１８ａの演算値ａ１、ａ２、ａ３と近似しており、ＣＰＵ１８ａ、１８ｂは共に正常であると判断される。このステップ１１０の処理が終了すると、次に、ステップ１１２の処理が実行される。一方、ステップ１０８において、ｂ３＞ａ_ＭＡＸ＋α、又は、ｂ３＜ａ_ＭＩＮ−αが成立するならば、ステップ１１０において、ＣＰＵ１８ｂの演算値ｂ３は、ＣＰＵ１８ａの演算値ａ１、ａ２、ａ３と大きく乖離しており、ＣＰＵ１８ａ、１８ｂの少なくとも一方は異常であると判断される。そして、ＣＰＵ１８ａから外部機器１４に対する制御信号の出力は停止され、今回のルーチンは終了となる。
【００２２】
ステップ１１２では、ステップ１０４で算出された演算値ａ３に基づき、制御信号が出力ポート２４を介して外部機器１４に与えられる。この時、外部機器１４は、ＣＰＵ１８ａから与えられた制御信号に従って作動する。そして、再び、ステップ１００の処理が実行される。
一方、ＣＰＵ１８ｂは、以下のようなルーチンを実行する。
【００２３】
図３は、ＣＰＵ１８ｂが実行するルーチンを示すフローチャートである。図３に示すルーチンは、その処理が終了する毎に繰り返し起動される。なお、メモリ２０ｂ内には、前々回と前回のルーチンで取得された入力信号に基づくＣＰＵ１８ｂの演算処理の結果である演算値ｂ１、ｂ２が既に格納されているものとする。また、受信バッファ２２ｂには、ＣＰＵ１８ａの最新の演算処理の結果である演算値ａ３が既に格納されているものとする。
【００２４】
図３に示すルーチンが起動されると、先ず、ステップ２００において、外部機器１２からの入力信号が入力ポート１６を介して取得される。ステップ２００の処理が終了すると、次に、ステップ２０２の処理が実行される。
ステップ２０２では、図２に示すルーチンのステップ１０６において受信バッファ２２ｂ内に格納された演算値ａ３が取得される。そして、次に、ステップ２０４の処理が実行される。
【００２５】
ステップ２０４では、ステップ２００において取得された入力信号に基づいた所定の演算が実行され、その結果、演算値ｂ３が算出される。そして、次のステップ２０６では、ステップ２０４で算出された演算値ｂ３がメモリ２０ｂに格納される。また、演算値ｂ３は、受信バッファ２２ａにも与えられ、受信バッファ２２ａ内に格納される。このステップ２０６の処理が終了すると、次に、ステップ２０８の処理が実行される。
【００２６】
ステップ２０８では、メモリ２０ｂに格納されていた最新の３つの演算値ｂ１、ｂ２、ｂ３と、ステップ２０２で取得されたＣＰＵ１８ａの最新の演算値ａ３との大きさの比較が行なわれる。ここで、例えば、演算値ｂ１、ｂ２、ｂ３のうちの最大値をｂ_ＭＡＸ、最小値をｂ_ＭＩＮとする。また、比較時の微小な誤差等の許容範囲を定める判定余裕値をβとする。この判定余裕値βは、予め設定されているものとする。ステップ２０８では、演算値ａ３と演算値ａ_ＭＡＸ＋β及びｂ_ＭＩＮ−βとの大小関係の比較が行なわれる。そして、この比較処理の後にステップ２１０の処理が実行される。
【００２７】
ステップ２１０では、ステップ２０８の比較結果に基づいた判別処理が実行される。ステップ２０８において、ａ３＞ｂ_ＭＡＸ＋β、又は、ａ３＜ｂ_ＭＩＮ−βが不成立ならば、ステップ２１０において、ＣＰＵ１８ａの演算値ａ３は、ＣＰＵ１８ｂの演算値ｂ１、ｂ２、ｂ３と近似しており、ＣＰＵ１８ａ、１８ｂは共に正常であると判断される。そして、このステップ２１０の処理が終了すると、再び、ステップ２００の処理が実行される。一方、ステップ２０８において、ａ３＞ｂ_ＭＡＸ＋β、又は、ａ３＜ｂ_ＭＩＮ−βが成立するならば、ステップ２１０において、ＣＰＵ１８ａの演算値ａ３は、ＣＰＵ１８ｂの演算値ｂ１、ｂ２、ｂ３と大きく乖離しており、ＣＰＵ１８ａ、１８ｂの少なくとも一方は異常であると判断される。この場合、次に、ステップ２１２の処理が実行される。
【００２８】
ステップ２１２では、ＣＰＵ１８ａに停止信号が与えられ、ＣＰＵ１８ａから外部機器１４への制御信号の出力が禁止となる。そして、今回のルーチンは終了となる。
上記のように、ＣＰＵ１８ａ、１８ｂが互いの演算値を比較し合うので、ＣＰＵ１８ａ、１８ｂが正常であるか否かが容易に判別できる。また、本発明によれば、ＣＰＵ１８ａ、１８ｂの同期をとる必要がないため、同期信号発生回路からＣＰＵ１８ａ、１８ｂに同期信号を与えなくてもよい。
【００２９】
ここで、ＣＰＵ１８ａ、１８ｂが互いの演算値を任意のタイミングで取得して、比較処理を行なうようにすると、相手側ＣＰＵ（例えば、ＣＰＵ１８ａにとってのＣＰＵ１８ｂ）の最新の演算値が自己ＣＰＵの最新の演算値より先に算出された値なのか後に算出された値なのか判別できない。例えば、ＣＰＵ１８ｂの最新の演算値ｂ３がＣＰＵ１８ａの最新の演算値ａ３より先に算出された値なのか後に算出された値なのか判別できない。この場合、ＣＰＵ１８ａの前回の演算処理で算出された演算値ａ２と、今回（最新）の演算値ａ３と、演算値ａ３の次に算出される演算値ａ４とを考慮して、上記ステップ１０８の比較処理における誤差等の許容範囲を定める判定余裕値αを大きめに設定する必要がある。
【００３０】
しかし、本発明では、上記ステップ１０２、１０４及びステップ２０２、２０４に示すように、ＣＰＵ１８ａ、１８ｂは、共に相手側ＣＰＵの最新の演算値を受信バッファ２２ａ、２２ｂを介して取得してから、自己の演算処理を行なっている。このため、相手際ＣＰＵの最新の演算値は、常に自己ＣＰＵの最新の演算値よりも先に算出されたものであると確定できる。
【００３１】
また、相手側ＣＰＵが最新の演算値を算出した後に、自己ＣＰＵがその演算値を受信バッファを介して取得するまでは所定の微小な時間τ（τ＜ｔ）を要するので、本発明において、自己ＣＰＵが受信バッファを介して取得した相手側ＣＰＵの最新の演算値は、自己ＣＰＵの前々回の演算処理時から今回（最新）の演算処理時の間に算出されたものであると限定できる。
【００３２】
そこで、本発明のステップ１０８、２０８では、相手側ＣＰＵの最新の演算値と、自己ＣＰＵの前々回、前回及び今回の演算処理時の演算値である最新の３つの演算値とを比較する構成にしている。このように、本発明では、相手側ＣＰＵの最新の演算値と比較する自己ＣＰＵの演算値を最小数の３つに限定しているので、比較処理における誤差等の許容範囲を定める判定余裕値α、βをより小さな値に設定することができる。判定余裕値α、βをより小さな値に設定することで、ＣＰＵ１８ａ、１８ｂの異常の検出がより高精度に行なわれる。
【００３３】
ここで、ＣＰＵ１８ａ、１８ｂの演算値の比較処理が不要な入力信号に対しては、ＣＰＵ１８ａ、１８ｂが個別に演算処理を行なうようにしてもよい。演算値の比較処理が不要な入力信号をＣＰＵ１８ａ、１８ｂのいずれか一方に分配することで、２つのＣＰＵ１８ａ、１８ｂを有効に利用することができる。
なお、上記実施例は、ＣＰＵ１８ａ、１８ｂが共にそれぞれのルーチンに従って動作し、受信バッファを介して最新の演算値を相手側ＣＰＵに与える構成であるが、例えば、ＣＰＵ１８ａをマスタＣＰＵとし、ＣＰＵ１８ｂをスレーブＣＰＵとしてもよい。この場合、マスタＣＰＵであるＣＰＵ１８ａがスレーブＣＰＵであるＣＰＵ１８ｂを起動させる。そして、ＣＰＵ１８ａによって起動したＣＰＵ１８ｂが送信バッファに格納していた最新の演算値をＣＰＵ１８ａに与える。
【００３４】
図４は、ＣＰＵ１８ａ、１８ｂが共に正常である時の演算値ａ１、ａ２、ａ３及びｂ３を示す図である。なお、演算値ａ１、ａ２、ｂ３、ａ３にそれぞれ対応する時刻ｔ１、ｔ２、ｔ３、ｔ４は、各演算値がＣＰＵ１８ａ、１８ｂによって算出された時刻を示す。また、演算値ａ１、ａ２、ｂ３、ａ３にそれぞれ対応する値Ａ、Ｂ、Ｃ、Ｄは、各演算値ａ１、ａ２、ｂ１、ａ３の大きさを示す。
【００３５】
図４に示すように、ＣＰＵ１８ｂによって時刻ｔ３に算出された最新の演算値ｂ３の値Ｃは、ＣＰＵ１８ａによって時刻ｔ１に算出された演算値ａ１の値Ａより大きく、時刻ｔ４に算出された演算値ａ３の値Ｄより小さい。従って、この場合、図２に示したルーチンのステップ１０８において、ＣＰＵ１０ａは、ＣＰＵ１８ａ、１８ｂが共に正常であると判断して外部機器１４を制御するための制御信号を出力する。
【００３６】
図５は、ＣＰＵ１８ａ、１８ｂの少なくとも一方が異常である時の演算値ａ１、ａ２、ａ３及びｂ３を示す図である。なお、演算値ａ１、ａ２、ｂ３、ａ３にそれぞれ対応する時刻ｔ１、ｔ２、ｔ３、ｔ４は、各演算値がＣＰＵ１８ａ、１８ｂによって算出された時刻を示す。また、演算値ａ１、ａ２、ｂ３、ａ３にそれぞれ対応する値Ａ、Ｂ、Ｃ、Ｄは、各演算値ａ１、ａ２、ｂ１、ａ３の大きさを示す。
【００３７】
図５に示すように、ＣＰＵ１８ｂによって時刻ｔ３に算出された最新の演算値ｂ３の値Ｃは、ＣＰＵ１８ａによって時刻ｔ４に算出された演算値ａ３の値Ｄに判定余裕値αを加えた値よりも大きい。従って、この場合、図２に示したルーチンのステップ１０８において、ＣＰＵ１０ａは、ＣＰＵ１８ａ、１８ｂの少なくとも一方が異常であると判断して、外部機器１４を制御するための制御信号の出力を停止する。同様に、ＣＰＵ１０ｂがＣＰＵ１８ａ、１８ｂの少なくとも一方が異常であると判断する場合は、ＣＰＵ１８ａから外部機器１４への制御信号の出力を禁止にする。
【００３８】
なお、制御用コンピュータ１０内のＣＰＵの数は２個に限らず、制御用コンピュータ１０内に３個以上のＣＰＵを設けて、互いの演算値の比較によってＣＰＵの異常検出を行なうようにしてもよい。
上記実施例において、図２のステップ１００及び図３のステップ２００の処理が特許請求の範囲に記載の第１のステップに相当し、図２のステップ１０２及び図３のステップ２０２の処理が特許請求の範囲に記載の第２のステップに相当し、図２のステップ１０４及び図３のステップ２０４の処理が特許請求の範囲に記載の第３のステップに相当する。また、図２のステップ１０８及び図３のステップ２０８の処理が特許請求の範囲に記載の第４のステップに相当し、図２のステップ１１０及び図３のステップ２１０の処理が特許請求の範囲に記載の第５のステップに相当する。
【００３９】
【発明の効果】
上述の如く、請求項１記載の発明によれば、複数のＣＰＵを有するコンピュータシステムにおいて、ＣＰＵの同期をとることなく、ＣＰＵの異常が容易に検出される。また、ＣＰＵの異常検出を高精度に行うことができる。
また、請求項２記載の発明によれば、ＣＰＵの異常検出をより高精度に行うことができる。
【図面の簡単な説明】
【図１】本発明の異常検出方法が適用される制御用コンピュータの構成図である。
【図２】ＣＰＵが実行するルーチンを示すフローチャートである。
【図３】ＣＰＵが実行するルーチンを示すフローチャートである。
【図４】２つのＣＰＵが共に正常である時の演算値の値を示す図である。
【図５】２つのＣＰＵのうちの少なくとも一方が異常である時の演算値の値を示す図である。
【符号の説明】
１０制御用コンピュータ
１２、１４外部機器
１６入力ポート
１８ａ、１８ｂＣＰＵ
２０ａ、２０ｂメモリ
２２ａ、２２ｂ受信バッファ
２４出力ポート
ａ１、ａ２、ａ３、ｂ１、ｂ２、ｂ３演算値
α、β 判定余裕値[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a CPU abnormality detection method, and more particularly to a CPU abnormality detection method in a computer system having a plurality of CPUs.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, computer systems in which a plurality of CPUs (Central Processing Units) are provided to improve operation reliability are widely known. In such a computer system, the presence or absence of a CPU failure or the like is detected by comparing the calculation results of a plurality of CPUs. For example, Japanese Patent Laid-Open No. 5-324391 discloses a method for detecting the presence or absence of a CPU failure by a bus comparator.
[0003]
This bus comparator has a compression processing unit and a comparator provided for each of a plurality of CPUs. Each compression processing unit has a data compression unit and a serial transfer unit for each bit of the corresponding CPU. In such a configuration, the output data of the CPU is code-compressed by the data compression unit in the bus comparator. The code-compressed data is serially transferred by the serial transfer unit and then output to the comparator side. The comparator sequentially compares the compressed data output from each compression processing unit, and detects a CPU failure or the like when there is mismatched data.
[0004]
According to such a failure detection method, since the output data of the CPU is compressed by the data compression unit, it is possible to cope with an increase in the number of bits of the CPU bus. Further, since the CPU output data is compressed and the output frequency of the comparison result is lowered, it is not necessary to provide a frequency dividing circuit or the like for ensuring fail-safe, and the cost of the bus comparator or the like can be reduced.
[0005]
[Problems to be solved by the invention]
However, in order to detect CPU abnormality by the method as in the conventional example, it is necessary to synchronize calculation timings of a plurality of CPUs, input timings of data given to a plurality of CPUs from external sensors, and the like. For this reason, it is necessary to provide a synchronization signal to each CPU from the synchronization signal generation circuit.
[0006]
In addition, in order to synchronize the CPUs, it is necessary to strictly estimate the calculation time required for each process by the CPU. Furthermore, it is necessary to prepare a countermeasure when a synchronization signal for synchronizing a plurality of CPUs does not work well and the CPUs cannot be synchronized.
In the above conventional example, when a plurality of CPUs that are not perfectly synchronized perform calculations, there is a difference between the respective calculation results. As a result of comparing each other's calculation results, a normal CPU is erroneously recognized as abnormal. Or an abnormal CPU may be mistakenly recognized as normal.
[0007]
The present invention has been made in view of the above points, and an object of the present invention is to provide a method for easily detecting an abnormality of a CPU in a computer system having a plurality of CPUs without synchronizing the plurality of CPUs. To do.
[0008]
[Means for Solving the Problems]
The object is a method for detecting an abnormality of a CPU in a computer system having a plurality of CPUs, as described in claim 1.
A first step of Ru are sequentially obtains the input signal at a predetermined interval in each of CPU,
A second step of Ru to acquire the latest calculated value for the input signal other CPU is calculated respectively each CPU,
Each respective CPU, after to acquire the latest calculated value other CPU is calculated by executing the second step, a third step of Ru is calculated operation value for the input signal,
To each of CPU, the latest of each calculated value of the predetermined number of calculated, a fourth step of comparing the latest other CPU is calculated by the calculation value and the magnitude,
The I Ri is achieved anomaly detection method of a CPU and a fifth step of detecting an abnormality of the CPU based on the result of magnitude comparison by the fourth step.
[0009]
In such a CPU abnormality detection method, a plurality of CPUs compare each other's calculated values, whereby a CPU abnormality is easily detected without synchronizing the plurality of CPUs. In addition, since the present invention is configured so that each CPU calculates its own calculation value after acquiring the latest calculation value of the other CPU, for each CPU, the acquired latest calculation value of the other CPU is Therefore, it is always calculated prior to its latest calculated value. For this reason, each CPU has only to detect the presence or absence of abnormality of the CPU based on the error between the latest predetermined number of calculated values already calculated and the latest calculated values calculated by other CPUs. It is not necessary to consider the calculation value to be calculated following the value when comparing. Accordingly, the allowable error range at the time of comparison can be set small, and highly accurate CPU abnormality detection can be realized.
[0010]
Further, the object is a method for detecting an abnormality of a CPU according to claim 1, as described in claim 2.
The latest predetermined number of each calculation value in the fourth step is achieved by the CPU abnormality detection method which is the latest three calculation values.
Since it takes a predetermined minute time for each CPU to acquire the latest calculation value calculated by the other CPU, the latest calculation value of the other CPU acquired by each CPU is the calculation process of its own previous time. It can be limited that it is calculated between the time and the latest arithmetic processing. According to the present invention, since the calculation value of the own CPU to be compared with the latest calculation value of the counterpart CPU is limited to the three latest calculation values, it is possible to set the allowable error range at the time of comparison to a smaller value. it can. By setting the allowable error range to a smaller value, the CPU abnormality is detected with higher accuracy.
[0011]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to FIGS.
FIG. 1 is a configuration diagram of a control computer 10 that detects an abnormality of a CPU by the abnormality detection method of the present invention. The control computer 10 performs arithmetic processing based on input signals sequentially given from the external device 12. Then, the control computer 10 controls the operation of the external device 14 by giving a control signal corresponding to the result of the arithmetic processing to the external device 14. The external device 12 that supplies an input signal to the control computer 10 is, for example, a switch or a sensor. The external device 14 controlled by the control computer 10 is, for example, an actuator or an LED display.
[0012]
As shown in FIG. 1, the control computer 10 includes an input port 16, CPUs 18a and 18b, memories 20a and 20b, reception buffers 22a and 22b, an output port 24, and the like.
The input port 16 sequentially receives input signals from the external device 12 configured by switches, sensors, and the like. The input port 16 provides the CPU 18a and 18b with an input signal that has undergone noise elimination processing, level shift processing, and the like.
[0013]
The CPU 18a performs arithmetic processing based on an input signal given from the external device 12 via the input port 16, and calculates arithmetic values a1, a2, a3,... According to the input signal. Similarly, the CPU 18b performs arithmetic processing based on an input signal given from the external device 12 through the input port 16, and calculates arithmetic values b1, b2, b3,... According to the input signal.
[0014]
The CPUs 18a and 18b are not synchronized, and both repeat calculation of the calculation value at a predetermined interval t (for example, about 6 ms). Therefore, the calculated values a1, a2, a3,... And the calculated values b1, b2, b3,... Alternate in the order of, for example, b1, a1, b2, a2, b3, a3,. Is calculated.
The calculated values a1, a2, a3,... Calculated by the CPU 18a are sequentially stored in the memory 20a and the reception buffer 22b. Further, the calculated values b1, b2, b3,... Calculated by the CPU 18b are sequentially stored in the memory 20b and the reception buffer 22a.
[0015]
The CPU 18a reads the latest three calculated values (for example, a1, a2, and a3) from the memory 20a, and reads the latest calculated value (for example, b3) of the CPU 18b from the reception buffer 22a. Then, the CPU 18a compares the calculated values a1, a2, and a3 with the calculated value b3, and provides a control signal corresponding to the comparison result to the external device 14 configured by an actuator, an LED display, or the like via the output port 24. At this time, the external device 14 operates according to a control signal given from the CPU 18a.
[0016]
On the other hand, the CPU 18b reads the latest three calculated values (for example, b1, b2, and b3) from the memory 20b, and reads the latest calculated value (for example, a3) of the CPU 18a from the reception buffer 22b. Then, the CPU 18b compares the calculated values b1, b2, and b3 with the calculated value a3, and as a result of the comparison, prohibits the output of the control signal from the CPU 18a to the external device 14 as necessary.
[0017]
The memories 20a and 20b store the operation values of the CPUs 18a and 18b in addition to the operation values a1, a2,..., B1, b2,.
Next, the operation of the CPUs 18a and 18b will be described using a flowchart.
FIG. 2 is a flowchart showing a routine executed by the CPU 18a. The routine shown in FIG. 2 is repeatedly activated every time the process is completed. In the memory 20a, it is assumed that the calculation values a1 and a2 that are the results of the calculation processing of the CPU 18a based on the input signals acquired in the previous and previous routines are already stored. Further, it is assumed that the latest calculation value b3, which is the result of the calculation process of the CPU 18b, is already stored in the reception buffer 22a.
[0018]
When the routine shown in FIG. 2 is started, first, in step 100, an input signal from the external device 12 is acquired via the input port 16. When the process of step 100 is completed, the process of step 102 is executed next.
In step 102, a calculation value b3 that is a result of the calculation process by the CPU 18b is acquired from the reception buffer 22a. Next, the process of step 104 is executed.
[0019]
In step 104, a predetermined calculation based on the input signal acquired in step 100 is executed, and as a result, a calculation value a3 is calculated. In the subsequent step 106, the calculated value a3 calculated in step 104 is stored in the memory 20a. The calculated value a3 is also given to the reception buffer 22b and stored in the reception buffer 22b. When the process of step 106 is completed, the process of step 108 is executed next.
[0020]
In step 108, the latest three calculated values a1, a2, and a3 stored in the memory 20a are compared with the latest calculated value b3 of the CPU 18b acquired in step 102. Here, for example, the maximum value among the calculated values a1, a2, and a3 is a _MAX and the minimum value is a _MIN . In addition, a determination margin value that determines an allowable range such as a minute error at the time of comparison is α. This determination margin value α is set in advance. In step 108, the magnitude relation between the computed value b3 and the computed values a _MAX + α and a _MIN −α is compared. Then, after this comparison process, the process of step 110 is executed.
[0021]
In step 110, a discrimination process based on the comparison result in step 108 is executed. If b3> a _MAX + α or b3 <a _MIN −α is not established in step 108, the calculated value b3 of the CPU 18b approximates the calculated values a1, a2, and a3 of the CPU 18a in step 110, and the CPU 18a , 18b are determined to be normal. When the process of step 110 is completed, the process of step 112 is executed next. On the other hand, if b3> a _MAX + α or b3 <a _MIN −α is established in step 108, the calculated value b3 of the CPU 18b greatly deviates from the calculated values a1, a2, and a3 of the CPU 18a in step 110. Therefore, it is determined that at least one of the CPUs 18a and 18b is abnormal. Then, the output of the control signal from the CPU 18a to the external device 14 is stopped, and the current routine ends.
[0022]
In step 112, a control signal is given to the external device 14 via the output port 24 based on the calculation value a 3 calculated in step 104. At this time, the external device 14 operates according to a control signal given from the CPU 18a. Then, the process of step 100 is executed again.
On the other hand, the CPU 18b executes the following routine.
[0023]
FIG. 3 is a flowchart showing a routine executed by the CPU 18b. The routine shown in FIG. 3 is repeatedly started every time the process is completed. In the memory 20b, it is assumed that the calculation values b1 and b2 that are the results of the calculation processing of the CPU 18b based on the input signals acquired in the previous and previous routines are already stored. In addition, it is assumed that the calculation value a3 that is the result of the latest calculation processing of the CPU 18a is already stored in the reception buffer 22b.
[0024]
When the routine shown in FIG. 3 is started, first, in step 200, an input signal from the external device 12 is acquired via the input port 16. When the process of step 200 is completed, the process of step 202 is then executed.
In step 202, the operation value a3 stored in the reception buffer 22b in step 106 of the routine shown in FIG. 2 is acquired. Next, the process of step 204 is executed.
[0025]
In step 204, a predetermined calculation based on the input signal acquired in step 200 is executed, and as a result, a calculated value b3 is calculated. In the next step 206, the calculated value b3 calculated in step 204 is stored in the memory 20b. The calculated value b3 is also given to the reception buffer 22a and stored in the reception buffer 22a. When the process of step 206 is completed, the process of step 208 is then executed.
[0026]
In step 208, the magnitudes of the latest three calculated values b1, b2, b3 stored in the memory 20b and the latest calculated value a3 of the CPU 18a acquired in step 202 are compared. Here, for example, the maximum value among the calculated values b1, b2, and b3 is b _MAX and the minimum value is b _MIN . Further, a determination margin value that determines an allowable range such as a minute error at the time of comparison is β. This determination margin value β is set in advance. In step 208, the magnitude relation between the computed value a3 and the computed values a _MAX + β and b _MIN −β is compared. Then, after this comparison process, the process of step 210 is executed.
[0027]
In step 210, determination processing based on the comparison result in step 208 is executed. If a3> b _MAX + β or a3 <b _MIN −β is not established in step 208, the calculated value a3 of the CPU 18a approximates the calculated values b1, b2, and b3 of the CPU 18b in step 210, and the CPU 18a , 18b are determined to be normal. Then, when the process of step 210 is completed, the process of step 200 is executed again. On the other hand, if a3> b _MAX + β or a3 <b _MIN −β holds in step 208, the calculated value a3 of the CPU 18a greatly deviates from the calculated values b1, b2, and b3 of the CPU 18b in step 210. Therefore, it is determined that at least one of the CPUs 18a and 18b is abnormal. In this case, next, the process of step 212 is executed.
[0028]
In step 212, a stop signal is given to the CPU 18a, and output of a control signal from the CPU 18a to the external device 14 is prohibited. And this routine is complete | finished.
As described above, since the CPUs 18a and 18b compare the calculated values with each other, it can be easily determined whether or not the CPUs 18a and 18b are normal. Further, according to the present invention, since it is not necessary to synchronize the CPUs 18a and 18b, it is not necessary to provide a synchronization signal from the synchronization signal generating circuit to the CPUs 18a and 18b.
[0029]
Here, when the CPUs 18a and 18b obtain the mutual calculation values at arbitrary timing and perform comparison processing, the latest calculation value of the partner CPU (for example, the CPU 18b for the CPU 18a) is the latest value of the own CPU. It cannot be determined whether the value is calculated before or after the calculated value. For example, it cannot be determined whether the latest calculated value b3 of the CPU 18b is a value calculated before or after the latest calculated value a3 of the CPU 18a. In this case, the calculation value a2 calculated in the previous calculation process of the CPU 18a, the current (latest) calculation value a3, and the calculation value a4 calculated next to the calculation value a3 are taken into consideration. It is necessary to set a large judgment margin value α that determines an allowable range of error or the like in the comparison process.
[0030]
However, in the present invention, as shown in Steps 102 and 104 and Steps 202 and 204, the CPUs 18a and 18b both acquire the latest calculated values of the counterpart CPU through the reception buffers 22a and 22b, and then The calculation process is performed. For this reason, it can be determined that the latest calculation value of the counterpart CPU is always calculated before the latest calculation value of the own CPU.
[0031]
Further, after the partner CPU calculates the latest calculated value, it takes a predetermined minute time τ (τ <t) until the own CPU acquires the calculated value via the reception buffer. The latest calculation value of the counterpart CPU acquired by the own CPU through the reception buffer can be limited to that calculated from the previous calculation process of the own CPU to the current (latest) calculation process.
[0032]
Therefore, in steps 108 and 208 of the present invention, the latest calculation value of the counterpart CPU is compared with the latest three calculation values that are the calculation values of the previous CPU, the previous and the current calculation processing. ing. As described above, in the present invention, the calculation value of the own CPU to be compared with the latest calculation value of the counterpart CPU is limited to the minimum number of three, so a determination margin value that determines an allowable range such as an error in the comparison processing α and β can be set to smaller values. By setting the determination margin values α and β to smaller values, the abnormality of the CPUs 18a and 18b can be detected with higher accuracy.
[0033]
Here, the CPU 18a and 18b may individually perform arithmetic processing on input signals that do not require comparison processing of the arithmetic values of the CPUs 18a and 18b. By distributing an input signal that does not require a comparison process of operation values to one of the CPUs 18a and 18b, the two CPUs 18a and 18b can be used effectively.
In the above embodiment, the CPUs 18a and 18b both operate according to their respective routines and give the latest calculation value to the counterpart CPU via the reception buffer. For example, the CPU 18a is the master CPU and the CPU 18b is the slave. It may be a CPU. In this case, the CPU 18a as the master CPU activates the CPU 18b as the slave CPU. Then, the CPU 18b activated by the CPU 18a gives the latest calculated value stored in the transmission buffer to the CPU 18a.
[0034]
FIG. 4 is a diagram showing the calculation values a1, a2, a3, and b3 when the CPUs 18a and 18b are both normal. Note that times t1, t2, t3, and t4 respectively corresponding to the calculated values a1, a2, b3, and a3 indicate times at which the calculated values are calculated by the CPUs 18a and 18b. Further, values A, B, C, and D respectively corresponding to the calculated values a1, a2, b3, and a3 indicate the magnitudes of the calculated values a1, a2, b1, and a3.
[0035]
As shown in FIG. 4, the value C of the latest calculated value b3 calculated by the CPU 18b at time t3 is larger than the value A of the calculated value a1 calculated by the CPU 18a at time t1, and the calculated value calculated at time t4. It is smaller than the value D of a3. Therefore, in this case, in step 108 of the routine shown in FIG. 2, the CPU 10a determines that both the CPUs 18a and 18b are normal and outputs a control signal for controlling the external device 14.
[0036]
FIG. 5 is a diagram showing calculated values a1, a2, a3, and b3 when at least one of the CPUs 18a and 18b is abnormal. Note that times t1, t2, t3, and t4 respectively corresponding to the calculated values a1, a2, b3, and a3 indicate times at which the calculated values are calculated by the CPUs 18a and 18b. Further, values A, B, C, and D respectively corresponding to the calculated values a1, a2, b3, and a3 indicate the magnitudes of the calculated values a1, a2, b1, and a3.
[0037]
As shown in FIG. 5, the value C of the latest calculated value b3 calculated at time t3 by the CPU 18b is larger than the value obtained by adding the determination margin value α to the value D of the calculated value a3 calculated at time t4 by the CPU 18a. large. Therefore, in this case, in step 108 of the routine shown in FIG. 2, the CPU 10a determines that at least one of the CPUs 18a and 18b is abnormal, and stops outputting a control signal for controlling the external device 14. Similarly, when the CPU 10b determines that at least one of the CPUs 18a and 18b is abnormal, the control signal output from the CPU 18a to the external device 14 is prohibited.
[0038]
Note that the number of CPUs in the control computer 10 is not limited to two, but three or more CPUs may be provided in the control computer 10 so that CPU abnormality detection is performed by comparing the calculated values of each other. Good.
In the above embodiment, the processing in step 100 in FIG. 2 and the processing in step 200 in FIG. 3 corresponds to the first step described in the claims, and the processing in step 102 in FIG. 2 and step 202 in FIG. 2 corresponds to the second step, and the processing of step 104 in FIG. 2 and step 204 in FIG. 3 corresponds to the third step described in the claims. 2 corresponds to the fourth step described in the claims, and the processing of step 110 of FIG. 2 and step 210 of FIG. 3 falls within the scope of the claims. This corresponds to the fifth step described.
[0039]
【The invention's effect】
As described above, according to the first aspect of the present invention, in a computer system having a plurality of CPUs, a CPU abnormality is easily detected without synchronizing the CPUs. In addition, the CPU abnormality can be detected with high accuracy.
Further, according to the second aspect of the present invention, it is possible to detect the abnormality of the CPU with higher accuracy.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of a control computer to which an abnormality detection method of the present invention is applied.
FIG. 2 is a flowchart showing a routine executed by a CPU.
FIG. 3 is a flowchart showing a routine executed by a CPU.
FIG. 4 is a diagram illustrating a value of an operation value when both two CPUs are normal.
FIG. 5 is a diagram illustrating a value of a calculation value when at least one of two CPUs is abnormal.
[Explanation of symbols]
10 control computer 12, 14 external device 16 input port 18a, 18b CPU
20a, 20b Memory 22a, 22b Reception buffer 24 Output port a1, a2, a3, b1, b2, b3 Operation value α, β Judgment margin value

Claims

A method for detecting an abnormality of a CPU in a computer system having a plurality of CPUs,
A first step of Ru are sequentially obtains the input signal at a predetermined interval in each of CPU,
A second step of Ru to acquire the latest calculated value for the input signal other CPU is calculated respectively each CPU,
Each respective CPU, after to acquire the latest calculated value other CPU is calculated by executing the second step, a third step of Ru is calculated operation value for the input signal,
To each of CPU, the latest of each calculated value of the predetermined number of calculated, a fourth step of comparing the latest other CPU is calculated by the calculation value and the magnitude,
And a fifth step of detecting an abnormality of the CPU based on a result of the size comparison in the fourth step .

The CPU abnormality detection method according to claim 1,
The CPU abnormality detection method, wherein the latest predetermined number of each calculation value in the fourth step is the latest three calculation values.