JP3564310B2

JP3564310B2 - Redundancy device failure information collection method

Info

Publication number: JP3564310B2
Application number: JP32925698A
Authority: JP
Inventors: 將夫浅井; 智紀奈良; 新伊藤
Original assignee: Fujitsu Ltd; Nippon Telegraph and Telephone Corp
Current assignee: Fujitsu Ltd; Nippon Telegraph and Telephone Corp
Priority date: 1998-11-19
Filing date: 1998-11-19
Publication date: 2004-09-08
Anticipated expiration: 2018-11-19
Also published as: JP2000156687A

Description

【０００１】
【発明の属する技術分野】
本発明は二重化装置の障害情報収集方法に係わり、特に、運用系（ＡＣＴ）の制御系装置と予備系（ＳＢＹ）の制御系装置を備えた交換機の障害情報収集方法に関する。
【０００２】
【従来の技術】
交換機を構成する各ユニットは信頼度を向上するために二重化されており、運用系装置が障害によりダウンしても、予備系装置が代わって動作し、サービスを続行できるようになっている。図１５は２台の交換機Ａ、Ｂを通話回線ＬＮで接続し、各交換機に電話機ＴＡ，ＴＢを接続したネットワークを示しており、各交換機Ａ，Ｂは通話路系装置ＳＰＵと制御系装置ＣＴＵを備え、これら通話路系装置ＳＰＵと制御系装置ＣＴＵは共に二重化されている。
通話路系装置ＳＰＵは、交換機がたとえばＡＴＭ交換機であれば、伝送路に接続された回線ＩＦ部、集線分離部、セルスイッチ部などを有している。回線ＩＦ部は、所定フォーマットのフレーム例えばＤＳ１，ＤＳ３フレームあるいはＳＯＮＥＴフレームをＡＴＭセルフォーマットに変換してセルスイッチ側に送出したり、ＡＴＭセルフォーマットをＤＳ１，ＤＳ３フレームあるいはＳＯＮＥＴフレームに変換して回線に送出する機能を有している。集線分離部は多数の回線インタフェースに接続されてセルを多重化してセルスイッチに入力すると共に、セルスイッチ側からの多重セルを分離して所定の回線インタフェースに送出する。セルスイッチ部は入力された多重セルを所定の方路にスイッチングする。
【０００３】
制御系装置ＣＴＵは通話路系装置を制御するもので、ＭＰＵ、メモリ、システム制御部等を有している。図１６は交換機における制御系装置ＣＴＵの構成図であり、１０_０，１０_１はそれぞれ０系，１系の制御系装置で同一の構成を備えている。１１はＰＲＯＣ制御部（処理制御部）であり、プログラムの実行、制御を行うマイクロプロセッサユニット（ＭＰＵ部）１１ａ、ファイル（交換プログラム、緊急制御動作プログラム等）やデータを記憶するメモリ部１１ｂ、システムを二重化構成にするための制御（運用／予備切替制御、ファイル更新制御等）を行うシステム制御部１１ｃ、内部バス１１ｄ、内部バスと拡張バス間のインタフェース制御を行うインタフェース制御部１１ｅを備えている。０系／１系のメモリ部１１ｂ，１１ｂ間はメモリ交絡線ＭＣＬで接続され、０系／１系のシステム制御部１１ｃ，１１ｃ間はシステム交絡線ＳＣＬで接続されている。１２はＳＣＳＩインタフェースを備え、光磁気ディスクドライブやハードディスクドライブの制御を行うＩＯ制御部、１３は保守コンソール（操作パネル）とのインタフェースを行うＥｔｈｅｒｎｅｔコントローラなどのＬＡＮ制御部、１４はスイッチ等の通話路系装置ＳＰＵとのインタフェース制御部、１５は拡張バス、１６は他系とのインタフェースを司るバスインタフェース部であり、拡張バス交絡線ＢＣＬを介して系間交絡制御を行う。
【０００４】
上記二重化された交換機において、一般的な運用方法は片系が処理を実行し（ＡＣＴ状態〔以下ＡＣＴ系と記す〕）、他の片系が待機して（ＳＢＹ状態〔ＳＢＹ系と記す〕）運用される。このような形態での運用中に重大な障害が発生すると緊急動作として、ＡＣＴ／ＳＢＹチェンジ等の動作（旧ＡＣＴ系→新ＳＢＹ系、旧ＳＢＹ系→新ＡＣＴ系）が行われ、サービスに影響をあたえることなく運用を継続する。また、このような障害発生時には、障害情報（各種レジスタ、メモリの障害時の内容）をログ情報とし収集してファイル等に出力するのが一般的な方法となっている。
０系がＡＣＴ系、１系がＳＢＹ系でシステム運用中と仮定すると、０系のＭＰＵ部１１ａはメモリ部１１ｂに書かれた交換プログラム等を読み出し、その内容にしたがって各機能ブロックの制御や演算結果のメモリ部への書き込みを行う。例えば、ＡＣＴ系のＭＰＵ部１１ａは交換プログラムにしたがって運用系及び予備系の通話路系装置ＳＰＵに各種制御データを送出すると共に、ＡＣＴ系の通話路系装置から受信したデータに基づいて所定の処理を行い、処理結果をメモリ部１１ｂに格納する。また、ＡＣＴ系のメモリ部１１ｂに書き込まれたデータの内容はメモリ交絡線ＭＣＬを経由して常にＳＢＹ系のメモリ部１１ｂに反映され、メモリの内容はＡＣＴ系、ＳＢＹ系共に等しく保たれる。
【０００５】
かかる状態において、ＡＣＴ／ＳＢＹチェンジ（系切替）を要する障害、例えば、障害検出用タイマーＴＦ（ＦａｕｌｔＴｉｍｅｒ）のオーバーフローが発生すると、ＡＣＴ／ＳＢＹチェンジ割り込みが発生する。障害検出用タイマーＴＦは例えばウォッチドッグタイマ（ｗａｔｃｈｄｏｇｔｉｍｅｒ）であり、自動的にカウントアップし、そのカウント値を定期的にクリアするものである。プログラムの異常ループや暴走あるいはハード的障害が発生すると、カウント値をクリアできずオーバーフローが発生し、ＡＣＴ／ＳＢＹの系切替の契機となる。
ＭＰＵ部１１ａは、系切替の割込みにより緊急制御動作プログラムに従って処理を実行する。すなわち、ＭＰＵ部１１ａは緊急制御動作プログラムに従って、ＭＰＵ内の全レジスタ情報、その他各機能ブロックの制御情報などを障害情報ログとして収集して所定の記憶部に格納する。格納先はシステムに依存し、メモリ部１１ｂであったり、ＩＯ制御部１２配下のハードディスク、光ディスク等のデバイスであったりする。
【０００６】
ついで、０系、１系にハードウェアリセットを掛ける。しかる後、ＡＣＴ系のシステム制御部１１ｃはＳＢＹ系のシステム制御部１１ｃにシステム交絡線ＳＣＬを介してＡＣＴ／ＳＢＹを切り替えを指示し、ＡＣＴ／ＳＢＹチェンジを行う。これにより、１系（旧ＳＢＹ）が新ＡＣＴとして立ち上がり、以後、１系（新ＡＣＴ）が０系（旧ＡＣＴ）の処理を継続してサービスを提供する。
以上のようなステップを踏んで、ＡＣＴ／ＳＢＹチェンジ動作を行い、また、障害情報ログにより障害内容を解析しその解析内容に応じた処置を行う。
【０００７】
【発明が解決しようとする課題】
以上では、障害検出用タイマーＴＦ（ＦａｕｌｔＴｉｍｅｒ）のオーバーフローによりＡＣＴ／ＳＢＹチェンジが行われた場合であるが、ＡＣＴ／ＳＢＹチェンジの要因となる障害はそのほかにも、
（１）ＭＰＵ自体がデッドロック状態に陥った障害、
（２）障害機能ブロックがバスをロックして固まってしまった障害、
等がある。ＭＰＵのデッドロックとは、ＭＰＵは正常であるけれどもコマンドに対する応答がなくウェイト状態になっている障害であり、又、バスロックとは障害機能ブロックがバス線のレベルをハイレベルまたはローレベルに固定する障害である。（１）及び（２）の障害にもＡＣＴ／ＳＢＹチェンジが行われるが、かかる障害時には「障害情報ログの収集及び格納」が出来ず、障害情報を残すことが出来なくなる。このため、ＡＣＴ／ＳＢＹチェンジ後の障害情報解析が出来ず、故障装置や壊れかけた装置に対する対処が困難であるという問題が発生する。
【０００８】
以上から本発明の目的は、ＡＣＴ／ＳＢＹチェンジが発生するような障害が発生したとき、確実に障害情報の収集ができ、障害箇所を的確に認識して対処できる二重化装置の障害情報収集方法を提供することである。
【０００９】
【課題を解決するための手段】
上記課題は本発明によれば、（１）ＡＣＴ／ＳＢＹ切替を必要とする障害発生時に障害情報の収集を行い、（２）ハードウェアリセット前に障害情報の収集が完了すればフラグをセット状態にし、収集が完了してなければフラグを未セット状態のままにし、（３）両系をハードウェアリセット後、前記フラグを参照して障害情報の収集が完了しているか判断し、（４）完了していなければ障害情報を収集する二重化装置の障害情報収集方法により達成される。以上のようにすれば、障害情報の収集が完了していない場合であっても、ハードウェアリセット後に確実に障害情報の収集ができる。
【００１０】
又、上記課題は本発明によれば、（１）ＡＣＴ／ＳＢＹ切替を必要とする障害発生時に、各系は自系の障害情報の収集を行い、（２）ハードウェアリセット前に障害情報の収集が完了すれば自系の情報収集フラグをセット状態にし、収集が完了しなければフラグを未セット状態のままにし、（３）両系をハードウェアリセット後、各系のフラグを参照して障害情報の収集が完了しているかチェックし、（４）完了していない系の障害情報を収集することにより達成される。以上のようにすれば、両系あるいは片系で障害情報の収集が完了していない場合であっても、ハードウェアリセット後に系毎に確実に障害情報の収集ができる。
【００１１】
【発明の実施の形態】
（Ａ）制御系装置の構成
図１は本発明の交換機の制御系装置の構成図であり、１０_０，１０_１はそれぞれ０系，１系の制御系装置で同一の構成を備えている。１１_０，１１_１は０系及び１系のＰＲＯＣ制御部（処理制御部）であり、プログラムの実行、制御を行うマイクロプロセッサユニット（ＭＰＵ部）１１ａ_０，１１ａ_１、ファイル（交換プログラム、緊急制御動作プログラム等）やデータなどを記憶するメモリ部１１ｂ_０，１１ｂ_１、システムを二重化構成にするための制御（運用／予備切替制御、ファイル更新制御等）を行うシステム制御部１１ｃ_０，１１ｃ_１、内部バス１１ｄ_０，１１ｄ_１、内部バスと拡張バス間のインタフェース制御を行うインタフェース制御部１１ｅ_０，１１ｅ_１、システム制御を行うためのファームウェア部１１ｆ_０，１１ｆ_１を備えている。０系／１系のメモリ部１１ｂ_０，１１ｂ_１間はメモリ交絡線ＭＣＬで接続され、０系／１系のシステム制御部１１ｃ_０，１１ｃ_１間はシステム交絡線ＳＣＬで接続されている。メモリ部１１ｂ_０，１１ｂ_１には、
（１）交換プログラムＰ１、
（２）緊急制御動作プログラムＰ２、
（３）各種データ類ＤＴ、
（４）障害情報ログの収集完了／未完了を示すログ情報収集完了フラグＬＣＦ
等が記憶される。
【００１２】
１２_０，１２_１はＳＣＳＩインタフェースを備え、光磁気ディスクドライブやハードディスクドライブの制御を行うＩＯ制御部、１３_０，１３_１は保守コンソール（操作パネル）とのインタフェースを行うＥｔｈｅｒｎｅｔコントローラなどのＬＡＮ制御部、１４_０，１４_１はセルスイッチ等の通話路系装置ＳＰＵとのインタフェース制御部、１５_０，１５_１は拡張バス、１６_０，１６_１は他系とのインタフェースを司るバスインタフェース部であり、拡張バス交絡線ＢＣＬを介して系間交絡制御を行う。
【００１３】
（Ｂ）第１の発明における障害情報収集処理
第１の発明では、システム運用中にＡＣＴ／ＳＢＹチェンジの障害が発生すると、（１）ＡＣＴ系が両系（ＡＣＴ系、ＳＢＹ系）の障害情報を収集する。（２）ＡＣＴ系はハードウェアリセット前に障害情報の収集が完了すればフラグをセット状態にし、収集が完了しなければフラグを未セット状態のままにする。（３）両系をハードウェアリセット後、ＡＣＴ系あるいはＳＢＹ系は前記ログ情報収集フラグを参照して障害情報の収集が完了しているか判断し、（４）完了していなければ障害情報を収集する。
なお、ハードウェアリセットは、ＴＦオーバーフロー、ＭＰＵのデッドロック、バスロックなどの障害要因を解消するものであり、収集すべき障害情報が記憶されたＭＰＵ内のレジスタやメモリ１１ｂ_０、１１ｂ_１の記憶内容はリセットされない。
【００１４】
上記（１），（２）の処理をするのはＡＣＴ系であるが、（３），（４）の処理をいずれの系で行うかにより４つの形態（第１〜第４実施例）がある。
図２は０系をＡＣＴ系、１系をＳＢＹ系とした場合において、ハードウェアリセット後の（３），（４）の処理を行う系を示す図表である。
▲１▼第１実施例では、旧ＡＣＴ系（０系）が、フラグチェック処理（３）及びフラグ未設定時における障害情報ログ収集処理（４）を実行する。
▲２▼第２実施例では、旧ＡＣＴ系（０系）がフラグチェック処理（３）を行い、新ＡＣＴ系（１系）がフラグ未設定時における障害情報ログ収集処理（４）を実行する。
▲３▼第３実施例では、新ＡＣＴ系（１系）がフラグチェック処理（３）及びフラグ未設定時における障害情報ログ収集処理（４）を実行する。
▲４▼第４実施例では、新ＡＣＴ系（１系）がフラグチェック処理（３）を行い、旧ＡＣＴ系（０系）がフラグ未設定時における障害情報ログ収集処理（４）を実行する。
【００１５】
（ａ）第１実施例
図３は第１実施例の０系及び１系の処理フローである。
当初、０系がＡＣＴ系、１系がＳＢＹ系として運用している（ステップ１０１，２０１）。
かかる状態において、０系のＭＰＵ部１１ａ_０に対してＡＣＴ／ＳＢＹチェンジを要する割込みが発生する（ステップ１０２）。
ＭＰＵ部１１ａ_０はこの割込みをトリガとして緊急制御動作プログラムＰ２を起動する。そして、このプログラムにより、両系におけるＭＰＵ内の全レジスタ情報、その他各機能ブロックの制御情報など障害情報ログを収集し、所定の記憶エリアに格納する（ステップ１０３）。格納先はシステムに依存し、メモリ部１１ｂ_０であったり、ＩＯ制御部１２_０配下のデバイス（ハードディスク、光ディスク）であったりする。又、他系の障害情報ログはプロセッサ間通信により収集する。
【００１６】
０系のＭＰＵ部１１ａ_０は各種障害情報ログが正常に退避出来たなら、ログ収集完了フラグＬＣＦを”１”に設定する（ステップ１０４）。しかし、各種障害情報ログが正常に退避できなければ、ログ収集完了フラグＬＣＦを未設定（”０”）にしたままとする。
しかる後、０系、１系の両系にハードウェアリセットを行う（ステップ１０５，２０２）。ハードウェアリセット後、ＡＣＴ系のシステム制御部１１ｃ_０はＳＢＹ系のシステム制御部１１ｃ_１にシステム交絡線ＳＣＬを介してＡＣＴ／ＳＢＹの切り替えを指示し、ＡＣＴ／ＳＢＹチェンジを行う。これにより、０系（旧ＡＣＴ）が新ＳＢＹとして立上り、１系（旧ＳＢＹ）が新ＡＣＴとして立ち上がる（ステップ１０６、２０３）。
ついで、０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０はログ収集完了フラグＬＣＦの”１”，”０”をチェックし（ステップ１０７）、フラグが未設定（”０”）なら両系のログ収集動作を実行する（ステップ１０８）。
【００１７】
０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０は、ログ収集完了後、あるいは、ステップ１０７においてログ収集完了フラグＬＣＦが設定（”１”）されていれば、１系（旧ＳＢＹ）のそれまでの待機運転処理を継続する（待機運転状態、ステップ１０９）。
又、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は０系の待機運転と同期して、それまでの０系（旧ＡＣＴ）の運用運転処理を継続してサービスを提供する（運用運転状態、ステップ２０４）。
以上のように、第１実施例によれば、系切替後のログ収集は全て新ＳＢＹ系（０系）で行うので、新ＡＣＴ系によるサービス処理に影響を与えないで障害情報ログの収集ができ、これにより障害箇所を的確に認識して対処できる。
【００１８】
（ｂ）第２実施例
図４は第２実施例の０系及び１系の処理フローであり、第１実施例と同一ステップには同一番号を付している。第２実施例において、０系（旧ＡＣＴ）が新ＳＢＹとして起動し（ステップ１０１〜１０６）、１系（旧ＳＢＹ）が新ＡＣＴとして起動するまで（ステップ２０１〜２０３）、第１実施例と同様の処理が行われる。
ステップ１０６の処理実行後、０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０はログ収集完了フラグＬＣＦの”１”，”０”をチェックし（ステップ１２１）、フラグが未設定（”０”）なら、プロセッサ間通信により、その旨を１系（新ＡＣＴ）のＭＰＵ部１１ａ_１に通知する（ステップ１２２）。
【００１９】
１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、０系（旧ＡＣＴ）よりフラグ未設定通知を受信したか、すなわち、ログ情報収集指示を受信したかチェックし（ステップ２２１）、受信すれば、両系のログ収集動作を実行する（ステップ２２２）。ログ収集完了後、あるいは、ステップ２２１においてログ情報収集指示を受信しなければ、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、それまでの０系（旧ＡＣＴ）の運用運転処理を継続してサービスを提供する（運用運転状態、ステップ２２３）。
又、０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は、１系（新ＡＣＴ）の運用運転と同期してそれまでの１系の待機運転処理を継続する（待機運転状態、ステップ１２３）。
以上第２実施例によれば、系切り替え後のフラグチェックを新ＳＢＹ系（０系）で行うので、ハードウェアリセット前にログ収集が完了していれば新ＡＣＴ系のサービス処理に全く影響を与えないようにできる。
また、ハードウェアリセット前にログ収集が完了していなければ、ハードウェアリセット後に新ＡＣＴ系（１系）で障害情報ログを収集するためより確実にログ収集が可能になる。
【００２０】
（ｃ）第３実施例
図５は第３実施例の０系及び１系の処理フローであり、第１実施例と同一ステップには同一番号を付している。第３実施例において、０系（旧ＡＣＴ）が新ＳＢＹとして起動し（ステップ１０１〜１０６）、１系（旧ＳＢＹ）が新ＡＣＴとして起動するまで（ステップ２０１〜２０３）、第１実施例と同様の処理が行われる。
ステップ２０３の処理実行後、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１はログ収集完了フラグＬＣＦの”１”，”０”をチェックし（ステップ２３１）、フラグが未設定（”０”）なら両系のログ収集動作を実行する（ステップ２３２）。
ＭＰＵ部１１ａ_１はログ収集完了後、あるいは、ステップ２３１においてログ収集完了フラグＬＣＦが設定（”１”）されていれば、それまでの０系（旧ＡＣＴ）の運用運転処理を継続してサービスを提供する（運用運転状態、ステップ２３３）。
又、０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は１系の運用運転と同期してそれまでの１系（旧ＳＢＹ）の待機運転処理を継続する（待機運転状態、ステップ１２３）。
以上第３実施例によれば、系切り替え後のフラグチェック及び障害情報ログの収集を新ＡＣＴ系（１系）で行うので、より確実なフラグチェック及びログ収集が可能となる。
【００２１】
（ｄ）第４実施例
図６は第４実施例の０系及び１系の処理フローであり、第１実施例と同一ステップには同一番号を付している。第４実施例において、０系（旧ＡＣＴ）が新ＳＢＹとして起動し（ステップ１０１〜１０６）、１系（旧ＳＢＹ）が新ＡＣＴとして起動するまで（ステップ２０１〜２０３）、第１実施例と同様の処理が行われる。
ステップ２０３の処理実行後、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１はログ収集完了フラグＬＣＦの”１”，”０”をチェックし（ステップ２４１）、フラグが未設定（”０”）なら、プロセッサ間通信により、その旨を０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０に通知する（ステップ２４２）。
０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は、１系（旧ＳＢＹ系）よりフラグ未設定通知を受信したか、すなわち、ログ情報収集指示を受信したかチェックし（ステップ１４１）、受信すれば、両系の障害情報ログの収集動作を実行する（ステップ１４２）。
【００２２】
０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は、ログ収集完了後、あるいは、ステップ１４１においてログ情報収集指示を受信しなければ、以後、１系（旧ＳＢＹ）のそれまでの待機運転処理を継続する（待機運転状態、ステップ１４３）。
又、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は０系の待機運転と同期してそれまでの０系（旧ＡＣＴ）の運用運転処理を継続し、サービスを提供する（運用運転状態、ステップ２４３）。
以上、第４実施例によれば、系切り替え後のフラグチェックを新ＡＣＴ系（１系）で行うので、より確実なチェックが行える。
また、ハードリセット前に障害情報ログの収集が完了していなくても、ハードリセット後に新ＳＢＹ系（０系）でログ収集を行うのでサービスに影響を与えない。
【００２３】
（Ｃ）第２の発明における障害ログ収集処理
第２の発明では、システム運用中にＡＣＴ／ＳＢＹチェンジの障害発生すると、（１）各系が自系の障害情報ログを収集し、（２）ハードウェアリセット前に障害情報ログの収集が完了すれば自系のフラグをセット状態にし、収集が完了しなければフラグを未セット状態のままにする。そして、（３）両系をハードウェアリセット後、ＡＣＴ系あるいはＳＢＹ系は各系のフラグを参照して障害情報ログの収集が完了しているかチェックし、（４）完了していない系の障害情報ログを収集する。
上記（１），（２）の処理をするのは各系であるが、（３），（４）の処理をいずれの系で行うかにより７つの形態（第５〜第１１実施例）がある。
【００２４】
図７は０系をＡＣＴ系、１系をＳＢＹ系とした場合におけるハードウェアリセット後の上記（３），（４）の処理を行う系を示す図表である。
▲１▼第５実施例では、旧ＡＣＴ系（０系）及び新ＡＣＴ系（１系）がそれぞれ、自系のフラグチェック処理（３）及びフラグ未設定時における障害情報ログ収集処理（４）を実行する。
▲２▼第６実施例では、旧ＡＣＴ系（０系）が両系のフラグチェック処理（３）を行うと共にそれぞれの系のフラグ未設定時における障害情報ログ収集処理（４）を実行する。
▲３▼第７実施例では、旧ＡＣＴ系（０系）が両系のフラグチェック処理（３）を行い、新ＡＣＴ系（１系）が両系のフラグ未設定時における障害情報ログ収集処理（４）を実行する。
▲４▼第８実施例では、旧ＡＣＴ系（０系）が両系のフラグチェック処理（３）を行い、各系がそれぞれ自系のフラグ未設定時における障害情報ログ収集処理（４）を実行する。
▲５▼第９実施例では、新ＡＣＴ系（１系）が両系のフラグチェック処理（３）を行うと共に、それぞれの系のフラグ未設定時における障害情報ログ収集処理（４）を実行する。
▲６▼第１０実施例では、新ＡＣＴ系（１系）が両系のフラグチェック処理（３）を行い、旧ＡＣＴ系（０系）が両系のフラグ未設定時における障害情報ログ収集処理（４）を実行する。
▲７▼第１１実施例では、新ＡＣＴ系（１系）が両系のフラグチェック処理（３）を行い、各系がそれぞれ自系のフラグ未設定時における障害情報ログ収集処理（４）を実行する。
【００２５】
（ａ）第５実施例
図８は第５実施例の０系及び１系の処理フローであり、当初、０系がＡＣＴ系、１系がＳＢＹ系として運用している（ステップ３０１、４０１）。
かかる状態において、０系のＭＰＵ部１１ａ_０に対してＡＣＴ／ＳＢＹチェンジを要する割込みが発生すると、ＭＰＵ部１１ａ_０はこの割込みをシステム制御部１１ｃ_０、システム交絡線ＳＣＬを介して１系のＭＰＵ部１１ａ_１に通知する（ステップ３０２，４０２）。
両系のＭＰＵ部１１ａ_０，１１ａ_１はＡＣＴ／ＳＢＹチェンジ割込みをトリガとして緊急制御動作プログラムＰ２を起動する。そして、このプログラムにより、それぞれ、自系のＭＰＵ内の全レジスタ情報、その他各機能ブロックの制御情報など障害情報ログを収集し、所定の記憶エリアに格納する（ステップ３０３，４０３）。尚、格納先はシステムに依存し、メモリ部１１ｂ_０，１１ｂ_１であったり、ＩＯ制御部１２_０，１２_１配下のデバイス（ハードディスク、光ディスク）であったりする。
【００２６】
各系のＭＰＵ部１１ａ_０，１１ａ_１は各種障害情報ログを正常に退避出来たなら、ログ収集完了フラグＬＣＦ_０，ＬＣＦ_１を”１”に設定する（ステップ３０４，４０４）。しかし、各種情報ログが正常に退避できなければ、ログ収集完了フラグＬＣＦ_０，ＬＣＦ_１を未設定（”０”）にしたままとする。
しかる後、０系、１系の両系にハードウェアリセットを行う（ステップ３０５，４０５）。ハードウェアリセット後、０系のシステム制御部１１ｃ_０は１系のシステム制御部１１ｃ_１にシステム交絡線ＳＣＬを介してＡＣＴ／ＳＢＹを切り替えを指示し、ＡＣＴ／ＳＢＹチェンジを行う。これにより、０系（旧ＡＣＴ）が新ＳＢＹとして起動し、１系（旧ＳＢＹ）が新ＡＣＴとして立ち上がる（ステップ３０６、４０６）。
０系及び１系のＭＰＵ部１１ａ_０，１１ａ_１はそれぞれ自系のログ収集完了フラグＬＣＦ_０，ＬＣＦ_１の”１”，”０”をチェックし（ステップ３０７，４０７）、フラグが未設定（”０”）なら自系のログ収集動作を実行する（ステッ３０８，４０８）。
【００２７】
０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０は、ログ収集完了後、あるいは、ステップ３０７においてログ収集完了フラグＬＣＦ_０が設定（”１”）されていれば、１系（旧ＳＢＹ）のそれまでの待機運転処理を継続する（待機運転状態、ステップ３０９）。
又、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、ログ収集完了後、あるいは、ステップ４０７においてログ収集完了フラグＬＣＦ_１が設定（”１”）されていれば、０系の処理と同期して、それまでの０系（旧ＡＣＴ）の運用運転処理を継続する（運用運転状態、ステップ４０９）。
以上、第５実施例によれば、各系は自系の障害情報ログの収集を行うために、ハードウェアリセット前に障害情報を確実に収集可能となる。また、他系の障害情報ログの収集を行わないのでより高速に処理の完了が可能となる。
【００２８】
（ｂ）第６実施例
図９は第６実施例の０系及び１系の処理フローであり、第５実施例と同一ステップには同一番号を付している。第６実施例において、０系（旧ＡＣＴ）が新ＳＢＹとして起動し（ステップ３０１〜３０６）、１系（旧ＳＢＹ）が新ＡＣＴとして起動するまで（ステップ４０１〜４０６）、第５実施例と同様の処理が行われる。
ステップ３０６の処理実行後、０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０は両系のログ収集完了フラグＬＣＦ_０，ＬＣＦ_１の”１”，”０”をそれぞれチェックし（ステップ３１１）、▲１▼自系（０系）のみのフラグＬＣＦ_０が未設定（”０”）であれば、自系のログ収集動作を実行し（ステップ３１２）、▲２▼両系（０系，１系）のフラグＬＣＦ_０，ＬＣＦ_１が共に未設定（”０”）であれば、両系のログ収集動作を実行し（ステップ３１３）、▲３▼他系（１系）のみのフラグＬＣＦ_１が未設定（”０”）であれば、他系のログ収集動作を実行する（ステップ３１４）。
【００２９】
０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０は、以上によりログ収集処理を完了すれば、あるいは、ステップ３１１においてログ収集完了フラグＬＣＦ_０，ＬＣＦ_１が共に設定（”１”）されていれば、１系（旧ＳＢＹ）のそれまでの待機運転処理を継続する（待機運転状態、ステップ３１５）。
又、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は０系の待機運転と同期して、それまでの０系（旧ＡＣＴ）の運用運転処理を継続してサービスを提供する（運用運転状態、ステップ４１１）。
以上第６実施例によれば、各系は自系の障害情報ログの収集を行うために、ハードウェアリセット前に障害情報を確実に収集可能となる。
また、ハードリセット前に両系のログ収集が完了していなくてもハードリセット後に新ＳＢＹ系（０系）がログ収集を行うので新ＡＣＴ系のサービス処理に影響を与えない。
【００３０】
（ｃ）第７実施例
図１０は第７実施例の０系及び１系の処理フローであり、第５実施例と同一ステップには同一番号を付している。第７実施例において、０系（旧ＡＣＴ）が新ＳＢＹとして起動し（ステップ３０１〜３０６）、１系（旧ＳＢＹ）が新ＡＣＴとして起動するまで（ステップ４０１〜４０６）、第５実施例と同様の処理が行われる。
ステップ３０６の処理実行後、０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０は両系のログ収集完了フラグＬＣＦ_０，ＬＣＦ_１の”１”，”０”をそれぞれチェックし（ステップ３２１）、いずれかのフラグが未設定（”０”）なら、プロセッサ間通信により、フラグが未設定の系を１系（新ＡＣＴ）のＭＰＵ部１１ａ_１に通知し、ログ情報収集を指示する（ステップ３２２）。
１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、０系（旧ＡＣＴ）よりフラグ未設定の系を示す通知を受信したかチェックし、受信すれば、いずれの系がフラグ未設定であるか判断する（ステップ４２１）。
【００３１】
１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、▲１▼他系（０系）のみのフラグＬＣＦ_０が未設定（”０”）であれば、他系のログ収集動作を実行し（ステップ４２２）、▲２▼両系（０系，１系）のフラグＬＣＦ_０，ＬＣＦ_１が共に未設定（”０”）であれば、両系のログ収集動作を実行し（ステップ４２３）、▲３▼自系（１系）のみのフラグＬＣＦ_１が未設定（”０”）であれば、自系のログ収集動作を実行する（ステップ４２４）。
以上のログ収集処理が完了すれば、あるいは、ステップ４２１においてログ情報収集指示を受信しなければ、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、それまでの０系（旧ＡＣＴ）の運用運転処理を継続してサービスを提供する（運用運転状態、ステップ４２５）。
又、０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は、１系（新ＡＣＴ）の運用運転処理と同期してそれまでの１系の待機運転処理を継続する（待機運転状態、ステップ３２３）。
【００３２】
以上第７実施例によれば、各系は自系の障害情報ログの収集を行うために、ハードウェアリセット前に障害情報を確実に収集可能となる。
また、ログ収集フラグのチェックを新ＳＢＹ系（０系）で行うため、ハードウェアリセット前に両系のログ収集が完了していれば新ＡＣＴ系のサービス処理に全く影響を与えない。また、たとえ、ハードウェアリセット前にログ収集が完了していなくても、新ＡＣＴ系でログ収集を行うためにより確実にログ収集が可能となる。
【００３３】
（ｄ）第８実施例
図１１は第８実施例の０系及び１系の処理フローであり、第５実施例と同一ステップには同一番号を付している。第８実施例において、０系（旧ＡＣＴ）が新ＳＢＹとして起動し（ステップ３０１〜３０６）、１系（旧ＳＢＹ）が新ＡＣＴとして起動するまで（ステップ４０１〜４０６）、第５実施例と同様の処理が行われる。
ステップ３０６の処理実行後、０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０は他系（１系）のログ収集完了フラグＬＣＦ_１の”１”，”０”をチェックし（ステップ３３１）、フラグＬＣＦ_１が未設定（”０”）であれば、プロセッサ間通信により、その旨を１系（新ＡＣＴ）のＭＰＵ部１１ａ_１に通知する（ステップ３３２）。１系のＭＰＵ部１１ａ_１は、０系（旧ＡＣＴ）よりフラグ未設定通知を受信したか、すなわち、ログ情報収集指示を受信したかチェックし（ステップ４３１）、受信すれば、自系のログ収集動作を実行する（ステップ４３２）。ログ収集完了後、あるいは、ステップ４３１においてログ情報収集指示を受信しなければ、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、それまでの０系（旧ＡＣＴ）の運用運転処理を継続し、サービスを提供する（運用運転状態、ステップ４３３）。
【００３４】
一方、０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０は、ステップ３３２の通知処理が終了すれば、あるいは、ステップ３３１においてログ収集完了フラグＬＣＦ_１が設定（”１”）されていれば、自系のログ収集完了フラグＬＣＦ_０の”１”，”０”をチェックし（ステップ３３３）、フラグＬＣＦ_０が未設定（”０”）であれば、自系のログ情報収集動作を実行する（ステップ３３４）。
ついで、あるいは、ステップ３３３においてフラグＬＣＦ_０が設定（”１”）されていれば、０系（旧ＡＣＴ）のＭＰＵ部１１ａ_０は、１系（新ＡＣＴ）の運用運転処理と同期してそれまでの１系の待機運転処理を継続する（待機運転状態、ステップ３３５）。
以上第８実施例によれば、各系は自系の障害情報ログの収集を行うために、ハードウェアリセット前に障害情報を確実に収集可能となる。
また、ログ収集フラグのチェックを新ＳＢＹ系（０系）で行うため、ハードウェアリセット前に両系のログ収集が完了していれば新ＡＣＴ系のサービス処理に全く影響を与えない。又、たとえ、ハードウェアリセット前にログ収集が完了していなくても、各系は自系のみのログ収集を行えばよいため高速でログ収集を行う事が可能となる。
【００３５】
（ｅ）第９実施例
図１２は第９実施例の０系及び１系の処理フローであり、第５実施例と同一ステップには同一番号を付している。第９実施例において、０系（旧ＡＣＴ）が新ＳＢＹとして起動し（ステップ３０１〜３０６）、１系（旧ＳＢＹ）が新ＡＣＴとして起動するまで（ステップ４０１〜４０６）、第５実施例と同様の処理が行われる。
ステップ４０６の処理実行後、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は両系のログ収集完了フラグＬＣＦ_０，ＬＣＦ_１の”１”，”０”をそれぞれチェックし（ステップ４４１）、▲１▼他系（０系）のみのフラグＬＣＦ_０が未設定（”０”）であれば、他系のログ収集動作を実行し（ステップ４４２）、▲２▼両系（０系，１系）のフラグＬＣＦ_０，ＬＣＦ_１が共に未設定（”０”）であれば、両系のログ収集動作を実行し（ステップ４４３）、▲３▼自系（１系）のみのフラグＬＣＦ_１が未設定（”０”）であれば、自系のログ収集動作を実行する（ステップ４４４）。
【００３６】
１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、以上によりログ収集処理を完了すれば、あるいは、ステップ４４１においてログ収集完了フラグＬＣＦ_０，ＬＣＦ_１が共に設定（”１”）されていれば、以後、それまでの０系（旧ＡＣＴ）の運用運転処理を継続する（運用運転状態、ステップ４４５）。
又、０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は１系の運用運転処理と同期して、それまでの１系（旧ＳＢＹ）の待機運転処理を継続する（待機運転状態、ステップ３４１）。
以上第９実施例によれば、各系は自系の障害情報ログの収集を行うために、ハードウェアリセット前に障害情報を確実に収集可能となる。また、ハードウェアリセット後のログ収集完了フラグのチェック及び未完了時のログ収集を新ＡＣＴ系（１系）で行うのでより確実にログ収集が可能となる。
【００３７】
（ｆ）第１０実施例
図１３は第１０実施例の０系及び１系の処理フローであり、第５実施例と同一ステップには同一番号を付している。第１０実施例において、０系（旧ＡＣＴ）が新ＳＢＹとして起動し（ステップ３０１〜３０６）、１系（旧ＳＢＹ）が新ＡＣＴとして起動するまで（ステップ４０１〜４０６）、第５実施例と同様の処理が行われる。
ステップ４０６の処理実行後、１系（旧ＳＢＹ）のＭＰＵ部１１ａ_１は両系のログ収集完了フラグＬＣＦ_０，ＬＣＦ_１の”１”，”０”をそれぞれチェックし（ステップ４５１）、いずれかのフラグが未設定（”０”）なら、プロセッサ間通信により、フラグが未設定の系を０系（新ＳＢＹ）のＭＰＵ部１１ａ_０に通知し、ログ情報収集を指示する（ステップ４５２）。
【００３８】
０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は、１系（旧ＳＢＹ）よりフラグ未設定の系を示す通知を受信したかチェックし、受信すれば、いずれの系がフラグ未設定であるか判断する（ステップ３５１）。
０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は、▲１▼自系（０系）のみのフラグＬＣＦ_０が未設定（”０”）であれば、自系のログ収集動作を実行し（ステップ３５２）、▲２▼両系（０系，１系）のフラグＬＣＦ_０，ＬＣＦ_１が共に未設定（”０”）であれば、両系のログ収集動作を実行し（ステップ３５３）、▲３▼他系（１系）のみのフラグＬＣＦ_１が未設定（”０”）であれば、他系のログ収集動作を実行する（ステップ３５４）。
以上のログ収集処理が完了すれば、あるいは、ステップ３５１においてログ情報収集指示を受信しなければ、０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は、それまでの１系（旧ＳＢＹ）の待機運転処理を継続する（待機運転状態、ステップ３５５）。
【００３９】
又、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、０系（新ＳＢＹ）の待機運転処理と同期してそれまでの０系の運用運転処理を継続する（運用運転状態、ステップ４５３）。
以上第１０実施例によれば、各系は自系の障害情報ログの収集を行うために、ハードウェアリセット前に障害情報を確実に収集可能となる。
また、ハードウェアリセット後のログ収集完了フラグのチェックを新ＡＣＴ系（１系）で行うのでより確実にチェックが行える。又、たとえ、ハードウェアリセット前にログ収集が完了していなくても、新ＳＢＹ（０系）でログ収集を行う為に新ＡＣＴ系のサービス処理に影響を与えない。
【００４０】
（ｇ）第１１実施例
図１４は第１１実施例の０系及び１系の処理フローであり、第５実施例と同一ステップには同一番号を付している。第１１実施例において、０系（旧ＡＣＴ）が新ＳＢＹとして起動し（ステップ３０１〜３０６）、１系（旧ＳＢＹ）が新ＡＣＴとして起動するまで（ステップ４０１〜４０６）、第５実施例と同様の処理が行われる。
ステップ４０６の処理実行後、１系（旧ＳＢＹ）のＭＰＵ部１１ａ_１は自系のログ収集完了フラグＬＣＦ_１の”１”，”０”をチェックし（ステップ４６１）、フラグＬＣＦ_１が未設定（”０”）であれば、自系のログ情報収集動作を実行する（ステップ４６２）。
【００４１】
ついで、あるいは、ステップ４６１において自系のフラグＬＣＦ_１が設定（”１”）されていれば、他系（０系）のログ収集完了フラグＬＣＦ_０の”１”，”０”をチェックし（ステップ４６３）、フラグＬＣＦ_０が未設定（”０”）であれば、プロセッサ間通信により、その旨を０系（新ＳＢＹ）のＭＰＵ部１１ａ_０に通知する（ステップ４６４）。０系のＭＰＵ部１１ａ_０は、１系（旧ＳＢＹ）よりフラグ未設定通知を受信したか、すなわち、ログ情報収集指示を受信したかチェックし（ステップ３６１）、受信すれば、自系のログ収集動作を実行する（ステップ３６２）。ログ収集完了後、あるいは、ステップ３６１においてログ情報収集指示を受信しなければ、０系（新ＳＢＹ）のＭＰＵ部１１ａ_０は、それまでの１系（旧ＳＢＹ）の待機運転処理を継続する（待機運転状態、ステップ３６３）。
【００４２】
一方、１系（新ＡＣＴ）のＭＰＵ部１１ａ_１は、ステップ４６４の通知処理が終了すれば、あるいは、ステップ４６３においてログ収集完了フラグＬＣＦ_０が設定（”１”）されていれば、０系（新ＳＢＹ）の待機運転処理と同期して０系のそれまでの運用運転処理を継続する（運用運転状態、ステップ４６５）。
以上第１１実施例１１によれば、各系は自系の障害情報ログの収集を行うために、ハードウェアリセット前にログ情報をより確実に収集可能となる。
また、ハードウェアリセット後のログ収集完了フラグのチェックを新ＡＣＴ系（１系）で行うのでより確実にチェックが行える。
以上では、０系を運用運転状態（ＡＣＴ運転中）、１系を待機運転状態（ＳＢＹ運転中）として運転中にＡＣＴ／ＳＢＹチェンジを要する割り込みが発生した場合について説明したが、１系がＡＣＴ運転中、０系がＳＢＹ運転中にＡＣＴ／ＳＢＹチェンジ割込が発生すれば、同様にＡＣＴ／ＳＢＹ切替を行う。
以上、本発明を実施例により説明したが、本発明は請求の範囲に記載した本発明の主旨に従い種々の変形が可能であり、本発明はこれらを排除するものではない。
【００４３】
【発明の効果】
以上本発明によれば、ＡＣＴ／ＳＢＹチェンジを必要とする障害が発生した場合、確実に障害ログを得ることができ、障害箇所を的確に抑えることが可能となる。
又、本発明によれば、ＡＣＴ／ＳＢＹ切替を必要とする障害発生時に障害情報ログの収集を行い、ハードウェアリセット前に障害情報ログの収集が完了すればフラグをセット状態にし、収集が完了してなければフラグを未セット状態のままにし、両系をハードウェアリセット後、フラグを参照して障害情報ログの収集が完了しているか判断し、完了していなければ障害情報ログを収集するようにしたから、障害情報ログの収集が完了していない場合であっても、ハードウェアリセット後に確実に障害情報ログの収集ができる。
【００４４】
又、本発明によれば、ＡＣＴ／ＳＢＹ切替を必要とする障害発生時に、各系は自系の障害情報ログの収集を行い、ハードウェアリセット前に障害情報ログの収集が完了すれば自系のログ情報収集フラグをセット状態にし、収集が完了しなければフラグを未セット状態のままにし、両系をハードウェアリセット後、各系のフラグを参照して障害情報ログの収集が完了しているかチェックし、完了していない系の障害情報ログを収集するようにしたから、両系あるいは片系で障害情報ログの収集が完了していない場合であっても、ハードウェアリセット後に系毎に確実に障害情報ログの収集ができる。
【００４５】
第１実施例の発明によれば、ログ収集を全て新ＳＢＹ系で行うため新ＡＣＴ系のサービス処理に影響を与えないようにできる。
第２実施例の発明によれば、系切り替え後のフラグチェックを新ＳＢＹ系（０系）で行うので、ハードウェアリセット前のログ収集が完了していれば新ＡＣＴ系のサービス処理に影響を与えない。また、ハードウェアリセット前のログ収集が完了していなければ、ハードウェアリセット後に新ＡＣＴ系（１系）で障害情報ログを収集するためより確実にログ収集が可能となる。
第３実施例の発明によれば、系切り替え後のフラグチェック処理及び障害情報ログの収集処理を新ＡＣＴ系（１系）で行うので、より確実なチェック及びログ収集が可能となる。
第４実施例の発明によれば、系切り替え後のフラグチェックを新ＡＣＴ系（１系）で行うので、より確実なチェックが行える。また、ハードウェアリセット前にログ収集が完了していなくても、ハードウェアリセット後に新ＳＢＹ系（０系）でログ収集を行うので新ＡＣＴ系のサービス処理に影響を与えない。
【００４６】
第５〜第１１実施例の発明によれば、各系が自系のログ収集を行ってフラグの設定を行うために、ハードウェアリセット前に収集するログ情報をより確実に収集できる。
第５実施例の発明によれば、他系に対するフラグチェック処理及び障害情報のログ収集を行わないのでより高速に処理を完了できる。
第６実施例の発明によれば、ハードウェアリセット前に両系のログ収集が完了していなくてもハードウェアリセット後に新ＳＢＹ系（０系）がログ収集を行うので新ＡＣＴ系のサービス処理に影響を与えない。
第７実施例の発明によれば、ログ収集フラグのチェックを新ＳＢＹ系（０系）で行うため、ハードウェアリセット前に両系のログ収集が完了していれば新ＡＣＴ系のサービス処理に影響を与えない。又、たとえ、ハードウェアリセット前にログ収集が完了していなくても、新ＡＣＴ系でログ収集を行うためにより確実にログ収集が可能となる。
【００４７】
第８実施例の発明によれば、ログ収集フラグのチェックを新ＳＢＹ系（０系）で行うため、ハードウェアリセット前に両系のログ収集が完了していれば新ＡＣＴ系のサービス処理に影響を与えない。又、たとえ、ハードウェアリセット前にログ収集が完了していなくても、自系のみのログ収集を行えばよいため高速でログ収集を行う事が可能となる。
第９実施例の発明によれば、ハードウェアリセット後のフラグチェック処理及び未完了時のログ収集を全て新ＡＣＴ系で行うのでより確実にログ収集が可能となる。
第１０実施例の発明によれば、ハードウェアリセット後のログ収集完了フラグのチェックを新ＡＣＴ系で行うのでより確実にチェックが行える。又、たとえ、ハードウェアリセット前にログ収集が完了していなくても、新ＳＢＹ（０系）でログ収集を行う為に新ＡＣＴ系のサービス処理に影響を与えない。
第１１実施例の発明によれば、ハードウェアリセット後のログ収集完了フラグのチェックを新ＡＣＴ系で行うのでより確実にチェックが行える。
【図面の簡単な説明】
【図１】本発明の交換機の制御系装置の構成図である。
【図２】０系が両系（０系、１系）の障害情報ログを収集してログ収集完了フラグを設定する場合におけるハードウェアリセット後の処理形態説明図である。
【図３】第１実施例の処理フローである。
【図４】第２実施例の処理フローである。
【図５】第３実施例の処理フローである。
【図６】第４実施例の処理フローである。
【図７】各系が自系の障害情報ログを収集して自系のログ収集完了フラグを設定する場合におけるハードウェアリセット後の処理形態説明図である。
【図８】第５実施例の処理フローである。
【図９】第６実施例の処理フローである。
【図１０】第７実施例の処理フローである。
【図１１】第８実施例の処理フローである。
【図１２】第９実施例の処理フローである。
【図１３】第１０実施例の処理フローである。
【図１４】第１１実施例の処理フローである。
【図１５】ネットワーク構成図である。
【図１６】従来の交換機の制御系装置の構成図である。
【符号の説明】
１０_０，１０_１・・０系，１系の制御系装置
１１_０，１１_１・・０系及び１系のＰＲＯＣ制御部（処理制御部）１１ａ_０，１１ａ_１・・マイクロプロセッサユニット（ＭＰＵ部）
１１ｂ_０，１１ｂ_１・・メモリ部
１１ｃ_０，１１ｃ_１・・システム制御部
１１ｄ_０，１１ｄ_１・・内部バス
１１ｅ_０，１１ｅ_１・・インタフェース制御部
１１ｆ_０，１１ｆ_１・・ファームウェア部
１２_０，１２_１・・ＩＯ制御部
１３_０，１３_１・・ＬＡＮ制御部
１４_０，１４_１・・通話路系装置とのインタフェース制御部
１５_０，１５_１・・拡張バス
１６_０，１６_１・・バスインタフェース部
ＭＣＬ・・メモリ交絡線
ＳＣＬ・・システム交絡線
ＢＣＬ・・拡張バス交絡線
Ｐ１・・交換プログラム
Ｐ２・・緊急制御動作プログラム
ＬＣＦ・・ログ情報収集完了フラグ[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a method for collecting fault information of a duplexer, and more particularly, to a method of collecting fault information for an exchange provided with an active (ACT) control system device and a standby (SBY) control system device.
[0002]
[Prior art]
Each unit constituting the exchange is duplicated to improve the reliability, and even if the operation system unit goes down due to a failure, the standby system unit operates instead and the service can be continued. FIG. 15 shows a network in which two exchanges A and B are connected by a communication line LN, and telephones TA and TB are connected to the respective exchanges. Each of the exchanges A and B has a communication path unit SPU and a control unit CTU. And the communication path unit SPU and the control unit CTU are both duplicated.
If the exchange is, for example, an ATM exchange, the communication path unit SPU has a line IF unit connected to a transmission line, a line separation unit, a cell switch unit, and the like. The line IF unit converts a frame of a predetermined format, for example, a DS1, DS3 frame or SONET frame into an ATM cell format and sends it to the cell switch side, or converts the ATM cell format into a DS1, DS3 frame or SONET frame and converts it into a line. It has the function of sending. The line concentrator is connected to a number of line interfaces, multiplexes cells and inputs the multiplexed cells to the cell switch, and separates the multiplexed cells from the cell switch side and sends them to a predetermined line interface. The cell switch unit switches the input multiplex cell to a predetermined route.
[0003]
The control unit CTU controls the communication path unit and includes an MPU, a memory, a system control unit, and the like. FIG. 16 is a configuration diagram of the control system unit CTU in the exchange. ₀ , 10 ₁ Are the control system devices of system 0 and system 1, respectively, having the same configuration. Reference numeral 11 denotes a PROC control unit (processing control unit), which is a microprocessor unit (MPU unit) 11a that executes and controls programs, a memory unit 11b that stores files (exchange programs, emergency control operation programs, etc.) and data, a system A system control unit 11c that performs control (operation / preliminary switching control, file update control, etc.) for making a redundant configuration, an internal bus 11d, and an interface control unit 11e that performs interface control between the internal bus and the expansion bus. . The 0-system / 1 system memory units 11b, 11b are connected by a memory confounding line MCL, and the 0-system / 1 system control units 11c, 11c are connected by a system confounding line SCL. Reference numeral 12 denotes an IO control unit which has a SCSI interface and controls a magneto-optical disk drive or a hard disk drive; 13 denotes a LAN control unit such as an Ethernet controller which interfaces with a maintenance console (operation panel); 14 denotes a communication path such as a switch. An interface control unit with the system device SPU, 15 is an expansion bus, and 16 is a bus interface unit that controls an interface with another system, and performs inter-system confounding control via an extended bus confounding line BCL.
[0004]
In the above duplexed exchange, a general operation method is such that one system executes processing (ACT state (hereinafter, referred to as ACT system)) and the other system stands by (SBY state (SBY system)). It is operated. If a serious failure occurs during operation in such a mode, an operation such as ACT / SBY change (old ACT system → new SBY system, old SBY system → new ACT system) is performed as an emergency operation, affecting service. Operation is continued without giving. When such a failure occurs, it is a general method to collect failure information (contents of various registers and memory at the time of failure) as log information and output it to a file or the like.
Assuming that the system 0 is an ACT system and the system 1 is an SBY system, the MPU unit 11a of the system 0 reads an exchange program or the like written in the memory unit 11b, and controls and computes each functional block according to the contents. Write the result to the memory. For example, the MPU unit 11a of the ACT system sends out various control data to the active and standby communication channel devices SPU according to the exchange program, and performs predetermined processing based on the data received from the ACT communication channel device. And stores the processing result in the memory unit 11b. Further, the content of the data written in the ACT memory unit 11b is always reflected in the SBY memory unit 11b via the memory confounding line MCL, and the content of the memory is kept equal in both the ACT system and the SBY system.
[0005]
In this state, if a failure requiring an ACT / SBY change (system switching), for example, an overflow of a failure detection timer TF (Fault Timer) occurs, an ACT / SBY change interrupt occurs. The failure detection timer TF is, for example, a watch dog timer, which automatically counts up and periodically clears the count value. When an abnormal loop, runaway or hardware failure of the program occurs, the count value cannot be cleared and an overflow occurs, which triggers the ACT / SBY system switching.
The MPU unit 11a executes a process according to an emergency control operation program by a system switching interrupt. That is, the MPU unit 11a collects, as a failure information log, all register information in the MPU, other control information of each functional block, and the like in accordance with the emergency control operation program, and stores it in a predetermined storage unit. The storage destination depends on the system, and may be a memory unit 11b or a device such as a hard disk or an optical disk under the IO control unit 12.
[0006]
Next, a hardware reset is applied to the 0 system and the 1 system. Thereafter, the ACT system controller 11c instructs the SBY system controller 11c to switch ACT / SBY via the system confounding line SCL, and performs ACT / SBY change. As a result, the system 1 (old SBY) starts up as the new ACT, and thereafter the system 1 (new ACT) continues to provide the service by continuing the processing of the system 0 (old ACT).
By performing the above steps, the ACT / SBY change operation is performed, the details of the failure are analyzed using the failure information log, and a measure corresponding to the analysis is performed.
[0007]
[Problems to be solved by the invention]
In the above description, the ACT / SBY change is performed due to the overflow of the failure detection timer TF (Fault Timer). However, other failures that cause the ACT / SBY change include:
(1) The MPU itself has entered a deadlock condition,
(2) Obstacles where the function block locks the bus and hardens,
Etc. An MPU deadlock is a fault in which the MPU is normal but has no response to a command and is in a wait state. In addition, a bus lock is a faulty function block in which the level of a bus line is fixed at a high level or a low level. It is an obstacle to do. The ACT / SBY change is also performed for the failures (1) and (2). However, in such a failure, “collection and storage of the failure information log” cannot be performed, and the failure information cannot be left. For this reason, failure information analysis after ACT / SBY change cannot be performed, and there is a problem that it is difficult to deal with a failed device or a broken device.
[0008]
As described above, an object of the present invention is to provide a failure information collection method for a duplexing apparatus that can reliably collect failure information when a failure that causes an ACT / SBY change occurs, and that can accurately recognize and deal with a failure location. To provide.
[0009]
[Means for Solving the Problems]
According to the present invention, the above problems can be achieved by (1) collecting failure information when a failure requiring ACT / SBY switching occurs, and (2) setting a flag if collection of the failure information is completed before hardware reset. If the collection is not completed, the flag is left unset. (3) After hardware reset of both systems, it is determined whether the collection of the fault information is completed by referring to the flag, and (4) This is achieved by a failure information collection method for a redundant device that collects failure information if not completed. In this way, even if the collection of the fault information is not completed, the fault information can be reliably collected after the hardware reset.
[0010]
Further, according to the present invention, the above problems can be solved by (1) when a failure requiring ACT / SBY switching occurs, each system collects failure information of its own system, and (2) collects failure information before hardware reset. If the collection is completed, the information collection flag of the own system is set. If the collection is not completed, the flag is left unset. (3) After resetting both systems by hardware, refer to the flags of each system. This is achieved by checking whether the collection of failure information has been completed and (4) collecting the failure information of the incomplete system. In this way, even if the failure information has not been collected in both systems or one system, failure information can be reliably collected for each system after hardware reset.
[0011]
BEST MODE FOR CARRYING OUT THE INVENTION
(A) Configuration of control system device
FIG. 1 is a block diagram of a control system device of an exchange according to the present invention. ₀ , 10 ₁ Are the control system devices of system 0 and system 1, respectively, having the same configuration. 11 ₀ , 11 ₁ Is a 0-system and 1-system PROC control unit (processing control unit), which is a microprocessor unit (MPU unit) 11a that executes and controls a program. ₀ , 11a ₁ Unit 11b for storing files (exchange programs, emergency control operation programs, etc.) and data ₀ , 11b ₁ , A system control unit 11c that performs control (operation / standby switching control, file update control, etc.) for making the system redundant. ₀ , 11c ₁ , Internal bus 11d ₀ , 11d ₁ Control unit 11e for controlling the interface between the internal bus and the expansion bus ₀ , 11e ₁ , Firmware unit 11f for performing system control ₀ , 11f ₁ It has. 0 system / 1 system memory unit 11b ₀ , 11b ₁ Are connected by a memory confounding line MCL, and a 0-system / 1-system control unit 11c. ₀ , 11c ₁ The parts are connected by a system confounding line SCL. Memory unit 11b ₀ , 11b ₁ In
(1) Exchange program P1,
(2) Emergency control operation program P2,
(3) Various data DT,
(4) Log information collection completion flag LCF indicating collection completion / incomplete of failure information log collection
Are stored.
[0012]
12 ₀ , 12 ₁ Is an IO control unit that has a SCSI interface and controls a magneto-optical disk drive and a hard disk drive. ₀ , 13 ₁ Is a LAN control unit such as an Ethernet controller for interfacing with a maintenance console (operation panel). ₀ , 14 ₁ Is an interface control unit with the communication path system unit SPU such as a cell switch; ₀ , 15 ₁ Is an expansion bus, 16 ₀ , 16 ₁ Is a bus interface unit that controls an interface with another system, and controls intersystem confounding via an extended bus confounding line BCL.
[0013]
(B) Fault information collection processing in the first invention
In the first invention, when an ACT / SBY change fault occurs during system operation, (1) the ACT system collects fault information of both systems (ACT system and SBY system). (2) The ACT system sets the flag to the set state if the collection of the failure information is completed before the hardware reset, and leaves the flag in the unset state if the collection is not completed. (3) After hardware reset of both systems, the ACT or SBY system refers to the log information collection flag to determine whether the collection of failure information has been completed. (4) If not completed, collects the failure information I do.
The hardware reset is for resolving a failure factor such as TF overflow, MPU deadlock, bus lock, etc., and a register or memory 11b in the MPU storing failure information to be collected. ₀ , 11b ₁ Is not reset.
[0014]
The ACT system performs the processes (1) and (2). There are four modes (first to fourth embodiments) depending on which system performs the processes (3) and (4). is there.
FIG. 2 is a chart showing a system for performing the processes (3) and (4) after hardware reset when the system 0 is an ACT system and the system 1 is an SBY system.
{Circle around (1)} In the first embodiment, the old ACT system (system 0) executes the flag check processing (3) and the failure information log collection processing (4) when the flag is not set.
{Circle around (2)} In the second embodiment, the old ACT system (system 0) performs the flag check processing (3), and the new ACT system (system 1) executes the failure information log collection processing (4) when the flag is not set. .
{Circle around (3)} In the third embodiment, the new ACT system (system 1) executes the flag check processing (3) and the failure information log collection processing (4) when the flag is not set.
{Circle around (4)} In the fourth embodiment, the new ACT system (system 1) performs the flag check process (3), and the old ACT system (system 0) executes the failure information log collection process (4) when the flag is not set. .
[0015]
(A) First embodiment
FIG. 3 is a processing flow of the 0-system and the 1-system in the first embodiment.
Initially, the system 0 operates as the ACT system and the system 1 operates as the SBY system (steps 101 and 201).
In this state, the 0-system MPU unit 11a ₀ An interrupt requiring an ACT / SBY change occurs (step 102).
MPU unit 11a ₀ Starts the emergency control operation program P2 using this interrupt as a trigger. With this program, failure information logs such as all register information in the MPU in both systems and control information of each functional block are collected and stored in a predetermined storage area (step 103). The storage destination depends on the system, and the memory unit 11b ₀ Or the IO control unit 12 ₀ It may be a subordinate device (hard disk, optical disk). Further, the fault information log of the other system is collected by communication between the processors.
[0016]
0-system MPU unit 11a ₀ Sets the log collection completion flag LCF to "1" if the various failure information logs can be saved normally (step 104). However, if the various failure information logs cannot be saved normally, the log collection completion flag LCF is left unset (“0”).
Thereafter, hardware reset is performed for both the 0 system and the 1 system (steps 105 and 202). After the hardware reset, the ACT system controller 11c ₀ Is an SBY system controller 11c ₁ ACT / SBY switching is instructed via the system confounding line SCL, and ACT / SBY change is performed. Thus, the system 0 (old ACT) rises as a new SBY, and the system 1 (old SBY) rises as a new ACT (steps 106 and 203).
Next, the MPU unit 11a of system 0 (old ACT) ₀ Checks "1" and "0" of the log collection completion flag LCF (step 107). If the flag is not set ("0"), the log collection operation of both systems is executed (step 108).
[0017]
MPU part 11a of system 0 (old ACT) ₀ Means that after the log collection is completed, or if the log collection completion flag LCF is set ("1") in step 107, the standby operation processing of the first system (old SBY) is continued (standby operation state, Step 109).
In addition, the MPU unit 11a of the first system (new ACT) ₁ Synchronizes with the standby operation of the system 0 to continue to provide the service by continuing the operation operation processing of the system 0 (old ACT) (operation operation state, step 204).
As described above, according to the first embodiment, the log collection after the system switching is all performed by the new SBY system (system 0), so that the failure information log can be collected without affecting the service processing by the new ACT system. This makes it possible to accurately recognize the failure location and deal with it.
[0018]
(B) Second embodiment
FIG. 4 is a processing flow of system 0 and system 1 in the second embodiment, and the same steps as those in the first embodiment are given the same numbers. In the second example, the system 0 (old ACT) starts as a new SBY (steps 101 to 106), and the system 1 (old SBY) starts as a new ACT (steps 201 to 203). Similar processing is performed.
After the execution of the process of step 106, the MPU unit 11a of the 0 system (old ACT) ₀ Checks the "1" and "0" of the log collection completion flag LCF (step 121). If the flag is not set ("0"), the MPU unit of the first system (new ACT) notifies the fact by inter-processor communication. 11a ₁ (Step 122).
[0019]
MPU part 11a of 1 system (new ACT) ₁ Checks whether a flag non-setting notification has been received from the system 0 (old ACT), that is, whether a log information collection instruction has been received (step 221), and if received, executes log collection operations of both systems (step 221). 222). After the log collection is completed, or if no log information collection instruction is received in step 221, the MPU unit 11a of the first system (new ACT) ₁ Provides the service by continuing the operation operation processing of the system 0 (old ACT) up to that time (operation operation state, step 223).
Also, the MPU unit 11a of the system 0 (new SBY) ₀ The standby operation processing of the first system is continued in synchronization with the operation operation of the first system (new ACT) (standby operation state, step 123).
According to the second embodiment, the flag check after system switching is performed by the new SBY system (system 0). Therefore, if the log collection is completed before hardware reset, the new ACT system service processing is completely affected. Can not give.
If the log collection is not completed before the hardware reset, the new ACT system (system 1) collects the failure information log after the hardware reset, so that the log collection can be performed more reliably.
[0020]
(C) Third embodiment
FIG. 5 is a processing flow of the system 0 and system 1 in the third embodiment, and the same steps as those in the first embodiment are denoted by the same reference numerals. In the third embodiment, the 0th system (old ACT) starts as a new SBY (steps 101 to 106) and the 1st system (old SBY) starts as a new ACT (steps 201 to 203). Similar processing is performed.
After execution of the processing of step 203, the MPU unit 11a of the first system (new ACT) ₁ Checks "1" and "0" of the log collection completion flag LCF (step 231). If the flag is not set ("0"), the log collection operation of both systems is executed (step 232).
MPU unit 11a ₁ After the log collection is completed, or if the log collection completion flag LCF is set ("1") in step 231, the service is provided by continuing the operation and operation processing of the system 0 (old ACT) up to that time (step 231). Operational operation state, step 233).
Also, the MPU unit 11a of the system 0 (new SBY) ₀ Continues the standby operation processing of the first system (old SBY) in synchronization with the operation operation of the first system (standby operation state, step 123).
As described above, according to the third embodiment, the new ACT system (system 1) performs the flag check after system switching and the collection of the failure information log, so that more reliable flag check and log collection can be performed.
[0021]
(D) Fourth embodiment
FIG. 6 is a processing flow of system 0 and system 1 in the fourth embodiment, and the same steps as those in the first embodiment are denoted by the same reference numerals. In the fourth embodiment, the system 0 (old ACT) starts as a new SBY (steps 101 to 106) and the system 1 (old SBY) starts as a new ACT (steps 201 to 203). Similar processing is performed.
After execution of the processing of step 203, the MPU unit 11a of the first system (new ACT) ₁ Checks the log collection completion flag LCF "1", "0" (step 241). If the flag is not set ("0"), the MPU unit of the system 0 (old ACT) notifies the fact by inter-processor communication. 11a ₀ (Step 242).
MPU part 11a of system 0 (new SBY) ₀ Checks whether a flag non-setting notification has been received from the first system (old SBY system), that is, whether a log information collection instruction has been received (step 141). Execute (step 142).
[0022]
MPU part 11a of system 0 (new SBY) ₀ After the log collection is completed or if the log information collection instruction is not received in step 141, the standby operation processing of the first system (old SBY) is continued thereafter (standby operation state, step 143).
In addition, the MPU unit 11a of the first system (new ACT) ₁ Synchronizes with the standby operation of the system 0, continues the operation operation processing of the system 0 (old ACT) up to that time, and provides a service (operation operation state, step 243).
As described above, according to the fourth embodiment, the flag check after system switching is performed by the new ACT system (system 1), so that a more reliable check can be performed.
Even if the collection of the failure information log is not completed before the hard reset, the service is not affected because the log is collected by the new SBY system (0 system) after the hard reset.
[0023]
(C) Failure log collection processing in the second invention
In the second invention, when an ACT / SBY change failure occurs during system operation, (1) each system collects its own failure information log, and (2) collection of the failure information log is completed before hardware reset. Then, the flag of the own system is set, and if the collection is not completed, the flag remains unset. (3) After hardware reset of both systems, the ACT or SBY system refers to the flag of each system to check whether collection of the failure information log has been completed. (4) Failure of the system that has not been completed Collect information logs.
It is the respective systems that perform the processes (1) and (2), but there are seven modes (the fifth to eleventh embodiments) depending on which system performs the processes (3) and (4). is there.
[0024]
FIG. 7 is a chart showing a system for performing the processes (3) and (4) after hardware reset when the system 0 is an ACT system and the system 1 is an SBY system.
{Circle around (1)} In the fifth embodiment, the old ACT system (system 0) and the new ACT system (system 1) respectively perform their own system flag check processing (3) and failure information log collection processing when the flag is not set (4). Execute
{Circle around (2)} In the sixth embodiment, the old ACT system (system 0) performs the flag check processing (3) of both systems and also executes the failure information log collection processing (4) when the flag of each system is not set.
{Circle around (3)} In the seventh embodiment, the old ACT system (system 0) performs flag check processing (3) for both systems, and the new ACT system (system 1) collects failure information logs when both systems have no flags set. Execute (4).
{Circle around (4)} In the eighth embodiment, the old ACT system (system 0) performs the flag check processing (3) of both systems, and each system performs the failure information log collection processing (4) when its own system flag is not set. Execute.
{Circle around (5)} In the ninth embodiment, the new ACT system (system 1) performs the flag check processing (3) of both systems and also executes the failure information log collection processing (4) when the flags of each system are not set. .
{Circle around (6)} In the tenth embodiment, the new ACT system (system 1) performs the flag check processing (3) of both systems, and the old ACT system (system 0) collects the failure information log when the flags of both systems are not set. Execute (4).
{Circle around (7)} In the eleventh embodiment, the new ACT system (system 1) performs flag check processing (3) for both systems, and each system performs failure information log collection processing (4) when its own system flag is not set. Execute.
[0025]
(A) Fifth embodiment
FIG. 8 shows the processing flow of system 0 and system 1 in the fifth embodiment. System 0 is initially operated as ACT system and system 1 is operated as SBY system (steps 301 and 401).
In this state, the 0-system MPU unit 11a ₀ When an interrupt requiring an ACT / SBY change occurs to the MPU unit 11a ₀ Sends this interrupt to the system controller 11c. ₀ , The MPU unit 11a of the first system via the system confounding line SCL ₁ (Steps 302 and 402).
MPU part 11a of both systems ₀ , 11a ₁ Starts the emergency control operation program P2 with the ACT / SBY change interrupt as a trigger. With this program, failure information logs such as all register information in the MPU of the own system and other control information of each functional block are collected and stored in a predetermined storage area (steps 303 and 403). Note that the storage destination depends on the system, and the memory unit 11b ₀ , 11b ₁ Or the IO control unit 12 ₀ , 12 ₁ It may be a subordinate device (hard disk, optical disk).
[0026]
MPU unit 11a of each system ₀ , 11a ₁ Indicates that the log collection completion flag LCF ₀ , LCF ₁ Is set to "1" (steps 304 and 404). However, if the various information logs cannot be saved normally, the log collection completion flag LCF ₀ , LCF ₁ Is not set (“0”).
Thereafter, hardware reset is performed for both the 0 system and the 1 system (steps 305 and 405). After the hardware reset, the system control unit 11c of the 0 system ₀ Is the system control unit 11c of the first system ₁ ACT / SBY is instructed via the system confounding line SCL to perform ACT / SBY change. As a result, the system 0 (old ACT) is activated as the new SBY, and the system 1 (old SBY) is activated as the new ACT (steps 306 and 406).
MPU part 11a of system 0 and system 1 ₀ , 11a ₁ Is the log collection completion flag LCF of the own system. ₀ , LCF ₁ "1" and "0" are checked (steps 307 and 407). If the flag is not set ("0"), the log collection operation of the own system is executed (steps 308 and 408).
[0027]
MPU part 11a of system 0 (old ACT) ₀ After completion of log collection, or in step 307, a log collection completion flag LCF ₀ Is set ("1"), the standby operation processing of the first system (old SBY) is continued (standby operation state, step 309).
In addition, the MPU unit 11a of the first system (new ACT) ₁ After completion of log collection, or at step 407, a log collection completion flag LCF ₁ Is set ("1"), the operation operation processing of the 0 system (old ACT) is continued in synchronization with the processing of the 0 system (operation operation state, step 409).
As described above, according to the fifth embodiment, since each system collects the fault information log of its own system, it is possible to surely collect the fault information before resetting the hardware. Further, since the failure information log of the other system is not collected, the processing can be completed at higher speed.
[0028]
(B) Sixth embodiment
FIG. 9 is a processing flow of system 0 and system 1 in the sixth embodiment, and the same steps as those in the fifth embodiment are given the same numbers. In the sixth embodiment, the system 0 (old ACT) starts as a new SBY (steps 301 to 306), and the system 1 (old SBY) starts as a new ACT (steps 401 to 406). Similar processing is performed.
After executing the processing of step 306, the MPU unit 11a of the 0 system (old ACT) ₀ Is the log collection completion flag LCF for both systems ₀ , LCF ₁ "1" and "0" are checked (step 311). {Circle around (1)} Flag LCF of only own system (0 system) ₀ If is not set ("0"), the log collection operation of the own system is executed (step 312), and (2) the flag LCF of both systems (0 system, 1 system) ₀ , LCF ₁ If both are not set ("0"), the log collection operation of both systems is executed (step 313), and (3) the flag LCF of only the other system (system 1) ₁ If is not set ("0"), the log collection operation of another system is executed (step 314).
[0029]
MPU part 11a of system 0 (old ACT) ₀ Is completed if the log collection process is completed as described above, or the log collection completion flag LCF ₀ , LCF ₁ Are set ("1"), the standby operation processing of the first system (old SBY) is continued (standby operation state, step 315).
In addition, the MPU unit 11a of the first system (new ACT) ₁ Synchronizes with the standby operation of the system 0 to continue the operation operation processing of the system 0 (old ACT) to provide a service (operation operation state, step 411).
As described above, according to the sixth embodiment, each system can collect the failure information log of its own system, so that the failure information can be reliably collected before the hardware reset.
Even if the log collection of both systems is not completed before the hard reset, the new SBY system (0 system) collects the logs after the hard reset, so that the new ACT system service processing is not affected.
[0030]
(C) Seventh embodiment
FIG. 10 is a processing flow of system 0 and system 1 in the seventh embodiment, and the same steps as those in the fifth embodiment are denoted by the same reference numerals. In the seventh embodiment, system 0 (old ACT) starts as a new SBY (steps 301 to 306), and system 1 (old SBY) starts as a new ACT (steps 401 to 406). Similar processing is performed.
After executing the processing of step 306, the MPU unit 11a of the 0 system (old ACT) ₀ Is the log collection completion flag LCF for both systems ₀ , LCF ₁ "1" and "0" are checked (step 321), and if any flag is not set ("0"), the system whose flag is not set is changed to the system 1 (new ACT) by inter-processor communication. MPU unit 11a ₁ And instructs to collect log information (step 322).
MPU part 11a of 1 system (new ACT) ₁ Checks whether a notification indicating a flag-unset system has been received from system 0 (old ACT), and if received, determines which system has no flag set (step 421).
[0031]
MPU part 11a of 1 system (new ACT) ₁ Is the flag LCF of (1) other system (0 system) only ₀ If is not set ("0"), the log collection operation of the other system is executed (step 422), and (2) the flag LCF of both systems (system 0 and system 1) ₀ , LCF ₁ If both are not set ("0"), the log collection operation of both systems is executed (step 423), and (3) the flag LCF of only the own system (system 1) ₁ If is not set ("0"), the log collection operation of the own system is executed (step 424).
If the above log collection processing is completed, or if the log information collection instruction is not received in step 421, the MPU unit 11a of the first system (new ACT) ₁ Provides the service by continuing the operation operation processing of system 0 (old ACT) up to that time (operation operation state, step 425).
Also, the MPU unit 11a of the system 0 (new SBY) ₀ Continues the standby operation process of the first system in synchronization with the operational operation process of the first system (new ACT) (standby operation state, step 323).
[0032]
As described above, according to the seventh embodiment, each system can collect the failure information log of its own system, so that the failure information can be reliably collected before the hardware reset.
Further, since the check of the log collection flag is performed by the new SBY system (system 0), the service processing of the new ACT system is not affected at all if the log collection of both systems is completed before the hardware reset. Even if the log collection is not completed before the hardware reset, the log collection can be performed more reliably because the new ACT system performs the log collection.
[0033]
(D) Eighth embodiment
FIG. 11 is a processing flow of the system 0 and system 1 in the eighth embodiment, and the same steps as those in the fifth embodiment are denoted by the same reference numerals. In the eighth embodiment, the 0th system (old ACT) starts as a new SBY (steps 301 to 306), and the 1st system (old SBY) starts as a new ACT (steps 401 to 406). Similar processing is performed.
After executing the processing of step 306, the MPU unit 11a of the 0 system (old ACT) ₀ Is the log collection completion flag LCF of the other system (system 1) ₁ "1" and "0" are checked (step 331), and the flag LCF is checked. ₁ Is not set ("0"), the fact is notified by the inter-processor communication to the MPU unit 11a of the first system (new ACT). ₁ (Step 332). 1 system MPU unit 11a ₁ Checks whether a flag non-setting notification has been received from the system 0 (old ACT), that is, whether a log information collection instruction has been received (step 431), and if received, executes its own log collection operation (step 431). 432). After the log collection is completed, or if no log information collection instruction is received in step 431, the MPU unit 11a of the first system (new ACT) ₁ Continues the operation operation process of the system 0 (old ACT) up to that time and provides a service (operation operation state, step 433).
[0034]
On the other hand, the MPU unit 11a of the system 0 (old ACT) ₀ Indicates that the notification processing of step 332 is completed or that the log collection completion flag LCF ₁ Is set ("1"), the log collection completion flag LCF of the own system is set. ₀ "1" and "0" are checked (step 333), and the flag LCF is checked. ₀ If is not set ("0"), the log information collection operation of the own system is executed (step 334).
Then, or in step 333, the flag LCF ₀ Is set ("1"), the MPU unit 11a of the system 0 (old ACT) ₀ Continues the standby operation processing of the first system in synchronization with the operational operation processing of the first system (new ACT) (standby operation state, step 335).
According to the eighth embodiment, since each system collects its own failure information log, it is possible to reliably collect failure information before hardware reset.
Further, since the check of the log collection flag is performed by the new SBY system (system 0), the service processing of the new ACT system is not affected at all if the log collection of both systems is completed before the hardware reset. Further, even if the log collection is not completed before the hardware reset, each system only needs to collect the log of its own system, so that the log collection can be performed at high speed.
[0035]
(E) Ninth embodiment
FIG. 12 is a processing flow of system 0 and system 1 in the ninth embodiment, and the same steps as those in the fifth embodiment are denoted by the same reference numerals. In the ninth embodiment, system 0 (old ACT) starts as a new SBY (steps 301 to 306), and system 1 (old SBY) starts as a new ACT (steps 401 to 406). Similar processing is performed.
After execution of the processing of step 406, the MPU unit 11a of the first system (new ACT) ₁ Is the log collection completion flag LCF for both systems ₀ , LCF ₁ "1" and "0" are checked (step 441). {Circle around (1)} Flag LCF of other system (0 system) only ₀ If is not set ("0"), the log collection operation of the other system is executed (step 442), and (2) the flag LCF of both systems (0 system, 1 system) ₀ , LCF ₁ If both are not set ("0"), the log collection operation of both systems is executed (step 443), and (3) the flag LCF of only the own system (system 1) ₁ If is not set ("0"), the log collection operation of the own system is executed (step 444).
[0036]
MPU part 11a of 1 system (new ACT) ₁ Is completed if the log collection processing is completed as described above, or the log collection completion flag LCF ₀ , LCF ₁ If both are set ("1"), the operation operation processing of the system 0 (old ACT) up to that point is continued (operation operation state, step 445).
Also, the MPU unit 11a of the system 0 (new SBY) ₀ Continues the standby operation processing of the first system (old SBY) up to that time in synchronization with the operational operation processing of the first system (standby operation state, step 341).
As described above, according to the ninth embodiment, each system can collect the failure information log of its own system, so that the failure information can be reliably collected before the hardware reset. In addition, since the log collection completion flag after hardware reset and the log collection when the log collection is not completed are performed by the new ACT system (system 1), log collection can be performed more reliably.
[0037]
(F) Tenth embodiment
FIG. 13 is a processing flow of system 0 and system 1 in the tenth embodiment, and the same steps as those in the fifth embodiment are given the same numbers. In the tenth embodiment, the 0th system (old ACT) starts as a new SBY (steps 301 to 306) and the 1st system (old SBY) starts as a new ACT (steps 401 to 406). Similar processing is performed.
After execution of the processing of step 406, the MPU unit 11a of the first system (old SBY) ₁ Is the log collection completion flag LCF for both systems ₀ , LCF ₁ "1" and "0" are checked (step 451), and if any flag is not set ("0"), the system whose flag is not set is changed to the system 0 (new SBY) by inter-processor communication. MPU unit 11a ₀ And instructs to collect log information (step 452).
[0038]
MPU part 11a of system 0 (new SBY) ₀ Checks whether a notification indicating a flag-unset system has been received from the first system (old SBY), and if received, determines which system has no flag set (step 351).
MPU part 11a of system 0 (new SBY) ₀ Is the flag LCF of (1) own system (0 system) only ₀ Is not set ("0"), the log collection operation of the own system is executed (step 352), and (2) the flag LCF of both systems (0 system, 1 system) ₀ , LCF ₁ If both are not set ("0"), the log collection operation of both systems is executed (step 353), and (3) the flag LCF of only the other system (system 1) ₁ If is not set ("0"), the log collection operation of another system is executed (step 354).
If the above log collection processing is completed, or if no log information collection instruction is received in step 351, the MPU unit 11a of the 0 system (new SBY) ₀ Continues the standby operation processing of the first system (old SBY) up to that time (standby operation state, step 355).
[0039]
In addition, the MPU unit 11a of the first system (new ACT) ₁ Continues the operation operation processing of the system 0 up to that time in synchronization with the standby operation processing of the system 0 (new SBY) (operation operation state, step 453).
According to the tenth embodiment, since each system collects its own failure information log, it is possible to reliably collect failure information before hardware reset.
Further, the check of the log collection completion flag after the hardware reset is performed by the new ACT system (system 1), so that the check can be performed more reliably. Even if the log collection is not completed before the hardware reset, the new SBY (0 system) performs log collection without affecting the new ACT system service processing.
[0040]
(G) Eleventh embodiment
FIG. 14 is a processing flow of system 0 and system 1 in the eleventh embodiment, and the same steps as those in the fifth embodiment are denoted by the same reference numerals. In the eleventh embodiment, the 0th system (old ACT) starts as a new SBY (steps 301 to 306), and the 1st system (old SBY) starts as a new ACT (steps 401 to 406). Similar processing is performed.
After execution of the processing of step 406, the MPU unit 11a of the first system (old SBY) ₁ Is the log collection completion flag LCF of the own system ₁ "1" and "0" are checked (step 461), and the flag LCF is checked. ₁ If is not set ("0"), the log information collection operation of the own system is executed (step 462).
[0041]
Then, or in step 461, the flag LCF of the own system is set. ₁ Is set ("1"), the log collection completion flag LCF of the other system (system 0) ₀ "1" and "0" are checked (step 463), and the flag LCF is checked. ₀ Is not set ("0"), the effect is notified to the MPU unit 11a of the 0 system (new SBY) by inter-processor communication. ₀ (Step 464). 0-system MPU unit 11a ₀ Checks whether a flag non-setting notification has been received from the first system (old SBY), that is, whether a log information collection instruction has been received (step 361), and if received, executes its own system log collection operation (step 361). 362). After the log collection is completed, or if the log information collection instruction is not received in step 361, the MPU unit 11a of the 0 system (new SBY) ₀ Continue the standby operation processing of the first system (old SBY) up to that time (standby operation state, step 363).
[0042]
On the other hand, the MPU unit 11a of the first system (new ACT) ₁ Indicates that the notification processing of step 464 is completed, or that the log collection completion flag LCF ₀ Is set ("1"), the operation operation processing of the system 0 is continued in synchronization with the standby operation processing of the system 0 (new SBY) (operation operation state, step 465).
As described above, according to the eleventh embodiment, each system can collect the log information more reliably before the hardware reset because the system collects the failure information log of the own system.
Further, the check of the log collection completion flag after the hardware reset is performed by the new ACT system (system 1), so that the check can be performed more reliably.
In the above, the case where an interruption requiring an ACT / SBY change occurs during operation while the system 0 is in the operation operation state (during the ACT operation) and the system 1 is in the standby operation state (during the SBY operation) has been described. During operation, if an ACT / SBY change interrupt occurs while the system 0 is in SBY operation, ACT / SBY switching is performed similarly.
As described above, the present invention has been described with reference to the embodiments. However, the present invention can be variously modified in accordance with the gist of the present invention described in the claims, and the present invention does not exclude these.
[0043]
【The invention's effect】
As described above, according to the present invention, when a failure requiring an ACT / SBY change occurs, a failure log can be reliably obtained, and the location of the failure can be accurately suppressed.
According to the present invention, a failure information log is collected when a failure requiring ACT / SBY switching occurs, and if the collection of the failure information log is completed before the hardware reset, the flag is set and the collection is completed. If not, leave the flag in the unset state, reset both hardware, and refer to the flag to determine whether collection of the failure information log has been completed. If not, collect the failure information log. Thus, even if the collection of the failure information log is not completed, the failure information log can be reliably collected after the hardware reset.
[0044]
Further, according to the present invention, when a failure that requires ACT / SBY switching occurs, each system collects its own failure information log, and if the collection of the failure information log is completed before the hardware reset, the own system. Set the log information collection flag in the set state.If collection is not completed, leave the flag unset.After resetting both systems by hardware, refer to the flag of each system and complete the collection of the failure information log. And collects the failure information log of the incomplete system.Even if the failure information log collection is not completed in both systems or one system, it is possible to collect the failure information log after each hardware reset. Failure information logs can be collected reliably.
[0045]
According to the invention of the first embodiment, all log collection is performed by the new SBY system, so that the new ACT system service processing is not affected.
According to the invention of the second embodiment, the flag check after system switching is performed by the new SBY system (system 0), so that if the log collection before hardware reset is completed, the service processing of the new ACT system will be affected. Do not give. If the log collection before the hardware reset is not completed, the failure information log is collected by the new ACT system (system 1) after the hardware reset, so that the log collection can be performed more reliably.
According to the invention of the third embodiment, the flag check processing after system switchover and the failure information log collection processing are performed by the new ACT system (system 1), so that more reliable checks and log collection are possible.
According to the invention of the fourth embodiment, the flag check after the system switching is performed by the new ACT system (system 1), so that a more reliable check can be performed. Even if the log collection is not completed before the hardware reset, the log collection is performed by the new SBY system (0 system) after the hardware reset, so that the new ACT system service processing is not affected.
[0046]
According to the fifth to eleventh embodiments, since each system collects its own log and sets a flag, it is possible to more reliably collect the log information to be collected before hardware reset.
According to the fifth embodiment, since the flag check processing for another system and the log collection of the failure information are not performed, the processing can be completed at a higher speed.
According to the invention of the sixth embodiment, even if the log collection of both systems is not completed before the hardware reset, the new SBY system (system 0) performs the log collection after the hardware reset, so that the service processing of the new ACT system is performed. Does not affect
According to the invention of the seventh embodiment, the log collection flag is checked by the new SBY system (system 0). If the log collection of both systems is completed before the hardware reset, the service processing of the new ACT system is performed. Has no effect. Further, even if the log collection is not completed before the hardware reset, the log collection can be performed more reliably because the new ACT system performs the log collection.
[0047]
According to the invention of the eighth embodiment, the log collection flag is checked by the new SBY system (0 system). Therefore, if the log collection of both systems is completed before the hardware reset, the service processing of the new ACT system is performed. Has no effect. Even if the log collection is not completed before the hardware reset, it is sufficient to collect the log only for the own system, so that the log can be collected at a high speed.
According to the ninth embodiment, since the flag check processing after hardware reset and the log collection when the hardware is not completed are all performed by the new ACT system, the log collection can be performed more reliably.
According to the tenth embodiment, since the log collection completion flag after the hardware reset is checked by the new ACT system, the check can be performed more reliably. Even if the log collection is not completed before the hardware reset, the new SBY (0 system) performs log collection without affecting the new ACT system service processing.
According to the eleventh embodiment, since the log collection completion flag after the hardware reset is checked by the new ACT system, the check can be performed more reliably.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of a control system device of an exchange according to the present invention.
FIG. 2 is an explanatory diagram of a processing mode after a hardware reset when a system 0 collects failure information logs of both systems (system 0 and system 1) and sets a log collection completion flag.
FIG. 3 is a processing flow of the first embodiment.
FIG. 4 is a processing flow of a second embodiment.
FIG. 5 is a processing flow of a third embodiment.
FIG. 6 is a processing flow of a fourth embodiment.
FIG. 7 is an explanatory diagram of a processing mode after hardware reset when each system collects its own system failure information log and sets a log collection completion flag of its own system.
FIG. 8 is a processing flow of the fifth embodiment.
FIG. 9 is a processing flow of a sixth embodiment.
FIG. 10 is a processing flow of a seventh embodiment.
FIG. 11 is a processing flow of the eighth embodiment.
FIG. 12 is a processing flow of the ninth embodiment.
FIG. 13 is a processing flow of the tenth embodiment.
FIG. 14 is a processing flow of the eleventh embodiment.
FIG. 15 is a network configuration diagram.
FIG. 16 is a configuration diagram of a control system device of a conventional exchange.
[Explanation of symbols]
10 ₀ , 10 ₁ ..Control devices of system 0 and system 1
11 ₀ , 11 ₁ ..PROC control units (processing control units) 11a for system 0 and system 1 ₀ , 11a ₁ ..Microprocessor units (MPU units)
11b ₀ , 11b ₁ ..Memory units
11c ₀ , 11c ₁ ..System control units
11d ₀ , 11d ₁ ..Internal bus
11e ₀ , 11e ₁ ..Interface control units
11f ₀ , 11f ₁ ..Firmware section
12 ₀ , 12 ₁ ..IO control units
13 ₀ , 13 ₁ ..LAN controller
14 ₀ , 14 ₁ ..Interface control unit with communication channel devices
Fifteen ₀ , 15 ₁ ..Expansion buses
16 ₀ , 16 ₁ ..Bus interface section
MCL ... memory confounding wire
SCL ・・ System confounding wire
BCL-Extended bus confounding wire
P1 Exchange program
P2 ... Emergency control operation program
LCF-Log information collection completion flag

Claims

In a failure information collection method for a redundant device including an active (ACT) control device and a standby (SBY) control device,
Collect failure information when a failure that requires ACT / SBY switching occurs,
If the collection of the fault information executed before the hardware reset is completed, the flag is set, and if the collection is not completed, the flag is left unset.
A failure information collection method for a duplex device, comprising: determining whether collection of failure information is completed by referring to the flag after hardware resetting both systems; and collecting failure information if not completed.

2. The fault information collection method according to claim 1, wherein the old ACT determines whether the collection of the fault information is completed by referring to the flag, and if not, collects the fault information.

The old ACT side refers to the flag to determine whether the collection of the failure information is completed, and if not, notifies the old ACT side to the new ACT side and collects the failure information on the new ACT side. 2. The fault information collecting method according to claim 1, wherein:

2. The fault information collecting method according to claim 1, wherein the new ACT determines whether the collection of the fault information has been completed by referring to the flag, and if not completed, collects the fault information.

The new ACT determines whether the collection of the failure information is completed by referring to the flag. If the collection is not completed, the new ACT notifies the old ACT of the failure and collects the failure information on the old ACT. 2. The fault information collecting method according to claim 1, wherein:

In a failure information collection method for a redundant device including an active (ACT) control device and a standby (SBY) control device,
When a failure requiring ACT / SBY switching occurs, each system collects its own failure information,
If the collection of the failure information is completed before the hardware reset, the information collection flag of the own system is set.If the collection is not completed, the flag is left unset.
After the hardware reset of both systems, the system checks the failure information collection by referring to the flags of each system and collects the failure information of the incomplete system. Method.

7. The failure information collecting method according to claim 6, wherein each system checks its own flag and, if the collection of the failure information is not completed, collects the failure information of the own system.

7. The fault information collecting method according to claim 6, wherein the old ACT checks the flags of both systems, and if the collection of the fault information is not completed, the old ACT collects the fault information.

7. The fault according to claim 6, wherein the old ACT checks the flags of both systems, and if the collection of the fault information is not completed, notifies the new ACT from the old ACT and collects the fault information on the new ACT. Information collection method.

The old ACT checks the flags of both systems, and if the collection of the failure information on the old ACT is not completed, the old ACT performs the collection of the failure information. 7. The failure information collection method according to claim 6, wherein the notification is sent to the ACT side, and the new ACT side collects the failure information.

7. The failure information collecting method according to claim 6, wherein the new ACT checks the flags of both systems, and if the collection of the failure information is not completed, the new ACT collects the failure information.

7. The fault information according to claim 6, wherein the new ACT checks the flags of both systems, and if the collection of the fault information is not completed, the new ACT notifies the old ACT and collects the fault information on the old ACT. Collection method.

The new ACT checks the flags of both systems, and if the collection of the failure information on the new ACT is not completed, the new ACT performs the collection of the failure information. 7. The fault information collecting method according to claim 6, wherein the ACT side is notified and fault information is collected on the old ACT side.