JP2004030513A

JP2004030513A - Bus monitoring system and its program

Info

Publication number: JP2004030513A
Application number: JP2002189380A
Authority: JP
Inventors: Noboru Kinoshita; 木下　登; Manabu Chikada; 近田　学; Takahiko Shirakawa; 白川　孝彦; Shinji Karasaki; 唐崎　真二
Original assignee: Hitachi Ltd; Hitachi Asahi Electronics Co Ltd
Current assignee: Hitachi Ltd; Hitachi Asahi Electronics Co Ltd
Priority date: 2002-06-28
Filing date: 2002-06-28
Publication date: 2004-01-29

Abstract

<P>PROBLEM TO BE SOLVED: To perform a fault analysis and fault processing on the basis of information acquired from a board for analysis without a monitoring CPU even when an abnormality that can not be processed by an OS itself occurs about bus monitoring and analysis for monitoring whether connection buses with various devices normally operate. <P>SOLUTION: A bus monitoring system is provided with a bus monitor 113 which is connected to a system bus (PCI bus) 115 of an information processor and has a bus monitoring and analyzing function of monitoring whether the connection buses (IDE 121 and SCSI bus 116) normally operate, a function 131 of making hardware resources separate and independent for a plurality of OSs by virtual hardware 132 having a function of allocating an interrupt from hardware and a processing time of a CPU, and a function 131 of making data writable and readable between the plurality of OSs, and executes a monitoring program 134 for monitoring and analyzing the buses on an OS independent of an OS that normally operates. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、情報処理装置の入出力バスモニタリングシステムおよびそのプログラムに関し、特にマルチＯＳ環境の情報処理装置で、通常のＯＳとは別のＯＳ上で動作する監視プログラムにより制御されるバスモニタを組込んだ情報処理装置およびプログラムに関する。
【０００２】
【従来の技術】
従来のバスモニタ装置としては、例えば特開平７−３１９８０４号公報に示すＳＣＳＩバスモニタ装置及びバスモニタシステムがある。これは、バス受信部で取込まれたＳＣＳＩ信号の遷移から記録すべき事象を検出するイベント検出手段と、この状態をトレースデータとして順次記録するトレースメモリとを設けて、記憶量が設定値を超えたとき、警報を出力するとともにバスに対してビジー信号を出力し、その間にトレースデータを外部へ転送するものである。すなわち、監視対象のバスライン（ＳＣＳＩバス）に接続したアダプタを介してＳＣＳＩバスモニタ装置に信号を取り込み、モニタ装置内で処理したトレース情報をホスト収集装置（トレースメモリ）に送り、オペレータがホスト収集装置に含まれるキーボードの操作等により情報を解析していた。
【０００３】
また、特開平１０−２２２４３２号公報に示すＳＣＳＩバスモニタリングシステムでは、ＳＣＳＩバス監視部がＳＣＳＩバスの監視を随時行い、バス情報解析部ではバスの信号情報をデータとして提供可能な形に変換する。バスモニタデーモンは、定期的にバスモニタに対してＳＣＳＩコマンドを発行することにより、ＳＣＳＩバスの情報を取得して、システムにフィードバックさせる。すなわち、ＳＣＳＩバスモニタにＳＣＳＩ装置としての機能を組込み、ホストコンピュータから操作可能とし、ＳＣＳＩバスモニタから取得したＳＣＳＩバス情報をもとに、ＳＣＳＩバスの性能向上を図っている。
【０００４】
【発明が解決しようとする課題】
しかしながら、これらの従来技術では、監視用の別のホスト収集装置が必要であったり、ＯＳ自身が検知できない異常でホストコンピュータのプログラムがハングアップした場合などは、ＳＣＳＩバスモニタに採取された情報が利用できないという問題がある。
また、特開平１０−２２２４３２号公報では、使用するＯＳ毎にＳＣＳＩバスモニタを制御するプログラムを開発しなければならないという問題もある。
【０００５】
本発明の目的は、このような従来の問題を解消し、監視用のＣＰＵを必要とせず、通常のＯＳがハングアップやシステム停止してしまった場合などでも、バスモニタに採取された情報を利用できる安価で確実な障害解析および回復手段を備えたバスモニタリングシステムおよびそのプログラムを提供することにある。
【０００６】
【課題を解決するための手段】
上記目的を達成するため、本発明のバスモニタリングシステムは、情報処理装置のシステムバス（ＰＣＩバス）に接続され、各種デバイスとの接続バス（ＩＤＥ（Ｉｎｔｅｇｒａｔｅｄ　Ｄｅｖｉｃｅ　Ｅｌｅｃｔｒｏｎｉｃｓ）、ＳＣＳＩバス）が正常に動作しているか否かを監視するためのバスのモニタリングおよび解析機能を有するバスモニタと、ハードウェアからの割り込みやＣＰＵの処理時間を振り分ける機能を持つ仮想ハードウェアにより複数のＯＳに対しハードウェア資源を分離独立させる機能と、複数ＯＳ間でデータ書込みまたは読み出しを可能とする機能とを備えたマルチＯＳ環境の情報処理装置において、通常のＯＳとは独立したＯＳ上で動作する前記バスモニタを制御するためのモニタリングプログラムを実行させるようにしている。
これにより、通常のＯＳ自身が処理できないような異常が発生した場合でも、監視用の別ＣＰＵが無くてもモニタリングプログラムからの情報により的確な障害解析および対策が可能となる。また、仮想ハードウェアを用いることにより、ＯＳに依存しないでバスモニタを制御するための監視プログラムを開発することが可能となる。
【０００７】
本発明のモニタ監視プログラムは、情報処理装置に対し、業務ＯＳ基本部の処理と、ハードウェア制御を行うハードウェアドライバ群の処理と、ＳＣＳＩバスモニタを制御するためのＳＣＳＩバスモニタ管理ＡＰの処理と、ハードウェア内にロギングされる障害情報を編集出力するためのＬＯＧ制御ＡＰの処理と、ＩＥＤバスモニタを制御するＩＤＥバスモニタ管理ＡＰの処理と、これらの各ＡＰによる編集結果をＬＡＮ経由でクライアントＰＣ側からの制御で行うためのリモート制御ＡＰの処理とを、それぞれ実行させるためのプログラムである。
【０００８】
【発明の実施の形態】
以下、本発明の実施形態を、図面により詳細に説明する。
図１は、本発明の一実施形態を示す監視対象バスをＳＣＳＩバスとしたバスモニタリングシステムの構成図である。
図１において、ソフトウェア１３０は、業務側プログラム１３３と監視側プログラム１３４の２つからなり、ハードウェア１０１は、業務側プログラム１３３と監視側プログラム１３４から仮想ハードウェア１３２としてそれぞれにアクセスされる。
複数ＯＳ制御プログラム１３１は、ハードウェア１０１からの割り込みやＣＰＵ１０８の処理時間を業務側プログラム１３３と監視側プログラム１３４に振り分ける機能があり、仮想ハードウェア１３２を各ＯＳに対しハードウェアのように見せることで、業務側プログラム１３３から見えるハードウェア資源と監視側プログラム１３４から見えるハードウェア資源とを分離独立させている。
【０００９】
また、複数ＯＳ制御プログラム１３１は、複数ＯＳ間でデータの書き込みまたは読み出しを可能とする機能も有する。これらの機能により、１つのハードウェア上で複数のＯＳが互いに独立して実行することを実現している。ここで実施しているような１つのハードウェアで複数のＯＳを独立に実行する技術は、特開平１１−１４９３８５号公報に開示されている。これによれば、業務側プログラム１３３と監視側プログラム１３４を独立実行でき、業務側プログラム１３３が障害で停止した場合でも、監視側プログラム１３４は継続して動作できる。
【００１０】
ハードウェア１０１は、ＣＰＵ１０８、主メモリ１０９、ＳＣＳＩＡ１１２、ＳＣＳＩバスモニタ１１３、ＬＡＮＡ１１４、各種デバイスを制御するためのＳｕｐｅｒＩＯ１１１、各バスを制御するシステムバスコントローラ１２０が、それぞれＰＣＩバス１１５に接続されている。システムバスコントローラ１２０には、ＣＲＴＡ１１０が接続され、ＣＲＴ１０７への表示制御を行う。また、システムバスコントローラ１２０は、ＩＤＥバス１２１の制御も行う。ＳｕｐｅｒＩＯ１１１へは、キーボード１０５およびマウス１０６が接続される。
ＳＣＳＩＡ１１２には、ＳＣＳＩバス１１６を介してＨＤＤ（ハードディスク）１０２〜１０４が接続される。なおＨＤＤの台数は、本実施例では３台としたが、この台数に限定されるものではない。このＳＣＳＩバス１１６の状態遷移をモニタリングできるように、ＳＣＳＩバスモニタ１１３も接続されている。
また、ＬＡＮ１２２を介して、複数台のクライアントＰＣが接続される。本実施例においては、クライアントＰＣ１１７〜１１９の３台を接続した例を示している。
【００１１】
図２は、図１における業務側プログラムの構成を示す図である。
業務側プログラム１３３は、図２に示すように、業務ＯＳ基本部２０１、ハードウェア制御を行うハードウェアドライバ群２０２、業務用の各種アプリケーションプログラムからなる業務ＡＰ２０３、ＣＰＵ１０８やハードウェアドライバ群２０２から報告されたエラーのロギングや通知を行うシステム管理用アプリケーションであるシステム管理ＡＰ２０４、およびＬＡＮ１２２を介してクライアントＰＣ１１７〜１１９との通信を行うリモート制御ＡＰ２０５で構成されている。
【００１２】
図３は、図１における監視側プログラムの構成を示す図である。
監視側プログラム１３４は、図３に示すように、業務ＯＳ基本部３０１、ハードウェア制御を行うハードウェアドライバ群３０２、ＳＣＳＩバスモニタ１１３を制御するためのＳＣＳＩバスモニタ管理ＡＰ３０３、ハードウェア１０１内にロギングされる障害情報を編集出力するためのＬＯＧ制御ＡＰ３０４、ＩＤＥバスモニタ９０１（図９参照）を制御するＩＤＥバスモニタ管理ＡＰ３０６、および、これらの各ＡＰによる編集結果をＬＡＮ１２２経由でクライアントＰＣ１１７〜１１９側からの制御で行うためのリモート制御ＡＰ３０５からなる。
なお、監視側プログラム１３４の操作はハードウェア１０１に接続されたキーボード１０５、マウス１０６およびＣＲＴ１０７を使用するか、あるいはＬＡＮＡ１１４に接続されたクライアントＰＣの操作による。
本実施例では、クライアントＰＣ１１７を監視側プログラム１３４を操作する管理用ＰＣとして記述する。
【００１３】
図４は、図１におけるＳＣＳＩバスモニタ１１３の構成を示す図である。
ＳＣＳＩバスの信号を受信するためのバスレシーバ４０１、受信したバスの信号をＳＣＳＩバスモニタ１１３の内部タイミングに同期させるためのバス信号ラッチ４０２、バスの状態遷移を記憶するためのトレースメモリ４０９、ＰＣＩバスとの転送制御を行うＰＣＩバス制御部４１０、バスの状態遷移の発生時刻を示すためのタイムスタンプ発生部４０３、トレースメモリ４０９への書き込み停止条件を設定するためのトレース停止条件設定部４０４、トレース停止条件の検出およびバスの状態遷移を検出するイベント検出部４０５、トレースメモリ４０９のアクセスタイミングやアドレス更新タイミングを制御するモニタシーケンサ部４０６、トレースメモリ４０９のアドレスを管理するメモリアドレス制御部４０７、およびＳＣＳＩバスモニタ全体の制御を行うモニタ内部制御部４０８からなる。
【００１４】
図５は、本発明の一実施例のシステム起動手順を示すフローチャートである。
図５により起動処理を説明する。
ハードウェア１０１に電源が投入（またはリブート）されると、業務側プログラム１３３が起動する（ステップ５０１）。業務側プログラム１３３は初期化処理を開始するが、複数ＯＳ制御プログラム１３１により初期化処理を中断し監視側プログラム１３４を起動する（ステップ５０２）。監視側プログラム１３４は、トレースメモリ４０９のクリア（ステップ５０３）、タイムスタンプ発生部４０３の初期設定（ステップ５０４）、トレース停止条件設定部４０４の初期設定（ステップ５０５）を行う。
この時設定されるトレース停止条件は、フリーラン状態（停止条件なし）である。
【００１５】
次に、モニタシーケンサ部４０６を起動し、トレース動作を開始する（ステップ５０６）。その後、リモート制御ＡＰ３０５を入力待ちで起動し、クライアントＰＣ１１７（管理用ＰＣ）からの操作待ちとする（ステップ５０７、ステップ５０８）。この状態で、複数ＯＳ制御プログラム１３１により業務側プログラム１３３に処理が移され、業務側プログラム１３３の起動処理が再開される（ステップ５０９）。次に、業務ＡＰ２０３が起動され、関連する各種情報の初期化が行われる（ステップ５１０）。その後、業務ＡＰ２０３による通常業務が開始される（ステップ５１１）。
【００１６】
図６は、図１、図２におけるＳＣＳＩバス１１６に関する障害が、システム管理ＡＰ２０４よりクライアントＰＣ１１７に通知された場合の処理手順を示すフローチャートである。
システム管理ＡＰ２０４がＳＣＳＩバス１１６に関する障害を、ハードウェアドライバ群２０２と業務ＯＳ基本部２０１の応答をモニタすることにより検出する（ステップ６０１）。システム管理ＡＰ２０４は、障害情報のロギングを行うとともにリモート制御ＡＰ２０５を介してクライアントＰＣ１１７に障害通知する（ステップ６０２）。一般的には、システム管理者がクライアントＰＣ１１７の情報から障害解析を行う（ステップ６０３）。この段階で障害要因を特定できれば、それに基づき適切な対策を実施する。システム管理ＡＰ２０４より通知された情報では要因が特定できない場合、システム管理者が、クライアントＰＣ１１７よりＳＣＳＩバスモニタ管理ＡＰ３０３をアクセスし、ＳＣＳＩバスモニタの動作モードを指定する（ステップ６０４）。
【００１７】
ステップ６０４により動作パラメータを指定されたＳＣＳＩバスモニタ管理ＡＰ３０３は、まずモニタシーケンサ部４０６の停止指示を行う（ステップ６０５）。次に、トレース停止条件設定部４０４にチェックコンデション状態検出（エラー応答）などのステップ６０４で指定された停止条件を設定する（ステップ６０６）。そしてモニタシーケンサ部４０６を起動し、トレース動作を開始する（ステップ６０７）。監視側プログラム１３４は、イベント待ちの状態に入り休止状態となる（ステップ６０８）。これら監視側プログラム１３４の動作は、複数ＯＳ制御プログラム１３１により業務側プログラム１３３とは独立して実行される。
【００１８】
図７は、ＣＳＩバスモニタが設定された停止条件を検出された場合の処理フローチャートである。
イベント検出部４０４は、バス信号ラッチ４０２の出力とトレース停止条件設定部４０６の出力を比較し停止条件が成立したことを検出する（ステップ７０１）。この結果、モニタシーケンサ部４０６によるトレースメモリ４０９へのＳＣＳＩバス信号状態の書き込みを停止する（ステップ７０２）。ＳＣＳＩバスモニタ１１３からの割り込みにより、ＳＣＳＩバスモニタ管理ＡＰ３０３に停止条件成立が通知される（ステップ７０３）。ＳＣＳＩバスモニタ管理ＡＰ３０３は、ＰＣＩバス１１５を介してトレースメモリ４０９の内容を主メモリ１０９の所定のエリアに読み出す（ステップ７０４）。その後、ＳＣＳＩバスモニタ管理ＡＰ３０３は主メモリ１０９上でトレースデータを編集し、結果をリモート制御ＡＰ３０５を介してクライアントＰＣ１１７に転送する（ステップ７０５）。システム管理者は、クライアントＰＣ１１７に転送されたトレースの内容を解析し、障害対策を実施する（ステップ７０６）。
業務側プログラム１３３の業務ＯＳ基本部２０１自身が処理できないような異常が発生した場合、業務側プログラム１３３は、停止状態などの機能不能状態に遷移してしまう場合がある。
【００１９】
図８は、業務側プログラム停止障害時の処理フローチャートである。
システム管理者がクライアントＰＣ１１７の操作によりＳＣＳＩバスモニタ管理ＡＰ３０３モニタ停止、およびトレースデータ転送を指定する（ステップ８０１）。モニタ停止を指示されたＳＣＳＩバスモニタ管理ＡＰ３０３が、モニタシーケンサ部４０６に停止の設定を行う（ステップ８０２）。モニタシーケンサ部４０６によるトレースメモリ４０９へのＳＣＳＩバス信号状態の書き込みが停止される（ステップ８０３）。ＳＣＳＩバスモニタ管理ＡＰ３０３は、ＰＣＩバス１１５を介してトレースメモリ４０９の内容を主メモリ１０９の所定のエリアに読み出す（ステップ８０４）。その後、ＳＣＳＩバスモニタ管理ＡＰ３０３は主メモリ１０９上でトレースデータを編集し、結果をリモート制御ＡＰ３０５を介してクライアントＰＣ１１７に転送する（ステップ８０５）。システム管理者は、クライアントＰＣ１１７に転送されたトレースの内容を解析し障害対策を実施する（ステップ８０６）。
【００２０】
業務側プログラム１３３の業務ＯＳ基本部２０１自身が処理できないような異常がＳＣＳＩバス１１６のシーケンス以外の原因と考えられる場合、システム管理者はクライアントＰＣ１１７の操作によりＬＯＧ制御ＡＰ３０４をアクセスしハードウェア１０１内にロギングされている各種ハードウェアログログの編集出力を行う。このようにして得られた情報を解析することにより、障害要因を判断し、適切な対策を実施する。
【００２１】
また、本実施例では、ＳＣＳＩバスモニタ１１３の制御をＳＣＳＩバスモニタ管理ＡＰ３０３でのみ行っているが、複数ＯＳ制御プログラム１３１の仮想ハードウェア１３２の設定を変更することで、システム管理ＡＰ２０４側から行えるようにすることもできる。システム管理ＡＰ２０４に、ＳＣＳＩバス１１６に関する障害検出時、自動的にＳＣＳＩバスモニタ１１３の制御を行う機能を組み込むことにより、システム管理者の操作を介入させずにＳＣＳＩバス１１６の状態情報を採取することが可能になる。
【００２２】
図９は、本発明によるＩＤＥバスをトレースする場合の追加構成を示す図である。
本実施例では、ＳＣＳＩバス１１６のモニタリングを行うことを例にしたが、図９に示すようにＩＤＥバスモニタ９０１を接続し、監視側プログラム１３４にＩＤＥバスモニタ管理ＡＰを組み込むことでＩＤＥバスの監視を行うことも可能になる。図９において、ＩＤＥバスモニタ９０１は、ＰＣＩバス１１５から制御され、ＩＤＥバス１２１の監視を行う。ＨＤＤ９０２、ＨＤＤ９０３はＩＤＥ　Ｉ／ＦのＨＤＤである。
【００２３】
【発明の効果】
以上説明したように、本発明によれば、マルチＯＳ環境を利用して、通常動作している業務ＯＳとは別のＯＳ上で動作するバスモニタ監視プログラムでバスモニタを制御しているので、業務ＯＳ、業務プログラム自身が処理できないような異常が発生した場合、監視用の別ＣＰＵがなくてもモニタリングプログラムからの情報により的確な障害解析および対策が可能となる。また、仮想ハードウェア方式を用いることにより、業務側プログラムのＯＳ種別に依存することなく、同一の監視側プログラムを使用できる。
【図面の簡単な説明】
【図１】本発明の一実施形態を示すバスモニタリングシステムの構成図である。
【図２】図１における業務側プログラムの構成を示す図である。
【図３】図１における監視側プログラムの構成を示す図である。
【図４】図１におけるＳＣＳＩバスモニタの構成を示す図である。
【図５】本発明の一実施例を示すシステムの起動手順を示すフローチャートである。
【図６】本発明における障害検出時の処理フローチャートである。
【図７】本発明におけるトレース停止条件検出時の処理フローチャートである。
【図８】本発明における業務側プログラム停止障害時の処理フローチャートである。
【図９】本発明によりＩＤＥバスをトレースする場合の追加構成を示す図である。
【符号の説明】
１０１…ハードウェア、１０２〜１０４…ＨＤＤ（ハードディスク）、
１０５…キーボード、１０６…マウス、１０７…ＣＲＴ、１０８…ＣＰＵ、
１０９…主メモリ、１１０…ＣＲＴＡ、１１１…Ｓｕｐｅｒ　ＩＯ、
１１２…ＳＣＳＩＡ、１１３…ＳＣＳＩバスモニタ、１１４…ＬＡＮＡ、
１１５…ＰＣＩバス、１１６…ＳＣＳＩバス、
１１７〜１１９…クライアントＰＣ、１２０…システムバスコントローラ、
１２１…ＩＤＥバス、１２２…ＬＡＮ、１３０…ソフトウェア、
１３１…複数ＯＳ制御プログラム、１３２…仮想ハードウェア、
１３３…業務側プログラム、１３４…監視側プログラム、
２０１…業務ＯＳ基本部、２０２…ハードウェアドライバ群、
２０３…業務ＡＰ（ＡＰ：アプリケーションプログラム）、
２０４…システム監視ＡＰ、２０５…リモート制御ＡＰ、
３０１…監視ＯＳ基本部、３０２…ハードウェアドライバ群、
３０３…ＳＣＳＩバスモニタ管理ＡＰ、３０４…ＬＯＧ制御ＡＰ、
３０５…リモート制御ＡＰ、３０６…ＩＤＥバスモニタ管理ＡＰ、
４０１…バスレシーバ、４０２…バス信号ラッチ、
４０３…タイムスタンプ発生部、４０４…トレース停止条件設定部、
４０５…イベント検出部、４０６…モニタシーケンサ部、
４０７…メモリアドレス制御部、４０８…モニタ内部制御部、
４０９…トレースメモリ、４１０…ＰＣＩバス制御部、
９０１…ＩＤＥバスモニタ、９０１，９０２…ＨＤＤ。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an input / output bus monitoring system for an information processing apparatus and a program therefor, and more particularly, to an information processing apparatus in a multi-OS environment, comprising a bus monitor controlled by a monitoring program operating on an OS different from a normal OS. The present invention relates to an embedded information processing device and a program.
[0002]
[Prior art]
Conventional bus monitor devices include, for example, a SCSI bus monitor device and a bus monitor system disclosed in Japanese Patent Application Laid-Open No. 7-319804. This is provided with an event detecting means for detecting an event to be recorded from a transition of the SCSI signal fetched by the bus receiving section, and a trace memory for sequentially recording this state as trace data. When it exceeds, an alarm is output and a busy signal is output to the bus, during which the trace data is transferred to the outside. That is, a signal is taken into a SCSI bus monitor device via an adapter connected to a bus line (SCSI bus) to be monitored, trace information processed in the monitor device is sent to a host collection device (trace memory), and the operator collects the host data. Information was analyzed by operating a keyboard included in the device.
[0003]
In the SCSI bus monitoring system disclosed in Japanese Patent Application Laid-Open No. 10-222432, a SCSI bus monitoring unit monitors the SCSI bus as needed, and a bus information analysis unit converts signal information of the bus into a form that can be provided as data. The bus monitor daemon periodically issues a SCSI command to the bus monitor to acquire SCSI bus information and feed it back to the system. In other words, the function as a SCSI device is incorporated in the SCSI bus monitor so that it can be operated from the host computer, and the performance of the SCSI bus is improved based on the SCSI bus information obtained from the SCSI bus monitor.
[0004]
[Problems to be solved by the invention]
However, according to these conventional techniques, when a separate host collection device for monitoring is required, or when the host computer program hangs due to an abnormality that cannot be detected by the OS itself, the information collected by the SCSI bus monitor is lost. There is a problem that it cannot be used.
Further, in Japanese Patent Application Laid-Open No. Hei 10-222432, there is a problem that a program for controlling a SCSI bus monitor must be developed for each OS used.
[0005]
An object of the present invention is to solve such a conventional problem and eliminate the need for a monitoring CPU. Even when a normal OS hangs up or the system stops, the information collected by the bus monitor can be obtained. An object of the present invention is to provide a bus monitoring system provided with an inexpensive and reliable fault analysis and recovery means that can be used, and a program therefor.
[0006]
[Means for Solving the Problems]
In order to achieve the above object, the bus monitoring system of the present invention is connected to a system bus (PCI bus) of an information processing apparatus, and a connection bus (IDE (Integrated Device Electronics), SCSI bus) for connecting to various devices operates normally. A bus monitor having a bus monitoring and analysis function for monitoring whether or not a bus is running, and a virtual hardware having a function of allocating an interrupt from hardware and a processing time of a CPU allocate hardware resources to a plurality of OSs. In an information processing apparatus in a multi-OS environment having a function of separating and independent and a function of enabling data writing or reading between a plurality of OSs, the bus monitor operating on an OS independent of a normal OS is controlled. To run a monitoring program for It has to.
As a result, even when an abnormality that cannot be processed by the normal OS itself occurs, accurate failure analysis and countermeasures can be performed based on information from the monitoring program without a separate monitoring CPU. Further, by using the virtual hardware, it is possible to develop a monitoring program for controlling the bus monitor without depending on the OS.
[0007]
A monitor monitoring program according to an embodiment of the present invention provides an information processing apparatus with processing of a business OS basic unit, processing of a hardware driver group for controlling hardware, and processing of a SCSI bus monitor management AP for controlling a SCSI bus monitor. LOG control AP processing for editing and outputting failure information logged in hardware, IDE bus monitor management AP processing for controlling the IED bus monitor, and editing results by these APs via the LAN. This is a program for executing the remote control AP process under the control of the client PC.
[0008]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a configuration diagram of a bus monitoring system according to an embodiment of the present invention in which a monitored bus is a SCSI bus.
In FIG. 1, software 130 includes two programs, a business program 133 and a monitoring program 134, and the hardware 101 is accessed by the business program 133 and the monitoring program 134 as virtual hardware 132.
The multiple OS control program 131 has a function of distributing an interrupt from the hardware 101 and a processing time of the CPU 108 to the business program 133 and the monitoring program 134, and makes the virtual hardware 132 look like hardware to each OS. Thus, the hardware resources seen from the business program 133 and the hardware resources seen from the monitoring program 134 are separated and independent.
[0009]
Further, the multiple OS control program 131 has a function of enabling data writing or reading between multiple OSs. With these functions, a plurality of OSs can be executed independently on one piece of hardware. A technique for executing a plurality of OSs independently with one piece of hardware as implemented here is disclosed in Japanese Patent Application Laid-Open No. H11-149385. According to this, the business-side program 133 and the monitoring-side program 134 can be executed independently, and even when the business-side program 133 stops due to a failure, the monitoring-side program 134 can continue to operate.
[0010]
In the hardware 101, a CPU 108, a main memory 109, a SCSIA 112, a SCSI bus monitor 113, a LANA 114, a SuperIO 111 for controlling various devices, and a system bus controller 120 for controlling each bus are connected to a PCI bus 115. The CRTA 110 is connected to the system bus controller 120 and controls display on the CRT 107. The system bus controller 120 also controls the IDE bus 121. The keyboard 105 and the mouse 106 are connected to the SuperIO 111.
HDDs (hard disks) 102 to 104 are connected to the SCSIA 112 via a SCSI bus 116. Although the number of HDDs is three in this embodiment, it is not limited to this number. The SCSI bus monitor 113 is also connected so that the state transition of the SCSI bus 116 can be monitored.
A plurality of client PCs are connected via the LAN 122. In this embodiment, an example is shown in which three client PCs 117 to 119 are connected.
[0011]
FIG. 2 is a diagram showing a configuration of the business program in FIG.
As shown in FIG. 2, the business-side program 133 reports from a business OS basic unit 201, a hardware driver group 202 for performing hardware control, a business AP 203 including various business application programs, a CPU 108, and a hardware driver group 202. The system includes a system management AP 204 that is a system management application that performs logging and notification of a received error, and a remote control AP 205 that communicates with the client PCs 117 to 119 via the LAN 122.
[0012]
FIG. 3 is a diagram showing the configuration of the monitoring program in FIG.
As shown in FIG. 3, the monitoring-side program 134 includes a business OS basic unit 301, a hardware driver group 302 for performing hardware control, a SCSI bus monitor management AP 303 for controlling the SCSI bus monitor 113, and the hardware 101. A LOG control AP 304 for editing and outputting logged failure information, an IDE bus monitor management AP 306 for controlling the IDE bus monitor 901 (see FIG. 9), and client PCs 117 to 119 via the LAN 122 for editing results by these APs. A remote control AP 305 for performing control from the side.
The operation of the monitoring program 134 is performed by using the keyboard 105, the mouse 106 and the CRT 107 connected to the hardware 101, or by operating a client PC connected to the LANA 114.
In this embodiment, the client PC 117 is described as a management PC that operates the monitoring program 134.
[0013]
FIG. 4 is a diagram showing a configuration of the SCSI bus monitor 113 in FIG.
A bus receiver 401 for receiving a SCSI bus signal, a bus signal latch 402 for synchronizing the received bus signal with the internal timing of the SCSI bus monitor 113, a trace memory 409 for storing bus state transitions, a PCI A PCI bus control unit 410 for controlling transfer with the bus, a time stamp generating unit 403 for indicating the time of occurrence of a bus state transition, a trace stop condition setting unit 404 for setting a write stop condition for the trace memory 409, An event detection unit 405 for detecting a trace stop condition and a bus state transition; a monitor sequencer unit 406 for controlling access timing and address update timing of the trace memory 409; a memory address control unit 407 for managing addresses of the trace memory 409; And SCSI Consisting monitor internal control unit 408 for controlling the entire monitor.
[0014]
FIG. 5 is a flowchart showing a system startup procedure according to one embodiment of the present invention.
The activation process will be described with reference to FIG.
When the power of the hardware 101 is turned on (or rebooted), the business program 133 is started (step 501). The business program 133 starts the initialization process, but the initialization process is interrupted by the multiple OS control program 131 and the monitoring program 134 is started (step 502). The monitoring program 134 clears the trace memory 409 (step 503), initializes the time stamp generator 403 (step 504), and initializes the trace stop condition setting unit 404 (step 505).
The trace stop condition set at this time is a free-run state (no stop condition).
[0015]
Next, the monitor sequencer unit 406 is activated to start a trace operation (step 506). After that, the remote control AP 305 is activated while waiting for input, and waits for an operation from the client PC 117 (management PC) (steps 507 and 508). In this state, the processing is transferred to the business program 133 by the multiple OS control program 131, and the startup processing of the business program 133 is restarted (step 509). Next, the business AP 203 is started, and various related information is initialized (step 510). Thereafter, the normal business by the business AP 203 is started (step 511).
[0016]
FIG. 6 is a flowchart illustrating a processing procedure when a failure related to the SCSI bus 116 in FIGS. 1 and 2 is notified from the system management AP 204 to the client PC 117.
The system management AP 204 detects a failure related to the SCSI bus 116 by monitoring responses of the hardware driver group 202 and the business OS basic unit 201 (step 601). The system management AP 204 logs the failure information and notifies the client PC 117 of the failure via the remote control AP 205 (step 602). Generally, the system administrator performs a failure analysis from the information of the client PC 117 (step 603). If the cause of the failure can be identified at this stage, appropriate measures will be taken based on the cause. If the cause cannot be specified from the information notified from the system management AP 204, the system administrator accesses the SCSI bus monitor management AP 303 from the client PC 117 and specifies the operation mode of the SCSI bus monitor (step 604).
[0017]
The SCSI bus monitor management AP 303, for which the operation parameters have been specified in step 604, first instructs the monitor sequencer unit 406 to stop (step 605). Next, the stop condition specified in step 604 such as check condition detection (error response) is set in the trace stop condition setting unit 404 (step 606). Then, the monitor sequencer unit 406 is activated to start the trace operation (step 607). The monitoring program 134 enters a state of waiting for an event and enters a pause state (step 608). The operation of the monitoring program 134 is executed by the multiple OS control program 131 independently of the business program 133.
[0018]
FIG. 7 is a processing flowchart when the CSI bus monitor detects the set stop condition.
The event detection unit 404 compares the output of the bus signal latch 402 with the output of the trace stop condition setting unit 406 and detects that the stop condition has been satisfied (step 701). As a result, the writing of the SCSI bus signal state to the trace memory 409 by the monitor sequencer unit 406 is stopped (step 702). An interruption from the SCSI bus monitor 113 notifies the SCSI bus monitor management AP 303 that the stop condition has been satisfied (step 703). The SCSI bus monitor management AP 303 reads the contents of the trace memory 409 to a predetermined area of the main memory 109 via the PCI bus 115 (Step 704). After that, the SCSI bus monitor management AP 303 edits the trace data on the main memory 109 and transfers the result to the client PC 117 via the remote control AP 305 (step 705). The system administrator analyzes the contents of the trace transferred to the client PC 117 and takes measures against the failure (step 706).
When an abnormality occurs such that the business OS basic unit 201 of the business program 133 cannot process the business program 133, the business program 133 may transition to a disabled state such as a stopped state.
[0019]
FIG. 8 is a processing flowchart at the time of the business side program stop failure.
The system administrator operates the client PC 117 to designate the SCSI bus monitor management AP 303 to stop monitoring and to transfer trace data (step 801). The SCSI bus monitor management AP 303 that has been instructed to stop monitoring sets the stop in the monitor sequencer unit 406 (step 802). The writing of the SCSI bus signal state to the trace memory 409 by the monitor sequencer unit 406 is stopped (step 803). The SCSI bus monitor management AP 303 reads the contents of the trace memory 409 into a predetermined area of the main memory 109 via the PCI bus 115 (Step 804). After that, the SCSI bus monitor management AP 303 edits the trace data on the main memory 109 and transfers the result to the client PC 117 via the remote control AP 305 (step 805). The system administrator analyzes the contents of the trace transferred to the client PC 117 and takes measures against the failure (step 806).
[0020]
When an abnormality that the business OS basic unit 201 of the business program 133 cannot process is considered to be a cause other than the sequence of the SCSI bus 116, the system administrator accesses the LOG control AP 304 by operating the client PC 117 and Edits and outputs various hardware log logs recorded in. By analyzing the information obtained in this way, the cause of the failure is determined and appropriate measures are taken.
[0021]
In this embodiment, the SCSI bus monitor 113 is controlled only by the SCSI bus monitor management AP 303. However, the control can be performed from the system management AP 204 by changing the setting of the virtual hardware 132 of the multiple OS control program 131. You can also do so. By incorporating a function for automatically controlling the SCSI bus monitor 113 upon detection of a failure related to the SCSI bus 116 in the system management AP 204, the status information of the SCSI bus 116 can be collected without intervention of the system administrator. Becomes possible.
[0022]
FIG. 9 is a diagram showing an additional configuration when tracing the IDE bus according to the present invention.
In the present embodiment, the monitoring of the SCSI bus 116 is described as an example. However, as shown in FIG. 9, an IDE bus monitor 901 is connected, and the IDE bus monitor Monitoring can also be performed. 9, an IDE bus monitor 901 is controlled by a PCI bus 115 and monitors the IDE bus 121. The HDD 902 and the HDD 903 are IDE I / F HDDs.
[0023]
【The invention's effect】
As described above, according to the present invention, the bus monitor is controlled by the bus monitor monitoring program operating on an OS different from the normally operating business OS using the multi-OS environment. When an abnormality occurs that the business OS or the business program itself cannot process, accurate failure analysis and countermeasures can be performed based on information from the monitoring program without a separate monitoring CPU. Further, by using the virtual hardware method, the same monitoring program can be used without depending on the OS type of the business program.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of a bus monitoring system according to an embodiment of the present invention.
FIG. 2 is a diagram showing a configuration of a business program in FIG.
FIG. 3 is a diagram showing a configuration of a monitoring-side program in FIG. 1;
FIG. 4 is a diagram showing a configuration of a SCSI bus monitor in FIG. 1;
FIG. 5 is a flowchart illustrating a procedure for starting a system according to an embodiment of the present invention.
FIG. 6 is a processing flowchart when a failure is detected in the present invention.
FIG. 7 is a processing flowchart when a trace stop condition is detected in the present invention.
FIG. 8 is a processing flowchart at the time of a task side program stop failure in the present invention.
FIG. 9 is a diagram showing an additional configuration when tracing an IDE bus according to the present invention.
[Explanation of symbols]
101: hardware, 102 to 104: HDD (hard disk),
105: keyboard, 106: mouse, 107: CRT, 108: CPU,
109: Main memory, 110: CRTA, 111: Super IO,
112: SCSIIA, 113: SCSI bus monitor, 114: LANA,
115: PCI bus, 116: SCSI bus,
117 to 119: client PC, 120: system bus controller,
121: IDE bus, 122: LAN, 130: software,
131: multiple OS control programs, 132: virtual hardware,
133: business side program, 134: monitoring side program,
201: business OS basic unit, 202: hardware driver group,
203: business AP (AP: application program),
204: system monitoring AP, 205: remote control AP,
301: monitoring OS basic unit, 302: hardware driver group,
303: SCSI bus monitor management AP, 304: LOG control AP,
305: remote control AP, 306: IDE bus monitor management AP,
401: bus receiver, 402: bus signal latch,
403: time stamp generation unit, 404: trace stop condition setting unit
405: an event detector, 406: a monitor sequencer,
407: memory address control unit, 408: monitor internal control unit,
409: trace memory, 410: PCI bus control unit,
901: IDE bus monitor; 901, 902: HDD.

Claims

A bus monitor for analysis and monitoring of a connection bus with various devices, and means for separating hardware resources for a plurality of OSs by means of virtual hardware for distributing interrupts from hardware and processing time of a CPU, A bus monitoring system comprising an information processing device in a multi-OS environment having a plurality of OS control programs for writing or reading data between a plurality of OSs,
A bus monitoring system that operates on an OS different from an OS that is normally operating, controls the analysis bus monitor, and includes a bus monitor monitoring program that performs bus monitoring and abnormality detection.

The bus monitoring system according to claim 1,
A bus monitoring system, wherein the bus monitoring monitoring program does not depend on the type of an OS that is operating normally.

For the information processing apparatus, the processing of the business OS basic unit, the processing of a group of hardware drivers for performing hardware control, the processing of the SCSI bus monitor management AP for controlling the SCSI bus monitor, and the processing of logging in the hardware Processing of a LOG control AP for editing and outputting fault information to be processed, processing of an IDE bus monitor management AP for controlling an IED bus monitor, and editing results of each AP are controlled by a client PC via a LAN. Monitor monitoring program for causing each of the processes of the remote control AP to execute.