JP2012003511A

JP2012003511A - Computer system switching

Info

Publication number: JP2012003511A
Application number: JP2010137842A
Authority: JP
Inventors: Taro Nakamura; 太郎中村
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2010-06-17
Filing date: 2010-06-17
Publication date: 2012-01-05

Abstract

PROBLEM TO BE SOLVED: To shorten system switching time of a cold standby computer.SOLUTION: When a fault occurs in an active system computer, a memory dump is stored in an external computer via a service processor, without storing in a system disk. Thereby a system disk is made available from a standby system computer before starting the memory dump processing in the active system computer. System switching time is shortened by starting a standby system computer, without waiting for completion of the memory dump processing by the active system computer.

Description

本発明は、冗長化されたコンピュータの系切り替え方式に関するものである。 The present invention relates to a system switching method for redundant computers.

コンピュータシステムにおいて、システム全体の可用性を向上するための手法として、ハードウェアの冗長化が広く用いられている。 In a computer system, hardware redundancy is widely used as a technique for improving the availability of the entire system.

Ｎ＋１コールドスタンバイは、コンピュータシステムにおけるハードウェア冗長化手法の１つである。Ｎ＋１コールドスタンバイでは、Ｎ台の現用系コンピュータと１台の待機系コンピュータでシステムを構成する。通常時の運用においては、待機系コンピュータは稼動させず、Ｎ台の現用系コンピュータのみで業務アプリケーションを実行する。現用系コンピュータのいずれか１台で障害が発生した場合、待機系コンピュータが起動し、新たな現用系コンピュータの１台として、障害が発生したコンピュータが実行していた業務アプリケーションの処理を引き継ぐ。 N + 1 cold standby is one of the hardware redundancy methods in a computer system. In N + 1 cold standby, a system is configured by N active computers and one standby computer. In normal operation, the standby computer is not operated, and the business application is executed only by the N active computers. When a failure occurs in any one of the active computers, the standby computer is activated and takes over the processing of the business application that was executed by the failed computer as one of the new active computers.

Ｎ＋１コールドスタンバイの構成を組む場合、ＯＳのシステムディスクは現用系コンピュータで稼動するＮ台分のみが必要であり、待機系コンピュータのためのＯＳを別途用意する必要はない。Ｎ台分のＯＳイメージは、外部のＳＡＮに接続されたディスクアレイなどに格納し、Ｎ＋１台のコンピュータ全てから物理的にアクセスできるようにシステムを構成する。ＯＳイメージを格納するシステムディスクと各コンピュータとの対応は、論理的なマッピングを切り替えることで制御する。システムの通常稼動時は、現用系コンピュータにシステムディスクがマッピングされ、待機系コンピュータにはマッピングされない。現用系コンピュータのいずれか１台で障害が発生した場合、その現用系コンピュータに対応していたシステムディスクは、待機系コンピュータにマッピングが切り替えられる。待機系コンピュータは、そのシステムディスクに格納されたＯＳイメージを使用してＯＳを起動する。 When an N + 1 cold standby configuration is configured, only the N system disks operating on the active computer are required, and there is no need to separately prepare an OS for the standby computer. N OS images are stored in a disk array or the like connected to an external SAN, and the system is configured to be physically accessible from all N + 1 computers. The correspondence between the system disk storing the OS image and each computer is controlled by switching logical mapping. During normal operation of the system, the system disk is mapped to the active computer and not mapped to the standby computer. When a failure occurs in any one of the active computers, the mapping of the system disk corresponding to the active computer is switched to the standby computer. The standby computer starts up the OS using the OS image stored in the system disk.

Ｎ＋１コールドスタンバイにおける系切り替えは、次の手順で実行される。
(1) １台の現用系コンピュータで障害が発生する。
(2) 現用系コンピュータ上のＯＳは、メモリ内にあるファイルシステムのキャッシュをディスクにフラッシュする。
(3) 現用系コンピュータ上のＯＳは、メモリイメージをシステムディスクのダンプ領域にダンプする。ダンプデータは、後で障害の原因を解析するために必要となる。
(4) ダンプ完了後、現用系コンピュータはシャットダウンされる。
(5) システムディスクのマッピングが現用系コンピュータから待機系コンピュータに切り替えられる。
(6) 待機系コンピュータがシステムディスクからＯＳを起動する。
(7) 待機系コンピュータ上のＯＳは、ダンプ領域に格納されていたダンプデータを読み込み、ファイルシステム上にファイルとしてコピーする。
(8) 待機系コンピュータで業務アプリケーションが再開される。 The system switching in the N + 1 cold standby is executed according to the following procedure.
(1) A failure occurs on one active computer.
(2) The OS on the active computer flushes the file system cache in the memory to the disk.
(3) The OS on the active computer dumps the memory image to the dump area of the system disk. The dump data is needed later to analyze the cause of the failure.
(4) After the dump is completed, the active computer is shut down.
(5) The system disk mapping is switched from the active computer to the standby computer.
(6) The standby computer starts the OS from the system disk.
(7) The OS on the standby computer reads the dump data stored in the dump area and copies it as a file on the file system.
(8) The business application is resumed on the standby computer.

このように、Ｎ＋１コールドスタンバイにおける系切り替えでは、待機系コンピュータが業務アプリケーションが再開するためには、上記(1)〜(7)までの処理の完了を待つ必要があった。特に、ダンプデータを扱う(3)、(7)は、コンピュータが搭載するメモリ容量が大きくなるにつれて時間がかかる傾向にあり、業務アプリケーションを迅速に再開するための妨げとなっていた。 As described above, in the system switching in the N + 1 cold standby, it is necessary for the standby computer to wait for the completion of the processes (1) to (7) in order for the business application to resume. In particular, (3) and (7) for handling dump data tend to take time as the memory capacity of the computer increases, which has been an obstacle to quickly restarting business applications.

また、通常、ダンプデータの解析は専門のサポートエンジニアが担当するが、セキュリティの観点から、サポートエンジニアは業務アプリケーションを実行するコンピュータのＯＳにはアクセスできないことが多い。そのため、サポートエンジニアはダンプ解析用のコンピュータを別に用意し、ダンプ解析はこのコンピュータ上で行われる。この場合、ダンプデータは、業務アプリケーションを実行しているコンピュータからダンプ解析用コンピュータに転送する必要がある。 Further, although analysis of dump data is usually performed by a specialized support engineer, from the viewpoint of security, the support engineer often cannot access the OS of a computer that executes a business application. Therefore, the support engineer prepares a separate computer for dump analysis, and the dump analysis is performed on this computer. In this case, it is necessary to transfer the dump data from the computer executing the business application to the dump analysis computer.

特開２０１０−００９２９３号公報JP 2010-009293 A

コールドスタンバイ方式の系切り替えにおける課題は、切り替え中にダンプデータの処理完了を待つ必要があり、待機系コンピュータが迅速に業務アプリケーションを再開できないことである。 The problem with cold standby system switching is that it is necessary to wait for the completion of dump data processing during switching, and the standby computer cannot quickly resume the business application.

本発明では、メモリダンプをシステムディスクのダンプ領域には格納せず、サービスプロセッサを経由してダンプ解析用コンピュータに転送する。システムディスクのマッピングは、現用系コンピュータのメモリダンプ開始前に切り替えられる。待機系コンピュータは、現用系コンピュータによるメモリダンプの完了を待たずに直ちにＯＳを起動する。 In the present invention, the memory dump is not stored in the dump area of the system disk, but transferred to the dump analysis computer via the service processor. The system disk mapping is switched before the memory dump of the active computer is started. The standby computer immediately starts the OS without waiting for the completion of the memory dump by the active computer.

本発明では、待機系コンピュータが業務アプリケーションを再開するために、現用系コンピュータのメモリダンプ処理完了を待つ必要がなく、また、待機系コンピュータ上のＯＳが、システムディスクのダンプ領域からダンプデータをファイルに変換する必要もない。そのため、現用系コンピュータで障害が発生してから、待機系コンピュータで業務アプリケーションが再開されるまでの時間を短縮でき、システム全体としての可用性を向上する効果がある。 In the present invention, it is not necessary to wait for the completion of the memory dump processing of the active computer in order for the standby computer to resume the business application, and the OS on the standby computer can save the dump data from the dump area of the system disk. There is no need to convert to. Therefore, it is possible to shorten the time from when a failure occurs in the active computer to when the business application is resumed on the standby computer, thereby improving the availability of the entire system.

コンピュータの系切り替え方式の実施例を示した図である。It is the figure which showed the Example of the system switching system of a computer. コンピュータの系切り替え方式の処理手順である。It is a processing procedure of a computer system switching method. メモリダンプ処理で使用するＤＭＡ領域のフォーマットである。This is the format of the DMA area used in the memory dump process. 現用系コンピュータのＣＰＵによるメモリダンプ処理手順である。It is a memory dump processing procedure by the CPU of the active computer. サービスプロセッサによるメモリダンプ処理手順である。It is a memory dump processing procedure by the service processor.

図１は、本発明の一実施例を示した図である。 FIG. 1 is a diagram showing an embodiment of the present invention.

現用系コンピュータ１０１は、システムの通常稼動時に業務アプリケーションを実行するためのコンピュータであり、ＣＰＵ１０３、メモリ１０４、ＩＯブリッジ１０５、ＨＢＡ１０６、及びサービスプロセッサ１０７を搭載する。ＣＰＵ１０３はメモリコントローラを内蔵し、メモリ１０４及びＩＯブリッジ１０５に接続される。ＨＢＡ１０６の先には、外部ストレージデバイスとしてディスクアレイ装置２０１が接続される。ＯＳ及び業務アプリケーションは、ディスクアレイ装置２０１が搭載するディスク２０２からメモリ１０４にロードされ、ＣＰＵ１０３上で実行される。 The active computer 101 is a computer for executing a business application during normal operation of the system, and includes a CPU 103, a memory 104, an IO bridge 105, an HBA 106, and a service processor 107. The CPU 103 includes a memory controller and is connected to the memory 104 and the IO bridge 105. A disk array device 201 is connected as an external storage device beyond the HBA 106. The OS and business application are loaded into the memory 104 from the disk 202 mounted on the disk array apparatus 201 and executed on the CPU 103.

サービスプロセッサ１０７は、独自のＣＰＵ、メモリ、ＩＯデバイスを備える、現用系コンピュータ１０１とは独立したコンピュータであり、現用系コンピュータ１０１のメイン電源の制御などを行う。サービスプロセッサ１０７は現用系コンピュータ１０１のスタンバイ電源を利用するため、現用系コンピュータ１０１のメイン電源がＯｆｆの状態でも常に稼動している。また、サービスプロセッサ１０７はＩＯブリッジ１０５と接続されており、ＣＰＵ１０３のメモリコントローラを介してメモリ１０４にアクセスできる。 The service processor 107 is a computer that is provided with a unique CPU, memory, and IO device and is independent of the active computer 101, and controls the main power supply of the active computer 101. Since the service processor 107 uses the standby power supply of the active computer 101, the service processor 107 is always operating even when the main power supply of the active computer 101 is off. The service processor 107 is connected to the IO bridge 105 and can access the memory 104 via the memory controller of the CPU 103.

待機系コンピュータ１０２は、現用系コンピュータ１０１と同様のハードウェアを備えるコンピュータであり、現用系コンピュータ１０１で障害が発生した場合に業務アプリケーションの実行を継続する役割を担う。待機系コンピュータ１０２は、現用系コンピュータ１０１の稼働中は電源Ｏｆｆ状態で待機する。 The standby computer 102 is a computer having hardware similar to that of the active computer 101, and plays a role of continuing execution of business applications when a failure occurs in the active computer 101. The standby computer 102 stands by in a power-off state while the active computer 101 is in operation.

ディスクアレイ装置２０１には、現用系コンピュータ１０１と待機系コンピュータ１０２のＨＢＡ１０６が物理的に接続される。コントローラ２０３はホストマッピング機能を有し、接続されるＨＢＡのＩＤに応じて入出力データのフローを制御できる。現用系コンピュータ１０１の稼働中は、現用系コンピュータ１０１のＨＢＡとディスクアレイ装置２０１を接続するパスが有効であり、待機系コンピュータ１０２のＨＢＡとディスクアレイ装置２０１を接続するパスは無効化されている。 The disk array device 201 is physically connected to the HBA 106 of the active computer 101 and the standby computer 102. The controller 203 has a host mapping function, and can control the flow of input / output data according to the ID of the HBA to be connected. While the active computer 101 is in operation, the path connecting the HBA of the active computer 101 and the disk array device 201 is valid, and the path connecting the HBA of the standby computer 102 and the disk array device 201 is invalidated. .

管理サーバ３０１は、コールドスタンバイの系切り替えを制御するコンピュータである。管理サーバ３０１は、管理ネットワーク５０１を経由してサービスプロセッサ１０７及びディスクアレイ装置２０１と接続される。現用系コンピュータ１０１で障害が発生した場合、管理サーバ３０１は、ディスクアレイ装置２０１に対するＩ／Ｏパスの切り替え指示と、待機系コンピュータ１０２に対するメイン電源Ｏｎの指示を行う。 The management server 301 is a computer that controls cold standby system switching. The management server 301 is connected to the service processor 107 and the disk array device 201 via the management network 501. When a failure occurs in the active computer 101, the management server 301 instructs the disk array device 201 to switch the I / O path and instructs the standby computer 102 to turn on the main power source On.

ダンプ解析用コンピュータ４０１は、現用系コンピュータ１０１のメモリダンプデータの格納・解析を行うためのコンピュータである。ダンプ解析用コンピュータ４０１は、ダンプ格納用ディスク４０２を備える。また、管理ネットワーク５０１を経由してサービスプロセッサ１０７と接続される。 The dump analysis computer 401 is a computer for storing and analyzing memory dump data of the active computer 101. The dump analysis computer 401 includes a dump storage disk 402. Further, the service processor 107 is connected via the management network 501.

本発明における、現用系コンピュータ１０１で障害が発生した場合の系切り替え処理を図２に示す。 FIG. 2 shows a system switching process when a failure occurs in the active computer 101 in the present invention.

現用系コンピュータ１０１で障害が発生すると、ＣＰＵ１０３に割り込みが上がり、業務アプリケーションの全ての処理が中断されてＯＳに処理が移行する。 When a failure occurs in the active computer 101, the CPU 103 is interrupted, all processing of the business application is interrupted, and the processing shifts to the OS.

現用系コンピュータ１０１上のＯＳは、ステップ１００１においてＣＰＵ１０３のコンテキスト情報をメモリ１０４に退避し、ステップ１００２においてメモリ内のファイルシステムのキャッシュをディスク２０２にフラッシュした後、ステップ１００３において、ＩＯブリッジ１０５を経由して、障害が発生したことをサービスプロセッサ１０７に通知する。 The OS on the active computer 101 saves the context information of the CPU 103 to the memory 104 in step 1001, flushes the cache of the file system in the memory to the disk 202 in step 1002, and then passes through the IO bridge 105 in step 1003. Then, the service processor 107 is notified that a failure has occurred.

ステップ１００４において、サービスプロセッサ１０７は、管理サーバ３０１に対して、現用系コンピュータ１０１で障害が発生し、系切り替えする必要がある旨を通知する。管理サーバ３０１は、ステップ１００５で直ちにディスクアレイ装置２０１に対してＩ／Ｏパスの切り替えを指示する。ディスクアレイ装置２０１はこの指示を受け、ステップ１００６でディスク２０２に対するホストマッピングを現用系コンピュータ１０１から待機系コンピュータ１０２に切り替え、ステップ１００７で、Ｉ／Ｏパスの切り替えが完了したことを管理サーバ３０１に通知する。管理サーバ３０１は、ステップ１００８で待機系コンピュータ１０２のサービスプロセッサ１０７に対してメイン電源Ｏｎの指示を出す。待機系コンピュータ１０２のサービスプロセッサ１０７は、この指示を受け、ステップ１００９で待機系コンピュータ１０２のメイン電源を投入する。待機系コンピュータ１０２のＣＰＵ１０３は、ステップ１０１０でＯＳを待機系コンピュータ１０２のメモリ１０４にロードし、ステップ１０１１で業務アプリケーションの処理を再開する。 In step 1004, the service processor 107 notifies the management server 301 that a failure has occurred in the active computer 101 and the system needs to be switched. In step 1005, the management server 301 immediately instructs the disk array device 201 to switch the I / O path. Upon receiving this instruction, the disk array device 201 switches the host mapping for the disk 202 from the active computer 101 to the standby computer 102 in step 1006, and in step 1007 informs the management server 301 that the I / O path switching has been completed. Notice. In step 1008, the management server 301 issues a main power On instruction to the service processor 107 of the standby computer 102. Upon receiving this instruction, the service processor 107 of the standby computer 102 turns on the main power of the standby computer 102 in step 1009. The CPU 103 of the standby computer 102 loads the OS into the memory 104 of the standby computer 102 in step 1010, and resumes the processing of the business application in step 1011.

一方、現用系コンピュータ１０１側では、ステップ１００４の処理の後に、引き続きメモリダンプ処理２０００を開始する。 On the other hand, on the active computer 101 side, the memory dump process 2000 is started after the process of step 1004.

メモリダンプ処理では、ＣＰＵ１０３とサービスプロセッサ１０７とのデータ送受信のために、メモリ１０４上に確保されたＤＭＡ領域が使用される。このＤＭＡ領域は、現用系コンピュータ１０１の起動時にＣＰＵ１０３によって確保される。ＤＭＡ領域のフォーマットを図３に示す。データ領域９０１は、メモリダンプ処理において、ダンプ対象のメモリデータをバッファリングするための領域である。制御コード領域９０２は、メモリダンプ処理において、ＣＰＵ１０３とサービスプロセッサ１０７が使用する制御コードを格納するための領域である。制御コード領域９０２には、データ有効コード、データ無効コード、完了コードのいずれかが格納される。現用系コンピュータ１０１の起動時、ＣＰＵ１０３は制御コード領域９０２にデータ無効コードが格納する。 In the memory dump process, a DMA area secured on the memory 104 is used for data transmission / reception between the CPU 103 and the service processor 107. This DMA area is secured by the CPU 103 when the active computer 101 is activated. The format of the DMA area is shown in FIG. The data area 901 is an area for buffering memory data to be dumped in the memory dump process. The control code area 902 is an area for storing control codes used by the CPU 103 and the service processor 107 in the memory dump process. The control code area 902 stores any one of a data valid code, a data invalid code, and a completion code. When the active computer 101 is activated, the CPU 103 stores a data invalid code in the control code area 902.

メモリダンプ処理は、現用系コンピュータ１０１のＣＰＵ１０３と、サービスプロセッサ１０７、及びダンプ解析用コンピュータ４０１により実行される。 The memory dump process is executed by the CPU 103 of the active computer 101, the service processor 107, and the dump analysis computer 401.

ＣＰＵ１０３によるメモリダンプ処理の手順を図４に示す。ステップ２００１及びステップ２００２は、データ領域９０１が書き込み可能となることを待つ処理である。ステップ２００１において、ＣＰＵ１０３は制御コード領域９０２の値を読み出す。ステップ２００２において、ステップ２００１で読み出した値がデータ無効コードであった場合はステップ２００３に遷移し、そうでない場合はステップ２００１に戻る。ステップ２００３において、ダンプ対象データの内、まだ処理が完了していないデータの一部をデータ領域９０１にコピーする。ステップ２００４において、ステップ２００３で全てのダンプ対象データをデータ領域９０１にコピーし終えていた場合、ステップ２００６に遷移し、そうでない場合はステップ２００５に遷移する。ステップ２００５では、制御コード領域９０２にデータ有効コードを書き込み、ステップ２００１に戻る。一方、ステップ２００６では、制御コード領域９０２に完了コードを書き込み、メモリダンプ処理を終了する。 The procedure of the memory dump process by the CPU 103 is shown in FIG. Steps 2001 and 2002 are processes for waiting for the data area 901 to be writable. In step 2001, the CPU 103 reads the value in the control code area 902. In step 2002, if the value read in step 2001 is a data invalid code, the process proceeds to step 2003. If not, the process returns to step 2001. In step 2003, a part of the data that has not yet been processed among the data to be dumped is copied to the data area 901. In step 2004, if all the dump target data has been copied to the data area 901 in step 2003, the process proceeds to step 2006, and if not, the process proceeds to step 2005. In step 2005, the data valid code is written in the control code area 902, and the process returns to step 2001. On the other hand, in step 2006, a completion code is written in the control code area 902, and the memory dump process is terminated.

サービスプロセッサ１０７によるメモリダンプ処理の手順を図５に示す。まず、ステップ３００１においてダンプ解析用コンピュータ４０１に接続し、データ転送のためのセッションを確立する。ステップ３００２及びステップ３００３は、ＣＰＵ１０３によるデータ領域９０１の書き込み完了を待つ処理である。ステップ３００２において、制御コード領域９０２の値を読み出す。ステップ３００３において、ステップ３００２で読み出した値がデータ無効コードであった場合はステップ３００２に戻り、そうでない場合はステップ３００４に遷移する。ステップ３００４において、データ領域９０２からダンプデータの一部を読み込み、読み込んだデータを管理ネットワーク５０１を経由してダンプ解析用コンピュータ４０１に転送する。ステップ３００５において、制御コード領域９０２にデータ無効コードを書き込む。ステップ３００６において、ステップ３００２で読み出した制御コード領域９０２の値が完了コードであった場合はステップ３００７に遷移し、そうでない場合はステップ３００２に戻る。ステップ３００７において、現用系コンピュータ１０１の電源を落とす。ステップ３００８において、ダンプ解析用コンピュータとのセッションを閉じ、メモリダンプ処理を終了する。 The procedure of the memory dump process by the service processor 107 is shown in FIG. First, in step 3001, a connection is made to the dump analysis computer 401 to establish a session for data transfer. Steps 3002 and 3003 are processes for waiting for completion of writing of the data area 901 by the CPU 103. In step 3002, the value in the control code area 902 is read. In step 3003, if the value read in step 3002 is a data invalid code, the process returns to step 3002, and if not, the process transitions to step 3004. In step 3004, a part of dump data is read from the data area 902, and the read data is transferred to the dump analysis computer 401 via the management network 501. In step 3005, the data invalid code is written in the control code area 902. In step 3006, if the value of the control code area 902 read in step 3002 is a completion code, the process proceeds to step 3007. Otherwise, the process returns to step 3002. In step 3007, the active computer 101 is powered off. In step 3008, the session with the dump analysis computer is closed, and the memory dump process is terminated.

ダンプ解析用コンピュータ４０１は、サービスプロセッサ１０７から転送されてくるダンプデータを、順次ダンプ格納用ディスク４０２に格納する。 The dump analysis computer 401 sequentially stores the dump data transferred from the service processor 107 in the dump storage disk 402.

以上のように、本発明によるコンピュータの系切り替え方式では、障害が発生した現用系コンピュータのメモリダンプをシステムディスクには格納せず、サービスプロセッサを介して外部のダンプ解析用コンピュータに格納する。そのため、現用系コンピュータは使用していたシステムディスクをメモリダンプダンプ処理開始前に解放可能であり、待機系コンピュータを即座に起動できる。これにより、業務アプリケーションの停止時間を短縮することができ、システム全体としての可用性を向上できる。 As described above, in the computer system switching method according to the present invention, the memory dump of the failed active computer is not stored in the system disk, but is stored in the external dump analysis computer via the service processor. Therefore, the active computer can release the used system disk before starting the memory dump dump process, and can immediately start the standby computer. Thereby, the stop time of a business application can be shortened and the availability of the entire system can be improved.

なお、本実施例では現用系コンピュータと待機系コンピュータが各々１台ずつ存在する場合を示したが、本発明によるコンピュータの切り替え方式は、現用系コンピュータと待機系コンピュータが複数存在する場合にも適用可能である。 In this embodiment, the case where there is one active computer and one standby computer is shown. However, the computer switching method according to the present invention is also applicable to the case where there are a plurality of active computers and standby computers. Is possible.

また、本実施例ではメモリダンプ処理をＯＳが担当したが、ＯＳはファイルシステムをディスクにフラッシュしたらその時点で終了させ、メモリダンプ処理はシステムファームウェアが実施しても良い。 In this embodiment, the OS is in charge of the memory dump process. However, the OS may be terminated at that point when the OS flushes the file system to the disk, and the system firmware may execute the memory dump process.

１０１…現用系コンピュータ、１０２…待機系コンピュータ、１０３…ＣＰＵ、１０４…メモリ、１０５…ＩＯブリッジ、１０６…ＨＢＡ、１０７…サービスプロセッサ、２０１…ディスクアレイ装置、２０２…ディスク、２０３…コントローラ、３０１…管理サーバ、４０１…ダンプ解析用コンピュータ、４０２…ダンプ格納用ディスク、５０１…管理ネットワーク。 DESCRIPTION OF SYMBOLS 101 ... Active computer, 102 ... Standby computer, 103 ... CPU, 104 ... Memory, 105 ... IO bridge, 106 ... HBA, 107 ... Service processor, 201 ... Disk array apparatus, 202 ... Disk, 203 ... Controller, 301 ... Management server 401... Dump analysis computer 402 dump storage disk 501 management network

Claims

This is a system switching method for a cold standby type computer that shares an OS startup disk, and by storing a memory dump of the active computer in an external computer via a service processor, before the memory dump process of the active computer is started. A computer system switching method characterized in that an OS startup disk can be used and a standby computer can be started without waiting for the completion of memory dump processing by the active computer.