JP2006114064A

JP2006114064A - Storage subsystem

Info

Publication number: JP2006114064A
Application number: JP2005376967A
Authority: JP
Inventors: Rie Kobayashi; 利恵小林; Yoshiko Matsumoto; 佳子松本; Kenji Muraoka; 健司村岡
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2005-12-28
Filing date: 2005-12-28
Publication date: 2006-04-27

Abstract

<P>PROBLEM TO BE SOLVED: To provide a storage subsystem having controllers in which exclusive control of caches between a plurality of controllers sharing the caches is eliminated. <P>SOLUTION: In the caches 33, 43 to which mutual multiplex writing is performed, cache areas are divided by each processor, each of the controllers 30, 40 accesses only to the self-controller control area. By fixing the cache areas used by each controller, the exclusive control between processors is made unnecessary and performance deterioration accompanying increase of the processors is prevented. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、上位装置からの情報の入出力要求を制御する制御装置を有する記憶サブシステムに関し、特に、制御装置内のコントローラ及びキャッシュメモリを冗長構成とする記憶サブシステムに関する。 The present invention relates to a storage subsystem having a control device that controls information input / output requests from a host device, and more particularly to a storage subsystem having a controller and a cache memory in the control device as a redundant configuration.

コントローラ及びディスク等の記憶装置に冗長性を持たせた記憶サブシステムとして、一方の系が現用系として、他方の系が予備系として稼働する２重の系で構成される記憶サブシステムがある。 As a storage subsystem in which storage devices such as a controller and a disk are made redundant, there is a storage subsystem configured by a double system in which one system is used as a working system and the other system is used as a standby system.

特開平４−２１５１４２に記載されている記憶サブシステムは、現用系のディスク装置の記憶情報を両系からアクセス可能な共用ディスク装置を介して予備系のディスク装置に複写すること、あるいは、現用系コントローラ障害時は、予備系のコントローラによって、現用系のディスク装置の記憶情報を抽出可能とすることによって、コントローラ及びディスク装置障害時のデータ保全性の向上を計っている。 The storage subsystem described in Japanese Patent Laid-Open No. 4-215142 copies the storage information of the active disk device to the standby disk device via a shared disk device accessible from both systems, or the active system In the event of a controller failure, the storage information of the active disk device can be extracted by the standby controller, thereby improving data integrity when the controller and the disk device fail.

特開平４−２１５１４２号JP-A-4-215142

最近の市場動向として、記憶装置の高性能化、大容量化、低価格化の要求が高まっており、ＲＡＩＤの技術が重視されている。ＲＡＩＤの技術を適用した記憶サブシステムにおいては、複数のディスク装置をアレイ状に構成する。そして、データ書き込み時には、書き込みデータに加えて冗長データを書き込みデータを格納したディスク装置とは異なるディスク装置へ書き込む。アレイ構成内の任意のディスク装置の故障に対しは、他のディスク装置のデータと前記冗長データから障害ディスク装置上のデータを修復可能とすることによって、ディスク装置のデータの保全性の向上を計っている。 As recent market trends, demands for higher performance, larger capacity, and lower price of storage devices are increasing, and RAID technology is emphasized. In a storage subsystem to which RAID technology is applied, a plurality of disk devices are configured in an array. When data is written, redundant data is written to a disk device different from the disk device storing the write data in addition to the write data. For the failure of any disk unit in the array configuration, the data integrity of the disk unit is improved by making it possible to recover the data on the failed disk unit from the data of other disk units and the redundant data. ing.

しかし、ＲＡＩＤの技術を適用した記憶サブシステムは、データの保全性が向上する反面、上述した冗長データ生成／書き込みのために処理時間が増大するため、ホストからのＩ／Ｏ処理と同期して冗長データの生成／書き込みまでを行うと、ホストからのライト性能が大幅に劣化する。従って、ＲＡＩＤの技術を適用した記憶サブシステムのコントローラには、ライトキャッシュが不可欠となっている。 However, in the storage subsystem to which the RAID technology is applied, the data integrity is improved, but the processing time increases due to the above-mentioned redundant data generation / writing, so that it is synchronized with the I / O processing from the host. If redundant data is generated / written, the write performance from the host is greatly degraded. Therefore, a write cache is indispensable for a controller of a storage subsystem to which RAID technology is applied.

ライトキャッシュとは、コントローラ内に搭載された、データを一時的に書き込むキャッシュであり、ホストからのライト要求では、このキャッシュに書き込みを行った時点で、ホストに終了報告を返す。そして、ホストのＩ／Ｏ処理とは非同期に冗長データの生成、ライトデータ及び冗長データのディスク装置への格納を行うことにより書き込み処理時の性能低下を防ぐ。しかし、ライトキャッシュを用いると、データをキャッシュ上に書き込んだ時点でホストに終了報告をするため、キャッシュ上にディスク装置未反映のホストデータが存在する。従って、キャッシュに冗長性がなければ、キャッシュ障害時に、ユーザデータロストとなる。したがって、特にデータの高信頼性が強く求められる記憶サブシステムに用いる制御装置では、従来のコントローラ、記憶装置の冗長構成に加え、一般的にキャッシュにも冗長性を持たせることが行われている。 The write cache is a cache that is mounted in the controller and temporarily writes data. In a write request from the host, an end report is returned to the host when the cache is written. Then, generation of redundant data and storage of write data and redundant data in a disk device are performed asynchronously with host I / O processing, thereby preventing performance degradation during write processing. However, when the write cache is used, since the end report is sent to the host when the data is written on the cache, there is host data that is not reflected in the disk device on the cache. Therefore, if there is no redundancy in the cache, user data is lost when the cache fails. Therefore, in particular, in a control device used in a storage subsystem in which high reliability of data is strongly required, in addition to the conventional redundant configuration of the controller and storage device, the cache is generally made redundant. .

コントローラを多重化した記憶サブシステムにおいて、単にキャッシュを多重化すると、キャッシュ上のデータを複数の制御装置から同時にアクセスすることによるデータ整合性矛盾を防ぐためにキャッシュアクセス時に複数の制御装置からの排他制御が必要になる。そして、コントローラを多重化した記憶サブシステムでは、この排他制御により、シングルコントローラの記憶サブシステムに比べて性能が低下する。 In a storage subsystem with multiple controllers, if the cache is simply multiplexed, exclusive control from multiple controllers during cache access to prevent data consistency inconsistency due to simultaneous access of data on the cache from multiple controllers Is required. In the storage subsystem in which the controllers are multiplexed, the performance is lowered by this exclusive control as compared with the storage subsystem of the single controller.

本発明の目的は、コントローラの多重化及びキャッシュの多重化に伴う、プロセッサ間のキャッシュの排他制御を無くし、性能を落とすことなく信頼性を上げることにある。 An object of the present invention is to eliminate the exclusive control of the cache between the processors accompanying the multiplexing of the controller and the multiplexing of the cache, and to improve the reliability without degrading the performance.

上記の目的を達成するため、本発明による記憶サブシステムは、各プロセッサ毎に処理担当の論理ボリュームを排他的に決める手段と、あるプロセッサが受領したホストコンピュータからの要求が、担当外であった場合は、担当プロセッサに処理要求を通信する手段と、上記通信を受領したプロセッサは、処理結果を要求元プロセッサに通信する手段と、各プロセッサ毎に、ディレクトリ／データセグメント等のキャッシュ構成要素を持つ手段と、上記構成要素の状態をプロセッサの負荷に応じてダイナミックに変更する手段と、ホストコンピュータからのライトデータを複数のコントローラ上のキャッシュへ多重書きする手段と、コントローラ障害時には、障害コントローラ内プロセッサの持つキャッシュ構成要素の制御権を正常系コントローラ内のプロセッサに切り替える手段と、コントローラ復旧時には、該制御権を復旧プロセッサに戻す手段と、ディスク装置への書き込み時にキャッシュメモリ障害が発生した際は、多重書きしている他キャッシュからディスク装置に書き込みを行う手段とを有する。 In order to achieve the above object, in the storage subsystem according to the present invention, the means for exclusively determining the logical volume in charge of processing for each processor and the request from the host computer received by a certain processor are out of charge. In this case, the means for communicating the processing request to the processor in charge, the processor receiving the communication has means for communicating the processing result to the requesting processor, and each processor has a cache component such as a directory / data segment. Means for dynamically changing the state of the component according to the load of the processor, means for multiple writing of write data from the host computer to caches on a plurality of controllers, and a processor in the fault controller in the event of a controller failure The right to control the cache components of Means for switching to the processor in the controller, means for returning the control right to the restoration processor when the controller is restored, and if a cache memory failure occurs when writing to the disk device, the disk device from other caches that have been overwritten And means for writing.

上述した手段により、複数のプロセッサ間で、キャッシュを排他制御することなく複数のコントローラ上のキャッシュへ多重書きを行うことができ、複数プロセッサ化に伴う性能低下の発生を防ぎ、性能を落とすことなく信頼性の向上を計ることができる。 By the means described above, multiple writes can be performed to caches on a plurality of controllers without exclusive control of the caches between a plurality of processors, preventing the occurrence of performance degradation due to the use of multiple processors, and without reducing the performance. Reliability can be improved.

また、上述の手段により、キャッシュメモリ障害時には、多重書きしている他キャッシュからディスク装置への書き込みを行い、データロストを防止できる。
さらに、上記手段により、コントローラ障害時には、自動的に、正常系に切り替えて処理続行が可能であり、また、コントローラ復旧時には、自動的に、復旧系に処理を戻すことが可能となり、システムの無停止運用を実現できる。 Further, by the above-described means, when the cache memory fails, the data can be prevented from being lost by writing to the disk device from another cache in which multiple writing is performed.
Furthermore, by the above means, when the controller fails, it is possible to automatically switch to the normal system and continue the process, and when the controller is restored, the process can be automatically returned to the recovery system. Stop operation can be realized.

本発明によれば、コントローラ及びキャッシュメモリを２重化した記憶サブシステムにおいて、各コントローラにキャッシュメモリの一部及び論理ボリュームを割り当てることによりキャッシュメモリに対するコントローラ内のプロセッサ間の排他制御が無くなるため、複数プロセッサ化による応答性能劣化を防ぐことができる。 According to the present invention, in the storage subsystem in which the controller and the cache memory are duplicated, by allocating a part of the cache memory and the logical volume to each controller, the exclusive control between the processors in the controller with respect to the cache memory is eliminated. It is possible to prevent response performance deterioration due to the use of multiple processors.

また、複数のキャッシュへ多重書きすることにより、キャッシュ障害時には、多重書きしている他キャッシュからディスクに書き込むことができるため、データロストを防ぐことができる。さらに、コントローラ障害時にキャッシュメモリの制御を正常なコントローラに切り替える手段とコントローラ障害から復旧する手段を設けることにより、システムを無停止で運用することができる。 Further, by performing multiple writing to a plurality of caches, data can be prevented from being lost because, in the event of a cache failure, data can be written to the disk from another cache in which multiple writing is performed. Furthermore, by providing means for switching the cache memory control to a normal controller in the event of a controller failure and means for recovering from the controller failure, the system can be operated without interruption.

図１は、本発明の概念図である。 FIG. 1 is a conceptual diagram of the present invention.

図１において、１０、１１はホストコンピュータ、２０はデュアルコントローラ構成をとる制御装置、５０はディスク装置であり、ディスク装置５０は、論理ボリューム０と論理ボリューム１の２つの論理ボリュームに分割されている。 In FIG. 1, 10 and 11 are host computers, 20 is a control device having a dual controller configuration, 50 is a disk device, and the disk device 50 is divided into two logical volumes, logical volume 0 and logical volume 1. .

ホストＡ１０は、制御装置２０内のコントローラＡ３０を介して、論理ボリューム０の処理を行っており、ホストＢ１１は、制御装置２０内のコントローラＢ４０を介して、論理ボリューム１の処理を行っている。 The host A 10 performs processing for the logical volume 0 via the controller A 30 in the control device 20, and the host B 11 performs processing for the logical volume 1 via the controller B 40 in the control device 20.

ここで、コントローラＡ３０には論理ボリューム０が、コントローラＢ４０には論理ボリューム１が処理担当論理ボリュームとして割当てられている。又、コントローラ内のキャッシュの領域は、それぞれ、コントローラＡ用キャッシュ３１、４１、コントローラＢ用キャッシュ３２、４２に２分割されている。そして、コントローラＡ用キャッシュ３１と４１の間で２重書きを行い、又、コントローラＢ用キャッシュ３２と４２の間でも２重書きを行う。 Here, the logical volume 0 is assigned to the controller A30, and the logical volume 1 is assigned to the controller B40 as the processing logical volume. Further, the cache area in the controller is divided into two parts, ie, controller A caches 31 and 41 and controller B caches 32 and 42, respectively. Then, double writing is performed between the controller A caches 31 and 41, and double writing is performed between the controller B caches 32 and 42.

コントローラＡ３０は、通常、コントローラＡ用キャッシュ３１と４１を用いて、Ｉ／Ｏ処理を行い、同様に、コントローラＢ４０は、コントローラＢ用キャッシュ３２と４２を用いて、Ｉ／Ｏ処理を行う。このように、コントローラ毎に使用するキャッシュ領域を個別に割り当てることにより、コントローラ間の排他制御を無くし、コントローラ台数増加に伴う性能劣化を防ぐことができる。 The controller A 30 normally performs I / O processing using the controller A caches 31 and 41, and similarly, the controller B 40 performs I / O processing using the controller B caches 32 and 42. In this way, by individually allocating the cache area to be used for each controller, exclusive control between controllers can be eliminated, and performance deterioration due to an increase in the number of controllers can be prevented.

また、コントローラＢ４０障害時には、コントローラＢ用キャッシュ３２、４２をコントローラＡ３０が使用することにより、ホストＡ１０からコントローラＡ３０を介して、コントローラＢ４０の処理担当であった論理ボリューム１への処理を続行させることができる。 Further, when the controller B40 fails, the controller A30 uses the controller B caches 32 and 42 to continue the processing from the host A10 to the logical volume 1 that was in charge of the processing of the controller B40 via the controller A30. Can do.

以下、本発明によるマルチコントローラ構成の制御装置の１実施例を図面を用いて説明する。 Hereinafter, an embodiment of a control device having a multi-controller configuration according to the present invention will be described with reference to the drawings.

図２は、本発明をマルチコントローラ構成の磁気ディスクアレイサブシステムに適用した場合の構成図である。 FIG. 2 is a configuration diagram when the present invention is applied to a magnetic disk array subsystem having a multi-controller configuration.

図２において、１０００、１１００、１２００、１３００はデータ処理を行う中央処理装置であるホストコンピュータ、２０００はマルチコントローラ構成をとりディスク装置の制御を行う制御装置、７０００，７１００はホストコンピュータのデータを格納するディスク装置である。ここで、制御装置２０００は、ホストバスに直結したスロットに差し込みホスト筐体内に組み込む場合もあるし、制御装置として独立した筐体に組み込む場合もあるし、ディスク装置を組み込んだ筐体として実現する場合もある。また、ディスク装置群７０００及び７１００は、データディスクとパリティディスクからなるパリティグループを含んでいる。さらに、ディスク装置群７０００は、論理ボリューム０と論理ボリューム１と
に、ディスク装置群７１００は論理ボリューム２と論理ボリューム３とに分割されている。 In FIG. 2, reference numerals 1000, 1100, 1200, and 1300 denote host computers that are central processing units for performing data processing, 2000 denotes a control unit that controls a disk unit in a multi-controller configuration, and 7000 and 7100 store data of the host computers. Disk device to be Here, the control device 2000 may be inserted into a slot directly connected to the host bus and incorporated into the host housing, or may be incorporated into an independent housing as the control device, or realized as a housing incorporating a disk device. In some cases. The disk device groups 7000 and 7100 include a parity group including a data disk and a parity disk. Further, the disk device group 7000 is divided into logical volume 0 and logical volume 1, and the disk device group 7100 is divided into logical volume 2 and logical volume 3.

制御装置２０００は、ホストコンピュータ１０００、１１００とディスク装置７０００間のデータ転送を制御するコントローラ３０００、４０００及びホストコンピュータ１２００、１３００とディスク装置７１００間のデータ転送を制御するコントローラ５０００、６０００より構成される。 The control device 2000 includes controllers 3000 and 4000 that control data transfer between the host computers 1000 and 1100 and the disk device 7000, and controllers 5000 and 6000 that control data transfer between the host computers 1200 and 1300 and the disk device 7100. .

コントローラ３０００は、ホストコンピュータ１０００とのプロトコル制御を行うホストＩ／Ｆ制御部３１００、コントローラ全体を制御するマイクロプロセッサ（以下「プロセッサ」という。）３２００、データの転送を実行するデータ転送制御部３３００、ホストコンピュータ１０００とディスク装置７０００のデータ転送時及びプロセッサ間通信時に用いられるキャッシュ３４００、各ディスク装置７０００とのプロトコル制御を行うＤＲＶＩ／Ｆ制御部３５００より構成される。コントローラ４０００、５０００、６０００はコントローラ３０００と同一の構成である。 The controller 3000 includes a host I / F control unit 3100 that performs protocol control with the host computer 1000, a microprocessor (hereinafter referred to as “processor”) 3200 that controls the entire controller, a data transfer control unit 3300 that executes data transfer, A cache 3400 used for data transfer between the host computer 1000 and the disk device 7000 and communication between processors, and a DRVI / F control unit 3500 that performs protocol control with each disk device 7000 are configured. The controllers 4000, 5000, and 6000 have the same configuration as the controller 3000.

プロセッサ３２００は、後述の手段により、あらかじめプロセッサ毎に排他的に割り当てた担当論理ボリュームの処理を行う。このプロセッサ毎の担当論理ボリュームの指定は、ホストコンピュータから論理ボリューム毎の担当プロセッサ指定コマンドを受け取ることにより、ダイナミックに設定可能である。このプロセッサと担当論理ボリュームとの対応情報は、後述のキャッシュ上の共通メモリ領域３４１０、４４１０に格納する。 The processor 3200 performs processing of the assigned logical volume that has been exclusively assigned to each processor in advance by means described later. The designation of the assigned logical volume for each processor can be set dynamically by receiving the assigned processor designation command for each logical volume from the host computer. The correspondence information between the processor and the assigned logical volume is stored in common memory areas 3410 and 4410 on the cache described later.

データ転送制御部３３００はプロセッサ３２００からの指示により、ホストコンピュータ１０００からのライトデータを指定キャッシュに多重書きする機能を備えている。この実施例の構成では、キャッシュ３４００とキャッシュ４４００の間で２重書きを行い、また、キャッシュ５４００とキャッシュ６４００の間でも２重書きを行う。以下、キャッシュ３４００とキャッシュ４４００の２面に２重書きする方式について説明する。 The data transfer control unit 3300 has a function of multiplex writing write data from the host computer 1000 to the designated cache in response to an instruction from the processor 3200. In the configuration of this embodiment, double writing is performed between the cache 3400 and the cache 4400, and double writing is performed between the cache 5400 and the cache 6400. Hereinafter, a method of dual writing on two surfaces of the cache 3400 and the cache 4400 will be described.

キャッシュ３４００とキャッシュ４４００の内容について図３を用いて説明する。尚キャッシュ３４００とキャッシュ４４００は内部構成が同一であるため、キャッシュ３４００を例に説明する。キャッシュ３４００は、プロセッサ間通信に用いる制御情報を格納している共通メモリ領域３４１０、プロセッサ３２００用領域３４８０、プロセッサ４２００用領域３４９０より構成される。 The contents of the cache 3400 and the cache 4400 will be described with reference to FIG. Since the cache 3400 and the cache 4400 have the same internal configuration, the cache 3400 will be described as an example. The cache 3400 includes a common memory area 3410 that stores control information used for inter-processor communication, a processor 3200 area 3480, and a processor 4200 area 3490.

プロセッサ３２００用領域３４８０は、ホストコンピュータとディスク装置間のデータ転送時、データを１次的に格納するデータ格納エリア３４８２、データ格納エリア３４８２を管理するデータ管理情報３４８１より構成され、データ格納エリア３４８２に格納するライトデータと、このライトデータの管理情報は、キャッシュ４４００内のプロセッサ３２００用領域４４８０に２重書きを行う。同様に、プロセッサ４２００用領域３４９０は、プロセッサ４２００により、キャッシュ４４００内のプロセッサ４２００用領域４４９０のライトデータとライトデータの管理情報が２重書きされている。 The processor 3200 area 3480 includes a data storage area 3482 for temporarily storing data during data transfer between the host computer and the disk device, and data management information 3481 for managing the data storage area 3482. The data storage area 3482 The write data stored in and the management information of the write data are written twice in the processor 3200 area 4480 in the cache 4400. Similarly, in the processor 4200 area 3490, the write data and the write data management information of the processor 4200 area 4490 in the cache 4400 are written twice by the processor 4200.

共通メモリ領域３４１０は、論理ボリューム担当プロセッサ情報３４２０、プロセッサ負荷情報３４３０、多重書き情報３４５０、プロセッサ間コミュニケーションメモリ３４６０より構成され、これらの情報は全て、データ転送制御部３３００、４３００によって、キャッシュ３４００と４４００に２重書きされている。 The common memory area 3410 includes logical volume responsible processor information 3420, processor load information 3430, multiplex write information 3450, and interprocessor communication memory 3460. These pieces of information are all stored in the cache 3400 by the data transfer control units 3300 and 4300. It is written twice in 4400.

図３（ｃ）にプロセッサ間コミュニケーションメモリの構成を示す。プロセッサ間コミュニケーションメモリ３４６０は、プロセッサ３２００、４２００、５２００、６２００毎の書き込み用メモり３４６１、３４６２、３４６３、３４６４より構成される。図３（ｄ）にプロセッサ書き込み用メモリの構成を示す。プロセッサ３２００書き込み用メモリ３４６１は、自プロセッサ以外のプロセッサ４２００、５２００、６２００への要求用エリア３４７１、３４７２、３４７３と自プロセッサ以外のプロセッサ４２００、５２００、６２００からの要求に対する応答用エリア３４７４、３４７５、３４７６より構成される。プロセッサ４２００、５２００、６２００書き込み用メモリ３４６２、３４６３、３４６４の内部構成は、プロセッサ３２００書き込み用メモリ３４６１と同一構成である。 FIG. 3C shows the configuration of the interprocessor communication memory. The inter-processor communication memory 3460 includes write memories 3461, 3462, 3463, and 3464 for each of the processors 3200, 4200, 5200, and 6200. FIG. 3D shows the configuration of the processor writing memory. The processor 3200 write memory 3461 includes areas 3471, 3472, and 3473 for requesting processors 4200, 5200, and 6200 other than its own processor, and areas 3474 and 3475 for responding to requests from processors 4200, 5200, and 6200 other than its own processor. 3476. The internal configuration of the processors 4200, 5200, 6200 write memory 3462, 3463, 3464 is the same as the processor 3200 write memory 3461.

キャッシュ５４００とキャッシュ６４００との間も、共通メモリ領域を除いて、キャッシュ３４００とキャッシュ４４００との間と同様に２重化が行われている。共通メモリ領域は、キャッシュ３４００、４４００に２重書きされている情報を制御装置内の全プロセッサで共有するため、キャッシュ５４００、６４００には存在しない。 Duplication is performed between the cache 5400 and the cache 6400 as well as between the cache 3400 and the cache 4400 except for the common memory area. The common memory area does not exist in the caches 5400 and 6400 because the information written in the caches 3400 and 4400 is shared by all the processors in the control device.

本発明を実施する制御装置では、コントローラの増設はコントローラ２台単位で行い、対になったコントローラのキャッシュ間のみで２重書きを行うとともに、ドライブ側のデータバスについても、それぞれのディスク装置は対になったコントローラにのみ接続することによりハードウェア構成を簡略化し、ドライブ側データバス上の競合を回避することが可能となる。 In the control apparatus embodying the present invention, the number of controllers is increased in units of two controllers, and the dual writing is performed only between the caches of the paired controllers, and each disk device is also used for the data bus on the drive side. By connecting only to the paired controllers, the hardware configuration can be simplified, and contention on the drive-side data bus can be avoided.

次に本実施例における、磁気ディスクサブシステムでの、ホストコンピュータ１０００からのＩ／Ｏ処理について図４、図５、図６を用いて説明する。まず最初に、プロセッサ３２００担当論理ボリュームへのＩ／Ｏ処理について説明する。
図４は、ホストからのＩ／Ｏ処理を示すフローチャートである。ホストコンピュータ１０００からの書き込み要求時、プロセッサ３２００は、まず、共通メモリ領域３４１０内の論理ボリューム担当プロセッサ情報３４２０によって、処理要求論理ボリュームの担当プロセッサ情報を取得し、自処理担当論理ボリューム（ＬＵＮ）への処理かの判定を行い（ステップ９０２）、自プロセッサ処理担当論理ボリュームへの処理であることを認識する。次に、処理種別の判定を行い（ステップ９０３）、書き込み処理であることを認識する。ホストＩ／Ｆ制御部３１００により、書き込み論理データを受領し、データ転送制御部３３００によってキャッシュ３４００のコントローラ３０００用領域３４８０とキャッシュ４４００のコントローラ３０００用領域４４８０とにその管理情報とともに２重に格納する（ステップ９０４）。そして、この時点でホストコンピュータ１０００に終了を報告する（ステップ９０５）。 Next, I / O processing from the host computer 1000 in the magnetic disk subsystem in this embodiment will be described with reference to FIGS. 4, 5, and 6. FIG. First, I / O processing to the logical volume in charge of the processor 3200 will be described.
FIG. 4 is a flowchart showing I / O processing from the host. At the time of a write request from the host computer 1000, the processor 3200 first acquires the processor information in charge of the processing request logical volume from the logical volume processor information 3420 in the common memory area 3410, and transfers it to its own processing logical volume (LUN). (Step 902), it is recognized that the processing is for the logical volume in charge of its own processor processing. Next, the process type is determined (step 903), and it is recognized that the process is a write process. The host I / F control unit 3100 receives the write logical data, and the data transfer control unit 3300 stores it in the controller 3000 area 3480 of the cache 3400 and the controller 3000 area 4480 of the cache 4400 together with the management information in a double manner. (Step 904). At this time, the end is reported to the host computer 1000 (step 905).

図５は、キャッシュ内のデータをディスク装置に格納する処理を示すフローチャートである。プロセッサ３２００は、ホストコンピュータ１０００からのＩ／Ｏ処理とは非同期にプロセッサ３２００用領域３４８０上のライトデータをデータ転送制御部３３００とＤＲＶＩ／Ｆ制御部３５００によりディスク装置群７０００に格納する（ステップ９２２）。この際、キャッシュのメモリ障害により読み込みエラーが発生した場合（ステップ９２３）は、２重化しているプロセッサ３２００用領域４４８０からディスク装置７０００へ格納する（ステップ９２４）ことによりデータ損失を防止することができる。 FIG. 5 is a flowchart showing processing for storing data in the cache in the disk device. The processor 3200 stores the write data on the processor 3200 area 3480 in the disk device group 7000 by the data transfer control unit 3300 and the DRV I / F control unit 3500 asynchronously with the I / O processing from the host computer 1000 (step S3). 922). At this time, if a read error occurs due to a memory failure in the cache (step 923), data is prevented from being lost by storing the data from the redundant processor 3200 area 4480 into the disk device 7000 (step 924). it can.

ホストコンピュータ１０００からの読み込み要求時は、プロセッサ３２００は、上記書き込み処理同様、自プロセッサ処理担当論理ボリューム（ＬＵＮ）への処理であることを認識（ステップ９０２）した後、処理種別の判定を行う（ステップ９０３）。Ｉ／Ｏ処理が読み込み処理であることを認識すると、データ転送制御部３３００とＤＲＶＩ／Ｆ制御部３５００によりデータをディスク装置群７０００からキャッシュ３４００のコントローラ３０００用領域３４８０に格納し（ステップ９０６）、ホストコンピュータに転送する（ステップ９０７）。 At the time of a read request from the host computer 1000, the processor 3200 recognizes that the processing is to the own processor processing logical volume (LUN) (step 902) and then determines the processing type as in the above writing processing (step 902). Step 903). When recognizing that the I / O process is a read process, the data transfer control unit 3300 and the DRV I / F control unit 3500 store the data from the disk device group 7000 into the controller 3000 area 3480 of the cache 3400 (step 906). The data is transferred to the host computer (step 907).

次にホストコンピュータ１０００からコントローラ４０００担当論理ボリュームへのＩ／Ｏ処理について説明する。 Next, I / O processing from the host computer 1000 to the logical volume in charge of the controller 4000 will be described.

ホストコンピュータ１０００からの書き込み要求時、プロセッサ３２００は、まず、共通メモリ領域３４１０内の論理ボリューム担当プロセッサ情報３４２０によって、処理要求論理ボリュームの担当プロセッサ情報を取得し、自処理担当論理ボリュームへの処理かの判定を行い（ステップ９０２）、処理担当外論理ボリュームへの処理であることを認識する。次に、処理種別の判定を行い（ステップ９０８）、書き込み処理であることを認識する。そして、ホストコンピュータ１０００からの書き込み論理データをキャッシュメモリのコントローラ３０００用領域３４８０に格納し、書き込み処理をこの論理ボリュームの担当であるコントローラ４０００へ要求する（ステップ９０９）。 At the time of a write request from the host computer 1000, the processor 3200 first acquires the processor information in charge of the processing request logical volume from the logical volume manager information 3420 in the common memory area 3410, (Step 902), and recognizes that the process is to a logical volume not in charge of processing. Next, the process type is determined (step 908), and it is recognized that the process is a write process. Then, the write logical data from the host computer 1000 is stored in the controller 3000 area 3480 of the cache memory, and a write process is requested to the controller 4000 in charge of this logical volume (step 909).

プロセッサ３２００は、プロセッサ４２００に書き込み処理を要求するために、書き込みデータ論理アドレス、書き込みデータのキャッシュ上の格納アドレス、データ長及び処理種別情報をデータ転送制御部３３００により共通メモリ領域３４１０、４４１０内のプロセッサ３２００書き込み用メモリ内のプロセッサ４２００への要求用エリアに２重に格納する。ここで、処理種別情報とは、書き込み処理か読み込み処理かを判断する情報である。プロセッサ４２００は、例えば１０ｍｓといった一定時間で、共通メモリ領域３４１０、４４１０の自プロセッサへの要求用エリアを参照にいき、他プロセッサからの要求を認識する。 In order to request the processor 4200 to perform a write process, the processor 3200 uses the data transfer control unit 3300 to store the write data logical address, the write data cache storage address, the data length, and the processing type information in the common memory areas 3410 and 4410. The processor 3200 is stored twice in an area for requesting the processor 4200 in the memory for writing. Here, the process type information is information for determining whether the process is a writing process or a reading process. The processor 4200 refers to a request area for the own processor in the common memory areas 3410 and 4410 in a certain time such as 10 ms, and recognizes a request from another processor.

図６は、プロセッサ３２００からの処理要求を受信したときのプロセッサ４２００の処理を示すフローチャートである。前述の方法により、プロセッサ３２００からの要求を認識（ステップ９３１）したプロセッサ４２００は、プロセッサ３２００書き込み用メモリ内のプロセッサ４２００への要求用エリア内の処理種別を参照し、書き込み処理要求であることを認識する（ステップ９３２）。そして、プロセッサ４２００は、プロセッサ３２００書き込み用メモリ内のプロセッサ４２００への要求用エリア内の書き込み論理アドレス、書き込みデータのキャッシュ上の格納アドレス、データ長を取得し（ステップ９３３）、キャッシュ３４００内の該格納アドレスからデータ長分の書き込みデータをプロセッサ４２００用領域３４９０と４４９０に、その管理情報である書き込み論理アドレスとデータ長と共に、２重に格納する（ステップ９３４）。そして、終了情報を共通メモリ領域３４１０、４４１０内のプロセッサ４２００書き込み用メモリ内のプロセッサ３２００からの要求に対する応答用エリアに設定することにより、プロセッサ３２００に処理終了を通信する（ステップ９３５）。 FIG. 6 is a flowchart showing processing of the processor 4200 when a processing request from the processor 3200 is received. The processor 4200 that has recognized the request from the processor 3200 by the above-described method (step 931) refers to the processing type in the request area to the processor 4200 in the processor 3200 write memory, and determines that the request is a write processing request. Recognize (step 932). Then, the processor 4200 acquires the write logical address, the storage address of the write data in the cache, and the data length in the area for request to the processor 4200 in the processor 3200 write memory (step 933), and Write data corresponding to the data length from the storage address is stored twice in the processor 4200 areas 3490 and 4490 together with the write logical address and the data length, which are the management information (step 934). Then, the end information is set in a response area to the request from the processor 3200 in the memory for writing to the processor 4200 in the common memory areas 3410 and 4410, thereby communicating the end of processing to the processor 3200 (step 935).

プロセッサ３２００は、プロセッサ４２００に対する処理要求後は、プロセッサ４２００書き込み用メモリ内のプロセッサ３２００からの要求に対する応答用エリアを参照することにより、プロセッサ４２００の処理の終了を監視（ステップ９１０）しており（図４参照）、この処理終了の通信を受けて、ホストコンピュータ１０００に終了を報告する（ステップ９０５）。プロセッサ４２００は、この後、図５に従ってホストＩ／Ｏ処理とは非同期に、この書き込みデータのディスク装置７０００への書き込み処理を行う。 After the processing request to the processor 4200, the processor 3200 monitors the end of the processing of the processor 4200 by referring to the response area for the request from the processor 3200 in the processor 4200 write memory (step 910) (step 910). In response to this processing end communication, the end is reported to the host computer 1000 (step 905). Thereafter, the processor 4200 performs a write process of the write data to the disk device 7000 asynchronously with the host I / O process according to FIG.

図４において、ホストコンピュータ１０００から読み込み要求があったときは、プロセッサ３２００は、上記書き込み要求受領時同様、処理担当外論理ボリューム（ＬＵＮ）への処理であることを認識した（ステップ９０２）後、処理種別の判定を行う（ステップ９０８）。読み込み処理であることを認識すると、プロセッサ３２００は読み込み要求論理アドレス、読み込みデータのキャッシュ上の格納許可アドレス、データ長、処理種別情報を共通メモリ領域３４１０、４４１０内のプロセッサ３２００書き込み用メモリ内のプロセッサ４２００への要求用エリアに格納することにより、該ＬＵＮ処理担当であるプロセッサ４２００に読み込み要求を通信する（ステップ９１１）。 In FIG. 4, when there is a read request from the host computer 1000, the processor 3200 recognizes that the process is to a non-processing logical volume (LUN) as in the case of receiving the write request (step 902). The processing type is determined (step 908). When recognizing that it is a read process, the processor 3200 displays the read request logical address, the storage permission address of the read data in the cache, the data length, and the processing type information in the processor 3200 write memory in the common memory areas 3410 and 4410. By storing in the request area to 4200, the read request is communicated to the processor 4200 in charge of the LUN processing (step 911).

図６において、プロセッサ３２００からの要求を認識（ステップ９３１）したプロセッサ４２００は、共通メモリ領域内の情報により、読み込み処理であることを認識する（ステップ９３２）。そして、共通メモリ領域から読み込み要求論理アドレス、読み込みデータのキャッシュ上の格納許可アドレス、データ長を取得する（ステップ９３６）。次に、データをディスク装置７０００からプロセッサ４２００用領域４４９０に格納し、このデータをキャッシュ３４００上の格納許可アドレスに格納する（ステップ９３７）。さらに、共通メモリ領域３４１０、４４１０内のプロセッサ４２００書き込み用メモリ内のプロセッサ３２００からの要求に対する応答用エリアに終了情報を設定することにより、プロセッサ３２００に読み込み終了を通信する（ステップ９３５）。 In FIG. 6, the processor 4200 that has recognized the request from the processor 3200 (step 931) recognizes that it is a read process based on the information in the common memory area (step 932). Then, the read request logical address, the storage permission address of the read data in the cache, and the data length are obtained from the common memory area (step 936). Next, the data is stored from the disk device 7000 into the processor 4200 area 4490, and this data is stored in the storage permission address on the cache 3400 (step 937). Furthermore, by setting end information in an area for responding to a request from the processor 3200 in the memory for writing to the processor 4200 in the common memory areas 3410 and 4410, the end of reading is communicated to the processor 3200 (step 935).

図４において、プロセッサ４２００の処理終了を監視（ステップ９１２）していたプロセッサ３２００は、この読み込み終了報告を受けて、データをホストコンピュータに転送する（ステップ９１３）。 In FIG. 4, the processor 3200 monitoring the processing end of the processor 4200 (step 912) receives this read end report and transfers the data to the host computer (step 913).

このように、プロセッサ３２００は、通常、プロセッサ３２００用領域３４８０と４４８０を用いて、Ｉ／Ｏ処理を行う。同様に、プロセッサ４２００は、通常、プロセッサ４２００用領域３４９０と４４９０を用いて、Ｉ／Ｏ処理を行う。 As described above, the processor 3200 normally performs I / O processing using the areas 3480 and 4480 for the processor 3200. Similarly, the processor 4200 normally performs I / O processing using the areas 4490 and 4490 for the processor 4200.

このように、プロセッサ毎に使用するキャッシュ領域を固定化することにより、プロセッサ間の排他制御を無くし、プロセッサ台数増加に伴う性能劣化を防ぐことができる。特にホストコンピュータ間でファイル（論理ボリューム）をシェアしないシステムにおいては、接続しているコントローラ内のプロセッサにこの論理ボリュームを割り当てておくことにより、Ｉ／Ｏ処理のときのプロセッサ間の通信制御を不要とし、さらなる性能向上を可能とする。 In this way, by fixing the cache area to be used for each processor, it is possible to eliminate exclusive control between processors and prevent performance deterioration due to an increase in the number of processors. Especially in a system that does not share files (logical volumes) between host computers, this logical volume is assigned to the processor in the connected controller, eliminating communication control between processors during I / O processing. And further performance improvement is possible.

次にコントローラ４０００の障害時の自動切り替え／復旧方式について図７、図８を用いて説明する。Ｉ／Ｏ処理実行中、コントローラ４０００の障害を検知したプロセッサは、共通メモリ領域３４１０を用いて、残りの全プロセッサにコントローラ４０００の障害を通信する。この際、コントローラ４０００とキャッシュを２重書きしているコントローラ３０００内のプロセッサ３２００には、処理の引継要求も通信する。本実施例では、プロセッサ３２００が障害を検知した場合について説明する。 Next, an automatic switching / restoring method at the time of failure of the controller 4000 will be described with reference to FIGS. During the execution of the I / O process, the processor that has detected the failure of the controller 4000 uses the common memory area 3410 to communicate the failure of the controller 4000 to all the remaining processors. At this time, a processing takeover request is also communicated to the processor 3200 in the controller 3000 in which the controller 4000 and the cache are double-written. In this embodiment, a case where the processor 3200 detects a failure will be described.

図７は、プロセッサ３２００がコントローラ４０００の障害を検知した場合のプロセッサ３２００の処理を示すフローチャートである。プロセッサ３２００はＩ／Ｏ処理実行中（ステップ９５０）、コントローラ４０００の障害を検知（ステップ９５１）すると、前述の方法により、プロセッサ５２００、６２００にコントローラ４０００の障害を通信する。そして、障害コントローラをシステムから切り放すため、キャッシュ３４００と４４００へ２重書きされているホストコンピュータからの書き込みデータ及び共通メモリ領域のデータを、キャッシュ３４００への１重書きに変更することをデータ転送制御部３３００に指示する（ステップ９５２）。また、プロセッサ３２００からの要求を認識したプロセッサ５２００、６２００は、共通メモリ領域をキャッシュ３４００への１重書きに変更する。次に、プロセッサ３２００は、プロセッサ４２００の処理を引き継ぐ為に、プロセッサ４２００用領域の制御権をプロセッサ３２００に切り替える（ステップ９５３）。これらの処理により、制御権の切り替えが完了し、プロセッサ３２００は通常のＩ／Ｏ処理を再開する（ステップ９５４）。 FIG. 7 is a flowchart showing processing of the processor 3200 when the processor 3200 detects a failure of the controller 4000. When the processor 3200 is executing the I / O process (step 950) and detects the failure of the controller 4000 (step 951), the processor 3200 communicates the failure of the controller 4000 to the processors 5200 and 6200 by the method described above. Then, in order to disconnect the fault controller from the system, the data transfer that the write data from the host computer and the data in the common memory area that are double-written to the caches 3400 and 4400 are changed to the single-write to the cache 3400 The control unit 3300 is instructed (step 952). Further, the processors 5200 and 6200 that have recognized the request from the processor 3200 change the common memory area to single writing to the cache 3400. Next, the processor 3200 switches the control right of the area for the processor 4200 to the processor 3200 in order to take over the processing of the processor 4200 (step 953). With these processes, the switching of the control right is completed, and the processor 3200 resumes the normal I / O process (step 954).

図８は、障害が発生したコントローラ４０００の復旧処理を示すフローチャートである。コントローラ４０００の障害部位が交換（ステップ９７１）されると、プロセッサ４２００は、共通メモリ領域３４１０を用いて全プロセッサに復旧開始を伝達する（ステップ９７２）。プロセッサ３２００、５２００、６２００は、この復旧開始の伝達を受けて（ステップ９５５）、それぞれのコントローラのデータ転送制御部にキャッシュ３４００と４４００への２重書きを指示すると共に、共通メモリ領域３４１０、４４１０を用いて、処理終了の応答をプロセッサ４２００に通信する（ステップ９５６）。この終了報告を全プロセッサから受領（ステップ９７３）したプロセッサ４２００は、キャッシュ４４００のデータ回復を行う（ステップ９７４）。データ回復が完了すると、共通メモリ領域３４１０、４４１０を用いて、プロセッサ３２００に復旧完了を伝達する（ステップ９７５）。 FIG. 8 is a flowchart showing a recovery process of the controller 4000 in which a failure has occurred. When the faulty part of the controller 4000 is replaced (step 971), the processor 4200 transmits a recovery start to all the processors using the common memory area 3410 (step 972). The processors 3200, 5200, and 6200 receive this recovery start notification (step 955), and instruct the data transfer control unit of each controller to perform double writing to the caches 3400 and 4400, and the common memory areas 3410, 4410. Is used to communicate the processing end response to the processor 4200 (step 956). The processor 4200 that has received the completion report from all the processors (step 973) performs data recovery of the cache 4400 (step 974). When the data recovery is completed, the recovery completion is transmitted to the processor 3200 using the common memory areas 3410 and 4410 (step 975).

図７において、この完了通知を受けた（ステップ９５８）プロセッサ３２００は、プロセッサ４２００用領域の制御権をプロセッサ４２００に復旧（ステップ９５９）させ、共通メモリ領域を用いて、制御権の復旧をプロセッサ４２００に伝達する（ステップ９６０）。図８において、この伝達を受けた（ステップ９７６）プロセッサ４２００は、Ｉ／Ｏ処理を再開させる（ステップ９７７）。 In FIG. 7, upon receiving this completion notification (step 958), the processor 3200 restores the control right of the area for the processor 4200 to the processor 4200 (step 959), and uses the common memory area to restore the control right to the processor 4200. (Step 960). In FIG. 8, the processor 4200 that has received this notification (step 976) restarts the I / O processing (step 977).

尚、以上の実施例においては、コントローラ毎にプロセッサ、ホストＩ／Ｆ制御部を１つ持った例を示したが、これらの数は任意でも、ホストコンピュータからのコマンドを受け取ったプロセッサが、担当プロセッサに処理要求を伝達することにより、同様に実現できる。 In the above embodiment, an example in which each controller has one processor and one host I / F control unit is shown. However, the processor that receives the command from the host computer is in charge of any number of these controllers. The same can be realized by transmitting a processing request to the processor.

また、キャッシュの分割方式は、プロセッサ毎に均等ではなく、ユーザの指定により設定／変更可能である。特に、特定プロセッサをホットスタンバイで動作させる場合には、キャッシュ領域をホットスタンバイのプロセッサには割り当てないことにより、キャッシュを有効に利用することができる。又、プロセッサの負荷に応じてダイナミックに変更することも可能である。ユーザの指定により分割を行うか、プロセッサの負荷に応じて変更を行うかの指示は、本実施例では、ホストコマンドにより行うが、パネルといった装置を接続し、そこから入力する形を取っても、むろん良い。 Further, the cache division method is not uniform for each processor, and can be set / changed by user designation. In particular, when a specific processor is operated in hot standby, the cache can be used effectively by not allocating the cache area to the hot standby processor. It is also possible to change dynamically according to the load of the processor. In this embodiment, an instruction to perform division according to user designation or change according to the processor load is performed by a host command. However, a device such as a panel may be connected and input from there. Of course good.

つぎに、コントローラのキャッシュの動的割当の実現方式について、以下、説明する。 Next, a method for realizing dynamic allocation of the controller cache will be described below.

まず、キャッシュの管理方式について、図９を用いて説明する。
プロセッサ毎に持つデータ格納エリアは、セグメント９８３と呼ばれる管理単位に分割されている。セグメントは、セグメント毎にセグメント管理ブロック９８１（以下ＳＧＣＢという。）をデータ管理情報内に持ち、セグメントを管理する情報とセグメントアドレスが格納されている。又、これらのＳＧＣＢは、そのセグメントの属性によって、ダーティキュー９８０とクリーンキュー９８２という２つキューに分けられて接続されている。ダーティキュー９８０には、ディスク未反映のライトデータを格納しているセグメントのＳＧＣＢが接続されており、それ以外のＳＧＣＢは、クリーンキュー９８２に接続されている。 First, a cache management method will be described with reference to FIG.
The data storage area for each processor is divided into management units called segments 983. Each segment has a segment management block 981 (hereinafter referred to as SGCB) in the data management information for each segment, and stores segment management information and a segment address. Also, these SGCBs are connected by being divided into two queues, a dirty queue 980 and a clean queue 982, depending on the attribute of the segment. The dirty queue 980 is connected to the SGCB of the segment storing the write data not reflected on the disk, and the other SGCBs are connected to the clean queue 982.

キャッシュの動的割当を実現するために、プロセッサ毎の負荷情報を共通メモリ領域に持つ。この負荷情報として、例えば、キャッシュ内のクリーンＳＧＣＢ量を用いる。各プロセッサは、ＳＧＣＢのクリーン、ダーティ間のキュー遷移契機に、この情報を更新する。プロセッサは、例えば、１分といった一定周期でこの情報を参照にいき、キャッシュを共有しているプロセッサ内で最も負荷の低いプロセッサのクリーンキューから最も負荷の高いプロセッサのクリーンキューへ、その負荷が同じになるまでＳＧＣＢと管理セグメントを移行させる。この際、使用中のＳＧＣＢは、移行対象外とする。移行の際は、ＳＧＣＢの格納データ情報はクリアする。この移行の間は、プロセッサ通信を用いて、移行を行うプロセッサのＩ／Ｏ処理はとめる。 In order to realize dynamic cache allocation, load information for each processor is stored in a common memory area. As the load information, for example, the amount of clean SGCB in the cache is used. Each processor updates this information in response to the queue transition between SGCB clean and dirty. The processor refers to this information at regular intervals such as 1 minute, and the load is the same from the clean queue of the processor with the lowest load to the clean queue of the processor with the highest load among the processors sharing the cache. SGCB and management segment are migrated until. At this time, the SGCB in use is not subject to migration. At the time of migration, the SGCB storage data information is cleared. During this transition, I / O processing of the processor performing the transition is stopped using processor communication.

また、以上の実施例においては、２台のコントローラ間でキャッシュを共有し、各々、対コントローラのキャッシュに２重書きする例を示したが、キャッシュ領域がプロセッサ毎に分割されていれば、そのキャッシュの共有化方式、多重書き方式は、任意の方式でも、同様に実現できる。 In the above embodiment, the cache is shared between the two controllers, and each of them is written twice in the cache of the controller. However, if the cache area is divided for each processor, The cache sharing method and the multiple writing method can be similarly realized by any method.

キャッシュ多重書きの例を図１０に示す。 An example of cache multiple writing is shown in FIG.

（１）は、装置全体でキャッシュを共有しあい、２重書きする方式である。つまり、プロセッサ３２００はキャッシュ３４００、４４００を用いて、プロセッサ４２００はキャッシュ４４００、５４００を用いて、プロセッサ５２００はキャッシュ５４００、６４００を用いて、プロセッサ６２００はキャッシュ６４００、３４００を用いて２重書きを行っている。 (1) is a method in which the cache is shared by the entire apparatus and is written twice. That is, the processor 3200 uses the caches 3400 and 4400, the processor 4200 uses the caches 4400 and 5400, the processor 5200 uses the caches 5400 and 6400, and the processor 6200 uses the caches 6400 and 3400. ing.

（２）は、装置全体でキャッシュを共有しあい、全キャッシュに多重書きする方式である。つまり、プロセッサ３２００、４２００、５２００、６２００は、それぞれキャッシュ３４００、４４００、５４００、６４００を用いて、多重書きを行っている。このケースにおいて、コントローラが障害となった場合は、キャッシュを共有しているプロセッサ間でもっとも負荷の低いプロセッサが、障害コントローラ担当論理ボリュームの処理を引き継ぐ。これらのケースにおいては、任意のプロセッサが障害コントローラ担当論理ボリュームの処理を引き継げるように、ディスク側のデータバスを、装置内の全ディスク装置、全コントローラで共通のバスに接続しておく。もちろん、これらの多重書き方式を装置内で混在させることも可能である。これらの多重書き方式の指定は、共通メモリ領域３４１０、４４１０に多重書き情報を持ち、各々のプロセッサ３２００、４２００、５２００、６２００が、この情報を元に、書き込みデータの転送方式をデータ転送制御部３３００、４３００、５３００、６３００に指示することにより実現できる。 (2) is a system in which the cache is shared by the entire apparatus, and multiple writing is performed in all caches. That is, the processors 3200, 4200, 5200, and 6200 perform multiple writing using the caches 3400, 4400, 5400, and 6400, respectively. In this case, if the controller fails, the processor with the lowest load among the processors sharing the cache takes over the processing of the failed controller responsible logical volume. In these cases, the data bus on the disk side is connected to a common bus for all disk devices and all controllers in the device so that an arbitrary processor can take over the processing of the logical volume in charge of the fault controller. Of course, these multiple writing systems can be mixed in the apparatus. The designation of these multiple write methods has multiple write information in the common memory areas 3410 and 4410, and each processor 3200, 4200, 5200 and 6200 determines the transfer method of write data based on this information as a data transfer control unit. This can be realized by instructing 3300, 4300, 5300, 6300.

本発明の概要を表す構成図である。It is a block diagram showing the outline | summary of this invention. 本発明の実施例である制御装置の構成図である。It is a block diagram of the control apparatus which is an Example of this invention. 本発明の実施例であるコントローラのキャッシュの構成を示す図である。It is a figure which shows the structure of the cache of the controller which is an Example of this invention. 本発明の実施例によるコントローラのホストからのＩ／Ｏ処理の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the I / O process from the host of the controller by the Example of this invention. 本発明の実施例によるコントローラのキャッシュ内のデータをディスク装置に格納する動作を示すフローチャートである。4 is a flowchart illustrating an operation of storing data in a cache of a controller in a disk device according to an embodiment of the present invention. 本発明の実施例による他のコントローラから処理要求を受けとったコントローラの制御装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the control apparatus of the controller which received the process request from the other controller by the Example of this invention. 本発明の実施例による他のコントローラの障害を検出したコントローラの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the controller which detected the failure of the other controller by the Example of this invention. 本発明の実施例による障害が発生したコントローラの復旧処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of a recovery process of the controller in which the failure generate | occur | produced by the Example of this invention. 本発明の実施例によるコントローラおいて用いられるキャッシュの管理方式を示す図である。It is a figure which shows the management system of the cache used in the controller by the Example of this invention. 本発明の他の実施例二夜コントローラのキャッシュの構成を示す図である。It is a figure which shows the structure of the cache of another Example 2 night controller of this invention.

Explanation of symbols

１０／１１：ホストコンピュータ
２０：制御装置
３０／４０：コントローラ
３１／４１：コントローラＡ用キャッシュメモリ
３２／４２：コントローラＢ用キャッシュメモリ
３３／４３：キャッシュメモリ
５０：ディスク装置
１０００／１１００／１２００／１３００：ホストコンピュータ
２０００：制御装置
３０００／４０００／５０００／６０００：コントローラ
３１００／４１００／５１００／６１００：ホストＩ／Ｆ制御部
３２００／４２００／５２００／６２００：マイクロプロセッサ
３３００／４３００／５３００／６３００：データ転送制御部
３４００／４４００／５４００／６４００：キャッシュ
３５００／４５００／５５００／６５００：ＤＲＶＩ／Ｆ制御部
７０００／７１００：ディスク装置群
10/11: Host computer 20: Controller 30/40: Controller 31/41: Controller A cache memory 32/42: Controller B cache memory 33/43: Cache memory 50: Disk device 1000/1100/1200/1300 : Host computer 2000: Control device 3000/4000/5000/6000: Controller 3100/4100/5100/6100: Host I / F control unit 3200/4200/5200/6200: Microprocessor 3300/4300/5300/6300: Data transfer Control unit 3400/4400/5400/6400: Cache 3500/4500/5500/6500: DRVI / F control unit 7000/7100: Disk device group

Claims

A storage device for storing data of the host computer and having a plurality of storage areas;
The storage device is controlled based on an instruction from the host computer, data transfer between the host computer and the disk device is controlled, and data transferred between the host computer and the storage device is temporarily stored. A storage subsystem comprising a controller having a plurality of cache memory having a plurality of areas to be held and a controller comprising a path connecting the plurality of controllers,
The controller includes at least one of a plurality of storage areas of the storage device and at least one of a plurality of areas of the cache memory of the controller and a plurality of areas of a cache memory of another controller connected by the path A storage subsystem characterized in that at least one is allocated.

2. The storage subsystem according to claim 1, wherein the controller writes data transferred from the host computer into a plurality of the cache memories assigned to the controller.

3. The storage subsystem according to claim 2, wherein when a failure occurs in the controller, the other controller performs processing of a storage area of the storage device that was in charge of the failure controller. system.

4. The storage subsystem according to claim 3, wherein the other controller is a hot standby controller, and a storage area of a cache memory is not allocated to the hot standby controller. .

2. The storage subsystem according to claim 1, wherein the control device has a path connecting the plurality of controllers, and the controller issues a processing request for a storage area of the storage device allocated to another controller from a host computer. When received, the controller communicates the processing request to the other controller.

The storage subsystem according to claim 1, wherein the division of the cache area is changed according to a load on the controller.

A magnetic disk having a plurality of logical volumes for storing host computer data;
A cache memory having a plurality of areas for temporarily holding data transferred between the host computer and the disk device, and a data transfer control unit for connecting the cache memory and controlling data transfer of the data A storage subsystem having a controller for controlling the magnetic disk device based on an instruction from the host computer, and a controller for controlling the magnetic disk device based on an instruction from the host computer.
The controller includes at least one of a plurality of logical volumes of the magnetic disk device, at least one of a plurality of areas of a cache memory of the controller, and at least one of a plurality of areas of a cache memory of another controller. A storage subsystem characterized by being assigned.

8. The storage subsystem according to claim 7, wherein the controller includes a cache memory area of the controller assigned to the controller for data transferred from the host computer, and another controller assigned to the controller. A storage subsystem for writing to a cache memory area.

9. The storage subsystem according to claim 8, wherein when a failure occurs in the controller, the other controller performs processing of the logical volume for which the failure controller was responsible.

10. The storage subsystem according to claim 9, wherein the other controller is a hot standby controller, and a storage area of a cache memory is not allocated to the hot standby controller. system.

8. The storage subsystem according to claim 7, wherein when the controller receives a processing request for a logical volume assigned to another controller from a host computer, the data transfer control unit of the controller sends the request to the other controller. A storage subsystem, wherein a processing request is transferred through one path, the other controller that has received the processing request performs processing on the logical volume, and transfers a processing result to the controller.

8. The storage subsystem according to claim 7, wherein the division of the cache area is changed according to a load on the controller.

8. The storage subsystem according to claim 7, wherein the path between the controllers includes a first path that connects the two controllers and a second path that connects the set of the two controllers. Storage subsystem.

14. The storage subsystem according to claim 13, wherein when a controller is added to the control device, the controller is added in units of two of the controllers.

A storage device for storing data of the host computer and having a plurality of storage areas;
The storage device is controlled based on an instruction from the host computer, data transfer between the host computer and the storage device is controlled, and data transferred between the host computer and the storage device is temporarily stored. A storage subsystem comprising a controller having a plurality of cache memory having a plurality of areas to be held and a controller comprising a path connecting the plurality of controllers,
The controller includes at least one of a plurality of storage areas of the storage device and at least one of a plurality of areas of the cache memory of the controller and a plurality of areas of a cache memory of another controller connected by the path At least one is assigned,
The data transferred from the host computer is written in the cache memory area of the controller allocated to the controller and the cache memory area of another controller allocated to the controller. Storage subsystem.

16. The storage subsystem according to claim 15, wherein, when a failure occurs in the controller, the other controller performs processing of a storage area of the storage device that was in charge of the failure controller. system.

16. The storage subsystem according to claim 15, wherein the other controller is a hot standby controller, and the storage area of the cache memory is not allocated to the hot standby controller. .

16. The storage subsystem according to claim 15, wherein the control device has a path connecting the plurality of controllers, and the controller issues a processing request for a storage area of the storage device allocated to another controller from a host computer. When received, the controller communicates the processing request to the other controller.

16. The storage subsystem according to claim 15, wherein the division of the cache area is changed according to a load on the controller.