JP2007141264A

JP2007141264A - Storage device system

Info

Publication number: JP2007141264A
Application number: JP2007035620A
Authority: JP
Inventors: Naoto Matsunami; 直人松並; Takashi Oeda; 高大枝; Akira Yamamoto; 山本　　彰; Yasuyuki Ajimatsu; 康行味松; Masahiko Sato; 雅彦佐藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1998-12-22
Filing date: 2007-02-16
Publication date: 2007-06-07

Abstract

<P>PROBLEM TO BE SOLVED: To construct a storage device system corresponding to computer system scale or requirements so as to easily achieve improvement in extension, reliability of the storage device system in future. <P>SOLUTION: A storage device system 1 includes a storage device for holding data, a plurality of subsets 10 having a controller for controlling the storage device, and a switch device 20 disposed between the subsets 10 and a host 30. The switch device 20 includes a management table holding management information for managing the configuration of the storage device system 1, and address information contained in frame information outputted by the host 30 in accordance with the management information is converted to distribute the frame information to the subsets 10. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、複数のディスク装置を制御するディスク制御システムの実現方法に関し、特に、ディスク制御システムの高速化、低コスト化、コストパフォーマンスの向上の方法に関する。 The present invention relates to a method for realizing a disk control system for controlling a plurality of disk devices, and more particularly, to a method for speeding up, reducing costs, and improving cost performance of a disk control system.

計算機システムに用いられる記憶装置システムとして、複数のディスク装置を制御するディスクアレイシステムがある。ディスクアレイシステムについては、例えば、非特許文献１に開示されている。 As a storage device system used in a computer system, there is a disk array system that controls a plurality of disk devices. The disk array system is disclosed in Non-Patent Document 1, for example.

ディスクアレイは、複数のディスク装置を並列に動作させることで、ディスク装置を単体で用いた記憶装置システムに比べ高速化を実現する技術である。 The disk array is a technology that realizes higher speed than a storage system using a single disk device by operating a plurality of disk devices in parallel.

複数のディスクアレイシステムを、複数のホストと相互に接続する方法として、ファイバチャネル（Fibre Channel）のFabricを使用した方法がある。この方法を適用した計算機システムの例が、非特許文献２に示されている。 As a method of interconnecting a plurality of disk array systems with a plurality of hosts, there is a method using Fabric of Fiber Channel. An example of a computer system to which this method is applied is shown in Non-Patent Document 2.

ここに開示される計算機システムでは、複数のホストコンピュータ（以下では単にホストと呼ぶ）と複数のディスクアレイシステムが、それぞれ、ファイバチャネルを介してファブリック装置に接続される。ファブリック装置は、ファイバチャネルのスイッチであり、ファブリック装置に接続する任意の装置間の転送路の接続を行う。ファブリック装置はファイバチャネルのパケットである「フレーム」の転送に対し透過であり、ホストとディスクアレイシステムは、互いにファブリック装置を意識することなく２点間で通信を行う。 In the computer system disclosed herein, a plurality of host computers (hereinafter simply referred to as hosts) and a plurality of disk array systems are each connected to a fabric device via a fiber channel. The fabric device is a fiber channel switch, and connects a transfer path between arbitrary devices connected to the fabric device. The fabric device is transparent to the transfer of “frames” which are Fiber Channel packets, and the host and the disk array system communicate between two points without being aware of the fabric device.

デビッド・エー・パターソン（David A. Patterson）他２名著、「ケースオブリダンダントアレイオブインエクスペンシブディスク（A Case for Redundant Arrays of Inexpensive Disks (RAID)）」、米国、エーシーエムシグモッドプロシーディング（In Proc. ACM SIGMOD）、1988年6月、p.109-116David A. Patterson and two other authors, “A Case for Redundant Arrays of Inexpensive Disks (RAID)”, USA, Inc. ACM SIGMOD), June 1988, p.109-116 「シリアルSCSIがいよいよ市場へ」、日経エレクトロニクス、no.639、1995年7月3日、P.79 図３“Serial SCSI is finally on the market”, Nikkei Electronics, no.639, July 3, 1995, page 79

従来のディスクアレイシステムでは、大容量化のためディスク装置の台数を増やし、高性能化のため台数に見合った性能を有するコントローラを実現しようとすると、コントローラの内部バスの性能限界や、転送制御を行うプロセッサの性能限界が顕在化する。このような問題に対処するために、内部バスを拡張し、プロセッサ数を増加することが行われている。しかし、このような対処の仕方は、多数のバス制御によるコントローラ構成の複雑化や、プロセッサ間の共有データの排他制御等による制御ソフトの複雑化とオーバヘッドの増加を招く。このため、コストを非常に上昇させるとともに、性能は頭打ちになり、その結果、コストパフォーマンスが悪化する。また、このような装置は、大規模なシステムでは、そのコストに見合った性能が実現できるものの、規模がそれほど大きくないシステムには見合わない、拡張性が制限される、開発期間の増大と開発コストの上昇を招くといった課題がある。 In a conventional disk array system, increasing the number of disk units for higher capacity and realizing a controller with performance that matches the number of units for higher performance would limit the performance limit of the internal bus of the controller and transfer control. The performance limit of the processor to perform becomes obvious. In order to cope with such a problem, an internal bus is expanded to increase the number of processors. However, such a method of handling leads to a complicated controller configuration due to a large number of bus controls, a complicated control software due to exclusive control of shared data between processors, and an increase in overhead. For this reason, the cost is greatly increased, and the performance reaches a peak, resulting in a deterioration in cost performance. In addition, such a device can achieve performance commensurate with its cost in a large-scale system, but it is not suitable for a system that is not so large, expandability is limited, development period increases and development There is a problem that the cost increases.

複数のディスクアレイシステムを並べファブリック装置で相互接続することによって、システム全体としての大容量化、高性能化を行うことが可能である。しかし、この方法では、ディスクアレイシステム間に関連性は全くなく、特定のディスクアレイシステムにアクセスが集中したとしてもそれを他の装置に分散することができないので、実使用上の高性能化が実現できない。また、ホストから見た論理的なディスク装置（論理ユニットと呼ぶ）の容量は、１台のディスクアレイシステムの容量に制限されるので、論理ユニットの大容量化は実現できない。 By arranging a plurality of disk array systems and interconnecting them by a fabric device, it is possible to increase the capacity and performance of the entire system. However, with this method, there is no relationship between the disk array systems, and even if access is concentrated in a specific disk array system, it cannot be distributed to other devices, so that high performance in practical use can be achieved. Cannot be realized. Further, since the capacity of a logical disk device (referred to as a logical unit) viewed from the host is limited to the capacity of one disk array system, it is not possible to increase the capacity of the logical unit.

ディスクアレイシステム全体を高信頼化しようとした際に、ホストが備えているミラーリング機能を用いて２台のディスクアレイシステムによるミラー構成を実現することができるが、ホストによるミラーリングのための制御オーバヘッドが発生し、システム性能が制限されるという課題がある。また、多数のディスクアレイシステムがシステム内に個別に存在すると、システム管理者が管理するための負荷が増加する。このため、多数の保守人員、複数台分の保守費用が必要になる等、管理コストが増加する。さらに、複数のディスクアレイシステム、ファブリック装置は、それぞれ独立した装置であるので、各種設定は、それぞれの装置毎に異なる方法で実施する必要がある。このため、管理者のトレーニングや、操作時間の増大にともない運用コストが増大する。 When attempting to make the entire disk array system highly reliable, a mirror configuration with two disk array systems can be realized using the mirroring function provided by the host, but the control overhead for mirroring by the host is reduced. There is a problem that the system performance is limited. Further, when a large number of disk array systems exist individually in the system, the load for management by the system administrator increases. For this reason, management costs increase, such as requiring maintenance costs for a large number of maintenance personnel and multiple units. Furthermore, since the plurality of disk array systems and fabric devices are independent devices, various settings need to be performed by different methods for each device. For this reason, the operation cost increases with the training of the manager and the increase in the operation time.

本発明の目的は、これら従来技術における課題を解決し、計算機システムの規模、要求などに応じた記憶装置システムを構築でき、将来における記憶装置システムの拡張、信頼性の向上などに容易に対応することのできる記憶装置システムを実現することにある。 The object of the present invention is to solve these problems in the prior art, and to construct a storage system that meets the scale and requirements of the computer system, and to easily cope with future expansion of the storage system and improvement of reliability. An object of the present invention is to realize a storage device system that can be used.

本発明の記憶装置システムは、データを保持する記憶媒体を有する記憶装置と、この記憶装置を制御する制御装置とを有する複数の記憶装置サブシステム、複数の記憶装置サブシステムに保持されるデータを使用する計算機に接続された第１のインタフェースノード、各々が記憶装置サブシステムのいずれかに接続された複数の第２のインタフェースノード、及び第１のインタフェースノード及び複数の第２のインタフェースノードが接続され、第１のインタフェースノードと複数の第２のインタフェースノードとの間でフレームの転送を行う転送手段を有する。 A storage device system of the present invention includes a plurality of storage device subsystems having a storage device having a storage medium for holding data and a control device for controlling the storage device, and data held in the plurality of storage device subsystems. A first interface node connected to a computer to be used, a plurality of second interface nodes each connected to one of the storage subsystems, and a first interface node and a plurality of second interface nodes connected to each other And a transfer means for transferring a frame between the first interface node and the plurality of second interface nodes.

好ましくは、第１のインタフェースノードは、記憶装置システムの構成情報を格納した構成管理テーブルと、計算機から送られてくるフレームに応答して、該フレームを解析し、構成管理テーブルに保持された構成情報に基づいてそのフレームの転送先に関する情報変換して転送手段に転送する。 Preferably, the first interface node analyzes the frame in response to the configuration management table storing the configuration information of the storage system and the frame sent from the computer, and stores the configuration in the configuration management table. Based on the information, information about the transfer destination of the frame is converted and transferred to the transfer means.

また、フレームの転送に際して、第１のインタフェースノードは、そのフレームを受け取るべきノードのノードアドレス情報をフレームに付加する。転送手段はフレームに付加されたノードアドレス情報に従ってフレームを転送する。第２のインタフェースノードは、転送手段から受け取ったフレームからノードアドレス情報を除いてフレームを再形成し、目的の記憶装置サブシステムに転送する。 Further, when transferring a frame, the first interface node adds node address information of a node that should receive the frame to the frame. The transfer means transfers the frame according to the node address information added to the frame. The second interface node removes the node address information from the frame received from the transfer means, re-forms the frame, and transfers it to the target storage subsystem.

本発明のある態様において、記憶装置システムは、転送手段に接続する管理プロセッサを有する。管理プロセッサは、オペレータからの指示に従って、構成管理テーブルに構成情報を設定する。構成情報には、計算機からのアクセスを制限する情報が含まれる。 In one aspect of the present invention, the storage device system has a management processor connected to the transfer means. The management processor sets configuration information in the configuration management table in accordance with an instruction from the operator. The configuration information includes information that restricts access from the computer.

本発明によれば、計算機システムの規模、要求などに応じた記憶装置システムの拡張、信頼性の向上などを容易に実現することのできる記憶装置システムを実現することができる。 According to the present invention, it is possible to realize a storage device system that can easily realize expansion, improvement of reliability, etc. of the storage device system in accordance with the scale and requirements of the computer system.

［第１実施形態］
図１は、本発明が適用されたディスクアレイシステムを用いたコンピュータシステムの一実施形態における構成図である。 [First Embodiment]
FIG. 1 is a configuration diagram of an embodiment of a computer system using a disk array system to which the present invention is applied.

1はディスクアレイシステム、３０はディスクアレイシステムが接続されるホストコンピュータ（ホスト）である。ディスクアレイシステム1は、ディスクアレイサブセット１０、ディスクアレイスイッチ２０、ディスクアレイシステム全体の設定管理を行うディスクアレイシステム構成管理手段７０、ディスクアレイスイッチ２０とディスクアレイシステム構成管理手段７０との間、およびディスクアレイサブセット１０ディスクアレイシステム構成管理手段７０との間の通信インタフェース（通信Ｉ／Ｆ）８０を有する。ホスト３０とディスクアレイシステム1とは、ホストインタフェース（ホストＩ／Ｆ）３１で接続されており、ホストＩ／Ｆ３１はディスクアレイシステム1のディスクアレイスイッチ２０に接続する。ディスクアレイシステム1の内部において、ディスクアレイスイッチ２０とディスクアレイサブセット１０は、ディスクアレイインタフェース（ディスクアレイＩ／Ｆ２１）で接続される。 1 is a disk array system, and 30 is a host computer (host) to which the disk array system is connected. The disk array system 1 includes a disk array subset 10, a disk array switch 20, a disk array system configuration management means 70 for managing the settings of the entire disk array system, between the disk array switch 20 and the disk array system configuration management means 70, and The disk array subset 10 has a communication interface (communication I / F) 80 with the disk array system configuration management means 70. The host 30 and the disk array system 1 are connected by a host interface (host I / F) 31, and the host I / F 31 is connected to the disk array switch 20 of the disk array system 1. In the disk array system 1, the disk array switch 20 and the disk array subset 10 are connected by a disk array interface (disk array I / F 21).

ホスト３０、ディスクアレイサブセット１０は、図では、各々４台示されているが、この台数に関しては制限はなく任意である。ホスト３０とディスクアレイサブセット１０の台数が異なっても構わない。また、ディスクアレイスイッチ２０は、本実施形態では図示の通り二重化されている。各ホスト３０および各ディスクアレイサブセット１０は、それぞれ別々のホストＩ／Ｆ３１、ディスクアレイＩ／Ｆ２１で二重化されたディスクアレイスイッチ２０の双方に接続されている。これは、一方のディスクアレイスイッチ２０、ホストＩ／Ｆ３１、あるいはディスクアレイＩ／Ｆ２１が故障しても他方を使用することでホスト３０からディスクアレイシステム1へのアクセスを可能とし、高い可用性を実現するためである。しかし、このような二重化は必ずしも必須ではなく、システムに要求される信頼性レベルに応じて選択可能である。 Although four hosts 30 and four disk array subsets 10 are shown in the figure, there are no restrictions on the number of hosts 30 and disk array subsets 10. The number of hosts 30 and disk array subsets 10 may be different. Further, the disk array switch 20 is duplexed as shown in the present embodiment. Each host 30 and each disk array subset 10 are connected to both a separate host I / F 31 and a disk array switch 20 that is duplicated by a disk array I / F 21. Even if one of the disk array switch 20, host I / F 31 or disk array I / F 21 fails, the other can be used to access the disk array system 1 from the host 30 and realize high availability. It is to do. However, such duplication is not always essential and can be selected according to the reliability level required for the system.

図２は、ディスクアレイサブセット１０の一構成例を示す構成図である。１０１は上位システム（ホスト１０）からのコマンドを解釈してキャッシュヒットミス判定を実施し、上位システムとキャッシュ間のデータ転送を制御する上位アダプタ、１０２はディスクデータアクセス高速化のためのキャッシュ、および、マルチプロセッサ間の共有データを格納する共有メモリ（以下キャッシュ・共有メモリと呼ぶ）、１０４はディスクアレイサブセット10内に格納される複数のディスクユニットである。１０３はディスクユニット１０４を制御し、ディスクユニット１０４とキャッシュ間のデータ転送を制御する下位アダプタである。１０６はディスクアレイサブセット構成管理手段であり、ディスクアレイシステム1全体を管理するディスクアレイシステム構成管理手段７０と通信Ｉ／Ｆ８０を介して通信し、構成パラメータの設定や、障害情報の通報等の管理を行う。 FIG. 2 is a configuration diagram showing a configuration example of the disk array subset 10. 101 is a host adapter that interprets commands from the host system (host 10) and performs cache hit / miss determination, and controls data transfer between the host system and the cache, 102 is a cache for speeding up disk data access, and , A shared memory (hereinafter referred to as a cache / shared memory) for storing shared data among multiprocessors, and 104 are a plurality of disk units stored in the disk array subset 10. Reference numeral 103 denotes a lower adapter that controls the disk unit 104 and controls data transfer between the disk unit 104 and the cache. Reference numeral 106 denotes a disk array subset configuration management means, which communicates with the disk array system configuration management means 70 for managing the entire disk array system 1 via the communication I / F 80, and manages configuration parameter settings, failure information notification, and the like. I do.

上位アダプタ１０１、キャッシュ・共有メモリ１０２、下位アダプタ１０３はそれぞれ二重化されている。この理由は上記ディスクアレイスイッチ２０の二重化と同様、高可用性を実現するためであり必須ではない。また、各ディスクユニット１０４は、二重化された下位アダプタ１０３のいずれからも制御可能である。本実施形態では、低コスト化の観点から同一のメモリ手段をキャッシュと共有メモリに共用しているが、これらは勿論分離することも可能である。 The upper adapter 101, the cache / shared memory 102, and the lower adapter 103 are duplicated. The reason for this is to realize high availability, as in the case of duplication of the disk array switch 20, and is not essential. Further, each disk unit 104 can be controlled from any of the duplicated lower level adapters 103. In the present embodiment, the same memory means is shared between the cache and the shared memory from the viewpoint of cost reduction, but it is of course possible to separate them.

上位アダプタ１０１は、上位アダプタ１０１の制御を実行する上位ＭＰＵ１０１０、上位システム、すなわちディスクアレイスイッチ２０との接続Ｉ／ＦであるディスクアレイＩ／Ｆ２１を制御するディスクアレイＩ／Ｆコントローラ１０１１、キャッシュ・共有メモリ１０２と上位ＭＰＵ１０１０とディスクアレイＩ／Ｆコントローラ１０１１との間の通信、データ転送を行う上位バス１０１２を含む。 The host adapter 101 includes a host MPU 1010 that controls the host adapter 101, a disk array I / F controller 1011 that controls the disk array I / F 21 that is a connection I / F to the host system, that is, the disk array switch 20, cache cache An upper bus 1012 that performs communication and data transfer between the shared memory 102, the upper MPU 1010, and the disk array I / F controller 1011 is included.

図では各上位アダプタ１０１毎に1台のディスクアレイＩ／Ｆコントローラ１０１１が示されているが、１つの上位アダプタに対し、複数のディスクアレイＩ／Ｆコントローラ１０１１を設けてもよい。 Although one disk array I / F controller 1011 is shown for each upper adapter 101 in the figure, a plurality of disk array I / F controllers 1011 may be provided for one upper adapter.

下位アダプタ１０３は、下位アダプタ１０３の制御を実行する下位ＭＰＵ１０３０、ディスク１０４とのインタフェースであるディスクＩ／Ｆを制御するディスクＩ／Ｆコントローラ１０３１、キャッシュ・共有メモリ１０２と下位ＭＰＵ１０３０とディスクＩ／Ｆコントローラ１０３１との間の通信、データ転送を行う下位バス１０３２を含む。 The lower adapter 103 includes a lower MPU 1030 that controls the lower adapter 103, a disk I / F controller 1031 that controls a disk I / F that is an interface with the disk 104, a cache / shared memory 102, a lower MPU 1030, and a disk I / F. A lower level bus 1032 that performs communication and data transfer with the controller 1031 is included.

図では各下位アダプタ１０３毎に4台のディスクＩ／Ｆコントローラ１０３１が示されているが、その数は任意であり、ディスクアレイの構成や、接続するディスク台数に応じて変更可能である。 In the figure, four disk I / F controllers 1031 are shown for each lower adapter 103, but the number thereof is arbitrary, and can be changed according to the configuration of the disk array and the number of connected disks.

図３は、ディスクアレイスイッチ２０の一構成例を示す構成図である。２００はディスクアレイスイッチ全体の制御および管理を行うプロセッサである管理プロセッサ（ＭＰ）、２０１はｎ×ｎの相互スイッチ経路を構成するクロスバスイッチ、２０２はディスクアレイＩ／Ｆ２１毎に設けられるディスクアレイＩ／Ｆノード、２０３はホストＩ／Ｆ３１毎に設けられるホストＩ／Ｆノード、２０４はディスクアレイシステム構成管理手段７０との間の通信を行う通信コントローラである。２０２０はディスクアレイＩ／Ｆノード２０２とクロスバスイッチ２０１を接続するパス、２０３０はホストＩ／Ｆノード２０３とクロスバスイッチ２０１を接続するパス、２０４０は他のディスクアレイスイッチ２０と接続し、クラスタを構成するためのクラスタ間Ｉ／Ｆ、２０５０はＭＰ２００とクロスバスイッチ２０１を接続するためのパスである。 FIG. 3 is a configuration diagram showing a configuration example of the disk array switch 20. Reference numeral 200 denotes a management processor (MP) which is a processor for controlling and managing the entire disk array switch, 201 denotes a crossbar switch constituting an n × n mutual switch path, and 202 denotes a disk array I provided for each disk array I / F 21. / F node, 203 is a host I / F node provided for each host I / F 31, and 204 is a communication controller that performs communication with the disk array system configuration management means. 2020 is a path connecting the disk array I / F node 202 and the crossbar switch 201, 2030 is a path connecting the host I / F node 203 and the crossbar switch 201, and 2040 is connected to another disk array switch 20 to form a cluster. The inter-cluster I / F 2050 is a path for connecting the MP 200 and the crossbar switch 201.

図４はクロスバスイッチ２０１の構造を示す構成図である。２０１０はクロスバスイッチ２０１に接続するパス２０２０、２０３０、２０５０、およびクラスタ間Ｉ／Ｆ２０４０を接続するポートであるスイッチングポート（ＳＷＰ）である。ＳＷＰ２０１０はすべて同一の構造を有し、あるＳＷＰから他のＳＷＰへの転送経路のスイッチング制御を行う。図では１つのＳＷＰについてのみ転送経路を示しているが、すべてのＳＷＰ間で同様の転送経路が存在する。 FIG. 4 is a configuration diagram showing the structure of the crossbar switch 201. Reference numeral 2010 denotes a switching port (SWP) that is a port connecting the paths 2020, 2030, and 2050 connected to the crossbar switch 201 and the inter-cluster I / F 2040. All of the SWPs 2010 have the same structure and perform switching control of a transfer path from one SWP to another SWP. Although the transfer path is shown for only one SWP in the figure, a similar transfer path exists between all SWPs.

図５は、ホストＩ／Ｆノード２０３の一構成例を示す構成図である。本実施形態では、具体的に説明をするためにホストＩ／Ｆ３１とディスクアレイＩ／Ｆ２１の両方にファイバチャネルを使用するものと仮定する。もちろんホストＩ／Ｆ３１とディスクアレイＩ／Ｆ２１として、ファイバチャネル以外のインタフェースを適用することも可能である。ホストＩ／Ｆノード２０３とディスクアレイＩ／Ｆノード２０２の両方に同一のインタフェースを使用することで、両者を同一構造にできる。本実施形態においては、ディスクアレイＩ／Ｆノード２０２も図に示すホストＩ／Ｆノード２０３と同様に構成される。以下では、ホストＩ／Ｆノード２０３を例に説明を行う。 FIG. 5 is a configuration diagram illustrating a configuration example of the host I / F node 203. In the present embodiment, it is assumed that a fiber channel is used for both the host I / F 31 and the disk array I / F 21 for specific explanation. Of course, an interface other than the fiber channel can be applied as the host I / F 31 and the disk array I / F 21. By using the same interface for both the host I / F node 203 and the disk array I / F node 202, both can be made to have the same structure. In this embodiment, the disk array I / F node 202 is configured in the same manner as the host I / F node 203 shown in the figure. Hereinafter, the host I / F node 203 will be described as an example.

２０２１は受信したファイバチャネルフレーム（以下単にフレームと呼ぶ）をどのノードに転送するかを検索する検索プロセッサ（ＳＰ）、２０２２はホスト３０（ディスクアレイＩ／Ｆノード２０２の場合は、ディスクアレイサブセット１０）との間でフレームを送受信するインタフェースコントローラ（ＩＣ）、２０２２はＩＣ２０２３が受信したフレームに対しＳＰ２０２１が検索した結果に基づいて変換を施すスイッチングコントローラ（ＳＣ）、２０２４はＳＣ２０２１が変換したフレームを他のノードに転送するためにクロスバスイッチ２０１を通過できる形式にパケット化するパケット生成部（ＳＰＧ）、２０２５は受信したフレームを一時的に格納するフレームバッファ（ＦＢ）、２０２６は一つのホストからのディスクアレイアクセス要求コマンド（以下単にコマンドと呼ぶ）に対応した複数のフレーム列であるエクスチェンジ（Exchange）を識別するためのエクスチェンジ番号を管理するエクスチェンジテーブル（ＥＴ）、２０２７は複数のディスクアレイサブセット１０の構成情報を格納するディスクアレイ構成管理テーブル（ＤＣＴ）である。 2021 is a search processor (SP) that searches to which node the received Fiber Channel frame (hereinafter simply referred to as a frame) is transferred, and 2022 is the host 30 (in the case of the disk array I / F node 202, the disk array subset 10). The interface controller (IC) 202 transmits / receives a frame to / from the switching controller (SC) 2022 converts the frame received by the IC 2023 based on the result of the search performed by the SP 2021, and the other 2024 converts the frame converted by the SC 2021 A packet generation unit (SPG) that packetizes into a format that can pass through the crossbar switch 201 for transfer to a node in the network, 2025 a frame buffer (FB) that temporarily stores received frames, and 2026 a disk from one host A An exchange table (ET) for managing an exchange number for identifying an exchange (Exchange) that is a plurality of frame sequences corresponding to an access request command (hereinafter simply referred to as a command), and 2027 is a configuration of a plurality of disk array subsets 10 It is a disk array configuration management table (DCT) for storing information.

ディスクアレイスイッチ２０の各構成部は、すべてハードウェアロジックで構成されることが性能上望ましい。しかし、求められる性能を満足できるならば、汎用プロセッサを用いたプログラム制御によりＳＰ２０２１やＳＣ２０２２の機能を実現することも可能である。 It is desirable in terms of performance that each component of the disk array switch 20 is configured by hardware logic. However, if the required performance can be satisfied, the functions of SP2021 and SC2022 can be realized by program control using a general-purpose processor.

各ディスクアレイサブセット１０は、各々が有するディスクユニット１０４を１または複数の論理的なディスクユニットとして管理している。この論理的なディスクユニットを論理ユニット（ＬＵ）と呼ぶ。ＬＵは、物理的なディスクユニット１０４と１対１で対応する必要はなく、１台のディスクユニット１０４に複数のＬＵが構成され、あるいは、複数のディスクユニット１０４で１つのＬＵが構成されても構わない。 Each disk array subset 10 manages each disk unit 104 as one or a plurality of logical disk units. This logical disk unit is called a logical unit (LU). The LU does not need to correspond to the physical disk unit 104 on a one-to-one basis, and a plurality of LUs are configured in one disk unit 104, or one LU is configured by a plurality of disk units 104. I do not care.

ディスクアレイサブセット１０の外部から見た場合、１つのＬＵは、１台のディスク装置として認識される。本実施形態では、ディスクアレイスイッチ２０によりさらに論理的なＬＵが構成され、ホスト３０は、このＬＵに対してアクセスするように動作する。本明細書では、１つのＬＵでホスト３０から認識される１つのＬＵが構成される場合、ホスト３０により認識されるＬＵを独立ＬＵ（ＩＬＵ）、複数のＬＵでホスト３０から認識される１つのＬＵが構成される場合、ホスト３０により認識されるＬＵを統合ＬＵ（ＣＬＵ）と呼ぶ。 When viewed from the outside of the disk array subset 10, one LU is recognized as one disk device. In this embodiment, the disk array switch 20 further configures a logical LU, and the host 30 operates to access this LU. In this specification, when one LU recognized from the host 30 is configured by one LU, an LU recognized by the host 30 is an independent LU (ILU), and one LU recognized from the host 30 by a plurality of LUs. When an LU is configured, an LU recognized by the host 30 is referred to as an integrated LU (CLU).

図１２に、４つのディスクアレイサブセットのＬＵで１つの統合ＬＵが構成される場合における各階層間でのアドレス空間の対応関係を示す。図において、１０００は、一例として、ホスト“＃２”からみたディスクアレイシステム１の１つの統合ＬＵにおけるアドレス空間、１１００は、ディスクアレイサブセット１０のＬＵのアドレス空間、１２００はディスクユニット１０４（ここでは、ディスクアレイサブセット“＃０”についてのみ図示されている）のアドレス空間を示している。 FIG. 12 shows a correspondence relationship of address spaces between tiers in the case where one integrated LU is composed of four disk array subset LUs. In the figure, as an example, 1000 is the address space in one integrated LU of the disk array system 1 viewed from the host “# 2”, 1100 is the address space of the LU in the disk array subset 10, and 1200 is the disk unit 104 (here , Only the disk array subset “# 0” is shown).

各ディスクアレイサブセット１０のＬＵは、ここでは、４台のディスクユニット１０４によりＲＡＩＤ５（Redundant Arrays of Inexpensive Disks Level 5）型ディスクアレイとして構成されるものとする。各ディスクアレイサブセット１０は、それぞれｎ０、ｎ１、ｎ２、ｎ３の容量を有するＬＵを持つ。ディスクアレイスイッチ２０は、これら４つのＬＵの持つアドレス空間を（ｎ０＋ｎ１＋ｎ２＋ｎ３）の容量を有するアドレス空間に統合し、ホスト３０から認識される統合ＬＵを実現する。 Here, the LU of each disk array subset 10 is configured as a RAID 5 (Redundant Arrays of Inexpensive Disks Level 5) type disk array by four disk units 104 here. Each disk array subset 10 has LUs having capacities of n0, n1, n2, and n3, respectively. The disk array switch 20 integrates the address spaces of these four LUs into an address space having a capacity of (n0 + n1 + n2 + n3), thereby realizing an integrated LU recognized by the host 30.

本実施形態では、例えば、ホスト＃２が領域Ａ１００１をアクセスする場合、領域Ａ１００１を指定したアクセス要求は、ディスクアレイスイッチ２０によりディスクアレイサブセット＃０のＬＵの領域Ａ′１１０１をアクセスするための要求に変換されてディスクアレイサブセット＃０に転送される。ディスクアレイサブセット＃０は、領域Ａ′１１０１をさらに、ディスクユニット１０４上の領域Ａ″１２０１にマッピングしてアクセスを行う。アドレス空間１０００とアドレス空間１１００との間のマッピングは、ディスクアレイスイッチ２０が有するＤＣＴ２０７に保持された構成情報に基づき行われる。この処理の詳細については後述する。なお、ディスクアレイサブセット内におけるマッピングについては、既によく知られた技術であり、本明細書では詳細な説明については省略する。 In this embodiment, for example, when the host # 2 accesses the area A1001, the access request specifying the area A1001 is a request for accessing the area A′1101 of the LU of the disk array subset # 0 by the disk array switch 20 And transferred to disk array subset # 0. The disk array subset # 0 is accessed by further mapping the area A′1101 to the area A ″ 1201 on the disk unit 104. The mapping between the address space 1000 and the address space 1100 is performed by the disk array switch 20. This processing is performed based on the configuration information held in the DCT 207. The details of this processing will be described later, and the mapping in the disk array subset is a well-known technique, and will be described in detail in this specification. Is omitted.

本実施形態において、ＤＣＴ２０７は、システム構成テーブルとサブセット構成テーブルを含む。図６は、システム構成テーブルの構成を、図７は、サブセット構成テーブルの構成を示す。 In the present embodiment, the DCT 207 includes a system configuration table and a subset configuration table. FIG. 6 shows the configuration of the system configuration table, and FIG. 7 shows the configuration of the subset configuration table.

図７に示すように、システム構成テーブル20270は、ホストＬＵの構成を示す情報を保持するホストＬＵ構成テーブル20271、及びディスクアレイスイッチ２０のディスクアレイＩ／Ｆノード２０２とディスクアレイサブセット１０との接続関係を示すディスクアレイＩ／Ｆノード構成テーブル20272を有する。 As shown in FIG. 7, the system configuration table 20270 includes a host LU configuration table 20271 that holds information indicating the configuration of the host LU, and the connection between the disk array I / F node 202 of the disk array switch 20 and the disk array subset 10. It has a disk array I / F node configuration table 20272 showing the relationship.

ホストＬＵ構成テーブル20271は、ホスト３０からみたＬＵごとに、そのＬＵを識別する番号であるHost-LU No.、ＬＵの属性を示すLU Type、CLU Class、及びCLU Stripe Size、ホストＬＵの状態を示す情報であるCondition、ホストＬＵを構成するディスクアレイサブセット１０のＬＵに関する情報であるＬＵ情報（LU Info.）を有する。 The host LU configuration table 20271 indicates, for each LU viewed from the host 30, the Host-LU No. that is a number for identifying the LU, the LU Type indicating the attribute of the LU, the CLU Class, the CLU Stripe Size, and the state of the host LU. Condition that is information to indicate, and LU information (LU Info.) That is information related to the LU of the disk array subset 10 constituting the host LU.

LU Typeは、このホストＬＵがＣＬＵであるか、ＩＬＵであるかといったＬＵの種類を示す情報である。CLU Classは、LU TypeによりこのホストＬＵがＣＬＵであることが示される場合に、そのクラスが“Joined”、“mirrored”、及び“Striped”のいずれであるかを示す情報である。“Joined”は、図１１により説明したように、いくつかのＬＵを連結して１つの大きな記憶空間を持つＣＬＵが構成されていることを示す。“Mirrored”は、第６実施形態として後述するように、２つのＬＵにより二重化されたＬＵであることを示す。“Striped”は、第７実施形態として後述するように、複数のＬＵで構成され、データがこれら複数のＬＵに分散して格納されたＬＵであることを示す。CLU Stripe Sizeは、CLU Classにより「Striped」であることが示される場合に、ストライピングサイズ（データの分散の単位となるブロックのサイズ）を示す。 LU Type is information indicating the type of LU, such as whether this host LU is a CLU or an ILU. CLU Class is information indicating whether the class is “Joined”, “mirrored”, or “Striped” when the LU Type indicates that this host LU is a CLU. “Joined” indicates that a CLU having one large storage space is configured by connecting several LUs as described with reference to FIG. “Mirrored” indicates that the LU is duplicated by two LUs as described later in the sixth embodiment. “Striped” indicates that the LU is composed of a plurality of LUs and data is distributed and stored in the plurality of LUs as described later in the seventh embodiment. CLU Stripe Size indicates the striping size (the size of a block serving as a unit of data distribution) when the CLU Class indicates “Striped”.

Conditionにより示される状態には、“Normal”、“Warning”、“Fault”、及び“Not Defined”の４種類がある。“Normal”はこのホストＬＵが正常な状態であることを示す。“Warning”は、このホストＬＵを構成するＬＵに対応するいずれかのディスクユニットに障害が発生している等の理由により縮退運転が行われていることを示す。“Fault”は、ディスクアレイサブセット１０の故障などによりこのホストＬＵを運転することができないことを示す。“Not Defined”は、対応するHost-LU No.のホストＬＵが定義されていないことを示す。 There are four states indicated by Condition: “Normal”, “Warning”, “Fault”, and “Not Defined”. “Normal” indicates that this host LU is in a normal state. “Warning” indicates that a degenerate operation is being performed, for example, because a failure has occurred in any of the disk units corresponding to the LU constituting the host LU. “Fault” indicates that the host LU cannot be operated due to a failure of the disk array subset 10 or the like. “Not Defined” indicates that the host LU of the corresponding Host-LU No. is not defined.

LU Infoは、このホストＬＵを構成するＬＵについて、そのＬＵが属するディスクアレイサブセット１０を特定する情報、ディスクアレイサブセット内でのＬＵＮ、及びそのサイズを示す情報を含む。ホストＬＵがＩＬＵの場合には、唯一のＬＵに関する情報が登録される。ホストＬＵがＣＬＵの場合には、それを構成する全てのＬＵについて、それぞれのＬＵに関する情報が登録される。例えば、図において、Host-LU No.が“０”であるHost-LUは、ディスクアレイサブセット“＃０”のＬＵＮ“０”、ディスクアレイサブセット“＃１”のＬＵＮ“０”、ディスクアレイサブセット“＃２”のＬＵＮ“０”、ディスクアレイサブセット“＃３”のＬＵＮ“０”の4つのＬＵから構成されるＣＬＵであり、そのＣＬＵクラスが“Joined”であるＣＬＵであることが分かる。 LU Info includes information specifying the disk array subset 10 to which the LU belongs, information indicating the LUN in the disk array subset, and the size of the LU constituting the host LU. When the host LU is an ILU, information regarding only one LU is registered. When the host LU is a CLU, information on each LU is registered for all the LUs constituting the host LU. For example, in the figure, the Host-LU whose Host-LU No. is “0” is the LUN “0” of the disk array subset “# 0”, the LUN “0” of the disk array subset “# 1”, and the disk array subset. It can be seen that the CLU is composed of four LUs of LUN “0” of “# 2” and LUN “0” of the disk array subset “# 3”, and the CLU class is “Joined”.

ディスクアレイＩ／Ｆノード構成テーブル20272は、ディスクアレイＩ／Ｆ２１が接続するディスクアレイサブセット１０のポートごとに、どのディスクアレイスイッチ２０のディスクアレイＩ／Ｆノード２０２が接続されるかを示す情報を保持する。 The disk array I / F node configuration table 20272 includes information indicating which disk array I / F node 202 of which disk array switch 20 is connected for each port of the disk array subset 10 to which the disk array I / F 21 is connected. Hold.

具体的には、ディスクアレイサブセット１０を特定するSubset No.、ポートを特定するSubset Port No.、そのポートに接続するディスクアレイスイッチ２０を特定するSwitch No.、及びそのディスクアレイスイッチ２０のディスクアレイＩ／Ｆノード２０２を特定するI/F Node No.を有する。ディスクアレイサブセット１０が複数のポートを備えている場合には、そのポート毎に情報が設定される。 Specifically, the Subset No. that identifies the disk array subset 10, the Subset Port No. that identifies the port, the Switch No. that identifies the disk array switch 20 connected to the port, and the disk array of the disk array switch 20 It has an I / F Node No. that identifies the I / F node 202. When the disk array subset 10 has a plurality of ports, information is set for each port.

サブセット構成テーブルは、図７に示すように、各ディスクアレイサブセット１０に対応する複数のテーブル202720〜202723を有する。各テーブルは、ディスクアレイサブセット１０内で構築されたＲＡＩＤグループの構成を示す情報を保持するＲＡＩＤグループ構成テーブル202730と、ディスクアレイサブセット１０内に構築されたＬＵの構成を示す情報を保持するＬＵ構成テーブル202740を含む。 The subset configuration table has a plurality of tables 202720 to 202723 corresponding to each disk array subset 10, as shown in FIG. Each table includes a RAID group configuration table 202730 that holds information indicating the configuration of a RAID group built in the disk array subset 10, and an LU configuration that holds information indicating the configuration of an LU built in the disk array subset 10. Includes table 202740.

ＲＡＩＤグループ構成テーブル202730は、ＲＡＩＤグループに付加された番号を示すGroup No.、そのＲＡＩＤグループのレベルを示すLevel、そのＲＡＩＤグループを構成するディスクの数を示す情報であるDisks、そのＲＡＩＤグループがＲＡＩＤレベル０，５等のストライピングされた構成の場合、そのストライプサイズを示すStripe Sizeを情報として含む。例えば、図に示されるテーブルにおいて、ＲＡＩＤグループ“０”は、４台のディスクユニットにより構成されたＲＡＩＤグループであり、ＲＡＩＤレベルが５、ストライプサイズがＳ０である。 The RAID group configuration table 202730 includes a Group No. indicating a number added to the RAID group, a Level indicating the level of the RAID group, Disks which is information indicating the number of disks constituting the RAID group, and the RAID group is RAID. In the case of a striped configuration such as level 0, 5, etc., Stripe Size indicating the stripe size is included as information. For example, in the table shown in the figure, the RAID group “0” is a RAID group composed of four disk units, the RAID level is 5, and the stripe size is S0.

ＬＵ構成テーブル202740は、ＬＵに付加された番号（ＬＵＮ）を示すLU No.、このＬＵがどのＲＡＩＤグループに構成されているのかを示すRAID Group、ＬＵの状態を示すCondition、このＬＵのサイズ（容量）を示すSize、このＬＵがディスクアレイサブセット１０のどのポートからアクセス可能なのかを示すPort、及びその代替となるポートを示すAlt. Portを情報として含む。Conditionで示される状態は、ホストＬＵについてのConditionと同様、“Normal”、“Warning”、“Fault”、“Not Defined”の４種類がある。Alt. Portに設定された情報により特定されるポートは、Portに設定された情報で特定されるポートに障害が発生したときに用いられるが、単に複数のポートから同一のＬＵをアクセスするために用いることもできる。 The LU configuration table 202740 includes an LU number indicating the number (LUN) added to the LU, a RAID Group indicating which RAID group this LU is configured in, a condition indicating the LU status, and a size of this LU ( The information includes a size indicating (capacity), a port indicating from which port of the disk array subset 10 this LU can be accessed, and an Alt. Port indicating an alternative port. There are four states indicated by Condition: “Normal”, “Warning”, “Fault”, and “Not Defined”, as in the condition for the host LU. The port specified by the information set in Alt. Port is used when a failure occurs in the port specified by the information set in Port, but simply to access the same LU from multiple ports. It can also be used.

図８は、ファイバチャネルにおけるフレームの構成図である。ファイバチャネルのフレーム４０は、フレームの先頭を示すＳＯＦ（Start Of Frame）４００、フレームヘッダ４０１、転送の実態データを格納する部位であるフレームペイロード４０２、３２ビットのエラー検出コードであるＣＲＣ（Cyclic Redundancy Check）４０３、フレームの最後尾を示すＥＯＦ（End Of Frame）４０４を含む。フレームヘッダ４０１は、図９に示すような構造になっており、フレーム転送元のＩＤ（S_ID）、フレーム転送先のＩＤ（D_ID)、エクスチェンジの起動元、応答先が指定するそれぞれのエクスチェンジＩＤ（OX_ID、RX_ID）、エクスチェンジ中のフレームグループを指定するシーケンスのＩＤ（SEQ_ID）等が格納されている。 FIG. 8 is a configuration diagram of a frame in the fiber channel. The fiber channel frame 40 includes an SOF (Start Of Frame) 400 indicating the head of the frame, a frame header 401, a frame payload 402 which is a part storing actual data of transfer, and a CRC (Cyclic Redundancy) which is a 32-bit error detection code. Check) 403 and EOF (End Of Frame) 404 indicating the end of the frame. The frame header 401 has a structure as shown in FIG. 9. The frame transfer source ID (S_ID), the frame transfer destination ID (D_ID), the exchange activation source, and each exchange ID specified by the response destination ( OX_ID, RX_ID), an ID (SEQ_ID) of a sequence specifying a frame group being exchanged, and the like are stored.

本実施形態では、ホスト３０により発行されるフレームには、S_IDとしてホスト３０に割り当てられたＩＤが、また、D_IDとしてディスクアレイスイッチ２０のポートに割り当てられたＩＤが使用される。一つのホストコマンドに対し、１ペアのエクスチェンジＩＤ（OX_ID、RX_ID）が割り当てられる。複数のデータフレームを同一のエクスチェンジに対し発行する必要があるときは、その全データフレームに対して同一のSEQ_IDが割り当てられ、おのおのはシーケンスカウント（SEQ_CNT）で識別される。フレームペイロード４０２の最大長は２１１０バイトであり、フレーム種毎に格納される内容が異なる。例えば、後述するFCP_CMDフレームの場合、図１０に示すように、ＳＣＳＩのLogical Unit Number（ＬＵＮ）、Command Description Block（ＣＤＢ）等が格納される。ＣＤＢは、ディスク（ディスクアレイ）アクセスに必要なコマンドバイト、転送開始論理アドレス（ＬＢＡ）、転送長（ＬＥＮ）を含む。 In this embodiment, an ID assigned to the host 30 as S_ID and an ID assigned to the port of the disk array switch 20 as D_ID are used for the frame issued by the host 30. One pair of exchange IDs (OX_ID, RX_ID) is assigned to one host command. When it is necessary to issue a plurality of data frames to the same exchange, the same SEQ_ID is assigned to all the data frames, and each is identified by a sequence count (SEQ_CNT). The maximum length of the frame payload 402 is 2110 bytes, and the content stored for each frame type is different. For example, in the case of an FCP_CMD frame to be described later, as shown in FIG. 10, a SCSI Logical Unit Number (LUN), a Command Description Block (CDB), and the like are stored. The CDB includes a command byte necessary for disk (disk array) access, a transfer start logical address (LBA), and a transfer length (LEN).

以下、本実施形態のディスクアレイシステムの動作を説明する。 Hereinafter, the operation of the disk array system of this embodiment will be described.

ディスクアレイシステムを使用するのに先立ち、ディスクアレイスイッチ２０に対して、ディスクアレイサブセット１０の構成情報を設定する必要がある。システム管理者は、管理端末５からディスクアレイシステム構成手段７０を介して、すべてのディスクアレイサブセット１０およびディスクアレイスイッチ２０の構成設定情報を獲得する。管理者は、管理端末５から所望のシステム構成になるよう論理ユニットの構成設定、RAIDレベルの設定、障害発生時の交代パスの設定等、各種設定に必要な設定情報を入力する。ディスクアレイシステム構成管理手段７０は、その設定情報を受け、各ディスクアレイサブセット１０およびディスクアレイスイッチ２０に設定情報を転送する。なお、管理端末５における設定情報の入力については第５実施形態にて別途説明する。 Prior to using the disk array system, it is necessary to set the configuration information of the disk array subset 10 for the disk array switch 20. The system administrator acquires configuration setting information of all the disk array subsets 10 and the disk array switches 20 from the management terminal 5 through the disk array system configuration means 70. The administrator inputs setting information necessary for various settings, such as logical unit configuration settings, RAID level settings, and alternate path settings when a failure occurs, from the management terminal 5 so as to obtain a desired system configuration. The disk array system configuration management means 70 receives the setting information and transfers the setting information to each disk array subset 10 and the disk array switch 20. The input of setting information in the management terminal 5 will be described separately in the fifth embodiment.

ディスクアレイスイッチ２０では、通信コントローラ２０４が設定情報を獲得し、ＭＰ２００により各ディスクアレイサブセット１０のアドレス空間情報等の構成情報が設定される。ＭＰ２００は、クロスバスイッチ２０１経由で各ホストＩ／Ｆノード２０３およびディスクアレイＩ／Ｆノード２０２に、ディスクアレイサブセット１０の構成情報を配信する。 In the disk array switch 20, the communication controller 204 acquires setting information, and configuration information such as address space information of each disk array subset 10 is set by the MP 200. The MP 200 distributes the configuration information of the disk array subset 10 to each host I / F node 203 and disk array I / F node 202 via the crossbar switch 201.

各ノード２０３、および２０２はこの情報を受信すると、ＳＰ２０２１により構成情報をＤＣＴ２０２７に格納する。ディスクアレイサブセット１０では、ディスクアレイサブセット構成管理手段１０６が、設定情報を獲得し、共有メモリ１０２に格納する。各上位ＭＰＵ１０１０および下位ＭＰＵ１０３０は、共有メモリ１０２上の設定情報を参照し、各々の構成管理を実施する。 When the nodes 203 and 202 receive this information, the SP2021 stores the configuration information in the DCT 2027. In the disk array subset 10, the disk array subset configuration management means 106 acquires setting information and stores it in the shared memory 102. Each of the upper MPU 1010 and the lower MPU 1030 refers to setting information on the shared memory 102 and performs configuration management of each.

以下では、ホスト“＃２”がディスクアレイシステム1に対し、リードコマンドを発行した場合の動作を説明する。図１１に、ホストからのリード動作時にファイバチャネルを通して転送されるフレームのシーケンスを示す模式図を、図１３にこのときのディスクアレイスイッチのホストＩ／Ｆノード２０３における動作のフローチャートを示す。 The operation when the host “# 2” issues a read command to the disk array system 1 will be described below. FIG. 11 is a schematic diagram showing a sequence of frames transferred through the fiber channel during a read operation from the host, and FIG. 13 is a flowchart of the operation in the host I / F node 203 of the disk array switch at this time.

なお、以下の説明では、ホスト“＃２”が、図１２における記憶領域Ａ１００１をアクセスすることを仮定する。記憶領域Ａ１００１に対応する実際の記憶領域Ａ″は、ディスクアレイサブセット“＃０”のＬＵＮ＝０のＬＵを構成するディスクユニット＃２のアドレス空間内に存在するものとする。また、アドレス空間１０００を構成するＬＵを定義しているホストＬＵ構成テーブル20271のLU Typeには「ＣＬＵ」が、CLU Classには「Joined」が設定されているものとする。 In the following description, it is assumed that the host “# 2” accesses the storage area A1001 in FIG. It is assumed that the actual storage area A ″ corresponding to the storage area A1001 exists in the address space of the disk unit # 2 constituting the LU of LUN = 0 of the disk array subset “# 0”. It is assumed that “CLU” is set in the LU Type and “Joined” is set in the CLU Class in the host LU configuration table 20271 that defines the LUs constituting the.

データのリード時、ホスト３０は、リードコマンドを格納したコマンドフレーム「FCP_CMD」をディスクアレイスイッチ２０に発行する（図１１矢印（ａ））。ディスクアレイスイッチ２０のホストＩ／Ｆノード“＃２”は、ＩＣ２０２３によりホストＩ／Ｆ３１経由でコマンドフレーム「FCP_CMD」を受信する（ステップ20001）。ＩＣ２０２３は、ＳＣ２０２２にコマンドフレームを転送する。ＳＣ２０２２は、受け取ったコマンドフレームを一旦ＦＢ２０２５に格納する。この際、ＳＣ２０２２は、コマンドフレームのＣＲＣを計算し、受信情報が正しいことを検査する。ＣＲＣの検査に誤りがあれば、ＳＣ２０２２は、その旨をＩＣ２０２３に通知する。ＩＣ２０２３は、誤りの通知をＳＣ２０２２から受けると、ホストＩ／Ｆ３１を介してホスト３０にＣＲＣエラーを報告する。（ステップ20002）。 At the time of reading data, the host 30 issues a command frame “FCP_CMD” storing the read command to the disk array switch 20 (arrow (a) in FIG. 11). The host I / F node “# 2” of the disk array switch 20 receives the command frame “FCP_CMD” via the host I / F 31 by the IC 2023 (step 20001). The IC 2023 transfers the command frame to the SC 2022. The SC 2022 temporarily stores the received command frame in the FB 2025. At this time, the SC 2022 calculates the CRC of the command frame and checks whether the received information is correct. If there is an error in the CRC check, the SC 2022 notifies the IC 2023 to that effect. Upon receiving the error notification from the SC 2022, the IC 2023 reports a CRC error to the host 30 via the host I / F 31. (Step 20002).

ＣＲＣが正しい場合、ＳＣ２０２２は、ＦＢ２０２５に保持したフレームをリードし、それがコマンドフレームであることを認識してフレームヘッダ４０１を解析する（ステップ20003）。そして、ＳＣ２０２２は、ＳＰ２０２１に指示し、S_ID、D_ID、OX_ID等のエクスチェンジ情報をＥＴ２０２６に登録する（ステップ20004）。 If the CRC is correct, the SC 2022 reads the frame held in the FB 2025, recognizes that it is a command frame, and analyzes the frame header 401 (step 20003). The SC 2022 then instructs the SP 2021 to register exchange information such as S_ID, D_ID, and OX_ID in the ET 2026 (step 20004).

次に、ＳＣ２０２２は、フレームペイロード４０２を解析し、ホスト３０により指定されたＬＵＮおよびＣＤＢを取得する（ステップ20005）。ＳＰ２０２１は、ＳＣ２０２２の指示により、ＤＣＴ２０２７を検索し、ディスクアレイサブセット１０の構成情報を得る。具体的には、ＳＰ２０２１は、ホストＬＵ構成テーブル20271を検索し、受信したフレームペイロード４０２に格納されたＬＵＮと一致するHost-LU No.を有する情報を見つける。ＳＰ２０２１は、LU Type、CLU Classに設定された情報からホストＬＵの構成を認識し、LU Info.に保持されている情報に基づきアクセスすべきディスクサブセット１０とその中のＬＵのＬＵＮ、及びこのＬＵ内でのＬＢＡを判別する。次に、ＳＰ２０２１は、サブセット構成テーブル202720のＬＵ構成テーブル202740を参照し、目的のディスクアレイサブセット１０の接続ポートを確認し、ディスクアレイＩ／Ｆノード構成テーブル20272からそのポートに接続するディスクアレイＩ／Ｆノード２０２のノードNo.を得る。ＳＰ２０２１は、このようにして得たディスクアレイサブセット１０を識別する番号、ＬＵＮ、ＬＢＡ等の変換情報をＳＣ２０２２に報告する。（ステップ20006）。 Next, the SC 2022 analyzes the frame payload 402 and acquires the LUN and CDB specified by the host 30 (step 20005). The SP 2021 searches the DCT 2027 according to an instruction from the SC 2022 and obtains configuration information of the disk array subset 10. Specifically, the SP 2021 searches the host LU configuration table 20271 and finds information having a Host-LU No. that matches the LUN stored in the received frame payload 402. The SP 2021 recognizes the configuration of the host LU from the information set in the LU Type and CLU Class, and based on the information held in the LU Info., The disk subset 10 to be accessed, the LUN of the LU in the LU, and this LU The LBA is determined. Next, the SP 2021 refers to the LU configuration table 202740 of the subset configuration table 202720, confirms the connection port of the target disk array subset 10, and the disk array I connected to that port from the disk array I / F node configuration table 20272. The node number of the / F node 202 is obtained. The SP 2021 reports the conversion information such as the number, LUN, and LBA for identifying the disk array subset 10 obtained in this way to the SC 2022. (Step 20006).

次に、ＳＣ２０２２は、獲得した変換情報を使用しフレームペイロード４０２のＬＵＮとＣＤＢのなかのＬＢＡを変換する。また、フレームヘッダ４０１のD_IDを対応するディスクアレイサブセット１０のホストＩ／Ｆコントローラ１０１１のD_IDに変換する。なお、この時点ではS_IDは書き換えない（ステップ20007）。 Next, the SC 2022 converts the LUN of the frame payload 402 and the LBA in the CDB using the acquired conversion information. Further, the D_ID of the frame header 401 is converted into the D_ID of the host I / F controller 1011 of the corresponding disk array subset 10. At this time, S_ID is not rewritten (step 20007).

ＳＣ２０２２は、変換後のコマンドフレームと、対象ディスクアレイサブセット１０に接続するディスクアレイＩ／Ｆノード番号を、ＳＰＧ２０２４に転送する。ＳＰＧ２０２４は、受け取った変換後のコマンドフレームに対し、図１４に示すような簡単な拡張ヘッダ６０１を付加したパケットを生成する。このパケットをスイッチングパケット（S Packet）６０と呼ぶ。S Packet６０の拡張ヘッダ６０１には、転送元（自ノード）番号、転送先ノード番号、及び転送長が付加含まれる。ＳＰＧ２０２４は、生成したS Packet６０をクロスバスイッチ２０１に送信する（ステップ20008）。 The SC 2022 transfers the converted command frame and the disk array I / F node number connected to the target disk array subset 10 to the SPG 2024. The SPG 2024 generates a packet in which a simple extension header 601 as shown in FIG. 14 is added to the received command frame after conversion. This packet is called a switching packet (S Packet) 60. The extension header 601 of S Packet 60 additionally includes a transfer source (own node) number, a transfer destination node number, and a transfer length. The SPG 2024 transmits the generated S Packet 60 to the crossbar switch 201 (step 20008).

クロスバスイッチ２０１は、ホストＩ／Ｆノード“＃２”と接続するＳＷＰ２０１０によりS Packet６０を受信する。ＳＷＰ２０１０は、S Packet６０の拡張ヘッダ６０１を参照し、転送先のノードが接続するＳＷＰへのスイッチ制御を行って経路を確立し、S Packet６０を転送先のディスクアレイＩ／Ｆノード２０２（ここでは、ディスクアレイＩ／Ｆノード“＃０”）に転送する。ＳＷＰ２０１０は、経路の確立をS Packet６０の受信の度に実施し、S Packet６０の転送が終了したら、その経路を解放する。ディスクアレイＩ／Ｆノード“＃０”では、ＳＰＧ２０２４がS Packet６０を受信し、拡張ヘッダ６０１を外してコマンドフレームの部分をＳＣ２０２２に渡す。 The crossbar switch 201 receives the S Packet 60 by the SWP 2010 connected to the host I / F node “# 2”. The SWP 2010 refers to the extension header 601 of the S Packet 60, establishes a path by performing switch control to the SWP to which the transfer destination node is connected, and transfers the S Packet 60 to the transfer destination disk array I / F node 202 (here, To the disk array I / F node “# 0”). The SWP 2010 establishes a path each time the S Packet 60 is received, and releases the path when the transfer of the S Packet 60 is completed. In the disk array I / F node “# 0”, the SPG 2024 receives the S Packet 60, removes the extension header 601, and passes the command frame portion to the SC 2022.

ＳＣ２０２２は、受け取ったコマンドフレームのフレームヘッダのS_IDに自分のＩＤを書き込む。次にＳＣ２０２２は、ＳＰ２０２１に対し、コマンドフレームのS_ID、D_ID、OX_ID等のエクスチェンジ情報、及びフレーム転送元ホストＩ／Ｆノード番号をＥＴ２０２６に登録するよう指示し、ＩＣ２０２３にコマンドフレームを転送する。ＩＣ２０２３は、フレームヘッダ４０１の情報に従い、接続するディスクアレイサブセット１０（ここでは、ディスクアレイサブセット“＃０”）にコマンドフレームを転送する（図１１矢印（ｂ））。 The SC 2022 writes its own ID in the S_ID of the frame header of the received command frame. Next, the SC 2022 instructs the SP 2021 to register the exchange information such as S_ID, D_ID, and OX_ID of the command frame and the frame transfer source host I / F node number in the ET 2026, and transfers the command frame to the IC 2023. The IC 2023 transfers the command frame to the disk array subset 10 to be connected (here, disk array subset “# 0”) according to the information in the frame header 401 (arrow (b) in FIG. 11).

ディスクアレイサブセット“＃０”は、変換後のコマンドフレーム「FCP_CMD」をディスクアレイＩ／Ｆコントローラ１０１１で受信する。上位ＭＰＵ１０１０は、コマンドフレームのフレームペイロード４０２に格納されたＬＵＮとＣＤＢを取得し、指定された論理ユニットのＬＢＡからＬＥＮ長のデータをリードするコマンドであると認識する。 The disk array subset “# 0” receives the converted command frame “FCP_CMD” by the disk array I / F controller 1011. The upper MPU 1010 acquires the LUN and CDB stored in the frame payload 402 of the command frame, and recognizes that the command reads LEN length data from the LBA of the designated logical unit.

上位ＭＰＵ１０１０は、共有メモリ１０２に格納されたキャッシュ管理情報を参照し、キャッシュヒットミス／ヒット判定を行う。ヒットすればキャッシュ１０２からデータ転送を実施する。ミスの場合、ディスクユニットからデータをリードする必要があるので、ＲＡＩＤ５の構成に基づくアドレス変換を実施し、キャッシュ空間を確保する。そして、ディスクユニット２からのリード処理に必要な処理情報を生成し、下位ＭＰＵ１０３０に処理を引き継ぐべく、共有メモリ１０２に処理情報を格納する。 The upper MPU 1010 refers to the cache management information stored in the shared memory 102 and performs cache hit miss / hit determination. If there is a hit, data transfer from the cache 102 is performed. In the case of a miss, since it is necessary to read data from the disk unit, address conversion based on the configuration of RAID 5 is performed to secure a cache space. Then, processing information necessary for the read processing from the disk unit 2 is generated, and the processing information is stored in the shared memory 102 in order to take over the processing to the lower MPU 1030.

下位ＭＰＵ１０３０は、共有メモリ１０２に処理情報が格納されたことを契機に処理を開始する。下位ＭＰＵ１０３０は、適切なディスクＩ／Ｆコントローラ１０３１を特定し、ディスクユニット２へのリードコマンドを生成して、ディスクＩ／Ｆコントローラ１０３１にコマンドを発行する。ディスクＩ／Ｆコントローラ１０３１は、ディスクユニット2からリードしたデータをキャッシュ１０２の指定されたアドレスに格納して下位ＭＰＵ１０３０に終了報告を通知する。下位ＭＰＵ１０３０は、処理が正しく終了したことを上位ＭＰＵ１０１０に通知すべく共有メモリ１０２に処理終了情報を格納する。 The lower MPU 1030 starts processing when the processing information is stored in the shared memory 102. The lower MPU 1030 identifies an appropriate disk I / F controller 1031, generates a read command for the disk unit 2, and issues the command to the disk I / F controller 1031. The disk I / F controller 1031 stores the data read from the disk unit 2 at the specified address of the cache 102 and notifies the lower MPU 1030 of the end report. The lower MPU 1030 stores the process end information in the shared memory 102 so as to notify the upper MPU 1010 that the process has been correctly completed.

上位ＭＰＵ１０１０は、共有メモリ１０２に処理終了情報が格納されたことを契機に処理を再開し、ディスクアレイＩ／Ｆコントローラ１０１１にリードデータ準備完了を通知する。ディスクアレイＩ／Ｆコントローラ１０１１は、ディスクアレイスイッチ２０の当該ディスクアレイＩ／Ｆノード“＃０”に対し、ファイバチャネルにおけるデータ転送準備完了フレームである「FCP_XFER_RDY」を発行する（図１１矢印（ｃ））。 The host MPU 1010 resumes processing when the processing end information is stored in the shared memory 102 and notifies the disk array I / F controller 1011 of read data preparation completion. The disk array I / F controller 1011 issues “FCP_XFER_RDY”, which is a data transfer ready frame in the fiber channel, to the disk array I / F node “# 0” of the disk array switch 20 (FIG. 11 arrow (c )).

ディスクアレイＩ／Ｆノード“＃０”では、データ転送準備完了フレーム「FCP_XFER_RDY」を受信すると、ＳＣ２０２２が、ディスクアレイサブセット２０から受信した応答先エクスチェンジＩＤ（RX_ID）を獲得し、S_ID、D_ID、OX_IDを指定して、ＳＰ２０２１に指示しＥＴ２０２６の当該エクスチェンジ情報にRX_IDを登録する。ＳＣ２０２２は、データ転送準備完了フレームの転送先（コマンドフレームの転送元）のホストＩ／Ｆノード番号を獲得する。ＳＣ２０２２は、このフレームのS_IDを無効化し、ＳＰＧ２０２４に転送する。ＳＰＧ２０２４は、先に述べたようにしてS Packetを生成し、クロスバスイッチ２０１経由で対象ホストＩ／Ｆノード“＃２”に転送する。 In the disk array I / F node “# 0”, when the data transfer ready frame “FCP_XFER_RDY” is received, the SC 2022 acquires the response destination exchange ID (RX_ID) received from the disk array subset 20, and S_ID, D_ID, and OX_ID. Is designated, and the SP 2021 is instructed to register the RX_ID in the exchange information of the ET 2026. The SC 2022 acquires the host I / F node number of the transfer destination (command frame transfer source) of the data transfer preparation completion frame. The SC 2022 invalidates the S_ID of this frame and transfers it to the SPG 2024. The SPG 2024 generates S Packet as described above, and transfers it to the target host I / F node “# 2” via the crossbar switch 201.

ホストＩ／Ｆノード“＃２”では、ＳＰＧ２０２４がデータ転送準備完了フレームのS Packetを受信すると、S Packetの拡張ヘッダを外し「FCP_XFER_RDY」を再生してＳＣ２０２２に渡す（ステップ20011）。ＳＣ２０２２は、ＳＰ２０２１に指示しＥＴ２０２６をサーチして該当するエクスチェンジを特定する（ステップ20012）。 In the host I / F node “# 2”, when the SPG 2024 receives the S Packet of the data transfer ready frame, it removes the S Packet extension header and reproduces “FCP_XFER_RDY” and passes it to the SC 2022 (step 2001). The SC 2022 instructs the SP 2021 to search the ET 2026 and identify the corresponding exchange (step 20012).

次に、ＳＣ２０２２は、フレームが「FCP_XFER_RDY」であるかどうか調べ（ステップ20013）、「FCP_XFER_EDY」であれば、ＥＴ２０２６の応答先エクスチェンジＩＤ（RX_ID）の更新をＳＰ２０２１に指示する。応答先エクスチェンジＩＤとしては、このフレームに付加されていた値が使用される（ステップ20014）。そして、ＳＣ２０２２は、フレームヘッダ４０１のS_ID、D_IDをホストＩ／Ｆノード２０３のＩＤとホスト３０のＩＤを用いた適切な値に変換する（ステップ20015）。これらの処理によりフレームヘッダ４０１は、ホスト“＃２”に対するフレームに変換される。ＩＣ２０２３は、ホスト“＃２”に対し、このデータ転送準備完了フレーム「FCP_XFER_RDY」を発行する（図１１の矢印（ｄ）：ステップ20016）。 Next, the SC 2022 checks whether or not the frame is “FCP_XFER_RDY” (step 20013). If it is “FCP_XFER_EDY”, the SC 2022 instructs the SP 2021 to update the response destination exchange ID (RX_ID) of the ET 2026. As the response destination exchange ID, the value added to this frame is used (step 20014). The SC 2022 converts the S_ID and D_ID of the frame header 401 into appropriate values using the ID of the host I / F node 203 and the ID of the host 30 (step 20001). By these processes, the frame header 401 is converted into a frame for the host “# 2”. The IC 2023 issues this data transfer preparation completion frame “FCP_XFER_RDY” to the host “# 2” (arrow (d) in FIG. 11: step 20016).

ディスクアレイサブセット“＃０”のディスクアレイＩ／Ｆコントローラ１０１１は、データ転送を行うため、データフレーム「FCP_DATA」を生成し、ディスクアレイスイッチ２０に転送する（図１１矢印（ｅ））。フレームペイロードの転送長には制限があるため、１フレームで転送できる最大のデータ長は２ＫＢである。データ長がこれを越える場合は、必要数だけデータフレームを生成し発行する。すべてのデータフレームには同一のSEQ_IDが割り当てられる。データフレームの発行は、同一のSEQ_IDに対し複数のフレームが生成されることを除き（すなわちSEQ_CNTが変化する）、データ転送準備完了フレームの場合と同様である。 The disk array I / F controller 1011 of the disk array subset “# 0” generates a data frame “FCP_DATA” and transfers the data frame “FCP_DATA” to the disk array switch 20 (arrow (e) in FIG. 11). Since the transfer length of the frame payload is limited, the maximum data length that can be transferred in one frame is 2 KB. If the data length exceeds this, the required number of data frames are generated and issued. All data frames are assigned the same SEQ_ID. Issuing data frames is the same as for data transfer ready frames except that multiple frames are generated for the same SEQ_ID (ie, SEQ_CNT changes).

ディスクアレイスイッチ２０は、データ転送準備完了フレームの処理と同様に、データフレーム「FCP_DATA」のフレームヘッダ４０１の変換を実施する。ただし、データフレームの転送の場合、RX_IDが既に確立されているので、データ転送準備完了フレームの処理におけるステップ20014の処理はスキップされる。フレームヘッダ４０１の変換後、ディスクアレイスイッチ２０は、ホスト“＃２”にデータフレームを転送する（図１１矢印（ｆ））。 The disk array switch 20 performs conversion of the frame header 401 of the data frame “FCP_DATA” in the same manner as the data transfer preparation completion frame processing. However, in the case of data frame transfer, since RX_ID has already been established, the processing of step 20014 in the processing of the data transfer preparation complete frame is skipped. After the conversion of the frame header 401, the disk array switch 20 transfers the data frame to the host “# 2” (arrow (f) in FIG. 11).

次に、ディスクアレイサブセット“＃０”のディスクアレイＩ／Ｆコントローラ１０１１は、終了ステータス転送を行うため、ステータスフレーム「FCP_RSP」を生成し、ディスクアレイスイッチ２０に対し発行する（図１１矢印（ｇ））。ディスクアレイスイッチ２０では、データ転送準備完了フレームの処理と同様に、ＳＰＧ２０２４がS Packetから拡張ヘッダを外し「FCP_RSP」ステータスフレームを再現し（ステップ20021）、ＳＰ２０２１によりＥＴ２０２６を検索しエクスチェンジ情報を獲得する（ステップ20022）。ＳＣ２０２２は、その情報に基づきフレームを変換する（ステップの20023）。変換されたフレームは、ＩＣ２０２３によりホスト“＃２”に転送される（図１１矢印（ｈ）：ステップ20024）。最後にＳＰ２０２１は、ＥＴ２０２６からエクスチェンジ情報を削除する（ステップ20025）。 Next, the disk array I / F controller 1011 of the disk array subset “# 0” generates a status frame “FCP_RSP” and issues it to the disk array switch 20 to transfer the end status (arrow (g )). In the disk array switch 20, the SPG 2024 removes the extension header from the S Packet, reproduces the “FCP_RSP” status frame (step 20021), and searches the ET 2026 by the SP 2021 to acquire the exchange information in the same manner as the data transfer preparation completion frame processing. (Step 20022). The SC 2022 converts the frame based on the information (Step 20023). The converted frame is transferred to the host “# 2” by the IC 2023 (arrow (h) in FIG. 11: step 20024). Finally, the SP 2021 deletes the exchange information from the ET 2026 (Step 20025).

以上のようにしてディスクアレイからのリード処理が行われる。ディスクアレイシステム１に対するライト処理についてもデータフレームの転送方向が逆転するのみで、上述したリード処理と同様の処理が行われる。 As described above, the read processing from the disk array is performed. The write processing for the disk array system 1 is also performed in the same manner as the above-described read processing, only the data frame transfer direction is reversed.

図３に示したように、ディスクアレイスイッチ２０は、クロスバスイッチ２０１にクラスタ間Ｉ／Ｆ２０４０を備えている。図１に示したシステム構成では、クラスタ間Ｉ／Ｆ２０４０は使用されていない。本実施形態のディスクアレイスイッチ２０は、クラスタ間Ｉ／Ｆ２０４０を利用して図１５に示すように、他のディスクアレイスイッチと相互に接続されることができる。 As shown in FIG. 3, the disk array switch 20 includes an inter-cluster I / F 2040 in the crossbar switch 201. In the system configuration shown in FIG. 1, the inter-cluster I / F 2040 is not used. The disk array switch 20 of this embodiment can be connected to other disk array switches as shown in FIG. 15 by using the inter-cluster I / F 2040.

本実施形態におけるディスクアレイスイッチ２０単独では、ホスト３０とディスクアレイサブセット１０を合計８台までしか接続できないが、クラスタ間Ｉ／Ｆ２０４０を利用して複数のディスクアレイスイッチを相互接続し、接続できるホスト１０とディスクアレイの数を増やすことができる。例えば、図１５に示すシステムでは、４台のディスクアレイスイッチ２０を使ってホスト３０とディスクアレイサブセット１０を合計３２台まで接続でき、これらの間で相互にデータ転送が可能になる。 In the present embodiment, the disk array switch 20 alone can connect only up to a total of eight hosts 30 and disk array subsets 10, but a host that can interconnect and connect a plurality of disk array switches using an inter-cluster I / F 2040. 10 and the number of disk arrays can be increased. For example, in the system shown in FIG. 15, up to 32 hosts 30 and disk array subsets 10 can be connected in total using four disk array switches 20, and data can be transferred between them.

このように、本実施形態では、ディスク容量や性能の必要性に合わせて、ディスクアレイサブセットやホストの接続台数を増加していくことができる。また、必要な転送帯域分のホストＩ／Ｆを用いてホスト−ディスクアレイシステム間を接続することができるので、容量、性能、接続台数の拡張性を大幅に向上させることができる。 As described above, in this embodiment, the number of connected disk array subsets and hosts can be increased in accordance with the necessity of disk capacity and performance. Further, since the host-disk array system can be connected using the host I / F corresponding to the necessary transfer bandwidth, the capacity, performance, and expandability of the number of connected devices can be greatly improved.

以上説明した実施形態によれば、１台のディスクアレイサブセットの性能が、内部のＭＰＵや内部バスで制限されたとしても、複数のディスクアレイサブセットを用いて、ディスクアレイスイッチによりホストとディスクアレイサブセット間を相互接続することができる。これにより、ディスクアレイシステムトータルとして高い性能を実現することができる。ディスクアレイサブセットの性能が比較的低いものであっても、複数のディスクアレイサブセットを用いることで高性能化を実現できる。したがって、低コストのディスクアレイサブセットをコンピュータシステムの規模に合わせて必要な台数だけ接続することができ、規模に応じた適切なコストでディスクアレイシステムを構築することが可能となる。 According to the embodiment described above, even if the performance of one disk array subset is limited by an internal MPU or an internal bus, a host and a disk array subset are used by a disk array switch using a plurality of disk array subsets. Can be interconnected. Thereby, it is possible to realize high performance as the total disk array system. Even if the performance of the disk array subset is relatively low, high performance can be realized by using a plurality of disk array subsets. Accordingly, it is possible to connect only a required number of low-cost disk array subsets according to the scale of the computer system, and it is possible to construct a disk array system at an appropriate cost according to the scale.

また、ディスク容量の増大や性能の向上が必要になったときは、ディスクアレイサブセットを必要なだけ追加すればよい。さらに、複数のディスクアレイスイッチを用いて任意の数のホスト及びディスクアレイサブセットを接続できるので、容量、性能、接続台数のいずれをも大幅に向上させることができ、高い拡張性を有するシステムが実現できる。 Further, when it is necessary to increase disk capacity or improve performance, it is sufficient to add as many disk array subsets as necessary. In addition, since any number of hosts and disk array subsets can be connected using multiple disk array switches, the capacity, performance, and number of connected devices can be greatly improved, resulting in a highly scalable system. it can.

さらにまた、本実施形態によれば、ディスクアレイサブセットとして、従来のディスクアレイシステムそのものの縮小機を用いることができるので、既に開発した大規模な制御ソフトウェア資産をそのまま利用でき、開発コストの低減と開発期間の短縮を実現することができる。 Furthermore, according to the present embodiment, since a reduction device of the conventional disk array system itself can be used as a disk array subset, large-scale control software assets that have already been developed can be used as they are, and development costs can be reduced. The development period can be shortened.

［第２実施形態］
図１６は、本発明の第２の実施形態におけるコンピュータシステムの構成図である。本実施形態は、ディスクアレイスイッチのホストＩ／Ｆノードにおいて、フレームヘッダ４０１のみを変換し、フレームペイロード４０２は操作しない点、及び、ディスクアレイスイッチ、ホストＩ／Ｆ、ディスクアレイＩ／Ｆが二重化されていない点で第１実施形態と構成上相違する。したがって、各部の構成は、第１実施形態と大きく変わるところがなく、その詳細については説明を省略する。 [Second Embodiment]
FIG. 16 is a configuration diagram of a computer system according to the second embodiment of the present invention. In this embodiment, the host I / F node of the disk array switch converts only the frame header 401 and does not operate the frame payload 402, and the disk array switch, host I / F, and disk array I / F are duplicated. This is different from the first embodiment in the configuration. Therefore, the configuration of each part is not significantly different from that of the first embodiment, and the description thereof is omitted.

図１６において、各ディスクアレイサブセット１０は、複数の論理ユニット（ＬＵ）１１０で構成されている。各ＬＵ１１０は、独立ＬＵとして構成される。一般に、各ディスクアレイサブセット１０内のＬＵ１１０に割り当てられるＬＵＮは、０から始まる連続番号である。このため、ホスト３０に対して、ディスクアレイシステム1内のすべてのＬＵ１１０のＬＵＮを連続的に見せる場合には、第１実施形態と同様に、フレームペイロード４０２のＬＵＮフィールドを変換する必要がある。本実施形態では、各ディスクアレイサブセット１０のＬＵＮをそのままホスト３０に見せることで、フレームペイロード４０２の変換を不要とし、ディスクアレイスイッチの制御を簡単なものとしている。 In FIG. 16, each disk array subset 10 is composed of a plurality of logical units (LUs) 110. Each LU 110 is configured as an independent LU. In general, LUNs assigned to LUs 110 in each disk array subset 10 are sequential numbers starting from 0. For this reason, when the LUNs of all the LUs 110 in the disk array system 1 are continuously displayed to the host 30, it is necessary to convert the LUN field of the frame payload 402 as in the first embodiment. In this embodiment, the LUN of each disk array subset 10 is shown to the host 30 as it is, so that the conversion of the frame payload 402 is unnecessary and the control of the disk array switch is simplified.

本実施形態のディスクアレイスイッチ２０は、ホストＩ／Ｆノード２０３ごとに特定のディスクアレイサブセット１０をアクセスできるものと仮定する。この場合、一つのホストＩ／Ｆ３１を使うと、１台のディスクアレイサブセット１０にあるＬＵ１１０のみがアクセス可能である。１台のホストから複数のディスクアレイサブセット１０のＬＵ１１０をアクセスしたい場合には、そのホストを複数のホストＩ／Ｆノード２０３に接続する。また、複数のホスト３０から１台のディスクアレイサブセット１０のＬＵ１１０をアクセスできるようにする場合は、同一のホストＩ／Ｆノード２０３にループトポロジーや、ファブリックトポロジー等を用い、複数のホスト３０を接続する。このように構成すると、１台のホスト３０から１つのＬＵ１１０をアクセスする際に、ホストＩ／Ｆノード２０３のD_ID毎にディスクアレイサブセット１０が確定することになるため、各ＬＵのＬＵＮをそのままホスト３０に見せることが可能である。 It is assumed that the disk array switch 20 of this embodiment can access a specific disk array subset 10 for each host I / F node 203. In this case, when one host I / F 31 is used, only the LU 110 in one disk array subset 10 can be accessed. When it is desired to access the LU 110 of the plurality of disk array subsets 10 from one host, the host is connected to the plurality of host I / F nodes 203. In addition, when making it possible to access the LU 110 of one disk array subset 10 from a plurality of hosts 30, a plurality of hosts 30 are connected to the same host I / F node 203 using a loop topology or a fabric topology. To do. With this configuration, when one LU 110 is accessed from one host 30, the disk array subset 10 is determined for each D_ID of the host I / F node 203. 30 can be shown.

本実施形態では、上述した理由により、ホスト３０に、各ディスクアレイサブセット１０内のＬＵ１１０のＬＵＮをそのままホスト３０に見せているため、ディスクアレイスイッチ２０におけるＬＵＮの変換は不要となる。このため、ディスクアレイスイッチ２０は、ホスト３０からフレームを受信すると、フレームヘッダ４０１のみを第１実施例と同様にして変換し、フレームペイロード４０２は変換せずにディスクアレイサブセット１０に転送する。本実施形態における各部の動作は、フレームペイロード４０２の変換が行われないことを除くと第１実施形態と同様であるので、ここでは詳細な説明を省略する。本実施形態によれば、ディスクアレイスイッチ２０の開発を容易にできる。 In the present embodiment, for the reason described above, the LUN of the LU 110 in each disk array subset 10 is shown to the host 30 as it is to the host 30, so that LUN conversion in the disk array switch 20 is not necessary. For this reason, when receiving a frame from the host 30, the disk array switch 20 converts only the frame header 401 in the same manner as in the first embodiment, and transfers the frame payload 402 to the disk array subset 10 without conversion. Since the operation of each unit in the present embodiment is the same as that in the first embodiment except that the frame payload 402 is not converted, detailed description thereof is omitted here. According to this embodiment, development of the disk array switch 20 can be facilitated.

［第３実施形態］
第２実施形態では、ディスクアレイスイッチのホストＩ／Ｆノードにおいて、フレームヘッダのみを変換しているが、以下に説明する第３実施形態ではフレームヘッダも含め、フレームの変換を行わない形態について説明する。本実施形態のコンピュータシステムは、図１に示す第１実施形態におけるコンピュータシステムと同様に構成される。 [Third Embodiment]
In the second embodiment, only the frame header is converted in the host I / F node of the disk array switch, but in the third embodiment described below, a mode in which frame conversion is not performed including the frame header will be described. To do. The computer system of this embodiment is configured in the same manner as the computer system in the first embodiment shown in FIG.

第１、および第２実施形態では、ホスト３０に対し、ディスクアレイサブセット１０の台数や、ＬＵ１１０の構成等、ディスクアレイシステム1の内部構成を隠蔽している。このため、ホスト３０からはディスクアレイシステム1が全体で１つの記憶装置として見える。これに対し、本実施形態では、ディスクアレイサブセット１０をそのままホスト３０に公開し、ホスト３０がフレームヘッダのD_IDとして直接ディスクアレイサブセットのポートのＩＤを使えるようにする。これにより、ディスクアレイスイッチは、フレームヘッダの情報に従ってフレームの転送を制御するだけで済み、従来技術におけるファイバチャネルのファブリック装置と同等のスイッチ装置をディスクアレイスイッチ２０に替えて利用することができる。 In the first and second embodiments, the internal configuration of the disk array system 1 such as the number of disk array subsets 10 and the configuration of the LU 110 is concealed from the host 30. For this reason, the disk array system 1 appears to the host 30 as a single storage device as a whole. In contrast, in this embodiment, the disk array subset 10 is disclosed to the host 30 as it is, and the host 30 can directly use the ID of the port of the disk array subset as the D_ID of the frame header. Thus, the disk array switch only needs to control frame transfer according to the information of the frame header, and a switch device equivalent to the fiber channel fabric device in the prior art can be used in place of the disk array switch 20.

ディスクアレイシステム構成管理手段７０は、ディスクアレイサブセット１０の通信コントローラ１０６、及びディスクアレイスイッチ２０の通信手段２０４と通信して各ディスクアレイサブセット１０及びディスクアレイスイッチ２０の構成情報を獲得し、あるいは、設定する。 The disk array system configuration management unit 70 communicates with the communication controller 106 of the disk array subset 10 and the communication unit 204 of the disk array switch 20 to acquire configuration information of each disk array subset 10 and disk array switch 20, or Set.

ディスクアレイスイッチ２０は、基本的には図３に示す第１実施形態におけるディスクアレイスイッチと同様の構成を有する。しかし、本実施形態では、ホスト３０が発行するフレームのフレームヘッダの情報をそのまま使ってフレームの転送を制御するため、第１実施形態、あるいは第２実施形態でディスクアレイスイッチ２０のホストＩ／Ｆノード２０３、ディスクアレイＩ／Ｆノード２０２が有するＤＣＴ２０２７や、ＳＣ２０２２、ＳＰＧ２０２４等により実現されるフレームヘッダ等の変換の機能は不要となる。ディスクアレイスイッチ２０が有するクロスバスイッチ２０１は、フレームヘッダの情報に従ってホストＩ／Ｆノード２０３、及びディスクアレイＩ／Ｆノード２０２の間でファイバチャネルのフレームの転送を行う。 The disk array switch 20 basically has the same configuration as the disk array switch in the first embodiment shown in FIG. However, in this embodiment, since frame transfer information is controlled using the frame header information issued by the host 30 as it is, the host I / F of the disk array switch 20 in the first embodiment or the second embodiment is used. The conversion function such as the frame header realized by the DCT 2027, the SC 2022, the SPG 2024, etc. included in the node 203 and the disk array I / F node 202 becomes unnecessary. The crossbar switch 201 included in the disk array switch 20 transfers a fiber channel frame between the host I / F node 203 and the disk array I / F node 202 according to the information of the frame header.

本実施形態では、ディスクアレイシステムの構成をディスクアレイシステム構成管理手段７０で一括して管理するために、ディスクアレイ管理用テーブル（以下、このテーブルもＤＣＴと呼ぶ）をディスクアレイシステム構成管理手段７０に備える。ディスクアレイシステム構成管理手段７０が備えるＤＣＴは、図６、７に示す、システム構成テーブル20270とサブセット構成テーブル202720〜202723の２つのテーブル群を含む。なお、本実施形態では、ホストＬＵは全てＩＬＵとして構成されるため、ホストＬＵ構成テーブル20271のLU Typeは全て「ＩＬＵ」となり、CLU Class、CLU Stripe Sizeは意味をなさない。 In this embodiment, in order to collectively manage the configuration of the disk array system by the disk array system configuration management means 70, a disk array management table (hereinafter also referred to as DCT) is used as the disk array system configuration management means 70. Prepare for. The DCT included in the disk array system configuration management means 70 includes two table groups, a system configuration table 20270 and subset configuration tables 202720 to 202723 shown in FIGS. In this embodiment, since all host LUs are configured as ILUs, all LU Types in the host LU configuration table 20271 are “ILU”, and CLU Class and CLU Stripe Size do not make sense.

管理者は、管理端末５を操作してディスクアレイシステム構成管理手段７０と通信し、ディスクアレイサブセット１０のディスク容量、ディスクユニットの台数等の情報を得て、ディスクアレイサブセット１０のＬＵ１１０の設定、ＲＡＩＤレベルの設定等を行う。次に管理者は、管理端末５によりディスクアレイシステム構成管理手段７０と通信し、ディスクアレイスイッチ２０を制御して、各ホスト３０とディスクアレイサブセット２０間の関係情報を設定する。 The administrator operates the management terminal 5 to communicate with the disk array system configuration management means 70, obtains information such as the disk capacity of the disk array subset 10 and the number of disk units, and sets the LU 110 of the disk array subset 10. Set the RAID level. Next, the administrator communicates with the disk array system configuration management means 70 via the management terminal 5 and controls the disk array switch 20 to set the relationship information between each host 30 and the disk array subset 20.

以上の操作により、ディスクアレイシステム1の構成が確立し、ホスト３０から管理者が望む通りにＬＵ１１０が見えるようになる。ディスクアレイ構成管理手段７０は以上の設定情報を保存し、管理者からの操作に応じ構成の確認や、構成の変更を行うことができる。 With the above operation, the configuration of the disk array system 1 is established, and the LU 110 can be seen from the host 30 as desired by the administrator. The disk array configuration management means 70 stores the above setting information, and can confirm the configuration or change the configuration in accordance with an operation from the administrator.

本実施形態によれば、ひとたびディスクアレイシステム1を構成すれば、管理者からディスクアレイスイッチ２０の存在を認識させることが無く、複数のディスクアレイサブシステムを１台のディスクアレイシステムと同様に扱うことができる。また、本実施形態によれば、ディスクアレイスイッチ２０とディスクアレイサブセット１０は、同一の操作環境によって統一的に操作することができ、その構成確認や、構成変更も容易になる。さらに、本実施形態によれば、従来使用していたディスクアレイシステムを本実施形態におけるディスクアレイシステムに置き換える場合に、ホスト３０の設定を変更することなく、ディスクアレイシステム1の構成をそれまで使用していたディスクアレイシステムの構成に合わせることができ、互換性を維持できる。 According to the present embodiment, once the disk array system 1 is configured, the administrator does not recognize the existence of the disk array switch 20, and a plurality of disk array subsystems are handled in the same manner as a single disk array system. be able to. Further, according to the present embodiment, the disk array switch 20 and the disk array subset 10 can be operated uniformly in the same operation environment, and the configuration confirmation and configuration change are facilitated. Furthermore, according to the present embodiment, when replacing the disk array system used conventionally with the disk array system in the present embodiment, the configuration of the disk array system 1 is used so far without changing the setting of the host 30. It is possible to match the configuration of the disk array system that has been used and maintain compatibility.

［第４実施形態］
以上説明した第１から第３の実施形態では、ホストＩ／Ｆにファイバチャネルを使用している。以下に説明する実施形態では、ファイバチャネル以外のインタフェースが混在した形態について説明する。 [Fourth Embodiment]
In the first to third embodiments described above, a fiber channel is used for the host I / F. In the embodiment described below, a mode in which interfaces other than the fiber channel are mixed will be described.

図１７は、ホストＩ／ＦがパラレルＳＣＳＩである場合のホストＩ／Ｆノード２０３内部のＩＣ２０２３の一構成例を示す。20230はパラレルＳＣＳＩのプロトコル制御を行うＳＣＳＩプロトコルコントローラ（ＳＰＣ）、20233はファイバチャネルのプロトコル制御を行うファイバチャネルプロトコルコントローラ（ＦＰＣ）、20231はパラレルＳＣＳＩとファイバチャネルのシリアルＳＣＳＩをプロトコル変換するプロトコル変換プロセッサ（ＰＥＰ）、20232はプロトコル変換中データを一時保存するバッファ（ＢＵＦ）である。 FIG. 17 shows a configuration example of the IC 2023 inside the host I / F node 203 when the host I / F is a parallel SCSI. 20230 is a SCSI protocol controller (SPC) that performs parallel SCSI protocol control, 20233 is a Fiber Channel protocol controller (FPC) that performs Fiber Channel protocol control, and 20231 is a protocol conversion processor that performs protocol conversion between parallel SCSI and Fiber Channel serial SCSI. (PEP), 20232 is a buffer (BUF) for temporarily storing data during protocol conversion.

本実施形態において、ホスト３０は、ディスクアレイＩ／Ｆノード２０３に対してＳＣＳＩコマンドを発行する。リードコマンドの場合、ＳＰＣ20230は、これをＢＵＦ20232に格納し、ＰＥＰ20231に割り込みでコマンドの受信を報告する。ＰＥＰ20231は、ＢＵＦ20232に格納されたコマンドを利用し、ＦＰＣ20233へのコマンドに変換し、ＦＰＣ20233に送る。ＦＰＣ20233は、このコマンドを受信すると、フレーム形式に変換し、ＳＣ２０２２に引き渡す。この際、エクスチェンジＩＤ、シーケンスＩＤ、ソースＩＤ、デスティネイションＩＤは、以降の処理が可能なようにＰＥＰ20231により付加される。あとのコマンド処理は、第１実施形態と同様に行われる。 In this embodiment, the host 30 issues a SCSI command to the disk array I / F node 203. In the case of a read command, the SPC 20230 stores this in the BUF 20232 and reports reception of the command to the PEP 20231 by an interrupt. The PEP 20231 uses a command stored in the BUF 20232, converts it to a command for the FPC 20233, and sends it to the FPC 20233. When the FPC 20233 receives this command, it converts it into a frame format and delivers it to the SC 2022. At this time, the exchange ID, sequence ID, source ID, and destination ID are added by the PEP 20231 so that the subsequent processing is possible. The subsequent command processing is performed in the same manner as in the first embodiment.

ディスクアレイサブセット１０は、データの準備が完了すると、データ転送準備完了フレームの発行、データ転送、正常終了後ステータスフレームの発行を実施する。ディスクアレイサブセット１０からＩＣ２０２３までの間では、フレームヘッダ４０１やフレームペイロード４０２が必要に応じ変換されながら、各種フレームの転送が行われる。ＩＣ２０２３のＦＰＣ20233は、データ転送準備完了フレームを受信し、続いてデータを受信してＢＵＦ20232に格納し、続けて正常に転送が終わったならば、ステータスフレームを受信し、ＰＴＰ20231に割り込みをかけてデータの転送完了を報告する。ＰＴＰ20231は、割り込みを受けると、ＳＰＣ20230を起動し、ホスト３０に対しデータ転送を開始するよう指示する。ＳＰＣ20230はホスト３０にデータを送信し、正常終了を確認するとＰＴＰ20231に対し割り込みで正常終了を報告する。 When data preparation is completed, the disk array subset 10 issues a data transfer preparation completion frame, data transfer, and a status frame after normal completion. Between the disk array subset 10 and the IC 2023, various frames are transferred while the frame header 401 and the frame payload 402 are converted as necessary. The FPC 20233 of the IC 2023 receives the data transfer preparation completion frame, subsequently receives the data and stores it in the BUF 20232. If the transfer is completed normally, the FPC 20233 receives the status frame, interrupts the PTP 20231, and receives the data. Report completion of transfer. When receiving the interrupt, the PTP 20231 activates the SPC 20230 and instructs the host 30 to start data transfer. When the SPC 20230 transmits data to the host 30 and confirms the normal end, it reports the normal end to the PTP 20231 by interruption.

ここでは、ファイバチャネル以外のホストＩ／Ｆの例としてパラレルＳＣＳＩを示したが、他のインタフェース、例えば、メインフレームへのホストＩ／ＦであるESCON等に対しても同様に適用することが可能である。ディスクアレイスイッチ２０のホストＩ／Ｆノード２０３として、例えば、ファイバチャネル、パラレルＳＣＳＩ、及びESCONに対応したホストＩ／Ｆノードを設けることで、１台のディスクアレイシステム1に、メインフレームと、パーソナルコンピュータ、ワークステーション等のいわゆるオープンシステムの両方を混在させて接続することが可能である。本実施形態では、ディスクアレイＩ／Ｆとしては、第１から第３実施形態と同様、ファイバチャネルを用いているが、ディスクアレイＩ／Ｆに対しても任意のＩ／Ｆを使用することが可能である。 Here, parallel SCSI is shown as an example of a host I / F other than Fiber Channel, but it can be similarly applied to other interfaces such as ESCON which is a host I / F to a mainframe. It is. As the host I / F node 203 of the disk array switch 20, for example, a host I / F node corresponding to Fiber Channel, parallel SCSI, and ESCON is provided, so that one disk array system 1 has a main frame and a personal It is possible to connect a mixture of both so-called open systems such as computers and workstations. In this embodiment, a fiber channel is used as the disk array I / F as in the first to third embodiments. However, any I / F may be used for the disk array I / F. Is possible.

［第５実施形態］
次に、ディスクアレイシステム1の構成管理の方法について、第５実施形態として説明する。図１８は、本実施形態のシステム構成図である。本実施形態では、ホスト３０が4台設けられている。ホスト“＃０”、“＃１”とディスクアレイシステム1の間のＩ／Ｆ３０はファイバチャネル、ホスト“＃２”とディスクアレイシステム1の間は、パラレルＳＣＳＩ（Ultra SCSI）、ホスト“＃３”とディスクアレイシステム1の間は、パラレルＳＣＳＩ（Ultra2 SCSI）で接続されている。 [Fifth Embodiment]
Next, a configuration management method for the disk array system 1 will be described as a fifth embodiment. FIG. 18 is a system configuration diagram of the present embodiment. In the present embodiment, four hosts 30 are provided. The I / F 30 between the hosts “# 0” and “# 1” and the disk array system 1 is a fiber channel, and between the host “# 2” and the disk array system 1 is a parallel SCSI (Ultra SCSI), the host “# 3” And the disk array system 1 are connected by parallel SCSI (Ultra2 SCSI).

パラレルＳＣＳＩのディスクアレイスイッチ２０への接続は第４実施形態と同様に行われる。ディスクアレイシステム1は、４台のディスクアレイサブセット３０を有する。ディスクアレイサブセット“＃０”には４つの独立ＬＵ、ディスクアレイサブセット“＃１”には２つの独立ＬＵがそれぞれ構成されている。ディスクアレイサブセット“＃２”と“＃３”で１つの統合ＬＵが構成されている。本実施形態では、第１実施形態と同様、ホスト３０に対しディスクアレイサブセット１０を隠蔽し、ファイバチャネルのフレームを変換するものとする。各ＬＵに割り当てられるＬＵＮは、ディスクアレイサブセット“＃０”のＬＵから順に、ＬＵＮ＝０、１、２、・・・６までの７つである。 Connection to the parallel SCSI disk array switch 20 is performed in the same manner as in the fourth embodiment. The disk array system 1 has four disk array subsets 30. The disk array subset “# 0” includes four independent LUs, and the disk array subset “# 1” includes two independent LUs. The disk array subsets “# 2” and “# 3” constitute one integrated LU. In the present embodiment, as in the first embodiment, the disk array subset 10 is concealed from the host 30 and the fiber channel frame is converted. There are seven LUNs, LUN = 0, 1, 2,... 6 in order from the LU of the disk array subset “# 0”.

図１９は、管理端末５の表示画面上に表示される画面の一例である。図は、ホストＩ／Ｆ３１と各論理ユニット（ＬＵ）との対応を示した論理接続構成画面である。 FIG. 19 is an example of a screen displayed on the display screen of the management terminal 5. The figure is a logical connection configuration screen showing the correspondence between the host I / F 31 and each logical unit (LU).

論理接続構成画面５０には、各ホストＩ／Ｆ３１に関する情報３１００、各ＬＵ１１０に関する情報11000、ディスクアレイサブセット１０とＬＵ１１０の関係等が表示される。ホストＩ／Ｆ３１に関する情報としては、Ｉ／Ｆ種類、Ｉ／Ｆ速度、ステータス等が含まれる。ＬＵ１１０に関する情報としては、格納サブセット番号、ＬＵＮ、容量、ＲＡＩＤレベル、ステータス、情報、等が表示される。管理者はこの画面を参照することで、容易にディスクアレイシステム１の構成を管理することができる。 The logical connection configuration screen 50 displays information 3100 related to each host I / F 31, information 11000 related to each LU 110, the relationship between the disk array subset 10 and the LU 110, and the like. Information relating to the host I / F 31 includes I / F type, I / F speed, status, and the like. Information relating to the LU 110 includes a storage subset number, LUN, capacity, RAID level, status, information, and the like. The administrator can easily manage the configuration of the disk array system 1 by referring to this screen.

論理接続構成画面５０上で、ホストＩ／ＦとＬＵの間に引かれている線は、各ホストＩ／Ｆ３１を経由してアクセス可能なＬＵ１１０を示している。ホストＩ／Ｆから線の引かれていないＬＵ１１０に対して、そのホストＩ／Ｆに接続するホスト３０からはアクセスできない。ホスト３０によって、扱うデータ形式が異なり、また使用者も異なることから、セキュリティ維持上、適切なアクセス制限を設けることが不可欠である。そこで、システムを設定する管理者が、この画面を用いて、各ＬＵ１１０とホストＩ／Ｆとの間のアクセス許可をあたえるか否かによって、アクセス制限を実施する。図において、例えば、ＬＵ“＃０”は、ホストＩ／Ｆ“＃０”および“＃１”からアクセス可能であるが、ホストＩ／Ｆ“＃２”、“＃３”からはアクセスできない。ＬＵ“＃４”は、ホストＩ／Ｆ“＃２”からのみアクセス可能である。 On the logical connection configuration screen 50, a line drawn between the host I / F and the LU indicates the LU 110 that can be accessed via each host I / F 31. An LU 110 that is not drawn from the host I / F cannot be accessed from the host 30 connected to the host I / F. Since data formats to be handled and users are different depending on the host 30, it is indispensable to provide appropriate access restrictions in order to maintain security. Therefore, an administrator who configures the system uses this screen to restrict access depending on whether or not to grant access permission between each LU 110 and the host I / F. In the figure, for example, the LU “# 0” can be accessed from the host I / Fs “# 0” and “# 1”, but cannot be accessed from the host I / Fs “# 2” and “# 3”. The LU “# 4” can be accessed only from the host I / F “# 2”.

このようなアクセス制限を実現するためアクセス制限情報は、ディスクアレイシステム構成管理手段７０からディスクアレイスイッチ２０に対して送信される。ディスクアレイスイッチ２０に送られたアクセス制限情報は、各ホストＩ／Ｆノード２０３に配信され、各ホストＩ／Ｆノード２０３のＤＣＴ２０２７に登録される。ホストにより、アクセスが制限されたＬＵに対するＬＵ存在有無の検査コマンドが発行された場合、各ホストＩ／Ｆノード２０３は、ＤＣＴ２０２７の検査を行い、検査コマンドに対し応答しないか、あるいは、エラーを返すことで、そのＬＵは、ホストからは認識されなくなる。ＬＵ存在有無の検査コマンドとしては、ＳＣＳＩプロトコルの場合、Test Unit Readyコマンドや、Inquiryコマンドが一般に用いられる。この検査なしに、リード／ライトが実施されることはないため、容易にアクセスの制限をかけることが可能である。 In order to realize such access restriction, access restriction information is transmitted from the disk array system configuration management means 70 to the disk array switch 20. The access restriction information sent to the disk array switch 20 is distributed to each host I / F node 203 and registered in the DCT 2027 of each host I / F node 203. When the host issues an LU existence check command for an LU whose access is restricted, each host I / F node 203 checks the DCT 2027 and does not respond to the check command or returns an error. As a result, the LU is not recognized by the host. In the case of the SCSI protocol, a test unit ready command or an inquiry command is generally used as a check command for the presence / absence of an LU. Since reading / writing is not performed without this inspection, it is possible to easily limit access.

本実施形態ではホストＩ／Ｆ３１毎にアクセス制限をかけているが、これを拡張することで、ホスト３０毎にアクセス制限をかけることも容易に実現できる。また、ホストＩ／Ｆ３１、ホスト３０、あるいは、アドレス空間を特定して、リードのみ可、ライトのみ可、リード／ライトとも可、リード／ライトとも不可といった、コマンドの種別に応じたアクセス制限をかけることもできる。この場合、アクセス制限情報としてホストＩ／Ｆ番号、ホストＩＤ、アドレス空間、制限コマンド等を指定してディスクアレイスイッチ２０に制限を設定する。 In this embodiment, access restriction is applied to each host I / F 31. By extending this, access restriction can be easily applied to each host 30. In addition, the host I / F 31, the host 30, or the address space is specified, and access restrictions are applied according to the type of command, such as read only, write only, read / write enabled, read / write disabled. You can also In this case, the host I / F number, host ID, address space, restriction command, etc. are designated as access restriction information to set restrictions on the disk array switch 20.

次に、新たなディスクアレイサブセット１０の追加について説明する。ディスクアレイサブセット１０を新規に追加する場合、管理者は、ディスクアレイスイッチ２０の空いているディスクアレイＩ／Ｆノード２０２に追加するディスクアレイサブセット１０を接続する。つづけて、管理者は、管理端末５を操作し、論理接続構成画面５０に表示されている「最新状態を反映」ボタン５００１を押下する。この操作に応答して、未設定のディスクアレイサブセットを表す絵が画面上に表示される（図示せず）。このディスクアレイサブセットの絵が選択されると、ディスクアレイサブセットの設定画面が現れる。管理者は、表示された設定画面上で、新規に追加されたディスクアレイサブセットの各種設定を実施する。ここで設定される項目にはＬＵの構成、ＲＡＩＤレベル等がある。続けて、図１９の論理接続構成図の画面に切り替えると、新規ディスクアレイサブセットとＬＵが現れる。以降、ホストＩ／Ｆ３１毎に対するアクセス制限を設定し、「設定実行」ボタン５００２を押下すると、ディスクアレイスイッチ２０に対し、アクセス制限情報、およびディスクアレイサブセット、ＬＵの情報が転送され、設定が実行される。 Next, addition of a new disk array subset 10 will be described. When newly adding the disk array subset 10, the administrator connects the disk array subset 10 to be added to the vacant disk array I / F node 202 of the disk array switch 20. Subsequently, the administrator operates the management terminal 5 and presses a “reflect latest state” button 5001 displayed on the logical connection configuration screen 50. In response to this operation, a picture representing an unconfigured disk array subset is displayed on the screen (not shown). When this picture of the disk array subset is selected, a disk array subset setting screen appears. The administrator performs various settings for the newly added disk array subset on the displayed setting screen. Items set here include the LU configuration, RAID level, and the like. Subsequently, when switching to the screen of the logical connection configuration diagram of FIG. 19, a new disk array subset and LU appear. Thereafter, when the access restriction for each host I / F 31 is set and the “execute setting” button 5002 is pressed, the access restriction information, the disk array subset, and the LU information are transferred to the disk array switch 20 and the setting is executed. Is done.

各ディスクアレイサブセット１０にＬＵ１１０を追加する際の手順も上述した手順で行われる。また、ディスクアレイサブセット、およびＬＵの削除についてもほぼ同様の手順で行われる。異なる点は、管理者が各削除部位を画面上で選択して「削除」ボタン５００３を押下し、適切な確認が行われたのち、実行される点である。以上のように、管理端末７０を用いることで、管理者はディスクアレイシステム全体を一元的に管理できる。 The procedure for adding the LU 110 to each disk array subset 10 is also performed according to the procedure described above. Also, the disk array subset and LU deletion are performed in substantially the same procedure. The difference is that the administrator selects each deletion part on the screen, presses a “delete” button 5003, and is executed after appropriate confirmation is performed. As described above, by using the management terminal 70, the administrator can manage the entire disk array system in an integrated manner.

［第６実施形態］
次に、ディスクアレイスイッチ２０によるミラーリングの処理について、第６実施形態として説明する。ここで説明するミラーリングとは、２台のディスクアレイサブセットの２つの独立ＬＵにより二重書きをサポートする方法であり、ディスクアレイサブセットのコントローラまで含めた二重化である。従って、信頼性は、ディスクのみの二重化とは異なる。 [Sixth Embodiment]
Next, mirroring processing by the disk array switch 20 will be described as a sixth embodiment. The mirroring described here is a method of supporting dual writing by two independent LUs of two disk array subsets, and is duplexing including the controllers of the disk array subsets. Therefore, the reliability is different from the duplication of only the disk.

本実施形態におけるシステムの構成は図１に示すものと同じである。図１に示す構成おいて、ディスクアレイサブセット“＃０”と“＃１”は全く同一のＬＵ構成を備えており、この２つのディスクアレイサブセットがホスト３０からは１つのディスクアレイとして見えるものとする。便宜上、ミラーリングされたディスクアレイサブセットのペアの番号を“＃０１”と呼ぶ。また、各ディスクアレイサブセットのＬＵ“＃０”とＬＵ“＃１”によってミラーリングペアが形成され、このＬＵのペアを便宜上、ＬＵ“＃０１”と呼ぶ。ＤＣＴ２０２７のホストＬＵ構成テーブル20271上でＬＵ＃０１を管理するための情報は、CLU Classに「Mirrored」が設定され、LU Info.として、ＬＵ＃０とＬＵ＃１に関する情報が設定される。その他の各部の構成は第１実施形態と同様である。 The system configuration in the present embodiment is the same as that shown in FIG. In the configuration shown in FIG. 1, the disk array subsets “# 0” and “# 1” have exactly the same LU configuration, and these two disk array subsets appear to the host 30 as one disk array. To do. For convenience, the number of the mirrored disk array subset pair is referred to as “# 01”. Further, a mirroring pair is formed by LU “# 0” and LU “# 1” of each disk array subset, and this LU pair is called LU “# 01” for convenience. As information for managing LU # 01 on the host LU configuration table 20271 of the DCT 2027, “Mirrored” is set in CLU Class, and information on LU # 0 and LU # 1 is set as LU Info. The configuration of other parts is the same as that of the first embodiment.

本実施形態における各部の動作は、第１実施例とほぼ同様である。以下、第１実施形態と相違する点について、ディスクアレイスイッチ２０のホストＩ／Ｆノード２０３の動作を中心に説明する。図２０は、本実施形態におけるライト動作時に転送されるフレームのシーケンスを示す模式図、図２１、２２は、ライト動作時におけるホストＩ／Ｆノード２０３による処理の流れを示すフローチャートである。 The operation of each part in the present embodiment is substantially the same as in the first example. Hereinafter, differences from the first embodiment will be described focusing on the operation of the host I / F node 203 of the disk array switch 20. FIG. 20 is a schematic diagram showing a sequence of frames transferred during a write operation in the present embodiment, and FIGS. 21 and 22 are flowcharts showing a processing flow by the host I / F node 203 during the write operation.

ライト動作時、ホスト３０が発行したライトコマンドフレーム（FCP_CMD）は、ＩＣ２０２３により受信される（図２０の矢印（ａ）：ステップ21001）。ＩＣ２０２３により受信されたライトコマンドフレームは、第１実施形態で説明したリード動作時におけるステップ20002 20005と同様に処理される（ステップ21002 - 21005）。 During the write operation, the write command frame (FCP_CMD) issued by the host 30 is received by the IC 2023 (arrow (a) in FIG. 20: step 21001). The write command frame received by the IC 2023 is processed in the same manner as Step 20002 20005 during the read operation described in the first embodiment (Step 21002 to 21005).

ＳＣ２０２２は、ＳＰ２０２１を使ってＤＣＴ２０２７を検索し、ミラー化されたディスクアレイサブセット“＃０１”のＬＵ“＃０１”へのライトアクセス要求であることを認識する（ステップ21006）。ＳＣ２０２２は、ＦＢ２０２５上に、受信したコマンドフレームの複製を作成する（ステップ21007）。ＳＣ２０２２は、ＤＣＴ２０２７に設定されている構成情報に基づいてコマンドフレームの変換を行い、ＬＵ“＃０”とＬＵ“＃１”の両者への別々のコマンドフレームを作成する（ステップ21008）。ここで、ＬＵ“＃０”を主ＬＵ、ＬＵ“＃１”を従ＬＵと呼び、コマンドフレームにもそれぞれ主コマンドフレーム、従コマンドフレームと呼ぶ。そして、両者別々にＥＴ２０２６にエクスチェンジ情報を格納し、ディスクアレイサブセット“＃０”およびディスクアレイサブセット“＃１”に対し作成したコマンドフレームを発行する（図２０の矢印（ｂ０）（ｂ１）：ステップ21009）。 The SC 2022 searches the DCT 2027 using the SP 2021 and recognizes that it is a write access request to the LU “# 01” of the mirrored disk array subset “# 01” (step 21006). The SC 2022 creates a copy of the received command frame on the FB 2025 (step 21007). The SC 2022 converts the command frame based on the configuration information set in the DCT 2027, and creates separate command frames for both LU “# 0” and LU “# 1” (step 21008). Here, LU “# 0” is called a main LU, LU “# 1” is called a sub-LU, and command frames are also called a main command frame and a sub-command frame, respectively. Then, the exchange information is separately stored in the ET 2026, and the created command frames are issued to the disk array subset “# 0” and the disk array subset “# 1” (arrows (b0) and (b1) in FIG. 20: step). 21009).

各ディスクアレイサブセット“＃０”、“＃１”は、コマンドフレームを受信し、それぞれ独立にデータ転送準備完了フレーム（FCP_XFER_RDY）をディスクアレイスイッチ２０に送信する（図２０の矢印（ｃ０）（ｃ１））。ディスクアレイスイッチ２０では、ホストＩ／Ｆノード２０３が、第１実施形態におけるリード動作のステップ20011 20013と同様の処理により転送されてきたデータ転送準備完了フレームを処理する（ステップ21011 - 21013）。 Each of the disk array subsets “# 0” and “# 1” receives the command frame and independently transmits a data transfer preparation completion frame (FCP_XFER_RDY) to the disk array switch 20 (arrows (c0) and (c1 in FIG. 20). )). In the disk array switch 20, the host I / F node 203 processes a data transfer preparation completion frame transferred by the same processing as the read operation steps 200111 20013 in the first embodiment (steps 21011 to 21013).

各ディスクアレイサブセットからのデータ転送準備完了フレームがそろった段階で（ステップ21014）、ＳＣ２０２２は、主データ転送準備完了フレームに対する変換を実施し（ステップ21015）、ＩＣ２０２３により変換後のフレームをホスト３０に送信する（図２０の矢印（ｄ）：ステップ21015）。 When the data transfer ready frames from each disk array subset are ready (step 21014), the SC 2022 performs conversion on the main data transfer ready frames (step 21015), and the IC 2023 converts the converted frames to the host 30. Transmit (arrow (d) in FIG. 20: step 21015).

ホスト３０は、データ転送準備完了フレームを受信した後、ライトデータ送信のため、データフレーム（FCP_DATA）をディスクアレイスイッチ２０に送信する（図２０の矢印（ｅ））。ホスト３０からのデータフレームは、ＩＣ２０２３により受信されると（ステップ21031）、リードコマンドフレームやライトコマンドフレームと同様に、ＦＢ２０２５に格納され、ＣＲＣ検査、フレームヘッダの解析が行われる（ステップ21032、21033）。フレームヘッダの解析結果に基づき、ＥＴ２０２６がＳＰ２０２１により検索され、エクスチェンジ情報が獲得される（ステップ21034）。 After receiving the data transfer preparation completion frame, the host 30 transmits a data frame (FCP_DATA) to the disk array switch 20 for transmission of write data (arrow (e) in FIG. 20). When the data frame from the host 30 is received by the IC 2023 (step 21031), the data frame is stored in the FB 2025 in the same manner as the read command frame and the write command frame, and CRC check and frame header analysis are performed (steps 21032 and 21033). ). Based on the analysis result of the frame header, the ET 2026 is searched by the SP 2021 to obtain exchange information (step 21034).

ＳＣ２０２２は、ライトコマンドフレームのときと同様に複製を作成し（ステップ21035）、その一方をディスクアレイサブセット“＃０”内のＬＵ“＃０”に、他方をディスクアレイサブセット“＃１”内のＬＵ“＃１”に向けて送信する（図２０の矢印（ｆ０）（ｆ１）：ステップ21037）。 The SC 2022 creates a replica in the same manner as in the write command frame (step 21035), one of them in the LU “# 0” in the disk array subset “# 0” and the other in the disk array subset “# 1”. Transmission is performed toward LU “# 1” (arrows (f0) and (f1) in FIG. 20: step 21037).

ディスクアレイサブセット“＃０”、“＃１”は、各々、データフレームを受信し、ディスクユニット１０４に対しそれぞれライトし、ステータスフレーム（FCP_RSP）をディスクアレイスイッチ２０に送信する。 Each of the disk array subsets “# 0” and “# 1” receives the data frame, writes it to the disk unit 104, and transmits a status frame (FCP_RSP) to the disk array switch 20.

ＳＣ２０２２は、ディスクアレイサブセット“＃０”、“＃１”それぞれからステータスフレームを受信すると、それらのステータスフレームから拡張ヘッダを外してフレームヘッダを再現し、ＥＴ２０２６からエクスチェンジ情報を獲得する（ステップ21041、21042）。 When the SC 2022 receives status frames from the disk array subsets “# 0” and “# 1”, the SC 2022 removes the extension header from the status frames, reproduces the frame header, and obtains exchange information from the ET 2026 (step 21041, 21042).

ディスクアレイサブセット“＃０”、“＃１”の両者からのステータスフレームが揃うと（ステップ21043）、ステータスが正常終了であることを確認のうえ、ＬＵ“＃０”からの主ステータスフレームに対する変換を行い（ステップ21044）、従ステータスフレーム消去する（ステップ21045）。そして、ＩＣ２０２３は、正常終了を報告するためのコマンドフレームをホストに送信する（図２０の矢印（ｈ）：ステップ21046）。最後にＳＰ２０２１は、ＥＴ２０２６のエクスチェンジ情報を消去する（ステップ21047）。 When the status frames from both disk array subsets “# 0” and “# 1” are prepared (step 21043), the status is confirmed to be normal completion, and the conversion to the main status frame from LU “# 0” is performed. (Step 21044), and the sub status frame is deleted (step 21045). Then, the IC 2023 transmits a command frame for reporting the normal end to the host (arrow (h) in FIG. 20: step 21046). Finally, the SP 2021 deletes the exchange information of the ET 2026 (Step 21047).

以上でミラーリング構成におけるライト処理が終了する。ミラーリングされたＬＵ“＃０１”に対するリード処理は、データの転送方向が異なるだけで、上述したライト処理とほぼ同様に行われるが、ライトとは異なり、２台のディスクアレイサブセットにリードコマンドを発行する必要はなく、どちらか一方に対してコマンドフレームを発行すればよい。たとえば、常に主ＬＵに対してコマンドフレームを発行してもよいが、高速化のため、主／従双方のＬＵに対して、交互にコマンドフレームを発行するなどにより、負荷を分散すると有効である。 This completes the write process in the mirroring configuration. The read processing for the mirrored LU “# 01” is performed in the same manner as the write processing described above except that the data transfer direction is different. Unlike the write, a read command is issued to two disk array subsets. There is no need to issue a command frame to either one. For example, command frames may always be issued to the main LU, but it is effective to distribute the load by issuing command frames alternately to both the primary and secondary LUs for speedup. .

上述した処理では、ステップ21014、及びステップ21043で２台のディスクアレイサブセット“＃０”、“＃１”の応答を待ち、両者の同期をとって処理が進められる。このような制御では、双方のディスクアレイサブセットでの処理の成功が確認されてから処理が進むため、エラー発生時の対応が容易になる。その一方で、全体の処理速度が、どちらか遅いほうの応答に依存してしまうため、性能が低下するという欠点がある。 In the above-described processing, in Step 21014 and Step 21043, the response of the two disk array subsets “# 0” and “# 1” is waited for, and the processing proceeds with the two synchronized. In such control, the processing proceeds after confirming the success of the processing in both disk array subsets, so that it is easy to cope with an error. On the other hand, since the overall processing speed depends on the slower response, there is a disadvantage that the performance is lowered.

この問題を解決するため、ディスクアレイスイッチにおいて、ディスクアレイサブセットの応答を待たずに次の処理に進んだり、ディスクアレイサブセットのどちらか一方からの応答があった時点で次の処理に進む「非同期型」の制御をすることも可能である。非同期型の制御を行った場合のフレームシーケンスの一例を、図２０において破線矢印で示す。 To solve this problem, the disk array switch proceeds to the next process without waiting for the response of the disk array subset, or proceeds to the next process when there is a response from one of the disk array subsets. It is also possible to control the “type”. An example of a frame sequence when asynchronous control is performed is indicated by a dashed arrow in FIG.

破線矢印で示されるフレームシーケンスでは、ステップ21016で行われるホストへのデータ転送準備完了フレームの送信が、ステップ21009の処理の後、ディスクアレイサブセット１０からのデータ転送準備完了フレームを待たずに実施される。この場合、ホストに送信されるデータ転送準備完了フレームは、ディスクアレイスイッチ２０のＳＣ２０２２により生成される（破線矢印（ｄ′））。 In the frame sequence indicated by the dashed arrow, the data transfer preparation completion frame transmission to the host performed in step 21016 is performed without waiting for the data transfer preparation completion frame from the disk array subset 10 after the processing of step 21009. The In this case, the data transfer preparation completion frame transmitted to the host is generated by the SC 2022 of the disk array switch 20 (broken line arrow (d ′)).

ホスト３０からは、破線矢印（ｅ′）で示されるタイミングでデータフレームがディスクアレイスイッチ２０に転送される。ディスクアレイスイッチ２０では、このデータフレームが一旦ＦＢ２０２５に格納される。ＳＣ２０２２は、ディスクアレイサブセット１０からのデータ転送準備完了フレームの受信に応答して、データ転送準備完了フレームが送られてきたディスクアレイサブセット１０に対し、ＦＢ２０２５に保持されたデータフレームを転送する（破線矢印（ｆ０′）、（ｆ１′））。 From the host 30, the data frame is transferred to the disk array switch 20 at the timing indicated by the dashed arrow (e ′). In the disk array switch 20, this data frame is temporarily stored in the FB 2025. In response to receiving the data transfer ready frame from the disk array subset 10, the SC 2022 transfers the data frame held in the FB 2025 to the disk array subset 10 to which the data transfer ready frame has been sent (dashed line). Arrows (f0 '), (f1')).

ディスクアレイスイッチ２０からホスト３０への終了報告は、双方のディスクアレイサブシステム１０からの報告（破線矢印（ｇ０′）、（ｇ０′））があった時点でおこなわれる（破線矢印（ｈ′））。このような処理により、図２０に示される時間Ｔａの分だけ処理時間を短縮することが可能である。 The end report from the disk array switch 20 to the host 30 is made when there is a report (broken arrows (g0 ′), (g0 ′)) from both disk array subsystems 10 (broken arrows (h ′)). ). By such processing, it is possible to shorten the processing time by the time Ta shown in FIG.

ディスクアレイスイッチ２０とディスクアレイサブセット１０間のフレーム転送の途中でエラーが発生した場合、以下の処理が実施される。 When an error occurs during frame transfer between the disk array switch 20 and the disk array subset 10, the following processing is performed.

実行中の処理がライト処理の場合、エラーが発生したＬＵに対し、リトライ処理が行われる。リトライが成功すれば、処理はそのまま継続される。あらかじめ設定された規定の回数のリトライが失敗した場合、ディスクアレイスイッチ２０は、このディスクアレイサブセット１０（もしくはＬＵ）に対するアクセスを禁止し、そのことを示す情報をＤＣＴ２０２７に登録する。また、ディスクアレイスイッチ２０は、ＭＰ２００、通信コントローラ２０４を経由して、ディスクシステム構成手段７０にそのことを通知する。 If the process being executed is a write process, a retry process is performed on the LU in which an error has occurred. If the retry is successful, the process continues. When a predetermined number of retries set in advance fail, the disk array switch 20 prohibits access to the disk array subset 10 (or LU) and registers information indicating this in the DCT 2027. Further, the disk array switch 20 notifies the disk system configuration means 70 via the MP 200 and the communication controller 204.

ディスクシステム構成手段７０は、この通知に応答して管理端末５にアラームを発行する。これにより管理者は、トラブルが発生したことを認識できる。その後、ディスクアレイスイッチ２０は、正常なディスクアレイサブセットを用いて運転を継続する。ホスト３０は、エラーが発生したことを認識することはなく、処理を継続できる。 The disk system configuration means 70 issues an alarm to the management terminal 5 in response to this notification. As a result, the administrator can recognize that a trouble has occurred. Thereafter, the disk array switch 20 continues to operate using the normal disk array subset. The host 30 does not recognize that an error has occurred and can continue processing.

本実施形態によれば、２台のディスクアレイサブシステムでミラー構成を実現できるので、ディスクの耐障害性を上げることことができる。また、ディスクアレイコントローラ、ディスクアレイＩ／Ｆ、及びディスクアレイＩ／Ｆノードの耐障害性を上げることができ、内部バスの二重化等することなくディスクアレイシステム全体の信頼性を向上させることができる。 According to this embodiment, since the mirror configuration can be realized by two disk array subsystems, the fault tolerance of the disk can be increased. Further, the fault tolerance of the disk array controller, the disk array I / F, and the disk array I / F node can be improved, and the reliability of the entire disk array system can be improved without duplicating the internal bus. .

［第７実施形態］
次に、３台以上のディスクアレイサブセット１０を統合し、１台の論理的なディスクアレイサブセットのグループを構成する方法について説明する。本実施形態では、複数のディスクアレイサブセット１０にデータを分散して格納する。これにより、ディスクアレイサブセットへのアクセスを分散させ、特定のディスクアレイサブセットへのアクセスの集中を抑止することで、トータルスループットを向上させる。本実施形態では、ディスクアレイスイッチによりこのようなストライピング処理を実施する。 [Seventh Embodiment]
Next, a method of integrating three or more disk array subsets 10 to form one logical disk array subset group will be described. In this embodiment, data is distributed and stored in a plurality of disk array subsets 10. As a result, access to the disk array subset is distributed, and concentration of access to a specific disk array subset is suppressed, thereby improving the total throughput. In this embodiment, such a striping process is performed by a disk array switch.

図２３は、本実施形態におけるディスクアレイシステム1のアドレスマップである。ディスクアレイサブセット１０のアドレス空間は、ストレイプサイズＳでストライピングされている。ホストから見たディスクアレイシステム１のアドレス空間は、ストライプサイズＳ毎に、ディスクアレイサブセット“＃０”、“＃１”、“＃２”、“＃３”に分散されている。ストライプサイズＳのサイズは任意であるが、あまり小さくない方がよい。ストライプサイズＳが小さすぎると、アクセスすべきデータが複数のストライプに属するストライプまたぎが発生したときに、その処理にオーバヘッドが発生するおそれがある。ストライプサイズＳを大きくすると、ストライプまたぎが発生する確率が減少するので性能向上のためには好ましい。ＬＵの数は任意に設定することができる。 FIG. 23 is an address map of the disk array system 1 in this embodiment. The address space of the disk array subset 10 is striped with a stripe size S. The address space of the disk array system 1 viewed from the host is distributed to the disk array subsets “# 0”, “# 1”, “# 2”, and “# 3” for each stripe size S. The size of the stripe size S is arbitrary, but it is better not to be too small. If the stripe size S is too small, there is a possibility that overhead occurs in the processing when the data to be accessed strides over a stripe belonging to a plurality of stripes. Increasing the stripe size S is preferable for improving performance because the probability of stripe straddling is reduced. The number of LUs can be set arbitrarily.

以下、本実施形態におけるホストＩ／Ｆノード２０３の動作について、図２４に示す動作フローチャートを参照しつつ第１実施形態との相違点に着目して説明する説明する。なお、本実施形態では、ＤＣＴ２０２７のホストＬＵ構成テーブル20271上で、ストライピングされたホストＬＵに関する情報のCLU Classには「Striped」が、CLU Stripe Sizeにはストライプサイズ「Ｓ」が設定される。 Hereinafter, the operation of the host I / F node 203 in the present embodiment will be described by focusing on differences from the first embodiment with reference to the operation flowchart shown in FIG. In this embodiment, in the DCT 2027 host LU configuration table 20271, “Striped” is set as the CLU Class of the information related to the striped host LU, and the stripe size “S” is set as the CLU Stripe Size.

ホスト３０がコマンドフレームを発行すると、ディスクアレイスイッチ２０は、ホストＩ／Ｆノード２０３のＩＣ２０２３でこれを受信する（ステップ22001）、ＳＣ２０２２は、ＩＣ２０２３からこのコマンドフレームを受け取り、ＳＰ２０２１を使ってＤＣＴ２０２７を検索し、ストライピングする必要があることを認識する（ステップ22005）。 When the host 30 issues a command frame, the disk array switch 20 receives this at the IC 2023 of the host I / F node 203 (step 22001). The SC 2022 receives this command frame from the IC 2023, and uses the SP 2021 to execute the DCT 2027. Recognize that it needs to be searched and striped (step 22005).

次に、ＳＣ２０２２は、ＳＰ２０２１によりＤＣＴ２０２７を検索し、ストライプサイズＳを含む構成情報から、アクセスの対象となるデータが属するストライプのストライプ番号を求め、このストライプがどのディスクアレイサブセット１０に格納されているか特定する（ステップ22006）。この際、ストライプまたぎが発生する可能性があるが、この場合の処理については後述する。ストライプまたぎが発生しない場合、ＳＰ２０２１の計算結果に基づき、ＳＣ２０２２はコマンドフレームに対し変換を施し（ステップ22007）、エクスチェンジ情報をＥＴ２０２６に格納する（ステップ22008）。以降は、第１実施形態と同様の処理が行われる。 Next, the SC 2022 searches the DCT 2027 using the SP 2021, obtains the stripe number of the stripe to which the data to be accessed belongs from the configuration information including the stripe size S, and in which disk array subset 10 the stripe is stored. Specify (step 22006). At this time, there is a possibility that stripe straddling may occur. The processing in this case will be described later. If no stripe straddling occurs, the SC 2022 converts the command frame based on the calculation result of the SP 2021 (step 22007), and stores the exchange information in the ET 2026 (step 22008). Thereafter, the same processing as in the first embodiment is performed.

ストライプまたぎが発生した場合、ＳＰ２０２１は、２つのコマンドフレームを生成する。この生成は、例えば、ホスト３０が発行したコマンドフレームを複製することで行われる。生成するコマンドフレームのフレームヘッダ、フレームペイロード等は、新規に設定する。第６実施形態と同様、ＳＣ２０２２でコマンドフレームの複製を作成した後、変換を実施することも可能であるが、ここでは、ＳＰ２０２１により新規に作成されるものとする。ＳＣ２０２２は、２つのコマンドフレームが生成されると、これらを各ディスクアレイサブセット１０に送信する。 When stripe straddling occurs, the SP 2021 generates two command frames. This generation is performed, for example, by copying a command frame issued by the host 30. The frame header, frame payload, etc. of the command frame to be generated are newly set. As in the sixth embodiment, it is possible to perform conversion after creating a copy of the command frame in SC2022, but here it is assumed that it is newly created by SP2021. When two command frames are generated, the SC 2022 transmits them to each disk array subset 10.

この後、第１実施形態と同様にデータ転送が実施される。ここで、本実施形態では、第１実施形態、あるいは第６実施形態と異なり、データ自体を１台のホスト３０と２台のディスクアレイサブセット１０間で転送する必要がある。たとえば、リード処理の場合、２台のディスクアレイサブセット１０から転送されるデータフレームは、すべてホスト３０に転送する必要がある。この際ＳＣ２０２２は、各ディスクアレイサブセット１０から転送されてくるデータフレームに対し、ＥＴ２０２６に登録されたエクスチェンジ情報に従い、適切な順番で、適切なエクスチェンジ情報を付加してホスト３０に送信する。 Thereafter, data transfer is performed as in the first embodiment. In this embodiment, unlike the first embodiment or the sixth embodiment, it is necessary to transfer the data itself between one host 30 and two disk array subsets 10. For example, in the case of read processing, all data frames transferred from the two disk array subsets 10 need to be transferred to the host 30. At this time, the SC 2022 adds the appropriate exchange information to the data frame transferred from each disk array subset 10 according to the exchange information registered in the ET 2026 in an appropriate order and transmits the data frame to the host 30.

ライト処理の場合は、コマンドフレームの場合と同様、２つのデータフレームに分割して、該当するディスクアレイサブセット１０に転送する。なお、データフレームの順序制御は、ホスト、あるいはディスクアレイサブセットがアウトオブオーダー（Out of Order）機能と呼ばれる、順不同処理に対応しているならば必須ではない。 In the case of the write process, as in the case of the command frame, it is divided into two data frames and transferred to the corresponding disk array subset 10. Note that the order control of the data frames is not essential if the host or the disk array subset supports an out-of-order process called an out-of-order function.

最後に、すべてのデータ転送が完了し、ディスクアレイスイッチ２０が２つのステータスフレームをディスクアレイサブセット１０から受信すると、ＳＰ２０２１（あるいはＳＣ２０２２）は、ホスト３０へのステータスフレームを作成し、これをＩＣ２０２３によりホスト３０に送信する。 Finally, when all the data transfer is completed and the disk array switch 20 receives two status frames from the disk array subset 10, the SP 2021 (or SC 2022) creates a status frame for the host 30, and this is transmitted by the IC 2023. Send to host 30.

本実施形態によれば、アクセスを複数のディスクアレイサブセットに分散することができるので、トータルとしてスループットを向上させることができるとともに、アクセスレイテンシも平均的に低減させることが可能である。 According to this embodiment, since access can be distributed to a plurality of disk array subsets, throughput can be improved as a whole, and access latency can be reduced on average.

［第８実施形態］
次に、２台のディスクアレイシステム（またはディスクアレイサブセット）間における複製の作成について、第８実施形態として説明する。ここで説明するようなシステムは、２台のディスクアレイシステムの一方を遠隔地に配置し、天災等による他方のディスクアレイシステムの障害に対する耐性を備える。このような災害に対する対策をディザスタリカバリと呼び、遠隔地のディスクアレイシステムとの間で行われる複製の作成のことをリモートコピーと呼ぶ。 [Eighth Embodiment]
Next, creation of a replica between two disk array systems (or disk array subsets) will be described as an eighth embodiment. In the system as described here, one of the two disk array systems is arranged at a remote location, and has resistance against the failure of the other disk array system due to natural disasters or the like. Such a countermeasure against disaster is called disaster recovery, and the creation of a copy performed with a remote disk array system is called remote copy.

第６実施形態で説明したミラーリングでは、地理的にほぼ同一の場所に設置されたディスクアレイサブセット１０でミラーを構成するので、ディスクアレイＩ／Ｆ２１はファイバチャネルでよい。しかし、リモートコピーを行うディスクアレイ（ディスクアレイサブセット）が１０ｋｍを越える遠隔地に設置される場合、中継なしでファイバチャネルによりフレームを転送する事ができない。ディザスタリカバリに用いられる場合、お互いの間の距離は通常数百ｋｍ以上となる、このため、ファイバチャネルでディスクアレイ間を接続することは実用上不可能であり、ＡＴＭ（Asynchronous Transfer Mode）等による高速公衆回線や衛星通信等が用いられる。 In the mirroring described in the sixth embodiment, the disk array I / F 21 may be a fiber channel because the mirror is configured by the disk array subsets 10 installed at almost the same geographical location. However, when a disk array (disk array subset) that performs remote copy is installed in a remote place exceeding 10 km, it is not possible to transfer a frame by fiber channel without relay. When used for disaster recovery, the distance between each other is usually several hundred km or more. Therefore, it is practically impossible to connect disk arrays by fiber channel. A high-speed public line or satellite communication is used.

図２５は、本実施形態におけるディザスタリカバリシステムの構成例である。 FIG. 25 is a configuration example of the disaster recovery system in the present embodiment.

８１はサイトＡ、８２はサイトＢであり、両サイトは、地理的な遠隔地に設置される。9は公衆回線であり、ＡＴＭパケットがここを通過する。サイトＡ８１、およびサイトＢ８２は、それぞれディスクアレイシステム1を有する。ここでは、サイトＡ８１が通常使用される常用サイトであり、サイトＢ８２はサイトＡ８１が災害等でダウンしたときに使用されるリモートディザスタリカバリサイトである。 81 is a site A, 82 is a site B, and both sites are installed in geographically remote places. 9 is a public line through which ATM packets pass. Site A 81 and site B 82 each have a disk array system 1. Here, the site A 81 is a regular site that is normally used, and the site B 82 is a remote disaster recovery site that is used when the site A 81 is down due to a disaster or the like.

サイトＡ８１のディスクアレイシステム１０のディスクアレイサブセット“＃０”、“＃１”の内容は、サイトＢ８２のディスクアレイシステム１０のリモートコピー用ディスクアレイサブセット“＃０”、“＃１”にコピーされる。ディスクアレイスイッチ２０のＩ／Ｆノードのうち、リモートサイトに接続するものはＡＴＭを用いて公衆回線9に接続されている。このノードをＡＴＭノード２０５と呼ぶ。ＡＴＭノード２０５は、図５に示すホストＩ／Ｆノードと同様に構成され、ＩＣ２０２３がＡＴＭ−ファイバチャネルの変換を行う。この変換は、第４実施形態におけるＳＣＳＩ−ファイバチャネルの変換と同様の方法により実現される。 The contents of the disk array subsets “# 0” and “# 1” of the disk array system 10 at site A81 are copied to the remote copy disk array subsets “# 0” and “# 1” of the disk array system 10 at site B82. The Of the I / F nodes of the disk array switch 20, those connected to the remote site are connected to the public line 9 using ATM. This node is called an ATM node 205. The ATM node 205 is configured in the same manner as the host I / F node shown in FIG. 5, and the IC 2023 performs ATM-fiber channel conversion. This conversion is realized by the same method as the SCSI-fiber channel conversion in the fourth embodiment.

本実施形態におけるリモートコピーの処理は、第６実施形態におけるミラーリングの処理と類似する。以下、第６実施形態におけるミラーリングの処理と異なる点について説明する。 The remote copy process in this embodiment is similar to the mirroring process in the sixth embodiment. Hereinafter, differences from the mirroring process in the sixth embodiment will be described.

ホスト３０がライトコマンドフレームを発行すると、サイトＡ８１のディスクアレイシステム１０は、第６実施形態における場合と同様にフレームの二重化を実施し、その一方を自身のディスクアレイサブセット10に転送する。他方のフレームは、ＡＴＭノード２０５によりファイバチャネルフレームからＡＴＭパケットに変換され、公衆回線9を介してサイトＢ８２に送られる。 When the host 30 issues a write command frame, the disk array system 10 at the site A 81 performs frame duplication as in the sixth embodiment, and transfers one of the frames to its own disk array subset 10. The other frame is converted from a fiber channel frame into an ATM packet by the ATM node 205 and sent to the site B 82 via the public line 9.

サイトＢ８２では、ディスクアレイスイッチ２０のＡＴＭノード２０５がこのパケットを受信する。ＡＴＭノード２０５のＩＣ２０２３は、ＡＴＭパケットからファイバチャネルフレームを再現し、ＳＣ２０２２に転送する。ＳＣ２０２２は、ホスト３０からライトコマンドを受信したときと同様にフレーム変換を施し、リモートコピー用のディスクアレイサブセットに転送する。以降、データ転送準備完了フレーム、データフレーム、ステータスフレームのすべてにおいて、ＡＴＭノード２０５においてファイバチャネル−ＡＴＭ変換を行い、同様のフレーム転送処理を実施することにより、リモートコピーが実現できる。 At the site B 82, the ATM node 205 of the disk array switch 20 receives this packet. The IC 2023 of the ATM node 205 reproduces the fiber channel frame from the ATM packet and transfers it to the SC 2022. The SC 2022 performs frame conversion in the same manner as when a write command is received from the host 30, and transfers it to the disk array subset for remote copy. Thereafter, remote copy can be realized by performing fiber channel-ATM conversion in the ATM node 205 in all of the data transfer ready frame, data frame, and status frame, and performing similar frame transfer processing.

ホスト３０がリードコマンドフレームを発行した際には、ディスクアレイスイッチ２０は、自サイトのディスクアレイサブセット１０に対してのみコマンドフレームを転送し、自サイトのディスクアレイサブセット１０からのみデータをリードする。このときの動作は、第１実施形態と同一となる。 When the host 30 issues a read command frame, the disk array switch 20 transfers the command frame only to the disk array subset 10 at its own site and reads data only from the disk array subset 10 at its own site. The operation at this time is the same as in the first embodiment.

本実施形態によれば、ユーザデータをリアルタイムでバックアップし、天災等によるサイト障害、ディスクアレイシステム障害に対する耐性を備えることができる。 According to the present embodiment, user data is backed up in real time, and it is possible to provide resistance against a site failure due to natural disasters or a disk array system failure.

［第９実施形態］
次に、一台のディスクアレイサブセット１０に包含される複数のＬＵの統合について説明する。例えば、メインフレーム用のディスク装置は、過去のシステムとの互換性を維持するために、論理ボリュームのサイズの最大値が２ＧＢに設定されている。このようなディスクアレイシステムをオープンシステムでも共用する場合、ＬＵは論理ボリュームサイズの制限をそのまま受けることになり、小サイズのＬＵが多数ホストから見えることになる。このような方法では、大容量化が進展した場合に運用が困難になるという問題が生じる。そこで、ディスクアレイスイッチ２０の機能により、この論理ボリューム（すなわちＬＵ）を統合して一つの大きな統合ＬＵを構成することを考える。本実施形態では、統合ＬＵの作成をディスクアレイスイッチ２０で実施する。 [Ninth Embodiment]
Next, integration of a plurality of LUs included in one disk array subset 10 will be described. For example, in the mainframe disk device, the maximum value of the size of the logical volume is set to 2 GB in order to maintain compatibility with the past system. When such a disk array system is shared even in an open system, the LU is subject to the limitation of the logical volume size as it is, and many small-sized LUs can be seen from the host. Such a method causes a problem that operation becomes difficult when the capacity increases. Therefore, it is considered that the logical volume (that is, LU) is integrated by the function of the disk array switch 20 to form one large integrated LU. In this embodiment, the integrated LU is created by the disk array switch 20.

本実施形態におけるＬＵの統合は、第１実施形態における複数のディスクアレイサブセット１０による統合ＬＵの作成と同一である。相違点は、同一のディスクアレイサブセット１０内の複数ＬＵによる統合であることだけである。ディスクアレイシステムとしての動作は、第１実施形態と全く同一となる。 LU integration in the present embodiment is the same as creation of an integrated LU by a plurality of disk array subsets 10 in the first embodiment. The only difference is the integration by multiple LUs in the same disk array subset 10. The operation as a disk array system is exactly the same as in the first embodiment.

このように、同一のディスクアレイサブセット１０に包含される複数のＬＵを統合して一つの大きなＬＵを作成することで、ホストから多数のＬＵを管理する必要がなくなり、運用性に優れ、管理コストを低減したディスクアレイシステムを構築できる。 In this way, by integrating a plurality of LUs included in the same disk array subset 10 to create one large LU, it is not necessary to manage a large number of LUs from the host, resulting in excellent operability and management costs. Can be constructed.

［第１０実施形態］
次に、ディスクアレイスイッチ１０による交代パスの設定方法について、図２６を参照しつつ説明する。 [Tenth embodiment]
Next, an alternate path setting method by the disk array switch 10 will be described with reference to FIG.

図２６に示された計算機システムにおける各部の構成は、第１の実施形態と同様である。ここでは、２台のホスト３０が、各々異なるディスクアレイＩ／Ｆ２１を用いてディスクアレイサブセット１０をアクセスするとように構成していると仮定する。図では、ディスクアレイサブセット、ディスクアレイスイッチ２０のホストＩ／Ｆノード２０３およびディスクアレイＩ／Ｆノード２０２は、ここでの説明に必要な数しか示されていない。 The configuration of each part in the computer system shown in FIG. 26 is the same as that of the first embodiment. Here, it is assumed that the two hosts 30 are configured to access the disk array subset 10 using different disk array I / Fs 21. In the figure, only the number of disk array subsets, host I / F nodes 203 and disk array I / F nodes 202 of the disk array switch 20 necessary for the description here are shown.

ディスクアレイサブセット１０は、図２と同様の構成を有し、２つのディスクアレイＩ／Ｆコントローラはそれぞれ１台のディスクアレイスイッチ２０に接続している。ディスクアレイスイッチ２０の各ノードのＤＣＴ２２７には、ディスクアレイＩ／Ｆ２１の交代パスが設定される。交代パスとは、ある一つのパスに障害が発生した場合にもアクセス可能になるように設けられる代替のパスのことである。ここでは、ディスクアレイＩ／Ｆ“＃０”の交替パスをディスクアレイＩ／Ｆ“＃１”、ディスクアレイＩ／Ｆ“＃１”の交替パスをディスクアレイＩ／Ｆ“＃０”として設定しておく。同様に、ディスクアレイサブセット１０内の上位アダプタ間、キャッシュ・交代メモリ間、下位アダプタ間のそれぞれについても交代パスを設定しておく。 The disk array subset 10 has the same configuration as in FIG. 2, and each of the two disk array I / F controllers is connected to one disk array switch 20. An alternate path of the disk array I / F 21 is set in the DCT 227 of each node of the disk array switch 20. The alternate path is an alternative path that is provided so as to be accessible even when a failure occurs in a certain path. Here, the replacement path of the disk array I / F “# 0” is set as the disk array I / F “# 1”, and the replacement path of the disk array I / F “# 1” is set as the disk array I / F “# 0”. Keep it. Similarly, alternate paths are set for the upper adapters, the cache / alternate memory, and the lower adapters in the disk array subset 10.

次に、図２６に示すように、ディスクアレイサブセット１の上位アダプタ“＃１”に接続するディスクアレイＩ／Ｆ２１が断線し、障害が発生したと仮定して、交替パスの設定動作を説明する。このとき、障害が発生したディスクアレイＩ／Ｆ２１を利用しているホスト“＃１”は、ディスクアレイサブセット１０にアクセスできなくなる。ディスクアレイスイッチ２０は、ディスクアレイサブセット１０との間のフレーム転送の異常を検出し、リトライ処理を実施しても回復しない場合、このパスに障害が発生したと認識する。 Next, as shown in FIG. 26, it is assumed that the disk array I / F 21 connected to the host adapter “# 1” of the disk array subset 1 is disconnected and a failure has occurred, and the alternate path setting operation will be described. . At this time, the host “# 1” using the failed disk array I / F 21 cannot access the disk array subset 10. The disk array switch 20 detects an abnormality in frame transfer with the disk array subset 10 and recognizes that a failure has occurred in this path when recovery is not performed even if retry processing is performed.

パスの障害が発生すると、ＳＰ２０２１は、ＤＣＴ２０２７にディスクアレイＩ／Ｆ“＃１”に障害が発生したことを登録し、交代パスとしてディスクアレイＩ／Ｆ“＃０”を使用することを登録する。以降、ホストＩ／Ｆノード２０３のＳＣ２０２２は、ホスト“＃１”からのフレームをディスクアレイＩ／Ｆ“＃０”に接続するディスクアレイＩ／Ｆノード２０２に転送するように動作する。 When a path failure occurs, the SP 2021 registers in the DCT 2027 that a failure has occurred in the disk array I / F “# 1”, and registers that the disk array I / F “# 0” is used as an alternate path. . Thereafter, the SC 2022 of the host I / F node 203 operates to transfer a frame from the host “# 1” to the disk array I / F node 202 connected to the disk array I / F “# 0”.

ディスクアレイサブセット１０の上位アダプタ１０１は、ホスト“＃１”からのコマンドを引き継いで処理する。また、ディスクアレイスイッチ２０は、ディスクアレイシステム構成管理手段７０に障害の発生を通知し、ディスクアレイシステム構成管理手段７０により管理者に障害の発生が通報される。 The host adapter 101 of the disk array subset 10 takes over and processes the command from the host “# 1”. Further, the disk array switch 20 notifies the disk array system configuration management means 70 of the occurrence of the failure, and the disk array system configuration management means 70 notifies the administrator of the occurrence of the failure.

本実施形態によれば、パスに障害が発生した際の交替パスへの切り替えを、ホスト側に認識させることなく行うことができ、ホスト側の交代処理設定を不要にできる。これにより、システムの可用性を向上させることができる。 According to the present embodiment, switching to an alternate path when a failure occurs in a path can be performed without causing the host side to recognize, and the alternate process setting on the host side can be made unnecessary. Thereby, the availability of the system can be improved.

以上説明した各実施形態では、記憶メディアとして、すべてディスク装置を用いたディスクアレイシステムについて説明した。しかし、本発明は、これに限定されるものではなく、記憶メディアとしてディスク装置に限らず、光ディスク装置、テープ装置、ＤＶＤ装置、半導体記憶装置等を用いた場合にも同様に適用できる。 In each of the embodiments described above, the disk array system using all disk devices as storage media has been described. However, the present invention is not limited to this, and is not limited to the disk device as the storage medium, and can be similarly applied to the case where an optical disk device, a tape device, a DVD device, a semiconductor storage device, or the like is used.

第１実施形態のコンピュータシステムの構成図である。It is a block diagram of the computer system of 1st Embodiment. 第１実施形態のディスクアレイサブセットの構成図である。It is a block diagram of the disk array subset of 1st Embodiment. 第１実施形態のディスクアレイスイッチの構成図である。It is a block diagram of the disk array switch of 1st Embodiment. 第１実施形態におけるディスクアレイスイッチのクロスバスイッチの構成図である。It is a block diagram of the crossbar switch of the disk array switch in 1st Embodiment. 第１実施形態におけるディスクアレイスイッチのホストＩ／Ｆノードの構成図である。3 is a configuration diagram of a host I / F node of a disk array switch in the first embodiment. FIG. システム構成テーブルの構成図である。It is a block diagram of a system configuration table. サブセット構成テーブルの構成図である。It is a block diagram of a subset structure table. ファイバチャネルのフレームの構成図である。It is a block diagram of a fiber channel frame. ファイバチャネルのフレームヘッダの構成図である。It is a block diagram of a fiber channel frame header. ファイバチャネルのフレームペイロードの構成図である。It is a block diagram of the frame payload of a fiber channel. ホストからのリード動作時にファイバチャネルを通して転送されるフレームのシーケンスを示す模式図である。It is a schematic diagram which shows the sequence of the frame transferred through a fiber channel at the time of read operation from a host. ホストＬＵ、各ディスクアレイサブセットのＬＵ、及び各ディスクユニットの対応関係を示す模式図である。FIG. 4 is a schematic diagram showing a correspondence relationship between a host LU, an LU of each disk array subset, and each disk unit. ライト処理時のホストＩ／Ｆノードにおける処理のフローチャートである。It is a flowchart of the process in the host I / F node at the time of a write process. スイッチングパケットの構成図である。It is a block diagram of a switching packet. 複数のディスクアレイスイッチをクラスタ接続したディスクアレイシステムの構成図である。1 is a configuration diagram of a disk array system in which a plurality of disk array switches are cluster-connected. 第２実施形態におけるコンピュータシステムの構成図である。It is a block diagram of the computer system in 2nd Embodiment. 第４実施形態におけるディスクアレイスイッチのインタフェースコントローラの構成図である。It is a block diagram of the interface controller of the disk array switch in 4th Embodiment. 第５実施形態におけるコンピュータシステムの構成図である。It is a block diagram of the computer system in 5th Embodiment. 論理接続構成画面の表示例を示す画面構成図である。It is a screen block diagram which shows the example of a display of a logical connection structure screen. 第６実施形態におけるフレームシーケンスを示す模式図である。It is a schematic diagram which shows the frame sequence in 6th Embodiment. 第６実施形態のミラーリングライト処理時のホストＩ／Ｆノードにおける処理のフローチャートである。It is a flowchart of the process in the host I / F node at the time of the mirroring write process of 6th Embodiment. 第６実施形態のミラーリングライト処理時のホストＩ／Ｆノードにおける処理のフローチャートである。It is a flowchart of the process in the host I / F node at the time of the mirroring write process of 6th Embodiment. 第７実施形態におけるホストＬＵと各ディスクアレイサブセットのＬＵとの対応関係を示す模式図である。FIG. 20 is a schematic diagram showing a correspondence relationship between a host LU and an LU of each disk array subset in the seventh embodiment. 第７実施形態におけるホストＩ／Ｆノードの処理を示すフローチャートである。It is a flowchart which shows the process of the host I / F node in 7th Embodiment. 第８実施形態におけるディザスタリカバリシステムの構成図である。It is a block diagram of the disaster recovery system in 8th Embodiment. 交替パスの設定についての説明図である。It is explanatory drawing about the setting of an alternate path | pass.

Explanation of symbols

１…ディスクアレイシステム
５…管理端末
１０…ディスクアレイサブセット
２０…ディスクアレイスイッチ
３０…ホストコンピュータ
７０…ディスクアレイシステム構成管理手段
２００…管理プロセッサ
２０１…クロスバスイッチ
２０２…ディスクアレイＩ／Ｆノード
２０３…ホストＩ／Ｆノード
２０４…通信コントローラ DESCRIPTION OF SYMBOLS 1 ... Disk array system 5 ... Management terminal 10 ... Disk array subset 20 ... Disk array switch 30 ... Host computer 70 ... Disk array system configuration management means 200 ... Management processor 201 ... Crossbar switch 202 ... Disk array I / F node 203 ... Host I / F node 204 ... communication controller

Claims

A storage system,
A plurality of storage subsystems each including a control unit and a storage medium controlled by the control unit;
In response to the configuration information of the storage system connected to the computer and including information on a plurality of storage areas in the storage system, and the frame transmitted from the computer, the frame is analyzed, and based on the configuration information A first interface node having a switching controller for converting the frame;
A plurality of second interface nodes each connected to any one of the storage subsystems;
Storage comprising: switching means connected to the first interface node and the plurality of second interface nodes, for executing frame transfer between the first interface node and the plurality of second interface nodes. system.

The storage system of claim 1, wherein:
The first interface node has a packet generation unit that outputs a frame including node address information of the second interface;
Also here
The storage system, wherein the switching means executes frame transfer between the first node and the plurality of second nodes based on the node address information.

The storage system according to claim 2, wherein:
In response to the write command frame from the computer, the first interface node generates a plurality of copies of each of the write command frame and the data frame received from the computer. Here, each copy of the write data and the command frame has different node address information, and transfers the copy frame to the switching means, whereby the write command frame and the data are transferred. A storage system for transferring a frame to at least two storage subsystems.

The storage system according to claim 3, wherein:
The first interface node generates a plurality of copies of the read command frame in response to a read command frame from the computer that instructs to read data, wherein each of the copies of the read command frame is A storage system having different node address information and transferring the duplicate frame to the switching means, thereby transferring the read command frame to at least two storage subsystems.

5. The storage system according to claim 4, wherein:
The first interface node receives data frames transferred from at least two of the storage subsystems in response to the read command frame, and receives a plurality of data frames from the at least two storage subsystems to the computer A storage system characterized by selecting and transferring one of them.

6. The storage system according to claim 5, wherein:
The first interface node is accompanied by node address information of the second interface node connected to a predetermined storage subsystem in response to a read command frame sent from the computer and instructing data reading. A storage system for transferring the read command frame to the switching means.

The storage system according to claim 1, wherein
Here, the frame includes a frame header having an identifier characterizing a transfer destination and a transfer source, a frame payload having data to be transferred, and
Here, the switching controller sets the transfer destination identifier included in the frame header based on the configuration information of the storage system to the storage destination that is the transfer destination of the frame converted from the one representing the first interface node. A storage system characterized by converting to a subsystem representation.

The storage system according to claim 7, wherein
Here, the frame has first logical address information recognized from the computer, and the switching controller determines the first logical address information based on the configuration information of the storage system. The storage system is converted to second logical address information managed in the storage subsystem.

The storage system according to claim 1, further comprising:
A management processor connected to the switching means;
Here, the management processor receives the setting information for setting the configuration of the storage system input by an operator, and the configuration information based on the received setting information in response to the input of the setting information. A storage system that is set as a first interface node.

The storage system of claim 9, wherein:
The input configuration information includes information for restricting access from the computer to the plurality of storage subsystems.

The storage system of claim 1, wherein:
The information of the plurality of storage areas includes logical unit information of the plurality of storage subsystems.

12. A storage system according to claim 11, wherein:
The logical unit information includes information on a logical unit including at least two storage areas in a plurality of different storage subsystems.

A storage switch connected between the computer and a plurality of storage subsystems each including a control unit and a storage medium controlled by the control unit;
Here, a storage system is composed of the plurality of storage subsystems and the storage switch,
In response to the configuration information for the storage system connected to the computer and including information on a plurality of storage areas in the storage system and the frame transferred from the computer, the frame is analyzed, and the storage A first interface node having a switching controller that converts the frame based on configuration information for the system;
A plurality of second interface nodes each connected to any one of the storage subsystems;
A switch connected to the first interface node and the plurality of second interface nodes, and having switching means for executing frame transfer between the first interface node and the plurality of second interface nodes .

The storage switch according to claim 13, wherein the first interface node includes a packet generation unit that outputs a frame including node address information of the second interface,
Also here
The storage switch, wherein the switching means executes frame transfer between the first node and the plurality of second nodes based on the node address information.

15. The storage switch according to claim 14, wherein:
In response to the write command frame from the computer, the first interface node generates a plurality of copies of each of the write command frame and the data frame received from the computer. Here, each copy of the write data and the command frame has different node address information, and transfers the copy frame to the switching means, whereby the write command frame and the data are transferred. A storage switch characterized by transferring a frame to at least two storage subsystems.

The storage switch according to claim 15, wherein:
The first interface node generates a plurality of copies of the read command frame in response to a read command frame from the computer that instructs to read data, wherein each of the copies of the read command frame is A storage switch having different node address information, and transferring the duplicate frame to the switching means, thereby transferring the read command frame to at least two storage subsystems.

The storage switch of claim 16, wherein:
The first interface node receives data frames transferred from at least two of the storage subsystems in response to the read command frame, and receives a plurality of data frames from the at least two storage subsystems to the computer A storage switch characterized by selecting and transferring one of them.

The storage switch according to claim 15, wherein:
The first interface node is accompanied by node address information of the second interface node connected to a predetermined storage subsystem in response to a read command frame sent from the computer and instructing data reading. A storage switch, wherein the read command frame is transferred to the switching means.

The storage switch according to claim 13,
Here, the frame includes a frame header having an identifier characterizing a transfer destination and a transfer source, a frame payload having actual data to be transferred, and
Here, the switching controller sets the transfer destination identifier included in the frame header based on the configuration information of the storage system to the storage sub that is the transfer destination of the frame converted from the one representing the first interface node. A storage switch characterized by converting it into something that represents the system.

The storage switch according to claim 19, wherein
Here, the frame has first logical address information recognized from the computer in the frame payload,
In addition, here, the switching controller converts the first logical address information into second logical address information managed in the storage subsystem that is the frame transfer destination based on the configuration information of the storage system. Storage switch characterized by converting.

The storage switch according to claim 13, further comprising:
A management processor connected to the switching means;
Here, the management processor receives the setting information for setting the configuration of the storage system input by an operator, and the configuration information based on the received setting information in response to the input of the setting information. A storage switch that is set as a first interface node.

14. A storage switch according to claim 13, wherein:
The storage switch characterized in that the information of the plurality of storage areas includes logical unit information of the plurality of storage subsystems.

23. A storage switch according to claim 22, wherein:
The logical unit information includes information on a logical unit including at least two storage areas in a plurality of storage subsystems different from each other.

A storage system,
A plurality of storage subsystems each having a control unit and a storage medium controlled by the control unit;
A first interface node connected to the computer;
A plurality of second interface nodes each connected to any of the plurality of storage subsystems;
Switching means connected to the first interface node and the plurality of second interface nodes to execute frame transfer between the first interface node and the plurality of second interface nodes; and
A management processor connected to the switching means and having setting information including information on a plurality of storage areas of the storage system input by an operator;
here,
The storage system, wherein the management processor manages the configuration of the storage system based on the setting information.

A storage system,
A plurality of storage subsystems each having a control unit and a storage medium controlled by the control unit;
A plurality of first interface nodes each connected to a computer; a plurality of second interface nodes each connected to any one of the plurality of storage subsystems; the plurality of first interface nodes; A switch having switching means connected to the second interface node of
Here, the switch has configuration information of the storage system including information on a plurality of storage areas of the storage system.