JP2853624B2

JP2853624B2 - Data storage system

Info

Publication number: JP2853624B2
Application number: JP7330765A
Authority: JP
Inventors: 芳秀菊地
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1995-12-19
Filing date: 1995-12-19
Publication date: 1999-02-03
Anticipated expiration: 2015-12-19
Also published as: JPH09171479A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ディスク装置など
の記憶装置を用いて各種データを格納するデータ格納シ
ステムに係わり、特に１つの記憶装置に記憶されている
データの複製を他の記憶装置に記憶することによりシス
テムの障害に備えるデータ格納システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data storage system for storing various data using a storage device such as a disk device, and more particularly, to copying a data stored in one storage device to another storage device. The present invention relates to a data storage system that stores data to prepare for a system failure.

【０００２】[0002]

【従来の技術】ディスク装置などの記憶装置にデータを
格納するシステムでは、記憶装置に障害が発生したとき
でも格納されているデータが失われないようにデータを
冗長構成にして記憶することが行われる。従来、ディス
ク装置の障害対策として、データを冗長構成にして複数
のディスク装置に格納するものには、ＲＡＩＤ（Redund
and Array of Inexpensive Disks) と呼ばれるものがあ
る。ＲＡＩＤはレベル１からレベル５までが一般的に知
られている。2. Description of the Related Art In a system for storing data in a storage device such as a disk device, it is necessary to store the data in a redundant configuration so that the stored data is not lost even if a failure occurs in the storage device. Will be Conventionally, as a measure against a failure of a disk device, a method of storing data in a plurality of disk devices in a redundant configuration includes a RAID (redundant configuration).
and Array of Inexpensive Disks). RAID is generally known from level 1 to level 5.

【０００３】レベル１は、通常運用されるディスク装置
に格納する内容と同一の内容を障害用のディスク装置に
格納するものであり、ミラー方式と呼ばれている。ミラ
ー方式のデータ格納システムでは、通常運用するための
通常用ディスク装置と、障害対策用のミラーディスク装
置を備えている。ホストコンピュータからデータを書き
込むときは、通常用ディスク装置と同一のデータをミラ
ーディスク装置にも格納するようになっている。通常用
ディスク装置に障害が生じたときは、ミラーディスク装
置から読み出すことにより、データが読み出せなくなる
という事態を回避している。[0003] Level 1 is to store the same contents as those stored in a normally used disk device in a failure disk device, and is called a mirror system. The mirror type data storage system includes a normal disk device for normal operation and a mirror disk device for failure countermeasures. When writing data from the host computer, the same data as in the normal disk device is also stored in the mirror disk device. When a failure occurs in the normal disk device, reading from the mirror disk device prevents a situation in which data cannot be read.

【０００４】レベル２〜レベル５では、パリティ情報を
生成し、１つのディスク装置に障害が生じたとき他のデ
ィスク装置とパリティ情報を用いて障害の生じたディス
ク装置に格納されている内容を復元するものである。In levels 2 to 5, parity information is generated, and when a failure occurs in one disk device, the contents stored in the failed disk device are restored using the parity information with another disk device. Is what you do.

【０００５】特開平４−３１８６４０号公報には、パリ
ティ情報を用いたＲＡＩＤ方式によって障害の生じたデ
ィスク装置の記憶内容を復元するデータ格納システムが
開示されている。パリティ情報を用いる場合、通常は、
たとえば４台のディスク装置のうちの３台のディスク装
置にデータを格納し、残りのディスク装置に他の３台の
ディスク装置に格納したデータに対応するパリティ情報
を格納する。このようにデータとそのパリティ情報を格
納しているディスク装置をＥＣＣグループと呼ぶことに
する。データを格納している３台のうちの１台が故障し
たときは、残り２台の正常なディスク装置のデータとパ
リティ情報とを基にして障害の生じたディスク装置のデ
ータが復元される。[0005] in Japanese Unexamined Patent Publication No. 4 -318 640, the data storage system to restore the stored contents of a disk device the failed the RAID system using the parity information is disclosed. When using parity information, usually
For example, data is stored in three of the four disk devices, and parity information corresponding to data stored in the other three disk devices is stored in the remaining disk devices. A disk device storing data and its parity information in this manner is called an ECC group. When one of the three disks storing data fails, the data of the failed disk device is restored based on the data of the remaining two normal disk devices and the parity information.

【０００６】ディスク装置の台数が８台の場合には、こ
れを４台ずつのＥＣＣグループに分けると、障害の生じ
たディスク装置のデータを復元するための負荷がそのＥ
ＣＣグループ内に集中し、効率良く復元作業を行うこと
ができない。そこで特開平４−３１８６４０号に開示さ
れている先行技術では、各ディスク装置の記憶領域を複
数のブロックに分割し、ブロックごとに互いに異なる４
台のディスク装置によってＥＣＣグループを形成するよ
うにしている。これにより１台分のデイスク装置のデー
タを復元する際の負荷を残りの７台のディスク装置の分
散することができる。When the number of disk devices is eight, if the disk devices are divided into four ECC groups, the load for restoring data of a failed disk device becomes E.
They cannot concentrate on the CC group and perform restoration work efficiently. Therefore, in the prior art disclosed in Japanese Patent Laid-Open No. 4 -318 640, a storage area of each disk device is divided into a plurality of blocks, different for each block 4
An ECC group is formed by one disk device. As a result, the load of restoring the data of one disk device can be distributed to the remaining seven disk devices.

【０００７】パリティ情報を用いたデータ格納システム
では、ディスク装置の記憶内容を更新するごとにパリテ
ィ情報を計算して求めなければならず、高速な書き込み
処理を行うことが難しい。またディスク装置に障害が生
じたとき、そのデータの復元作業を行う必要があり、デ
ータが読み出されるまでの時間が長くかかり、高速な読
み出しに対応できない場合がある。これに対してミラー
方式では、冗長度が大きいので障害対策用に用意すべき
ディスク装置の容量が大きくなるが、高速なアクセスが
可能になるという利点がある。In a data storage system using parity information, it is necessary to calculate parity information every time the storage contents of the disk device are updated, and it is difficult to perform high-speed write processing. Further, when a failure occurs in the disk device, it is necessary to restore the data, it takes a long time until the data is read, and it may not be possible to cope with high-speed reading. On the other hand, in the mirror system, the capacity of a disk device to be prepared for a failure countermeasure is increased due to the high degree of redundancy, but there is an advantage that high-speed access is possible.

【０００８】図８は、従来から使用されているミラー方
式を用いたデータ格納システムの構成の概要を表わした
ものである。通常時に運用されるディスク装置３０１と
障害対策用のディスク装置３０２はバス３０３を通じて
ディスク制御装置３０４にそれぞれ接続されている。デ
ィスク制御装置３０４は、バスを通じて各ディスク装置
とデータを入出力するための図示しないインターフェイ
ス回路と、データの読み書きを制御するためのこれまた
図示しないＣＰＵ（中央処理装置）を備えている。ディ
スク制御装置３０４から通常用ディスク装置３０１へ書
き込み命令が出されると、書き込まれるデータは、通常
用ディスク装置３０１に格納されると同時に障害用ディ
スク装置３０２にも書き込まれる。FIG. 8 shows an outline of the configuration of a data storage system using a mirror system which has been conventionally used. The disk device 301 normally operated and the disk device 302 for troubleshooting are respectively connected to a disk control device 304 via a bus 303. The disk control device 304 includes an interface circuit (not shown) for inputting and outputting data to and from each disk device via a bus, and a CPU (central processing unit) for controlling reading and writing of data. When a write command is issued from the disk control device 304 to the normal disk device 301, the data to be written is stored in the normal disk device 301 and simultaneously written into the failure disk device 302.

【０００９】このように同一のデータを２つのディスク
装置に書き込むことにより、通常用ディスク装置３０１
の内容と障害用ディスク装置３０２の記憶内容は同一に
なる。通常用ディスク装置３０１と同一内容のデータを
障害用ディクスク装置３０２が保持していることから、
ディスク装置３０２はミラーディスクと呼ばれている。
通常用ディスク装置３０１に障害が生じると、ディスク
制御装置３０４は障害用ディスク装置３０２からデータ
を読み出すことで障害に対応するようになっている。By writing the same data to the two disk devices, the normal disk device 301
And the contents stored in the failure disk device 302 are the same. Since the failure disk device 302 holds the same data as the normal disk device 301,
The disk device 302 is called a mirror disk.
When a failure occurs in the normal disk device 301, the disk control device 304 responds to the failure by reading data from the failure disk device 302.

【００１０】図８に示したデータ格納システムは、ディ
スク装置で生じる障害を対象としたものであるが、障害
がディスク装置へのデータの読み書きを制御するディス
ク制御装置に発生することにより、通常用および障害用
の双方のディスク装置の内容が読み出せなくなる場合も
ある。[0010] The data storage system shown in FIG. 8 is intended for a failure occurring in a disk device. However, when a failure occurs in a disk control device which controls reading / writing of data from / to the disk device, the data storage system is normally used. In some cases, the contents of both disk devices for failure and failure cannot be read.

【００１１】図９は、ディスク制御装置の障害に対応す
ることのできるデータ格納システムの構成の概要を表わ
したものである。このシステムでは、第１の通常用ディ
スク装置３１１および第１の障害用ディスク装置３１２
はバス３１３によって第１のディスク制御装置３１４と
接続されている。また、ディスク装置３１１、３１２は
バス３１５を通じて第２のディスク制御装置３１６にも
接続されている。同様に、第２の通常用ディスク装置３
２１および第２の障害用ディスク装置３２２はバス３２
３によって第２のディスク制御装置３１６に接続される
とともに、バス３２４により第１のディスク制御装置３
１４にも接続されている。また、第１および第２のディ
スク制御装置３１４、３１６はネットワーク３２５を介
して互いに接続されている。さらにネットワーク３２５
には、データの読み出し要求や書き込み要求を行うホス
トコンピュータ３２６が接続されている。ディスク制御
装置はサーバとして、またホストコンピュータはクライ
アントとして機能する。FIG. 9 shows an outline of the configuration of a data storage system capable of coping with a failure of a disk control device. In this system, a first normal disk device 311 and a first failure disk device 312
Is connected to the first disk controller 314 by a bus 313. The disk devices 311 and 312 are also connected to a second disk control device 316 via a bus 315. Similarly, the second normal disk device 3
21 and the second failure disk device 322
3 is connected to the second disk controller 316 and the bus 324 is connected to the first disk controller 3
14 is also connected. The first and second disk controllers 314 and 316 are connected to each other via a network 325. Further network 325
Is connected to a host computer 326 that issues a data read request or a data write request. The disk control device functions as a server, and the host computer functions as a client.

【００１２】第１のディスク制御装置３１４によってデ
ータを第１の通常用ディスク装置３１１に書き込む際、
同じ内容のデータが第１のディスク制御装置３１４によ
り第１の障害用ディスク装置３１２にも書き込まれる。
同様に第２のディスク制御装置３１６によってデータを
第２の通常用ディスク装置３２１に書き込む際、同じ内
容のデータが第２のディスク制御装置３１６により第２
の障害用ディスク装置３２３にも書き込まれる。このよ
うにして、第１の通常用ディスク装置３１１と第１の障
害用ディスク装置３１２は互いに同じ内容を保持する。
また第２の通常用ディスク装置３２１と第２の障害用デ
ィスク装置３２２も互いに同じ内容を保持するようにな
っている。When data is written to the first normal disk device 311 by the first disk controller 314,
The same data is also written by the first disk controller 314 to the first failure disk device 312.
Similarly, when data is written to the second normal disk device 321 by the second disk control device 316, data of the same content is written to the second disk control device 316 by the second disk control device 316.
Is also written to the failure disk device 323. In this way, the first normal disk device 311 and the first failure disk device 312 hold the same contents.
The second normal disk device 321 and the second failure disk device 322 also hold the same contents.

【００１３】今、ネットワーク３２５に接続されている
統括管理用のホストコンピュータ３２６から第１の通常
用ディスク装置３１１に格納されているデータの読み出
し要求があったものとする。システムが正常なときは第
１の通常用ディスク装置３１１の内容はバス３１３を通
じて第１のディスク制御装置３１４によって読み出さ
れ、ネットワーク３２５を通じてホストコンピュータ３
２６に転送される。第１の通常用ディスク装置３１１に
障害が発生しているときは、第１のディスク制御装置３
１４は第１の通常用ディスク装置３１１と同一の内容を
格納している第１の障害用ディスク装置３１２から読み
出し要求のあったデータをバス３１３を通じて読み出
す。そしてこれをネットワーク３２５を介してホストコ
ンピュータ３２６に転送する。It is now assumed that a request for reading data stored in the first ordinary disk device 311 has been made from the central management host computer 326 connected to the network 325. When the system is normal, the contents of the first ordinary disk device 311 are read out by the first disk control device 314 via the bus 313 and the host computer 3 is read out via the network 325.
26. When a failure has occurred in the first normal disk device 311, the first disk control device 3
14 reads, via the bus 313, data requested to be read from the first faulty disk device 312, which stores the same contents as the first normal disk device 311. Then, this is transferred to the host computer 326 via the network 325.

【００１４】一方、第１のディスク制御装置３１４に障
害が起きた場合は、第１の通常用ディスク装置３１１の
内容はバス３１５を通じて第２のディスク制御装置３１
６から読み出され、ネットワーク３２５を介してホスト
コンピュータ３２６に転送される。このようにしてディ
スク制御装置に障害が発生した場合でも、格納されてい
るデータを読み出すことができる。このシステムと同様
にディスク装置の障害とディスク制御装置の障害の双方
に対応することができ、かつバスの構成をより簡易にし
たデータ格納システムもある。On the other hand, when a failure occurs in the first disk controller 314, the contents of the first ordinary disk device 311 are transferred to the second disk controller 31 via the bus 315.
6 and transferred to the host computer 326 via the network 325. In this way, stored data can be read even when a failure occurs in the disk control device. Similarly to this system, there is a data storage system that can cope with both a failure of a disk device and a failure of a disk control device and has a simpler bus configuration.

【００１５】図１０は、ディスク装置の障害およびディ
スク制御装置の障害の双方に対応することのできるバス
構成の簡易なデータ格納システムの概要を表わしたもの
である。第１のディスク制御装置３３１には、第１の通
常用ディスク装置３３２と第１の障害用ディスク装置３
３３がバス３３４を通じて接続されている。また、第２
のディスク制御装置３４１には、第２の通常用ディスク
装置３４２と第２の障害用ディスク装置３４３がバス３
４４を通じて接続されている。第１のディスク制御装置
３３１と第２のディスク制御装置３４１は互いにネット
ワーク３５１を通じて接続されている。さらにネットワ
ーク３５１には、データの読み出しおよび書き込み要求
を行うホストコンピュータ３５２が接続されている。FIG. 10 shows an outline of a simple data storage system having a bus configuration capable of coping with both a failure of a disk device and a failure of a disk control device. The first disk control device 331 includes a first normal disk device 332 and a first failure disk device 3.
33 are connected via a bus 334. Also, the second
The second normal disk unit 342 and the second failure disk unit 343 are connected to the bus
44 are connected. The first disk controller 331 and the second disk controller 341 are connected to each other via a network 351. Further, the network 351 is connected to a host computer 352 that issues data read and write requests.

【００１６】このシステムでは、第１のディスク制御装
置３３１に接続されている第１の通常用ディスク装置３
３２に対応するミラーディスク装置は、第２のディスク
制御装置３４１のバス３４４に接続されている第２の障
害用ディスク装置３４３を用いる。また第２のディスク
制御装置３４１に接続されている第２の通常用ディスク
装置３４２に対応するミラーディスク装置は、第１のデ
ィスク制御装置３３１に接続されている第１の障害用デ
ィスク装置３３３を用いる。In this system, the first normal disk device 3 connected to the first disk controller 331
The second disk device 343 connected to the bus 344 of the second disk control device 341 is used as the mirror disk device corresponding to 32. The mirror disk device corresponding to the second normal disk device 342 connected to the second disk control device 341 is the same as the first disk device 333 connected to the first disk control device 331. Used.

【００１７】第１の通常用ディスク装置３３２に書き込
むデータと同一のデータは、ネットワーク３５１および
第２のディスク制御装置３４１を通じて第２の障害用デ
ィスク装置３４３にも書き込まれる。また、第２の通常
用ディスク装置３４２に書き込むデータと同一のデータ
は、第１のディスク制御装置３３１を通じて第１の障害
用ディスク装置３３３にも書き込まれる。第１のディス
ク制御装置３３１に障害が発生したときは、第１の通常
用ディスク３３２の内容は第２の障害用ディスク装置３
４３にも格納されているので、第２のディスク制御装置
３４１を通じて第２の障害用ディスク装置から読み出す
ようになっている。The same data as the data to be written to the first normal disk unit 332 is also written to the second disk unit 343 via the network 351 and the second disk control unit 341. Further, the same data as the data to be written to the second normal disk device 342 is also written to the first failure disk device 333 through the first disk control device 331. When a failure occurs in the first disk control device 331, the contents of the first normal disk 332 are stored in the second disk device 3 for failure.
Since it is also stored in 43, it is read from the second disk device for failure through the second disk controller 341.

【００１８】[0018]

【発明が解決しようとする課題】このように１つのディ
スク制御装置に接続されている通常運用されるディスク
装置の内容を他の１つのディスク制御装置に接続されて
いる障害用のディスク装置に複製しておけば、ディスク
装置とディスク制御装置のいずれに障害が発生してもデ
ータの読み出しを行うことができる。しかしながら、デ
ィスク制御装置に障害が発生した場合は、障害の起きた
ディスク制御装置に接続されているディスク装置に対応
するミラーディスク装置を有する他のディスク制御装置
の負荷が倍になってしまうという問題がある。図１０の
例では、第１のディスク制御装置に障害が生じると、第
１の通常用ディスク装置の内容はこれと同一内容を保持
している第２の障害用ディスク装置から第２のディスク
制御装置を通じて読み出すことになる。第２のディスク
制御装置は第２の通常用ディスク装置の内容を読み出す
役割も負っているため、これら２台分の読み出しを行な
わなければならずその負荷が倍になる。その結果、デー
タの読み出し処理に時間がかかるという問題がある。As described above, the contents of a normally operated disk device connected to one disk controller are copied to a faulty disk device connected to another disk controller. If this is done, data can be read even if a failure occurs in either the disk device or the disk control device. However, when a failure occurs in a disk controller, the load on another disk controller having a mirror disk device corresponding to the disk device connected to the failed disk controller is doubled. There is. In the example of FIG. 10, when a failure occurs in the first disk control device, the contents of the first normal disk device are changed from the second failure disk device holding the same contents to the second disk control device. It will be read through the device. Since the second disk control device also has a role of reading the contents of the second normal disk device, it is necessary to read the data of these two disks, and the load is doubled. As a result, there is a problem that it takes time to read data.

【００１９】そこで本発明の目的は、ディスク装置の読
み書きを制御するディスク制御装置に障害が生じたと
き、これに代わって読み書きを行う他のディスク制御装
置の負担の増加を軽減することのできるデータ格納シス
テムを提供することにある。Accordingly, an object of the present invention is to reduce the increase in the load on other disk control devices that perform reading and writing in the event that a failure occurs in the disk control device that controls reading and writing of the disk device. It is an object of the present invention to provide a data storage system capable of performing the above.

【００２０】[0020]

【課題を解決するための手段】請求項１記載の発明で
は、所定のネットワークで接続された複数の単位サーバ
は、データを蓄積するための第１および第２のデータ記
憶手段と、これらデータ記憶手段を制御するためのディ
スク制御装置を有し、ディスク制御装置は、これらデー
タ記憶手段に入出力すべきデータを前記したネットワー
クとの間で送受信する通信手段と、この通信手段により
前記したネットワークを通じて第１のデータ記憶手段に
記憶すべきデータを受信したときこれを第１のデータ記
憶手段に書き込む第１の書き込み手段と、この第１の書
き込み手段によって第１のデータ記憶手段にデータが書
き込まれたときこれと同一のデータを前記したネットワ
ークを通じて自己以外の予め定められた単位サーバに、
第１のデータ記憶手段に書き込んだアドレスに応じてほ
ぼ等分に分割して送出する分割送出手段と、前記したネ
ットワークを通じて他の単位サーバの分割送出手段から
送出されたデータを受信したときこれを第２のデータ記
憶手段に書き込む第２の書き込み手段と、これら単位サ
ーバの障害の有無を検出する障害検出手段と、この障害
検出手段によって障害の有ることが検出された単位サー
バの有する第１のデータ記憶手段に記憶されているデー
タを読み出すときこれに代わってこれと同一のデータで
ほぼ等分に分割して送出した後のデータを、第１のデー
タ記憶手段に書き込んだアドレスと対応するアドレスを
基にして転送先の第２のデータ記憶手段からそれぞれ読
み出す障害用読出手段とをデータ格納システムに具備さ
せている。According to the first aspect of the present invention, a plurality of unit servers connected by a predetermined network include first and second data storage means for storing data, and these data storage means. A disk control device for controlling the means, the disk control device transmits and receives data to be input / output to / from these data storage units to / from the network, and the communication unit transmits / receives data to / from the network. First writing means for writing the data to be stored in the first data storage means to the first data storage means when the data is received, and writing the data to the first data storage means by the first writing means When the same data is sent to a predetermined unit server other than itself through the network described above ,
According to the address written in the first data storage means,
Division sending means for dividing the data into equal parts and sending the divided data; and second writing means for writing the data sent from the division sending means of the other unit server to the second data storage means when the data is received through the network. when this time of reading and failure detection means for detecting the presence of failure of these units server, the data is that there faulty stored in the first data storage means included in the unit server detected by the failure detecting means With this same data instead of
The data after being divided into almost equal parts and transmitted is referred to as the first data.
Address corresponding to the address written to the data storage means
The data storage system is provided with failure readout means for reading out from the transfer destination second data storage means.

【００２１】すなわち請求項１記載の発明では、所定の
ネットワークで接続された複数の単位サーバは、データ
を蓄積するための第１および第２のデータ記憶手段と、
これらデータ記憶手段を制御するためのディスク制御装
置を有している。そして、これらのディスク制御装置
は、通信手段と、第１の書き込み手段と、分割送出手段
と、第２の書き込み手段と、障害検出手段および障害用
読出手段を備えており、障害の有る単位サーバの第１の
データ記憶手段に格納されているデータを読み出すと
き、これと同一のデータを第２のデータ記憶手段に格納
している単位サーバから、必要なデータを読み出すよう
にしている。第１のデータ記憶手段の記憶内容が他の複
数の単位サーバの第２のデータ記憶手段に分割して格納
されているので、１つの単位サーバに障害が起きたとき
でも、他の単位サーバの負担が大幅に増大することがな
い。That is, according to the first aspect of the present invention, the plurality of unit servers connected by a predetermined network include first and second data storage means for storing data,
It has a disk controller for controlling these data storage means. The disk control device includes a communication unit, a first writing unit, a division sending unit, a second writing unit, a failure detection unit, and a failure reading unit. When reading the data stored in the first data storage means, the necessary data is read from the unit server storing the same data in the second data storage means. Since the storage contents of the first data storage means are divided and stored in the second data storage means of the other plurality of unit servers, even if a failure occurs in one unit server, the contents of the other unit servers are stored. The burden does not increase significantly.

【００２２】請求項２記載の発明では、第１のデータ記
憶手段にはファイルを単位としてデータが書き込まれ、
分割送出手段は、第１のデータ記憶手段に書き込まれた
ファイルを単位として予め定められた複数の転送先に第
１のデータ記憶手段の記憶内容を分散させている。According to the second aspect of the present invention, data is written in the first data storage means in file units.
The division sending unit distributes the storage contents of the first data storage unit to a plurality of predetermined destinations in units of a file written in the first data storage unit.

【００２３】すなわち請求項２記載の発明では、第１の
データ記憶手段の記憶内容をファイル単位に他の複数の
単位サーバに分割して記憶している。That is, in the second aspect of the present invention, the storage contents of the first data storage means are divided and stored in a plurality of other unit servers in file units.

【００２４】請求項３記載の発明では、第１のデータ記
憶手段の記憶領域は複数のブロックに分割されており、
分割送出手段はこのブロックを単位として予め定められ
た複数の転送先に第１のデータ記憶手段の記憶内容を分
割して記憶させている。According to the third aspect of the present invention, the storage area of the first data storage means is divided into a plurality of blocks.
Dividing delivery means partial contents stored in the first data storage means to a plurality of destination predetermined this block units
I divide and memorize it.

【００２５】すなわち請求項３記載の発明では、第１の
データ記憶手段の記憶領域を複数のブロックに分割し、
このブロックを単位に第１のデータ記憶手段の記憶内容
を複数の単位サーバに分割して格納している。That is, according to the third aspect of the present invention, the storage area of the first data storage means is divided into a plurality of blocks,
The storage contents of the first data storage means are divided and stored in a plurality of unit servers in units of this block.

【００２６】請求項４記載の発明では、複数の単位サー
バは、それぞれ複数の単位サーバから構成された２以上
のグループに分割されており、予め定められた複数の転
送先は各グループ内における他の単位サーバに設定され
ている。According to the fourth aspect of the present invention, the plurality of unit servers are divided into two or more groups each composed of a plurality of unit servers, and the plurality of predetermined destinations are different from each other in each group. Is set in the unit server.

【００２７】すなわち請求項４記載の発明では、各単位
サーバの有する第１のデータ記憶手段に記憶されたもの
と同一のデータは、その単位サーバの属するグループ内
における他の複数の単位サーバの第２のデータ記憶手段
に分散されて格納される。これにより、各グループ内で
１つの単位サーバの障害をリカバすることができるの
で、データ格納システム全体として２以上の単位サーバ
の障害に対応することができる。In other words, in the invention according to the fourth aspect, the same data stored in the first data storage means of each unit server is stored in the first data storage unit of the plurality of other unit servers in the group to which the unit server belongs. And is stored separately in the second data storage means. Thus, since a failure of one unit server can be recovered in each group, failure of two or more unit servers can be dealt with in the entire data storage system.

【００２８】請求項５記載の発明では、複数の単位サー
バは、それぞれ複数の単位サーバから構成された２以上
のグループに分割されており、予め定められた複数の転
送先はそれぞれ他のグループの単位サーバに設定されて
いる。According to the fifth aspect of the present invention, the plurality of unit servers are divided into two or more groups each composed of a plurality of unit servers, and the plurality of predetermined destinations are respectively assigned to other groups. Set in the unit server.

【００２９】すなわち請求項５記載の発明では、各単位
サーバの有する第１のデータ記憶手段に記憶されたもの
と同一のデータは、その単位サーバの属する以外のグル
ープの単位サーバの第２のデータ記憶手段に格納され
る。That is, in the invention described in claim 5, the same data stored in the first data storage means of each unit server is the second data of the unit server of a group other than the unit server. It is stored in the storage means.

【００３０】[0030]

BEST MODE FOR CARRYING OUT THE INVENTION

【００３１】[0031]

【実施例】図１は、本発明の一実施例におけるデータ格
納システムの構成の概要を表わしたものである。このシ
ステムでは、ネットワーク１１を通じて第１〜第４のデ
ィスク制御装置１２〜１５が接続されている。第１のデ
ィスク制御装置１２にはバス１６を通じて通常の運用に
用いられるディスク装置としての第１の通常用記憶装置
１７と、障害対策用に設けられたディスク装置としての
第１の障害用記憶装置１８が接続されている。同様に第
２のディスク制御装置１３には、バス２１を通じて第２
の通常用記憶装置２２と第２の障害用記憶装置２３が接
続されている。また第３のディスク制御装置１４にはバ
ス２４を通じて第３の通常用記憶装置２５および第３の
障害用記憶装置２６が、第４のディスク制御装置１５に
はバス２７を通じて第４の通常用記憶装置２８および第
４の障害用記憶装置２９がそれぞれ接続されている。FIG. 1 shows an outline of the configuration of a data storage system according to an embodiment of the present invention. In this system, first to fourth disk controllers 12 to 15 are connected via a network 11. The first disk control device 12 has a first normal storage device 17 as a disk device used for normal operation via a bus 16 and a first failure storage device as a disk device provided for troubleshooting. 18 are connected. Similarly, the second disk controller 13 receives the second
The normal storage device 22 and the second failure storage device 23 are connected. The third disk controller 14 has a third normal storage device 25 and a third failure storage device 26 via a bus 24, and the fourth disk controller 15 has a fourth normal storage device 26 via a bus 27. The device 28 and the fourth fault storage device 29 are connected to each other.

【００３２】ディスク装置への読み書きを制御するディ
スク制御装置とこれにバスを介して接続されている通常
用記憶装置および障害用記憶装置をまとめて単位サーバ
と呼ぶ。また、複数の単位サーバがネットワークに接続
されたものをクラスタと呼ぶ。ここでは、第１〜第４の
単位サーバ３１〜３４によってクラスタが構成されてい
る。図１に示したシステムでは通常用記憶装置および障
害用記憶装置としてハードディスク装置を用いている。
このほか、フロッピーディスク装置、シリコンディスク
装置などを用いることができる。また書き込み動作が必
要なければ、ＣＤ−ＲＯＭを用いることも可能である。A disk controller that controls reading and writing to a disk device, and a normal storage device and a failure storage device connected to the disk control device via a bus are collectively called a unit server. A plurality of unit servers connected to a network is called a cluster. Here, the first to fourth unit servers 31 to 34 form a cluster. In the system shown in FIG. 1, a hard disk device is used as a normal storage device and a failure storage device.
In addition, a floppy disk device, a silicon disk device, or the like can be used. If a write operation is not required, a CD-ROM can be used.

【００３３】各ディスク制御装置に接続されているバス
は、ＳＣＳＩ（Small Computer System Interface)バス
を用いている。これ以外にも、ファイバーチャネル等の
シリアルバスあるいはＡＴＭ(Asynchronous Transfer M
ode)等のネットワークを用いることも可能である。クラ
スタを構成する単位サーバの数は３つ以上であれば良い
が、本実施例では、４台の単位サーバがネットワークに
接続されたクラスタを示してある。As a bus connected to each disk controller, a SCSI (Small Computer System Interface) bus is used. In addition, a serial bus such as a fiber channel or an ATM (Asynchronous Transfer M
It is also possible to use a network such as ode). The number of unit servers constituting the cluster may be three or more, but this embodiment shows a cluster in which four unit servers are connected to a network.

【００３４】第１の通常用記憶装置１９の記憶内容と同
一内容のデータは、これの属する第１の単位サーバ３１
以外の第２〜第４の単位サーバ３２〜３４に属する障害
用記憶装置２３、２６、２９に分散して格納されるよう
になっている。第２の通常用の記憶装置２２の記憶内容
のコピーは、第２の単位サーバ３２以外の単位サーバ３
１、３３、３４の障害用記憶装置１８、２６、２９に分
散して格納される。同様に第３の通常用記憶装置２５の
記憶内容のコピーは、第１、第２および第４の障害用記
憶装置１８、２３、２９に、また第４の通常用記憶装置
２８の記憶内容のコピーは、第１〜第３の障害用記憶装
置１８、２３、２６にそれぞれ分散して格納される。こ
のように、各通常用記憶装置の内容は、その通常用記憶
装置の属する単位サーバ以外の単位サーバに属する障害
用記憶装置に分散して格納される。The data having the same contents as the contents stored in the first ordinary storage device 19 is stored in the first unit server 31 to which the data belongs.
And stored in the failure storage devices 23, 26, and 29 belonging to the second to fourth unit servers 32-34. The copy of the storage contents of the second normal storage device 22 is performed on the unit servers 3 other than the second unit server 32.
1, 33, and 34 are stored in a distributed manner in the failure storage devices 18, 26, and 29. Similarly, copies of the storage contents of the third normal storage device 25 are stored in the first, second, and fourth failure storage devices 18, 23, and 29, and the storage contents of the fourth normal storage device 28. The copies are distributed and stored in the first to third failure storage devices 18, 23, 26, respectively. In this way, the contents of each normal storage device are distributed and stored in the failure storage devices belonging to unit servers other than the unit server to which the normal storage device belongs.

【００３５】図２は、単位サーバの有するディスク制御
装置の構成の概要を表わしたものである。ディスク制御
装置４１は、読み書きの制御の中枢的機能を果たすＣＰ
Ｕ（中央処理装置）４２を備えている。ＣＰＵ４２には
バス４３を介して各種回路装置が接続されている。この
うちＲＯＭ（リード・オンリ・メモリ）４４は各種プロ
グラムや固定的データの格納された読み出し専用メモリ
である。ＲＡＭ（ランダム・アクセス・メモリ）４５
は、プログラムを実行する上で一時的に必要となる各種
データを記憶するためのメモリである。ネットワーク制
御装置４６は、ネットワークとの間で各種のデータやコ
マンドの入出力を行うための回路装置である。ＳＣＳＩ
コントローラ４７は、通常用記憶装置４８および障害用
記憶装置４９との間でデータの転送を行うための制御回
路である。ＳＣＳＩコントローラ４７から出力されてい
るＳＣＳＩバス上に通常用記憶装置４８および障害用記
憶装置４９は接続されている。FIG. 2 shows an outline of the configuration of the disk control device of the unit server. The disk controller 41 has a CP which performs a central function of read / write control.
U (central processing unit) 42 is provided. Various circuit devices are connected to the CPU 42 via a bus 43. Among them, a ROM (read only memory) 44 is a read-only memory in which various programs and fixed data are stored. RAM (random access memory) 45
Is a memory for storing various data temporarily required for executing the program. The network control device 46 is a circuit device for inputting and outputting various data and commands to and from a network. SCSI
The controller 47 is a control circuit for transferring data between the normal storage device 48 and the failure storage device 49. The normal storage device 48 and the fault storage device 49 are connected on the SCSI bus output from the SCSI controller 47.

【００３６】図３は、図１に示したデータ格納システム
の各記憶装置の記憶内容の一例を模式的に表わしたもの
である。図１と同一部分には同一の符号を付してありそ
れらの説明を適宜省略する。ここでは、通常用記憶装置
に格納されるデータのコピーは、他の３つの単位サーバ
の障害用記憶装置に分散して格納されるので、各通常用
記憶装置の記憶領域をコピーを格納する障害用記憶装置
の数に対応して３つに分割している。すなわち、単位サ
ーバの数（ディスク制御装置の数）より１つ少ない個数
の分割記憶領域に分割してある。そして分割されたそれ
ぞれに分割領域にその識別名称を割り当てている。FIG. 3 schematically shows an example of the storage contents of each storage device of the data storage system shown in FIG. The same parts as those in FIG. 1 are denoted by the same reference numerals, and description thereof will be omitted as appropriate. Here, a copy of the data stored in the normal storage device is distributed and stored in the fault storage devices of the other three unit servers. Is divided into three according to the number of storage devices for use. In other words, it is divided into divided storage areas that are one less than the number of unit servers (the number of disk control devices). Then, the identification names are assigned to the divided areas for each of the divided areas.

【００３７】第１の通常用記憶装置１７の第１の分割領
域５１には“Ｄ１−１”と、第２の分割領域５２には
“Ｄ１−１”と、また第３の分割領域５３には“Ｄ１−
３”の識別名称を付与している。また、第２の通常用記
憶装置２２の第１〜第３の分割領域５４〜５６には“Ｄ
２−１”、“Ｄ２−２”“Ｄ２−３”の識別名称を与え
ている。第３の通常用記憶装置２５の第１〜第３の分割
領域５７〜５９には“Ｄ３−１”、“Ｄ３−２”“Ｄ３
−３”を、第４の通常用記憶装置２８の第１〜第３の分
割領域６１〜６３には“Ｄ３−１”、“Ｄ３−２”“Ｄ
３−３”をそれぞれ識別名称として割り当てている。同
様に第１〜第４の障害用記憶装置１８、２３、２６、２
９の記憶領域もそれぞれ３つに分割されている。The first divided area 51 of the first ordinary storage device 17 has "D1-1", the second divided area 52 has "D1-1", and the third divided area 53 has "D1-1". Is "D1-
3 ”. The first to third divided areas 54 to 56 of the second normal storage device 22 are assigned“ D ”.
2-1 "," D2-2 ", and" D2-3 "are assigned to the first to third divided areas 57 to 59 of the third ordinary storage device 25. , “D3-2” “D3
-3 "in the first to third divided areas 61 to 63 of the fourth normal storage device 28," D3-1 "," D3-2 "," D
3-3 "are assigned as identification names. Similarly, the first to fourth failure storage devices 18, 23, 26, 2
Each of the nine storage areas is also divided into three.

【００３８】第１の通常用記憶装置１７にデータを書き
込む場合、書き込み先の分割領域が“Ｄ１−１”（５
１）ならば、第１の通常用記憶装置１７の領域“Ｄ１−
１”に書き込むと同時に、ネットワーク１１を通して同
一のデータが第２のディスク制御装置１３にも渡され
る。第２のディスク制御装置１３は、受け取ったデータ
を第２の障害用記憶装置２３の第１の分割領域６７に書
き込む。同様にして、第１の通常用記憶装置１７の分割
領域“Ｄ１−２”（５２）に書き込むデータは、第３の
ディスク制御装置１４に転送され、第３の障害用記憶装
置２６の第１の分割領域７１に書き込まれる。第１の通
常用記憶装置１７の分割領域“Ｄ１−３”（５３）に書
き込むときは、同一のデータが第４の障害用記憶装置２
９の第１の分割領域７４にも転送されて書き込まれる。When writing data to the first normal storage device 17, the divided area of the write destination is "D1-1" (5
If 1), the area "D1-
At the same time as writing "1", the same data is also passed to the second disk controller 13 through the network 11. The second disk controller 13 transfers the received data to the first disk storage device 23 in the second failure storage device 23. Similarly, the data to be written to the divided area “D1-2” (52) of the first normal storage device 17 is transferred to the third disk control device 14, and the third failure is generated. Is written to the first divided area 71 of the first storage device 26. When writing to the divided area "D1-3" (53) of the first ordinary storage device 17, the same data is written to the fourth failure storage device. 2
The data is also transferred and written to the first divided area 74 of No. 9.

【００３９】このようにして、第１の通常用記憶装置１
７に格納されるデータは、第１の通常通常用記憶装置１
７に書き込まれると同時に第２〜第４の障害用記憶装置
２３、２６、２９に分散して格納される。図３では、各
記憶装置の記憶領域をＳＣＳＩで用いられる論理ブロッ
クを単位に分割した場合を示してある。通常用記憶装置
および障害用記憶装置の記憶容量はそれぞれ同一であ
り、各記憶装置の記憶領域はそれぞれ“Ｎ”個の論理ブ
ロックに分割されている。Thus, the first ordinary storage device 1
7 is stored in the first normal storage device 1
7 and at the same time are distributed and stored in the second to fourth failure storage devices 23, 26, 29. FIG. 3 shows a case where the storage area of each storage device is divided into logical blocks used in SCSI. The storage capacities of the normal storage device and the fault storage device are the same, and the storage area of each storage device is divided into “N” logical blocks.

【００４０】このとき、Ｎ≧３ｎ−１となる最大のｎを
選び、各論理ブロックに“０”〜“３ｎ−１”の番号を
割り付けてある。そして、各記憶装置の論理ブロックの
番号が“０”〜“ｎ−１”の範囲を第１の分割領域に、
論理ブロックの番号が“ｎ”〜“２ｎ−１”の範囲を第
２の分割領域にそれぞれ対応させている。さらに論理ブ
ロックの番号が“２ｎ”〜“３ｎ−１”の範囲を第３の
分割領域に対応させている。At this time, the maximum n that satisfies N ≧ 3n−1 is selected, and numbers “0” to “3n−1” are assigned to the respective logical blocks. Then, the range of the logical block number of each storage device from “0” to “n−1” is set as the first divided area,
The range of the logical block number from “n” to “2n−1” corresponds to the second divided area. Further, the range of the logical block number from "2n" to "3n-1" corresponds to the third divided area.

【００４１】第１の通常用記憶装置１７の第１の分割領
域（５１）“Ｄ１−１”に格納されるものと同一のデー
タは第２の障害用記憶装置２３の第１の分割領域６７に
格納される。したがって、第１の通常用記憶装置１７の
“０”〜“ｎ−１”の論理ブロックに格納されるデータ
は、第２の障害用記憶装置２３の“０”〜“ｎ−１”の
論理ブロックにその複製が作成される。同様にして第１
の通常用記憶装置１７の第２の分割領域５２としての
“ｎ”〜“２ｎ−１”の論理ブロックに格納されるデー
タは、第３の障害用記憶装置２６の“０”〜“ｎ−１”
の論理ブロックにその複製が作成されている。このよう
に論理ブロック番号によってコピー先の障害用記憶装置
およびその記憶領域を対応付けることができる。このよ
うな対応関係は、各単位サーバのディスク制御装置の有
するＲＯＭあるいはＲＡＭに登録される。The same data stored in the first divided area (51) "D1-1" of the first normal storage device 17 is stored in the first divided region 67 of the second fault storage device 23. Is stored in Therefore, the data stored in the logical blocks “0” to “n−1” of the first normal storage device 17 is the logical block “0” to “n−1” of the second fault storage device 23. The block is duplicated. Similarly, the first
The data stored in the logical blocks “n” to “2n−1” as the second divided areas 52 of the normal storage device 17 are “0” to “n−” of the third failure storage device 26. 1 "
Has been duplicated in the logical block. In this manner, the failure storage device at the copy destination and its storage area can be associated with each other by the logical block number. Such a correspondence is registered in the ROM or RAM of the disk control device of each unit server.

【００４２】図３のようにデータの格納されているデー
タ格納システムにおいて、通常用記憶装置もしくはディ
スク制御装置に障害が生じた場合の動作を説明する。The operation of the data storage system in which data is stored as shown in FIG. 3 when a failure occurs in the normal storage device or the disk control device will be described.

【００４３】図３に示したシステムでは、各通常用記憶
装置のそれぞれの分割領域とそのコピー先となる障害用
記憶装置およびその障害用記憶装置内における格納領域
との対応関係が、他の単位サーバのディスク制御装置に
予め通知されている。たとえば、第１の通常用記憶装置
１７の第１の分割領域５１、すなわち“０”〜“ｎ−
１”の論理ブロックのコピー先が第２の障害用記憶装置
２３の第１の分割領域６７（“０”〜“ｎ−１”の論理
ブロック）であることが他の単位サーバのディスク制御
装置に通知されている。データの要求元となるホストコ
ンピュータ７７はネットワーク１１につながれている。In the system shown in FIG. 3, the correspondence relationship between each divided area of each normal storage device, the faulty storage device to be copied to, and the storage area in the faulty storage device is represented by another unit. This is notified in advance to the disk control device of the server. For example, the first divided area 51 of the first ordinary storage device 17, that is, "0" to "n-
It is determined that the copy destination of the logical block of “1” is the first divided area 67 (the logical block of “0” to “n−1”) of the second failure storage device 23. The host computer 77 that is the data request source is connected to the network 11.

【００４４】ホストコンピュータ７７には、ディスク制
御装置１２に障害が起きた場合はその代替としてディス
ク制御装置１３と通信すること、またディスク制御装置
１３に障害が起きたときは代替としてディスク制御装置
１４と通信することが予め設定されている。さらに、デ
ィスク制御装置１４に障害が起きたときは代替としてデ
ィスク制御装置１５と、ディスク制御装置１５に障害が
発生したときはその代替としてディスク制御装置１２と
それぞれ通信することが予め設定されている。When the disk controller 12 fails, the host computer 77 communicates with the disk controller 13 as an alternative. When the disk controller 13 fails, the host computer 77 provides an alternative. Communication with is set in advance. Furthermore, it is preset that, when a failure occurs in the disk control device 14, the disk controller 15 communicates with the disk controller 15 as an alternative, and when the disk controller 15 fails, the disk controller 12 communicates with the disk controller 12 as an alternative. .

【００４５】各単位サーバのディスク制御装置は、他の
単位サーバから送られてくる情報を基にして、自身のバ
スに接続されている障害用記憶装置の各記憶領域に格納
されるデータのコピー元の通常用記憶装置を自身の有す
るＲＡＭに記憶するようになっている。また、データの
読出要求の送出元となるホストコンピュータ７７は、各
単位サーバのディスク制御装置に障害が生じているか否
かを示した情報をホストコンピュータ７７の有するＲＡ
Ｍに記憶する。また障害が起きたときにその代替となる
ディスク制御装置を内部のメモリに記憶するようになっ
ている。The disk control device of each unit server copies the data stored in each storage area of the failure storage device connected to its own bus based on the information sent from the other unit servers. The original ordinary storage device is stored in its own RAM. In addition, the host computer 77 that is the source of the data read request sends information indicating whether a failure has occurred in the disk control device of each unit server to the RA having the host computer 77.
Store it in M. When a failure occurs, an alternative disk controller is stored in an internal memory.

【００４６】今、第１のディスク制御装置１２に障害が
起きたものとする。データの要求元となるホストコンピ
ュータ７７は第１のディスク制御装置１２の障害発生を
検出した以後は、第１の通常用記憶装置１７のデータを
読み出す場合は、読出要求を第１のディスク制御装置１
２の代替として設定されている第２のディスク制御装置
１３へ送る。読み出すべき領域は第１の通常用記憶装置
１７における論理ブロックの番号で表わされている。Now, it is assumed that a failure has occurred in the first disk controller 12. After detecting the occurrence of a failure in the first disk controller 12, the host computer 77, which has requested the data, sends a read request to the first disk controller when reading data from the first ordinary storage device 17. 1
To the second disk controller 13 set as an alternative to the second disk controller 13. The area to be read is represented by the number of a logical block in the first ordinary storage device 17.

【００４７】第２のディスク制御装置１３では、ホスト
コンピュータ７７から送られてきた読出要求から論理ブ
ロック番号を計算する。計算された論理ブロック番号が
“０”〜“ｎ−１”の範囲内であれば第２のディスク制
御装置１３のバスに接続されている第２の障害用記憶装
置２３に読み出すべきデータのコピーが格納されている
と判別する。そこで、第２のディスク制御装置１３は自
身のバス２１に接続されている第２の障害用記憶装置２
３から対応するデータを読み出し、ホストコンピュータ
７７にネットワーク１１を通じて送信する。The second disk controller 13 calculates a logical block number from a read request sent from the host computer 77. If the calculated logical block number is in the range of “0” to “n−1”, a copy of the data to be read to the second fault storage device 23 connected to the bus of the second disk control device 13 Is determined to be stored. Therefore, the second disk controller 13 connects the second failure storage device 2 connected to its own bus 21.
The corresponding data is read out from 3 and transmitted to the host computer 77 via the network 11.

【００４８】一方、読み出しの要求された論理ブロック
の番号が“ｎ”〜“２ｎ−１”の範囲であれば第２のデ
ィスク制御装置１３は第３のディスク制御装置１４にネ
ットワーク１１を通じて読出要求を転送する。第３のデ
ィスク制御装置１４は、受信した読出要求を基にして自
身のバス２４に接続されている第３の障害用記憶装置２
６から該当するデータを読み出しホストコンピュータ７
７にネットワーク１１を介して送信する。同様に論理ブ
ロック番号が“２ｎ”〜“３ｎ−１”の範囲であれば、
第２のディスク制御装置１３は第４のディスク制御装置
１５に読出要求を転送する。これにより読出要求の転送
された第４のディスク制御装置１５によって第４の障害
用記憶装置２９から該当するデータが読み出されホスト
コンピュータ７７に送られる。On the other hand, if the number of the logical block requested to be read is in the range of “n” to “2n−1”, the second disk controller 13 sends a read request to the third disk controller 14 via the network 11. To transfer. The third disk controller 14 is connected to its own bus 24 based on the received read request, and stores the third failure storage device 2.
6 reads the corresponding data from the host computer 7
7 via the network 11. Similarly, if the logical block number is in the range of “2n” to “3n−1”,
The second disk controller 13 transfers the read request to the fourth disk controller 15. As a result, the corresponding data is read from the fourth failure storage device 29 by the fourth disk controller 15 to which the read request has been transferred, and sent to the host computer 77.

【００４９】通常用記憶装置に障害が発生したことの検
出は次のようにして行われる。第１〜第４のディスク制
御装置１２〜１５は自身のバスに接続されている通常用
記憶装置に対してデータの読出要求を出す。そして読出
要求の送出先の通常用記憶装置から一定時間内に応答の
到来しないタイムアウトエラーが生じたり、応答が無い
ときはその通常用記憶装置に障害が発生したものと判別
する。The detection of the occurrence of a failure in the normal storage device is performed as follows. Each of the first to fourth disk controllers 12 to 15 issues a data read request to the normal storage device connected to its own bus. Then, when a timeout error occurs in which a response does not arrive within a predetermined time from the normal storage device to which the read request is sent, or when there is no response, it is determined that the normal storage device has failed.

【００５０】単位サーバのディスク制御装置に障害が発
生したことは次のようにして検出している。クラスタを
構成してる単位サーバのディスク制御装置は、次の単位
サーバのディスク制御装置に対して一定時間毎に自己が
正常に動作していることを表わした動作確認メッセージ
を送る。１つ手前のディスク制御装置から一定時間以上
に渡って動作確認メッセージが届かないときは、２つ手
前のディスク制御装置へ正常動作を確認するための問い
合わせを行う。２つ手前のディスク制御装置から問い合
わせに対する応答が無いときはさらにその１つ手前のデ
ィスク制御装置に動作確認の問い合わせを行う。このよ
うに応答が得られるまで、次々と遡って正常動作を確認
するための問い合わせを行う。そして問い合わせを始め
たディスク制御装置から応答の帰ってきたディスク制御
装置までの間のディスク制御装置に障害が生じたものと
判別する。The occurrence of a failure in the disk control unit of the unit server is detected as follows. The disk controller of the unit server constituting the cluster sends an operation confirmation message to the disk controller of the next unit server at regular intervals, indicating that the disk controller is operating normally. If the operation confirmation message has not been received from the immediately preceding disk controller for a certain period of time or more, an inquiry is made to the immediately preceding disk controller to confirm normal operation. If there is no response to the inquiry from the immediately preceding disk control device, an inquiry about operation confirmation is further performed to the immediately preceding disk control device. Until a response is obtained in this way, an inquiry is made to confirm the normal operation one after another. Then, it is determined that a failure has occurred in the disk control device between the disk control device that has started the inquiry and the disk control device that has returned the response.

【００５１】障害用記憶装置へのデータの分散の仕方
は、データを格納した通常用記憶装置の属している単位
サーバ以外の単位サーバに属している障害用記憶装置に
格納すれば良く、格納される順序やその領域については
図３に示した例に限定されるものではない。The method of distributing the data to the failure storage device may be stored in a failure storage device belonging to a unit server other than the unit server to which the normal storage device storing the data belongs. The order and the area are not limited to the example shown in FIG.

【００５２】その一例として通常用記憶装置の記憶領域
を物理的に分割し、それぞれの領域を他の単位サーバに
属する障害用記憶装置に振り分けて割り当てることがで
きる。また、通常用記憶装置に書き込むファイルを単位
にして、障害用記憶装置を割り当てても良い。As an example, a storage area of a normal storage device can be physically divided, and each area can be allocated to a failure storage device belonging to another unit server. Further, a failure storage device may be assigned in units of a file to be written to the normal storage device.

【００５３】論理ブロック番号を用いてい障害用記憶装
置を割り当てる場合、先に説明したように論理ブロック
を所定数連続するように割り当てるほか、隣り合う論理
ブロックを互いに異なる障害用記憶装置に割り当てるこ
ともできる。すなわち、ＳＣＳＩで用いる論理ブロック
を単位に障害用記憶装置を割り振る場合に、通常用記憶
装置にデータを書き込んだ論理ブロックの番号を、分散
する障害用記憶装置の数で割った余りの値を基準に割り
振る。When allocating a fault storage device using a logical block number, as described above, a logical block is allocated so as to be continuous by a predetermined number, and adjacent logical blocks may be allocated to different fault storage devices. it can. That is, when allocating a failure storage device in units of logical blocks used in SCSI, a value obtained by dividing the number of the logical block in which data is written in the normal storage device by the number of distributed failure storage devices is used as a reference. Allocate to.

【００５４】たとえば、１つの通常用記憶装置の記憶内
容を３つの障害用記憶装置に分散して格納する場合、論
理ブロックの番号を“３”で割った余りを基準に格納す
べき障害用記憶装置を割り振る。この場合、余りは
“０”〜“２”の３つの値をとるので、これらの値にそ
れぞれ障害用記憶装置を予め割り当てることにより、デ
ータをコピー先に論理ブロック単位で分散して格納する
ことができる。For example, when the storage contents of one normal storage device are distributed and stored in three failure storage devices, the failure storage device to be stored based on the remainder obtained by dividing the logical block number by "3". Allocate devices. In this case, since the remainder takes three values of "0" to "2", data is distributed and stored in the copy destination in units of logical blocks by assigning a failure storage device to each of these values in advance. Can be.

【００５５】また、単位サーバに接続されている通常用
記憶装置が複数台ある場合、それぞれの通常用記憶装置
をこれまで説明した記憶装置の分割領域として扱うこと
も可能である。たとえば、第１〜第４の単位サーバによ
ってクラスタが構成されており、それぞれの単位サーバ
に第１〜第３の通常用記憶装置が接続されているものと
する。このとき第１の単位サーバの第１の通常用記憶装
置の内容を第２の単位サーバの障害用記憶装置にコピー
する。また、第１の単位サーバの第２の通常用記憶装置
の内容を第３の単位サーバの障害用記憶装置にコピー
し、第１の単位サーバの第３の通常用記憶装置の内容を
第４の単位サーバの障害用記憶装置にコピーする。When there are a plurality of normal storage devices connected to the unit server, each of the normal storage devices can be treated as a divided area of the storage device described above. For example, it is assumed that a cluster is configured by the first to fourth unit servers, and the first to third normal storage devices are connected to each unit server. At this time, the contents of the first normal storage device of the first unit server are copied to the failure storage device of the second unit server. Further, the contents of the second normal storage device of the first unit server are copied to the failure storage device of the third unit server, and the contents of the third normal storage device of the first unit server are copied to the fourth normal storage device. Is copied to the failure storage device of the unit server.

【００５６】第１の変形例 First Modified Example

【００５７】これまで説明した実施例では、各単位サー
バの通常用記憶装置の記憶内容を他の全ての単位サーバ
の障害用記憶装置に分散して格納しているが、第１の変
形例では、一部の単位サーバに分散して格納するように
なっている。In the embodiment described so far, the storage contents of the normal storage device of each unit server are distributed and stored in the failure storage devices of all other unit servers. However, in the first modification, , And are distributed and stored in some unit servers.

【００５８】図４は、本発明の第１の変形例におけるデ
ータ格納システムの構成の概要を表わしたものである。
このシステムでは、第１〜第６のディスク制御装置８１
〜８６がネットワーク８７に接続されている。それぞれ
のディスク制御装置８１〜８６には、バス９１〜９６を
介して通常用記憶装置１０１〜１０６と障害用記憶装置
１１１〜１１６が接続されている。第１の変形例におけ
るデータ格納システムでは、これらを第１の部分クラス
タ１２１と、第２の部分クラスタ１２２に分割してい
る。それぞれの部分クラスタにおいて実施例と同様にデ
ータが分散して格納される。すなわち、第１の部分クラ
スタ１２１では、第１のディスク制御装置８１に接続さ
れている通常用記憶装置１０１の内容のコピーは、同一
の部分クラスタ１２１に属する第２および第３のディス
ク制御装置８２、８３に接続されている障害用記憶装置
１１２、１１３に分散されて記憶される。FIG. 4 shows an outline of the configuration of a data storage system according to a first modification of the present invention.
In this system, the first to sixth disk controllers 81
86 are connected to the network 87. The normal storage devices 101 to 106 and the failure storage devices 111 to 116 are connected to the respective disk control devices 81 to 86 via buses 91 to 96. In the data storage system in the first modified example, these are divided into a first partial cluster 121 and a second partial cluster 122. Data is distributed and stored in each partial cluster as in the embodiment. That is, in the first partial cluster 121, the copy of the contents of the normal storage device 101 connected to the first disk control device 81 is copied by the second and third disk control devices 82 belonging to the same partial cluster 121. , 83 are distributed and stored in the failure storage devices 112, 113 connected to the failure storage devices 112, 113.

【００５９】これにより、実施例ではクラスタ全体で１
つの単位サーバの障害しか許されないのに対し、第１の
変形例では各部分クラスタ内で１つまでの単位サーバの
障害に対応することが可能となる。もちろん、部分クラ
スタの数は２つに限らず、これ以上の数であっても同様
の効果が得られる。As a result, in the embodiment, 1 is set for the entire cluster.
While only one unit server is allowed to fail, the first modification can handle up to one unit server in each partial cluster. Of course, the number of partial clusters is not limited to two, and a similar effect can be obtained with more than two.

【００６０】第２の変形例 Second Modification

【００６１】図５は、第２の変形例におけるデータ格納
システムの構成の概要を表わしたものである。図４と同
一の部分には同一の符号を付してありそれらの説明を適
宜省略する。第２の変形例も第１の変形例と同様に第１
および第２の部分クラスタ１２１、１２２にクラスタが
分割されている。第１の変形例では、各部分クラスタ内
の通常用記憶装置の記憶内容のコピーは同じ部分クラス
タ内の障害用記憶装置に分散されて格納されている。こ
れに対して第２の変形例では、各部分クラスタに属する
通常用記憶装置の記憶内容のコピーは、他の部分クラス
タに属する障害用記憶装置に作成されるようになってい
る。FIG. 5 shows the outline of the configuration of the data storage system in the second modification. The same parts as those in FIG. 4 are denoted by the same reference numerals, and description thereof will be omitted as appropriate. The second modification is also similar to the first modification in the first modification.
And the second partial clusters 121 and 122 are divided into clusters. In the first modification, copies of the storage contents of the normal storage devices in each partial cluster are distributed and stored in the failure storage devices in the same partial cluster. On the other hand, in the second modification, a copy of the storage content of the normal storage device belonging to each partial cluster is created in the failure storage device belonging to another partial cluster.

【００６２】第１の部分クラスタ１２１に属する第１〜
第３の通常用記憶装置１０１〜１０３で構成される第１
の通常用記憶装置クラスタ１３１の記憶内容は、第２の
部分クラスタ１２２に属する第４〜第６の障害用記憶装
置１１４〜１１６で構成される第２の障害用記憶装置ク
ラスタ１３２にコピーされる。同様に第２の各部分クラ
スタ１２２に属する第４〜第６の通常用記憶装置１０４
〜１０６で構成される第２の通常用記憶装置クラスタ１
３３の記憶内容は、第１の部分クラスタ１２１に属する
第１〜第３の障害用記憶装置１１１〜１１３で構成され
る第１の障害用記憶装置クラスタ１３４にコピーされ
る。The first to first groups belonging to the first partial cluster 121
A first storage device including third normal storage devices 101 to 103;
The storage contents of the normal storage device cluster 131 are copied to a second fault storage device cluster 132 composed of fourth to sixth fault storage devices 114 to 116 belonging to the second partial cluster 122. . Similarly, the fourth to sixth ordinary storage devices 104 belonging to the second partial clusters 122
Second ordinary storage device cluster 1 composed of
The storage contents of 33 are copied to a first failure storage device cluster 134 composed of first to third failure storage devices 111 to 113 belonging to the first partial cluster 121.

【００６３】このように１つの部分クラスタに属する通
常用記憶装置の記憶内容を他の１つの部分クラスタに属
する障害用記憶装置に分散して格納することにより、複
数の部分クラスタから構成されるクラスタ内で複数の単
位サーバの故障に対応することが可能となる。図５に示
したように部分クラスタの数が３つ以下の場合は、複数
のクラスタで同時に障害が起きると回復できないが、部
分クラスタの数が４つ以上の場合には、複数の単位サー
バの障害に対応することができる。As described above, the storage contents of the normal storage device belonging to one partial cluster are distributed and stored in the failure storage device belonging to another partial cluster, so that a cluster composed of a plurality of partial clusters is obtained. It is possible to cope with a failure of a plurality of unit servers within the system. As shown in FIG. 5, when the number of partial clusters is three or less, recovery cannot be performed if failures occur simultaneously in a plurality of clusters. However, when the number of partial clusters is four or more, Can handle obstacles.

【００６４】図６は、４つの部分クタスタから構成され
るデータ格納システムの概要を表わしたものである。こ
のシステムでは、第１〜第１２のディスク制御装置１４
１〜１５３がネットワーク１５４に接続されている。そ
れぞれのディスク制御装置１４１〜１５３には、バス１
６１〜１７３を介して通常用記憶装置１８１〜１９３と
障害用記憶装置２０１〜２１３が接続されている。ネッ
トワーク１５４を介して構成されるクラスタは、第１〜
第４の部分クラスタ２２１〜２２５に分割されている。FIG. 6 shows an outline of a data storage system composed of four partial clusters. In this system, the first to twelfth disk control devices 14
1 to 153 are connected to the network 154. Each of the disk controllers 141 to 153 has a bus 1
The normal storage devices 181 to 193 and the failure storage devices 201 to 213 are connected via 61 to 173. The clusters configured via the network 154 include first to first clusters.
It is divided into fourth partial clusters 221-225.

【００６５】ここでは、第１の部分クラスタ２２１に属
する通常用記憶装置１８１〜１８３で構成された第１の
通常用記憶装置クラスタ２３１の記憶内容は第２の部分
クタスタ２２２に属する障害用記憶装置２０４〜２０６
で構成された第２の障害用記憶装置クラスタ２３２に分
散されてい格納される。また、第２の部分クラスタ２２
２に属する通常用記憶装置１８４〜１８６で構成された
第２の通常用記憶装置クラスタ２３３の記憶内容は第３
の部分クタスタ２２３に属する障害用記憶装置２０７〜
２０９で構成された第３の障害用記憶装置クラスタ２３
４に分散されてい格納される。Here, the storage contents of the first ordinary storage device cluster 231 composed of the ordinary storage devices 181 to 183 belonging to the first partial cluster 221 are stored in the failure storage device belonging to the second partial cluster 222. 204-206
Are stored in a distributed manner in the second failure storage device cluster 232 composed of Also, the second partial cluster 22
The storage contents of the second ordinary storage device cluster 233 composed of the ordinary storage devices 184 to 186 belonging to the second
Storage devices 207 to 207 belonging to the partial cluster 223 of FIG.
209, the third storage cluster for failure 23
4 and stored.

【００６６】同様に第３の部分クラスタ２２３に属する
第３の通常用記憶装置クラスタ２３５の記憶内容は、第
４の部分クラスタ２２４に属する第４の障害用記憶装置
クラスタ２３６に分散されて格納される。また、第４の
部分クラスタ２２４に属する第４の通常用記憶装置クラ
スタ２３７の記憶内容は、第１の部分クラスタ２２１に
属する第１の障害用記憶装置クラスタ２３８に分散され
て格納される。このように各部分クラスタの通常用記憶
装置クラスタの記憶内容のコピー先を次の部分クラスタ
の障害用記憶装置クラスタにすることで、制限はあるも
のの複数の部分クラスタにおいて障害が起きる場合に対
応できる。すなわち、隣り合わない部分クラスタで同時
に障害が起きても対応可能となる。Similarly, the storage contents of the third ordinary storage device cluster 235 belonging to the third partial cluster 223 are distributed and stored in the fourth failure storage device cluster 236 belonging to the fourth partial cluster 224. You. The storage contents of the fourth normal storage device cluster 237 belonging to the fourth partial cluster 224 are distributed and stored in the first failure storage device cluster 238 belonging to the first partial cluster 221. In this way, by setting the copy destination of the storage contents of the normal storage device cluster of each partial cluster to the failure storage device cluster of the next partial cluster, it is possible to cope with a case where a failure occurs in a plurality of partial clusters although there are restrictions. . That is, it is possible to cope with a failure occurring simultaneously in non-adjacent partial clusters.

【００６７】たとえば、第１の部分クラスタ２２１と第
３の部分クラスタ２２３で同時に障害が起きた場合に対
応可能になる。この場合には、第１の部分クラスタ２２
１の内容を第２の部分クラスタ２２２の障害用記憶装置
クラスタ２３２から読み出し、第３の部分クラスタ２２
３の内容を第４の部分クラスタ２２４の障害用記憶装置
クラスタ２３６から読み出すことができる。同様に第２
の部分クラスタ２２２と第４の部分クラスタ２２４で同
時に障害が起きた場合には、第１および第３の部分クラ
スタ２２１、２２３の障害用記憶装置クラスタ２３８、
２３４を用いて障害を回復することができる。For example, it is possible to cope with a case where a failure occurs simultaneously in the first partial cluster 221 and the third partial cluster 223. In this case, the first partial cluster 22
1 is read from the failure storage device cluster 232 of the second partial cluster 222 and the third partial cluster 22 is read.
3 can be read from the failure storage device cluster 236 of the fourth partial cluster 224. Similarly the second
If a failure occurs simultaneously in the partial cluster 222 and the fourth partial cluster 224, the failure storage device cluster 238 of the first and third partial clusters 221 and 223,
The fault can be recovered using 234.

【００６８】第３の変形例 Third Modification

【００６９】図７は、第３の変形例におけるデータ格納
システムの構成の概要を表わしたものである。第３の変
形例では、部分クラスタのようなグループ分けを行わ
ず、各単位サーバの通常用記憶装置の内容を他の任意数
の単位サーバの障害用記憶装置に分散してコピーするよ
うになっている。このシステムでは、第１〜第９のディ
スク制御装置２４１〜２４９がネットワーク２５１に接
続されている。それぞれのディスク制御装置２４１〜２
４９には、バス２６１〜２６９を介して通常用記憶装置
２７１〜２７９と障害用記憶装置２８１〜２８９が接続
されている。FIG. 7 shows the outline of the configuration of the data storage system in the third modification. In the third modification, the contents of the normal storage device of each unit server are distributed and copied to the failure storage devices of any other number of unit servers without performing grouping like a partial cluster. ing. In this system, first to ninth disk controllers 241 to 249 are connected to a network 251. Each disk controller 241-2
49 is connected to normal storage devices 271 to 279 and fault storage devices 281 to 289 via buses 261 to 269.

【００７０】各単位サーバの通常用記憶装置の記憶内容
のコピーを格納する障害用記憶装置は、各単位サーバの
通常用記憶装置毎に設定される。例えば、通常用記憶装
置２７１に格納されているデータのコピー先として、障
害用記憶装置２８２〜２８４を割り当てている。各通常
用記憶装置毎にばらばらに設定すると管理が複雑になる
ため、ここでは、通常用記憶装置の属する単位サーバの
次の単位サーバから連続する３台の障害用記憶装置をコ
ピー先としている。もちろんコピー先は連続する必要は
無く、ばらばらであっても構わない。A failure storage device for storing a copy of the storage contents of the normal storage device of each unit server is set for each normal storage device of each unit server. For example, failure storage devices 282 to 284 are assigned as copy destinations of data stored in the normal storage device 271. If the setting is made different for each normal storage device, the management becomes complicated. Therefore, here, three failure storage devices continuous from the unit server following the unit server to which the normal storage device belongs are set as copy destinations. Of course, the copy destinations do not need to be continuous, and may be different.

【００７１】以上説明した実施例および第１〜第３の変
形例では、通常用記憶装置と障害用記憶装置は物理的に
別々の装置を用いたが、同一の記憶装置内を通常用記憶
装置としての記憶領域と障害用記憶装置用の記憶領域に
分割して用いることもできる。また、ディスク制御装置
は、ワークステーションやホストコンピュータを用いる
こともできる。In the above-described embodiment and the first to third modifications, the normal storage device and the fault storage device are physically separate devices, but the same storage device is used in the same storage device. And a storage area for a failure storage device. In addition, a workstation or a host computer can be used as the disk control device.

【００７２】[0072]

【発明の効果】このように請求項１記載の発明によれ
ば、各単位サーバの第１のデータ記憶手段の記憶内容の
複製が他の複数の単位サーバの第２のデータ記憶手段に
分割して格納されるので、１つの単位サーバに障害が起
きたときでも、他の単位サーバの負担が大幅に増大する
ことがない。また単位サーバの有する第１のデータ記憶
手段に障害が生じた場合に限らず、第１の書き込み手段
などの単位サーバの他の回路部分に障害が生じたときで
も、障害の生じた単位サーバの第１のデータ記憶手段に
記憶されているものと同一のデータを他の単位サーバか
ら読み出すことができる。しかも請求項１記載の発明で
は、分割送出手段が自己以外の予め定められた複数の転
送先に第１のデータ記憶手段に書き込まれたのと同一の
データをそれぞれほぼ等分に分割して送出することにし
た。このため、データの分割される量が均一化し、障害
発生時の負荷が最小限に抑えられるという効果がある。As described above, according to the first aspect of the present invention, the copy of the storage contents of the first data storage means of each unit server is copied to the second data storage means of another plurality of unit servers.
Since the data is divided and stored , even if a failure occurs in one unit server, the load on another unit server does not increase significantly. Not only when a failure occurs in the first data storage unit of the unit server, but also when a failure occurs in another circuit part of the unit server such as the first writing unit, the failure of the unit server in which the failure has occurred. The same data stored in the first data storage means can be read from another unit server. Sending Moreover in the first aspect of the present invention, dividing delivery means divides the same data as that written in the first data storage means to a plurality of destination a predetermined non-self substantially equal respectively I decided to do it. For this reason, there is an effect that the amount of data division is made uniform and the load at the time of occurrence of a failure is minimized.

【００７３】また請求項２記載の発明によれば、ファイ
ル単位に他の複数の単位サーバに分割して格納している
ので、データの分割や、分割されたデータの読み出しを
容易に管理することができる。According to the second aspect of the present invention, the data is divided and stored in a plurality of other unit servers for each file, so that data division and reading of the divided data can be easily managed. Can be.

【００７４】さらに請求項３記載の発明によれば、第１
のデータ記憶装置の記憶領域を複数に分割したブロック
ごとに同一のデータの転送先が割り振られているので、
データの分割や、分割されたデータの読み出しを容易に
管理することができる。According to the third aspect of the present invention, the first
Since the same data transfer destination is assigned to each block obtained by dividing the storage area of the data storage device into a plurality,
Splitting and data, the reading of the divided data can be easily managed.

【００７５】また請求項４記載の発明によれば、各単位
サーバの有する第１のデータ記憶手段に記憶されたもの
と同一のデータは、その単位サーバの属するグループ内
における他の複数の単位サーバの第２のデータ記憶手段
に分散された格納される。これにより、各グループ内で
１つの単位サーバの障害をリカバすることができ、デー
タ格納システム全体として２以上の単位サーバの障害に
対応することができる。According to the fourth aspect of the present invention, the same data stored in the first data storage means of each unit server is stored in a plurality of other unit servers in the group to which the unit server belongs. Is distributed and stored in the second data storage means. Thus, a failure of one unit server can be recovered in each group, and failures of two or more unit servers can be dealt with in the entire data storage system.

【００７６】さらに請求項５記載の発明によれば、各単
位サーバの有する第１のデータ記憶手段に記憶されたも
のと同一のデータは、その単位サーバの属する以外のグ
ループの単位サーバの第２のデータ記憶手段に格納され
る。たとえば、４以上のグループに分ければ、２以上の
グループの単位サーバの障害に対応することができる。Further, according to the fifth aspect of the present invention, the same data stored in the first data storage means of each unit server is stored in the second data of the group server other than the unit server. Is stored in the data storage means. For example, if the server is divided into four or more groups, it is possible to cope with a failure of a unit server in two or more groups.

[Brief description of the drawings]

【図１】本発明の一実施例におけるデータ格納システム
の構成の概要を表わしたシステム構成図である。FIG. 1 is a system configuration diagram showing an outline of a configuration of a data storage system according to an embodiment of the present invention.

【図２】図１に示したディスク制御装置の回路構成の概
要を表わしたブロック図である。FIG. 2 is a block diagram showing an outline of a circuit configuration of the disk control device shown in FIG.

【図３】図１に示したデータ格納システムの各記憶装置
の記憶内容の一例を模式的に表わした説明図である。3 is an explanatory diagram schematically showing an example of storage contents of each storage device of the data storage system shown in FIG.

【図４】本発明の第１の変形例におけるデータ格納シス
テムの構成の概要を表わしたシステム構成図である。FIG. 4 is a system configuration diagram showing an outline of a configuration of a data storage system according to a first modified example of the present invention.

【図５】本発明の第２の変形例におけるデータ格納シス
テムの構成の概要を表わしたシステム構成図である。FIG. 5 is a system configuration diagram showing an outline of a configuration of a data storage system according to a second modified example of the present invention.

【図６】４つの部分クタスタから構成されるデータ格納
システムの概要を表わしたシステム構成図である。FIG. 6 is a system configuration diagram showing an outline of a data storage system composed of four partial clusters.

【図７】本発明の第３の変形例におけるデータ格納シス
テムの構成の概要を表わしたシステム構成図である。FIG. 7 is a system configuration diagram showing an outline of a configuration of a data storage system according to a third modification of the present invention.

【図８】従来から使用されているミラー方式を用いたデ
ータ格納システムの構成の概要を表わしたシステム構成
図である。FIG. 8 is a system configuration diagram showing an outline of a configuration of a data storage system using a mirror system which has been conventionally used.

【図９】従来から使用されているディスク制御装置の障
害に対応することのできるデータ格納システムの構成の
概要を表わしたシステム構成図である。FIG. 9 is a system configuration diagram showing an outline of a configuration of a data storage system capable of coping with a failure of a conventionally used disk control device.

【図１０】従来から使用されているディスク装置の障害
およびディスク制御装置の障害の双方に対応することの
できるバス構成の簡易なデータ格納システムの概要を表
わしたシステム構成図である。FIG. 10 is a system configuration diagram showing an outline of a simple data storage system having a bus configuration capable of coping with both a failure of a disk device and a failure of a disk control device conventionally used.

[Explanation of symbols]

１１、８７、１５４、２５１ネットワーク１２〜１５、４１、８１〜８６、１４１〜１５３、２４
１〜２４９ディスク制御装置１６、２１、２４、２７、９１〜９６、２６１〜２６９
バス１７、２２、２５、２８、４８、１０１〜１０６、１８
１〜１９３、２７１〜２７９通常用記憶装置１８、２３、２６、２９、４９、１１１〜１１６、２０
１〜２１３、２８１〜２８９障害用記憶装置３１〜３４単位サーバ４２ＣＰＵ４３ＣＰＵバス４４ＲＯＭ４５ＲＡＭ４６ネットワーク制御装置４７ＳＣＳＩコントローラ５１〜５９、６１〜６９、７１〜７６分割領域７７ホストコンピュータ１２１、１２２、２２１〜２２４部分クラスタ１３１、１３３、２３１、２３３、２３５、２３７通
常用記憶装置クラスタ１３２、１３４、２３２、２３４、２３６、２３８障
害用記憶装置クラスタ11, 87, 154, 251 Network 12-15, 41, 81-86, 141-153, 24
1 to 249 Disk control device 16, 21, 24, 27, 91 to 96, 261-269
Buses 17, 22, 25, 28, 48, 101-106, 18
1 to 193, 271 to 279 Normal storage device 18, 23, 26, 29, 49, 111 to 116, 20
1-213, 281-289 Failure storage device 31-34 Unit server 42 CPU 43 CPU bus 44 ROM 45 RAM 46 Network controller 47 SCSI controller 51-59, 61-69, 71-76 Partition area 77 Host computer 121, 122, 221-224 Partial cluster 131, 133, 231, 233, 235, 235, 237 Normal storage cluster 132, 134, 232, 234, 236, 238 Fault storage cluster

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁶，ＤＢ名) G06F 12/00 G06F 12/16 G06F 3/06 G06F 11/16 - 11/20 G06F 15/16──────────────────────────────────────────────────続き Continued on the front page (58) Field surveyed (Int.Cl. ⁶ , DB name) G06F 12/00 G06F 12/16 G06F 3/06 G06F 11/16-11/20 G06F 15/16

Claims

(57) [Claims]

1. A plurality of unit servers connected via a predetermined network include first and second units for storing data.
Data storage means, and a disk control device for controlling these data storage means, wherein the disk control device transmits and receives data to be input / output to / from these data storage means to / from the network; and When data to be stored in the first data storage means is received through the network by the communication means, the first data is written into the first data storage means.
And when the first writing means writes data to the first data storage means, sends the same data to a predetermined unit server other than itself through the network. Storage means
Almost a division transmitting means for transmitting divided equally, the second data storage means which upon receiving the data sent from the dividing transmitting unit of another unit server via said network in response to the written address Second to write to
Writing means, a failure detecting means for detecting the presence or absence of a failure in the unit server, and data stored in the first data storage means of the unit server which is detected to have a failure by the failure detecting means. When reading
Instead of this, the same data is used to divide the above-mentioned approximately equally.
The data after being divided and transmitted is stored in the first data storage means.
Based on the written address and the corresponding address,
A data storage system comprising: a failure reading unit that reads data from each of the transfer destination second data storage units.

2. The data is written to the first data storage unit in units of files, and the divided sending unit is configured to store the plurality of predetermined data in units of files written to the first data storage unit. 2. The data storage system according to claim 1, wherein the storage contents of the first data storage unit are distributed to the transfer destination.

3. The storage area of the first data storage unit is divided into a plurality of blocks, and the divided transmission unit stores the first data in the plurality of predetermined destinations in units of the blocks. 2. The data storage system according to claim 1, wherein the storage contents of the means are divided .

4. The plurality of unit servers are divided into two or more groups each composed of a plurality of unit servers, and the predetermined plurality of transfer destinations are other unit servers in each group. 2. The data storage system according to claim 1, wherein:

5. The plurality of unit servers are each divided into two or more groups each composed of a plurality of unit servers, and the predetermined plurality of transfer destinations are each a unit server of another group. 2. The data storage system according to claim 1, wherein: