JP3412665B2

JP3412665B2 - Disk array device and control method therefor

Info

Publication number: JP3412665B2
Application number: JP14325696A
Authority: JP
Inventors: 誠水上; 康暁田中; 茂太郎岩津; 伸芳井沢; 啓章白水; 隆河野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1996-06-05
Filing date: 1996-06-05
Publication date: 2003-06-03
Anticipated expiration: 2016-06-05
Also published as: JPH09325866A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、故障したディスク
ドライブのデータ復元中に生じるアクセス性能の低下を
最小限に抑えることができるディスクアレイ装置及びそ
の制御方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a disk array device and its control method capable of minimizing a decrease in access performance that occurs during data restoration of a failed disk drive.

【０００２】[0002]

【従来の技術】ディスクアレイ装置２は、図１に示すよ
うに複数のディスクドライブ４Ａ、４Ｂ、４Ｃ、４Ｄを
有する。図１はディスクアレイ装置２の基本構成を示
す。各ディスクドライブ４Ａ、４Ｂ、４Ｃ、４Ｄは、内
部バス５を介して接続されたディスクアレイコントロー
ラ３にて制御される。また、このディスクアレイコント
ローラ３は、外部バス６を介してホストコンピュータ１
に接続されており、このホストコンピュータ１からのリ
ード／ライト命令に応じて各ディスクドライブ４Ａ、４
Ｂ、４Ｃ、４Ｄにアクセスがなされる。2. Description of the Related Art A disk array device 2 has a plurality of disk drives 4A, 4B, 4C and 4D as shown in FIG. FIG. 1 shows the basic configuration of the disk array device 2. The disk drives 4A, 4B, 4C and 4D are controlled by the disk array controller 3 connected via the internal bus 5. The disk array controller 3 is also connected to the host computer 1 via the external bus 6.
Connected to each of the disk drives 4A, 4A in response to a read / write command from the host computer 1.
B, 4C, and 4D are accessed.

【０００３】ディスクドライブ４Ａ等は、通常、正常に
機能しているか、異常状態が検出されたか等の情報を内
部データとして保管している。この保管情報はホストコ
ンピュータ１にてチェックされ、ディスクドライブ４Ａ
等の異常検出等が行われる。具体的には、ホストコンピ
ュータ１がデータの読み書きの際にディスクドライブ４
Ａ等からリード／ライトエラー（Ｒ／Ｗエラー）を受取
ると、ディスクドライブ４Ａ等の保管情報をチェック
し、症状に応じて読み書きのリトライをしたり、ディス
クドライブ４Ａ等の故障を判断したりする。The disk drive 4A or the like normally stores information such as whether it is functioning normally or an abnormal state is detected as internal data. This storage information is checked by the host computer 1 and the disk drive 4A
Etc. are detected. Specifically, when the host computer 1 reads and writes data, the disk drive 4
When a read / write error (R / W error) is received from A or the like, the storage information of the disk drive 4A or the like is checked, read / write is retried according to the symptom, or the failure of the disk drive 4A or the like is determined. .

【０００４】ディスクドライブ４Ａ等の故障によるデー
タエラーを検出するには、従来より、パリティディスク
を設ける手法が採用されている。このパリティディスク
を設ける手法は、例えば、各ディスクドライブ４Ａ等の
同一物理セクタに書き込まれた同一バイトの各データを
排他的論理和演算にて処理し、この演算結果に基づいて
パリティデータを生成してデータとともに記録しておく
ものである。To detect a data error due to a failure of the disk drive 4A or the like, a method of providing a parity disk has been conventionally used. The method of providing this parity disk is, for example, processing each data of the same byte written in the same physical sector of each disk drive 4A etc. by exclusive OR operation and generating parity data based on this operation result. It is recorded together with the data.

【０００５】しかし、この種の方法ではディスクドライ
ブ４Ａ等が故障した際にデータの復元を行うことは不可
能である。そこで、従来のディスクアレイ装置２のディ
スクアレイコントローラ３には、ディスクドライブ４Ａ
等が故障すると、交換された新規のディスクドライブ４
Ａ等の上に、故障したディスクドライブ４Ａのデータを
パリティデータを用いて復元する処理機能が備えられて
いる。However, with this type of method, it is impossible to restore data when the disk drive 4A or the like fails. Therefore, the disk array controller 3 of the conventional disk array device 2 includes a disk drive 4A
New disk drive 4 that has been replaced
A processing function of restoring the data of the failed disk drive 4A using the parity data is provided on A and the like.

【０００６】このデータ復元処理機能は、新規のディス
ク４Ａに対して通常先頭のセクタから順番に実行され、
かつバックグランド処理で実行される。かかるバックグ
ランド処理で復元されたセクタは、ホストコンピュータ
１からのリード／ライト命令に対して全く影響を与えな
い。即ち、通常の高速なリード／ライト処理が可能であ
る。This data restoration processing function is normally executed in order from the first sector on the new disk 4A,
And it is executed as a background process. The sector restored by such background processing does not affect the read / write command from the host computer 1 at all. That is, normal high-speed read / write processing is possible.

【０００７】[0007]

【発明が解決しようとする課題】ところで、前記したデ
ィスクアレイ装置のデータ復元処理機能によれば、デー
タの復元処理を実行している最中に、ホストコンピュー
タからリード／ライト命令を受けると、復元されていな
いセクタに関してはフォアグランド処理でデータの復元
処理が行われる。よって、ホストコンピュータから見た
アクセス性能が著しく低下するという問題がある。By the way, according to the above-mentioned data restoration processing function of the disk array device, when a read / write command is received from the host computer while the data restoration processing is being executed, the restoration is performed. Data restoration processing is performed in the foreground processing for the unsectors. Therefore, there is a problem that the access performance as seen from the host computer is significantly reduced.

【０００８】すなわち、データのライト命令の場合には
新規のディスクドライブの復元されていないセクタにデ
ータを書き込み、このデータに基づいてパリティデータ
を更新するため、ほぼ通常のアクセス性能が維持される
が、ライト命令よりも遥かに実行確率が高いリード命令
の場合には所望のセクタを新規のディスクドライブに復
元した後に該セクタからデータを読み出すため、復元さ
れていないデータのリード命令は、アクセス性能を著し
く低下させるという問題が生じる。That is, in the case of a data write command, the data is written in the unrestored sector of the new disk drive, and the parity data is updated based on this data, so that almost normal access performance is maintained. In the case of a read command, which has a much higher execution probability than a write command, the desired sector is restored to the new disk drive and then the data is read from that sector. The problem of significantly lowering occurs.

【０００９】ここにおいて本発明の解決すべき主要な目
的は、次の通りである。本発明の第１の目的は、故障し
たディスクドライブのデータ復元中に起こるアクセス性
能の低下を最小限に抑えることができるディスクアレイ
装置及びその制御方法を提供せんとするものである。The main objects to be solved by the present invention are as follows. It is a first object of the present invention to provide a disk array device and a control method therefor capable of minimizing a decrease in access performance that occurs during data restoration of a failed disk drive.

【００１０】本発明の第２の目的は、アクセス直後のフ
ァイルが再アクセスされる確率が極めて高い、ビデオ・
オン・デマンドを始めとするマルチメディアサービスシ
ステムにおいて、ディスクドライブの故障の影響を最小
限にすることができるディスクアレイ装置及びその制御
方法を提供せんとするものである。A second object of the present invention is that a file immediately after being accessed has a very high probability of being re-accessed.
It is an object of the present invention to provide a disk array device and a control method thereof that can minimize the influence of a disk drive failure in a multimedia service system including on-demand.

【００１１】本発明のその他の目的は、明細書、図面、
特に特許請求の範囲の各請求項の記載から自ずと明らか
となろう。Other objects of the present invention include the specification, drawings,
Especially, it will be apparent from the description of each claim.

【００１２】[0012]

【課題を解決するための手段】前記した課題の解決は、
本発明が次に列挙する新規な特徴的構成手段及び手法を
採用することにより前記目的を達成する。[Means for Solving the Problems] To solve the above-mentioned problems,
The present invention achieves the above object by adopting the novel characteristic construction means and techniques listed below.

【００１３】すなわち、本発明装置の第１の特徴は、デ
ータを記録する複数のディスクドライブと、当該ディス
クドライブに内部バスを介して接続されたディスクアレ
イコントローラとを有するディスクアレイ装置におい
て、前記ディスクアレイコントローラは、前記ディスク
ドライブにデータを記憶する際、当該データを所定のデ
ータ長に分割するとともに、当該分割したデータ群に対
して冗長データを生成する冗長データ生成機能部と、前
記ディスクドライブの故障にて再生できなくなった分割
データを冗長データを参照して復元するデータ復元機能
部と、複数の前記ディスクドライブに記録されているフ
ァイルのアクセス確率を記録しておくアクセス確率テー
ブルと、復元すべきファイルに対してフラグをセットす
る修復テーブルと、を有し、データを書き込む場合には
前記冗長データ生成機能部において、所定長の分割デー
タと当該分割したデータに対して生成した冗長データと
を生成して、該データ群を複数の前記ディスクドライブ
に分散して記録するとともに、記録したファイルの前記
修復テーブルにおけるフラグがリセットされている場合
はデータの読み書きを実施し、複数の前記ディスクドラ
イブに記録されているファイルにアクセスがある度に前
記アクセス確率テーブルに該当ファイルのアクセス確率
を記録し、前記ディスクドライブの異常状態が検出さ
れ、これに応じて故障した当該ディスクドライブが切り
替えられた場合には、復元すべきファイルに対して前記
修復テーブルにおけるフラグをセットするとともに、前
記データ復元機能部においてファイルの復元処理を行
い、当該ファイルの復元したデータを新規のディスクド
ライブに記録して、復元したファイルの前記修復テーブ
ルにおけるフラグをリセットするバックグランド処理を
行い、当該バックグランド処理中にファイルのリード命
令を受けると、当該ファイルの前記修復テーブルにおけ
るフラグがセットされており、かつ前記アクセス確率テ
ーブルに記録された当該ファイルのアクセス確率が所定
のしきい値以上の場合には前記データ復元機能部におい
て当該ファイルの復元処理を行い、当該ファイルの復元
したデータを新規のディスクドライブに記録して当該フ
ァイルの前記修復テーブルにおけるフラグをリセットす
る一方、当該ファイルの前記修復テーブルにおけるフラ
グがセットされており、かつ前記アクセス確率テーブル
に記録されたアクセス確率が所定のしきい値未満の場合
には前記データ復元機能部において当該ファイルの復元
処理を行い、当該ファイルの復元したデータを新規のデ
ィスクドライブに記録せずに、リード処理を実施する、
ように構成されているディスクアレイ装置にある。That is, the first feature of the device of the present invention is a disk array device having a plurality of disk drives for recording data and a disk array controller connected to the disk drives via an internal bus. When storing data in the disk drive, the array controller divides the data into a predetermined data length, and a redundant data generation function unit that generates redundant data for the divided data group, and the disk drive. A data restoration function unit that restores divided data that cannot be played back due to a failure by referring to redundant data, an access probability table that records the access probabilities of files recorded in multiple disk drives, and a restoration A repair table that sets a flag for files that should be, When writing data, the redundant data generation function unit generates divided data of a predetermined length and redundant data generated for the divided data, and the data group is stored in the plurality of disk drives. In addition to recording in a distributed manner, when the flag in the recovery table of the recorded file is reset, data is read and written, and the access probability is calculated every time a file recorded in a plurality of the disk drives is accessed. When the access probability of the corresponding file is recorded in the table and the abnormal state of the disk drive is detected and the failed disk drive is switched accordingly, the flag in the restoration table is set for the file to be restored. And set the file in the data restoration function section. The original process is performed, the restored data of the file is recorded in a new disk drive, the background process of resetting the flag in the restoration table of the restored file is performed, and the file read command is issued during the background process. When the file is received, if the flag in the restoration table of the file is set and the access probability of the file recorded in the access probability table is equal to or more than a predetermined threshold value, the file is restored in the data restoration function unit. Is performed, the restored data of the file is recorded in a new disk drive and the flag in the restoration table of the file is reset, while the flag in the restoration table of the file is set, and Record in access probability table When the accessed access probability is less than a predetermined threshold value, the data restoration function unit restores the file, and the read processing is performed without recording the restored data of the file in a new disk drive. To do
The disk array device is configured as described above.

【００１４】本発明装置の第２の特徴は、前記本発明装
置の第１の特徴における前記ディスクアレイコントロー
ラにて分割するデータ長が、ビット、バイト又はワード
単位であるディスクアレイ装置にある。A second feature of the device of the present invention resides in a disk array device in which the data length divided by the disk array controller in the first feature of the device of the present invention is a bit, byte or word unit.

【００１５】本発明装置の第３の特徴は、前記本発明装
置の第１又は第２の特徴における前記ディスクドライブ
に、故障の際に交換して使用する予備用ディスクドライ
ブが備えられたディスクアレイ装置にある。A third feature of the device of the present invention is that the disk drive in the first or second feature of the device of the present invention is provided with a spare disk drive to be replaced when a failure occurs. On the device.

【００１６】本発明装置の第４の特徴は、前記本発明装
置の第１、第２又は第３の特徴における前記ディスクド
ライブが、当該ディスクドライブの異常状態をコンピュ
ータがデータの読み書きの際に検出するための情報を内
部データとして保管するディスクアレイ装置にある。A fourth feature of the device of the present invention is that the disk drive in the first, second or third feature of the device of the present invention detects an abnormal state of the disk drive when the computer reads or writes data. The disk array device stores information for doing so as internal data.

【００１７】本発明装置の第５の特徴は、前記本発明装
置の第１、第２、第３又は第４の特徴における前記ディ
スクドライブが、ビデオ・オン・デマンド等のマルチメ
ディアサービスを提供するホストコンピュータの外部記
憶装置であるディスクアレイ装置にある。A fifth feature of the device of the present invention is that the disk drive in the first, second, third or fourth feature of the device of the present invention provides a multimedia service such as video on demand. The disk array device is an external storage device of the host computer.

【００１８】本発明方法の第１の特徴は、複数のディス
クドライブの制御を内部バスを介して接続されたディス
クアレイコントローラにて行うに当たり、データを書き
込む場合には所定長の分割データと冗長データとを生成
して、該データ群を複数のディスクドライブに分散して
記録するとともに、記録したファイルのフラグがリセッ
トされている場合はデータの読み書きを実施し、複数の
前記ディスクドライブに記録されているファイルにアク
セスがある度に当該ファイルのアクセス確率を記録し、
ディスクドライブの異常状態が検出され、これに応じて
故障したディスクドライブが切り替えられた場合には、
復元すべきファイルに対して前記フラグをセットすると
ともに、当該ファイルの復元処理を行い、当該ファイル
の復元したデータを新規のディスクドライブに記録し
て、復元した当該ファイルの前記フラグをリセットする
バックグランド処理を行い、当該バックグランド処理中
にファイルのリード命令を受けると、当該ファイルが未
修復であり、かつアクセス確率が所定のしきい値以上の
場合には当該ファイルの復元処理を行い、当該ファイル
の復元したデータを新規のディスクドライブに記録して
当該ファイルの前記フラグをリセットする一方、当該フ
ァイルが未修復であり、かつアクセス確率が所定のしき
い値未満の場合には当該ファイルの復元処理を行い、当
該ファイルの復元したデータを新規のディスクドライブ
に記録せずに、リード処理を実施するディスクアレイ装
置の制御方法にある。The first feature of the method of the present invention is that when data is written when controlling a plurality of disk drives by a disk array controller connected via an internal bus, divided data of a predetermined length and redundant data are written. Is generated, the data group is distributed and recorded in a plurality of disk drives, and if the flag of the recorded file is reset, the data is read and written, and the data is recorded in the plurality of disk drives. Each time a file is accessed, the access probability of that file is recorded,
If an abnormal condition of a disk drive is detected and the failed disk drive is switched accordingly,
A background that sets the flag for the file to be restored, restores the file, records the restored data of the file in a new disk drive, and resets the flag of the restored file. When a file read command is received during the background process, if the file is unrepaired and the access probability is equal to or higher than a predetermined threshold, the file is restored and the file is restored. The restored data is recorded in a new disk drive and the flag of the file is reset, while the file is unrestored and the access probability is less than a predetermined threshold, the restoration process of the file is performed. The recorded data of the file is not recorded on the new disk drive, and The control method for a disk array apparatus for carrying out the process.

【００１９】本発明方法の第２の特徴は、前記本発明方
法の第１の特徴における前記アクセス確率が、単位時間
当りのファイルのアクセス回数からもしくは、ファイル
をアクセスした後の経過時間から求めてなるディスクア
レイ装置の制御方法にある。A second feature of the method of the present invention is that the access probability in the first feature of the method of the present invention is obtained from the number of file accesses per unit time or from the elapsed time after the file is accessed. Another method of controlling a disk array device is described below.

【００２０】本発明方法の第３の特徴は、前記本発明方
法の第２の特徴における前記ファイルのアクセス確率の
しきい値が、ファイルが復元されるまでの平均待ち時間当たりのアクセス回数＝２（Ｋｗ−１）／｛ＴＲ（ＫＥ−１）｝、ただし、Ｋｗ＝書き込みを行うデータ修復を伴うデータリード時間／全てのディスクドライブが正常な場合のデータリード時間、ＫＥ＝書き込みを行わないデータ修復を伴うデータリード時間／全てのディスクドライブが正常な場合のデータリード時間、ＴＲ＝新規のディスクドライブの修復に要する時間、で求めてなるディスクアレイ装置の制御方法にある。The third feature of the method of the present invention is that the threshold value of the access probability of the file in the second feature of the method of the present invention is the number of accesses per average waiting time until the file is restored = 2. (Kw-1) / {TR (KE-1)}, where Kw = data read time with data recovery for writing / data read time when all disk drives are normal, KE = data for no writing Data read time accompanied by repair / data read time when all disk drives are normal, TR = time required for repair of new disk drive.

【００２１】本発明方法の第４の特徴は、前記本発明方
法の第１、第２又は第３の特徴における前記ディスクア
レイコントローラにて分割するデータ長が、ビット、バ
イト又はワード単位であるディスクアレイ装置の制御方
法にある。A fourth feature of the method of the present invention is that the data length divided by the disk array controller in the first, second or third feature of the method of the present invention is a bit, byte or word unit. It is in the control method of the array device.

【００２２】本発明方法の第５の特徴は、前記本発明方
法の第１、第２、第３又は第４の特徴における前記ディ
スクドライブには、故障の際に交換して使用する予備用
ディスクドライブが備えられているディスクアレイ装置
の制御方法にある。A fifth feature of the method of the present invention is that the disk drive in the first, second, third or fourth feature of the method of the present invention is a spare disk to be replaced when a failure occurs. A method of controlling a disk array device provided with a drive.

【００２３】本発明方法の第６の特徴は、前記本発明方
法の第１、第２、第３、第４又は第５の特徴における前
記ディスクドライブが、当該ディスクドライブの異常状
態をコンピュータがデータの読み書きの際に検出するた
めの情報を内部データとして保管するディスクアレイ装
置の制御方法にある。A sixth feature of the method of the present invention is that the disk drive according to the first, second, third, fourth or fifth feature of the method of the present invention uses a computer to record an abnormal state of the disk drive. In the method of controlling a disk array device, information for detecting when reading and writing is stored as internal data.

【００２４】本発明方法の第７の特徴は、前記本発明方
法の第１、第２、第３、第４、第５又は第６の特徴にお
ける前記ディスクドライブが、ビデオ・オン・デマンド
等のマルチメディアサービスを提供するホストコンピュ
ータの外部記憶装置であるディスクアレイ装置の制御方
法にある。A seventh characteristic of the method of the present invention is that the disk drive in the first, second, third, fourth, fifth or sixth characteristic of the method of the present invention is a video-on-demand type or the like. A method of controlling a disk array device, which is an external storage device of a host computer that provides multimedia services.

【００２５】[0025]

【発明の実施の形態】以下、添付図面を参照して本発明
の実施の形態を、その装置例及び方法例に基づいて説明
する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to its apparatus and method examples with reference to the accompanying drawings.

【００２６】（装置例）図１は本装置例のディスクアレイ装置２の基本構成を示
し、図２は通常時のデータのリード／ライト処理の流れ
を示し、図３は故障時のデータのリード／ライト処理の
流れを示す。(Device Example) FIG. 1 shows a basic configuration of a disk array device 2 of this device example, FIG. 2 shows a flow of normal data read / write processing, and FIG. 3 shows data read at the time of failure. / Shows the flow of write processing.

【００２７】ディスクアレイ装置２は、従来の技術の欄
にて説明した通り、複数のディスクドライブ４Ａ、４
Ｂ、４Ｃ、４Ｄを有する。各ディスクドライブ４Ａ、４
Ｂ、４Ｃ、４Ｄは、内部バス５を介して接続されたディ
スクアレイコントローラ３にて制御される。また、この
ディスクアレイコントローラ３は、外部バス６を介して
ホストコンピュータ１に接続されており、このホストコ
ンピュータ１からのリード／ライト命令に応じて各ディ
スクドライブ４Ａ、４Ｂ、４Ｃ、４Ｄにアクセスがなさ
れる。The disk array device 2 has a plurality of disk drives 4A and 4A, as described in the section of the prior art.
B, 4C, 4D. Each disk drive 4A, 4
B, 4C and 4D are controlled by the disk array controller 3 connected via the internal bus 5. The disk array controller 3 is connected to the host computer 1 via an external bus 6, and can access the disk drives 4A, 4B, 4C, 4D in response to a read / write command from the host computer 1. Done.

【００２８】本装置例のディスクアレイ装置２では、ホ
ストコンピュータ１がディスクアレイコントローラ３に
対してデータのライト命令を出すと、ディスクアレイコ
ントローラ３はホストコンピュータ１から転送されたデ
ータを、例えば図２に示すように３つに分割するととも
に、当該分割データに対して、例えば偶数パリティデー
タを生成し、計４つの分割データを生成する。このよう
にして生成されたデータは４つのディスクドライブ４
Ａ、４Ｂ、４Ｃ、４Ｄに分割されて記録される。In the disk array device 2 of this device example, when the host computer 1 issues a data write command to the disk array controller 3, the disk array controller 3 stores the data transferred from the host computer 1, for example, in FIG. As shown in (3), while dividing into three, for example, even parity data is generated, and a total of four divided data is generated. The data thus generated is stored in four disk drives 4
The data is divided into A, 4B, 4C, and 4D and recorded.

【００２９】また、記録されたデータをリード処理する
場合には、通常パリティデータを除く３つの分割データ
を３つのディスクドライブ４Ａ、４Ｂ、４Ｃから読み出
し、これを結合してホストコンピュータ１に転送する。
例えば、ここで１つのディスクドライブ４Ｃが故障する
と、図３に示すように故障していない３つのディスクド
ライブ４Ａ、４Ｂ、４Ｄからパリティデータおよび分割
されたデータを読み出し、該データに対して偶数パリテ
ィ演算を行ってディスクドライブ４Ｃの故障にて再生で
きなくなった分割データを復元し、復元されたデータを
新規のディスクドライブ（図を省略）に書き込むととも
に、書き込んだデータを再び読み出してホストコンピュ
ータ１に転送する。When the recorded data is read, the three divided data except the normal parity data are read from the three disk drives 4A, 4B and 4C, combined, and transferred to the host computer 1. .
For example, if one disk drive 4C fails here, the parity data and the divided data are read from the three disk drives 4A, 4B, and 4D that have not failed as shown in FIG. The divided data that cannot be played back due to the failure of the disk drive 4C is calculated, the restored data is written to a new disk drive (not shown), and the written data is read again to the host computer 1. Forward.

【００３０】なお、パリティデータを記録しているディ
スクドライブ４Ｄが故障した場合には、同様にパリティ
データの復元処理が行なわれるが、通常の手順でデータ
の読み出しが可能なので、ホストコンピュータ１から見
たアクセス性能の低下は生じない。If the disk drive 4D recording the parity data fails, the parity data is restored in the same manner. However, since the data can be read out by the normal procedure, the host computer 1 can see it. Access performance does not deteriorate.

【００３１】図４は本装置例のディスクアレイ装置２に
おけるディスクアレイコントローラ３の構成を示す。デ
ィスクアレイコントローラ３は、ホストコンピュータ１
から転送されたデータを分割するとともに、パリティデ
ータを生成する冗長データ生成機能部３１、故障したデ
ィスクドライブ４Ａ〜４Ｄを検出するとともにパリティ
演算を行ってディスクドライブ４Ａ、４Ｂ、４Ｃ、４Ｄ
の故障にて再生できなくなった分割データを復元するデ
ータ復元機能部３２、ディスクアレイ装置２に記録され
ているファイルのアクセス確率を記録するアクセス確率
テーブル３３、該ファイルがディスクアレイコントロー
ラ３にて復元されたか否かを示す修復テーブル３４、デ
ィスクドライブ４Ａ、４Ｂ、４Ｃ、４Ｄをアクセスする
ための内部バスインタフェース３６、及びホストコンピ
ュータ１と接続するための外部バスインタフェース３５
を有し、データを書き込む場合には冗長データ生成機能
部において、所定長の分割データとこの分割したデータ
に対して生成した冗長データとを生成して、当該データ
群を複数の前記ディスクドライブに分散して記録すると
ともに、記録したファイルの修復テーブルにおけるフラ
グがリセットされている場合はデータの読み書きを実施
し、複数の前記ディスクドライブに記録されているファ
イルにアクセスがある度にアクセス確率テーブルに該当
ファイルのアクセス確率を記録し、ディスクドライブの
異常状態が検出され、これに応じて故障したディスクド
ライブが切り替えられた場合には、復元すべきファイル
に対して修復テーブルにおけるフラグをセットするとと
もに、データ復元機能部においてファイルの復元処理を
行い、当該ファイルの復元したデータを新規のディスク
ドライブに記録して、復元したファイルの修復テーブル
におけるフラグをリセットするバックグランド処理を行
い、バックグランド処理中にファイルのリード命令を受
けると、当該ファイルの修復テーブルにおけるフラグが
セットされており、かつアクセス確率テーブルに記録さ
れた当該ファイルのアクセス確率が所定のしきい値以上
の場合にはデータ復元機能部において当該ファイルの復
元処理を行い、当該ファイルの復元したデータを新規の
ディスクドライブに記録して当該ファイルの修復テーブ
ルにおけるフラグをリセットする一方、当該ファイルの
修復テーブルにおけるフラグがセットされており、かつ
アクセス確率テーブルに記録されたアクセス確率が所定
のしきい値未満の場合にはデータ復元機能部において当
該ファイルの復元処理を行い、当該ファイルの復元した
データを新規のディスクドライブに記録せずに、リード
処理を実施するように構成されている。FIG. 4 shows the configuration of the disk array controller 3 in the disk array device 2 of this device example. The disk array controller 3 is the host computer 1
The redundant data generation function unit 31 that splits the data transferred from the disk drive and generates the parity data, detects the failed disk drives 4A to 4D, and performs the parity operation to perform the disk drives 4A, 4B, 4C, and 4D.
Data recovery function unit 32 for recovering the divided data that cannot be reproduced due to the failure of the above, an access probability table 33 for recording the access probability of the file recorded in the disk array device 2, and the file is recovered by the disk array controller 3. The recovery table 34 indicating whether or not the disk drive has been executed, the internal bus interface 36 for accessing the disk drives 4A, 4B, 4C, 4D, and the external bus interface 35 for connecting to the host computer 1.
When writing data, the redundant data generation function unit generates divided data of a predetermined length and redundant data generated for the divided data, and the data group is stored in the plurality of disk drives. In addition to recording in a distributed manner, if the flag in the recovery table of the recorded file is reset, the data is read and written, and the access probability table is written each time a file recorded in multiple disk drives is accessed. When the access probability of the corresponding file is recorded and an abnormal state of the disk drive is detected, and the failed disk drive is switched accordingly, a flag in the recovery table is set for the file to be restored, and The data restoration function section restores the file and The restored data of is recorded in a new disk drive, the background process of resetting the flag in the restoration table of the restored file is performed, and when a file read command is received during the background processing, the restoration table of the file If the flag is set and the access probability of the file recorded in the access probability table is greater than or equal to the predetermined threshold value, the data restoration function unit restores the file and restores the restored data of the file. On the new disk drive to reset the flag in the repair table for the file, while the flag in the repair table for the file is set, and the access probability recorded in the access probability table is the predetermined threshold value. Data recovery if less than And restore processing of the file in function unit, the restored data of the file without recording the new disk drive, and is configured to perform a read process.

【００３２】ディスクアレイコントローラ３内の冗長デ
ータ生成機能部３１とデータ復元機能部３２は、ディス
クドライブ４Ａ、４Ｂ、４Ｃ、４Ｄが故障していない通
常の状態では、図２に示した処理の流れに従ってデータ
の記録再生を行う。そして、ディスクドライブ４Ａ、４
Ｂ、４Ｃ、４Ｄのうち１台が故障すると、故障したディ
スクドライブ４Ｃの代りに接続されたディスクドライ
ブ、若しくは予め起動されていたホットスタンバイ・デ
ィスクドライブに障害の発生したディスクドライブ４Ｃ
と同等のデータを復元する。The redundant data generation function unit 31 and the data restoration function unit 32 in the disk array controller 3 are in the normal state where the disk drives 4A, 4B, 4C, and 4D have not failed, and the process flow shown in FIG. The data is recorded and reproduced in accordance with. And the disk drives 4A, 4
If one of the B, 4C, and 4D fails, the disk drive connected in place of the failed disk drive 4C, or the disk drive 4C in which the hot standby disk drive that was previously activated has failed
Restore data equivalent to.

【００３３】（第１方法例）次に前記装置例を用いて実行した本発明の方法例につい
て説明する。図５は第１方法例のディスクアレイ装置２
の制御方法におけるデータ復元処理の手順のフローチャ
ートを示す。(First Method Example) Next, an example method of the present invention executed by using the above apparatus example will be described. FIG. 5 shows the disk array device 2 of the first method example.
6 is a flowchart of a procedure of data restoration processing in the control method of FIG.

【００３４】先ず、ディスクドライブ４Ｃの故障が検出
され、これに応じて故障したディスクドライブ４Ｃが人
手で交換され、又はホットスタンバイ・ディスクドライ
ブに切り替えられたか否かが判断される（ＳＴ１）。故
障したディスクドライブ４Ｃが切り替えられた場合に
は、ディスクアレイコントローラ３は復元すべきディス
クドライブ４Ｃを特定し（ＳＴ２）、修復テーブル３４
に登録されているすべてのファイルのフラグをセットす
る（ＳＴ３）。なお、フラグがセットされたファイルは
復元されていないことを示しており、復元された段階で
始めてリセットされる。First, the failure of the disk drive 4C is detected, and it is determined whether the failed disk drive 4C is manually replaced or switched to the hot standby disk drive (ST1). When the failed disk drive 4C is switched, the disk array controller 3 identifies the disk drive 4C to be restored (ST2) and restore table 34.
Flags of all files registered in are set (ST3). It should be noted that the file in which the flag is set indicates that the file has not been restored, and will be reset only when restored.

【００３５】次に、ディスクアレイコントローラ３は、
アクセス確率テーブル３３を参照し（ＳＴ４）、復元す
べきファイルから最もアクセス確率の高いファイルを抽
出する（ＳＴ５）。続いて、データ復元機能部３２を用
いて該ファイルの再生ができなくなった分割データの復
元処理を実行し、復元された分割データを新規のディス
クドライブに記録する（ＳＴ６）。Next, the disk array controller 3
The access probability table 33 is referred to (ST4), and the file with the highest access probability is extracted from the files to be restored (ST5). Subsequently, the data restoration function unit 32 is used to execute the restoration processing of the divided data in which the file cannot be reproduced, and the restored divided data is recorded in a new disk drive (ST6).

【００３６】そして、最後にファイルの復元処理が完了
すると、ディスクアレイコントローラ３は修復テーブル
３４の該当するファイルのフラグをリセットする（ＳＴ
７）。このようにして復元され修復テーブル３４のフラ
グがリセットされたファイルは、その後のアクセスに対
して図２に示す通常のリード／ライト処理が行なわれ
る。Then, when the file restoration process is finally completed, the disk array controller 3 resets the flag of the corresponding file in the restoration table 34 (ST.
7). The file thus restored and the flag of the repair table 34 is reset is subjected to the normal read / write processing shown in FIG. 2 for subsequent access.

【００３７】さらに、復元すべきファイルが有ると判断
されると（ＳＴ８）、ＳＴ４〜ＳＴ７の処理が再び行わ
れる。すなわち、ディスクアレイコントローラ３は、引
き続き修復テーブル３４内でフラグがセットされたファ
イルの中から最もアクセス確率の高いファイルを抽出
し、復元処理を行い、復元すべきファイルが無くなった
時点でファイルの復元処理を終了する。これらの処理は
全てバックグランド処理で実行される。Further, when it is determined that there is a file to be restored (ST8), the processes of ST4 to ST7 are performed again. That is, the disk array controller 3 continues to extract the file with the highest access probability from the files in which the flag is set in the repair table 34, performs the restoration process, and restores the file when there are no files to be restored. The process ends. All of these processes are executed in the background process.

【００３８】一方、故障したディスクドライブ４Ｃが交
換されていないと判断されると（ＳＴ１）、復元された
分割データは保存されないので、ディスクアレイコント
ローラ３は単に復元された分割データと再生可能な分割
データを結合して、ホストコンピュータ１に転送する。
なお、アクセス確率の算出法は、例えば単位時間当たり
のファイルのアクセス回数もしくは、ファイルをアクセ
スした後の経過時間から求める等の種々の方法が考えら
れるが、システムの特性に合せて選択すると良い。On the other hand, when it is determined that the failed disk drive 4C has not been replaced (ST1), the restored divided data is not saved, so the disk array controller 3 simply restores the restored divided data and the reproducible divided data. The data is combined and transferred to the host computer 1.
Various methods can be considered for calculating the access probability, for example, the number of times a file is accessed per unit time or the elapsed time after the file is accessed can be used, but it is preferable to select it according to the characteristics of the system.

【００３９】続いて、ファイルの復元をバックグランド
で処理中に、ホストコンピュータ１からリード命令があ
った場合の処理手順について説明する。図６は第１方法
例のディスクアレイ装置２の制御方法におけるファイル
復元中のリード手順を示し、図７は同制御方法における
ファイル復元中のライト手順を示す。Next, the processing procedure when there is a read command from the host computer 1 while the file restoration is being processed in the background will be described. FIG. 6 shows a read procedure during file restoration in the control method of the disk array device 2 of the first method example, and FIG. 7 shows a write procedure during file restoration in the control method.

【００４０】ファイルの復元をバックグランドで処理中
に、ホストコンピュータ１からリード命令があった場合
には、ディスクアレイコントローラ３は図６に示すよう
に、修復テーブル３４を参照して所望のファイルが復元
されているか否かを確認する（ＳＴ１０）。このように
所望のファイルが復元されているか否かを判断した後、
所望のファイルが復元されている場合は図２に示すよう
に所望のファイルのリードを行う（ＳＴ１３）。When a read command is issued from the host computer 1 while the file restoration is being processed in the background, the disk array controller 3 refers to the restoration table 34 as shown in FIG. It is confirmed whether the data has been restored (ST10). After deciding whether the desired file has been restored in this way,
If the desired file is restored, the desired file is read as shown in FIG. 2 (ST13).

【００４１】一方、所望のファイルが復元されていない
場合はフォアグランドで所望のファイルの復元処理を行
い（ＳＴ１１）、復元されたファイルの修復テーブル３
４のフラグをリセットして所望のファイルのリードを行
う（ＳＴ１２、ＳＴ１３）。したがって、データ復元機
能部３２がディスクドライブ４Ｃの故障にて読めなくな
ったデータを復元し、該データを新規のディスクドライ
ブに書き込んでから通常のリード処理を実行する。On the other hand, if the desired file is not restored, the desired file is restored in the foreground (ST11), and the restoration table 3 for the restored file is executed.
The flag of No. 4 is reset and the desired file is read (ST12, ST13). Therefore, the data restoration function unit 32 restores the unreadable data due to the failure of the disk drive 4C, writes the data in a new disk drive, and then executes the normal read processing.

【００４２】ファイルの復元処理中に、ホストコンピュ
ータ１からライト命令があった場合には、図７に示すよ
うに新規のディスクドライブに通常通りデータを書き込
む（ＳＴ２０）とともに書き込まれたファイルの修復テ
ーブル３４のフラグをリセットする（ＳＴ２１）。When a write command is issued from the host computer 1 during the file restoration process, as shown in FIG. 7, data is normally written in a new disk drive (ST20) and the written file restoration table is written. The flag of 34 is reset (ST21).

【００４３】このようにしてディスクアレイコントロー
ラ３は、アクセス確率の高い順番にファイルを復元し、
復元したファイルについては直ちに通常のアクセスが行
なわれるように制御する。またファイルを復元している
最中にホストコンピュータ１がファイルのリード／ライ
ト命令を出すと、該ファイルの復元処理を行ってからフ
ァイルを読み込んだり、該ファイルを書き込んだりして
から、直ちに通常のアクセスが可能なように修復テーブ
ル３４のフラグをリセットする。その結果、ディスクア
レイ装置２がディスクドライブ４Ｃのデータの復旧処理
を行っている場合でも、アクセス性能が低下する確率は
極めて低くなる。In this way, the disk array controller 3 restores the files in descending order of access probability,
Control the restored files so that normal access is immediately performed. If the host computer 1 issues a file read / write command during file restoration, the file is restored and then the file is read, or the file is immediately written, and then the normal operation is immediately resumed. The flag of the repair table 34 is reset so that it can be accessed. As a result, even when the disk array device 2 is performing the data recovery process of the disk drive 4C, the probability that the access performance will be lowered becomes extremely low.

【００４４】（第２方法例）次に本発明の第２方法例について説明する。図８は第２
方法例のディスクアレイ装置２の制御方法におけるデー
タ復元処理の手順のフローチャートを示す。(Second Method Example) Next, a second method example of the present invention will be described. FIG. 8 is the second
9 shows a flowchart of a procedure of data restoration processing in the control method of the disk array device 2 of the method example.

【００４５】本方法例では、先ず、第１方法例と同様に
故障したディスクドライブ４Ｃが人手で交換され、又は
ホットスタンバイ・ディスクドライブに切り替えられた
か否かが判断される（ＳＴ１' ）。In this example of the method, first, similarly to the example of the first method, it is judged whether or not the failed disk drive 4C has been manually replaced or switched to the hot standby disk drive (ST1 ').

【００４６】故障したディスクドライブ４Ｃが切り替え
られた場合には、ディスクアレイコントローラ３は復元
すべきディスクドライブ４Ｃを特定し（ＳＴ２' ）、修
復テーブル３４に登録されているすべてのファイルのフ
ラグをセットする（ＳＴ３'）。なお、フラグがセット
されたファイルは、第１方法例と同様に復元されていな
いことを示しており、復元された段階で始めてリセット
される。When the failed disk drive 4C is switched, the disk array controller 3 identifies the disk drive 4C to be restored (ST2 ') and sets the flags of all the files registered in the restoration table 34. Yes (ST3 '). It should be noted that the file in which the flag is set indicates that the file has not been restored as in the case of the first method example, and the file is reset only when restored.

【００４７】次にディスクアレイコントローラ３は、例
えばランダムに、或はアクセス確率テーブル３３を参照
してアクセス確率が高い順番に復元すべきファイルを抽
出し（ＳＴ４' ）、データ復元機能部３２を用いて該フ
ァイルの再生ができなくなった分割データの復元処理を
実行し（ＳＴ５' ）、復元された分割データを新規のデ
ィスクドライブ４Ｃに書き込む（ＳＴ６' ）。Next, the disk array controller 3 extracts the files to be restored, for example, randomly or referring to the access probability table 33 in the order of high access probability (ST4 '), and uses the data restoration function unit 32. Then, the restoration processing of the divided data which cannot be played back is executed (ST5 '), and the restored divided data is written in the new disk drive 4C (ST6').

【００４８】最後にファイルの復元処理が完了すると、
ディスクアレイコントローラ３は修復テーブル３４の該
当するファイルのフラグをリセットする（ＳＴ７' ）。
このようにして復元され、修復テーブル３４のフラグが
リセットされたファイルは、その後のアクセスに対して
図２に示す通常のリード／ライト処理が行なわれる。Finally, when the file restoration process is completed,
The disk array controller 3 resets the flag of the corresponding file in the restoration table 34 (ST7 ').
The file thus restored and the flag of the repair table 34 is reset is subjected to the normal read / write processing shown in FIG. 2 for subsequent access.

【００４９】また、復元すべきファイルが有ると判断さ
れると（ＳＴ８' ）、ＳＴ４' 〜ＳＴ７' の処理が再び
行われる。すなわち、ディスクアレイコントローラ３
は、引き続き修復テーブル３４内でフラグがセットされ
たファイルの中から復元すべきファイルを適宜抽出し、
同様にして復元処理を行い、復元すべきファイルが無く
なった時点でファイルの復元処理を終了する。When it is determined that there is a file to be restored (ST8 '), the processes of ST4' to ST7 'are performed again. That is, the disk array controller 3
Continues to extract files to be restored from the files whose flags are set in the repair table 34 as appropriate,
Similarly, the restoration process is performed, and when there are no more files to be restored, the file restoration process ends.

【００５０】これらの処理は全てバックグランド処理で
実行される。なお、アクセス確率の算出法は、前記した
第１方法例と同様にシステムの特性に合せて選択すると
良い。All of these processes are executed in the background process. It should be noted that the method of calculating the access probability may be selected according to the characteristics of the system as in the case of the first method example described above.

【００５１】続いて、ファイルの復元処理中に、ホスト
コンピュータ１からリード／ライト命令（Ｒ／Ｗ命令）
があった場合の処理手順について説明する。図９は第２
方法例のディスクアレイ装置２の制御方法におけるファ
イル復元中のリード・ライト手順のフローチャートを示
す。Then, during the file restoration process, a read / write command (R / W command) is issued from the host computer 1.
A processing procedure when there is such a case will be described. FIG. 9 is the second
9 is a flowchart of a read / write procedure during file restoration in the control method of the disk array device 2 of the method example.

【００５２】ファイルの復元をバックグランドで処理中
に、ホストコンピュータ１からリード／ライト命令（Ｒ
／Ｗ命令）が有るか否かが判断され（ＳＴ９' ）、Ｒ／
Ｗ命令があった場合には、ディスクアレイコントローラ
３は図９に示すように、修復テーブル３４を参照して所
望のファイルが復元されているか否かを確認する（ＳＴ
１０' 、ＳＴ１１' ）。While the file restoration is being processed in the background, the read / write command (R
/ W command) is determined (ST9 '), and R /
When there is a W command, the disk array controller 3 refers to the restoration table 34 as shown in FIG. 9 and confirms whether or not the desired file is restored (ST.
10 ', ST11').

【００５３】復元されている場合には図２に示す通常の
リード／ライト処理を実行し（ＳＴ１２' ）、復元され
ていない場合にはさらにリード命令か否か判断される
（ＳＴ１３' ）。リード命令の場合は、さらにアクセス
確率テーブルを参照し（ＳＴ１４' ）、該ファイルのア
クセス確率が予め定められたしきい値以上であるか否か
判断される（ＳＴ１５' ）。If it has been restored, the normal read / write processing shown in FIG. 2 is executed (ST12 '), and if it has not been restored, it is further judged whether or not it is a read instruction (ST13'). In the case of a read command, the access probability table is further referred to (ST14 '), and it is determined whether the access probability of the file is equal to or more than a predetermined threshold value (ST15').

【００５４】予め定められたしきい値以上である場合に
は、フォアグランド処理でファイルの復元処理を実行し
（ＳＴ１６' ）、復元したファイルのデータを新規のデ
ィスクドライブに書き込む（ＳＴ１７' ）とともに、復
元したファイルの修復テーブル３４のフラグをリセット
し（ＳＴ１８' ）、通常のファイルの読み出し処理を実
行する（ＳＴ１９' ）。If it is equal to or greater than the predetermined threshold value, the file restoration process is executed in the foreground process (ST16 '), and the restored file data is written to a new disk drive (ST17'). Then, the flag of the restoration table 34 of the restored file is reset (ST18 ') and the normal file reading process is executed (ST19').

【００５５】一方、ファイルのアクセス確率が予め定め
られたしきい値未満である場合には、ファイルの復元処
理を実行し（ＳＴ２０' ）、ファイルの読み出し処理を
実行する（ＳＴ２１' ）。On the other hand, when the access probability of the file is less than the predetermined threshold value, the file restoring process is executed (ST20 ') and the file reading process is executed (ST21').

【００５６】すなわち、復元したファイルのデータを新
規のディスクドライブに書き込まず、かつ復元したファ
イルの修復テーブル３４のフラグをリセットせずに、復
元したファイルをホストコンピュータ１に転送する。こ
の場合、復元されたデータが記録されないため、該ファ
イルが再アクセスされた場合には、再び同じファイルの
復元処理を行う必要がある。That is, the restored file is transferred to the host computer 1 without writing the data of the restored file to a new disk drive and without resetting the flag of the restoration table 34 of the restored file. In this case, since the restored data is not recorded, when the file is accessed again, it is necessary to restore the same file again.

【００５７】ＳＴ１３' にてファイルの復元処理中に、
ホストコンピュータ１からライト命令があったと判断さ
れた場合には、新規のディスクドライブに通常通りデー
タを書き込む（ＳＴ２２' ）とともに、書き込まれたフ
ァイルの修復テーブル３４のフラグをリセットする（Ｓ
Ｔ２３' ）。During the file restoration process at ST13 ',
When it is determined that the write command is issued from the host computer 1, the data is written to the new disk drive as usual (ST22 ') and the flag of the written file recovery table 34 is reset (S).
T23 ').

【００５８】このようにディスクアレイコントローラ３
は、リードされたファイルのアクセス確率に応じて該フ
ァイルの復元データを記録するか否かを決定し、アクセ
ス確率の高いファイルに関しては復元データを記録し、
再アクセス時のリード時間の増加を防止する。In this way, the disk array controller 3
Determines whether to record the restoration data of the read file according to the access probability of the file, and records the restoration data for a file with a high access probability,
Prevents an increase in read time during re-access.

【００５９】またアクセス確率の低いファイルに関して
は復元データを記録せず、バックグランド処理に委ねる
ことにより相対的にリード時間の増加を抑圧する。した
がって、ディスクアレイ装置２がデータの復旧処理を行
っている場合でも、アクセス性能が低下する割合は極め
て低くなる。Further, regarding the file having a low access probability, the restored data is not recorded and the background processing is entrusted to relatively suppress the increase of the read time. Therefore, even when the disk array device 2 is performing a data recovery process, the rate of deterioration in access performance is extremely low.

【００６０】次に本方法例のディスクアレイ装置２の制
御方法において、フォアグランド処理でデータの書込み
を行う場合のアクセス時間とバックグランド処理でデー
タの書込みを行う場合のアクセス時間について説明す
る。Next, in the control method of the disk array device 2 of this method example, the access time when writing data in the foreground process and the access time when writing data in the background process will be described.

【００６１】（フォアグランド処理でデータの書込みを
行う場合）図１０は本装置例のディスクアレイ装置２を使用してフ
ォアグランド処理でデータの書込みを行う場合のアクセ
ス時間を示す。新規のディスクドライブの修復処理を行
っている最中にファイルがリードされ、該ファイルの復
元したデータを新規のディスクドライブに書き込む場合
には、ファイルのリード時間は図１０に示すようにな
る。(When Data is Written by Foreground Processing) FIG. 10 shows an access time when data is written by foreground processing using the disk array device 2 of this device example. When the file is read during the restoration process of the new disk drive and the restored data of the file is written to the new disk drive, the file read time is as shown in FIG.

【００６２】ここで、復元したデータの書き込みを行う
初回のデータリード時間の増加の割合をＫｗとする。ま
た、ファイルがランダムに復元されるため、ファイルが
修復されるまでの平均待ち時間が新規のディスクドライ
ブの修復に要する時間ＴＲの１／２とし、この間に所望
のファイルがＮ回アクセスされると仮定すれば、復元し
たデータを新規のディスクドライブに書き込んだファイ
ルの平均リード時間は、ディスクドライブが正常な状態
でのリード時間の｛Ｋｗ＋（Ｎ−１）｝／Ｎ倍となる。Here, the increase rate of the first data read time for writing the restored data is Kw. Further, since the files are randomly restored, the average waiting time until the files are repaired is 1/2 of the time TR required to repair a new disk drive, and if a desired file is accessed N times during this time. Assuming that the average read time of the file in which the restored data is written in the new disk drive is {Kw + (N-1)} / N times the read time in the normal state of the disk drive.

【００６３】（バックグランド処理でデータの書込みを
行う場合）図１１は本装置例のディスクアレイ装置２を使用してバ
ックグランド処理でデータの書込みを行う場合のアクセ
ス時間を示す。新規のディスクドライブの修復処理を行
っている最中にファイルがリードされ、該ファイルの復
元したデータを新規のディスクドライブにフォアグラン
ド処理で書き込まない場合には、ファイルのリード時間
は図１１に示すようになる。(When Data is Written by Background Processing) FIG. 11 shows the access time when data is written by the background processing using the disk array device 2 of this device example. When the file is read during the restoration process of the new disk drive and the restored data of the file is not written to the new disk drive by the foreground process, the file read time is shown in FIG. Like

【００６４】復元したデータの書き込みをフォアグラン
ド処理で行わないため毎回行なわれるデータの復元処理
によるリード時間の増加の割合をＫＥとすれば、平均リ
ード時間はディスクドライブが正常な状態でのリード時
間のＫＥ倍となる。よって、アクセス確率のしきい値
は、上記の２つの平均リード時間が等しくなる条件か
ら、下記の次式の様に表される。Since the writing of the restored data is not performed in the foreground processing, if the rate of increase in the read time due to the data restoration processing performed every time is KE, the average read time is the read time in the normal state of the disk drive. KE times that of. Therefore, the threshold value of the access probability is expressed by the following expression below under the condition that the above two average read times are equal.

【００６５】アクセス確率のしきい値＝平均リード時間が等しくなるＮの値／ファイルが修復されるまでの平均待ち時間＝２（Ｋｗ−１）／｛ＴＲ（ＫＥ−１）｝[0065] Access probability threshold = value of N at which average read times are equal / Average waiting time for files to be repaired = 2 (Kw-1) / {TR (KE-1)}

【００６６】具体的に、図１０、図１１に示すようにＫ
ｗ＝２．０、ＫＥ＝１．３とすれば、アクセス確率のし
きい値は２／（０．３ＴＲ）となる。新規のディスクド
ライブが修復されるまでに少なくとも７回以上、即ちフ
ァイルが修復されるまでの平均待ち時間内に少なくとも
３．５回以上リードされると予想されるファイルに対し
ては復元データを書き込み、新規のディスクドライブが
修復されるまでのリード回数が７回未満、即ちファイル
が修復されるまでの平均待ち時間内のファイルのリード
回数が多くても３．５回未満と予想されるファイルに関
しては復元データの書き込みを行わない様にすれば、フ
ァイルを復元処理している間のアクセス性能の低下を最
小化することができる。Specifically, as shown in FIGS. 10 and 11, K
If w = 2.0 and KE = 1.3, the threshold value of the access probability is 2 / (0.3TR). Write the restored data to a file that is expected to be read at least 7 times before the new disk drive is repaired, ie at least 3.5 times within the average wait time before the file is repaired For files that are expected to be read less than 7 times before a new disk drive is repaired, ie less than 3.5 reads at most within the average wait time before a file is repaired By not writing the restoration data, it is possible to minimize the deterioration of the access performance during the restoration processing of the file.

【００６７】以上本発明の代表的な装置例及び方法例に
ついて説明したが、本発明は必ずしも前記装置例の手段
及び前記方法例の手法だけに限定されるものではない。
本発明の目的を達成し、後述する効果を有する範囲内に
おいて適宜変更して実施することができるものである。Although representative apparatus examples and method examples of the present invention have been described above, the present invention is not necessarily limited to the means of the apparatus examples and the method of the method examples.
The present invention can be appropriately modified and implemented within a range in which the object of the present invention is achieved and the effects described later are obtained.

【００６８】[0068]

【発明の効果】以上説明したように本発明によれば、故
障したディスクドライブのデータ復元を、ファイルのア
クセス確率に基づいて効率的に行うので、データ復元中
に起こり易いアクセス性能の低下を最小限に抑えること
ができるという効果を奏する。As described above, according to the present invention, the data recovery of a failed disk drive is efficiently performed based on the access probability of a file, so that the deterioration of the access performance which tends to occur during the data recovery is minimized. The effect is that it can be suppressed to the limit.

【００６９】特に、アクセス直後のファイルが再アクセ
スされる確率が極めて高い、ビデオ・オン・デマンドを
始めとするマルチメディアサービスシステムにおいて、
ディスクドライブの故障の影響を最小限にするのに好適
である。In particular, in a multimedia service system such as video-on-demand where the probability that a file immediately after access is re-accessed is extremely high,
It is suitable for minimizing the effect of disk drive failure.

[Brief description of drawings]

【図１】ディスクアレイ装置の基本構成を示すブロック
図である。FIG. 1 is a block diagram showing a basic configuration of a disk array device.

【図２】本発明の装置例のディスクアレイ装置による通
常時のデータのリード／ライト処理の流れを示した説明
図である。FIG. 2 is an explanatory diagram showing a flow of normal data read / write processing by the disk array device of the device example of the present invention.

【図３】同上の装置によるディスクドライブ故障時のデ
ータのリード／ライト処理の流れを示した説明図であ
る。FIG. 3 is an explanatory diagram showing a flow of data read / write processing when a disk drive fails in the same apparatus.

【図４】本装置例のディスクアレイ装置２におけるディ
スクアレイコントローラ３の構成を示したブロック図で
ある。FIG. 4 is a block diagram showing a configuration of a disk array controller 3 in a disk array device 2 of this device example.

【図５】第１方法例のディスクアレイ装置の制御方法に
おけるデータ復元処理の手順を示した流れ図である。FIG. 5 is a flowchart showing a procedure of data restoration processing in the control method of the disk array device of the first method example.

【図６】同上の制御方法におけるファイル復元中のリー
ド手順を示した流れ図である。FIG. 6 is a flowchart showing a read procedure during file restoration in the above control method.

【図７】同上の制御方法におけるファイル復元中のライ
ト手順を示した流れ図である。FIG. 7 is a flowchart showing a write procedure during file restoration in the above control method.

【図８】第２方法例のディスクアレイ装置の制御方法に
おけるデータ復元処理の手順を示した流れ図である。FIG. 8 is a flowchart showing a procedure of data restoration processing in the control method of the disk array device of the second method example.

【図９】同上の制御方法におけるファイル復元中のリー
ド・ライト手順を示した流れ図である。FIG. 9 is a flowchart showing a read / write procedure during file restoration in the control method of the above.

【図１０】本装置例のディスクアレイ装置を使用してフ
ォアグランド処理でデータの書込みを行う場合のアクセ
ス時間を示したグラフである。FIG. 10 is a graph showing the access time when data is written in the foreground process using the disk array device of this device example.

【図１１】同上の装置を使用してバックグランド処理で
データの書込みを行う場合のアクセス時間を示したグラ
フである。FIG. 11 is a graph showing an access time when data is written in the background process using the same device.

[Explanation of symbols]

１…ホストコンピュータ２…ディスクアレイ装置３…ディスクアレイコントローラ４Ａ、４Ｂ、４Ｃ、４Ｄ…ディスクドライブ５…内部バス６…外部バス３１…冗長データ生成機能部３２…データ復元機能部３３…アクセス確率テーブル３４…修復テーブル３５…外部バスインタフェース３６…内部バスインタフェース 1 ... Host computer 2 ... Disk array device 3 ... Disk array controller 4A, 4B, 4C, 4D ... Disk drive 5 ... Internal bus 6 ... External bus 31 ... Redundant data generation function unit 32 ... Data recovery function section 33 ... Access probability table 34 ... Restoration table 35 ... External bus interface 36 ... Internal bus interface

───────────────────────────────────────────────────── フロントページの続き (72)発明者岩津茂太郎東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 (72)発明者井沢伸芳東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 (72)発明者白水啓章神奈川県横浜市中区不老町２丁目９番１号エヌ・ティ・ティ・インテリジェントテクノロジ株式会社内 (72)発明者河野隆神奈川県横浜市中区不老町２丁目９番１号エヌ・ティ・ティ・インテリジェントテクノロジ株式会社内 (56)参考文献特開平５−314674（ＪＰ，Ａ) 特開平６−208488（ＪＰ，Ａ) 特開平８−137627（ＪＰ，Ａ) 特開平６−266508（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 3/06 - 3/08 G06F 12/00 - 12/12 G11B 20/10 - 20/16 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Shigetaro Iwazu 3-19-2 Nishishinjuku, Shinjuku-ku, Tokyo Nihon Telegraph and Telephone Corporation (72) Innovator Nobuyoshi Izawa 3-19, Nishishinjuku, Shinjuku-ku, Tokyo No. 2 Nihon Telegraph and Telephone Corporation (72) Inventor Keisho Shiramizu 2-9-1, Furo-cho, Naka-ku, Yokohama-shi, Kanagawa NTT Intelligent Technology Corporation (72) Inventor Kono Takashi 2-9-1, Furomachi, Naka-ku, Yokohama-shi, Kanagawa Prefecture NTT Intelligent Technology Co., Ltd. (56) Reference JP-A-5-314674 (JP, A) JP-A-6-208488 (JP, A) JP-A-8-137627 (JP, A) JP-A-6-266508 (JP, A) (58) Fields investigated (Int.Cl. ⁷ , DB name) G06F 3 / 06-3 / 08 G06F 12/00- 12/12 G11B 20/10-20/16

Claims

(57) [Claims]

1. A disk array device having a plurality of disk drives for recording data and a disk array controller connected to the disk drives via an internal bus, wherein the disk array controller stores data in the disk drives. When storing, the data is divided into a predetermined data length, and a redundant data generation function unit that generates redundant data for the divided data group and divided data that cannot be reproduced due to a failure of the disk drive. A data restoration function unit that refers to and restores redundant data, an access probability table that records the access probabilities of files recorded on multiple disk drives, and a repair that sets a flag for the files to be restored If you have a table and The redundant data generation function unit generates divided data of a predetermined length and redundant data generated for the divided data, records the data group in a distributed manner in the plurality of disk drives, and records it. When the flag of the file in the restoration table is reset, data is read and written, and the access probability of the corresponding file is recorded in the access probability table each time a file recorded in a plurality of the disk drives is accessed. Then, when an abnormal state of the disk drive is detected and the failed disk drive is switched accordingly, the flag in the restoration table is set for the file to be restored, and the data restoration function is set. Part of the file is restored and the file The original data is recorded in a new disk drive, the background process of resetting the flag in the restoration table of the restored file is performed, and if a file read command is received during the background process, the restoration of the file is performed. If the flag in the table is set and the access probability of the file recorded in the access probability table is equal to or greater than a predetermined threshold value, the data restoration function unit restores the file, The restored data of is recorded in a new disk drive to reset the flag in the restoration table of the file, while the flag in the restoration table of the file is set and the access recorded in the access probability table. Probability is predetermined When it is less than the threshold value, the data restoration function unit restores the file, and the read process is performed without recording the restored data of the file in a new disk drive. A disk array device characterized by the above.

2. The disk array device according to claim 1, wherein the data length divided by the disk array controller is in units of bits, bytes or words.

3. The disk array device according to claim 1, wherein the disk drive is equipped with a spare disk drive that is used in the case of a failure.

4. The disk drive stores, as internal data, information for a computer to detect an abnormal state of the disk drive when reading or writing data, according to claim 1, 2 or 3. The described disk array device.

5. The disk according to claim 1, 2, 3 or 4, wherein the disk drive is an external storage device of a host computer that provides a multimedia service such as video on demand. Array device.

6. A disk array controller connected via an internal bus for controlling a plurality of disk drives, when writing data, generates divided data of a predetermined length and redundant data, and writes the data. The group is distributed and recorded on multiple disk drives, and if the recorded file flag is reset, data is read and written, and each time a file recorded on multiple disk drives is accessed. When the access probability of the file is recorded and an abnormal state of the disk drive is detected, and the failed disk drive is switched accordingly,
A background that sets the flag for the file to be restored, restores the file, records the restored data of the file in a new disk drive, and resets the flag of the restored file. When a file read command is received during the background process, if the file is unrepaired and the access probability is equal to or higher than a predetermined threshold, the file is restored and the file is restored. If the file is unrepaired and the access probability is less than the specified threshold while the restored data of the above is recorded in the new disk drive and the flag of the file is reset, the restoration process of the file is performed. The recorded data of the file is recorded on the new disk drive, and Carrying out de-processing method for controlling a disk array apparatus, characterized in that.

7. The method of controlling a disk array device according to claim 6, wherein the access probability is obtained from the number of times the file is accessed per unit time or from the elapsed time after the file is accessed.

8. The threshold value of the access probability of the file is the number of accesses per average waiting time until the file is restored = 2 (Kw-1) / {TR (KE-1)}, where Kw = Data read time with data recovery with writing / Data read time when all disk drives are normal, KE = Data read time with data recovery without writing / Data read with all disk drives normal The method of controlling a disk array device according to claim 7, wherein TR is a time required for repairing a new disk drive.

9. The method for controlling a disk array device according to claim 6, wherein the data length divided by the disk array controller is in units of bits, bytes or words.

10. The disk array device according to claim 6, 7, 8 or 9, wherein the disk drive is provided with a spare disk drive to be replaced and used when a failure occurs. Control method.

11. The disk drive stores information for detecting an abnormal state of the disk drive when a computer reads and writes data as internal data. 11. The method for controlling a disk array device according to 9 or 10.

12. The disk drive is an external storage device of a host computer that provides a multimedia service such as video on demand, and the disk drive is an external storage device.
7. A method for controlling a disk array device according to.