JP2001075741A

JP2001075741A - Disk control system and data maintenance method

Info

Publication number: JP2001075741A
Application number: JP24885999A
Authority: JP
Inventors: Kazunori Sekido; 一紀関戸
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1999-09-02
Filing date: 1999-09-02
Publication date: 2001-03-23

Abstract

PROBLEM TO BE SOLVED: To improve the writing performance into a disk device and the maintainability of data. SOLUTION: Written data are stored in a log memory 171 built in a raid booster card (RAID BOOSTER) 17, and when data for one parity block are prepared, the data are collectively written in a raid disk array. The written data are stored in the log memory 171 of the card 17 and its copy also is stored in a main memory 13 of this computer system. When a fault is detected in the memory 171 or the card 17, the copy of the written data stored in the main memory 13 is quickly written in the disk array.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は計算機システムで用
いられるディスク制御システムおよびデータ保全方法に
関し、特にログメモリに蓄積された書き込みデータを一
括してディスク装置に書き込む機能を有するディスク制
御システムおよびそのデータ保全方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a disk control system and a data preservation method used in a computer system, and more particularly, to a disk control system having a function of collectively writing write data stored in a log memory to a disk device and its data. Conservation methods.

【０００２】[0002]

【従来の技術】近年、ディスクの高速化と信頼性向上の
観点から、ＲＡＩＤ（ＲｅｄｕｎｄａｎｔＡｒｒａｙ
ｏｆＩｎｄｅｐｅｎｄｅｎｔＤｉｓｋｓ）技術を
搭載したＰＣサーバが主流となっている。ＲＡＩＤ技術
には０，１，１０，５などの種類があるが、その中でも
価格容量比の良さと優れた耐障害性から、ＲＡＩＤ５が
広く利用されている。ＲＡＩＤ５は、データとパリティ
を含むパリティグループを複数台のディスク装置に対し
て横断配置したディスクアレイから構成されている。Ｒ
ＡＩＤ５のディスクアレイは、データ記憶用の複数のデ
ィスク装置にパリティ記憶用のディスク装置を１台付加
したＮ＋１（Ｎは２以上）台のディスク装置を含んでお
り、且つパリティをストライプ単位でＮ＋１（Ｎは２以
上）台のディスク装置に分散させることによってパリテ
ィディスクへのアクセスの集中を防止できるようにして
いる。2. Description of the Related Art In recent years, RAID (Redundant Array) has been developed from the viewpoint of speeding up disks and improving reliability.
PC servers equipped with technology of Independent Disks) have become mainstream. RAID technology includes types such as 0, 1, 10, and 5. Among them, RAID 5 is widely used because of its good price / capacity ratio and excellent fault tolerance. RAID5 is composed of a disk array in which a parity group including data and parity is arranged across a plurality of disk devices. R
The disk array of AID5 includes N + 1 (N is 2 or more) disk devices in which one disk device for parity storage is added to a plurality of disk devices for data storage, and parity is N + 1 (N + 1) in stripe units. N is 2 or more) so that the concentration of accesses to the parity disk can be prevented by distributing the disk devices to two or more disk devices.

【０００３】しかし、ＲＡＩＤ５ではデータを更新する
場合、次の処理がＲＡＩＤコントローラで行われる。However, when updating data in RAID 5, the following processing is performed in the RAID controller.

【０００４】（１）古いデータと古いパリティデータを
読み出す（２）新しいデータを加え、新しいパリティを算出する（３）新しいデータと新しいパリティデータを書き込むこの内、古いデータの読み出し、古いパリティデータの
読み出し、新しいパリティデータの書き込みは本来必要
のない処理で、ＲＡＩＤ５の書き込みペナルティと呼ば
れている。結局ＲＡＩＤ５では、信頼性の向上と引き替
えに書き込み性能を犠牲にしている。このため、ＲＡＩ
Ｄ５を使ったＰＣサーバでは、書き込み性能（特にラン
ダム書き込み）が悪く、それを意識したシステム構築が
必要であった。なお、最新ＲＡＩＤコントローラではこ
の書き込みペナルティを軽減する仕組みを持っている
が、ランダム書き込みに対しては大きな改善は見込めな
い。(1) Read old data and old parity data (2) Add new data and calculate new parity (3) Write new data and new parity data Among them, read old data and read old parity data Reading and writing of new parity data are essentially unnecessary processes and are called RAID5 write penalties. After all, in RAID 5, write performance is sacrificed in exchange for improvement in reliability. Therefore, RAI
In the PC server using D5, the writing performance (particularly, random writing) is poor, and it is necessary to construct a system that is aware of this. Although the latest RAID controller has a mechanism for reducing the write penalty, a great improvement cannot be expected for random write.

【０００５】[0005]

【発明が解決しようとする課題】そこで、最近では、Ｒ
ＡＩＤの書き込み性能を大幅に向上するための技術とし
て、ログ構造化ファイルシステム技術が考えられてい
る。ログ構造化ファイルシステム技術とは、「シーケン
シャルアクセスは、ランダムアクセスに比較して非常に
高速である」というディスクの特性を利用した性能向上
の手法である。ランダムアクセス時の性能低下はディス
クのシークおよび回転待ちによるもので出来れば無くし
たい部分である。そこで、ログ構造化ファイルシステム
技術では、書き込みデータを蓄積するためのログメモリ
を使用し、複数の小さなブロックの書き込みデータを、
一つの大きいブロックのデータに変換して「まとめ書
き」するという手法が用いられる。Therefore, recently, R
As a technique for greatly improving the AID writing performance, a log structured file system technique is considered. The log-structured file system technology is a technique for improving performance utilizing the characteristics of a disk that "sequential access is much faster than random access". The performance degradation at the time of random access is caused by the seek and rotation wait of the disk, and is a part that should be eliminated if possible. Therefore, the log structured file system technology uses a log memory to store the write data, and writes the write data of a plurality of small blocks.
A technique of converting the data into data of one large block and performing “collective writing” is used.

【０００６】しかし、ログ構造化ファイルシステム技術
を使用した場合には、書き込みデータをディスクに「ま
とめ書き」するため、アプリケーションへ書き込み完了
を報告する時点と、実際にディスクへの書き込みを行う
時点が異なる。そのため、本来はディスクに既に記録さ
れているべき書き込みデータがログメモリ上にしか残っ
ていないので、そのデータに関する保全性には十分な配
慮が必要である。However, when the log structured file system technology is used, the write data is "collectively written" to the disk. Therefore, there is a time when the write completion is reported to the application and a time when the actual write to the disk is performed. different. For this reason, write data that should have been already recorded on the disk remains only in the log memory, so that sufficient consideration must be given to the integrity of the data.

【０００７】本発明はこのような事情に鑑みてなされた
ものであり、ディスク装置に対する書き込み性能の向上
とデータの保全性の向上を実現することが可能なディス
ク制御システムおよびデータ保全方法を提供することを
目的とする。The present invention has been made in view of such circumstances, and provides a disk control system and a data maintenance method capable of realizing an improvement in write performance to a disk device and an improvement in data integrity. The purpose is to:

【０００８】[0008]

【課題を解決するための手段】上述の課題を解決するた
め、本発明は、計算機システムに設けられたディスク装
置を制御するためのディスク制御システムにおいて、書
き込みデータを蓄積するためのログメモリを有し、前記
ログメモリに蓄積された書き込みデータを一括して前記
ディスク装置に書き込むための制御手段と、前記ログメ
モリに蓄積された書き込みデータの複製を保持する手段
とを具備し、前記書き込みデータの複製を用いて、前記
ログメモリ内の書き込みデータの保全性を確保すること
を特徴とする。In order to solve the above-mentioned problems, the present invention provides a disk control system for controlling a disk device provided in a computer system, which has a log memory for storing write data. Control means for writing the write data stored in the log memory to the disk device at a time; and means for holding a copy of the write data stored in the log memory. The duplication is used to ensure the integrity of the write data in the log memory.

【０００９】このディスク制御システムにおいては、書
き込みデータをログメモリのみならず、その複製を計算
機システムのメモリなど保持することにより、書き込み
データを２重化することが出来る。このため、ログメモ
リに対する正常なデータ入出力が行われないようなエラ
ーが発生した場合でも、たとえばエラー検出時に書き込
みデータの複製をディスク装置に書き込む処理を自動的
に行うこと等により、書き込みデータの複製を利用して
本来の書き込みデータの内容を復元することが出来る。
よって、ディスク装置に対する書き込み性能の向上とデ
ータの保全性の向上を実現することが可能となる。特
に、データとパリティを含むパリティグループが複数台
のディスク装置に対して横断配置されているディスクア
レイの制御に適用することにより、書き込み性能と信頼
性の向上を高い次元で両立することが可能となる。In this disk control system, the write data can be duplicated by holding not only the log data but also a copy of the write data in the memory of the computer system. For this reason, even when an error occurs in which normal data input / output to / from the log memory is not performed, for example, when an error is detected, a process of writing a copy of the write data to the disk device is automatically performed. The original contents of the write data can be restored using the copy.
Therefore, it is possible to improve the write performance to the disk device and the data integrity. In particular, by applying to the control of a disk array in which a parity group including data and parity is traversed with respect to a plurality of disk devices, it is possible to improve write performance and reliability at a high level. Become.

【００１０】ログメモリのエラーの検出は、ログメモリ
のエラーを示すエラーステータス情報を保持するステー
タスメモリを制御手段に設け、そのステータスメモリを
定期的にチェックすることによって行うことが好まし
い。通常、計算機システムでは、メモリエラーなどが発
生するとリセット信号が自動的に発行されることによっ
て計算機システム自体を停止されてしまうことが多い。
しかし、上述のようにポーリングによるエラー検出手法
を利用することにより、ログメモリのエラーを制御手段
内に閉じ込めることが可能となり、書き込みデータの複
製を利用したデータの復旧処理を効率よく行うことが可
能となる。The detection of the error in the log memory is preferably performed by providing a status memory for holding error status information indicating an error in the log memory in the control means and periodically checking the status memory. Usually, in a computer system, when a memory error or the like occurs, a reset signal is automatically issued, so that the computer system itself is often stopped.
However, by using the error detection method by polling as described above, errors in the log memory can be confined in the control means, and data recovery processing using duplication of write data can be performed efficiently. Becomes

【００１１】また、本発明は、計算機システムに設けら
れたディスク装置を制御するためのディスク制御システ
ムにおいて、書き込みデータを蓄積するためのログメモ
リを有し、前記ログメモリに蓄積された書き込みデータ
を一括して前記ディスク装置に書き込むための制御手段
と、前記ディスク装置のエラーを検出するエラー検出手
段と、前記エラー検出手段によるエラー検出に応答し
て、前記ログメモリに蓄積されている書き込みデータを
退避する手段とを具備し、前記退避した書き込みデータ
を用いて、前記ログメモリに蓄積されている書き込みデ
ータの保全性を確保することを特徴とする。According to the present invention, there is provided a disk control system for controlling a disk device provided in a computer system, comprising a log memory for storing write data, and storing the write data stored in the log memory. Control means for writing to the disk device in a lump, error detecting means for detecting an error of the disk device, and writing data stored in the log memory in response to error detection by the error detecting means. An evacuation means for ensuring the integrity of the write data stored in the log memory using the saved write data.

【００１２】この計算機システムにおいては、ディスク
装置のエラーによって一括書き込みが行われなかった場
合には、ログメモリに蓄積されている書き込みデータの
退避が行われ、これにより、ディスク装置に書き出され
ておらずログメモリにのみ残っている書き込みデータの
内容が自動的にバックアップされる。よって、ディスク
障害が発生しても書き込みデータの内容を復元すること
が可能となるので、アプリケーションへ書き込み完了を
報告する時点と、実際にディスクへの書き込みを行う時
点が異なるシステムにおける、書き込みデータの保全性
の向上を図ることが可能となる。In this computer system, if batch writing is not performed due to an error in the disk device, the write data stored in the log memory is saved, whereby the data is written out to the disk device. The contents of the write data remaining only in the log memory are automatically backed up. Therefore, even if a disk failure occurs, it is possible to restore the contents of the write data, so that the time at which the write completion is reported to the application is different from the time at which the actual write to the disk is performed in the system. It is possible to improve the maintainability.

【００１３】[0013]

【発明の実施の形態】以下、図面を参照して本発明の実
施形態を説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１４】図１には、本発明の一実施形態に係る計算
機システムの構成が示されている。この計算機システム
はサーバ（ＰＣサーバ）として利用されるものであり、
複数のＣＰＵ１１を搭載することが出来る。これら各Ｃ
ＰＵ１１は図示のようにプロセッサバス１を介してブリ
ッジ１２に接続されている。ブリッジ１２はプロセッサ
バス１とＰＣＩバス２を双方向で接続するためのブリッ
ジＬＳＩであり、ここには主メモリ１３を制御するため
のメモリコントローラも内蔵されている。主メモリ１３
には、オペレーティングシステム、実行対象のアプリケ
ーションプログラム、およびドライバなどがロードされ
る。FIG. 1 shows a configuration of a computer system according to an embodiment of the present invention. This computer system is used as a server (PC server).
A plurality of CPUs 11 can be mounted. Each of these C
The PU 11 is connected to the bridge 12 via the processor bus 1 as shown. The bridge 12 is a bridge LSI for connecting the processor bus 1 and the PCI bus 2 in two directions, and has a built-in memory controller for controlling the main memory 13. Main memory 13
Is loaded with an operating system, an application program to be executed, a driver, and the like.

【００１５】ＰＣＩバス２には、図示のように、ＳＣＳ
Ｉコントローラ１４、ＲＡＩＤコントローラ１６、およ
びＲＡＩＤブースタカード（ＲＡＩＤＢＯＯＳＴＥ
Ｒ）１７が接続されている。ＳＣＳＩコントローラ１４
によって制御されるディスク装置（ＨＤＤ）１５は、オ
ペレーティングシステムなどを記録するために使用され
るシステムディスクである。また、ＲＡＩＤコントロー
ラ１６によって制御されるディスクアレイ１８は、各種
ユーザデータ等の記録に用いられる。As shown, the PCI bus 2 has an SCS
I controller 14, RAID controller 16, and RAID booster card (RAID BOOSTE
R) 17 is connected. SCSI controller 14
(HDD) 15 is a system disk used to record an operating system and the like. The disk array 18 controlled by the RAID controller 16 is used for recording various user data and the like.

【００１６】ディスクアレイ１８はＲＡＩＤコントロー
ラ１６の制御により、ＲＡＩＤ５のディスクアレイとし
て機能する。このディスクアレイ１８は、データ記憶用
のＮ台のディスク装置にパリティ記憶用のディスク装置
を１台付加したＮ＋１台（ここでは、ＤＩＳＫ０〜ＤＩ
ＳＫ４の５台）のディスク装置から構成される。これら
Ｎ＋１台のディスク装置はグループ化され、単一の論理
ドライブとして使用される。The disk array 18 functions as a RAID 5 disk array under the control of the RAID controller 16. This disk array 18 has N + 1 disks (here, DISK0 to DIK) in which one disk device for parity storage is added to N disk devices for data storage.
SK4). These N + 1 disk devices are grouped and used as a single logical drive.

【００１７】グループ化されたディスク装置には、図示
のように、データ（ＤＡＴＡ）とそのパリティ（ＰＡＲ
ＩＴＹ）とから構成されるストライプ（パリティグルー
プ）が割り当てられ、かつパリティ位置はストライプ毎
にＮ＋１台のディスク装置に移動される。すなわち、ス
トライプＳ０では、ＤＩＳＫ０〜ＤＩＳＫ３それぞれの
同一位置に割り当てられたストライプユニット上のデー
タ（ＤＡＴＡ）群のパリティ（ＰＡＲＩＴＹ）は、ＤＩ
ＳＫ４の対応するストライプユニット上に記録される
が、次のストライプＳ１においては、そのストライプＳ
１のデータに対応するパリティ（ＰＡＲＩＴＹ）はＤＩ
ＳＫ３の対応するストライプユニット上に記録される。
このようにパリティをストライプ単位でＮ＋１台のディ
スク装置に分散させることによって、パリティディスク
へのアクセスの集中を防止することができる。As shown, data (DATA) and its parity (PAR) are stored in the grouped disk devices.
ITY), and the parity position is moved to N + 1 disk devices for each stripe. That is, in the stripe S0, the parity (PARITY) of the data (DATA) group on the stripe unit assigned to the same position of each of DISK0 to DISK3 is DI
It is recorded on the corresponding stripe unit of SK4, but in the next stripe S1, that stripe S
The parity (PARITY) corresponding to 1 data is DI
It is recorded on the corresponding stripe unit of SK3.
By dispersing the parity in the stripe units among the N + 1 disk devices in this way, it is possible to prevent the concentration of accesses to the parity disk.

【００１８】ＲＡＩＤブースタカード（ＲＡＩＤＢＯ
ＯＳＴＥＲ）１７は、ログ構造化ファイルシステム技術
を使用することによって、ＲＡＩＤ５のディスクアレイ
１８に対する書き込み性能を向上させるためのものであ
り、ＰＣＩスロットに装着可能なＰＣＩ拡張カードとし
て実現されている。ここで、図２を参照して、ＲＡＩＤ
ブースタカード（ＲＡＩＤＢＯＯＳＴＥＲ）１７によ
る書き込み制御の原理を説明する。RAID booster card (RAID BO
The OSTER 17 is for improving the write performance to the RAID 5 disk array 18 by using the log structured file system technology, and is realized as a PCI expansion card that can be mounted in a PCI slot. Here, referring to FIG.
The principle of the write control by the booster card (RAID BOOSTER) 17 will be described.

【００１９】ログ構造化ファイルシステム技術を使用し
た書き込み方式では、ホストから書き込み要求された論
理アドレスに従ってデータ書き込み位置を決定するので
はなく、ホストから書き込み要求された順番で書き込み
データを蓄積することによって書き込み用の大きなデー
タブロックを構成し、その大きなデータブロックを一括
してディスクアレイ１８の空き領域に上から順番に「ま
とめ書き」する。これにより、ランダムなアクセスをシ
ーケンシャルなアクセスに変換するとが出来る。ここ
で、「まとめ書き」の単位をパリティグループ１つ分の
データ容量とし、その書き出し位置をパリティグループ
の先頭に合わせることにより、ＲＡＩＤコントローラ１
６では「まとめ書き」の書き込みデータだけからパリテ
ィを計算できるようになる。よって、先に述べた古いデ
ータの読み出し、古いパリティデータの読み出し、新し
いパリティデータの書き込み（書き込みペナルティ）が
発生しなくなり、書き込み性能が向上する。In the writing method using the log structured file system technology, the data writing position is not determined according to the logical address requested by the host to write, but by storing the writing data in the order of writing requested by the host. A large data block for writing is formed, and the large data block is collectively written in a free area of the disk array 18 from the top in order. This makes it possible to convert random access into sequential access. Here, the unit of “collective writing” is the data capacity of one parity group, and the write start position is aligned with the head of the parity group, thereby making the RAID controller 1
In No. 6, parity can be calculated only from the write data of “collective writing”. Therefore, the above-described reading of old data, reading of old parity data, and writing of new parity data (write penalty) do not occur, and the write performance is improved.

【００２０】図２は、ファイルシステムを介して送られ
来るアプリケーションプログラムからの書き込みデータ
のブロックデータサイズが２ＫＢ、１ストライプユニッ
トのデータサイズが６４ＫＢ、１ストライプ（パリティ
グループ）分のデータサイズが２５６ＫＢ（＝６４ＫＢ
・４）の場合の例である。２５６ＫＢ分のデータがＲＡ
ＩＤブースタカード（ＲＡＩＤＢＯＯＳＴＥＲ）１７
に蓄積された時点で、ディスクアレイ１８に対する書き
込みが一度にまとめて行われる。FIG. 2 shows that the block data size of the write data from the application program sent via the file system is 2 KB, the data size of one stripe unit is 64 KB, and the data size of one stripe (parity group) is 256 KB ( = 64 KB
・ This is an example of the case of 4). 256KB of data is RA
ID Booster Card (RAID BOOSTER) 17
At this time, writing to the disk array 18 is performed at once.

【００２１】ＲＡＩＤブースタカード（ＲＡＩＤＢＯ
ＯＳＴＥＲ）１７には、図１に示されているように、ロ
グメモリ１７１、アドレスマッピングテーブル１７２、
ＥＣＣステータスレジスタ１７３、ＤＭＡエンジン１７
４、ＭＰＵ１７５、バッテリ１７６などが搭載されてい
る。RAID booster card (RAID BO)
OSTER) 17, as shown in FIG. 1, a log memory 171, an address mapping table 172,
ECC status register 173, DMA engine 17
4, MPU 175, battery 176, and the like.

【００２２】ログメモリ１７１はアプリケーションから
の書き込みデータを蓄積するためのものであり、このロ
グメモリ１７１に１パリティグループ分の書き込みデー
タが蓄積された時点でディスクアレイ１８への一括書き
込みが行われる。アドレスマッピングテーブル１７２
は、書き込みデータの論理アドレスとディスクアレイ１
８上の実際の書き込み位置との対応関係を管理するため
に使用される。すなわち、書き込み要求された論理アド
レスの値と実際に書き込まれた物理アドレスとの対応関
係がアドレスマッピングテーブル１７２にて管理される
ことになる。The log memory 171 is for accumulating write data from an application, and when write data for one parity group is accumulated in the log memory 171, collective writing to the disk array 18 is performed. Address mapping table 172
Is the logical address of the write data and the disk array 1
8 is used to manage the correspondence with the actual write position. That is, the correspondence between the value of the logical address requested to be written and the actually written physical address is managed by the address mapping table 172.

【００２３】ログメモリ１７１とアドレスマッピングテ
ーブル１７２はＲＡＩＤブースタカード（ＲＡＩＤＢ
ＯＯＳＴＥＲ）１７上に搭載されたメモリ上に実現され
ており、そのメモリの内容はバッテリ１７６によってバ
ックアップされている。The log memory 171 and the address mapping table 172 are stored in a RAID booster card (RAID B).
OOSTER) 17 is implemented on a memory mounted thereon, and the contents of the memory are backed up by a battery 176.

【００２４】ＥＣＣステータスレジスタ１７３は、ログ
メモリ１７１のエラーを示すエラーステータス情報を保
持するためのエラーステータスレジスタであり、ＣＰＵ
１１によってリードすることが出来る。ＤＭＡエンジン
１７４は、ログメモリ１７１と主メモリ１２等との間の
データ転送をＤＭＡ転送によって行うために用いられ
る。ＭＰＵ１７５はＲＡＩＤブースタカード（ＲＡＩＤ
ＢＯＯＳＴＥＲ）１７内の各ユニットの制御に使用さ
れる。The ECC status register 173 is an error status register for holding error status information indicating an error in the log memory 171.
11 can be read. The DMA engine 174 is used for performing data transfer between the log memory 171 and the main memory 12 by DMA transfer. MPU175 is a RAID booster card (RAID
BOOSTER) 17 is used to control each unit.

【００２５】図３には、ＲＡＩＤブースタカード（ＲＡ
ＩＤＢＯＯＳＴＥＲ）１７を用いてディスク制御を行
うためのハードウェアとソフトウェアの関係が示されて
いる。FIG. 3 shows a RAID booster card (RA
The relationship between hardware and software for performing disk control using the ID BOOSTER 17 is shown.

【００２６】本実施形態のディスク制御システムは、Ｒ
ＡＩＤブースタカード（ＲＡＩＤＢＯＯＳＴＥＲ）１７
と制御ソフトウェアとから構成されている。制御ソフト
ウェアは、ログ構造化処理を行うためのフィルタードラ
イバ２３と、ＲＡＩＤブースタカード（ＲＡＩＤＢＯ
ＯＳＴＥＲ）１７を制御するためのカードドライバ２５
とから構成されている。The disk control system according to the present embodiment has an R
AID Booster Card (RAIDBOOSTER) 17
And control software. The control software includes a filter driver 23 for performing a log structuring process and a RAID booster card (RAID BO).
OSTER) 17 for controlling the card driver 25
It is composed of

【００２７】フィルタードライバ２３はＯＳのファイル
システム２２とＲＡＩＤコントローラドライバ２４との
間に位置しており、アプリケーションプログラム２１か
らの書き込みデータはフィルタードライバ２３によって
自動的に取得される。取得された書き込みデータは、カ
ードドライバ２５を通じてログメモリ１７１に蓄積され
る。ログメモリ１７１にパリティグループ１つ分のデー
タが蓄積されると、フィルタードライバ２３の制御下、
ＲＡＩＤコントローラドライバ２４を通じてＲＡＩコン
トローラ１６に書き込みデータが転送される。The filter driver 23 is located between the file system 22 of the OS and the RAID controller driver 24, and the write data from the application program 21 is automatically acquired by the filter driver 23. The acquired write data is accumulated in the log memory 171 through the card driver 25. When data for one parity group is accumulated in the log memory 171, under the control of the filter driver 23,
Write data is transferred to the RAI controller 16 through the RAID controller driver 24.

【００２８】（書き込みデータに対する保全性の向上）
次に、本実施形態のデータ保全機能について説明する。(Improvement of the integrity of write data)
Next, the data security function of the present embodiment will be described.

【００２９】上述したように、ＲＡＩＤブースタカード
（ＲＡＩＤＢＯＯＳＴＥＲ）１７による性能向上のポ
イントは、アプリケーションからの書き込みデータを溜
めて、一括してディスクアレイ１８へ「まとめ書き」す
ることにある。よって、本来はディスクアレイ１８に記
録されているべき書き込みデータがログメモリ１７１上
にしか残っていないので、そのデータに関する保全性に
は十分な配慮が必要である。そこで、本実施形態では、
以下の対策を行っている。As described above, the point of the performance improvement by the RAID booster card (RAID BOOSTER) 17 is to collect write data from the application and collectively write the data to the disk array 18 collectively. Therefore, since write data which should be originally recorded in the disk array 18 remains only in the log memory 171, sufficient consideration must be given to the integrity of the data. Therefore, in this embodiment,
The following measures are taken.

【００３０】（１）バッテリによるメモリのバックアッ
プバッテリモジュール１７６をＲＡＩＤブースタカード１
７に内蔵しており、停電などの障害に対してもバッテリ
バックアップで７２時間（通常使用時）メモリ内容のバ
ックアップが可能である。(1) Backup of Memory by Battery Battery module 176 is connected to RAID booster card 1
7 so that the contents of the memory can be backed up for 72 hours (during normal use) with a battery backup even in the event of a power failure or the like.

【００３１】（２）ＥＣＣによるデータ保護ＲＡＩＤブースタカード１７上のメモリはＥＣＣで保護
されており、メモリの１ビットエラーが発生してもそれ
を修復でき、データを失うことはない。(2) Data Protection by ECC The memory on the RAID booster card 17 is protected by ECC, so that even if a 1-bit error occurs in the memory, it can be repaired and data is not lost.

【００３２】（３）書き込みデータの２重化上記の対策だけでは、メモリの２ビットエラーを含むＲ
ＡＩＤブースタカード１７そのものの障害により、ログ
メモリ１７１に保存された書き込みデータは失われてし
まう。そこで、書き込みデータに関しては以下の処理を
行って、さらにデータの保全性を高めている。(3) Duplication of Write Data Only with the above measures, R
Due to the failure of the AID booster card 17 itself, the write data stored in the log memory 171 is lost. Therefore, the following processing is performed on the write data to further improve data integrity.

【００３３】（ａ）書き込みデータはＲＡＩＤブースタ
カード１７のログメモリ１７１内だけではなく、本計算
機システムの主メモリ１３にもそのコピーを作成する。
書き込みデータのコピーは、主メモリ１３上にログメモ
リコピーエリアとして確保された作業領域上に記録され
る。(A) The write data is copied not only in the log memory 171 of the RAID booster card 17 but also in the main memory 13 of the computer system.
A copy of the write data is recorded on a work area secured on the main memory 13 as a log memory copy area.

【００３４】（ｂ）通常のＰＣＩボードでは、障害が発
生した場合にリセット信号を発生し、サーバも止めてし
まう（フェイルストップ）。しかし、ＲＡＩＤブースタ
カード１７では、障害があった場合でも、それによる影
響がそのカード内だけに閉じ込め計算機本体まで波及し
ないように制御し（障害隔離）、計算機内のソフトウェ
アはそのまま実行できるようにする。そのために、前述
のＥＣＣステータスレジスタ１７３を用意し、ログメモ
リ１７１の状態をソフトウェアから監視できるようにし
ている。(B) In a normal PCI board, when a failure occurs, a reset signal is generated and the server is stopped (fail stop). However, in the RAID booster card 17, even if a failure occurs, the influence of the failure is controlled so as not to be confined to the card and spread to the computer main body (fault isolation), so that the software in the computer can be directly executed. . For this purpose, the aforementioned ECC status register 173 is prepared so that the state of the log memory 171 can be monitored from software.

【００３５】（ｃ）計算機側の制御ソフトウェアは常に
ＥＣＣステータスレジスタ１７３を用いてＲＡＩＤブー
スタカード１７の状態をチェックし、ログメモリ１７１
やカードの障害を検出した場合には、速やかに主メモリ
１３上の書き込みデータのコピーをディスクアレイ１８
へ書き出す。(C) The control software on the computer always checks the status of the RAID booster card 17 by using the ECC status register 173, and the log memory 171
When a failure of the card or the card is detected, a copy of the write data on the main memory 13 is immediately copied to the disk array 18.
Write to

【００３６】この対策により、ＲＡＩＤブースタカード
１７が故障した場合でさえも、そのとき計算機側の制御
ソフトウェアが動作中であれば、書き込みデータを失う
ことはない。With this measure, even if the RAID booster card 17 fails, the write data will not be lost if the control software on the computer side is operating at that time.

【００３７】（４）ディスク障害時のダンプファイル本実施形態では書き込みデータをディスクアレイ１８に
「まとめ書き」するため、アプリケーションへ書き込み
完了を報告する時点と、実際にディスクアレイ１８への
書き込みを行う時点が異なる。そのため、ディスクアレ
イ１８への書き込み時にディスク障害が発生した場合、
ディスクアレイ１８へ書かれているべきデータがログメ
モリ１７１上に残ってしまう。(4) Dump File at Disk Failure In this embodiment, the write data is "collectively written" to the disk array 18, so that the writing completion is reported to the application and the writing to the disk array 18 is actually performed. The time is different. Therefore, if a disk failure occurs during writing to the disk array 18,
Data to be written to the disk array 18 remains on the log memory 171.

【００３８】そこで、ディスク障害の場合には溜めてい
た書き込みデータをシステムディスク１５などへダンプ
ファイルとして保存しておき、ディスク障害が回復した
時点で自動的にそのダンプファイルをディスクアレイ１
８へ書き込むことにより、アプリケーションから見たデ
ータの整合性を確保している。In the event of a disk failure, the stored write data is stored as a dump file in the system disk 15 or the like, and when the disk failure is recovered, the dump file is automatically stored in the disk array 1.
8, data consistency as seen from the application is ensured.

【００３９】（書き込みデータの２重化）次に、図４乃
至図６を参照して、書き込みデータの２重化による処理
の原理を説明する。ここでは、説明を簡単にするため
に、ディスクアレイ１８がＮ＋１（Ｎ＝３）台のディス
クから構成されており、１ストライプユニットとアプリ
ケーションからの書き込みデータのブロックデータサイ
ズが等しいものと仮定する。(Duplicate of Write Data) Next, with reference to FIG. 4 to FIG. 6, the principle of processing by duplicating write data will be described. Here, for simplicity of description, it is assumed that the disk array 18 is composed of N + 1 (N = 3) disks, and that the block data size of one stripe unit and the write data from the application are equal.

【００４０】図４に示されているように、アプリケーシ
ョンからの書き込みデータは、ＲＡＩＤブースタカード
（ＲＡＩＤＢＯＯＳＴＥＲ）１７のログメモリ１７１
に記録されると共に、主メモリ１３のログメモリコピー
エリア１３１上にそのコピーが作成される。そして、シ
ステムの運用が進むにつれてログメモリ１７１に書き込
みデータが蓄積されていき、ログメモリ１７１に１パリ
ティグループ分の書き込みデータが蓄積されると、図５
に示すように、ディスクアレイ１８へのまとめ書きが行
われる。As shown in FIG. 4, the write data from the application is stored in the log memory 171 of the RAID booster card (RAID BOOSTER) 17.
And a copy thereof is created on the log memory copy area 131 of the main memory 13. Then, as the operation of the system proceeds, the write data is accumulated in the log memory 171, and when the write data for one parity group is accumulated in the log memory 171, FIG.
As shown in (1), collective writing to the disk array 18 is performed.

【００４１】もしログメモリ１７１に１パリティグルー
プ分の書き込みデータが揃う以前に、ログメモリ１７１
のエラーを含むＲＡＩＤブースタカード１７の障害が発
生すると、それまでにログメモリ１７１に記録した書き
込みデータを利用することが出来なくなる。この場合に
は、図６に示すように、ログメモリコピーエリア１３１
上に記録されているコピーデータを利用して、ディスク
アレイ１８への書き込みが行われる。この時、コピーデ
ータには任意のデータをダミーデータとして追加するこ
とにより１パリティグループ分の書き込みデータが作成
され、それが「まとめ書き」される。これにより、ログ
メモリ１７１のエラーを含むＲＡＩＤブースタカード１
７の障害発生時においても、ＲＡＩＤコントローラ１６
は、「まとめ書き」の書き込みデータだけからパリティ
を計算することが出来る。Before the write data for one parity group is prepared in the log memory 171, the log memory 171
When the failure of the RAID booster card 17 including the error described in (1) occurs, the write data recorded in the log memory 171 up to that time cannot be used. In this case, as shown in FIG.
Writing to the disk array 18 is performed using the copy data recorded above. At this time, write data for one parity group is created by adding arbitrary data as dummy data to the copy data, and the data is "collectively written". As a result, the RAID booster card 1 including the error in the log memory 171
7, the RAID controller 16
Can calculate the parity only from the write data of “collective writing”.

【００４２】（データ書き込み処理）次に、図７のフロ
ーチャートを参照して、本実施形態におけるデータ書き
込み処理の手順を説明する。(Data Write Processing) Next, the procedure of the data write processing in this embodiment will be described with reference to the flowchart of FIG.

【００４３】ＯＳのファイルシステム２２を介してアプ
リケーションプログラム２１からの書き込みデータを取
得すると、フィルタードライバ２３は、まず、カードド
ライバ２５を起動して、書き込みデータをログメモリ１
７１に書き込む為のＤＭＡ転送の実行をカードドライバ
２５に要求する（ステップＳ１０１）。次いで、フィル
タードライバ２３は、アプリケーションプログラム２１
から取得した書き込みデータを、そのフィルタードライ
バ２３の作業領域上などに割り当てられているログメモ
リコピーエリア１３１にコピーし、書き込みデータの複
製を作成する（ステップＳ１０２）。この後、フィルタ
ードライバ２３は、カードドライバ２５からＤＭＡ完了
通知が返されるのを所定時間待った後（ステップＳ１０
３）、ログメモリ１７１を含むＲＡＩＤブースタカード
（ＲＡＩＤＢＯＯＳＴＥＲ）１７が正常であるか否か
を判断する（ステップＳ１０４）。When obtaining write data from the application program 21 via the file system 22 of the OS, the filter driver 23 first activates the card driver 25 and stores the write data in the log memory 1.
A request is made to the card driver 25 to execute a DMA transfer for writing to the memory 71 (step S101). Next, the filter driver 23
Is copied to the log memory copy area 131 allocated on the work area of the filter driver 23 or the like, and a copy of the write data is created (step S102). Thereafter, the filter driver 23 waits for a predetermined time until a DMA completion notification is returned from the card driver 25 (step S10).
3) It is determined whether the RAID booster card (RAID BOOSTER) 17 including the log memory 171 is normal (step S104).

【００４４】この場合、例えば、ログメモリ１７１また
はＲＡＩＤブースタカード１７自体の障害によってＤＭ
Ａによる書き込みが正常に終了されず、所定時間経過し
てもＤＭＡ完了通知が返されなかった場合には、ログメ
モリ１７１にデータを書き込むことが出来ない何らかの
障害（ログメモリエラー）が発生したと判断される。ま
た、たとえＤＭＡ完了通知が所定時間内に返却された場
合であっても、ＥＣＣステータスレジスタ１７３を参照
して、もし２ビットエラーの発生を示すエラーステータ
スが設定されていれば、ログメモリエラーが発生してい
ると判断される。In this case, for example, due to a failure of the log memory 171 or the RAID booster card 17 itself, the DM
If the writing by A is not completed normally and the DMA completion notification is not returned even after the lapse of a predetermined time, it is determined that some failure (log memory error) in which data cannot be written to the log memory 171 has occurred. Is determined. Even if the DMA completion notification is returned within a predetermined time, if an error status indicating the occurrence of a 2-bit error is set by referring to the ECC status register 173, a log memory error will be generated. It is determined that it has occurred.

【００４５】このようなログメモリエラーが発生してお
らず、ＲＡＩＤブースタカード（ＲＡＩＤＢＯＯＳＴ
ＥＲ）１７が正常動作している場合には（ステップＳ１
０４のＹＥＳ）、フィルタードライバ２３は、アプリケ
ーションプログラム２１に対して書き込み完了を報告す
る（ステップＳ１０５）。そして、フィルタードライバ
２３は、１パリティグループ分の書き込みデータがログ
メモリ１７１に揃ったか否かを判断し（ステップＳ１０
６）、揃った場合には、パリティグループの一括書き込
み処理を実行し、１パリティグループ分の書き込みデー
タをディスクアレイ１８に「まとめ書き」して処理を終
了する（ステップＳ１０７）。１パリティグループ分の
書き込みデータがログメモリ１７１に揃っていない場合
には、ステップＳ１０７を実行せずに処理を終了する。When such a log memory error has not occurred, the RAID booster card (RAID BOOST
ER) 17 is operating normally (step S1)
04, YES), the filter driver 23 reports the completion of writing to the application program 21 (step S105). Then, the filter driver 23 determines whether or not the write data for one parity group has been collected in the log memory 171 (step S10).
6) If all the data are collected, the batch write processing of the parity group is executed, and the write data of one parity group is "collectively written" to the disk array 18 and the processing is terminated (step S107). If the write data for one parity group is not available in the log memory 171, the process ends without executing step S107.

【００４６】一方、ログメモリエラーの発生が検出され
た場合、つまりＲＡＩＤブースタカード（ＲＡＩＤＢ
ＯＯＳＴＥＲ）１７が正常動作していない場合には（ス
テップＳ１０４のＮＯ）、フィルタードライバ２３は、
ステップＳ１０２の実行以前にログメモリコピーエリア
１３１に既に存在しているコピーデータに対してダミー
データを加えることより、１パリティグループ分の書き
込みデータを作成する（ステップＳ１０８）。そして、
フィルタードライバ２３は、パリティグループの一括書
き込み処理を実行し、ステップＳ１０８で作成した１パ
リティグループ分の書き込みデータをディスクアレイ１
８に「まとめ書き」した後（ステップＳ１０９）、アプ
リケーションプログラム２１に対して書き込みエラーの
発生を通知して処理を終了する（ステップＳ１１０）。On the other hand, when occurrence of a log memory error is detected, that is, when a RAID booster card (RAID B
If the OOSTER) 17 is not operating normally (NO in step S104), the filter driver 23
By adding dummy data to the copy data already existing in the log memory copy area 131 before the execution of step S102, write data for one parity group is created (step S108). And
The filter driver 23 executes a parity group batch write process, and writes the write data for one parity group created in step S108 to the disk array 1.
After the “collective writing” in step 8 (step S109), the application program 21 is notified of the occurrence of a write error, and the process ends (step S110).

【００４７】このように、ログメモリ１７１に書き込み
データを記録する度に障害検出を行い、障害発生が検出
された場合には即座にコピーデータをディスクアレイ１
８に書き込み、かつ書き込みを要求したホストシステム
側にはエラーを返すことにより、書き込みデータの保全
性の向上を図ることが可能となる。As described above, a failure is detected each time write data is recorded in the log memory 171, and when a failure is detected, the copy data is immediately transferred to the disk array 1.
By writing an 8 and returning an error to the host system requesting the writing, it is possible to improve the integrity of the write data.

【００４８】（データ読み出し処理）次に、図８のフロ
ーチャートを参照して、本実施形態におけるデータ読み
出し処理の手順を説明する。(Data Read Processing) Next, the procedure of the data read processing in the present embodiment will be described with reference to the flowchart of FIG.

【００４９】ＯＳのファイルシステム２２を介してアプ
リケーションプログラム２１からのデータ読み出しの要
求を受け取ると、フィルタードライバ２３は、まず、リ
ード要求されたデータがログメモリ１７１に存在してい
るか否かを判断する（ステップＳ２０１）。この判断
は、書き込み要求を受けてログメモリ１７１に記録した
書き込みデータの中で、リード要求された論理アドレス
と一致する書き込みデータがあるか否かを調べることに
よって行われる。Upon receiving a data read request from the application program 21 via the OS file system 22, the filter driver 23 first determines whether or not the read requested data exists in the log memory 171. (Step S201). This determination is made by checking whether there is any write data that matches the logical address requested to be read among the write data recorded in the log memory 171 in response to the write request.

【００５０】リード要求されたデータがログメモリ１７
１に存在しないならば（ステップＳ２０１のＮＯ）、フ
ィルタードライバ２３は、ＲＡＩＤコントローラドライ
バ２４を介してＲＡＩＤコントローラ１６を制御し、デ
ィスクアレイ１８から該当するデータを読み出す（ステ
ップＳ２０６）。この場合、ディスクアレイ１８には書
き込み要求順にデータがシーケンシャルに書き込みれて
いるので、どの物理アドレスからデータを読み出すか
は、フィルタードライバ２３がアドレスマッピングテー
ブル１７２を参照することによって決定する。データ読
み出し後、フィルタードライバ２３は、アプリケーショ
ンプログラム２１にリード完了を通知して処理を終了す
る（ステップＳ２０７）。The data requested to be read is stored in the log memory 17.
If not present (step S201: NO), the filter driver 23 controls the RAID controller 16 via the RAID controller driver 24 and reads the corresponding data from the disk array 18 (step S206). In this case, since data is sequentially written to the disk array 18 in the order of write requests, the physical address from which to read data is determined by the filter driver 23 by referring to the address mapping table 172. After reading the data, the filter driver 23 notifies the application program 21 of the read completion, and ends the processing (step S207).

【００５１】リード要求されたデータがログメモリ１７
１に存在する場合には（ステップＳ２０１のＹＥＳ）、
フィルタードライバ２３は、カードドライバ２５を起動
して、リード要求されたデータを読み出すためのＤＭＡ
転送の実行をカードドライバ２５に要求する（ステップ
Ｓ２０２）。そして、フィルタードライバ２３は、カー
ドドライバ２５からＤＭＡ完了通知が返されるのを所定
時間待った後（ステップＳ２０３）、ログメモリ１７１
を含むＲＡＩＤブースタカード（ＲＡＩＤＢＯＯＳＴ
ＥＲ）１７が正常であるか否かを判断する（ステップＳ
２０４）。The data requested to be read is stored in the log memory 17.
1 (YES in step S201),
The filter driver 23 activates the card driver 25 to read the data requested to be read.
It requests the card driver 25 to execute the transfer (step S202). Then, the filter driver 23 waits for a predetermined time for the DMA completion notification to be returned from the card driver 25 (step S203), and then returns to the log memory 171.
RAID Booster Card (RAID BOOST)
ER) 17 is determined to be normal (step S)
204).

【００５２】例えば、ログメモリ１７１またはＲＡＩＤ
ブースタカード１７自体の障害によってＤＭＡによる読
み出しが正常に終了されず、所定時間経過してもＤＭＡ
完了通知が返されなかった場合には、ログメモリ１７１
からデータを読み出すことが出来ない何らかの障害（ロ
グメモリエラー）が発生したと判断される。また、ステ
ップＳ２０４ではＥＣＣステータスレジスタ１７３のチ
ェックも行われ、２ビットエラーの発生を示すエラース
テータスが設定されている場合には、ログメモリエラー
が発生していると判断される。For example, the log memory 171 or RAID
Due to the failure of the booster card 17 itself, the reading by the DMA is not completed normally and the DMA
If no completion notification is returned, the log memory 171
It is determined that some failure (log memory error) in which data cannot be read from the server has occurred. In step S204, the ECC status register 173 is also checked. If an error status indicating the occurrence of a 2-bit error is set, it is determined that a log memory error has occurred.

【００５３】このようなログメモリエラーが発生してお
らず、ＲＡＩＤブースタカード（ＲＡＩＤＢＯＯＳＴ
ＥＲ）１７が正常動作している場合には（ステップＳ２
０４のＹＥＳ）、フィルタードライバ２３は、アプリケ
ーションプログラム２１に対してリード完了を報告して
処理を終了する（ステップＳ２０５）。When such a log memory error has not occurred, the RAID booster card (RAID BOOST
ER) 17 is operating normally (step S2)
04 (YES), the filter driver 23 reports the read completion to the application program 21 and ends the processing (step S205).

【００５４】ログメモリエラーの発生が検出された場
合、つまりＲＡＩＤブースタカード（ＲＡＩＤＢＯＯ
ＳＴＥＲ）１７が正常動作していない場合には（ステッ
プＳ２０４のＮＯ）、フィルタードライバ２３は、ログ
メモリコピーエリア１３１に存在しているコピーデータ
に対してダミーデータを加えることより、１パリティグ
ループ分の書き込みデータを作成する（ステップＳ２０
８）。そして、フィルタードライバ２３は、パリティグ
ループの一括書き込み処理を実行し、ステップＳ２０８
で作成した１パリティグループ分の書き込みデータをデ
ィスクアレイ１８に「まとめ書き」した後（ステップＳ
２０９）、アプリケーションプログラム２１に対して書
き込みエラーの発生を通知して処理を終了する（ステッ
プＳ２１０）。When the occurrence of a log memory error is detected, that is, when a RAID booster card (RAID BOO
If the (STER) 17 is not operating normally (NO in step S204), the filter driver 23 adds dummy data to the copy data existing in the log memory copy area 131, so that one parity group is added. Is created (step S20).
8). Then, the filter driver 23 executes a batch write process of the parity group, and executes step S208.
After the write data for one parity group created at step S is “collectively written” to the disk array 18 (step S
209), the occurrence of a write error is notified to the application program 21, and the process is terminated (step S210).

【００５５】このように、データ読み出し時においても
障害検出を行い、障害発生が検出された場合には即座に
コピーデータをディスクアレイ１８に書き込むことによ
り、書き込みデータの保全性の向上を図ることが可能と
なる。As described above, the failure is detected even at the time of data reading, and when the failure is detected, the copy data is immediately written to the disk array 18, thereby improving the integrity of the write data. It becomes possible.

【００５６】（ディスク障害時のダンプファイル）次
に、ディスク障害発生時に、ログメモリ１７１の内容を
ダンプファイルとして退避する処理について説明する。
この退避処理は、図７、図８で説明した「パリティグル
ープの一括書き込み処理」の中で行われる。(Dump File When Disk Failure Occurs) Next, processing for saving the contents of the log memory 171 as a dump file when a disk failure occurs will be described.
This save processing is performed in the “parity group batch write processing” described with reference to FIGS. 7 and 8.

【００５７】以下、「パリティグループの一括書き込み
処理」の手順を図９に示す。FIG. 9 shows the procedure of the "parity group batch write process".

【００５８】フィルタードライバ２３は、まず、ＲＡＩ
Ｄコントローラドライバ２４を起動して、１パリティグ
ループ分のデータを一括してディスクアレイ１８に書き
込むためのＤＭＡ転送をＲＡＩＤコントローラドライバ
２４に要求する（ステップＳ３０１）。ＲＡＩＤコント
ローラ１６は、１パリティグループ分のデータからパリ
ティを自動生成し、空き領域の先頭ストライプへの書き
込みを試行する。The filter driver 23 first sets the RAI
The D controller driver 24 is activated, and requests the RAID controller driver 24 to perform a DMA transfer for collectively writing data for one parity group to the disk array 18 (step S301). The RAID controller 16 automatically generates a parity from the data of one parity group, and tries to write the parity to the first stripe of the free area.

【００５９】そして、フィルタードライバ２３は、ＲＡ
ＩＤコントローラドライバ２４からＤＭＡ完了通知が返
されるのを所定時間待った後（ステップＳ３０２）、Ｒ
ＡＩＤコントローラドライバ２４からの完了ステータス
の内容を調べること等によって、ディスクアレイ１８に
障害が発生しているか否かを判断する（ステップＳ３０
３）。ディスクアレイ１８に何らかの障害が発生してお
り、ディスクアレイ１８への書き込みが失敗した場合に
は（ステップＳ３０３のＹＥＳ）、フィルタードライバ
２３は、ログメモリ１７１に蓄積されている書き込みデ
ータをシステムディスク１５にダンプファイルとして退
避する（ステップＳ３０４）。この場合、ダンプファイ
ルが最後まで正しく保存できたか否かを確認できるよう
にするため、図１０に示すように、ダンプファイルの先
頭と末尾にはシーケンス番号などの識別子が埋め込まれ
る。Then, the filter driver 23
After waiting for a predetermined time until the DMA completion notification is returned from the ID controller driver 24 (step S302), R
By examining the contents of the completion status from the AID controller driver 24, it is determined whether or not a failure has occurred in the disk array 18 (step S30).
3). If any failure has occurred in the disk array 18 and writing to the disk array 18 has failed (YES in step S303), the filter driver 23 sends the write data stored in the log memory 171 to the system disk 15 Is saved as a dump file (step S304). In this case, an identifier such as a sequence number is embedded at the beginning and end of the dump file as shown in FIG. 10 so that it can be checked whether or not the dump file has been correctly saved to the end.

【００６０】（ダンプファイルを用いたデータ復元処
理）次に、図１１のフローチャートを参照して、ダンプ
ファイルを用いたデータ復元処理について説明する。(Data Restoring Process Using Dump File) Next, a data restoring process using a dump file will be described with reference to a flowchart of FIG.

【００６１】すなわち、計算機システムが再起動され、
フィルタードライバ２３がロードされると、フィルター
ドライバ２３は、まず、システムディスク１５にダンプ
ファイルが存在するか否か、および存在する場合にはそ
のダンプファイルの内容が正しいものであるか否かの判
断処理を行う（ステップＳ４０１）。ダンプファイルの
書き込み途中でシステムがダウンしてしまった場合など
のように、正しくないダンプファイルが記録されてしま
う場合があるからである。That is, the computer system is restarted,
When the filter driver 23 is loaded, the filter driver 23 first determines whether a dump file exists on the system disk 15 and, if so, whether the contents of the dump file are correct. Processing is performed (step S401). This is because an incorrect dump file may be recorded, such as when the system goes down while writing the dump file.

【００６２】ダンプファイルの内容が正しいものである
か否かの判定は、前述のシーケンス番号をチェックする
ことによって行うことが出来る。ダンプファイルの先頭
および末尾の双方にシーケンス番号が記録されていれ
ば、正しいダンプファイルであると判断される。この場
合、フィルタードライバ２３は、パリティグループの一
括書き込み処理を実行し、ダンプファイル内の１パリテ
ィグループ分のデータをディスクアレイ１８に書き込む
（ステップＳ４０２）。なお、もしダンプファイル内の
データが１パリティグループ分に足らない場合には、ダ
ミーデータを追加した後に一括書き込み処理を実行すれ
ばよい。Whether or not the contents of the dump file are correct can be determined by checking the above-mentioned sequence number. If a sequence number is recorded at both the beginning and end of the dump file, it is determined that the dump file is correct. In this case, the filter driver 23 executes a parity group batch write process, and writes data for one parity group in the dump file to the disk array 18 (step S402). If the data in the dump file is not enough for one parity group, the batch write processing may be executed after adding the dummy data.

【００６３】以上のように、本実施形態においては、書
き込みデータの２重化、ディスク障害時のダンプファイ
ル作成、等の機能を設けることにより、書き込みデータ
の保全性を向上させることが可能となり、十分な書き込
み性能と十分な信頼性を得ることができる。また、ＲＡ
ＩＤブースタカード１７の機能はＲＡＩＤコントローラ
内に搭載することも可能ではあるが、本実施形態のよう
にＲＡＩＤブースタカード１７をＰＣＩカードとして実
現することにより、既存のＲＡＩＤコントローラに何ら
変更を加えることなく、書き込み性能の向上とデータ保
全性の向上を容易に実現することが出来る。As described above, in this embodiment, by providing functions such as duplication of write data and creation of a dump file in the event of a disk failure, the integrity of write data can be improved. Sufficient writing performance and sufficient reliability can be obtained. Also, RA
Although the function of the ID booster card 17 can be installed in the RAID controller, by implementing the RAID booster card 17 as a PCI card as in the present embodiment, the existing RAID controller can be changed without any change. In addition, it is possible to easily realize an improvement in write performance and an improvement in data integrity.

【００６４】なお、本実施形態では、ＲＡＩＤコントロ
ーラにてパリティを生成するようにしたが、ＲＡＩＤブ
ースタカード１７でパリティを生成するようにしてもよ
い。また、バックアップファイルは不揮発性の記録媒体
であれば、どこに保存するようにしてもよい。さらに、
本実施形態によるデータ保全方法は、ＲＡＩＤ５のディ
スクアレイに特に有効ではあるが、他のＲＡＩＤ方式の
ディスクアレイや単一のディスク装置など、ログメモリ
１７１を用いて「まとめ書き」を行う構成を適用可能な
あらゆるディスクサブシステムに応用することができ
る。In this embodiment, the parity is generated by the RAID controller. However, the parity may be generated by the RAID booster card 17. The backup file may be stored anywhere as long as it is a non-volatile recording medium. further,
The data preservation method according to the present embodiment is particularly effective for a RAID 5 disk array, but applies a configuration in which “collective writing” is performed using the log memory 171 such as another RAID type disk array or a single disk device. It can be applied to any possible disk subsystem.

【００６５】[0065]

【発明の効果】以上説明したように、本発明によれば、
書き込みデータの２重化、ディスク障害時のダンプファ
イル作成、等の機能を設けることにより、ディスク装置
に対する書き込み性能の向上とデータの保全性の向上を
実現することが可能となる。As described above, according to the present invention,
By providing functions such as duplication of write data and creation of a dump file in the event of a disk failure, it is possible to achieve an improvement in write performance to a disk device and an improvement in data integrity.

[Brief description of the drawings]

【図１】本発明の一実施形態に係る計算機システムの構
成を示すブロック図。FIG. 1 is a block diagram showing a configuration of a computer system according to an embodiment of the present invention.

【図２】同実施形態のシステムによって行われる「まと
め書き」の原理を説明するための図。FIG. 2 is an exemplary view for explaining the principle of “collective writing” performed by the system of the embodiment.

【図３】同実施形態のシステムにおけるハードウェアと
ソフトウェアの関係を示す図。FIG. 3 is an exemplary view showing a relationship between hardware and software in the system according to the embodiment;

【図４】同実施形態のシステムにおいて実行される書き
込みデータのコピー処理を説明するため図。FIG. 4 is an exemplary view for explaining write data copy processing which is executed in the system of the embodiment.

【図５】同実施形態のシステムにおいて実行されるログ
データの一括書き込み処理を説明するための図。FIG. 5 is an exemplary view for explaining batch write processing of log data executed in the system of the embodiment.

【図６】同実施形態のシステムにおいて実行されるコピ
ーデータの一括書き込み処理を説明するための図。FIG. 6 is an exemplary view for explaining batch data write processing executed in the system according to the embodiment;

【図７】同実施形態のシステムによって実行されるデー
タ書き込み時の一連の処理の流れを示すフローチャー
ト。FIG. 7 is an exemplary flowchart illustrating a flow of a series of processes performed by the system according to the embodiment when writing data.

【図８】同実施形態のシステムによって実行されるデー
タ読み出し時の一連の処理の流れを示すフローチャー
ト。FIG. 8 is an exemplary flowchart showing the flow of a series of processes performed by the system of the embodiment when reading data.

【図９】同実施形態のシステムにおいて実行される一括
書き込み処理の手順を示すフローチャート。FIG. 9 is an exemplary flowchart illustrating the procedure of a batch write process executed in the system of the embodiment.

【図１０】同実施形態のシステムにおいてディスク障害
時に作成されるダンプファイルの一例を示す図。FIG. 10 is an exemplary view showing an example of a dump file created when a disk failure occurs in the system of the embodiment.

【図１１】同実施形態のシステムにおいて実行されるリ
ブート時の動作を示す図。FIG. 11 is an exemplary view showing an operation at the time of reboot executed in the system of the embodiment.

[Explanation of symbols]

１１…ＣＰＵ１３…主メモリ１５…システムディスク１６…ＲＡＩＤコントローラ１７…ＲＡＩＤブースタカード１８…ディスクアレイ２３…フィルタードライバ２５…カードドライバ１３１…ログメモリコピーエリア１７１…ログメモリ１７３…ＥＣＣステータスレジスタ DESCRIPTION OF SYMBOLS 11 ... CPU 13 ... Main memory 15 ... System disk 16 ... RAID controller 17 ... RAID booster card 18 ... Disk array 23 ... Filter driver 25 ... Card driver 131 ... Log memory copy area 171 ... Log memory 173 ... ECC status register

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１１Ｂ 20/18 ５７２Ｇ１１Ｂ 20/18 ５７２Ｆ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G11B 20/18 572 G11B 20/18 572F

Claims

[Claims]

1. A disk control system for controlling a disk device provided in a computer system, comprising: a log memory for storing write data; and collectively storing the write data stored in the log memory. Control means for writing to the disk device; and means for holding a copy of the write data stored in the log memory, and using the copy of the write data, to maintain the integrity of the write data in the log memory. A disk control system characterized by securing.

2. The apparatus according to claim 1, further comprising: an error detecting unit that detects an error in the log memory; and a unit that writes a copy of the write data to the disk device in response to the error detection by the error detecting unit. The disk control system according to claim 1, wherein

3. The control unit has a status memory that holds error status information indicating an error of the log memory. The error detection unit periodically checks the status memory to check the log memory. 3. The disk control system according to claim 2, wherein presence or absence of an error is detected.

4. The disk device comprises a disk array in which a parity group including data and parity is arranged across a plurality of disk devices, and the control means stores one parity in the log memory. 2. The disk control system according to claim 1, wherein writing to the disk array is performed in units of the one parity group each time write data for a group is prepared.

5. A disk control system for controlling a disk device provided in a computer system, comprising a log memory for storing write data, and collectively storing the write data stored in the log memory. A control unit for writing to the disk device; an error detection unit for detecting an error of the disk device; and a unit for saving write data stored in the log memory in response to the error detection by the error detection unit. A disk control system comprising: using the saved write data to ensure the integrity of the write data stored in the log memory.

6. The disk control system according to claim 5, further comprising a data restoring unit that writes the saved write data to the disk device when the computer system is restarted.

7. The system according to claim 1, further comprising: a unit for judging the validity of the saved write data when the computer system is restarted. 7. The disk control system according to claim 6, wherein the disk control system is executed when the performance is confirmed.

8. A data security method applied to a disk control system for controlling a disk device provided in a computer system, wherein write data is stored in a log memory and a copy of the write data is held. And collectively writing the write data stored in the log memory to the disk device, and using the copy of the write data to ensure the integrity of the write data in the log memory. A data preservation method characterized by the following.

9. The method according to claim 1, further comprising: an error detection step of detecting an error of the log memory; and a step of writing a copy of the write data to the disk device in response to the error detection of the log memory. 9. The data security method according to claim 8, wherein

10. A data security method applied to a disk control system for controlling a disk device provided in a computer system, comprising: a step of storing write data in a log memory; and a step of storing write data in the log memory. Collectively writing the write data to the disk device; detecting an error in the disk device; and evacuating the write data stored in the log memory in response to the error detection in the disk device. Using the saved write data to secure the integrity of the write data stored in the log memory.

11. The method according to claim 1, further comprising the step of writing the saved write data to the disk device when the computer system is restarted.
0 Data security method.