JP4314651B2

JP4314651B2 - Disk array device and data recording / reproducing method

Info

Publication number: JP4314651B2
Application number: JP24047998A
Authority: JP
Inventors: 聡油谷; 徳一伊藤; 裕之藤田; 聡米谷; 正和吉本; 聡勝尾; 潤吉川; 知久志賀; 正樹広瀬; 晃一佐藤
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1998-08-26
Filing date: 1998-08-26
Publication date: 2009-08-19
Anticipated expiration: 2018-08-26
Also published as: JP2000066845A

Description

【０００１】
【発明の属する技術分野】
本発明は、例えば動画像及び音声信号を記録再生するいわゆるビデオサーバシステム等に好適なディスクアレイ装置及びデータ記録再生方法に関する。
【０００２】
【従来の技術】
例えば複数のＨＤＤ（ハードディスクドライブ）によって並列冗長構成されたディスクアレイ装置は、一般にＲＡＩＤ（Redundant Array of Inexpensive Disks）と呼ばれている。
【０００３】
図９には、上記ＲＡＩＤと呼ばれるディスクアレイ装置の概念を示す。
【０００４】
この図９において、ディスクアレイ装置は、入力されたデータ１００を、ディスクアレイコントローラ１０１にてある所定単位長さ毎のデータ列Ｄ０〜Ｄ１５に分割（ストライピング）し、それら所定単位長さ毎のデータ列Ｄ０〜Ｄ１５）を当該各所定単位長さ毎に複数のＨＤＤ１１１〜１１８に振り分けて格納する。なお、当該所定単位長さはストライピング幅と呼ばれ、また、このストライピング幅のデータ列をデータ格納用ＨＤＤの台数分だけ集めたものはアレイブロッキングファクタ（以下、ＡＢＦと記す）と呼ばれる。
【０００５】
このストライピングと同時に、ディスクアレイコントローラ１０１は、入力データを分割するときに、複数のＨＤＤ１１１〜１１８にまたがるデータの列に対してエラー訂正符号を生成し、このエラー訂正符号をＨＤＤ１１９に格納する。このように、図９のディスクアレイ装置においては、エラー訂正符号を格納しておき、後にこのエラー訂正符号を用いたエラー訂正を行うことで、単体のＨＤＤの場合よりも高い性能と信頼性を実現している。なお、図中のＰｎ（ｎは０〜１５）は、ｎ番目のデータ列に対するエラー訂正符号を示す。この図９の例では、ＨＤＤをＨＤＤ１１１〜１１８及び１１９の９個としたが更に多数或いは少数であってもよい。
【０００６】
図１０には、一般的なディスクアレイ装置の構造を示す。
【０００７】
この図１０において、一般的なディスクアレイ装置は、データの入出力インターフェース部１３１、データキャッシュ部（キャッシュメモリ）１３２、ＣＰＵ（中央処理装置）部１３３、ストライピング・ＥＣＣ部１３４、データタイミングコントローラ１５１、各ＨＤＤ間の非同期性を吸収するための複数の専用バッファメモリ部１３５〜１３８、複数のデータストリームコントローラ１３９〜１４２、複数のＳＣＳＩプロトコルコントローラ（以下、ＳＰＣと呼ぶ）１４３〜１４６、複数のＨＤＤ部１４７〜１５０から構成される。
【０００８】
入力されたデータは、入出力インターフェイス部１３１及びデータキャッシュ部１３２を介してストライピング・ＥＣＣ部１３４に送られる。
【０００９】
ストライピング・ＥＣＣ部１３４では、入力されたデータを分割（ストライピング）する。このストライピングされたデータは、バッファメモリ１３５〜１３７及びＳＰＣ１４３〜１４５を介して、複数のＨＤＤ部１４７〜１４９に格納される。また、ストライピング・ＥＣＣ部１３４では、データを分割（ストライピング）するときに複数のＨＤＤ部１４７〜１４９にまたがるデータ列に対してエラー訂正符号を生成し、バッファメモリ１３８及びＳＰＣ１４６を介して、ＨＤＤ部１５０に格納する。
【００１０】
ＨＤＤ部１４７〜１５０へのデータの読み書きは、バッファメモリ部１３５〜１３８及びＳＰＣ１４３〜１４６を通して行われ、これらバッファメモリ部１３５〜１３８及びＳＰＣ１４３〜１４６におけるデータの読み書きの制御は、それぞれ対応するデータストリームコントローラ１３９〜１４２によりなされる。また、このデータストリームコントローラ１３９〜１４２におけるデータの読み書きのタイミングは、ストライピング・ＥＣＣ部１３４が発生する。
【００１１】
なお、ディスクアレイ装置内部のＨＤＤ部１４７〜１５０は、同時に動作すると言っても、それぞれのＨＤＤ部からのデータ転送の開始タイミングや終了タイミングは、必ずしも一致しない。このため、図示は省略するが、このタイミングの違いを吸収するためのバッファメモリ部１３５〜１３８を各ＨＤＤ部１４７〜１５０を制御するＳＰＣ１４３〜１４６の直後にも設けるようにしてもよい。
【００１２】
データタイミングコントローラ１５１は、入出力インターフェイス部１３１へのデータの読み書きのタイミングコントロールと、データキャッシュ部１３２のタイミングコントロールと、ストライピング・ＥＣＣ部１３４へのデータの読み書きのタイミングコントロールを行っている。
【００１３】
ＣＰＵ部１３３は、ＣＰＵバスを通して、入出力インターフェイス部１３１、ストライピング・ＥＣＣ部１３４、データストリームコントローラ１３９〜１４２、ＳＰＣ１４３〜１４６の動作をコントロールする。
【００１４】
図１１には、上述したように構成されるＲＡＩＤのディスクアレイ装置を、動画像及び音声蓄積用の蓄積メディアとして用いた、ビデオサーバシステムの概略構成を示す。
【００１５】
この図１１に示すビデオサーバシステムは、蓄積メディアとしてのディスクアレイ装置（以下、ＲＡＩＤ１２９と記す）に対して、複数の入出力装置１２１〜１２４が時分割多重でアクセス可能となされている。なお、以下、入出力装置１２１〜１２４のことを、ＩＯＰ（Input Output Processor）１２１〜１２４と呼ぶことにする。各ＩＯＰ１２１〜１２４は、映像及び音声データの入出力が行われ、また、上位アプリケーションからのコントロールを受ける。これらＩＯＰ１２１〜１２４は、タイムスロット生成部１２５で生成されたタイムスロットに従い、データパスを介して、時分割多重でＲＡＩＤ１２９にアクセス可能となされている。データパスとしては、内部バスであったり、ＳＣＳＩ（Small Computer System Interface）、ファイバチャンネル等のネットワーク、いわゆるＳＢＸバスなどが挙げられる。
【００１６】
【発明が解決しようとする課題】
ところで、一般に、動画像や音声などの大量のデータを一度に転送するには、いわゆるＲＡＩＤ−３構成のディスクアレイ装置が適していると言われているが、当該ディスクアレイ装置は以下に挙げる問題点があり、ビデオサーバシステム用の蓄積メディアとして求められる性能を満たすことができない。
【００１７】
第１の問題点として、書き込み時のオーバーヘッドが発生する。
【００１８】
すなわち、ディスクアレイ装置では、書き込み時にオーバーヘッドが生じることがあり、書き換えるデータが前記ＡＢＦ（アレイブロッキングファクタ）の境界内に丁度収まらない場合は、書き換える前のデータとパリティを読み出し、新しいパリティを計算して書き込む必要があるため、書き込み時にオーバーヘッドが発生することがある。
【００１９】
第２の問題点として、ＨＤＤ単体がリアルタイム性を保証していない。
【００２０】
すなわち、ＨＤＤ単体は、読み書きに対してリトライを行うことを前提にしており、リアルタイム性確保が困難である。例えば、データ読み出しの際に何らかの理由で読み出しに失敗したとしても、ディスクアレイ装置のエラー訂正機能によってある程度のリアルタイム性は確保できるが、一方で、書き込みに失敗したような場合にＨＤＤにてリトライが行われると、リアルタイム性が確保できないことになる。また、単にＨＤＤ単体のリトライを禁止したとしても、例えば時間が足りないという理由等によって、ＨＤＤへの書き込みに失敗することがある。なお、このような場合、書き込みには失敗したが、ＨＤＤ自体に欠陥があるわけではなく、したがって、次にその部分からデータを読み出した時、当該ＨＤＤからはエラーのサインが返送されてこないので、全く関係無いデータが読み出されてくることになり、その結果、データを復元できなくなるという事態に陥る。
【００２１】
第３の問題点として、障害回復時の性能低下を防ぐ機構を有していない。
【００２２】
すなわち、一般の計算機用のディスクアレイ装置は、障害回復時の性能低下を防ぐ機構を有しておらず、障害回復のデータ再構築（リビルド）時には、ディスクアレイに対する読み書きができない。しかし、ビデオサーバシステム用、特に放送用のビデオサーバシステムでは、障害回復時の機能低下が最小限になるようにしなければならない。
【００２３】
上述したように、従来のディスクアレイ装置においては、上記第１及び第２の問題点により、一定時間内に読み書きが終了する保証がない。すなわち、連続するデータの読み書きが途絶えないために、さらに時分割多重アクセスする映像／音声入出力装置の同期性やリアルタイム性を保証することができない。一般には、第１及び第２の問題点を回避するためには、キャッシュメモリ（データキャッシュ部１３２）の量を増やす必要があると言われているが、キャッシュメモリの量を増やしたとしても、必ずヒットする保証がないので、１００％リアルタイム性を保証するものではない。
【００２４】
そこで、本発明はこのような状況に鑑みてなされたものであり、書き込み時のオーバーヘッドの発生を防止し、リアルタイム性を保証し、障害回復時の性能低下を防ぐことが可能なディスクアレイ装置及びデータ記録再生方法を提供することを目的とする。
【００２５】
【課題を解決するための手段】
本発明のディスクアレイ装置は、複数のディスクドライブによって構成され、タイムスロットに従い時分割多重アクセスされるディスクアレイ装置において、上記ディスクアレイ装置に入力されたデータを一時的に保持するリングバッファ構造の第１のバッファメモリからのデータ又は上記入力されたデータを分割して上記複数のディスクドライブに順次に供給し、上記複数のディスクドライブから供給されたデータを結合するデータ分割結合手段と、上記複数のディスクドライブの動作状況及び上記タイムスロットを管理すると共に、上記第１のバッファメモリ、上記データ分割結合手段から出力される上記複数のディスクドライブからのデータを結合したデータを一時的に保持するリングバッファ構造の第２のバッファメモリの読み書き及び上記複数のディスクドライブの記録再生動作を制御する制御手段とを有し、上記制御手段は、上記ディスクドライブに対するデータ書き込みが失敗したときの上記ディスクドライブ単体での再書き込みを禁止して書き込まれるはずの箇所を管理し、当該データ書き込みが失敗したときのタイムスロットにおける上記入力されたデータを上記第１のバッファメモリに保持させ、次の空きタイムスロットを使って上記第１のバッファメモリに保持されたデータを上記ディスクドライブの対応箇所に再度書き込み制御し、データの再構築を行う際には、空きタイムスロットを使って上記ディスクドライブから上記再構築を行おうとするデータを読み出してエラー訂正後に上記第２のバッファメモリに保持させ、次の空きタイムスロットを使って上記第２のバッファメモリに保持されたデータを上記ディスクドライブに書き込み制御することにより、上述の課題を解決する。
【００２６】
また、本発明のディスクアレイ装置は、複数のディスクドライブによって構成され、タイムスロットに従い時分割多重アクセスされるディスクアレイ装置において、上記ディスクアレイ装置に入力されたデータを分割して上記複数のディスクドライブに順次に供給し、上記複数のディスクドライブから供給されたデータを結合するデータ分割結合手段と、上記データ分割結合手段から供給され上記ディスクドライブへ記録されるデータを一時的に保持する少なくとも２個１組の第１の先入れ先出しメモリと、上記ディスクドライブから再生されたデータを一時的に保持する少なくとも２個１組の第２の先入れ先出しメモリと、ディスクドライブの動作状況及び上記タイムスロットを管理すると共に、上記ディスクドライブの記録再生動作及び上記第１，第２の先入れ先出しメモリをバンク切り替え制御する制御手段とを有し、上記制御手段は、上記ディスクドライブに対するデータ書き込みが失敗したときの上記ディスクドライブ単体での再書き込みを禁止して書き込まれるはずの箇所を管理し、当該データ書き込みが失敗したときのタイムスロットにおける上記データ分割結合手段から供給されたデータを上記第１の先入れ先出しメモリに保持させ、次の空きタイムスロットを使って上記第１の先入れ先出しメモリに保持されたデータを上記ディスクドライブの対応箇所に再度書き込み制御し、データの再構築を行う際には、空きタイムスロットを使って上記ディスクドライブから上記再構築を行おうとするデータを読み出してエラー訂正後に上記第２の先入れ先出しメモリに保持させ、次の空きタイムスロットを使って上記第２の先入れ先出しメモリに保持されたデータを上記ディスクドライブに書き込み制御することにより、上述の課題を解決する。
【００２７】
【発明の実施の形態】
本発明の好ましい実施の形態について、図面を参照しながら説明する。
【００２８】
先ず、第１の実施の形態から説明する。
【００２９】
本発明の第１の実施の形態では、論理ブロックの大きさを「１セクタ（通常５１２バイト）×ＨＤＤの数」の整数倍に最適化し、ＡＢＦ（アレイブロッキングファクタ）と書き換えデータの大きさとを一致させて、書き込み時のオーバーヘッドを無くことにより、転送レートの低下を防ぐようにしている。これにより、本実施の形態では、書き込み時のオーバーヘッドを緩和するためのデータキャッシュの必要を無くしている。
【００３０】
また、本発明の第１の実施の形態では、ＨＤＤ単体のリトライ及びリアサインを禁止して、一定の時間内に読み書きが終了することを保証するようにしており、これにより連続データの読み書きを途切れることなく実行することと、複数チャンネルによる時分割多重を保証している。すなわち、本実施の形態では、書き込み時にエラーの起こった場所を例えばＣＰＵが管理し、データの再構築を後から行うことにして、データの信頼性を保つことを可能にしている。
【００３１】
さらに、本発明の第１の実施の形態では、障害回復の性能低下を避けるために、後述するリトライ用テンポラリバッファメモリとリビルド用バッファメモリを備え、通常運用時のデータの流れと障害回復時のデータの流れを分けるようにしている。これにより、書き込み時にエラーが発生したＨＤＤへのデータ修復を、システムのためのタイムスロットや、空いているタイムスロットを使って素早く回復することを可能にし、例えばＨＤＤを交換してデータの修復を行う場合もその運用効率を高めることを可能にしている。
【００３２】
以下、本発明の第１の実施の形態のディスクアレイ装置の構成及び動作を、例えば前記図１１に示したようなビデオサーバシステムの蓄積メディアとして使用する場合を例に挙げて説明する。
【００３３】
図１には、本発明の第１の実施の形態のディスクアレイ装置の概略構成を示す。
【００３４】
この図１において、入出力インターフェイス部２では、本実施の形態のディスクアレイ装置と前記図１１に示したビデオサーバシステムのＩＯＰ１２１〜１２４との間における、コマンド／ステータス及び入出力データの送受を制御する。この入出力インターフェイス部２に入力されたデータは、選択スイッチ４を介して、或いは後述するリトライ用テンポラリバッファメモリ２１及び選択スイッチ４を介して、ストライピング・ＥＣＣ部７に送られる。
【００３５】
ストライピング・ＥＣＣ部７では、入力されたデータを分割（ストライピング）する。このストライピングされたデータは、ＨＤＤコントロール部１０〜１３を介して、複数のＨＤＤ部１５〜１８に格納される。また、ストライピング・ＥＣＣ部７では、データを分割（ストライピング）するときに複数のＨＤＤ部１５〜１８にまたがるデータ列に対してエラー訂正符号を生成する。このエラー訂正符号は、ＨＤＤコントロール部１４を介して、ＨＤＤ部１９に格納される。
【００３６】
リトライ用テンポラリバッファメモリ２１とリビルド用バッファメモリ２２は、本実施の形態のディスクアレイ装置においてリトライとリビルドを効率よく行うために設けられている。リトライ用テンポラリバッファメモリ２１は、リングバッファ構造になっており、入出力インターフェイス部２からのデータが書き込まれる。一方、リビルド用バッファメモリ２２もリングバッファ構造になっており、各ＨＤＤ１５〜１８から読み出され、ＨＤＤ部１９からのエラー訂正符号によりエラー訂正されたデータが書き込まれる。これらリトライ用テンポラリバッファメモリ２１からのデータや、リビルド用バッファメモリ２２からのデータは、選択スイッチ４に送られる。詳細は後述するが、本実施の形態では、例えば各ＨＤＤ１５〜１９で読み書きにエラーが起きた場合、空いているタイムスロットを使い、リトライ用テンポラリバッファメモリ２１やリビルド用バッファメモリ２２からのデータを用いてデータの再構築（エラー時の回復動作やリビルド動作）を行うようにしている。なお、リトライ用テンポラリバッファメモリ２１とリビルド用バッファメモリ２２は、前記図１０に示したようないわゆるキャッシュメモリ（データキャッシュ部１３２）ではない。すなわち、リトライ用テンポラリバッファメモリ２１に保存されるデータは、入出力インターフェイス部２から選択スイッチ４に送られるデータのコピーであり、また、リビルド用バッファメモリ２２に保存されるデータは、ストライピング・ＥＣＣ部７から入出力インターフェイス部２に送られるデータのコピーであり、いわゆるキャッシュメモリの場合のようなキャッシュヒット等の複雑な制御を必要としない。
【００３７】
データタイミングコントローラ３は、入出力インターフェイス部２におけるデータの読み書きのタイミングコントロールと、ストライピング・ＥＣＣ部７におけるデータの読み書きのタイミングコントロールと、リトライ用テンポラリバッファメモリ２１におけるデータの読み書きのタイミングコントロールと、リビルド用バッファメモリ２２におけるデータの読み書きのタイミングコントロールと、さらに、入出力インターフェイス部２からストライピング・ＥＣＣ部７へ送られるデータとリトライ用テンポラリバッファメモリ２１からストライピング・ＥＣＣ部７へ送られるデータとリビルド用バッファメモリ２２からストライピング・ＥＣＣ部７へ送られるデータを選択スイッチ４にて切替選択する際の選択切替タイミングのコントロールを行う。
【００３８】
各ＨＤＤ部１５〜１９へのデータの読み書きは、各ＨＤＤコントロール部１０〜１４を通して行われ、また、これらＨＤＤコントロール部１０〜１４における各ＨＤＤ１５〜１９へのデータの読み書きのタイミングは、ストライピング・ＥＣＣ部７が発生する。
【００３９】
ＣＰＵ部５は、ＣＰＵバスを通して、入出力インターフェイス部２、データタイミングコントローラ３、ストライピング・ＥＣＣ部７、各ＨＤＤコントロール部１０〜１４の動作をコントロールする。また、ＣＰＵ部５は、ＣＰＵバス及び入出力インターフェイス部２を通して、前記図１１のＩＯＰ１２１〜１２４との間でコマンドとステータスの送受を行う。さらに、詳細は後述するが、ＣＰＵ部５は、各ＨＤＤ１５〜１９への書き込みエラーが起きた場合に、ＨＤＤのエラー箇所を管理し、データの再構築を後から行う。
【００４０】
次に、図２には、ＨＤＤコントロール部１０〜１４の構成を示す。
【００４１】
この図２において、インプット用とアウトプット用のＦＩＦＯ（first-in first-out）メモリ４３，４４は、ストライピング・ＥＣＣ部７から供給されるデータやストライピング・ＥＣＣ部７へ送るデータを一時蓄積するためと、ＨＤＤの非同期性の吸収のために設けられている。
【００４２】
ＳＰＣ４２は、ＳＣＳＩバスを通してＨＤＤを制御する。
【００４３】
当該ＳＰＣ４２とデータストリームコントローラ４１は、ＣＰＵバスを介して、図１のＣＰＵ部５によりコントロールされる。
【００４４】
また、データストリームコントローラ４１は、ストライピング・ＥＣＣ部７からのタイミングコントロールに従い、インプットＦＩＦＯメモリ４３とアウトプットＦＩＦＯメモリ４４へのデータの読み書きのタイミング制御や、ＳＰＣ４２へデータの読み書きのタイミング制御を行う。
【００４５】
次に、図３を用い、前記図１１のビデオサーバシステムの構成を参照しながら、リトライ用テンポラリバッファメモリ２１を用いた書き込みエラー時の回復動作例を説明する。
【００４６】
ここで、本実施の形態のディスクアレイ装置は、前記図１１のビデオサーバシステムのタイムスロット生成部１２５により生成されたタイムスロットに従って、前記ＩＯＰ１２１〜１２４により時分割多重でアクセスされる。図３の例では、各タイムスロットを、タイムスロットＴＳ１，ＴＳ２，ＴＳ３，・・・のように表しており、これらタイムスロットが前記図１１のＩＯＰ１２１〜１２４の４つとシステム用の１つの５つにより時分割多重でアクセスされる。また、この図３の例では、図１１の各ＩＯＰ１２１〜１２４から図１のＨＤＤ部へのデータの書き込みが行われているものとする。このような条件において、例えばタイムスロットＴＳ３でＨＤＤ部への書き込みが失敗したとする。
【００４７】
このとき、ＣＰＵ部５は、ＨＤＤコントロール部を介してＨＤＤ単体でのリトライを行わないように制御し、タイムスロットＴＳ３が書き込まれるはずであったＨＤＤ部上の箇所を管理する。またこのときのＣＰＵ部５は、データタイミングコントローラ３を介して、リトライ用テンポラリバッファメモリ２１をクリアさせずに、入力されたデータすなわちＨＤＤ部への書き込みが失敗したときのタイムスロットＴＳ３のデータを保持させる。その後は、次のタイムスロットＴＳ４に進み、通常のデータの読み書きが行われる。
【００４８】
次に、ＩＯＰからの読み書き要求のない空いているタイムスロットがあったとき、ＣＰＵ部５は、データタイミングコントローラ３を介して、選択スイッチ４を制御してデータの流れを切り替えさせると共に、リトライ用テンポラリバッファメモリ２１からのデータをストライピング・ＥＣＣ部７に送り、さらにＨＤＤコントロール部を制御して、対応するＨＤＤ部へ再度書き込み（リトライ）を行わせる。図３の例では、上記ＩＯＰからの読み書き要求のない空いているタイムスロットとして、システム用のタイムスロットＴＳ７でリトライが行われる。その後は、次のタイムスロットＴＳ８に進み、通常のデータの読み書きが行われる。
【００４９】
次に、図４を用い、前記図１１のビデオサーバシステムの構成を参照しながら、リビルド用バッファメモリ２２を用いたリビルド動作例を説明する。
【００５０】
ここで、ＩＯＰからの読み書き要求のない空いているタイムスロットがあったとき、ＣＰＵ部５は、ＨＤＤコントロール部を介してＨＤＤ部からリビルドデータを読み出させ、さらに、ストライピング・ＥＣＣ部７を制御して当該ＨＤＤ部から読み出されたリビルド用データをエラー訂正させると共に、データタイミングコントローラ３を制御してそのエラー訂正後のリビルド用データをリビルド用バッファメモリ２２に保持させる。図４の例では、上記ＩＯＰからの読み書き要求のない空いているタイムスロットとして、システム用のタイムスロットＴＳ２でＨＤＤからリビルド用データを読み出し、リビルド用バッファメモリ２２に保持させる。その後は、次のタイムスロットＴＳ４に進み、通常のデータの読み書きが行われる。
【００５１】
次に、ＩＯＰからの読み書き要求のない次の空いているタイムスロットがあったとき、ＣＰＵ部５は、データタイミングコントローラ３を制御してリビルド用バッファメモリ２２が保持しているリビルド用データを読み出させ、さらに、データタイミングコントローラ３を介して選択スイッチ４を制御してデータの流れを切り替えさせると共に、上記リビルド用バッファメモリ２２からのリビルド用データをストライピング・ＥＣＣ部７に送ってパリティを付加させ、その後ＨＤＤコントロール部を制御して、対応するＨＤＤ部へリビルドデータを書き込ませる。図４の例では、上記ＩＯＰからの読み書き要求のない次の空いているタイムスロットとして、システム用のタイムスロットＴＳ７でリビルドデータのＨＤＤ部への書き込みを行わせる。その後は、次のタイムスロットＴＳ８に進み、通常のデータの読み書きが行われる。
【００５２】
なお、本実施の形態において、なぜこのようにリビルドを２つのタイムスロットに分けて行うかというと、ＨＤＤのシークタイムは、ＨＤＤの実際の読み書きの時間に対して極端に長く、１つのタイムスロット内でリビルド作業を行うと、１回にリビルドすることのできるデータ量が、１０分の１程度になってしまい、非常に効率が悪くなるからである。
【００５３】
上述したように、本発明の第１の実施の形態では、リトライ用テンポラリバッファメモリ２１とリビルド用バッファメモリ２２を備えることで、通常運用時のデータの流れと障害回復時のデータの流れを分けることができる。これにより、障害回復の作業が、通常のディスクアレイ装置における読み書きに影響を与えないようにすることが可能になっている。
【００５４】
この第１の実施の形態では、以上のようにして、ＨＤＤのデータの再構築を効率よく行うことができる。
【００５５】
次に、本発明の第２の実施の形態について説明する。
【００５６】
本発明の第２の実施の形態では、前述した第１の実施の形態と同様に、論理ブロックの大きさを「１セクタ（通常５１２バイト）×ＨＤＤの数」の整数倍に最適化し、ＡＢＦ（アレイブロッキングファクタ）と書き換えデータの大きさとを一致させて、書き込み時のオーバーヘッドを無くことにより、転送レートの低下を防ぐようにしている。これにより、当該第２の実施の形態においては、書き込み時のオーバーヘッドを緩和するためのデータキャッシュの必要を無くしている。
【００５７】
また、本発明の第２の実施の形態では、前述した第１の実施の形態と同様に、ＨＤＤ単体のリトライ及びリアサインを禁止して、一定の時間内に読み書きが終了することを保証するようにしており、これにより連続データの読み書きを途切れることなく実行することと、複数チャンネルによる時分割多重を保証している。すなわち、この第２の実施の形態においても、書き込み時にエラーの起こった場所をＣＰＵが管理し、データの再構築を後から行うことにして、データの信頼性を保つことを可能にしている。
【００５８】
さらに、本発明の第２の実施の形態では、障害回復の性能低下を避けるために、各ＨＤＤのためのバッファを複数組用意し、通常運用時のデータの流れと障害回復時のデータの流れをマルチプレクサを用いて切り替え、これら複数組のバッファを使い分けるようにしている。これにより、書き込み時のエラーが起こったＨＤＤへのデータ修復を、システムのためのタイムスロットや、空いているタイムスロットを使って素早く回復することを可能にし、例えばＨＤＤを交換してデータの修復を行う場合もその運用効率を高めることを可能にしている。
【００５９】
図５には、本発明の第２の実施の形態のディスクアレイ装置の概略構成を示す。なお、この第２の実施の形態においても、ディスクアレイ装置を前記図１１に示したようなビデオサーバシステムの蓄積メディアとして使用する場合を例に挙げて説明する。以下、この第２の実施の形態では、前述した第１の実施の形態と異なる部分を中心に説明する。
【００６０】
図１において、入出力インターフェイス部５２では、本実施の形態のディスクアレイ装置と前記図１１に示したビデオサーバシステムのＩＯＰ１２１〜１２４との間における、コマンド／ステータス及び入出力データの送受を制御する。この入出力インターフェイス部５２に入力されたデータは、ストライピング・ＥＣＣ部５７に送られる。
【００６１】
ストライピング・ＥＣＣ部５７では、前記第１の実施の形態の場合と同様に、入力されたデータを分割（ストライピング）する。このストライピングされたデータは、ＨＤＤコントロール部６０〜６３を介して、複数のＨＤＤ部６５〜６８に格納される。また、ストライピング・ＥＣＣ部５７では、複数のＨＤＤ部６５〜６８にまたがるデータ列に対してエラー訂正符号を生成する。このエラー訂正符号は、ＨＤＤコントロール部６４を介して、ＨＤＤ部６９に格納される。
【００６２】
データタイミングコントローラ５３は、入出力インターフェイス部５２におけるデータの読み書きのタイミングコントロールと、ストライピング・ＥＣＣ部５７におけるデータの読み書きのタイミングコントロールを行う。
【００６３】
各ＨＤＤ部６５〜６９へのデータの読み書きは、各ＨＤＤコントロール部６０〜６４を通して行われ、また、これらＨＤＤコントロール部６０〜６４における各ＨＤＤ６５〜６９へのデータの読み書きのタイミングは、ストライピング・ＥＣＣ部５７が発生する。
【００６４】
ＣＰＵ部５５は、ＣＰＵバスを通して、入出力インターフェイス部５２、データタイミングコントローラ５３、ストライピング・ＥＣＣ部５７、各ＨＤＤコントロール部６０〜６４の動作をコントロールする。また、ＣＰＵ部５５は、ＣＰＵバス及び入出力インターフェイス部２を通して、前記図１１のＩＯＰ１２１〜１２４との間でコマンドとステータスの送受を行う。さらに、ＣＰＵ部５５は、ＨＤＤ部への書き込みエラーが起きた場合に、ＨＤＤ部のエラー箇所を管理し、データの再構築を後から行う。
【００６５】
次に、図６には、ＨＤＤコントロール部６０〜６４の構成を示す。
【００６６】
この図６において、本実施の形態の場合、ＨＤＤコントロール部６０〜６４は、インプットＦＩＦＯメモリを２個１組以上（図６の例ではインプットＦＩＦＯメモリ７７〜７９の３個）、アウトプットＦＩＦＯを２個１組（図６の例ではアウトプットＦＩＦＯメモリ８０及び８１の１組）備えている。このように、インプットＦＩＦＯを２個１組以上、アウトプットＦＩＦＯを２個１組設けるのは、後述するようにリビルドを効率よく行うためである。すなわち、インプット／アウトプットＦＩＦＯ７７〜８１は、図５のストライピング・ＥＣＣ部５７から供給されるデータや、ストライピング・ＥＣＣ部５７へ送るデータと、ＨＤＤの非同期性の吸収のためにある。
【００６７】
選択スイッチ７１は、ストライピング・ＥＣＣ部５７からのデータをインプットＦＩＦＯ７７〜７８の何れかに振り分けるための切替スイッチであり、選択スイッチ７２は、アウトプットＦＩＦＯ８０，８１の何れかのデータをストライピング・ＥＣＣ部７５に切り替えて供給するための切替スイッチである。
【００６８】
ＳＰＣ７６は、ＳＣＳＩバスを通して、ＨＤＤを制御する。
【００６９】
当該ＳＰＣ７６とデータストリームコントローラ７５は、ＣＰＵバスを介して、図１のＣＰＵ部５によりコントロールされる。
【００７０】
また、データストリームコントローラ７５は、ストライピング・ＥＣＣ部５７からのタイミングコントロールに従い、インプットＦＩＦＯメモリ７７〜７９とアウトプットＦＩＦＯメモリ８０，８１へのデータの読み書きのタイミング制御や、これらインプットＦＩＦＯメモリ７７〜７９とアウトプットＦＩＦＯメモリ８０，８１のバンク切り替え制御（選択スイッチ７１，７２の切替制御）、ＳＰＣ７６へデータの読み書きのタイミング制御を行う。
【００７１】
本実施の形態によれば、上述の図５及び図６のように構成することで、前述の図１０に示した従来例の構成のデータキャッシュ部１３２のキャッシュメモリに比べて、メモリスピードがデータ格納用のＨＤＤ台数分の１で済み、メモリ自体のコストも安く、回路構成もシンプルにすることができる。
【００７２】
次に、図７を用い、前記図１１のビデオサーバシステムの構成を参照しながら、当該第２の実施の形態の構成における書き込みエラー時の回復（システムリトライ）動作例を説明する。
【００７３】
この第２の実施の形態においても第１の実施の形態と同様に、タイムスロットＴＳ１，ＴＳ２，ＴＳ３，・・・が前記図１１のＩＯＰ１２１〜１２４の４つとシステム用の１つの５つにより時分割多重でアクセスされる。また、図７の例では、図１１の各ＩＯＰ１２１〜１２４から図５のＨＤＤ部へのデータの書き込みが、通常状態で例えばインプットＦＩＦＯ７７を通して行われているものとする。このような条件において、例えばタイムスロットＴＳ３でＨＤＤ部への書き込みが失敗したとする。
【００７４】
このとき、ＣＰＵ部５５は、ＨＤＤコントロール部を介してＨＤＤ部単体でのリトライを行わないように制御し、タイムスロットＴＳ３が書き込まれるはずであったＨＤＤ部上の箇所を管理する。またこのときのＣＰＵ部５５は、ＨＤＤコントロール部内のインプットＦＩＦＯ７７のデータをクリアさせずに、そのＨＤＤ部への書き込みが失敗したときのタイムスロットＴＳ３のデータを保持させる。次に、ＣＰＵ部５５は、データストリームコントローラ７５を制御し、次のタイムスロットＴＳ４からは、インプットＦＩＦＯ７８経由でＳＰＣ７６にデータを送り、ＨＤＤ部に書き込みが行われるように制御する。
【００７５】
その後、ＩＯＰからの読み書き要求のない空いているタイムスロットがあったとき、ＣＰＵ部５５は、データストリームコントローラ７５を制御し、インプットＦＩＦＯ７７からＳＰＣ７６にデータを送り、ＨＤＤ部へ再度書き込み（リトライ）を行わせる。図７の例では、上記ＩＯＰからの読み書き要求のない空いているタイムスロットとして、システム用のタイムスロットＴＳ７でリトライが行われる。その後は、ＣＰＵ部５５は、データストリームコントローラ７５を制御し、次のタイムスロットＴＳ８からはインプットＦＩＦＯ７７経由でＳＰＣ７６にデータを送り、ＨＤＤ部に書き込みが行われるように制御する。
【００７６】
次に、図８を用い、前記図１１のビデオサーバシステムの構成を参照しながら、第２の実施の形態におけるリビルド動作例を説明する。
【００７７】
ここで、各ＩＯＰ１２１〜１２４へのデータの読み出しは、通常状態で例えばアウトプットＦＩＦＯ８０を通して行われているものとする。
【００７８】
このような条件において、ＩＯＰからの読み書き要求のない空いているタイムスロットがあったとき、ＣＰＵ部５５は、ＳＰＣ７６を介してＨＤＤ部からリビルドデータを読み出させ、さらに、データストリームコントローラ７５を制御し、当該ＨＤＤ部から読み出されたリビルドデータをアウトプットＦＩＦＯ８０に保持させる。図８の例では、上記ＩＯＰからの読み書き要求のない空いているタイムスロットとして、システム用のタイムスロットＴＳ２でＨＤＤ部からリビルドデータを読み出し、アウトプットＦＩＦＯ８０に保持させる。
【００７９】
次に、ＣＰＵ部５５は、データストリームコントローラ７５を制御し、次のタイムスロットＴＳ３からは、ＨＤＤ部から読み出されたデータをアウトプットＦＩＦＯ８１経由でＩＯＰに送るように制御する。
【００８０】
次に、ＩＯＰからの読み書き要求のない次の空いているタイムスロットがあったとき、ＣＰＵ部５５は、データストリームコントローラ７５を制御して、アウトプットＦＩＦＯ８０からリビルドデータを読み出し、そのリビルドデータをストライピング・ＥＣＣ部５７に送ってパリティを付加させ、その後、データストリームコントローラ７５を制御して、そのリビルドデータをインプットＦＩＦＯ７９を通し、さらにＳＰＣ７６を介してＨＤＤ部へ書き込ませる。図８の例では、上記ＩＯＰからの読み書き要求のない次の空いているタイムスロットとして、システム用のタイムスロットＴＳ７で、既にアウトプットＦＩＦＯ８０に保持されているリビルドデータを読み出し、さらにインプットＦＩＦＯ７９を通してＨＤＤ部に書き込みを行わせる。
【００８１】
その後、ＣＰＵ部５５は、データストリームコントローラ７５を制御し、次のタイムスロットＴＳ８からは、ＨＤＤ部から読み出されたデータをアウトプットＦＩＦＯ８０経由でＩＯＰに送るように制御する。
【００８２】
なお、この第２の実施の形態においても前記第１の実施の形態と同様に、なぜこのようにリビルドを２つのタイムスロットに分けて行うかというと、ＨＤＤのシークタイムは、ＨＤＤの実際の読み書きの時間に対して極端に長く、１つのタイムスロット内でリビルド作業を行うと、１回にリビルドすることのできるデータ量が、１０分の１程度になってしまい、非常に効率が悪くなるからである。
【００８３】
また、当該第２の実施の形態では、インプット用とアウトプット用のＦＩＦＯを複数組用意することで、通常運用時のデータの流れと障害回復時のデータの流れを分けることができるようにしている。これにより、障害回復の作業が、通常のディスクアレイ装置における読み書きに影響を与えないようにすることが可能になっている。
【００８４】
第２の実施の形態では、以上のようにして、ＨＤＤのデータの再構築を効率よく行うことができる。
【００８５】
すなわち、上述した本発明の各実施の形態によれば、ディスクアレイ装置をビデオサーバシステムに適用することで、以下に示すような、
（１）大容量であること
（２）高い転送レートを実現すること
（３）高い信頼性を持つこと
（４）ランダムアクセス性が良いこと
（５）同時に複数のチャンネルから同一素材にアクセスできること
（６）データの連続性を損なわないこと
（７）障害回復時を効率よく行えること
（８）障害回復時に、性能が低下しないこと
などの各性能を満たすことが可能である。
【００８６】
【発明の効果】
以上の説明で明らかなように、本発明のディスクアレイ装置においては、入力データ及び出力データ用のリングバッファ構造の第１，第２のバッファメモリと、データを分割して複数のディスクドライブに順次に供給し複数のディスクドライブから供給されたデータを結合するデータ分割結合手段と、ディスクドライブの動作状況及び処理タイムスロットを管理すると共に各バッファメモリの読み書き及びディスクドライブの記録再生動作を制御する制御手段とを有し、空きタイムスロットを使い、第１，第２のバッファメモリからのデータを用いてデータの再構築を行うことにより、書き込み時のオーバーヘッドの発生を防止でき、また、リアルタイム性を保証でき、障害回復時の性能低下を防ぐことが可能である。
【００８７】
また、本発明のディスクアレイ装置は、データを分割して複数のディスクドライブに順次に供給し複数のディスクドライブから供給されたデータを結合するデータ分割結合手段と、ディスクドライブの記録及び再生データ用の少なくとも２個１組の第１，第２の先入れ先出しメモリと、ディスクドライブの動作状況及び処理タイムスロットを管理すると共にディスクドライブの記録再生動作及び第１，第２の先入れ先出しメモリをバンク切り替え制御する制御手段とを有し、空きタイムスロットを使い、第１，第２の先入れ先出しメモリからのデータを用いてデータの再構築を行うことにより、書き込み時のオーバーヘッドの発生を防止でき、また、リアルタイム性を保証でき、障害回復時の性能低下を防ぐことが可能である。
【図面の簡単な説明】
【図１】ビデオサーバシステムに適用可能な本発明の第１の実施の形態のディスクアレイ装置の概略構成を示すブロック回路図である。
【図２】第１の実施の形態のディスクアレイ装置内のＨＤＤコントロール部の概略構成を示すブロック回路図である。
【図３】第１の実施の形態のディスクアレイ装置においてＨＤＤへの書き込みエラーが発生した時の回復（リトライ）動作の説明に用いる図である。
【図４】第１の実施の形態のディスクアレイ装置においてリビルド動作の説明に用いる図である。
【図５】ビデオサーバシステムに適用可能な本発明の第２の実施の形態のディスクアレイ装置の概略構成を示すブロック回路図である。
【図６】第２の実施の形態のディスクアレイ装置内のＨＤＤコントロール部の概略構成を示すブロック回路図である。
【図７】第２の実施の形態のディスクアレイ装置においてＨＤＤへの書き込みエラーが発生した時の回復（リトライ）動作の説明に用いる図である。
【図８】第２の実施の形態のディスクアレイ装置においてリビルド動作の説明に用いる図である。
【図９】ディスクアレイ装置の概念説明に用いる図である。
【図１０】従来の一般的なディスクアレイ装置の概略構成を示すブロック回路図である。
【図１１】ビデオサーバシステムの概略構成を示すブロック回路図である。
【符号の説明】
２，５２入出力インターフェイス部、３，５３データタイミングコントローラ、４，７１，７２選択スイッチ、５，５５ＣＰＵ部、７，５７ストライピング・ＥＣＣ部、１０〜１４，６０〜６４ＨＤＤコントロール部、１５〜９，６５〜６９ＨＤＤ部、２１リトライ用テンポラリバッファメモリ、２２リビルド用バッファメモリ、４３，７７〜７９インプットＦＩＦＯ、４４，８０，８１アウトプットＦＩＦＯ、４１，７５データストリームコントローラ、４２，７６ＳＰＣ[0001]
BACKGROUND OF THE INVENTION
The present invention is suitable for a so-called video server system for recording and reproducing moving images and audio signals, for example. Na Disk array device And data recording / reproducing method About.
[0002]
[Prior art]
For example, a disk array device configured in parallel and redundantly by a plurality of HDDs (hard disk drives) is generally called RAID (Redundant Array of Inexpensive Disks).
[0003]
FIG. 9 shows the concept of the disk array device called RAID.
[0004]
In FIG. 9, the disk array device divides (stripes) input data 100 into data strings D0 to D15 for each predetermined unit length in the disk array controller 101, and data for each predetermined unit length. Columns D0 to D15) are distributed and stored in a plurality of HDDs 111 to 118 for each predetermined unit length. The predetermined unit length is referred to as a striping width, and a data string having the striping width corresponding to the number of data storage HDDs is referred to as an array blocking factor (hereinafter referred to as ABF).
[0005]
Simultaneously with the striping, when the disk array controller 101 divides the input data, the disk array controller 101 generates an error correction code for the data string spanning the plurality of HDDs 111 to 118, and stores the error correction code in the HDD 119. As described above, in the disk array apparatus of FIG. 9, by storing an error correction code and performing error correction using this error correction code later, higher performance and reliability than in the case of a single HDD are obtained. Realized. In the figure, Pn (n is 0 to 15) indicates an error correction code for the nth data string. In the example of FIG. 9, nine HDDs 111 to 118 and 119 are used, but a larger or smaller number may be used.
[0006]
FIG. 10 shows the structure of a general disk array device.
[0007]
10, a general disk array device includes a data input / output interface unit 131, a data cache unit (cache memory) 132, a CPU (central processing unit) unit 133, a striping / ECC unit 134, a data timing controller 151, A plurality of dedicated buffer memory units 135 to 138, a plurality of data stream controllers 139 to 142, a plurality of SCSI protocol controllers (hereinafter referred to as SPC) 143 to 146, a plurality of HDD units for absorbing the asynchrony between the HDDs 147-150.
[0008]
The input data is sent to the striping / ECC unit 134 via the input / output interface unit 131 and the data cache unit 132.
[0009]
The striping / ECC unit 134 divides the input data (striping). The striped data is stored in the plurality of HDD units 147 to 149 via the buffer memories 135 to 137 and the SPCs 143 to 145. The striping / ECC unit 134 generates an error correction code for a data string spanning a plurality of HDD units 147 to 149 when data is divided (striped), and the HDD unit via the buffer memory 138 and the SPC 146. 150.
[0010]
Data reading / writing to the HDD units 147 to 150 is performed through the buffer memory units 135 to 138 and the SPCs 143 to 146, and the data reading / writing control in the buffer memory units 135 to 138 and the SPCs 143 to 146 is respectively performed by corresponding data streams. This is done by the controllers 139-142. In addition, the striping / ECC unit 134 generates data read / write timings in the data stream controllers 139 to 142.
[0011]
Even if the HDD units 147 to 150 in the disk array apparatus operate simultaneously, the start timing and end timing of data transfer from the respective HDD units do not necessarily match. For this reason, although illustration is omitted, buffer memory units 135 to 138 for absorbing the difference in timing may be provided immediately after the SPCs 143 to 146 for controlling the HDD units 147 to 150.
[0012]
The data timing controller 151 performs data read / write timing control to the input / output interface unit 131, data cache unit 132 timing control, and striping / ECC unit 134 data read / write timing control.
[0013]
The CPU 133 controls the operations of the input / output interface unit 131, the striping / ECC unit 134, the data stream controllers 139 to 142, and the SPCs 143 to 146 through the CPU bus.
[0014]
FIG. 11 shows a schematic configuration of a video server system in which the RAID disk array device configured as described above is used as a storage medium for storing moving images and audio.
[0015]
In the video server system shown in FIG. 11, a plurality of input / output devices 121 to 124 are accessible by time division multiplexing to a disk array device (hereinafter referred to as RAID 129) as a storage medium. Hereinafter, the input / output devices 121 to 124 will be referred to as IOP (Input Output Processor) 121 to 124. Each of the IOPs 121 to 124 performs input / output of video and audio data, and receives control from a host application. These IOPs 121 to 124 can access the RAID 129 by time division multiplexing via a data path according to the time slot generated by the time slot generation unit 125. Data paths include internal buses, SCSI (Small Computer System Interface), fiber channel networks, so-called SBX buses, and the like.
[0016]
[Problems to be solved by the invention]
By the way, it is generally said that a disk array device having a so-called RAID-3 configuration is suitable for transferring a large amount of data such as moving images and sounds at one time. However, the disk array device has the following problems. Therefore, the performance required as a storage medium for a video server system cannot be satisfied.
[0017]
As a first problem, overhead at the time of writing occurs.
[0018]
That is, in the disk array device, there may be an overhead at the time of writing. When the data to be rewritten does not fit within the boundary of the ABF (array blocking factor), the data and parity before rewriting are read and a new parity is calculated. May cause overhead when writing.
[0019]
As a second problem, the HDD alone does not guarantee real-time performance.
[0020]
That is, the HDD alone is premised on retrying reading and writing, and it is difficult to ensure real-time performance. For example, even if data reading fails for some reason, the error correction function of the disk array device can ensure a certain degree of real-time performance. On the other hand, if writing fails, the HDD can retry. If done, real-time performance cannot be ensured. Even if the HDD alone is prohibited from being retried, writing to the HDD may fail due to, for example, insufficient time. In such a case, the writing has failed, but the HDD itself is not defective. Therefore, the next time data is read from that portion, no error sign is returned from the HDD. As a result, unrelated data is read out, and as a result, the data cannot be restored.
[0021]
As a third problem, it does not have a mechanism for preventing performance degradation during failure recovery.
[0022]
That is, a general disk array device for computers does not have a mechanism for preventing performance degradation at the time of failure recovery, and cannot read / write data from / to the disk array at the time of data reconstruction (rebuild) for failure recovery. However, in a video server system, in particular, a broadcast video server system, it is necessary to minimize the deterioration of the function upon failure recovery.
[0023]
As described above, in the conventional disk array device, there is no guarantee that reading and writing will be completed within a certain time due to the first and second problems. That is, since continuous reading / writing of continuous data is not interrupted, it is not possible to guarantee the synchronism and real-time property of the video / audio input / output device that performs further time division multiple access. In general, it is said that it is necessary to increase the amount of cache memory (data cache unit 132) in order to avoid the first and second problems, but even if the amount of cache memory is increased, Since there is no guarantee that it will always hit, it does not guarantee 100% real-time performance.
[0024]
Therefore, the present invention has been made in view of such a situation, and it is possible to prevent the occurrence of overhead at the time of writing, guarantee real-time performance, and prevent performance degradation at the time of failure recovery. Na Disk array device And data recording / reproducing method The purpose is to provide.
[0025]
[Means for Solving the Problems]
The disk array device of the present invention comprises a plurality of disk drives, and in a disk array device that is time-division multiplexed access according to a time slot, a ring buffer structure that temporarily holds data input to the disk array device. A data dividing and combining means for dividing the data from one buffer memory or the inputted data and sequentially supplying the divided data to the plurality of disk drives, and combining the data supplied from the plurality of disk drives; A ring buffer that manages the operation status of the disk drive and the time slot, and temporarily holds data obtained by combining the data from the plurality of disk drives output from the first buffer memory and the data division and coupling means Read / write second buffer memory of structure And control means for controlling recording and reproducing operation of the fine said plurality of disk drives, said control means, when the data writing to the disk drive has failed The above-mentioned disk drive is prohibited from being rewritten and the place where it should be written is managed. Holding the input data in the time slot in the first buffer memory; next Data stored in the first buffer memory using the empty time slot is transferred to the disk drive. Corresponding part of When data is rewritten and data is reconstructed, the data to be reconstructed is read from the disk drive using an empty time slot. After error correction The above-mentioned problem is solved by controlling the data held in the second buffer memory and writing the data held in the second buffer memory to the disk drive using the next empty time slot.
[0026]
The disk array device according to the present invention comprises a plurality of disk drives, and is divided into data input to the disk array device in a time-division multiple access according to time slots. And a data dividing / combining means for combining the data supplied from the plurality of disk drives and at least two for temporarily holding data supplied from the data dividing / combining means and recorded on the disk drive A set of first first-in first-out memories, at least two sets of second first-in first-out memories temporarily holding data reproduced from the disk drive, the operating status of the disk drive and the time slot are managed. , Recording / reproducing operation of the disk drive and Serial first, and a control means for the second first-in-first-out memory bank switching control, said control means, when the data writing to the disk drive has failed The above-mentioned disk drive is prohibited from being rewritten and the place where it should be written is managed. Holding the data supplied from the data dividing and coupling means in the time slot in the first first-in first-out memory; next Data stored in the first first-in first-out memory using an empty time slot is transferred to the disk drive. Corresponding part of When data is rewritten and data is reconstructed, the data to be reconstructed is read from the disk drive using an empty time slot. After error correction The above-described problem is solved by controlling the data stored in the second first-in first-out memory in the second first-in first-out memory and writing the data held in the second first-in first-out memory to the disk drive using the next empty time slot.
[0027]
DETAILED DESCRIPTION OF THE INVENTION
A preferred embodiment of the present invention will be described with reference to the drawings.
[0028]
First, the first embodiment will be described.
[0029]
In the first embodiment of the present invention, the size of the logical block is optimized to be an integral multiple of “1 sector (usually 512 bytes) × the number of HDDs”, and the ABF (array blocking factor) and the size of the rewrite data are set. By matching, the overhead at the time of writing is eliminated to prevent the transfer rate from decreasing. As a result, in this embodiment, the need for a data cache for reducing overhead during writing is eliminated.
[0030]
In the first embodiment of the present invention, retry and reassignment of a single HDD is prohibited, and it is ensured that reading and writing are completed within a certain time, thereby interrupting reading and writing of continuous data. And time division multiplexing with multiple channels is guaranteed. In other words, in this embodiment, for example, the CPU manages the location where an error has occurred during writing, and the data is reconstructed later, so that the reliability of the data can be maintained.
[0031]
Furthermore, the first embodiment of the present invention includes a retry temporary buffer memory and a rebuild buffer memory, which will be described later, in order to avoid a failure recovery performance degradation. The data flow is divided. As a result, it is possible to quickly restore data to the HDD in which an error has occurred during writing using a time slot for the system or a free time slot. For example, the HDD can be replaced to restore the data. It also makes it possible to increase the operational efficiency.
[0032]
The configuration and operation of the disk array device according to the first embodiment of the present invention will be described below by taking as an example a case where it is used as a storage medium of a video server system as shown in FIG.
[0033]
FIG. 1 shows a schematic configuration of the disk array device according to the first embodiment of the present invention.
[0034]
In FIG. 1, an input / output interface unit 2 controls transmission / reception of commands / status and input / output data between the disk array device of the present embodiment and the IOPs 121 to 124 of the video server system shown in FIG. To do. The data input to the input / output interface unit 2 is sent to the striping / ECC unit 7 via the selection switch 4 or the retry temporary buffer memory 21 and the selection switch 4 described later.
[0035]
The striping / ECC unit 7 divides the input data (striping). The striped data is stored in the plurality of HDD units 15 to 18 via the HDD control units 10 to 13. Further, the striping / ECC unit 7 generates an error correction code for a data string spanning a plurality of HDD units 15 to 18 when data is divided (striped). This error correction code is stored in the HDD unit 19 via the HDD control unit 14.
[0036]
The retry temporary buffer memory 21 and the rebuild buffer memory 22 are provided in order to efficiently perform the retry and rebuild in the disk array device of the present embodiment. The retry temporary buffer memory 21 has a ring buffer structure in which data from the input / output interface unit 2 is written. On the other hand, the rebuild buffer memory 22 also has a ring buffer structure, and data read from the HDDs 15 to 18 and error-corrected by the error correction code from the HDD unit 19 is written. The data from the retry temporary buffer memory 21 and the data from the rebuild buffer memory 22 are sent to the selection switch 4. Although details will be described later, in this embodiment, for example, when an error occurs in reading or writing in each of the HDDs 15 to 19, data from the retry temporary buffer memory 21 or the rebuild buffer memory 22 is used by using an empty time slot. It is used to reconstruct data (recovery operation and rebuild operation at the time of error). The retry temporary buffer memory 21 and the rebuild buffer memory 22 are not so-called cache memories (data cache unit 132) as shown in FIG. That is, the data stored in the retry temporary buffer memory 21 is a copy of the data sent from the input / output interface unit 2 to the selection switch 4, and the data stored in the rebuild buffer memory 22 is striping / ECC. This is a copy of data sent from the unit 7 to the input / output interface unit 2, and does not require complicated control such as a cache hit as in the case of a so-called cache memory.
[0037]
The data timing controller 3 controls the data read / write timing control in the input / output interface unit 2, the data read / write timing control in the striping / ECC unit 7, the data read / write timing control in the retry temporary buffer memory 21, and the rebuild. Data read / write timing control in the buffer memory 22, data sent from the input / output interface unit 2 to the striping / ECC unit 7, data sent from the retry temporary buffer memory 21 to the striping / ECC unit 7, and rebuilding Control of selection switching timing when the selection switch 4 switches and selects data sent from the buffer memory 22 to the striping / ECC unit 7 Do Le.
[0038]
Data reading / writing to the HDD units 15 to 19 is performed through the HDD control units 10 to 14, and the timing of data reading / writing to the HDDs 15 to 19 by the HDD control units 10 to 14 is determined by striping / ECC. Part 7 is generated.
[0039]
The CPU unit 5 controls the operations of the input / output interface unit 2, the data timing controller 3, the striping / ECC unit 7, and the HDD control units 10 to 14 through the CPU bus. The CPU unit 5 transmits and receives commands and statuses to and from the IOPs 121 to 124 in FIG. 11 through the CPU bus and the input / output interface unit 2. Further, as will be described in detail later, the CPU unit 5 manages the error location of the HDD and performs data reconstruction later when a write error to each of the HDDs 15 to 19 occurs.
[0040]
Next, FIG. 2 shows the configuration of the HDD control units 10-14.
[0041]
In FIG. 2, FIFO (first-in first-out) memories 43 and 44 for input and output temporarily store data supplied from the striping / ECC unit 7 and data to be sent to the striping / ECC unit 7. Therefore, it is provided to absorb the asynchronousness of the HDD.
[0042]
The SPC 42 controls the HDD through the SCSI bus.
[0043]
The SPC 42 and the data stream controller 41 are controlled by the CPU unit 5 in FIG. 1 via the CPU bus.
[0044]
Further, the data stream controller 41 controls the timing of data reading / writing to the input FIFO memory 43 and the output FIFO memory 44 and the timing of data reading / writing to the SPC 42 according to the timing control from the striping / ECC unit 7.
[0045]
Next, a recovery operation example at the time of a write error using the retry temporary buffer memory 21 will be described with reference to FIG. 3 while referring to the configuration of the video server system of FIG.
[0046]
Here, the disk array device of the present embodiment is accessed by the IOPs 121 to 124 by time division multiplexing according to the time slot generated by the time slot generation unit 125 of the video server system of FIG. In the example of FIG. 3, each time slot is represented as a time slot TS1, TS2, TS3,..., And these time slots are four IOPs 121 to 124 of FIG. Is accessed by time division multiplexing. In the example of FIG. 3, it is assumed that data is written from the IOPs 121 to 124 of FIG. 11 to the HDD unit of FIG. Under such conditions, for example, it is assumed that writing to the HDD unit has failed in time slot TS3.
[0047]
At this time, the CPU unit 5 controls not to retry the HDD alone via the HDD control unit, and manages the location on the HDD unit where the time slot TS3 should have been written. Further, the CPU unit 5 at this time does not clear the temporary buffer memory 21 for retry via the data timing controller 3, but stores the input data, that is, the data of the time slot TS3 when writing to the HDD unit fails. Hold. Thereafter, the process proceeds to the next time slot TS4, and normal data reading and writing are performed.
[0048]
Next, when there is an empty time slot without a read / write request from the IOP, the CPU unit 5 controls the selection switch 4 via the data timing controller 3 to switch the data flow, and for retrying. Data from the temporary buffer memory 21 is sent to the striping / ECC unit 7 and further the HDD control unit is controlled to rewrite (retry) the corresponding HDD unit. In the example of FIG. 3, a retry is performed in the system time slot TS7 as an empty time slot without a read / write request from the IOP. Thereafter, the process proceeds to the next time slot TS8, and normal data reading and writing are performed.
[0049]
Next, a rebuild operation example using the rebuild buffer memory 22 will be described with reference to FIG. 4 while referring to the configuration of the video server system of FIG.
[0050]
Here, when there is an empty time slot without a read / write request from the IOP, the CPU unit 5 reads the rebuild data from the HDD unit via the HDD control unit, and further controls the striping / ECC unit 7. Then, the rebuild data read from the HDD unit is error-corrected, and the data timing controller 3 is controlled to hold the error-corrected rebuild data in the rebuild buffer memory 22. In the example of FIG. 4, the rebuild data is read from the HDD in the system time slot TS 2 as a vacant time slot without a read / write request from the IOP and held in the rebuild buffer memory 22. Thereafter, the process proceeds to the next time slot TS4, and normal data reading and writing are performed.
[0051]
Next, when there is a next free time slot without a read / write request from the IOP, the CPU unit 5 controls the data timing controller 3 to read the rebuild data held in the rebuild buffer memory 22. Furthermore, the selection switch 4 is controlled via the data timing controller 3 to switch the data flow, and the rebuild data from the rebuild buffer memory 22 is sent to the striping / ECC unit 7 to add the parity. After that, the HDD control unit is controlled to write the rebuild data to the corresponding HDD unit. In the example of FIG. 4, the rebuild data is written to the HDD unit in the system time slot TS7 as the next free time slot without a read / write request from the IOP. Thereafter, the process proceeds to the next time slot TS8, and normal data reading and writing are performed.
[0052]
In the present embodiment, the reason why rebuilding is performed in two time slots in this way is that the seek time of the HDD is extremely longer than the actual read / write time of the HDD, and one time slot This is because the amount of data that can be rebuilt at one time is reduced to about one-tenth and the efficiency becomes very low.
[0053]
As described above, the first embodiment of the present invention includes the retry temporary buffer memory 21 and the rebuild buffer memory 22, thereby separating the data flow during normal operation and the data flow during failure recovery. be able to. As a result, it is possible to prevent the failure recovery operation from affecting reading and writing in a normal disk array device.
[0054]
In the first embodiment, the HDD data can be efficiently reconstructed as described above.
[0055]
Next, a second embodiment of the present invention will be described.
[0056]
In the second embodiment of the present invention, as in the first embodiment described above, the size of the logical block is optimized to be an integral multiple of “1 sector (usually 512 bytes) × the number of HDDs”, and ABF By reducing the (array blocking factor) and the size of the rewritten data and eliminating the overhead during writing, a decrease in the transfer rate is prevented. As a result, in the second embodiment, the need for a data cache for reducing overhead during writing is eliminated.
[0057]
Further, in the second embodiment of the present invention, as in the first embodiment described above, retry and reassignment of a single HDD are prohibited to ensure that reading and writing are completed within a certain time. This ensures continuous reading / writing of continuous data and time division multiplexing using a plurality of channels. That is, also in the second embodiment, the CPU manages the place where an error has occurred at the time of writing, and the data is reconstructed later, so that the reliability of the data can be maintained.
[0058]
Furthermore, in the second embodiment of the present invention, a plurality of buffers for each HDD are prepared in order to avoid a failure recovery performance, and the data flow during normal operation and the data flow during failure recovery are prepared. Are switched using a multiplexer, and these plural sets of buffers are used properly. This makes it possible to quickly restore data to the HDD in which an error has occurred during writing using a time slot for the system or a free time slot. For example, the data can be restored by replacing the HDD. This makes it possible to improve the operational efficiency.
[0059]
FIG. 5 shows a schematic configuration of the disk array device according to the second embodiment of the present invention. In the second embodiment, the case where the disk array device is used as a storage medium of the video server system as shown in FIG. 11 will be described as an example. Hereinafter, in the second embodiment, a description will be given centering on portions that are different from the first embodiment described above.
[0060]
In FIG. 1, an input / output interface unit 52 controls transmission / reception of commands / status and input / output data between the disk array device of the present embodiment and the IOPs 121 to 124 of the video server system shown in FIG. . The data input to the input / output interface unit 52 is sent to the striping / ECC unit 57.
[0061]
The striping / ECC unit 57 divides the input data (striping) as in the case of the first embodiment. The striped data is stored in the plurality of HDD units 65 to 68 via the HDD control units 60 to 63. The striping / ECC unit 57 generates an error correction code for a data string extending over the plurality of HDD units 65 to 68. This error correction code is stored in the HDD unit 69 via the HDD control unit 64.
[0062]
The data timing controller 53 performs data read / write timing control in the input / output interface unit 52 and data read / write timing control in the striping / ECC unit 57.
[0063]
Data reading / writing to the HDD units 65 to 69 is performed through the HDD control units 60 to 64, and the timing of data reading / writing to the HDDs 65 to 69 in the HDD control units 60 to 64 is determined by striping / ECC. Part 57 is generated.
[0064]
The CPU unit 55 controls operations of the input / output interface unit 52, the data timing controller 53, the striping / ECC unit 57, and the HDD control units 60 to 64 through the CPU bus. Further, the CPU unit 55 transmits and receives commands and statuses to and from the IOPs 121 to 124 in FIG. 11 through the CPU bus and the input / output interface unit 2. Further, when a write error to the HDD unit occurs, the CPU unit 55 manages the error part of the HDD unit and performs data reconstruction later.
[0065]
Next, FIG. 6 shows the configuration of the HDD control units 60 to 64.
[0066]
6, in the case of the present embodiment, the HDD control units 60 to 64 have two or more sets of input FIFO memories (in the example of FIG. 6, three of the input FIFO memories 77 to 79), and the output FIFOs. One set of two (one set of output FIFO memories 80 and 81 in the example of FIG. 6) is provided. The reason why two or more sets of input FIFOs and one set of two output FIFOs are provided in this way is to perform rebuilding efficiently as will be described later. That is, the input / output FIFOs 77 to 81 are provided to absorb the data supplied from the striping / ECC unit 57 in FIG. 5 and the data sent to the striping / ECC unit 57 and the asynchronous nature of the HDD.
[0067]
The selection switch 71 is a changeover switch for distributing the data from the striping / ECC unit 57 to any one of the input FIFOs 77 to 78, and the selection switch 72 is used to distribute any data in the output FIFOs 80 and 81 to the striping / ECC unit. This is a changeover switch for switching to 75.
[0068]
The SPC 76 controls the HDD through the SCSI bus.
[0069]
The SPC 76 and the data stream controller 75 are controlled by the CPU unit 5 in FIG. 1 via a CPU bus.
[0070]
The data stream controller 75 controls the timing of reading / writing data to / from the input FIFO memories 77 to 79 and the output FIFO memories 80 and 81 according to the timing control from the striping / ECC unit 57, and these input FIFO memories 77 to 79. And bank switching control of the output FIFO memories 80 and 81 (switching control of the selection switches 71 and 72), and timing control of data reading / writing to the SPC 76.
[0071]
According to the present embodiment, by configuring as shown in FIG. 5 and FIG. 6, the memory speed is higher than that of the cache memory of the data cache unit 132 having the configuration of the conventional example shown in FIG. It only needs to be one-tenth the number of HDDs for storage, the cost of the memory itself is low, and the circuit configuration can be simplified.
[0072]
Next, a recovery (system retry) operation example at the time of a write error in the configuration of the second embodiment will be described with reference to FIG. 7 while referring to the configuration of the video server system of FIG.
[0073]
Also in the second embodiment, as in the first embodiment, the time slots TS1, TS2, TS3,... Are generated by four IOPs 121 to 124 in FIG. Accessed by division multiplexing. In the example of FIG. 7, it is assumed that data is written from the IOPs 121 to 124 of FIG. 11 to the HDD unit of FIG. 5 through the input FIFO 77 in a normal state. Under such conditions, for example, it is assumed that writing to the HDD unit has failed in time slot TS3.
[0074]
At this time, the CPU unit 55 controls not to retry the single HDD unit via the HDD control unit, and manages the location on the HDD unit where the time slot TS3 should have been written. At this time, the CPU unit 55 does not clear the data in the input FIFO 77 in the HDD control unit, but holds the data in the time slot TS3 when writing to the HDD unit fails. Next, the CPU unit 55 controls the data stream controller 75 to send data from the next time slot TS4 to the SPC 76 via the input FIFO 78 and control the writing to the HDD unit.
[0075]
Thereafter, when there is an empty time slot without a read / write request from the IOP, the CPU unit 55 controls the data stream controller 75 to send data from the input FIFO 77 to the SPC 76 and write (retry) again to the HDD unit. Let it be done. In the example of FIG. 7, a retry is performed in the system time slot TS7 as an empty time slot without a read / write request from the IOP. Thereafter, the CPU unit 55 controls the data stream controller 75 to send data from the next time slot TS8 to the SPC 76 via the input FIFO 77 and control the data to be written to the HDD unit.
[0076]
Next, referring to FIG. 8, an example of the rebuild operation in the second embodiment will be described with reference to the configuration of the video server system in FIG.
[0077]
Here, it is assumed that reading of data to each of the IOPs 121 to 124 is performed through, for example, the output FIFO 80 in a normal state.
[0078]
Under these conditions, when there is an empty time slot without a read / write request from the IOP, the CPU unit 55 reads the rebuild data from the HDD unit via the SPC 76 and further controls the data stream controller 75. Then, the rebuild data read from the HDD unit is held in the output FIFO 80. In the example of FIG. 8, the rebuild data is read from the HDD unit in the system time slot TS <b> 2 as a vacant time slot without a read / write request from the IOP and held in the output FIFO 80.
[0079]
Next, the CPU section 55 controls the data stream controller 75 so that the data read from the HDD section is sent to the IOP via the output FIFO 81 from the next time slot TS3.
[0080]
Next, when there is a next free time slot without a read / write request from the IOP, the CPU unit 55 controls the data stream controller 75 to read the rebuild data from the output FIFO 80 and strip the rebuild data. The data is sent to the ECC unit 57 to add a parity, and then the data stream controller 75 is controlled to cause the rebuild data to be written to the HDD unit via the input FIFO 79 and further via the SPC 76. In the example of FIG. 8, the rebuild data already held in the output FIFO 80 is read in the system time slot TS7 as the next free time slot without a read / write request from the IOP, and the HDD is further passed through the input FIFO 79. Have the part write.
[0081]
Thereafter, the CPU unit 55 controls the data stream controller 75 so that the data read from the HDD unit is sent to the IOP via the output FIFO 80 from the next time slot TS8.
[0082]
In the second embodiment, as in the first embodiment, the reason why the rebuild is divided into two time slots is as follows. The seek time of the HDD is the actual HDD time. If the rebuild operation is performed in one time slot that is extremely long with respect to the read / write time, the amount of data that can be rebuilt at one time is reduced to about one-tenth, which is very inefficient. Because.
[0083]
In the second embodiment, by preparing a plurality of sets of input and output FIFOs, the data flow during normal operation and the data flow during failure recovery can be separated. Yes. As a result, it is possible to prevent the failure recovery operation from affecting reading and writing in a normal disk array device.
[0084]
In the second embodiment, HDD data can be efficiently reconstructed as described above.
[0085]
That is, according to each embodiment of the present invention described above, by applying the disk array device to the video server system, as shown below,
(1) Large capacity
(2) Realizing a high transfer rate
(3) Having high reliability
(4) Good random accessibility
(5) The same material can be accessed from multiple channels simultaneously.
(6) Do not impair data continuity
(7) To be able to efficiently recover from a failure
(8) No performance degradation at the time of failure recovery
It is possible to satisfy each performance.
[0086]
【The invention's effect】
As is apparent from the above description, in the disk array device of the present invention, the first and second buffer memories having a ring buffer structure for input data and output data, and the data are sequentially divided into a plurality of disk drives. Data dividing and coupling means for combining the data supplied from the plurality of disk drives to the disk and managing the operation status and processing time slot of the disk drive and controlling the reading / writing of each buffer memory and the recording / reproducing operation of the disk drive And reconstructing the data using the data from the first and second buffer memories using an empty time slot, so that the occurrence of overhead during writing can be prevented, and real-time performance can be improved. It is possible to guarantee, and it is possible to prevent performance degradation at the time of failure recovery.
[0087]
Further, the disk array device of the present invention includes a data division / combining means for dividing and supplying data to a plurality of disk drives in sequence and combining the data supplied from the plurality of disk drives, and for recording and reproduction data of the disk drive. The first and second first-in first-out memories of at least two sets, the operation status and processing time slot of the disk drive are managed, and the recording / reproducing operation of the disk drive and the first and second first-in first-out memories are controlled by bank switching. Control means, using empty time slots, and reconstructing data using the data from the first and second first-in first-out memories, can prevent the occurrence of overhead at the time of writing. Can be ensured, and it is possible to prevent performance degradation during failure recovery.
[Brief description of the drawings]
FIG. 1 is a block circuit diagram showing a schematic configuration of a disk array device according to a first embodiment of the invention applicable to a video server system.
FIG. 2 is a block circuit diagram showing a schematic configuration of an HDD control unit in the disk array device according to the first embodiment;
FIG. 3 is a diagram used for explaining a recovery (retry) operation when a write error to the HDD occurs in the disk array device according to the first embodiment;
FIG. 4 is a diagram used for explaining a rebuild operation in the disk array device according to the first embodiment;
FIG. 5 is a block circuit diagram showing a schematic configuration of a disk array device according to a second embodiment of the invention applicable to a video server system.
FIG. 6 is a block circuit diagram showing a schematic configuration of an HDD control unit in the disk array device according to the second embodiment;
FIG. 7 is a diagram used for explaining a recovery (retry) operation when an HDD write error occurs in the disk array device according to the second embodiment;
FIG. 8 is a diagram used for explaining a rebuild operation in the disk array device according to the second embodiment;
FIG. 9 is a diagram used to explain the concept of a disk array device.
FIG. 10 is a block circuit diagram showing a schematic configuration of a conventional general disk array device.
FIG. 11 is a block circuit diagram showing a schematic configuration of a video server system.
[Explanation of symbols]
2,52 I / O interface unit, 3,53 Data timing controller, 4,71,72 selection switch, 5,55 CPU unit, 7,57 striping / ECC unit, 10-14, 60-64 HDD control unit, 15- 9, 65-69 HDD section, 21 retry temporary buffer memory, 22 rebuild buffer memory, 43, 77-79 input FIFO, 44, 80, 81 output FIFO, 41, 75 data stream controller, 42, 76 SPC

Claims

In a disk array device configured by a plurality of disk drives and accessed in a time-division multiple access according to time slots,
The data from the first buffer memory having a ring buffer structure that temporarily holds the data input to the disk array device or the input data is divided and sequentially supplied to the plurality of disk drives. Data dividing and combining means for combining data supplied from the disk drives of
Manages the operation status of the plurality of disk drives and the time slots, and temporarily holds data obtained by combining the data from the plurality of disk drives output from the first buffer memory and the data dividing / combining means. Control means for controlling reading / writing of the second buffer memory having a ring buffer structure and recording / reproducing operations of the plurality of disk drives,
The control means manages a portion that should be written by prohibiting rewriting of the disk drive alone when data writing to the disk drive fails, and the input in the time slot when the data writing fails The stored data is held in the first buffer memory, and the data held in the first buffer memory is controlled to be written again to the corresponding part of the disk drive by using the next empty time slot, and the data is reconstructed. Is performed, the data to be reconstructed is read from the disk drive using an empty time slot, stored in the second buffer memory after error correction, and the second empty time slot is used to store the first data. Write the data held in the buffer memory 2 to the above disk drive Control disk array device.

The disk array device according to claim 1, wherein a theoretical block size as a unit of data read / write with respect to the data processing device is optimized to an integral multiple of sector size x number of storage means.

The disk array device according to claim 1, wherein the control means prohibits retry and reassignment in a time slot other than the empty time slot.

2. The disk array device according to claim 1, further comprising: an input / output interface unit that is time-division-multiplexed accessed by the plurality of input / output devices in a server having the plurality of input / output devices.

In a disk array device configured by a plurality of disk drives and accessed in a time-division multiple access according to time slots,
Data division and coupling means for dividing the data input to the disk array device and sequentially supplying the divided data to the plurality of disk drives, and combining the data supplied from the plurality of disk drives;
A set of at least two first-in-first-out memories that temporarily hold data supplied from the data dividing and coupling means and recorded in the disk drive;
A second first-in first-out first-in-first-out memory for temporarily holding data reproduced from the disk drive;
A control means for managing the operation status of the disk drive and the time slot, and for controlling the bank switching of the recording / reproducing operation of the disk drive and the first and second first-in first-out memories,
The control means manages a portion that should be written by prohibiting rewriting of the disk drive alone when data writing to the disk drive fails, and the data in the time slot when the data writing fails The data supplied from the division coupling means is held in the first first-in first-out memory, and the data held in the first first-in first-out memory is written again to the corresponding part of the disk drive using the next empty time slot. When data is reconstructed, the data to be reconstructed is read from the disk drive using an empty time slot, and is stored in the second first-in first-out memory after error correction , and the next empty time slot Stored in the second first-in first-out memory The disk array device for writing control data to the disk drive.

6. The disk array device according to claim 5, wherein a theoretical block size as a unit of data read / write with respect to the disk array device is optimized to an integral multiple of sector size × number of disk drives.

6. The disk array device according to claim 5, wherein the control means prohibits retry and reassignment in a time slot other than the empty time slot.

6. The disk array device according to claim 5, further comprising: an input / output interface unit that is time-division-multiplexed accessed by the plurality of input / output devices of the server including the plurality of input / output devices.

In a data recording / reproducing method of a disk array device constituted by a plurality of disk drives and time-division multiplexed access according to a time slot,
The data from the first buffer memory having a ring buffer structure that temporarily holds the data input to the disk array device or the input data is divided and sequentially supplied to the plurality of disk drives. A data dividing and combining step for combining the data supplied from the disk drives;
Manages the operation status of the plurality of disk drives and the time slots, and temporarily holds data obtained by combining the data from the plurality of disk drives output from the first buffer memory and the data dividing / combining means. A control process for controlling reading / writing of the second buffer memory having a ring buffer structure and recording / reproducing operations of the plurality of disk drives,
In the control process, when the data writing to the disk drive fails , the rewriting of the disk drive alone is prohibited and the place that should be written is managed, and the input in the time slot when the data writing fails The stored data is held in the first buffer memory, and the data held in the first buffer memory is controlled to be written again to the corresponding part of the disk drive by using the next empty time slot, and the data is reconstructed. Is performed, the data to be reconstructed is read from the disk drive using an empty time slot, stored in the second buffer memory after error correction, and the second empty time slot is used to store the first data. 2 Write the data stored in the buffer memory to the above disk drive Data recording and reproducing method for viewing control.