JP3845239B2

JP3845239B2 - Disk array device and failure recovery method in disk array device

Info

Publication number: JP3845239B2
Application number: JP36316399A
Authority: JP
Inventors: 和敏本尾
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1999-12-21
Filing date: 1999-12-21
Publication date: 2006-11-15
Anticipated expiration: 2019-12-21
Also published as: JP2001175423A

Description

【０００１】
【発明の属する技術分野】
本発明はディスクアレイ装置に関し、特にディスクアレイ装置における障害復旧方法に関する。
【０００２】
【従来の技術】
ディスクアレイ装置は、物理ディスクである大容量ハードディスクドライブ（ＨＤＤ）の搭載により、数十ＴＢ（テラバイト）を有する規模になってきている。従来装置からの移行性を重視して、従来と同じ容量の論理ディスクを提供している。これは、複数の物理ディスクから構成される１つのアレイランクを論理的に複数に分割する技術で実現されている。
【０００３】
【発明が解決しようとする課題】
このようなディスクアレイ装置において、物理ディスクであるハードディスクドライブの故障が発生した場合、アレイランク内で故障したハードディスクドライブに関わる全論理ディスクが障害状態になり、復旧完了時までの長時間に渡り、上位からのアクセスに対して性能低下が発生したり、信頼性が低下してしまうという問題がある。
【０００４】
さらに、最近のディスクアレイ装置は、大容量ＨＤＤを搭載しており、障害発生から復旧完了までの時間が長くなっており、今後もＨＤＤの大容量化が行われるため、ますます復旧完了までの時間が長くなる傾向にある。
【０００５】
図７は、従来のディスクアレイ装置のブロック図である。ホスト１とディスクアレイ装置９０２は、上位Ｉ／Ｆ３で接続されている。ホスト１からのＲＥＡＤ／ＷＲＩＴＥアクセス指令は、上位Ｉ／Ｆ制御部２０１で受信され、認識される。論理ディスクアクセス制御部２０２は、上位Ｉ／Ｆ制御部２０１からの指示により、ＲＥＡＤ／ＷＲＴＩＥアクセス対象の論理ディスク，アドレス，転送長を認識し、アレイ制御部２０３に対して指示を行う。
【０００６】
アレイ制御部２０３は、論理ディスクアクセス制御部２０２からの指示（アクセスの種類，アドレス，転送長）により、ＲＥＡＤ／ＷＲＩＴＥアクセス対象の論理ディスクを格納する物理ディスク２０５を制御するため、ＨＤＤ制御部２０４に指示を行う。
【０００７】
ＨＤＤ制御部２０４は、アレイ制御部２０３からの指示により、物理ディスク２０５を制御する。
【０００８】
物理ディスク２０５でデータ転送の準備が完了後、ＲＥＡＤまたはＷＲＩＴＥのデータ転送処理を行う。データ転送中、ＨＤＤ制御部２０４は、データ転送の終了を監視する。
【０００９】
データ転送終了後、ＨＤＤ制御部２０４は、アレイ制御部２０３を経由して、論理ディスクアクセス制御部２０２にデータ転送終了を報告する。
【００１０】
論理ディスクアクセス制御部２０２は、データ転送終了を受信した後、上位Ｉ／Ｆ制御部２０１に対して、データ転送終了を報告する。
【００１１】
上位Ｉ／Ｆ制御部２０１は、データ転送終了を受信して、ホスト１に対してＲＥＡＤ／ＷＲＩＴＥアクセスの終了を報告する。
【００１２】
以上の動作により、一連のＩ／Ｏ動作（ＲＥＡＤ／ＷＲＩＴＥアクセス）が行われる。
【００１３】
このようなディスクアレイ装置は、ＨＤＤが障害した時、ＨＤＤに記憶している複数の論理ディスクに影響を与える。
【００１４】
例として、論理ディスク“Ａ”，“Ｂ”，“Ｃ”，“Ｄ”が、図２のように配置されている場合を考える。この場合において、論理ディスク”Ａ”は、Ａ１〜Ａ３より成り、論理ディスク”Ｂ”は、Ｂ１〜Ｂ３より成り、論理ディスク”Ｃ”は、Ｃ１〜Ｃ３より成り、論理ディスク”Ｄ”は、Ｄ１〜Ｄ３より成る。ＨＤＤ＃１、ＨＤＤ＃２、ＨＤＤ＃３、ＨＤＤ＃４は物理ディスクであり、ＨＤＤ＃１はＡ１、Ｂ２、Ｃ３より成り、ＨＤＤ＃２はＡ２、Ｂ３、Ｄ１より成り、ＨＤＤ＃３はＡ３、Ｃ１、Ｄ２より成り、ＨＤＤ＃４はＢ１、Ｃ２、Ｄ３より成る。ＨＤＤ＃２で障害が発生した時、論理ディスク“Ａ”のデータであるＡ２，論理ディスク“Ｂ”のデータであるＢ３，論理ディスク“Ｄ”のデータであるＤ４が障害となり、論理ディスク“Ａ”，“Ｂ”，“Ｄ”が影響を受ける（図３参照）。
【００１５】
障害復旧は、障害が発生していないＨＤＤを読み出して、データを復元した後、予め用意されたスペアＨＤＤまたは交換されたＨＤＤに書き戻す。しかしながら、ＨＤＤの全データが書き戻されるまでの間、たとえ一部の論理ディスクに関わるデータが復旧完了していたとしても、復旧完了とはならず、ＨＤＤ全データが復旧した時に、復旧完了となる。
【００１６】
この結果、長時間に渡り、複数の論理ディスクが、性能低下や信頼性低下の影響を受けるという問題がある。
【００１７】
図８は、図３の構成において、復旧開始から復旧完了までの経過を書いた図であり、データＡ２が復旧しても、データＢ３，Ｄ１が復旧完了するまで、復旧処理が完了しないことを示している。
【００１８】
本発明は、全ての論理ディスクが復旧が完了する前に特定の論理ディスクを上位からアクセスすることが可能なディスクアレイ装置及びその方法を提供することを目的とする。
【００１９】
【課題を解決するための手段】
本発明によるディスクアレイ装置は、物理ディスクに障害が発生したことを検出する手段と、障害が発生した前記物理ディスクが割り当てられている論理ディスクのうちの優先論理ディスクに割り当てられている物理ディスクのうちの障害が発生していない物理ディスクから前記優先論理ディスクに係る全てのデータを読み出す第１の読出手段と、前記第１の読出手段により読み出されたデータから前記優先論理ディスクのデータのうちの障害が発生した前記物理ディスクに格納されていた全てのデータを復旧する第１の復旧手段と、前記第１の復旧手段により復旧された前記データを一時的に格納する第１の記憶手段と、障害が発生した前記物理ディスクが割り当てられている論理ディスクのうちの優先論理ディスクでない論理ディスクに割り当てられている物理ディスクのうちの障害が発生していない物理ディスクから前記優先論理ディスクでない論理ディスクに係る全てのデータを読み出す第２の読出手段と、前記第２の読出し手段により読み出されたデータから前記優先論理ディスクでない論理ディスクのデータのうちの障害が発生した前記物理ディスクに格納されていた全てのデータを復旧する第２の復旧手段と、前記第１の記憶手段に一時的に格納されたデータ及び前記第２の復旧手段により復旧されたデータを格納する第２の記憶手段と、を備えることを特徴とする。
【００２０】
また、本発明によるディスクアレイ装置は、上記のディスクアレイ装置において、ホストより指定された論理ディスクの識別子を記憶する第２の記憶手段を備え、前記読み出し手段は前記第２の記憶手段に記憶されている識別子を有する論理ディスクを前記優先論理ディスクとすることを特徴とする。
【００２１】
更に、本発明によるディスクアレイ装置は、上記のディスクアレイ装置において、前記優先論理ディスクへのアクセスがホストからあったときに、前記優先論理ディスクに割り当てられている物理ディスクのうちの障害が発生していない物理ディスクと前記第１の記憶手段にアクセスするアクセス手段を更に備えることを特徴とする。
【００２２】
更に、本発明によるディスクアレイ装置は、上記のディスクアレイ装置において、前記第１の記憶手段は不揮発性の半導体メモリであることを特徴とする。
【００２３】
本発明によるディスクアレイ装置における障害復旧方法は、物理ディスクに障害が発生したことを検出するステップと、障害が発生した前記物理ディスクが割り当てられている論理ディスクのうちの優先論理ディスクに割り当てられている物理ディスクのうちの障害が発生していない物理ディスクから前記優先論理ディスクに係る全てのデータを読み出す第１の読出ステップと、前記第１の読出ステップにより読み出されたデータから前記優先論理ディスクのデータのうちの障害が発生した前記物理ディスクに格納されていた全てのデータを復旧する第１の復旧ステップと、前記第１の復旧ステップにより復旧された前記データを第１の記憶手段に一時的に格納する第１の記憶ステップと、障害が発生した前記物理ディスクが割り当てられている論理ディスクのうちの優先論理ディスクでない論理ディスクに割り当てられている物理ディスクのうちの障害が発生していない物理ディスクから前記優先論理ディスクでない論理ディスクに係る全てのデータを読み出す第２の読出ステップと、前記第２の読出しステップにより読み出されたデータから前記優先論理ディスクでない論理ディスクのデータのうちの障害が発生した前記物理ディスクに格納されていた全てのデータを復旧する第２の復旧ステップと、前記第１の記憶手段に一時的に格納されたデータ及び前記第２の復旧ステップにより復旧されたデータを第２の記憶手段に格納する第２の記憶ステップと、を有することを特徴とする。
【００２４】
また、本発明によるディスクアレイ装置における障害復旧方法は、上記のディスクアレイ装置における障害復旧方法において、ホストより指定された論理ディスクの識別子を記憶する第２の記憶ステップを備え、前記第１の読み出しステップでは前記第２の記憶ステップで記憶された識別子を有する論理ディスクを前記優先論理ディスクとすることを特徴とする。
【００２５】
更に、本発明によるディスクアレイ装置における障害復旧方法は、上記のディスクアレイ装置における障害復旧方法において、前記優先論理ディスクへのアクセスがホストからあったときに、前記優先論理ディスクに割り当てられている物理ディスクのうちの障害が発生していない物理ディスクと前記第１の記憶手段にアクセスするアクセスステップを更に有することを特徴とする。
【００２６】
【発明の実施の形態】
本発明は、大容量の物理ディスクを使用し、一つのアレイランク内に複数の論理ディスクを有するディスクアレイ装置において、不揮発性の半導体メモリを用意し、物理ディスクに障害が発生した場合、上位装置から予め設定された論理ディスク値を参照して、この論理ディスク値で示される論理ディスクで障害により失ったデータを優先的に不揮発性の半導体メモリに復旧させる。
【００２７】
これにより、障害が発生した物理ディスク全領域の復旧が完了するまでの長時間を待たずに、特定の論理ディスクを優先的に復旧でき、性能低下及び信頼性低下の時間を短くすることができる。
【００２８】
図１は、本発明の一実施形態として、ブロック図を示している。
【００２９】
本実施形態のディスクアレイ装置２は、ホスト１と上位Ｉ／Ｆ３で接続されている。
【００３０】
本実施形態のディスクアレイ装置２は、上位Ｉ／Ｆ制御部２０１，論理ディスクアクセス制御部２０２，アレイ制御部２０３，ＨＤＤ制御部２０４，物理ディスク２０５，優先復旧ディスク指示部２０６，復旧制御部２０７，ＨＤＤ障害検出部２０８，優先ディスクメモリ部２０９から構成されている。
【００３１】
上位Ｉ／Ｆ制御部２０１は、上位装置であるホスト１との転送を制御する。論理ディスクアクセス制御部２０２は、上位Ｉ／Ｆ制御部２０１からの指示により、論理ディスクのアクセスを制御する。アレイ制御部２０３は、論理ディスクアクセス制御部２０２からの指示（アクセスの種類，アドレス，転送長）により、アクセス対象の論理ディスクを格納する物理ディスク２０５の制御を行うよう、ＨＤＤ制御部２０４に指示を行う。
【００３２】
ＨＤＤ制御部２０４は、アレイ制御部２０３からの指示により、物理ディスク２０５を制御する。
【００３３】
物理ディスク２０５は、ディスクにデータを記憶し、ＨＤＤ制御部２０４により制御される。優先復旧ディスク指示部２０６は、上位Ｉ／Ｆ制御部２０１を経由して、ホスト１から予め転送される復旧時の優先論理ディスクの値を格納する。
【００３４】
復旧制御部２０７は、ＨＤＤ障害検出部２０８からの指示により、優先復旧ディスク指示部２０６に格納されている値を参照して、アレイ制御部２０３を制御して、障害が発生したＨＤＤの内容を復旧させる。
【００３５】
ＨＤＤ障害検出部２０８は、物理ディスク２０５で障害が発生したことをＨＤＤ制御部２０４からの指示で認識する。
【００３６】
優先ディスクメモリ部２０９は、不揮発性の半導体メモリであり、ＨＤＤ障害復旧時に、優先復旧ディスク指示部２０６で格納している値の論理ディスクのデータを復旧するために使用するメモリである。
【００３７】
次に、本実施形態の動作について説明する。
【００３８】
はじめに、ホスト１から本ディスクアレイ装置２に対して、論理ディスク“Ａ”へのＲＥＡＤアクセスが発生した場合について、図１，図２を参照して説明する。
【００３９】
本ディスクアレイ装置２は、ＲＥＡＤアクセスまたはＷＲＩＴＥアクセス前に、優先復旧の論理ディスクの値が、予めホスト１から転送され、優先復旧ディスク指示部２０６に設定される。
【００４０】
優先復旧ディスクの値を設定する動作は、優先復旧ディスクの値として“Ａ”を優先復旧ディスク指示部２０６に設定する場合を例として説明する。
【００４１】
まずホスト１からディスクアレイ装置２に対して、優先復旧ディスク設定の指令を発行する。転送された指令は、上位Ｉ／Ｆ制御部２０１で受信され、指令の認識が行われる。上位Ｉ／Ｆ制御部２０１は、優先復旧ディスク設定の指令と認識し、ホスト１に対して設定するディスク値の転送を要求する。ホスト１から設定する優先復旧ディスク値“Ａ”が転送され、上位Ｉ／Ｆ制御部２０１経由で、優先復旧ディスク指示部２０６に転送される。
【００４２】
優先復旧ディスク指示部２０６は、優先復旧ディスク値“Ａ”を受信し、格納する。
【００４３】
このような動作により、優先復旧ディスク値“Ａ”の設定が行われる。
【００４４】
優先復旧ディスク値“Ａ”の設定後、ホスト１から論理ディスク“Ａ”へのＲＥＡＤアクセス指令が発行される。
【００４５】
ＲＥＡＤアクセス指令は、上位Ｉ／Ｆ制御部２０１で受信され、ＲＥＡＤ指令の認識が行われる。
【００４６】
ＲＥＡＤ指令認識後、上位Ｉ／Ｆ制御部２０１は、論理ディスクアクセス制御部２０２に対して、論理ディスクのＲＥＡＤを指示する。
【００４７】
論理ディスクアクセス制御部２０２は、上位Ｉ／Ｆ制御部２０１からの指示により、ＲＥＡＤアクセス対象の論理ディスク“Ａ”とアドレス，転送長を認識し、アレイ制御部２０３に対して指示を行う。アレイ制御部２０３は、ＲＥＡＤアクセス対象の論理ディスク“Ａ”を認識し、ＨＤＤ制御部２０４に対して、アクセス対象の物理ディスクＨＤＤ＃１，２，３にアドレス，転送長を指示する。
【００４８】
ＨＤＤ制御部２０４は、アレイ制御部２０３からの指示により、アクセス対象の物理ディスクＨＤＤ＃１，２，３をＲＥＡＤする。
【００４９】
物理ディスクＨＤＤ＃１，２，３でデータ転送の準備が完了後、データであるＡ１，Ａ２，Ａ３のデータ転送処理を行う。アレイ制御部２０３は、転送されたデータＡ１，Ａ２，Ａ３を結合させ、論理ディスクアクセス制御部２０２に論理ディスクＡのデータを転送する。
【００５０】
論理ディスクアクセス制御部２０２は、アレイ制御部２０３から転送されたデータを上位Ｉ／Ｆ制御部２０１を経由して、ホスト１に転送する。
【００５１】
物理ディスクＨＤＤ＃１，２，３からのデータ転送が終了すると、ＨＤＤ制御部２０４は、アレイ制御部２０３を経由して、論理ディスクアクセス制御部２０２にデータ転送終了を報告する。
【００５２】
論理ディスクアクセス制御部２０２は、データ転送終了を受信した後、上位Ｉ／Ｆ制御部２０１に対して、データ転送終了を報告する。
【００５３】
上位Ｉ／Ｆ制御部２０１は、データ転送終了を受信して、ホスト１に対してＲＥＡＤアクセスの終了を報告する。以上の動作により、論理ディスク“Ａ”のＲＥＡＤアクセスが行われる。
【００５４】
一方、ホスト１から本ディスクアレイ装置２に対して、論理ディスク“Ｂ”のＷＲＩＴＥアクセスが発生した場合について説明する。
【００５５】
ＷＲＩＴＥアクセス指令は、上位Ｉ／Ｆ制御部２０１で受信され、ＷＲＩＴＥ指令の認識が行われる。ＷＲＩＴＥ指令認識後、上位Ｉ／Ｆ制御部２０１は、論理ディスクアクセス制御部２０２に対して、論理ディスク“Ｂ”のＷＲＩＴＥを指示する。
【００５６】
論理ディスクアクセス制御部２０２は、上位Ｉ／Ｆ制御部２０１からの指示により、ＷＲＩＴＥアクセス対象の論理ディスク“Ｂ”とアドレス，転送長を認識し、アレイ制御部２０３に対して指示を行う。アレイ制御部２０３は、ＷＲＩＴＥアクセス対象の論理ディスク“Ｂ”を認識し、ＨＤＤ制御部２０４に対して、アクセス対象の物理ディスクＨＤＤ＃４，１，２のアドレス，転送長を指示する。
【００５７】
物理ディスクＨＤＤ＃４，１，２でデータ転送の準備が完了後、上位Ｉ／Ｆ制御部２０１，論理ディスクアクセス制御部２０２を経由して、ホスト１からデータを受信する。アレイ制御部２０３は、受信したデータを分割し、物理ディスクＨＤＤ＃４，１，２に対して、データであるＢ１，Ｂ２，Ｂ３のＷＲＩＴＥを行う。
【００５８】
データのＷＲＩＴＥ終了後、ＨＤＤ制御部２０４は、アレイ制御部２０３を経由して、論理ディスクアクセス制御部２０２にデータ転送終了を報告する。
【００５９】
論理ディスクアクセス制御部２０２は、データ転送終了を受信した後、上位Ｉ／Ｆ制御部２０１に対して、データ転送終了を報告する。
【００６０】
上位Ｉ／Ｆ制御部２０１は、データ転送終了を受信して、ホスト１に対して、論理ディスク“Ｂ”のＷＲＩＴＥアクセスの終了を報告する。
【００６１】
以上の動作により、論理ディスク“Ｂ”のＷＲＩＴＥアクセスが行われる。
【００６２】
物理ディスク２０５の中で、図３で示されるように、ＨＤＤ＃２が障害した場合の復旧動作を図１及び図４を参照して説明する。
【００６３】
ＨＤＤ制御部２０４は、物理ディスク２０５の中で、ＨＤＤ＃２に障害が発生したことを認識し、ＨＤＤ障害検出部２０８に指示する。ＨＤＤ障害検出部２０８は、ＨＤＤ制御部２０４からの指示により、物理ディスクＨＤＤ＃２に障害が発生したことを検出する。物理ディスクＨＤＤ＃２の障害を検出したＨＤＤ障害検出部２０８は、ＨＤＤの復旧を行う復旧制御部２０７に対して、復旧の指示を行う。
【００６４】
復旧制御部２０７は、優先復旧ディスク指示部２０６に設定されている値（論理ディスク“Ａ”が設定されていると仮定する。）を参照し、障害した論理ディスクの中で、優先して復旧処理する論理ディスク“Ａ”を認識する。復旧制御部２０７は、アレイ制御部２０３に対して、優先して復旧する論理ディスク“Ａ”を指定して、優先ディスクメモリ部２０９に障害で失われたデータＡ２を復旧するよう指示する。アレイ制御部２０３は、ＨＤＤ制御部２０４に対して、物理ディスクＨＤＤ＃１，３をアクセスして、データであるＡ１，Ａ３をＲＥＡＤするよう指示する。ＲＥＡＤされたデータＡ１，Ａ３を基に、アレイ制御部２０３は、障害で失ったデータＡ２を復元し、優先ディスクメモリ部２０９の不揮発性の半導体メモリにデータをＷＲＩＴＥしていく。アレイ制御部２０３は、優先ディスクメモリ部２０９に、データＡ２を復元した後から、図５のように優先ディスクメモリ部２０９を使用した論理ディスク“Ａ”のアクセスを行う。ホスト１から論理ディスク“Ａ”に対して、アクセスがあった場合、アレイ制御部２０３は、ＨＤＤ制御部２０４を経由して、物理ディスクＨＤＤ＃１，３にあるデータＡ１，Ａ３及び優先ディスクメモリ部２０９にあるデータＡ２をアクセスするよう制御する。
【００６５】
以上のような動作により、優先復旧で指定された論理ディスクは、優先ディスクメモリ部２０９の不揮発性の半導体メモリを使用して、優先的に復旧することができ、図６で示されるようにデータ復旧までの時間は、論理ディスク“Ｂ”，“Ｃ”の復旧完了まで待つ必要がなくなる。
【００６６】
続いて、予めスペアのＨＤＤが搭載されている場合、あるいは障害ＨＤＤが正常なＨＤＤに交換された場合は、優先復旧が指定されていない論理ディスクの復旧が行われる（図５参照）。
【００６７】
復旧制御部２０７は、優先復旧ディスク指示部２０６に設定されている値（論理ディスク“Ａ”を設定）以外の論理ディスク“Ｂ”，“Ｄ”を復旧するようアレイ制御部２０３を制御する。
【００６８】
アレイ制御部２０３は、ＨＤＤ制御部２０４に対して、障害のＨＤＤ＃２以外のＨＤＤであるＨＤＤ＃１，３，４をアクセスして、データであるＢ１，Ｂ２及びＤ２，Ｄ３をＲＥＡＤするよう指示する。ＲＥＡＤされたデータＢ１，Ｂ２及びＤ２，Ｄ３を基に、アレイ制御部２０３において、障害で失ったデータＢ３，Ｄ１を復元し、スペアのＨＤＤまたは交換されたＨＤＤにデータをＷＲＩＴＥしていく。アレイ制御部２０３は、スペアのＨＤＤまたは交換されたＨＤＤにデータＢ３，Ｄ１を復元した後から、通常通りＨＤＤ＃２を使用した論理ディスク“Ｂ”，“Ｄ”のアクセスを行う。以上のようにして、論理ディスク“Ｂ”，“Ｄ”の復旧が行われる。
【００６９】
また、優先ディスクメモリ部２０９に記憶しているデータＡ２は、論理ディスク“Ａ”のアクセスが低負荷な時に、オペレータの指示により、ＨＤＤ＃２に対してＷＲＩＴＥを行い、その後、通常通り、ＨＤＤ＃２を使用したアクセスが可能となる。
【００７０】
【発明の効果】
第一の効果は、ディスクアレイ装置において、アレイランクを構成する物理ディスクに障害が発生した場合、優先度の高い論理ディスクを優先して、復旧させることができることである。
【００７１】
これにより、優先度の高い論理ディスクの復旧までの時間を最短にすることができ、性能低下や信頼性低下の時間を短くできるという効果を有する。
【００７２】
特に、性能低下が許されない、あるいは重要なデータが存在する場合は、大きな効果を発揮する。
【００７３】
その理由は、障害ディスクにある複数の論理ディスクのデータの中で、優先度の高い論理ディスクに関わるデータは、不揮発性の半導体メモリである優先ディスクメモリ部に優先的に復旧させるためである。その他の論理ディスクのデータは、通常通りの復旧処理を行う。
【図面の簡単な説明】
【図１】本発明の実施形態によるディスクアレイ装置の構成とそれに接続されるホストを示すブロック図である。
【図２】アレイランクの構成例を示す図である。
【図３】図２のアレイランクの構成において、ＨＤＤ＃２に障害が発生した様子を示す図である。
【図４】本発明の実施形態による図２のアレイランクの構成において、ＨＤＤ＃２に障害が発生し、不揮発性の半導体メモリに復旧データＡ２が書き込まれた様子を示す図である。
【図５】本発明の実施形態による優先論理ディスク以外の論理ディスクの復旧の動作を説明するための図である。
【図６】本発明による復旧開始から優先度の高い論理ディスクの復旧が完了するまで、及び、その後に他の論理ディスクが復旧するまでの様子を示すタイミング図である。
【図７】従来例によるディスクアレイ装置の構成とそれに接続されるホストを示すブロック図である。
【図８】従来例による復旧開始から復旧完了までの様子を示すタイミング図である。
【符号の説明】
１ホスト
２ディスクアレイ装置
３上位Ｉ／Ｆ
２０１上位Ｉ／Ｆ制御部
２０２論理ディスクアクセス制御部
２０３アレイ制御部
２０４ＨＤＤ制御部
２０５物理ディスク
２０６優先復旧ディスク指示部
２０７復旧制御部
２０９ＨＤＤ障害検出部
２０９優先ディスクメモリ部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a disk array device, and more particularly to a failure recovery method in a disk array device.
[0002]
[Prior art]
The disk array device has a scale of several tens of TB (terabytes) due to the mounting of a large-capacity hard disk drive (HDD) that is a physical disk. Emphasizing transferability from conventional devices, logical disks having the same capacity as conventional ones are provided. This is realized by a technique that logically divides one array rank composed of a plurality of physical disks into a plurality.
[0003]
[Problems to be solved by the invention]
In such a disk array device, when a failure of a hard disk drive that is a physical disk occurs, all the logical disks related to the failed hard disk drive in the array rank are in a failure state, and for a long time until the completion of recovery, There is a problem that the performance is lowered or the reliability is lowered with respect to the access from the upper level.
[0004]
In addition, recent disk array devices are equipped with large-capacity HDDs, and it takes a long time to complete recovery after a failure occurs. Since the capacity of HDDs will continue to increase in the future, the time until recovery will continue to increase. Time tends to be longer.
[0005]
FIG. 7 is a block diagram of a conventional disk array device. The host 1 and the disk array device 902 are connected by an upper I / F 3. A READ / WRITE access command from the host 1 is received and recognized by the host I / F control unit 201. The logical disk access control unit 202 recognizes the logical disk, the address, and the transfer length of the READ / WRTIE access target according to an instruction from the higher-level I / F control unit 201 and instructs the array control unit 203.
[0006]
The array control unit 203 controls the physical disk 205 that stores the logical disk to be read / written according to an instruction (access type, address, transfer length) from the logical disk access control unit 202. Therefore, the HDD control unit 204 To give instructions.
[0007]
The HDD control unit 204 controls the physical disk 205 according to an instruction from the array control unit 203.
[0008]
After preparation for data transfer is completed on the physical disk 205, READ or WRITE data transfer processing is performed. During the data transfer, the HDD control unit 204 monitors the end of the data transfer.
[0009]
After completing the data transfer, the HDD control unit 204 reports the end of the data transfer to the logical disk access control unit 202 via the array control unit 203.
[0010]
After receiving the data transfer end, the logical disk access control unit 202 reports the data transfer end to the upper I / F control unit 201.
[0011]
The host I / F control unit 201 receives the end of data transfer and reports the end of READ / WRITE access to the host 1.
[0012]
With the above operation, a series of I / O operations (READ / WRITE access) are performed.
[0013]
Such a disk array device affects a plurality of logical disks stored in the HDD when the HDD fails.
[0014]
As an example, consider a case where logical disks “A”, “B”, “C”, and “D” are arranged as shown in FIG. In this case, the logical disk “A” is composed of A1 to A3, the logical disk “B” is composed of B1 to B3, the logical disk “C” is composed of C1 to C3, and the logical disk “D” is It consists of D1 to D3. HDD # 1, HDD # 2, HDD # 3, and HDD # 4 are physical disks, HDD # 1 is composed of A1, B2, and C3, HDD # 2 is composed of A2, B3, and D1, and HDD # 3 is A3. , C1, and D2, and HDD # 4 includes B1, C2, and D3. When a failure occurs in HDD # 2, logical disk “A” data A2, logical disk “B” data B3, logical disk “D” data D4 fails, and logical disk “A” “,“ B ”,“ D ”are affected (see FIG. 3).
[0015]
In the failure recovery, an HDD in which no failure has occurred is read and restored, and then written back to a spare HDD prepared in advance or a replaced HDD. However, until all the data on the HDD is written back, even if the data related to some of the logical disks has been recovered, the recovery is not completed. Become.
[0016]
As a result, there is a problem that a plurality of logical disks are affected by performance degradation and reliability degradation for a long time.
[0017]
FIG. 8 is a diagram in which the process from the start of recovery to the completion of recovery is written in the configuration of FIG. 3. Even if the data A2 is recovered, the recovery process is not completed until the data B3 and D1 are recovered. Show.
[0018]
An object of the present invention is to provide a disk array apparatus and method for accessing a specific logical disk from the host before all logical disks have been restored.
[0019]
[Means for Solving the Problems]
The disk array device according to the present invention includes means for detecting that a physical disk has failed and a physical disk assigned to a priority logical disk among logical disks to which the failed physical disk is assigned. A first reading unit that reads all data related to the priority logical disk from a physical disk of which no failure has occurred, and data of the priority logical disk from the data read by the first reading unit First recovery means for recovering all data stored in the physical disk in which the failure occurred, and first storage means for temporarily storing the data recovered by the first recovery means; , the logical disk is not a priority logical disk of the logical disk that the failed physical disk is allocated A second reading unit that reads all data related to a logical disk that is not the priority logical disk from a physical disk that has not failed among the allocated physical disks, and is read by the second reading unit. Second recovery means for recovering all data stored in the failed physical disk from among the data of the logical disk that is not the preferred logical disk from the stored data, and temporarily stored in the first storage means And a second storage means for storing the stored data and the data restored by the second restoration means .
[0020]
The disk array device according to the present invention further comprises second storage means for storing an identifier of a logical disk designated by the host in the above disk array device, and the reading means is stored in the second storage means. The logical disk having the identifier is used as the priority logical disk.
[0021]
Furthermore, in the disk array device according to the present invention, in the above disk array device, when the priority logical disk is accessed from the host, a failure occurs in the physical disks allocated to the priority logical disk. It further comprises access means for accessing a physical disk that has not been accessed and the first storage means.
[0022]
Furthermore, the disk array device according to the present invention is characterized in that, in the above disk array device, the first storage means is a nonvolatile semiconductor memory.
[0023]
A failure recovery method in a disk array device according to the present invention includes a step of detecting that a failure has occurred in a physical disk, and a logical disk assigned to the priority logical disk among the logical disks to which the physical disk in which the failure has occurred is assigned. a first reading step of reading all the data relating the physical disks in said priority logical disk failure of the physical disk is not generated there, the priority logical disk from data read by said first reading step one o'clock and first recovery step to recover all data failure of the data has been stored in the physical disk has occurred, the data that has been restored by the first restoration step in a first storage means a first storing step of storing manner, with the failed physical disk is allocated A second reading step of reading all data related to the logical disk that is not the priority logical disk from the physical disks that have not failed among the physical disks that are assigned to the logical disks that are not the priority logical disks And a second recovery step for recovering all data stored in the physical disk in which a failure has occurred among the data of the logical disk that is not the priority logical disk from the data read in the second read step. And a second storage step for storing in the second storage means the data temporarily stored in the first storage means and the data recovered by the second recovery step. To do.
[0024]
Furthermore, error recovery method in a disk array device according to the invention, in the error recovery method in the disk array device, a second storage step of storing the identifier of the logical disk specified by the host, the first read In the step, the logical disk having the identifier stored in the second storage step is set as the priority logical disk.
[0025]
Further, the failure recovery method in the disk array device according to the present invention is the above-described failure recovery method in the disk array device, wherein the physical allocated to the priority logical disk when the priority logical disk is accessed from the host. It further has an access step of accessing a physical disk in which no failure has occurred among the disks and the first storage means.
[0026]
DETAILED DESCRIPTION OF THE INVENTION
The present invention provides a non-volatile semiconductor memory in a disk array device that uses a large-capacity physical disk and has a plurality of logical disks in one array rank. Referring to a preset logical disk value, data lost due to a failure in the logical disk indicated by the logical disk value is preferentially restored to the nonvolatile semiconductor memory.
[0027]
As a result, a specific logical disk can be preferentially restored without waiting for a long time until the recovery of the entire area of the failed physical disk is completed, and the performance degradation and reliability degradation time can be shortened. .
[0028]
FIG. 1 shows a block diagram as an embodiment of the present invention.
[0029]
The disk array device 2 according to the present embodiment is connected to the host 1 via a host I / F 3.
[0030]
The disk array device 2 of this embodiment includes a higher-level I / F control unit 201, a logical disk access control unit 202, an array control unit 203, an HDD control unit 204, a physical disk 205, a priority recovery disk instruction unit 206, and a recovery control unit 207. , An HDD failure detection unit 208, and a priority disk memory unit 209.
[0031]
The host I / F control unit 201 controls transfer with the host 1 which is a host device. The logical disk access control unit 202 controls access to the logical disk according to an instruction from the upper I / F control unit 201. The array control unit 203 instructs the HDD control unit 204 to control the physical disk 205 storing the logical disk to be accessed according to the instructions (access type, address, transfer length) from the logical disk access control unit 202. I do.
[0032]
The HDD control unit 204 controls the physical disk 205 according to an instruction from the array control unit 203.
[0033]
The physical disk 205 stores data in the disk and is controlled by the HDD control unit 204. The priority recovery disk instruction unit 206 stores the value of the priority logical disk at the time of recovery that is transferred in advance from the host 1 via the upper I / F control unit 201.
[0034]
In response to an instruction from the HDD failure detection unit 208, the recovery control unit 207 refers to the value stored in the priority recovery disk instruction unit 206 and controls the array control unit 203 to determine the contents of the failed HDD. Restore.
[0035]
The HDD failure detection unit 208 recognizes that a failure has occurred in the physical disk 205 based on an instruction from the HDD control unit 204.
[0036]
The priority disk memory unit 209 is a non-volatile semiconductor memory, and is a memory used to recover the logical disk data of the value stored in the priority recovery disk instruction unit 206 when the HDD failure is recovered.
[0037]
Next, the operation of this embodiment will be described.
[0038]
First, a case where READ access from the host 1 to the disk array device 2 to the logical disk “A” occurs will be described with reference to FIGS.
[0039]
In this disk array device 2, prior to READ access or WRITE access, the value of the logical disk for priority recovery is transferred from the host 1 in advance and set in the priority recovery disk instruction unit 206.
[0040]
The operation for setting the value of the priority recovery disk will be described as an example where “A” is set in the priority recovery disk instruction unit 206 as the value of the priority recovery disk.
[0041]
First, the host 1 issues a priority recovery disk setting command to the disk array device 2. The transferred command is received by the host I / F control unit 201 and the command is recognized. The host I / F control unit 201 recognizes this as a priority recovery disk setting command and requests the host 1 to transfer a disk value to be set. The priority recovery disk value “A” set from the host 1 is transferred and transferred to the priority recovery disk instruction unit 206 via the upper I / F control unit 201.
[0042]
The priority recovery disk instruction unit 206 receives and stores the priority recovery disk value “A”.
[0043]
With this operation, the priority recovery disk value “A” is set.
[0044]
After setting the priority recovery disk value “A”, the host 1 issues a READ access command to the logical disk “A”.
[0045]
The READ access command is received by the host I / F control unit 201, and the READ command is recognized.
[0046]
After recognizing the READ command, the upper I / F control unit 201 instructs the logical disk access control unit 202 to read the logical disk.
[0047]
The logical disk access control unit 202 recognizes the logical disk “A” to be read-accessed, the address, and the transfer length in accordance with an instruction from the upper I / F control unit 201, and instructs the array control unit 203. The array control unit 203 recognizes the logical disk “A” to be read-accessed, and instructs the HDD control unit 204 to specify the address and transfer length for the physical disks HDD # 1, 2, 3 to be accessed.
[0048]
The HDD control unit 204 reads the physical disks HDD # 1, 2, and 3 to be accessed according to an instruction from the array control unit 203.
[0049]
After the physical disks HDD # 1, 2, and 3 are prepared for data transfer, data transfer processing of data A1, A2, and A3 is performed. The array control unit 203 combines the transferred data A1, A2, and A3, and transfers the data on the logical disk A to the logical disk access control unit 202.
[0050]
The logical disk access control unit 202 transfers the data transferred from the array control unit 203 to the host 1 via the upper I / F control unit 201.
[0051]
When the data transfer from the physical disks HDD # 1, 2, 3 is completed, the HDD control unit 204 reports the end of the data transfer to the logical disk access control unit 202 via the array control unit 203.
[0052]
After receiving the data transfer end, the logical disk access control unit 202 reports the data transfer end to the upper I / F control unit 201.
[0053]
The host I / F control unit 201 receives the end of data transfer and reports the end of READ access to the host 1. Through the above operation, the READ access to the logical disk “A” is performed.
[0054]
On the other hand, a case where a WRITE access of the logical disk “B” occurs from the host 1 to the disk array device 2 will be described.
[0055]
The WRITE access command is received by the host I / F control unit 201, and the WRITE command is recognized. After recognizing the WRITE command, the upper I / F control unit 201 instructs the logical disk access control unit 202 to write the logical disk “B”.
[0056]
The logical disk access control unit 202 recognizes the logical access target logical disk “B”, the address, and the transfer length according to an instruction from the host I / F control unit 201, and gives an instruction to the array control unit 203. The array control unit 203 recognizes the logical disk “B” that is the WRITE access target, and instructs the HDD control unit 204 about the address and transfer length of the physical disk HDD # 4, 1, 2 to be accessed.
[0057]
After the physical disks HDD # 4, 1 and 2 are ready for data transfer, data is received from the host 1 via the higher I / F control unit 201 and logical disk access control unit 202. The array control unit 203 divides the received data and performs WRITE of the data B1, B2, and B3 on the physical disks HDD # 4, 1 and 2.
[0058]
After the WRITE of data is completed, the HDD control unit 204 reports the end of data transfer to the logical disk access control unit 202 via the array control unit 203.
[0059]
After receiving the data transfer end, the logical disk access control unit 202 reports the data transfer end to the upper I / F control unit 201.
[0060]
The host I / F control unit 201 receives the end of the data transfer, and reports to the host 1 the end of the WRITE access of the logical disk “B”.
[0061]
With the above operation, the WRITE access of the logical disk “B” is performed.
[0062]
With reference to FIGS. 1 and 4, the recovery operation when the HDD # 2 fails in the physical disk 205 as shown in FIG. 3 will be described.
[0063]
The HDD control unit 204 recognizes that a failure has occurred in the HDD # 2 in the physical disk 205, and instructs the HDD failure detection unit 208. The HDD failure detection unit 208 detects that a failure has occurred in the physical disk HDD # 2 according to an instruction from the HDD control unit 204. The HDD failure detection unit 208 that has detected the failure of the physical disk HDD # 2 issues a recovery instruction to the recovery control unit 207 that recovers the HDD.
[0064]
The recovery control unit 207 refers to the value set in the priority recovery disk instruction unit 206 (assuming that the logical disk “A” is set) and recovers with priority among the failed logical disks. The logical disk “A” to be processed is recognized. The recovery control unit 207 specifies the logical disk “A” to be recovered with priority to the array control unit 203 and instructs the priority disk memory unit 209 to recover the data A2 lost due to the failure. The array control unit 203 instructs the HDD control unit 204 to access the physical disks HDD # 1, 3 and read the data A1, A3. Based on the read data A 1 and A 3, the array control unit 203 restores the data A 2 lost due to the failure, and writes the data to the nonvolatile semiconductor memory of the priority disk memory unit 209. After restoring the data A2 to the priority disk memory unit 209, the array control unit 203 accesses the logical disk “A” using the priority disk memory unit 209 as shown in FIG. When the host 1 accesses the logical disk “A”, the array control unit 203 passes the HDD control unit 204 through the data A1, A3 and the priority disk memory in the physical disks HDD # 1, 3. It controls to access the data A2 in the unit 209.
[0065]
Through the operation as described above, the logical disk designated by the priority recovery can be recovered preferentially using the non-volatile semiconductor memory of the priority disk memory unit 209, and the data as shown in FIG. There is no need to wait until the recovery of the logical disks “B” and “C” is completed.
[0066]
Subsequently, when a spare HDD is installed in advance, or when a failed HDD is replaced with a normal HDD, a logical disk for which priority recovery is not designated is restored (see FIG. 5).
[0067]
The recovery control unit 207 controls the array control unit 203 to recover the logical disks “B” and “D” other than the value set in the priority recovery disk instruction unit 206 (set the logical disk “A”).
[0068]
The array control unit 203 accesses the HDD control unit 204 to the HDDs # 1, 3, and 4, which are HDDs other than the failed HDD # 2, and reads the data B1, B2, and D2, D3. Instruct. Based on the read data B1, B2, and D2, D3, the array control unit 203 restores the data B3, D1 lost due to the failure, and writes the data to the spare HDD or the replaced HDD. The array control unit 203 accesses the logical disks “B” and “D” using the HDD # 2 as usual after restoring the data B3 and D1 to the spare HDD or the replaced HDD. As described above, the recovery of the logical disks “B” and “D” is performed.
[0069]
Further, the data A2 stored in the priority disk memory unit 209 performs WRITE to the HDD # 2 according to an instruction from the operator when the access to the logical disk “A” is lightly loaded, and thereafter, as usual, the HDD Access using # 2 becomes possible.
[0070]
【The invention's effect】
The first effect is that in a disk array device, when a failure occurs in a physical disk constituting an array rank, a logical disk having a high priority can be preferentially restored.
[0071]
As a result, it is possible to minimize the time until recovery of a logical disk having a high priority, and to shorten the time for performance degradation and reliability degradation.
[0072]
In particular, when the performance degradation is not allowed or important data exists, a great effect is exhibited.
[0073]
The reason is that data related to a logical disk having a high priority among the data of a plurality of logical disks in the failed disk is preferentially restored to the priority disk memory unit which is a nonvolatile semiconductor memory. The other logical disk data is restored as usual.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a disk array device according to an embodiment of the present invention and a host connected thereto.
FIG. 2 is a diagram illustrating a configuration example of an array rank.
3 is a diagram showing a state in which a failure has occurred in HDD # 2 in the configuration of the array rank of FIG. 2;
4 is a diagram illustrating a state in which a failure has occurred in HDD # 2 and recovery data A2 has been written in a nonvolatile semiconductor memory in the array rank configuration of FIG. 2 according to an embodiment of the present invention.
FIG. 5 is a diagram for explaining the recovery operation of logical disks other than the priority logical disk according to the embodiment of the present invention;
FIG. 6 is a timing chart showing a state from the start of recovery according to the present invention until recovery of a logical disk with a high priority is completed, and after that, another logical disk is recovered.
FIG. 7 is a block diagram showing a configuration of a conventional disk array device and a host connected thereto.
FIG. 8 is a timing chart showing a state from the start of recovery to the completion of recovery according to a conventional example.
[Explanation of symbols]
1 Host 2 Disk array device 3 Host I / F
201 Host I / F Control Unit 202 Logical Disk Access Control Unit 203 Array Control Unit 204 HDD Control Unit 205 Physical Disk 206 Priority Recovery Disk Instruction Unit 207 Recovery Control Unit 209 HDD Failure Detection Unit 209 Priority Disk Memory Unit

Claims

Means for detecting that a physical disk has failed;
Priority information storage means for holding an identifier of a priority logical disk that is preferentially restored in the event of a physical disk failure;
Reading means for reading out data related to the logical disk stored in the physical disk in which no failure has occurred with respect to the logical disk in which data is allocated to the physical disk in which the failure has occurred;
Recovery means for recovering data of the logical disk stored in the failed physical disk from the data read by the reading means;
First and second storage means for storing the data restored by the restoration means,
The recovery means preferentially recovers the data of the priority logical disk corresponding to the identifier held in the priority information storage means at the time of data recovery, stores the data in the first storage means, and then stores the other logical disk The data is sequentially recovered and stored in the second storage means,
When the data recovery of the priority logical disk is completed in the first storage means, access to the priority logical disk from the host device without waiting for the data recovery completion of other logical disks in the second storage means A disk array device characterized by being made possible .

2. The disk array device according to claim 1 , wherein a physical disk in which no failure has occurred among physical disks allocated to the priority logical disk when the priority logical disk is accessed from a host; A disk array device further comprising access means for accessing the first storage means.

3. The disk array device according to claim 1, wherein the first storage means is a non-volatile semiconductor memory.

Detecting that a physical disk has failed;
Read all the data related to the priority logical disk from the physical disks that have not failed among the physical disks that are assigned to the priority logical disk of the logical disks to which the failed physical disk is assigned A first reading step;
A first recovery step of recovering all data stored in the physical disk in which a failure has occurred among the data of the priority logical disk from the data read in the first reading step;
A first storage step of temporarily storing the data recovered by the first recovery step in a first storage means;
From the physical disk assigned to the logical disk that is not the preferred logical disk among the logical disks to which the failed physical disk is assigned, to the logical disk that is not the preferred logical disk. A second reading step for reading all such data;
A second recovery step of recovering all data stored in the physical disk in which a failure has occurred among the data of the logical disk that is not the priority logical disk from the data read in the second reading step;
A second storage step of storing the data recovered in the second recovery step in a second storage means;
A third storage step of storing data temporarily stored in the first storage means in the second storage means;
A failure recovery method for a disk array device, comprising:

5. The failure recovery method for a disk array device according to claim 4 , further comprising a fourth storage step for storing an identifier of a logical disk designated by a host, wherein the first read step stores in the fourth storage step. A failure recovery method in a disk array device, wherein a logical disk having a specified identifier is used as the priority logical disk.

6. The failure recovery method for a disk array device according to claim 4 , wherein when a priority is given to the priority logical disk from a host, a failure occurs in a physical disk assigned to the priority logical disk. A failure recovery method in a disk array device, further comprising an access step of accessing a physical disk that has not been accessed and the first storage means.