JP2008197886A

JP2008197886A - Storage device and control method therefor

Info

Publication number: JP2008197886A
Application number: JP2007031909A
Authority: JP
Inventors: Yukio Saito; 幸男齋藤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2007-02-13
Filing date: 2007-02-13
Publication date: 2008-08-28

Abstract

<P>PROBLEM TO BE SOLVED: To provide a storage device of high-redundancy and a control method therefor. <P>SOLUTION: By replacing an HDD inside a pool losing redundancy among a plurality of pools for an HDD of a spare inside the same device or an HDD inside the other pool, the redundancy is generated in the pool once losing the redundancy. Accordingly, the high redundancy can be obtained. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、ストレージ装置及びその制御方法に関する。 The present invention relates to a storage apparatus and a control method therefor.

保守に対して費用をあまりかけられないシステムにおいて、ハードディスクドライバ（以下「ＨＤＤ」と称す。）の故障後も交換されることなく長時間放置される場合があり、その間に他のＨＤＤに障害が発生するとデータの消失が発生するケースがある。そのようなシステムに対して少しでも高い信頼性を有するストレージ装置が望まれる。
このため、ユニット内のドライブを別々に取扱い、自由にＲＡＩＤ（ＲＡＩＤについては後述する。）グループを構成しても、ユニットを容易に交換することができるストレージ装置（例えば、特許文献１参照。）や、記憶デバイスが破壊された場合に、新しい空の記憶デバイスの装着を必要とせず、ＲＡＩＤグループ内の他の記憶デバイスの冗長性を維持して信頼性を落とさないようにした情報記憶装置（例えば、特許文献２参照。）が提案されている。
特開２００５−２９３５４７号公報特開平９−１４６７１７号公報 In a system that does not cost much for maintenance, there is a case where a hard disk driver (hereinafter referred to as “HDD”) is left without being replaced for a long time after a failure, and other HDDs fail during that time. In some cases, data loss may occur. A storage device having a high degree of reliability for such a system is desired.
For this reason, even if the drives in the unit are handled separately and a RAID (RAID will be described later) group is freely configured, the storage device can easily replace the unit (see, for example, Patent Document 1). In addition, when a storage device is destroyed, an information storage device that does not require the installation of a new empty storage device and maintains the redundancy of other storage devices in the RAID group so as not to reduce the reliability. For example, see Patent Document 2).
JP 2005-293547 A JP-A-9-146717

しかしながら、例えばＲＡＩＤ６を設定されたストレージ装置において、１つのプールに属するＨＤＤが２台故障した場合それ以上の冗長性を持たせることが出来ずＨＤＤがさらにもう１台故障した場合にはプールに記録された情報が全て消えてしまい、運用しているシステムに多大なる影響を及ぼしていた。 However, for example, in a storage device in which RAID 6 is set, if two HDDs belonging to one pool fail, no more redundancy can be provided, and if another HDD fails, it is recorded in the pool. All of the information was lost, which had a great impact on the operating system.

そこで、本発明の目的は、冗長性の高いストレージ装置及びその制御方法を提供することにある。 Therefore, an object of the present invention is to provide a highly redundant storage apparatus and a control method therefor.

上記課題を解決するため、請求項１記載の発明は、複数のプールのうちの冗長性がなくなったプール内のＨＤＤを同一装置内のスペアのＨＤＤもしくは他のプール内のＨＤＤと交換するようにしたことを特徴とする。 In order to solve the above-mentioned problem, the invention according to claim 1 is configured such that an HDD in a pool that has lost redundancy among a plurality of pools is replaced with a spare HDD in the same apparatus or an HDD in another pool. It is characterized by that.

請求項１記載の発明によれば、複数のプールのうちの冗長性がなくなったプール内のＨＤＤを同一装置内のスペアのＨＤＤもしくは他のプール内のＨＤＤと交換するようにしたことにより、一旦冗長性がなくなったプールに冗長性が発生するので、高い冗長性を得ることができる。 According to the first aspect of the present invention, the HDD in the pool that has lost redundancy among the plurality of pools is replaced with a spare HDD in the same device or an HDD in another pool. Since redundancy occurs in a pool that has lost redundancy, high redundancy can be obtained.

請求項２記載の発明は、複数のＨＤＤからなる複数のプールと、該プールを制御する制御手段とを備えたストレージ装置において、前記制御手段は、全ＨＤＤの故障の有無を監視し、故障したＨＤＤが有る場合には同一装置内のスペアのＨＤＤと交換することを特徴とする。 The invention according to claim 2 is a storage apparatus comprising a plurality of pools composed of a plurality of HDDs and a control means for controlling the pool, wherein the control means monitors all HDDs for failure and has failed. When there is an HDD, the HDD is replaced with a spare HDD in the same apparatus.

請求項２記載の発明によれば、制御手段は、全ＨＤＤの故障の有無を監視し、故障したＨＤＤが有る場合には同一装置内のスペアのＨＤＤと交換することにより、一旦冗長性がなくなった場合であってもプールに冗長性が発生するので、高い冗長性を得ることができる。 According to the second aspect of the present invention, the control means monitors the presence or absence of a failure of all HDDs, and if there is a failed HDD, the control unit temporarily replaces it with a spare HDD in the same device, thereby temporarily eliminating the redundancy. Even in such a case, since redundancy occurs in the pool, high redundancy can be obtained.

請求項３記載の発明は、請求項２記載の発明において、前記制御手段は、同一装置内にスペアのＨＤＤが無い場合であって、同一プール内に他に故障したＨＤＤが無いときは、その故障したＨＤＤを切り離して再度全ＨＤＤの故障を監視することを特徴とする。 According to a third aspect of the present invention, in the second aspect of the present invention, when the control means has no spare HDD in the same device and there is no other failed HDD in the same pool, the control means It is characterized in that the failed HDD is disconnected and the failure of all the HDDs is monitored again.

請求項３記載の発明によれば、制御手段は、同一装置内にスペアのＨＤＤが無い場合であって、同一プール内に他に故障したＨＤＤが無いときは、その故障したＨＤＤを切り離して再度全ＨＤＤの故障を監視することにより、一旦冗長性がなくなった場合であってもプールに冗長性が発生するので、高い冗長性を得ることができる。 According to the third aspect of the present invention, when there is no spare HDD in the same device and there is no other failed HDD in the same pool, the control means disconnects the failed HDD and again By monitoring failure of all HDDs, even if redundancy is once lost, redundancy occurs in the pool, so that high redundancy can be obtained.

請求項４記載の発明は、請求項２記載の発明において、前記制御手段は、同一装置内にスペアのＨＤＤが無い場合であって、同一プール内に他に１台の故障したＨＤＤが有るときは、他のＲＡＩＤ６相当の冗長性を持つプールのＨＤＤの故障の有無を確認し、故障が無い場合には前記故障のあるＨＤＤを他のプール内のＨＤＤと交換することを特徴とする。 According to a fourth aspect of the present invention, in the second aspect of the present invention, the control means includes a case where there is no spare HDD in the same device and there is another failed HDD in the same pool. Is characterized by checking whether or not there is a failure in an HDD in another RAID 6 equivalent redundancy, and if there is no failure, replace the failed HDD with an HDD in another pool.

請求項４記載の発明によれば、制御手段は、同一装置内にスペアのＨＤＤが無い場合であって、同一プール内に他に１台の故障したＨＤＤが有るときは、他のＲＡＩＤ６相当の冗長性を持つプールのＨＤＤの故障の有無を確認し、故障が無い場合には前記故障のあるＨＤＤを他のプール内のＨＤＤと交換することにより、一旦冗長性がなくなった場合であってもプールに冗長性が発生するので、高い冗長性を得ることができる。 According to the fourth aspect of the present invention, when there is no spare HDD in the same device and there is another failed HDD in the same pool, the control means is equivalent to another RAID6. Even if the redundancy once disappears by checking whether there is a failure of the HDD in the pool with redundancy, and if there is no failure, replace the failed HDD with an HDD in another pool. Since redundancy occurs in the pool, high redundancy can be obtained.

請求項５記載の発明は、請求項２記載の発明において、前記制御手段は、同一装置内にスペアのＨＤＤが無い場合であって、同一プール内に他に２台以上の故障したＨＤＤが有るときは、データが破壊された後再度全ＨＤＤの故障を監視することを特徴とする。 The invention according to claim 5 is the invention according to claim 2, wherein the control means has no spare HDD in the same device, and there are two or more other failed HDDs in the same pool. In some cases, the failure of all HDDs is monitored again after the data is destroyed.

請求項５記載の発明によれば、制御手段は、同一装置内にスペアのＨＤＤが無い場合であって、同一プール内に他に２台以上の故障したＨＤＤが有るときは、データが破壊された後再度全ＨＤＤの故障を監視することにより、使用可能なＨＤＤを把握することができる。 According to the fifth aspect of the present invention, when there is no spare HDD in the same device, and there are two or more other failed HDDs in the same pool, the control means destroys the data. After that, by monitoring the failure of all the HDDs again, the usable HDDs can be grasped.

請求項６記載の発明は、請求項５記載の発明において、前記制御手段は、前記他のプール内が前記故障したＨＤＤを有するプールより冗長性が高い場合には前記他のプール内のＨＤＤを前記故障の有るＨＤＤと交換することを特徴とする。 According to a sixth aspect of the present invention, in the fifth aspect of the present invention, when the other pool has a higher redundancy than the pool having the failed HDD, the control means selects the HDD in the other pool. The HDD is replaced with the failed HDD.

請求項６記載の発明によれば、制御手段は、他のプール内が故障したＨＤＤを有するプールより冗長性が高い場合には他のプール内のＨＤＤを故障の有るＨＤＤと交換することにより、一旦冗長性がなくなった場合であってもプールに冗長性が発生するので、高い冗長性を得ることができる。 According to the invention of claim 6, when the redundancy is higher than the pool having the failed HDD in the other pool, the control means replaces the HDD in the other pool with the failed HDD, Even if the redundancy is once lost, redundancy occurs in the pool, so that high redundancy can be obtained.

請求項７記載の発明は、複数のプールを監視し、冗長性がなくなったプール内のＨＤＤを同一装置内のスペアのＨＤＤもしくはＲＡＩＤ６相当の冗長性のある他のプール内のＨＤＤと交換することを特徴とする。 The invention according to claim 7 monitors a plurality of pools, and replaces HDDs in a pool that has lost redundancy with spare HDDs in the same apparatus or HDDs in another pool having redundancy equivalent to RAID 6. It is characterized by.

請求項７記載の発明によれば、複数のプールを監視し、冗長性がなくなったプール内のＨＤＤを同一装置内のスペアのＨＤＤもしくはＲＡＩＤ６相当の冗長性のある他のプール内のＨＤＤと交換するようにしたことにより、一旦冗長性がなくなったプールに冗長性が発生するので、高い冗長性を得ることができる。 According to the seventh aspect of the present invention, a plurality of pools are monitored, and the HDDs in the pool that have lost redundancy are replaced with spare HDDs in the same apparatus or HDDs in other pools with redundancy equivalent to RAID 6 By doing so, redundancy occurs in the pool once the redundancy is lost, so that high redundancy can be obtained.

本発明によれば、複数のプールのうちの冗長性がなくなったプール内のＨＤＤを同一装置内のスペアのＨＤＤもしくは他のプール内のＨＤＤと交換するようにしたことにより、一旦冗長性がなくなったプールに冗長性が発生するので、高い冗長性を得ることができる。 According to the present invention, redundancy is temporarily lost by replacing an HDD in a pool that has lost redundancy among a plurality of pools with a spare HDD in the same device or an HDD in another pool. Since redundancy occurs in the pool, high redundancy can be obtained.

本発明に係るストレージ装置の一実施の形態について説明する。
図２は、本発明に係るストレージ装置の一実施の形態を示す概念図である。
同図において、ストレージ装置は、制御部２１と、複数（図では２つであるが限定されない。）のプールＡ（４１）、プールＢ（４２）と、スペア用のＨＤＤ（図では１つであるが限定されない。）５０とで構成されている。 An embodiment of a storage apparatus according to the present invention will be described.
FIG. 2 is a conceptual diagram showing an embodiment of a storage apparatus according to the present invention.
In the figure, the storage apparatus includes a control unit 21, a plurality of pools A (41), a pool B (42), and a spare HDD (one in the figure). There is no limitation.) 50.

制御部２１としては、例えば、マイクロプロセッサが用いられる。プールＡ（４１）は複数（図では４台であるが限定されない。）のＨＤＤ３１、ＨＤＤ３２、ＨＤＤ３３、ＨＤＤ３４で構成され、プールＢ（４２）は複数（図では４台であるが限定されない。）のＨＤＤ３５、ＨＤＤ３６、ＨＤＤ３７、ＨＤＤ３８で構成されている。 For example, a microprocessor is used as the control unit 21. Pool A (41) is composed of a plurality of HDDs 31, HDD 32, HDD 33, and HDD 34 (four is not limited in the figure), and pool B (42) is a plurality (four is not limited in the figure). HDD 35, HDD 36, HDD 37, and HDD 38.

図５（ａ）は、本発明に係るストレージ装置のプール内の４台のＨＤＤのうちの２台が故障した状態を示す概念図であり、図５（ｂ）は、図５（ａ）に示したストレージ装置の処理後の状態を示す概念図である。但し、図５（ａ）、（ｂ）においてはスペア用のＨＤＤ５０は既に故障しており、実質的にスペア用のＨＤＤが無いものとして説明する。 FIG. 5A is a conceptual diagram showing a state in which two of the four HDDs in the pool of the storage apparatus according to the present invention have failed, and FIG. 5B is the same as FIG. It is a conceptual diagram which shows the state after the process of the shown storage apparatus. However, in FIGS. 5A and 5B, it is assumed that the spare HDD 50 has already failed and there is substantially no spare HDD.

本発明に係るストレージ装置は、図５（ａ）に示すように１台のストレージ装置１１に４台以上のＨＤＤ（図では４台）を持つＲＡＩＤ６構成されたプールＡ（４１）、プールＢ（４２）の２つが設定されている状態において、同一プールＡ（４１）に属するＨＤＤ３１及びＨＤＤ３２の故障発生によりプールＡ（４１）として冗長性を持つことが出来なくなったとき、冗長性を維持するよう自動でプールＢ（４２）に属するＨＤＤ３５とプールＡ（４１）に属するＨＤＤ３１と切り替える（交換する）ことで再度冗長性を持たせることを特徴とする。尚、プールＡ（４１）及びプールＢ（４２）はＲＡＩＤ６であることを前提とする。
すなわち、複数のプールを監視し、冗長性がなくなったプール内のＨＤＤを冗長性のある他のプール内のＨＤＤと交換することにより、高い冗長性を得ることができる。
ここで、プールとは、物理ディスクから構成した仮想的な記憶領域のことである。 As shown in FIG. 5A, the storage apparatus according to the present invention includes a pool A (41) and a pool B (RAID 6 configured with RAID 6 having four or more HDDs (four in the figure) in one storage apparatus 11. 42), when the HDD 31 and the HDD 32 belonging to the same pool A (41) fail to have redundancy as the pool A (41), the redundancy is maintained. It is characterized in that redundancy is provided again by automatically switching (changing) the HDD 35 belonging to the pool B (42) and the HDD 31 belonging to the pool A (41). It is assumed that pool A (41) and pool B (42) are RAID 6.
That is, high redundancy can be obtained by monitoring a plurality of pools and exchanging HDDs in a pool that has lost redundancy with HDDs in another pool having redundancy.
Here, the pool is a virtual storage area composed of physical disks.

本発明に係るストレージ装置では、１台のストレージ装置１１においてＲＡＩＤ６設定されたプールＡ（４１）及びプールＢ（４２）が存在するとき、１つのＲＡＩＤ６設定のプールＡ（４１）でＨＤＤ３１及びＨＤＤ３２の２台が故障した場合、他のＲＡＩＤ６設定のプールＢ（４２）から１台のＨＤＤ３５を切り離し、その１台のＨＤＤ３５を冗長性のなくなったプールＡ（４１）に組み込み再度冗長性を持たせて運用することで高い信頼性を上げることができる。 In the storage apparatus according to the present invention, when the pool A (41) and the pool B (42) set with RAID 6 exist in one storage apparatus 11, the HDD 31 and the HDD 32 of the pool A (41) set with one RAID 6 are stored. If two units fail, one HDD 35 is disconnected from the other RAID 6 set pool B (42), the one HDD 35 is incorporated into the pool A (41) that has lost redundancy, and redundancy is provided again. High reliability can be achieved through operation.

ここで、ＲＡＩＤは、ＲｅｄｕｎｄａｎｔＡｒｒａｙＩｎｅｘｐｅｎｓｉｖｅＤｒｉｖｅの頭文字をとった略称であり、「レイド」と呼び、複数台の安価なＨＤＤを組み合わせて、冗長化された１台のＨＤＤとして管理する技術のことである。要するに、ＨＤＤを管理する技術であるが、ＨＤＤへのデータ配置やデータの冗長化（多重化）方法により、「ＲＡＩＤ０、ＲＡＩＤ１、ＲＡＩＤ２、ＲＡＩＤ３、ＲＡＩＤ４、ＲＡＩＤ５」の６つのレベルに分類定義されている。また、近年、信頼性を強化するため、ダブルパリティを採用したＲＡＩＤ６のレベルも提供されている。 Here, RAID is an abbreviation for Redundant Array Independent Drive, which is called “Raid” and is a technology that manages multiple redundant HDDs as a redundant HDD. It is. In short, although it is a technology for managing HDDs, it is classified and defined into six levels of “RAID0, RAID1, RAID2, RAID3, RAID4, RAID5” depending on the data arrangement in HDD and the data redundancy (multiplexing) method. Yes. In recent years, in order to enhance the reliability, a RAID 6 level employing double parity is also provided.

なお、上述した実施の形態は、本発明の好適な実施の形態の一例を示すものであり、本発明はそれに限定されることなく、その要旨を逸脱しない範囲内において、種々変形実施が可能である。 The above-described embodiment shows an example of a preferred embodiment of the present invention, and the present invention is not limited thereto, and various modifications can be made without departing from the scope of the invention. is there.

次に本発明に係るストレージ装置の一実施例について説明する。
図２に示すストレージ装置１１では４台（４台には限定されず、５台以上であってもよい。）のＨＤＤ３１、ＨＤＤ３２、ＨＤＤ３３、ＨＤＤ３４を一つのプールＡ（４１）とみなし、冗長性の高いＲＡＩＤ６構成を構築し運用を行う。ＨＤＤ３１、及びＨＤＤ３２が同時に２台故障した場合でもデータの再構築が可能であるため、信頼性が高いことを特徴としている。 Next, an embodiment of the storage apparatus according to the present invention will be described.
In the storage apparatus 11 shown in FIG. 2, four (not limited to four, may be five or more) HDD 31, HDD 32, HDD 33, and HDD 34 are regarded as one pool A (41) and redundant. A high RAID 6 configuration is constructed and operated. Since the data can be reconstructed even when two HDDs 31 and 32 fail at the same time, the reliability is high.

しかしながら保守体制が整っていないシステムにおいてＨＤＤ３１及びＨＤＤ３２の故障時の交換作業が即座に実施できない場合、一つのプールで３台のＨＤＤ３１、ＨＤＤ３２、ＨＤＤ３３の故障が発生しデータが消えてしまう恐れがある。 However, if the replacement work at the time of failure of the HDD 31 and the HDD 32 cannot be performed immediately in a system that does not have a maintenance system, the failure of the three HDDs 31, the HDD 32, and the HDD 33 in one pool may occur and the data may be lost. .

そこで本発明に係るストレージ装置では、複数のＲＡＩＤ６構成のプールＡ（４１）及びプールＢ（４２）を持つ装置において、プールＡ（４１）のＨＤＤ３１及びＨＤＤ３２が２台故障した場合、他のＲＡＩＤ６からＨＤＤ３５を移行することで更なる冗長性を持たせる事を特徴としている。
ここで、故障とは、セクタ異常や応答ＮＧ等のＨＤＤが使用不可の状態となることをいう。 Therefore, in the storage apparatus according to the present invention, when two HDDs 31 and 32 in the pool A (41) fail in the apparatus having a plurality of RAID 6 pool A (41) and pool B (42), It is characterized by providing further redundancy by migrating the HDD 35.
Here, the failure means that the HDD such as sector abnormality or response NG becomes unusable.

すなわち、１つのプールでＨＤＤの２台の故障を検出した時点で、他の正常なプールＢ（４２）からＨＤＤ３５を１台切り離し、冗長性のなくなったプールＡ（３１）へ移行させて交換することで再度冗長性を持たせるのである。故障した１台のＨＤＤ３１は形式上故障の発生していなかったプールＢ（４２）に組み込まれるため、故障のＨＤＤ３１及びＨＤＤ３２の交換により再びＲＡＩＤ６としての冗長性を持つことが可能である。 In other words, when two HDD failures are detected in one pool, one HDD 35 is disconnected from the other normal pool B (42), moved to the pool A (31) with no redundancy, and replaced. In this way, redundancy is given again. Since one failed HDD 31 is incorporated into the pool B (42) where no failure has occurred in form, it is possible to have redundancy as RAID 6 again by replacing the failed HDD 31 and HDD 32.

〔構成〕
本発明に係るストレージ装置は、図１（ａ）、（ｂ）に示すようにＨＤＤを９台以上実装できるストレージ装置１１と、ＨＤＤへの書き込みや装置内の状態監視を行う制御部２１と、９台以上（図では９台）のＨＤＤ３１、ＨＤＤ３２、ＨＤＤ３３、ＨＤＤ３４、ＨＤＤ３５、ＨＤＤ３６、ＨＤＤ３７、ＨＤＤ３８と、スペア用のＨＤＤ５０とで構成される。また、プールＡはＨＤＤ３１、ＨＤＤ３２、ＨＤＤ３３、ＨＤＤ３４で構成され、プールＢはＨＤＤ３５、ＨＤＤ３６、ＨＤＤ３７、ＨＤＤ３８で構成される。〔Constitution〕
As shown in FIGS. 1A and 1B, a storage apparatus according to the present invention includes a storage apparatus 11 that can mount nine or more HDDs, a control unit 21 that performs writing to the HDD and monitoring the state in the apparatus, Nine or more (9 in the figure) HDD 31, HDD 32, HDD 33, HDD 34, HDD 35, HDD 36, HDD 37, HDD 38, and spare HDD 50. Pool A is composed of HDD 31, HDD 32, HDD 33, and HDD 34, and pool B is composed of HDD 35, HDD 36, HDD 37, and HDD 38.

〔動作〕
図１は、本発明に係るストレージ装置の制御方法を適用したフローの一実施例である。図３（ａ）は、本発明に係るストレージ装置のプール内の４台のＨＤＤのうちの３台が故障した状態を示す概念図であり、図３（ｂ）は、図３（ａ）に示したストレージ装置の処理後の状態を示す概念図である。図４（ａ）は、本発明に係るストレージ装置のプール内の４台のＨＤＤのうちの１台が故障した状態を示す概念図であり、図４（ｂ）は、図４（ａ）に示したストレージ装置の処理後の状態を示す概念図である。図６（ａ）は、本発明に係るストレージ装置のプール内の４台のＨＤＤのうちの１台が故障した状態を示す概念図であり、図６（ｂ）は、図６（ａ）に示したストレージ装置の処理後の状態を示す概念図である。但し、図３（ａ）、（ｂ）及び図４（ａ）、（ｂ）においては、スペア用のＨＤＤ５０は既に故障しており、実質的にスペア用のＨＤＤが無いものとし、図６（ａ）、（ｂ）においては、スペア用のＨＤＤ５０は正常に動作するものとして説明する。 [Operation]
FIG. 1 shows an embodiment of a flow to which a storage apparatus control method according to the present invention is applied. FIG. 3A is a conceptual diagram showing a state in which three of the four HDDs in the pool of the storage apparatus according to the present invention have failed, and FIG. 3B is the same as FIG. It is a conceptual diagram which shows the state after the process of the shown storage apparatus. FIG. 4A is a conceptual diagram showing a state in which one of the four HDDs in the pool of the storage apparatus according to the present invention has failed, and FIG. 4B is the same as FIG. It is a conceptual diagram which shows the state after the process of the shown storage apparatus. FIG. 6A is a conceptual diagram showing a state in which one of the four HDDs in the pool of the storage apparatus according to the present invention has failed, and FIG. 6B is the same as FIG. It is a conceptual diagram which shows the state after the process of the shown storage apparatus. However, in FIGS. 3A and 3B and FIGS. 4A and 4B, it is assumed that the spare HDD 50 has already failed and there is substantially no spare HDD. In a) and (b), it is assumed that the spare HDD 50 operates normally.

１台のストレージ装置１１において複数のＲＡＩＤ６設定されたプールＡ（４１）、プールＢ（４２）が存在し、運用状態を監視する（ステップＳ１）システムにおいて、制御部２１は、ＨＤＤ３１〜ＨＤＤ３８及びＨＤＤ５０に故障が有るか否かを判断する（ステップＳ２）。 In a system in which a plurality of RAID 6 set pool A (41) and pool B (42) exist in one storage apparatus 11 and the operation state is monitored (step S1), the control unit 21 includes HDDs 31 to 38 and HDD 50. It is determined whether or not there is a failure (step S2).

制御部２１は、ＨＤＤ３１に故障が無い場合（ステップＳ２／ＮＯ）はＨＤＤ３１〜ＨＤＤ３８及びＨＤＤ５０に故障が有るか否かを判断し続け、ＨＤＤ３１〜ＨＤＤ３８に故障が有ると判断した場合（ステップＳ２／ＹＥＳ）はまず、スペア用のＨＤＤ（スペア：ＨＤＤ５０）の有無（正常なスペア用のＨＤＤの有無）の確認を行う（ステップＳ３）。 When there is no failure in HDD 31 (step S2 / NO), control unit 21 continues to determine whether there is a failure in HDD 31 to HDD 38 and HDD 50, and when it is determined that there is a failure in HDD 31 to HDD 38 (step S2 / First, the presence / absence of a spare HDD (spare: HDD 50) (presence of a normal spare HDD) is checked (step S3).

制御部２１は、図６（ａ）、（ｂ）に示すようにスペア用のＨＤＤが存在する場合（ステップＳ３／ＹＥＳ）は、故障したＨＤＤ３１を正常なＨＤＤ５０と交換（ステップＳ１０）し、ステップＳ１に戻ることで運用状態を復旧することができるが、（正常な）スペア用のＨＤＤが存在しない場合（ステップＳ３／ＮＯ）は、さらに、同プール（プールＡ（４１））内で障害となっているＨＤＤをチェック（故障の有無のチェック）する（ステップＳ４）。 As shown in FIGS. 6A and 6B, when there is a spare HDD (step S3 / YES), the control unit 21 replaces the failed HDD 31 with a normal HDD 50 (step S10). The operating state can be recovered by returning to S1, but if there is no (normal) spare HDD (step S3 / NO), a failure is further detected in the same pool (pool A (41)). The checked HDD is checked (check for failure) (step S4).

制御部２１は、同プール（プールＡ（４１））内で故障したＨＤＤ以外に２台以上のＨＤＤが故障している場合、すなわち図３（ａ）に示すようにＨＤＤ３１、ＨＤＤ３２、ＨＤＤ３３、ＨＤＤ５０が故障している場合（ステップＳ４／２台以上）はデータの保障ができなくなり、プールＡ（４１）全体のデータが破壊されて運用継続が出来なくなる。つまり、既に２台のＨＤＤが故障して新たにＨＤＤが故障した場合（３台目の故障）には故障データの書き込みや読み出し等のサービスが停止となり（図３（ｂ）参照）、ステップＳ１で引き続き運用状態を監視する。 When two or more HDDs have failed in addition to the failed HDD in the same pool (Pool A (41)), the control unit 21 has the HDD 31, HDD 32, HDD 33, and HDD 50 as shown in FIG. Is faulty (step S4 / 2 units or more), the data cannot be guaranteed, and the entire pool A (41) data is destroyed and the operation cannot be continued. That is, when two HDDs have already failed and a new HDD has failed (third failure), services such as writing and reading of failure data are stopped (see FIG. 3B), and step S1. Continue to monitor the operational status.

制御部２１は、同一プール（プールＡ（４１））内で故障したＨＤＤ以外に故障したＨＤＤの台数が０台の場合（図４（ａ））、（ステップＳ４／０台）は障害となった（故障した）ＨＤＤ３１を図４（ｂ）に示すように切り離し（ステップＳ１２）、ステップＳ１に戻って継続して運用状態を監視する。 If the number of failed HDDs is 0 other than the failed HDD in the same pool (pool A (41)) (FIG. 4 (a)), the control unit 21 becomes a failure when (step S4 / 0). The (failed) HDD 31 is disconnected as shown in FIG. 4B (step S12), and the operation state is continuously monitored by returning to step S1.

制御部２１は、同一プール（プールＡ（４１））内で故障したＨＤＤ以外に故障したＨＤＤの台数が１台の場合、すなわちＨＤＤ１、ＨＤＤ３２が故障した場合（ステップＳ４／１台）、同じ装置内の他のプールＢ（４２）の状態を確認する（ステップＳ５）。 When the number of failed HDDs is one in addition to the failed HDD in the same pool (pool A (41)), that is, when the HDD 1 and the HDD 32 fail (step S4 / 1), the control unit 21 The state of the other pool B (42) is confirmed (step S5).

制御部２１は、他のプール（プールＢ（４２））がＲＡＩＤ６で構成されているか否かを判断し（ステップＳ６）、他のプール（プールＢ（４２））がＲＡＩＤ６で構成されていると判断（ステップＳ６／ＹＥＳ）し、そのプール内（プールＢ（４２））に故障しているＨＤＤが有るか否かを判断し（ステップS７）、且つそのプール内（プールＢ（４２））に故障しているＨＤＤが無いと判断した場合（ステップＳ７／ＮＯ）、そのプール内（プールＢ（４２））のＨＤＤ３５を１台切り離し（ステップＳ８）、プールＡ（４１）へ組み込みを行うことで交換する（ステップＳ９）。 The control unit 21 determines whether another pool (pool B (42)) is configured with RAID 6 (step S6), and when the other pool (pool B (42)) is configured with RAID 6. Judgment is made (step S6 / YES), it is judged whether or not there is a failed HDD in the pool (pool B (42)) (step S7), and in the pool (pool B (42)). When it is determined that there is no failed HDD (step S7 / NO), one HDD 35 in the pool (pool B (42)) is disconnected (step S8) and incorporated into the pool A (41). Exchange (step S9).

そのように交換することによりプールＡ（４１）は再び冗長性を持つことが可能となり、ＨＤＤ３５が切り離されたプールＢ（４２）も信頼性は低くなるものの継続して冗長性を持つことが可能となる。
尚、他のプールがＲＡＩＤ６構成ではない場合（ステップＳ６／ＮＯ）や他のプールに故障ＨＤＤが有る場合（ステップＳ７／ＹＥＳ）はステップＳ１に戻る。
以上において、本実施例によれば、ストレージ装置の製品の更なる信頼性を得ることが出来る。 By such replacement, the pool A (41) can have redundancy again, and the pool B (42) from which the HDD 35 has been separated can continue to have redundancy although the reliability is low. It becomes.
If the other pool does not have a RAID 6 configuration (step S6 / NO) or if there is a failed HDD in the other pool (step S7 / YES), the process returns to step S1.
As described above, according to the present embodiment, further reliability of the storage device product can be obtained.

ここで、ストレージ装置内に故障時の予備となるスペア用のＨＤＤが搭載された構成においても、ＲＡＩＤ６のプールが二つ以上設定されたストレージ搭載にも適用可能である。また、ＲＡＩＤ６を構成するＨＤＤは４台以上であれば適応が可能である。またＲＡＩＤ６構成のプールとＲＡＩＤ５構成のプールとが混在するシステムにおいて、ＲＡＩＤ５のＨＤＤが障害となった場合、ＲＡＩＤ６側のプールが２台の冗長ＨＤＤを持てば同様にＨＤＤの切り離しと他プールへの組み込みが可能である（ＲＡＩＤ６は２台の冗長性を有し、ＲＡＩＤ５は１台の冗長性を有する）。 Here, even in a configuration in which a spare HDD as a spare in the event of a failure is installed in the storage apparatus, the present invention can also be applied to storage installation in which two or more RAID 6 pools are set. In addition, if the number of HDDs constituting RAID 6 is four or more, adaptation is possible. In a system in which a RAID 6 pool and a RAID 5 pool coexist, if a RAID 5 HDD fails, if the RAID 6 pool has two redundant HDDs, the HDD can be disconnected and transferred to another pool. It can be installed (RAID 6 has two units of redundancy and RAID 5 has one unit of redundancy).

本発明は、データの重要性が有るが、保守にあまり費用の掛けられないシステムや遠隔地や山岳部等のような作業者が常時滞在しておらずＨＤＤ故障が発生時に即座に交換対応が出来ないシステムに利用できる。 Although the present invention is important in terms of data, a system that does not require much maintenance, workers such as remote areas and mountainous areas are not always staying, and can be replaced immediately when an HDD failure occurs. It can be used for systems that cannot.

本発明に係るストレージ装置の制御方法を適用したフローの一実施例である。It is one Example of the flow which applied the control method of the storage apparatus concerning this invention. 本発明に係るストレージ装置の一実施の形態を示す概念図である。It is a conceptual diagram which shows one Embodiment of the storage apparatus which concerns on this invention. （ａ）は、本発明に係るストレージ装置のプール内の４台のＨＤＤのうちの３台が故障した状態を示す概念図であり、（ｂ）は、（ａ）に示したストレージ装置の処理後の状態を示す概念図である。(A) is a conceptual diagram showing a state in which three of the four HDDs in the pool of the storage apparatus according to the present invention have failed, and (b) shows the processing of the storage apparatus shown in (a). It is a key map showing the latter state. （ａ）は、本発明に係るストレージ装置のプール内の４台のＨＤＤのうちの１台が故障した状態を示す概念図であり、（ｂ）は、（ａ）に示したストレージ装置の処理後の状態を示す概念図である。(A) is a conceptual diagram showing a state in which one of four HDDs in the pool of the storage device according to the present invention has failed, and (b) is a process of the storage device shown in (a). It is a key map showing the latter state. （ａ）は、本発明に係るストレージ装置のプール内の４台のＨＤＤのうちの２台が故障した状態を示す概念図であり、（ｂ）は、（ａ）に示したストレージ装置の処理後の状態を示す概念図である。(A) is a conceptual diagram showing a state in which two of the four HDDs in the pool of the storage device according to the present invention have failed, and (b) shows the processing of the storage device shown in (a). It is a key map showing the latter state. （ａ）は、本発明に係るストレージ装置のプール内の４台のＨＤＤのうちの１台が故障した状態を示す概念図であり、（ｂ）は、（ａ）に示したストレージ装置の処理後の状態を示す概念図である。(A) is a conceptual diagram showing a state in which one of four HDDs in the pool of the storage device according to the present invention has failed, and (b) is a process of the storage device shown in (a). It is a key map showing the latter state.

Explanation of symbols

１１ストレージ装置
２１制御部
３１〜３８ＨＤＤ
４１プールＡ
４２プールＢ
５０スペア用のＨＤＤ 11 Storage Device 21 Control Unit 31-38 HDD
41 Pool A
42 Pool B
50 HDD for spare

Claims

A storage apparatus characterized in that an HDD in a pool that has lost redundancy among a plurality of pools is replaced with a spare HDD in the same apparatus or an HDD in another pool.

In a storage apparatus comprising a plurality of pools composed of a plurality of HDDs and control means for controlling the pools,
The control unit monitors the presence or absence of a failure of all HDDs, and if there is a failed HDD, replaces it with a spare HDD in the same device.

When there is no spare HDD in the same device and there is no other failed HDD in the same pool, the control means disconnects the failed HDD and monitors the failure of all HDDs again. The storage apparatus according to claim 2, wherein:

When there is no spare HDD in the same device and there is one other failed HDD in the same pool, the control means fails in an HDD in another pool having redundancy equivalent to RAID 6 3. The storage apparatus according to claim 2, wherein the presence or absence of the storage system is checked, and if there is no failure, the failed HDD is replaced with an HDD in another pool.

When there is no spare HDD in the same device and there are two or more other failed HDDs in the same pool, the control means monitors all HDDs again after the data is destroyed. The storage apparatus according to claim 2, wherein:

6. The control unit according to claim 5, wherein when the redundancy in the other pool is higher than that of the pool having the failed HDD, the HDD in the other pool is replaced with the failed HDD. The storage device described.

Control of a storage apparatus characterized by monitoring a plurality of pools and exchanging HDDs in a pool that has lost redundancy with spare HDDs in the same apparatus or HDDs in another pool having redundancy equivalent to RAID 6 Method.