JP6330824B2

JP6330824B2 - Storage system, access device, client device, method and program

Info

Publication number: JP6330824B2
Application number: JP2016008544A
Authority: JP
Inventors: 高橋　秀行; 秀行高橋
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2016-01-20
Filing date: 2016-01-20
Publication date: 2018-05-30
Anticipated expiration: 2036-01-20
Also published as: JP2017130016A

Description

本発明は、同一内容のデータの重複を排除して記憶する機能を有するストレージシステムに関する。 The present invention relates to a storage system having a function of storing data with the same content eliminated.

近年、同一内容のデータの重複を排除して記憶するストレージシステムが知られている。以降、同一内容のデータの重複を排除することを、重複排除とも記載する。特に、書き込みの対象となるデータを複数のデータブロックに分割し、異なるデータを構成するデータブロックであっても同一内容のデータブロックの重複排除を行うストレージシステムがある。このようなストレージシステムは、書き込みの対象となるデータを分割したデータブロックのうち、未だ記憶していないデータブロックを記憶する。そして、このようなストレージシステムは、書き込みの対象となるデータについて、そのデータを構成するデータブロックへの参照を記憶する。 In recent years, storage systems that store data with the same content eliminated are known. Hereinafter, eliminating duplicate data having the same content is also referred to as deduplication. In particular, there is a storage system that divides data to be written into a plurality of data blocks and eliminates duplication of data blocks having the same contents even if the data blocks constitute different data. Such a storage system stores data blocks that have not yet been stored among data blocks obtained by dividing data to be written. Such a storage system stores, for data to be written, a reference to a data block constituting the data.

例えば、特許文献１には、書き込みの対象となるデータを受信しながら、リアルタイムでデータブロックへ分割して重複排除を行う技術の一例が記載されている。この関連技術は、データバッファに入力されるデータを所定の基準によって分割する。また、この関連技術は、分割されずにデータバッファに残ったデータと、連続して入力されたデータとを連結して、所定の基準によって分割することを繰り返す。 For example, Patent Literature 1 describes an example of a technique for performing deduplication by dividing data blocks in real time while receiving data to be written. In this related technique, data input to the data buffer is divided according to a predetermined standard. In this related technique, data that remains in the data buffer without being divided and data that is continuously input are connected and divided according to a predetermined criterion.

また、このような重複排除の機能を有するストレージシステムでは、１つのデータブロックが複数のデータによって参照され得るため、データブロックの消失に対応する高い信頼性が求められる。そこで、このようなストレージシステムでは、データブロックに冗長性を持たせて分散して配置することが行われる。具体的には、ストレージシステムは、データブロックをさらにフラグメントに分割し、所定の冗長度に応じたパリティを付加する。そして、ストレージシステムは、フラグメントおよびパリティを複数の記憶装置に分散して格納する。 Further, in a storage system having such a deduplication function, one data block can be referred to by a plurality of data, and thus high reliability corresponding to the loss of the data block is required. Therefore, in such a storage system, data blocks are distributed and arranged with redundancy. Specifically, the storage system further divides the data block into fragments and adds a parity according to a predetermined redundancy. Then, the storage system stores the fragment and the parity in a plurality of storage devices.

例えば、特許文献２には、データブロックの冗長度を、そのデータブロックを参照するデータの数に応じて設定する技術の一例が記載されている。あるデータブロックを参照するデータの数を、以降、参照数と記載する。具体的には、この関連技術は、参照数に対して適切な冗長度をあらかじめ設定しておく。そして、この関連技術は、所定のタイミング毎に、記憶している各データブロックについて、その参照数に対して定められた適切な冗長度と実際の冗長度との間の差異を検出し、適切な冗長度でそのデータブロックを書き込み直す。 For example, Patent Document 2 describes an example of a technique for setting the redundancy of a data block according to the number of data that refers to the data block. The number of data referring to a certain data block is hereinafter referred to as the reference number. Specifically, in this related technique, an appropriate redundancy is set in advance for the number of references. And this related technique detects the difference between the appropriate redundancy determined with respect to the number of references and the actual redundancy for each stored data block for each predetermined timing. Rewrite the data block with a high degree of redundancy.

また、例えば、特許文献３には、既存のデータブロックと同一内容であっても、異なる冗長度でのデータブロックの書き込みに対応する技術の一例が記載されている。この関連技術は、書き込み対象となるデータブロックに要求される冗長度が、記憶済みの同一内容のデータブロックの冗長度よりも高い場合には、新たな冗長度で新たにデータブロックを記憶する。また、この関連技術は、所定のタイミングで、同一内容のデータブロックのうち、冗長度がより高いデータブロックを残して、冗長度がより低いデータブロックを削除する。そして、この関連技術は、削除した方のデータブロックを参照していたデータからの参照を、残した方のデータブロックへの参照に変更する。 Further, for example, Patent Document 3 describes an example of a technique corresponding to writing of a data block with different redundancy even if the content is the same as that of an existing data block. In this related technique, when the redundancy required for the data block to be written is higher than the redundancy of the stored data block having the same content, the data block is newly stored with the new redundancy. In addition, this related technique deletes data blocks with lower redundancy while leaving data blocks with higher redundancy among data blocks with the same content at a predetermined timing. In this related technique, the reference from the data referring to the deleted data block is changed to the reference to the remaining data block.

特開２０１４−１７４６０４号公報JP 2014-174604 A 特開２０１０−２２４８４５号公報JP 2010-224845 A 特開２０１１−１５９１４２号公報JP 2011-159142 A

しかしながら、上述の関連技術には、以下の課題がある。 However, the related art described above has the following problems.

上述のような重複排除の機能を有するストレージシステムでは、ストレージを構成するノードの数に応じて、望ましい冗長度は変化する。例えば、ノードの数が２である場合、１つのノードに障害が発生した場合でも格納済みのデータへのアクセスを可能とするためには、データ全体の５０％がパリティである必要がある。また、ノードの数が４である場合、データ全体の２５％がパリティである必要がある。 In a storage system having a deduplication function as described above, the desired redundancy varies depending on the number of nodes constituting the storage. For example, when the number of nodes is 2, in order to enable access to stored data even when a failure occurs in one node, 50% of the entire data needs to be parity. When the number of nodes is 4, 25% of the entire data needs to be parity.

ここで、ストレージシステムでは、一般に、容量不足によるノードの追加や、古いハードウェアから新しいハードウェアへの入れ替えなどにより、ノード数が増減するため、望ましい冗長度が変わる可能性がある。そのため、このようなストレージシステムでは、記憶済みのデータの冗長度を変更したい場合がある。 Here, in the storage system, since the number of nodes generally increases or decreases due to addition of a node due to lack of capacity or replacement of old hardware with new hardware, there is a possibility that desirable redundancy may change. Therefore, in such a storage system, there is a case where it is desired to change the redundancy of stored data.

ところが、特許文献１には、重複排除を行って格納するデータブロックの冗長度に関しては記載がない。 However, Patent Document 1 does not describe the redundancy of data blocks stored by performing deduplication.

また、特許文献２に記載された関連技術は、上述したように、所定のタイミング毎に、各データブロックについて、適切な冗長度でない場合に適切な冗長度での書き込みをやり直す。適切な冗長度での書き込みのやり直しには、既存のデータブロックの読み出し処理が必要となる。したがって、ストレージシステム全体において冗長度を変更したい場合にこの関連技術を適用すると、全てのデータブロックについて、読み出しを行って新たな冗長度で書き込みをやり直すという冗長度の変換処理が一括して発生する。そのため、格納済みのデータ量が多い場合には、全てのデータブロックに対する一括での冗長度の変換処理により、長時間負荷が高い状況が生じる。そのような状況で、データにアクセスする性能への影響を抑えるために、仮に、運用でのアクセスを優先させ、冗長度の変換処理の速度を抑えたとする。その場合は、冗長度の変換処理が全てのデータブロックについて完了するまでの時間が長くなってしまい、変換処理の効率が悪い。 In addition, as described above, the related technique described in Patent Document 2 rewrites data blocks with appropriate redundancy for each data block at a predetermined timing when the data blocks are not appropriately redundant. In order to rewrite data with an appropriate redundancy, it is necessary to read the existing data block. Therefore, when this related technology is applied when it is desired to change the redundancy in the entire storage system, a redundancy conversion process is performed in which all data blocks are read and rewritten with a new redundancy. . For this reason, when the amount of stored data is large, a situation in which the load for a long time is high is caused by batch conversion processing of redundancy for all data blocks. In such a situation, in order to suppress the influence on the performance of accessing data, it is assumed that access in operation is prioritized and the speed of the redundancy conversion process is suppressed. In that case, it takes a long time to complete the redundancy conversion process for all the data blocks, and the efficiency of the conversion process is poor.

また、特許文献３に記載された関連技術は、上述したように、書き込みの対象となるデータブロックに要求される冗長度が、同一内容で記憶済みのデータブロックの冗長度より高い場合には、要求される冗長度で新たにデータブロックを記憶する。したがって、ストレージシステム全体において冗長度を変更したい場合にこの関連技術を適用すれば、データブロックの書き込み時に順次冗長度が変換されていく。そのため、冗長度の変換処理のために、変換前のデータブロックを読み出す処理は不要となる。また、全てのデータブロックに対する一括での冗長度の変換処理は発生しない。このため、この関連技術は、特許文献２に記載された関連技術に比べて、冗長度を効率的に変更することはできる。しかしながら、この関連技術は、冗長度の変更が要求された後にデータを書き込む際には、同一内容のデータブロックが既に記憶されている場合でも、要求される冗長度での書き込みを必要とする。つまり、この関連技術は、冗長度の変更が要求された後では、どのデータのどのデータブロックについても書き込みを行うことになる。その結果、この関連技術は、冗長度の変更により、通常の運用時に比べて、重複排除により向上するはずの書き込みの性能を低下させてしまう。 In addition, as described above, the related art described in Patent Document 3 has a redundancy required for a data block to be written higher than that of a data block stored with the same content. A new data block is stored with the required redundancy. Therefore, if this related technique is applied when it is desired to change the redundancy in the entire storage system, the redundancy is sequentially converted when data blocks are written. Therefore, the process of reading the data block before conversion is not required for the redundancy conversion process. Further, no batch redundancy conversion processing for all data blocks occurs. For this reason, this related technique can change the redundancy more efficiently than the related technique described in Patent Document 2. However, according to this related technique, when data is written after a change in redundancy is requested, writing with the required redundancy is required even if a data block having the same contents is already stored. That is, in this related technique, after a change in redundancy is requested, writing is performed for any data block of any data. As a result, this related technique degrades the write performance that should be improved by deduplication compared to the normal operation due to the change in redundancy.

本発明は、上述の課題を解決するためになされたものである。すなわち、本発明は、重複排除の機能を有するストレージシステムにおいて、性能への影響を抑えながら、冗長度を効率的に変更する技術を提供することを目的とする。 The present invention has been made to solve the above-described problems. That is, an object of the present invention is to provide a technique for efficiently changing redundancy while suppressing the influence on performance in a storage system having a deduplication function.

本発明のストレージシステムは、冗長処理が施されたデータを複数の記憶装置に分散して記憶する分散記憶部と、書き込みを要求された書き込みデータをデータブロックに分割し、分割したデータブロックのうち、同一内容のデータブロックが前記分散記憶部に記憶されていないデータブロックと、同一内容のデータブロックが前記分散記憶部に記憶済みで且つその冗長度がその要求値を満たしていないデータブロックの中から抽出したデータブロックとを、書き込みの対象として選択する選択部と、前記選択部によって選択されたデータブロックについて、前記要求値を満たす冗長度に基づく冗長処理を施して前記分散記憶部に記憶させる書き込み部と、前記分散記憶部に記憶済みのデータブロックのうち、その冗長度が前記要求値を満たしていないデータブロックを、前記要求値を満たす冗長度に基づく冗長処理を施したデータブロックに変換して、前記分散記憶部に記憶させる変換処理を、処理速度を抑制しながら実行する変換部と、を備える。 The storage system according to the present invention includes a distributed storage unit that distributes and stores redundantly processed data in a plurality of storage devices, and divides the write data requested to be written into data blocks. A data block in which a data block having the same content is not stored in the distributed storage unit and a data block in which a data block having the same content is already stored in the distributed storage unit and the redundancy does not satisfy the required value The data block extracted from the data block selected as a write target, and the data block selected by the selection unit is subjected to redundancy processing based on the redundancy satisfying the required value and is stored in the distributed storage unit Of the data blocks stored in the writing unit and the distributed storage unit, the redundancy satisfies the required value. A conversion unit that converts a data block that has not been converted into a data block that has been subjected to redundancy processing based on redundancy satisfying the required value, and that is stored in the distributed storage unit while suppressing processing speed; Is provided.

また、本発明のアクセス装置は、前記ストレージシステムにおいて、前記選択部と、前記書き込み部と、前記変換部とを有する。 The access device according to the present invention includes the selection unit, the writing unit, and the conversion unit in the storage system.

また、本発明の他のアクセス装置は、前記ストレージシステムにおいて、前記書き込み部と、前記変換部とを有する。 Another access device of the present invention includes the write unit and the conversion unit in the storage system.

また、本発明のクライアント装置は、前記ストレージシステムにおいて、前記選択部を有する。 The client device of the present invention includes the selection unit in the storage system.

また、本発明の方法は、コンピュータ装置が、冗長処理が施されたデータを複数の記憶装置に分散して記憶する分散記憶部を用いて、書き込みを要求された書き込みデータをデータブロックに分割し、分割したデータブロックのうち、同一内容のデータブロックが前記分散記憶部に記憶されていないデータブロックと、同一内容のデータブロックが前記分散記憶部に記憶済みで且つその冗長度がその要求値を満たしていないデータブロックの中から抽出したデータブロックとを、書き込みの対象として選択し、選択したデータブロックについて、前記要求値を満たす冗長度に基づく冗長処理を施して前記分散記憶部に記憶させ、前記分散記憶部に記憶済みのデータブロックのうち、その冗長度が前記要求値を満たしていないデータブロックを、前記要求値を満たす冗長度に基づく冗長処理を施したデータブロックに変換して、前記分散記憶部に記憶させる変換処理を、処理速度を抑制しながら実行する。 Further, according to the method of the present invention, the computer device divides the write data requested to be written into data blocks by using a distributed storage unit that stores the redundantly processed data in a plurality of storage devices. Among the divided data blocks, a data block having the same content data block not stored in the distributed storage unit and a data block having the same content stored in the distributed storage unit and the redundancy of the data block is the required value. Select the data block extracted from the data blocks that do not satisfy as a target for writing, and for the selected data block, perform redundancy processing based on the redundancy satisfying the required value, and store in the distributed storage unit, Among data blocks stored in the distributed storage unit, data blocks whose redundancy does not satisfy the required value It is converted into the data block to which the redundancy processing based on the redundancy satisfying the request value, a conversion process to be stored in the distributed storage unit, executes while suppressing the processing speed.

また、本発明のプログラムは、冗長処理が施されたデータを複数の記憶装置に分散して記憶する分散記憶部を用いて、書き込みを要求された書き込みデータをデータブロックに分割し、分割したデータブロックのうち、同一内容のデータブロックが前記分散記憶部に記憶されていないデータブロックと、同一内容のデータブロックが前記分散記憶部に記憶済みで且つその冗長度がその要求値を満たしていないデータブロックの中から抽出したデータブロックとを、書き込みの対象として選択する選択ステップをコンピュータ装置に実行させる。 In addition, the program of the present invention uses a distributed storage unit that distributes and stores redundantly processed data in a plurality of storage devices, divides the write data requested for writing into data blocks, and generates divided data. Among the blocks, a data block in which a data block having the same content is not stored in the distributed storage unit, and a data in which a data block having the same content is stored in the distributed storage unit and the redundancy does not satisfy the required value The computer apparatus is caused to execute a selection step of selecting a data block extracted from the blocks as a write target.

本発明は、重複排除の機能を有するストレージシステムにおいて、性能への影響を抑えながら、冗長度を効率的に変更する技術を提供することができる。 The present invention can provide a technology for efficiently changing redundancy while suppressing the influence on performance in a storage system having a deduplication function.

本発明の第１の実施の形態としてのストレージシステムの構成を示すブロック図である。1 is a block diagram showing a configuration of a storage system as a first embodiment of the present invention. FIG. 本発明の第１の実施の形態としてのストレージシステムのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the storage system as the 1st Embodiment of this invention. 本発明の第１の実施の形態におけるデータブロックおよびフラグメントを説明する模式図である。It is a schematic diagram explaining the data block and fragment in the 1st Embodiment of this invention. 本発明の第１の実施の形態としてのストレージシステムが書き込みデータを受信した際の動作を説明するフローチャートである。It is a flowchart explaining the operation | movement when the storage system as a 1st Embodiment of this invention receives write data. 本発明の第１の実施の形態としてのストレージシステムが変換処理を実行する動作を説明するフローチャートである。5 is a flowchart for explaining an operation of executing a conversion process by the storage system as the first embodiment of the invention. 本発明の第２の実施の形態としてのストレージシステムの構成を示すブロック図である。It is a block diagram which shows the structure of the storage system as the 2nd Embodiment of this invention. 本発明の第２の実施の形態としてのストレージシステムのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the storage system as the 2nd Embodiment of this invention. 本発明の第２の実施の形態におけるメタデータの一例を示す図である。It is a figure which shows an example of the metadata in the 2nd Embodiment of this invention. 本発明の第２の実施の形態におけるメタデータの他の一例を示す図である。It is a figure which shows another example of the metadata in the 2nd Embodiment of this invention. 本発明の第２の実施の形態において選択部がデータブロックを選択する処理を模式的に説明する図である。It is a figure which illustrates typically the process in which the selection part selects a data block in the 2nd Embodiment of this invention. 本発明の第２の実施の形態におけるアクセス装置が書き込みデータを受信した際の動作を説明するフローチャートである。It is a flowchart explaining the operation | movement when the access apparatus in the 2nd Embodiment of this invention receives write data. 本発明の第２の実施の形態におけるストレージノードがフラグメントを受信した際の動作を説明するフローチャートである。It is a flowchart explaining the operation | movement when the storage node in the 2nd Embodiment of this invention receives a fragment. 本発明の第２の実施の形態におけるアクセス装置が変換処理を実行する動作を説明するフローチャートである。It is a flowchart explaining the operation | movement which the access apparatus in the 2nd Embodiment of this invention performs a conversion process. 本発明の第３の実施の形態としてのストレージシステムの構成を示すブロック図である。It is a block diagram which shows the structure of the storage system as the 3rd Embodiment of this invention. 本発明の第３の実施の形態としてのストレージシステムのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the storage system as the 3rd Embodiment of this invention. 本発明の第３の実施の形態におけるクライアント装置がデータブロックを選択する動作を説明するフローチャートである。It is a flowchart explaining the operation | movement which the client apparatus in the 3rd Embodiment of this invention selects a data block. 本発明の第３の実施の形態におけるアクセス装置がデータブロックを受信した際の動作を説明するフローチャートである。It is a flowchart explaining the operation | movement at the time of the access apparatus in the 3rd Embodiment of this invention receiving a data block.

以下、本発明の実施の形態について、図面を参照して詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（第１の実施の形態）
本発明の第１の実施の形態としてのストレージシステム１の機能ブロック構成を図１に示す。図１において、ストレージシステム１は、選択部１１と、書き込み部１２と、変換部１３と、分散記憶部８０とを備える。 (First embodiment)
FIG. 1 shows a functional block configuration of the storage system 1 as the first embodiment of the present invention. In FIG. 1, the storage system 1 includes a selection unit 11, a writing unit 12, a conversion unit 13, and a distributed storage unit 80.

ここで、ストレージシステム１は、図２に示すようなハードウェア要素によって構成可能である。図２において、ストレージシステム１は、ＣＰＵ（Central Processing Unit）１００１、メモリ１００２、複数の記憶装置１００４、および、ネットワークインタフェース１００５を含む。メモリ１００２は、ＲＡＭ（Random Access Memory）およびＲＯＭ（Read Only Memory）等によって構成される。記憶装置１００４は、ハードディスクやフラッシュメモリ等の補助記憶装置によって構成される。この場合、選択部１１、書き込み部１２および変換部１３は、ネットワークインタフェース１００５と、メモリ１００２または記憶装置１００４に格納されるコンピュータ・プログラムを読み込んで実行するＣＰＵ１００１とによって構成される。また、分散記憶部８０は、複数の記憶装置１００４によって構成される。ただし、ストレージシステム１およびその各機能ブロックのハードウェア構成は、上述の構成に限定されない。 Here, the storage system 1 can be configured by hardware elements as shown in FIG. In FIG. 2, the storage system 1 includes a CPU (Central Processing Unit) 1001, a memory 1002, a plurality of storage devices 1004, and a network interface 1005. The memory 1002 includes a RAM (Random Access Memory), a ROM (Read Only Memory), and the like. The storage device 1004 is configured by an auxiliary storage device such as a hard disk or a flash memory. In this case, the selection unit 11, the writing unit 12, and the conversion unit 13 are configured by a network interface 1005 and a CPU 1001 that reads and executes a computer program stored in the memory 1002 or the storage device 1004. The distributed storage unit 80 includes a plurality of storage devices 1004. However, the hardware configuration of the storage system 1 and each functional block thereof is not limited to the above-described configuration.

なお、図２には、３つの記憶装置１００４が示されているが、本実施の形態に含まれる記憶装置１００４の数は限定されない。ストレージシステム１では、分散記憶部８０を構成する複数の記憶装置１００４の追加や交換により、容量や性能の拡張が可能となっている。 2 shows three storage devices 1004, the number of storage devices 1004 included in this embodiment is not limited. In the storage system 1, capacity and performance can be expanded by adding or replacing a plurality of storage devices 1004 constituting the distributed storage unit 80.

分散記憶部８０は、冗長処理が施されたデータを記憶する。冗長処理とは、例えば、データを複数の分割データに分割し、冗長情報を付加する処理である。冗長情報は、分割データの一部が消失しても元のデータが復元可能となるよう、誤り訂正符号の技術に基づき生成される情報である。この場合、つまり、冗長処理が施されたデータとは、分割データおよび冗長情報からなる。分散記憶部８０は、冗長処理が施されたデータ、すなわち、分割データおよび冗長情報を、複数の記憶装置１００４に分散させて記憶する。 The distributed storage unit 80 stores data that has been subjected to redundant processing. The redundancy process is a process for dividing data into a plurality of divided data and adding redundant information, for example. The redundant information is information generated based on the error correction code technique so that the original data can be restored even if a part of the divided data is lost. In this case, that is, the data subjected to the redundancy processing includes divided data and redundant information. The distributed storage unit 80 stores the data subjected to the redundancy processing, that is, the divided data and the redundant information, distributed to the plurality of storage devices 1004.

選択部１１は、書き込みデータをデータブロックに分割する。分割されるデータブロックは、固定長であってもよいし、可変長であってもよい。また、書き込みデータは、外部の装置によって書き込みを要求されて受信されたデータである。例えば、書き込みデータは、ＮＦＳ（Network File System）やＣＩＦＳ（Common Internet File System）等のファイル転送用プロトコルを用いて、外部の装置から受信されたものであってもよい。なお、書き込みデータとは、新規に書き込まれるデータに限らず、更新のために書き込まれるデータも含む。 The selection unit 11 divides the write data into data blocks. The data block to be divided may have a fixed length or a variable length. The write data is data that is received when a write is requested by an external device. For example, the write data may be received from an external device using a file transfer protocol such as NFS (Network File System) or CIFS (Common Internet File System). Note that the write data includes not only newly written data but also data written for updating.

また、選択部１１は、分割したデータブロックの中から、分散記憶部８０に書き込むデータブロックを選択する。具体的には、選択部１１は、分割したデータブロックのうち、同一内容のデータブロックが分散記憶部８０に記憶されていないデータブロックを、まず、書き込みの対象として選択する。 The selection unit 11 selects a data block to be written to the distributed storage unit 80 from the divided data blocks. Specifically, the selection unit 11 first selects, from among the divided data blocks, a data block in which a data block having the same content is not stored in the distributed storage unit 80 as a write target.

さらに、選択部１１は、分割したデータブロックのうち、同一内容のデータブロックが分散記憶部８０に既に記憶されており、かつ、その冗長度が要求値を満たさないデータブロックの中から、分散記憶部８０に書き込むデータブロックを選択する。このとき、選択部１１は、同一内容のデータブロックが分散記憶部８０に既に記憶されており、かつ、その冗長度が要求値を満たさないデータブロックの全てではなく、一部を選択することが望ましい。 Further, the selection unit 11 performs distributed storage from among the data blocks in which the data blocks having the same contents among the divided data blocks are already stored in the distributed storage unit 80 and the redundancy does not satisfy the required value. A data block to be written to the unit 80 is selected. At this time, the selection unit 11 may select some, not all, data blocks having the same content already stored in the distributed storage unit 80 and whose redundancy does not satisfy the required value. desirable.

ここで、データブロックの冗長度とは、冗長処理が施されたデータブロックに占める冗長情報の量を表す情報である。例えば、冗長度とは、データブロックに占める冗長情報の割合であってもよい。あるいは、冗長度とは、データブロックに含まれる冗長情報の個数であってもよい。 Here, the redundancy of the data block is information indicating the amount of redundant information in the data block subjected to the redundancy process. For example, the redundancy may be a ratio of redundant information in a data block. Alternatively, the redundancy may be the number of redundant information included in the data block.

また、要求値とは、書き込みデータの冗長度として要求される値である。例えば、要求値としては、ストレージシステム１上に実現されるファイルシステム毎に共通する値が設定されていてもよい。そのような要求値は、メモリ１００２または記憶装置１００４に記憶されていてもよい。また、要求値は、設定により変更可能であるものとする。また、冗長度が要求値を満たすとは、冗長度が要求値に一致することであってもよい。あるいは、冗長度が要求値を満たすとは、冗長度が要求値以上であることであってもよい。その他、冗長度が要求値を満たすとは、その冗長度により確保されるデータブロックの信頼性の高さが、要求値を満たす冗長度により確保されるデータブロックの信頼性の高さと同等またはそれ以上となることであればよい。 The required value is a value required as the redundancy of write data. For example, as the request value, a value common to each file system realized on the storage system 1 may be set. Such a required value may be stored in the memory 1002 or the storage device 1004. Further, it is assumed that the required value can be changed by setting. Further, that the redundancy satisfies the required value may be that the redundancy matches the required value. Alternatively, that the redundancy satisfies the required value may be that the redundancy is equal to or higher than the required value. In addition, the redundancy satisfying the required value means that the reliability of the data block secured by the redundancy is equal to or higher than the reliability of the data block secured by the redundancy satisfying the requested value. What is necessary is just to become above.

なお、もし、冗長度が要求値に一致する場合に要求値を満たすとみなす場合、冗長度が要求値を満たさないデータブロックとしては、冗長度が要求値より小さいデータブロックと、冗長度が要求値より大きく過剰となっているデータブロックとが該当する。これらのデータブロックは、後述の書き込み部１２または変換部１３により要求値が示す冗長度のデータブロックに変換されることになる。この場合、冗長度が要求値より大きく過剰となっているデータブロックが要求される冗長度に下げられることで、分散記憶部８０の使用容量が低減されることになる。 If it is considered that the required value is satisfied when the redundancy matches the required value, the data block whose redundancy does not satisfy the required value is a data block whose redundancy is smaller than the required value and the required redundancy. Data blocks that are larger than the value and excessive are applicable. These data blocks are converted into data blocks having the redundancy indicated by the request value by the writing unit 12 or the conversion unit 13 described later. In this case, the use capacity of the distributed storage unit 80 is reduced by reducing the data block whose redundancy is larger than the required value to the required redundancy.

また、もし、冗長度が要求値以上である場合に要求値を満たすとみなす場合、冗長度が要求値を満たさないデータブロックとしては、冗長度が要求値より小さいデータブロックが該当し、冗長度が要求値より大きいデータブロックは該当しない。この場合、後述の書き込み部１２または変換部１３により処理の対象となるデータブロックの個数がより抑制されることになる。 Also, if it is considered that the required value is satisfied when the redundancy is equal to or higher than the required value, the data block whose redundancy does not satisfy the required value corresponds to a data block whose redundancy is smaller than the required value. Data blocks whose value is larger than the required value are not applicable. In this case, the number of data blocks to be processed is further suppressed by the writing unit 12 or the conversion unit 13 described later.

書き込み部１２は、選択部１１によって選択されたデータブロックについて、要求値を満たす冗長度に基づく冗長処理を施して分散記憶部８０に記憶させる。具体的には、書き込み部１２は、データブロックを複数のフラグメントに分割する。分割されたフラグメントを、オリジナルフラグメントと呼ぶことにする。そして、書き込み部１２は、分割されたフラグメントに基づいて、１つ以上の冗長情報を生成する。冗長情報として生成されたフラグメントを、パリティフラグメントと呼ぶことにする。また、オリジナルフラグメントおよびパリティフラグメントを総称して、フラグメントとも記載する。このとき、冗長度に応じて、データブロックを分割するオリジナルフラグメントの個数および付加するパリティフラグメントの個数が決まるものとする。例えば、冗長度が、フラグメントの合計個数に対するパリティフラグメントの個数の割合を表している場合、冗長度（パリティフラグメントの個数の割合）が変更されると、オリジナルフラグメントおよびパリティフラグメントの個数がそれぞれ変更になる。また、例えば、冗長度が、パリティフラグメントの個数を表している場合で、フラグメントの合計個数があらかじめ定められているとする。この場合、冗長度（パリティフラグメントの個数）が変更されると、オリジナルフラグメントおよびパリティフラグメントの個数がそれぞれ変更になる。なお、このような冗長処理には、各種の誤り訂正符号の技術を適用すればよい。そして、書き込み部１２は、これらのフラグメントを、分散記憶部８０における複数の記憶装置１００４に分散して記憶させる。 The writing unit 12 performs redundancy processing based on the redundancy satisfying the requested value on the data block selected by the selection unit 11 and stores the data block in the distributed storage unit 80. Specifically, the writing unit 12 divides the data block into a plurality of fragments. The divided fragments are called original fragments. Then, the writing unit 12 generates one or more pieces of redundant information based on the divided fragments. A fragment generated as redundant information is called a parity fragment. Also, the original fragment and the parity fragment are collectively referred to as a fragment. At this time, it is assumed that the number of original fragments for dividing the data block and the number of parity fragments to be added are determined according to the redundancy. For example, if the redundancy indicates the ratio of the number of parity fragments to the total number of fragments, when the redundancy (the ratio of the number of parity fragments) is changed, the number of original fragments and parity fragments are changed. Become. For example, it is assumed that the redundancy indicates the number of parity fragments, and the total number of fragments is predetermined. In this case, when the redundancy (number of parity fragments) is changed, the numbers of original fragments and parity fragments are changed. Note that various error correction code techniques may be applied to such redundant processing. Then, the writing unit 12 distributes and stores these fragments in the plurality of storage devices 1004 in the distributed storage unit 80.

ここで、書き込みデータを構成するデータブロック、および、データブロックを構成するフラグメントを模式的に図３に示す。図３において、書き込みデータは、ｋ個のデータブロック１〜データブロックｋに分割される。ｋは正の整数である。各データブロックは、固定長であっても可変長であってもよい。また、データブロック１は、ｍ個のオリジナルフラグメントＯＦ１〜ＯＦｍに分割される。ｍは正の整数である。また、データブロック１に対して、ｎ個のパリティフラグメントＰＦ１〜ＰＦｎが生成される。ｎは正の整数である。つまり、冗長処理が施されたデータブロック１は、ｍ個のオリジナルフラグメントおよびｎ個のパリティフラグメントからなる。冗長度をパリティフラグメントの個数で表す場合、このデータブロック１の冗長度はｎとなる。また、冗長度を、パリティフラグメントの割合で表す場合、ｎ／（ｎ＋ｍ）となる。なお、“／”は除算を表す。 Here, FIG. 3 schematically shows data blocks constituting the write data and fragments constituting the data blocks. In FIG. 3, the write data is divided into k data blocks 1 to k. k is a positive integer. Each data block may be fixed length or variable length. The data block 1 is divided into m original fragments OF1 to OFm. m is a positive integer. For the data block 1, n parity fragments PF1 to PFn are generated. n is a positive integer. That is, the data block 1 subjected to the redundancy processing is composed of m original fragments and n parity fragments. When the redundancy is expressed by the number of parity fragments, the redundancy of the data block 1 is n. In addition, when the redundancy is expressed as a parity fragment ratio, it is n / (n + m). “/” Represents division.

変換部１３は、分散記憶部８０に記憶されたデータブロックのうちその冗長度が要求値を満たさないデータブロックについて冗長度の変換処理を、処理速度を抑制しながら行う。冗長度の変換処理とは、格納されているデータブロックを読み込んで、要求値を満たす冗長度に基づく冗長処理を施したデータブロックに変換した上で、分散記憶部８０に再度記憶させる処理をいう。具体的には、変換部１３は、該当するデータブロックを、オリジナルフラグメントおよびパリティフラグメントを読み込むことにより復元する。そして、変換部１３は、要求値を満たす冗長度に基づいて、復元したデータブロックを新たなオリジナルフラグメントに分割し、新たなパリティフラグメントを生成する。そして、変換部１３は、新たに生成したオリジナルフラグメントおよびパリティフラグメントを、分散記憶部８０における複数の記憶装置１００４に分散して記憶させる。 The conversion unit 13 performs the redundancy conversion process for the data block whose redundancy does not satisfy the required value among the data blocks stored in the distributed storage unit 80 while suppressing the processing speed. The redundancy conversion process is a process in which a stored data block is read, converted into a data block subjected to a redundancy process based on a redundancy satisfying a required value, and then stored in the distributed storage unit 80 again. . Specifically, the conversion unit 13 restores the corresponding data block by reading the original fragment and the parity fragment. Then, the conversion unit 13 divides the restored data block into new original fragments based on the redundancy satisfying the required value, and generates a new parity fragment. Then, the conversion unit 13 distributes and stores the newly generated original fragment and parity fragment in the plurality of storage devices 1004 in the distributed storage unit 80.

以上のように構成されたストレージシステム１の動作を、図面を参照して説明する。 The operation of the storage system 1 configured as described above will be described with reference to the drawings.

まず、書き込みデータを外部から受信した際のストレージシステム１の動作を図４に示す。 First, FIG. 4 shows the operation of the storage system 1 when write data is received from the outside.

図４では、まず、選択部１１は、冗長度の要求値を取得する（ステップＡ１）。 In FIG. 4, first, the selection unit 11 acquires a required value for redundancy (step A1).

次に、選択部１１は、書き込みデータをデータブロックに分割する（ステップＡ２）。 Next, the selection unit 11 divides the write data into data blocks (step A2).

次に、選択部１１は、分割したデータブロックのうち、同一内容のデータブロックが分散記憶部８０に記憶されていないデータブロックを、書き込みの対象として選択する（ステップＡ３）。 Next, the selection unit 11 selects, as a write target, a data block in which a data block having the same content is not stored in the distributed storage unit 80 among the divided data blocks (step A3).

次に、選択部１１は、分割したデータブロックのうち、同一内容のデータブロックが分散記憶部８０に記憶済み且つその冗長度が要求値を満たさないデータブロックの中から、書き込みの対象とするデータブロックを選択する（ステップＡ４）。 Next, the selection unit 11 selects the data to be written from among the divided data blocks from among the data blocks in which the data blocks having the same contents are stored in the distributed storage unit 80 and the redundancy does not satisfy the required value. A block is selected (step A4).

次に、書き込み部１２は、ステップＡ３およびステップＡ４で選択されたデータブロックについて、要求値を満たす冗長度に基づく冗長処理を施す（ステップＡ５）。 Next, the writing unit 12 performs a redundancy process based on the redundancy satisfying the requested value on the data block selected in Step A3 and Step A4 (Step A5).

次に、書き込み部１２は、冗長処理を施したデータブロックを、分散記憶部８０に記憶させる（ステップＡ６）。 Next, the writing unit 12 stores the data block subjected to the redundancy processing in the distributed storage unit 80 (step A6).

以上で、ストレージシステム１が、書き込みデータを外部から受信した際の動作の説明を終了する。 This is the end of the description of the operation when the storage system 1 receives write data from the outside.

次に、ストレージシステム１による冗長度の変換処理の動作を図５に示す。 Next, the operation of the redundancy conversion process by the storage system 1 is shown in FIG.

図５では、まず、変換部１３は、分散記憶部８０に記憶されたデータブロックのうち、その冗長度が要求値を満たさないデータブロックを、変換対象として抽出する（ステップＢ１）。 In FIG. 5, first, the conversion unit 13 extracts a data block whose redundancy does not satisfy the required value from among the data blocks stored in the distributed storage unit 80 as a conversion target (step B1).

次に、変換部１３は、変換対象のデータブロックを読み込む（ステップＢ２）。 Next, the conversion unit 13 reads the data block to be converted (step B2).

次に、変換部１３は、読み込んだデータブロックに対して、要求値を満たす冗長度に基づく冗長処理を実施する（ステップＢ３）。 Next, the conversion unit 13 performs a redundancy process on the read data block based on the redundancy satisfying the requested value (step B3).

次に、変換部１３は、冗長処理を施したデータブロックを、分散記憶部８０に記憶させる（ステップＢ４）。 Next, the conversion unit 13 stores the data block subjected to the redundancy process in the distributed storage unit 80 (step B4).

このようなステップＢ１〜Ｂ４の処理を、変換部１３は、処理速度を抑制しながら実行する。 The conversion unit 13 executes such processing in steps B1 to B4 while suppressing the processing speed.

以上で、ストレージシステム１の動作の説明を終了する。 This is the end of the description of the operation of the storage system 1.

次に、本発明の第１の実施の形態の効果について述べる。 Next, effects of the first exemplary embodiment of the present invention will be described.

本発明の第１の実施の形態としてのストレージシステムは、重複排除の機能を有するストレージシステムにおいて、性能への影響を抑えながら、冗長度を効率的に変更することができる。 The storage system according to the first embodiment of the present invention can efficiently change the redundancy while suppressing the influence on the performance in the storage system having the deduplication function.

その理由について説明する。本実施の形態では、分散記憶部が、冗長処理が施されたデータを記憶している。そして、選択部が、書き込みデータをデータブロックに分割し、分割したデータブロックの中から、書き込みの対象とするデータブロックを選択する。ここでは、同一内容のデータブロックが分散記憶部に記憶されていないデータブロックが、書き込みの対象として選択される。さらに、同一内容のデータブロックが分散記憶部に記憶済み且つその冗長度が要求値を満たさないデータブロックの中から抽出されたデータブロックが、書き込みの対象として選択される。そして、書き込み部が、選択部によって選択されたデータブロックについて、要求値を満たす冗長度に基づく冗長処理を施して分散記憶部に記憶させるからである。また、変換部が、分散記憶部に記憶されたデータブロックのうちその冗長度が要求値を満たさないデータブロックを、要求値を満たす冗長度に基づく冗長処理を施したデータブロックに変換して記憶し直す変換処理を、処理速度を抑制しながら実行するからである。 The reason will be described. In the present embodiment, the distributed storage unit stores data that has been subjected to redundancy processing. Then, the selection unit divides the write data into data blocks, and selects a data block to be written from among the divided data blocks. Here, a data block in which a data block having the same content is not stored in the distributed storage unit is selected as a write target. Further, a data block extracted from data blocks in which data blocks having the same contents are already stored in the distributed storage unit and whose redundancy does not satisfy the required value is selected as a write target. This is because the writing unit performs redundancy processing based on the redundancy satisfying the required value on the data block selected by the selection unit and stores the data block in the distributed storage unit. In addition, the conversion unit converts a data block whose redundancy does not satisfy the required value among the data blocks stored in the distributed storage unit into a data block subjected to redundancy processing based on the redundancy satisfying the required value and stores the data block. This is because the conversion process to be performed is executed while suppressing the processing speed.

このように構成されることにより、本実施の形態では、冗長度の要求値が変更された場合に、書き込みデータを構成するデータブロックのうち、新たな冗長度に基づく冗長処理が施されて書き込まれるデータブロックは、次のようなデータブロックのみとなる。すなわち、書き込まれるデータブロックは、分散記憶部に未だ記憶されていないデータブロックと、記憶済みのデータブロックと同一内容で冗長度が要求値を満たさないデータブロックから抽出されたデータブロックのみである。つまり、本実施の形態は、書き込み時に、同一内容のデータブロックが記憶済みでもその冗長度が新たな要求値を満たさないもの全てについて新たな冗長処理を施して記憶し直す場合と比べて、記憶し直すデータブロックの数を抑制することになる。その結果、本実施の形態は、冗長度の要求値が変更された後でも、書き込みの性能を著しく低下させることがない。 With this configuration, in the present embodiment, when the required value of the redundancy is changed, the redundancy processing based on the new redundancy is performed among the data blocks constituting the write data and writing is performed. Only the following data blocks are used. That is, the data blocks to be written are only data blocks extracted from data blocks that are not yet stored in the distributed storage unit and data blocks that have the same contents as the stored data blocks and whose redundancy does not satisfy the required value. In other words, in the present embodiment, when data blocks having the same contents are stored at the time of writing, all of those whose redundancy does not satisfy the new required value are compared with a case where new redundancy processing is performed and stored again. This reduces the number of data blocks to be redone. As a result, the present embodiment does not significantly reduce the write performance even after the required redundancy value is changed.

また、本実施の形態は、冗長度の要求値が変更された後、書き込み時に新たな冗長処理を施さず記憶し直さなかったデータブロックについては、別途実行する変換処理により、新たな冗長度に基づく冗長処理を施して記憶し直すことになる。このとき、本実施の形態は、このような変換処理を、処理速度を抑制しながら行う。これにより、本実施の形態は、書き込み時には冗長度の変更を行わずに記憶済みのデータブロックに対して一括して冗長度の変換処理を行う場合と比べて、運用でデータにアクセスする性能に影響が出ないようにすることができる。さらに、本実施の形態は、書き込み時にも一部のデータブロックについて冗長度の変更を行っているため、別途行う変換処理の処理速度を抑制しても、全てのデータブロックについて冗長度の変更が完了するまでの時間を短縮することができる。 Further, in the present embodiment, after the redundancy required value is changed, a new redundancy process is not performed at the time of writing. Based on the redundant processing, the data is stored again. At this time, the present embodiment performs such conversion processing while suppressing the processing speed. As a result, this embodiment has a performance to access data in operation compared to the case where redundancy conversion processing is performed on stored data blocks without changing redundancy at the time of writing. The influence can be prevented. Furthermore, in the present embodiment, the redundancy is changed for some data blocks even at the time of writing. Therefore, even if the processing speed of the conversion process performed separately is suppressed, the redundancy is changed for all the data blocks. Time to complete can be shortened.

つまり、本実施の形態は、情報の書き込み時に一部のデータブロックについては新たな冗長度での書き込みを行うことにより冗長度の変更の効率化を図るとともに、別途実行する変換処理により全てのデータブロックについての冗長度の変更を網羅する。その上、本実施の形態は、情報の書き込み時に新たな冗長度での書き込みを行うデータブロック数を抑えることで書き込みの性能を低下させず、別途実行する変換処理の速度を抑制することで読み込みの性能に影響を出さず、変換処理の完了までの時間を短縮する。 In other words, in this embodiment, when data is written, a part of the data block is written with a new redundancy to improve the efficiency of changing the redundancy, and all the data is converted by a conversion process executed separately. Covers changes in redundancy for blocks. In addition, this embodiment reduces the number of data blocks to be written with a new redundancy when writing information, and does not deteriorate the writing performance, and reduces the speed of the conversion process that is executed separately. The time to completion of the conversion process is shortened without affecting the performance.

（第２の実施の形態）
次に、本発明の第２の実施の形態について図面を参照して詳細に説明する。なお、本実施の形態の説明において参照する各図面において、本発明の第１の実施の形態と同一の構成および同様に動作するステップには同一の符号を付して本実施の形態における詳細な説明を省略する。 (Second Embodiment)
Next, a second embodiment of the present invention will be described in detail with reference to the drawings. Note that, in each drawing referred to in the description of the present embodiment, the same reference numerals are given to the same configuration and steps that operate in the same manner as in the first embodiment of the present invention, and the detailed description in the present embodiment. Description is omitted.

本発明の第２の実施の形態としてのストレージシステム２の構成を図６に示す。図６において、ストレージシステム２は、アクセス装置２００と、複数のストレージノード９００とを含む。アクセス装置２００は、ストレージシステム２に対するデータの書き込みや読み込みの要求を受け付けるとともに、ストレージシステム２におけるデータを管理する装置である。ストレージノード９００は、ストレージシステム２において読み書きの対象となる書き込みデータを分散して記憶するノードである。複数のストレージノード９００によって、分散記憶部９０が構成される。ストレージシステム２は、ストレージノード９００の追加や交換により、容量や性能の拡張が可能となっている。 FIG. 6 shows the configuration of the storage system 2 as the second embodiment of the present invention. In FIG. 6, the storage system 2 includes an access device 200 and a plurality of storage nodes 900. The access device 200 is a device that receives data write and read requests to the storage system 2 and manages data in the storage system 2. The storage node 900 is a node that distributes and stores write data to be read / written in the storage system 2. A plurality of storage nodes 900 constitute a distributed storage unit 90. The storage system 2 can be expanded in capacity and performance by adding or replacing the storage node 900.

また、図６に示すように、アクセス装置２００は、選択部２１と、書き込み部２２と、変換部２３とを有する。また、ストレージノード９００は、データ格納部９１と、メタデータ格納部９２と、データ処理部９３とを有する。 As illustrated in FIG. 6, the access device 200 includes a selection unit 21, a writing unit 22, and a conversion unit 23. The storage node 900 includes a data storage unit 91, a metadata storage unit 92, and a data processing unit 93.

ここで、ストレージシステム２を構成する各装置は、図７に示すようなハードウェア要素によって構成可能である。図７において、アクセス装置２００は、ＣＰＵ２００１、メモリ２００２、記憶装置２００４、および、ネットワークインタフェース２００５を含む。この場合、選択部２１、書き込み部２２および変換部２３は、ネットワークインタフェース２００５と、メモリ２００２または記憶装置２００４に格納されるコンピュータ・プログラムを読み込んで実行するＣＰＵ２００１とによって構成される。また、ストレージノード９００は、ＣＰＵ９００１、メモリ９００２、記憶装置９００４、および、ネットワークインタフェース９００５を含む。この場合、データ格納部９１およびメタデータ格納部９２は、記憶装置９００４によって構成される。また、データ処理部９３は、ネットワークインタフェース９００５と、メモリ９００２または記憶装置９００４に格納されるコンピュータ・プログラムを読み込んで実行するＣＰＵ９００１とによって構成される。なお、図７には、３つのストレージノード９００が示されているが、本実施の形態に含まれるストレージノード９００の数は、限定されない。また、図７には、１つのアクセス装置２００が示されているが、本実施の形態に含まれるアクセス装置２００の数は、限定されない。また、ストレージシステム２を構成する各装置およびその各機能ブロックのハードウェア構成は、上述の構成に限定されない。 Here, each device constituting the storage system 2 can be configured by hardware elements as shown in FIG. In FIG. 7, the access device 200 includes a CPU 2001, a memory 2002, a storage device 2004, and a network interface 2005. In this case, the selection unit 21, the writing unit 22, and the conversion unit 23 are configured by a network interface 2005 and a CPU 2001 that reads and executes a computer program stored in the memory 2002 or the storage device 2004. The storage node 900 includes a CPU 9001, a memory 9002, a storage device 9004, and a network interface 9005. In this case, the data storage unit 91 and the metadata storage unit 92 are configured by the storage device 9004. The data processing unit 93 includes a network interface 9005 and a CPU 9001 that reads and executes a computer program stored in the memory 9002 or the storage device 9004. Although three storage nodes 900 are shown in FIG. 7, the number of storage nodes 900 included in the present embodiment is not limited. FIG. 7 shows one access device 200, but the number of access devices 200 included in the present embodiment is not limited. Further, the hardware configuration of each device and each functional block constituting the storage system 2 is not limited to the above-described configuration.

次に、ストレージノード９００に含まれる各機能ブロックについて説明する。 Next, each functional block included in the storage node 900 will be described.

データ格納部９１は、冗長処理が施されたデータブロックを構成するフラグメントを格納する。フラグメントは、前述のように、オリジナルフラグメントおよびパリティフラグメントの総称である。複数のストレージノード９００のデータ格納部９１により、冗長処理が施されたデータブロックは、分散して記憶される。 The data storage unit 91 stores fragments that constitute a data block that has been subjected to redundancy processing. As described above, the fragment is a generic name for the original fragment and the parity fragment. Data blocks subjected to redundancy processing by the data storage units 91 of the plurality of storage nodes 900 are distributed and stored.

メタデータ格納部９２は、分散記憶部９０によって記憶されるデータブロックに関するメタデータを記憶する。メタデータには、データブロックの冗長度およびその内容の同一性を判定するための情報が含まれる。これらの情報は、選択部２１または変換部２３によって参照される。 The metadata storage unit 92 stores metadata regarding data blocks stored by the distributed storage unit 90. The metadata includes information for determining the redundancy of data blocks and the identity of their contents. These pieces of information are referred to by the selection unit 21 or the conversion unit 23.

また、メタデータには、書き込みデータまたは書き込みデータのグループ毎に設定された冗長度の要求値が含まれていてもよい。本発明の第１の実施の形態では、冗長度の要求値は、ファイルシステム毎に共通する値が設定されているものとして説明していた。本実施の形態では、冗長度の要求値は、書き込みデータまたは書き込みデータのグループ毎に個別に設定され得るものとする。なお、該当する書き込みデータに適用される個別の要求値がなければ、共通して設定された要求値が適用されるものとする。また、共通して設定される要求値は、アクセス装置２００のメモリ２００２または記憶装置２００４に格納されていてもよい。 Further, the metadata may include a request value for redundancy set for each write data or write data group. In the first embodiment of the present invention, it has been described that the required value of the redundancy is set to a common value for each file system. In the present embodiment, it is assumed that the required value of redundancy can be set individually for each write data or group of write data. If there is no individual required value applied to the corresponding write data, the commonly set required value is applied. In addition, the commonly set request value may be stored in the memory 2002 or the storage device 2004 of the access device 200.

その他、メタデータには、書き込みデータの読み書きに必要となる各種の情報が含まれる。例えば、メタデータには、書き込みデータを構成する各データブロックの論理アドレスを表す情報が含まれていてもよい。また、メタデータには、そのデータブロックを構成するフラグメントの物理アドレスを表す情報が含まれていてもよい。また、メタデータには、各データブロックの参照数を表す情報が含まれていてもよい。参照数とは、そのデータブロックの参照元となっている書き込みデータの個数である。 In addition, the metadata includes various kinds of information necessary for reading and writing the write data. For example, the metadata may include information indicating the logical address of each data block constituting the write data. Further, the metadata may include information indicating the physical addresses of the fragments that constitute the data block. Further, the metadata may include information indicating the number of references of each data block. The reference number is the number of write data that is a reference source of the data block.

メタデータ格納部９２に記憶されるメタデータの一例を図８および図９に示す。 An example of metadata stored in the metadata storage unit 92 is shown in FIGS.

図８は、書き込みデータを構成する各データブロックの論理アドレスを含むメタデータの一例である。この例では、書き込みデータは、ファイルまたはディレクトリとして表される。また、ファイルまたはディレクトリは階層的に管理されている。ディレクトリ「Root dir fid = 1」やディレクトリ「dirA fid = 3」についてのメタデータは、その配下にあるファイルまたはディレクトリの名称およびｆｉｄを含む。ｆｉｄは、ファイルまたはディレクトリを識別する情報である。また、ファイル「fileA fid = 2」についてのメタデータは、そのファイルを構成する各データブロックのオフセットおよび論理アドレスを含む。オフセットは、そのファイルにおけるそのデータブロックの開始位置を表す。 FIG. 8 is an example of metadata including a logical address of each data block constituting write data. In this example, the write data is represented as a file or directory. Files or directories are managed hierarchically. The metadata about the directory “Root dir fid = 1” and the directory “dirA fid = 3” includes the names and fids of the files or directories under the metadata. fid is information for identifying a file or a directory. The metadata about the file “fileA fid = 2” includes an offset and a logical address of each data block constituting the file. The offset represents the starting position of the data block in the file.

図９は、データブロックの冗長度および同一性を判定するための情報を含むメタデータの一例である。この例では、各データブロックの論理アドレスに関連付けてメタデータが記憶されている。ここでは、冗長度は、データブロックに対して付加されたパリティフラグメントの個数によって表されている。また、ハッシュ値は、データブロックの内容の同一性を判定するための情報の一例である。その場合、ハッシュ値は、ＭＤ５（Message Digest Algorithm 5）やＳＨＡ−２５６（Secure Hash Algorithm 256-bit）などのように、非可逆のハッシュ関数を用いて算出される値であることが望ましい。また、参照数は、前述したように、このデータブロックを参照する書き込みデータ（ファイル）の個数を表す。また、物理アドレスは、データブロックを構成する各フラグメントの物理アドレスを“，”で区切って表している。この例では、各フラグメントの物理アドレスは、ストレージノード９００を特定する情報およびそのデータ格納部９１上での物理的な格納位置を表す情報からなる。 FIG. 9 is an example of metadata including information for determining the redundancy and identity of data blocks. In this example, metadata is stored in association with the logical address of each data block. Here, the redundancy is represented by the number of parity fragments added to the data block. The hash value is an example of information for determining the identity of the contents of the data block. In this case, the hash value is preferably a value calculated using an irreversible hash function such as MD5 (Message Digest Algorithm 5) and SHA-256 (Secure Hash Algorithm 256-bit). The reference number represents the number of write data (files) referring to the data block as described above. The physical address represents the physical address of each fragment constituting the data block separated by “,”. In this example, the physical address of each fragment includes information for specifying the storage node 900 and information indicating the physical storage position on the data storage unit 91.

なお、複数のストレージノード９００のメタデータ格納部９２は、ストレージシステム２に記憶される全てのデータブロックについてのメタデータをそれぞれが記憶していてもよい。あるいは、複数のストレージノード９００のいずれかのメタデータ格納部９２が、ストレージシステム２に記憶される全てのデータブロックについてのメタデータを一括して記憶していてもよい。ただし、この場合、他のストレージノード９００により、メタデータ格納部９２がバックアップされていることが望ましい。あるいは、複数のストレージノード９００のメタデータ格納部９２は、ストレージシステム２に記憶されるデータブロックについてのメタデータを分散して記憶していてもよい。 Note that the metadata storage units 92 of the plurality of storage nodes 900 may each store metadata for all data blocks stored in the storage system 2. Alternatively, any one of the metadata storage units 92 of the plurality of storage nodes 900 may collectively store metadata for all data blocks stored in the storage system 2. However, in this case, it is desirable that the metadata storage unit 92 is backed up by another storage node 900. Alternatively, the metadata storage units 92 of the plurality of storage nodes 900 may store metadata about data blocks stored in the storage system 2 in a distributed manner.

データ処理部９３は、データ格納部９１またはメタデータ格納部９２に格納される情報を、アクセス装置２００から受信して書き込む。また、データ処理部９３は、アクセス装置２００からの要求に基づき、データ格納部９１またはメタデータ格納部９２に格納済みの情報を読み出してアクセス装置２００に送信する。 The data processing unit 93 receives information from the access device 200 and writes information stored in the data storage unit 91 or the metadata storage unit 92. Further, the data processing unit 93 reads information stored in the data storage unit 91 or the metadata storage unit 92 based on a request from the access device 200 and transmits the information to the access device 200.

次に、アクセス装置２００に含まれる各機能ブロックについて説明する。 Next, each functional block included in the access device 200 will be described.

選択部２１は、本発明の第１の実施の形態における選択部１１と略同様に構成されるが、書き込みの対象とするデータブロックを選択する処理の詳細が異なる。選択部２１は、同一内容のデータブロックが分散記憶部９０に記憶済みで且つその冗長度が要求値を満たしていないデータブロックの中から、所定量のデータブロックを抽出し、書き込みの対象として選択する。 The selection unit 21 is configured in substantially the same manner as the selection unit 11 in the first embodiment of the present invention, but the details of the process of selecting a data block to be written are different. The selection unit 21 extracts a predetermined amount of data blocks from among data blocks whose data blocks having the same contents have already been stored in the distributed storage unit 90 and whose redundancy does not satisfy the required value, and is selected as a write target To do.

例えば、選択部２１は、書き込みデータを構成するデータブロックを、記憶済みのデータブロックとの内容の同一性の検証および冗長度の検証を行うことにより、次の３種類に分類する。第１の種類は、同一内容のデータブロックが分散記憶部９０に記憶されていないデータブロックである。第２の種類は、同一内容のデータブロックが分散記憶部９０に記憶済みであり、かつ、その冗長度が要求値を満たすデータブロックである。第３の種類は、同一内容のデータブロックが分散記憶部９０に記憶済みであり、かつ、その冗長度が要求値を満たさないデータブロックである。 For example, the selection unit 21 classifies the data blocks constituting the write data into the following three types by verifying the identity of the contents with the stored data blocks and verifying the redundancy. The first type is a data block in which data blocks having the same content are not stored in the distributed storage unit 90. The second type is a data block in which data blocks having the same contents have already been stored in the distributed storage unit 90 and the redundancy satisfies the required value. The third type is a data block in which data blocks having the same contents have already been stored in the distributed storage unit 90 and the redundancy does not satisfy the required value.

ここで、選択部２１は、各データブロックについての内容の同一性の検証を、次のようにして実行する。すなわち、選択部２１は、各データブロックについて同一性を判定するための情報を算出する。そして、選択部２１は、算出した同一性を判定するための情報が、メタデータ格納部９２に既に記憶されているか否かに基づいて、同一内容のデータブロックが分散記憶部９０に記憶済みであるか否かを判定する。 Here, the selection part 21 performs the verification of the identity of the content about each data block as follows. That is, the selection unit 21 calculates information for determining identity for each data block. The selecting unit 21 stores the data block having the same content in the distributed storage unit 90 based on whether or not the information for determining the calculated identity is already stored in the metadata storage unit 92. It is determined whether or not there is.

また、選択部２１は、各データブロックについての冗長度の検証を、次のようにして実行する。すなわち、選択部２１は、各データブロックについて、同一内容のデータブロックが記憶済みであれば、その冗長度を、メタデータ格納部９２から取得する。また、選択部２１は、そのデータブロックを含む書き込みデータに適用される冗長度の要求値を取得する。書き込みデータに適用される冗長度は、書き込みデータまたはその属するグループについて個別に設定された要求値、または、共通に設定された要求値である。そして、選択部２１は、同一内容のデータブロックの冗長度が要求値を満たすか否かを判定する。これにより、選択部２１は、書き込みデータを構成する各データブロックを、上述の３種類に分類することができる。例えば、選択部２１は、データブロックをこのように３種類に分類した情報を、メモリ１００２または記憶装置１００４に一時的に記憶することにより、以下の選択処理を行えばよい。 Moreover, the selection part 21 performs the verification of the redundancy about each data block as follows. That is, for each data block, if the data block having the same content has already been stored, the selection unit 21 acquires the redundancy from the metadata storage unit 92. Further, the selection unit 21 acquires a required value of redundancy applied to write data including the data block. The redundancy applied to the write data is a request value set individually for the write data or a group to which the write data belongs, or a request value set in common. Then, the selection unit 21 determines whether or not the redundancy of data blocks having the same content satisfies the required value. Thereby, the selection part 21 can classify each data block which comprises write data into the above-mentioned three types. For example, the selection unit 21 may perform the following selection process by temporarily storing the information obtained by classifying the data block into the three types in the memory 1002 or the storage device 1004.

そして、選択部２１は、まず、第１の種類のデータブロックを、書き込みの対象として選択する。さらに、選択部２１は、第３の種類のデータブロックから、所定量のデータブロックを、書き込みの対象として選択する。所定量とは、データブロックの個数であってもよい。また、所定量とは、第３の種類のデータブロックの個数に対する割合であってもよい。また、所定量とは、書き込みデータを構成するデータブロックの個数に対する所定の割合であってもよい。また、所定量とは、データブロックの合計サイズであってもよい。また、所定量とは、第３の種類のデータブロックの合計サイズに対する所定の割合であってもよい。また、所定量とは、書き込みデータのサイズに対する所定の割合であってもよい。その他、所定量とは、選択されるデータブロックの量を表すものであれば、その他の量であってもよい。 Then, the selection unit 21 first selects the first type of data block as a write target. Furthermore, the selection unit 21 selects a predetermined amount of data blocks from among the third type data blocks as a write target. The predetermined amount may be the number of data blocks. The predetermined amount may be a ratio to the number of the third type data blocks. The predetermined amount may be a predetermined ratio with respect to the number of data blocks constituting the write data. The predetermined amount may be a total size of data blocks. The predetermined amount may be a predetermined ratio with respect to the total size of the third type data block. Further, the predetermined amount may be a predetermined ratio with respect to the size of the write data. In addition, the predetermined amount may be another amount as long as it represents the amount of data blocks to be selected.

また、選択部２１は、所定量のデータブロックを選択する基準として、様々な基準を適用可能である。例えば、選択部２１は、書き込みデータを構成する順に並べたデータブロックの先頭から順番に、第３の種類のデータブロックを所定量まで選択してもよい。 Further, the selection unit 21 can apply various criteria as criteria for selecting a predetermined amount of data blocks. For example, the selection unit 21 may select up to a predetermined amount of the third type of data blocks in order from the top of the data blocks arranged in the order of constituting the write data.

所定量のデータブロックの選択処理を模式的に図１０に示す。図１０において、矩形は、書き込みデータが分割されたデータブロックを表す。また、灰色に塗りつぶされた矩形は、選択されたデータブロックを表す。ｋ個のデータブロックは、書き込みデータの先頭から順に１番目からｋ番目まで並んでいる。また、矩形の中の数字は、上述の分類を表している。つまり、１を囲む矩形は、同一内容のデータブロックが記憶されていない第１の種類のデータブロックを表す。また、２を囲む矩形は、同一内容のデータブロックが記憶されており、且つ、その冗長度が要求値を満たしている第２の種類のデータブロックを表している。また、３を囲む矩形は、同一内容のデータブロックが記憶されており、且つ、その冗長度が要求値を満たしていない第３の種類のデータブロックを表している。このとき、第１の種類のデータブロックは、全て選択される。また、所定量および選択基準として「先頭からＭ個」が定められているとすると、第３の種類のデータブロックは、先頭からＭ個までが選択され、Ｍ＋１個目以降は選択されない。 A selection process of a predetermined amount of data blocks is schematically shown in FIG. In FIG. 10, a rectangle represents a data block into which write data is divided. Moreover, the rectangle filled with gray represents the selected data block. The k data blocks are arranged from the first to the kth in order from the top of the write data. The numbers in the rectangles represent the above-mentioned classification. That is, a rectangle surrounding 1 represents a first type of data block in which data blocks having the same contents are not stored. The rectangle surrounding 2 represents a second type of data block in which data blocks having the same contents are stored and the redundancy satisfies the required value. The rectangle surrounding 3 represents a third type of data block in which data blocks having the same contents are stored and the redundancy does not satisfy the required value. At this time, all the first type data blocks are selected. Assuming that “M from the top” is defined as the predetermined amount and the selection criterion, the M data blocks from the top are selected for the third type of data block, and the M + 1 and subsequent data blocks are not selected.

なお、選択部２１は、書き込みデータを構成するデータブロックの全てについて内容の同一性および冗長度の検証を終えてから選択を行わなくてもよい。つまり、選択部２１は、書き込みデータを構成するデータブロックについて同一性および冗長度の検証を行いながら、任意のタイミングで上述した選択の基準により所定量のデータブロックの選択を行ってもよい。例えば、選択部２１は、書き込みデータを構成するデータブロックの先頭から順に、事前に定められた個数のデータブロックについて同一性および冗長度の検証を終える毎に、上述した選択基準によりデータブロックを選択してもよい。あるいは、選択部２１は、書き込みデータを構成するデータブロックの先頭から順に同一性および冗長度の検証を実行しながら所定期間が経過する毎に、上述した選択基準によりデータブロックを選択してもよい。そして、この場合、選択部２１は、選択処理を行う度に、選択したデータブロックを後述の書き込み部２２に通知してもよい。 Note that the selection unit 21 may not perform selection after the verification of the identity and redundancy of the contents of all the data blocks constituting the write data. In other words, the selection unit 21 may select a predetermined amount of data blocks based on the above-described selection criteria while verifying the identity and redundancy of the data blocks constituting the write data. For example, the selection unit 21 selects a data block according to the above-described selection criteria every time it finishes verifying the identity and redundancy of a predetermined number of data blocks in order from the top of the data block constituting the write data. May be. Alternatively, the selection unit 21 may select a data block according to the selection criterion described above every time a predetermined period elapses while executing verification of identity and redundancy in order from the top of the data block constituting the write data. . In this case, the selection unit 21 may notify the writing unit 22 described later of the selected data block every time selection processing is performed.

書き込み部２２は、本発明の第１の実施の形態における書き込み部１２と略同様に構成される。ただし、本実施の形態では、書き込みデータに冗長処理を施して生成したオリジナルフラグメントおよびパリティフラグメントを分散して記憶させるために、ストレージノード９００に送信する点が異なる。いずれのフラグメントをいずれの、ストレージノード９００に送信するかについては、例えば、ラウンドロビンにより決定する手法が適用可能である。その他、いずれのフラグメントをいずれのストレージノード９００に送信するかについては、各種の公知の技術を適用すればよい。 The writing unit 22 is configured in substantially the same manner as the writing unit 12 in the first embodiment of the present invention. However, the present embodiment is different in that the original fragment and the parity fragment generated by performing the redundancy process on the write data are transmitted to the storage node 900 in order to be distributed and stored. As for which fragment is transmitted to which storage node 900, for example, a method of determining by round robin is applicable. In addition, various known techniques may be applied as to which fragment is transmitted to which storage node 900.

なお、書き込み部２２は、選択部２１によって書き込みの対象となるデータブロックが選択される度に通知される場合、通知される度に、通知されたデータブロックに対して冗長処理を施して、ストレージノード９００に送信すればよい。 Note that when the writing unit 22 is notified every time the data block to be written is selected by the selection unit 21, the writing unit 22 performs redundancy processing on the notified data block each time the notification is made, and What is necessary is just to transmit to the node 900.

変換部２３は、本発明の第１の実施の形態における変換部１３と略同様に構成されるが、変換処理の速度を抑制するための手法の詳細が異なる。変換部２３は、該当するデータブロックそれぞれに対する変換処理を、所定のタイミング毎に実行することにより、処理速度を抑制する。 The conversion unit 23 is configured in substantially the same manner as the conversion unit 13 in the first embodiment of the present invention, but the details of the technique for suppressing the speed of the conversion process are different. The conversion unit 23 suppresses the processing speed by executing conversion processing for each corresponding data block at each predetermined timing.

また、変換部２３は、分散記憶部９０に記憶されたデータブロックのうち、変換処理の対象となるデータブロックを抽出するために、ストレージノード９００のメタデータ格納部９２を参照する。具体的には、変換部２３は、メタデータ格納部９２に格納された各データブロックの冗長度と、そのデータブロックを参照する書き込みデータに適用される冗長度の要求値とを比較する。そして、変換部２３は、冗長度が要求値を満たさないデータブロックを、冗長度の変換処理の対象として抽出すればよい。 In addition, the conversion unit 23 refers to the metadata storage unit 92 of the storage node 900 in order to extract a data block to be converted from data blocks stored in the distributed storage unit 90. Specifically, the conversion unit 23 compares the redundancy of each data block stored in the metadata storage unit 92 with the required value of the redundancy applied to the write data that refers to the data block. Then, the conversion unit 23 may extract a data block whose redundancy does not satisfy the required value as a target of conversion processing for redundancy.

以上のように構成されたストレージシステム２の動作を、図面を参照して説明する。 The operation of the storage system 2 configured as described above will be described with reference to the drawings.

まず、書き込みデータを外部から受信した際のストレージシステム２の動作を図１１に示す。なお、ここでは、同一性を判定するための情報として、ハッシュ値を採用している。また、所定量として「Ｍ個」が定められ、「先頭から」という選択基準が定められているものとする。 First, FIG. 11 shows the operation of the storage system 2 when write data is received from the outside. Here, a hash value is adopted as information for determining identity. Further, it is assumed that “M” is defined as the predetermined amount and the selection criterion “from the top” is defined.

ここでは、まず、選択部２１は、書き込みデータについて適用される冗長度の要求値を取得する（ステップＡ１１）。 Here, first, the selection unit 21 acquires a required value of redundancy applied to write data (step A11).

前述のように、選択部２１は、書き込みデータまたはそのグループについて個別に設定された要求値または共通に設定された要求値を取得する。 As described above, the selection unit 21 acquires a request value individually set for the write data or the group or a request value set in common.

次に、選択部２１は、書き込みデータをデータブロックに分割する（ステップＡ１２）。 Next, the selection unit 21 divides the write data into data blocks (step A12).

次に、選択部２１は、各データブロックについて、ハッシュ値を算出する（ステップＡ１３）。 Next, the selection unit 21 calculates a hash value for each data block (step A13).

次に、選択部２１は、各データブロックについて、内容の同一性の検証および冗長度の検証を行う（ステップＡ１４）。 Next, the selection unit 21 verifies the identity of contents and the redundancy for each data block (step A14).

具体的には、選択部２１は、各データブロックについて算出したハッシュ値と同一のハッシュ値が、メタデータ格納部９２に格納されているか否かを判断する。また、選択部２１は、同一のハッシュ値のデータブロックの冗長度を、メタデータ格納部９２から取得する。そして、選択部２１は、同一のハッシュ値が格納されていないデータブロックを、第１の種類に分類する。また、選択部２１は、同一のハッシュ値が格納されており、その冗長度が要求値を満たしているデータブロックを、第２の種類に分類する。また、選択部２１は、同一のハッシュ値が格納されており、その冗長度が要求値を満たさないデータブロックを、第３の種類に分類する。前述のように、冗長度が要求値を満たすとは、冗長度が要求値に一致することであってもよいし、冗長度が要求値以上であることであってもよい。 Specifically, the selection unit 21 determines whether the same hash value as the hash value calculated for each data block is stored in the metadata storage unit 92. Further, the selection unit 21 acquires the redundancy of the data block having the same hash value from the metadata storage unit 92. Then, the selection unit 21 classifies the data blocks that do not store the same hash value as the first type. In addition, the selection unit 21 classifies the data blocks in which the same hash value is stored and the redundancy satisfies the required value as the second type. Further, the selection unit 21 classifies data blocks in which the same hash value is stored and the redundancy does not satisfy the required value as the third type. As described above, that the redundancy satisfies the required value may be that the redundancy matches the required value, or that the redundancy is equal to or higher than the required value.

次に、選択部２１は、第１の種類に分類したデータブロックを選択する。つまり、選択部２１は、ストレージノード９００に記憶されていないデータブロックを全て選択する（ステップＡ１５）。 Next, the selection unit 21 selects the data block classified into the first type. That is, the selection unit 21 selects all the data blocks that are not stored in the storage node 900 (step A15).

次に、選択部２１は、第３の種類に分類したデータブロックから、所定量のデータブロックを選択する（ステップＡ１６）。 Next, the selection unit 21 selects a predetermined amount of data blocks from the data blocks classified into the third type (step A16).

ここでは、前述のように、選択部２１は、書き込みデータの先頭から順に、第３の種類のデータブロックを所定の個数まで選択する。 Here, as described above, the selection unit 21 selects up to a predetermined number of third-type data blocks in order from the top of the write data.

次に、書き込み部２２は、ステップＡ１５およびＡ１６で選択されたデータブロックについて、要求値を満たす冗長度に基づく冗長処理を施す。具体的には、書き込み部２２は、要求値を満たす冗長度に基づいて、各データブロックを分割してオリジナルフラグメントを生成し、パリティフラグメントを生成する（ステップＡ１７）。 Next, the writing unit 22 performs redundancy processing based on the redundancy satisfying the requested value for the data block selected in steps A15 and A16. Specifically, the writing unit 22 divides each data block based on the redundancy satisfying the required value, generates an original fragment, and generates a parity fragment (step A17).

そして、書き込み部２２は、オリジナルフラグメントおよびパリティフラグメントを含む情報を、複数のストレージノード９００に分散して送信する（ステップＡ１８）。また、このとき、書き込み部２２は、このデータブロックのハッシュ値および冗長度等を含むメタデータを、これらのフラグメントと共に送信する。 Then, the writing unit 22 distributes and transmits information including the original fragment and the parity fragment to the plurality of storage nodes 900 (step A18). At this time, the writing unit 22 transmits metadata including the hash value and redundancy of the data block together with these fragments.

以上で、アクセス装置２００が、書き込みデータを外部から受信した際の動作の説明を終了する。 This is the end of the description of the operation when the access device 200 receives write data from the outside.

次に、ストレージノード９００が、フラグメントを受信した際の動作を図１２に示す。 Next, FIG. 12 shows an operation when the storage node 900 receives a fragment.

図１２において、まず、データ処理部９３は、フラグメントをアクセス装置２００から受信する（ステップＣ１）。 In FIG. 12, first, the data processing unit 93 receives a fragment from the access device 200 (step C1).

なお、受信されるフラグメントは、１つまたは複数である。また、受信されるフラグメントは、オリジナルフラグメントおよびパリティフラグメントのどちらか１種類で構成される場合もあるし、両方の種類で構成される場合もある。 Note that one or more fragments are received. In addition, the received fragment may be composed of either one of the original fragment and the parity fragment, or may be composed of both types.

次に、データ処理部９３は、受信したフラグメントを、データ格納部９１に格納する（ステップＣ２）。 Next, the data processing unit 93 stores the received fragment in the data storage unit 91 (step C2).

次に、データ処理部９３は、受信したフラグメントを含むデータブロックのメタデータを更新する（ステップＣ３）。 Next, the data processing unit 93 updates the metadata of the data block including the received fragment (step C3).

例えば、メタデータ格納部９２に、図８および図９に示した情報が格納されているとする。もし、受信したフラグメントを含むデータブロックが新規に書き込まれたものであれば、データ処理部９３は、新たなデータブロックに関するハッシュ値、冗長度および物理アドレスを含む情報を、図９に示したようなメタデータとして追加して記憶する。また、受信したフラグメントを含むデータブロックが、以前と同一内容で冗長度が変換されたものであれば、データ処理部９３は、該当するデータブロックの物理アドレスおよび冗長度を更新する。また、受信したフラグメントを含むデータブロックによって構成される書き込みデータが新規に書き込まれたものであれば、データ処理部９３は、新たな書き込みデータに関する情報を、図８に示したようなメタデータとして追加して記憶する。また、受信したフラグメントを含むデータブロックによって構成される書き込みデータが更新されたものであれば、データ処理部９３は、該当する書き込みデータのメタデータを更新する。 For example, it is assumed that the information shown in FIGS. 8 and 9 is stored in the metadata storage unit 92. If the data block including the received fragment is newly written, the data processing unit 93 provides information including the hash value, redundancy, and physical address regarding the new data block as shown in FIG. Add and store as simple metadata. If the data block including the received fragment has the same contents as before and the redundancy is converted, the data processing unit 93 updates the physical address and the redundancy of the corresponding data block. In addition, if the write data configured by the data block including the received fragment is newly written, the data processing unit 93 sets the information related to the new write data as metadata as illustrated in FIG. Add and remember. If the write data configured by the data block including the received fragment is updated, the data processing unit 93 updates the metadata of the corresponding write data.

以上で、ストレージノード９００は、フラグメントを受信した際の動作を終了する。 Thus, the storage node 900 ends the operation when receiving the fragment.

次に、ストレージシステム２の冗長度の変換処理を図１３に示す。 Next, the redundancy conversion processing of the storage system 2 is shown in FIG.

図１３では、まず、変換部２３は、分散記憶部９０に記憶されたデータブロックの１つについてその冗長度を、メタデータ格納部９２から取得する（ステップＢ１１）。 In FIG. 13, first, the conversion unit 23 acquires the redundancy of one of the data blocks stored in the distributed storage unit 90 from the metadata storage unit 92 (step B11).

次に、変換部２３は、そのデータブロックを含む書き込みデータについて適用される冗長度の要求値を取得する（ステップＢ１２）。 Next, the conversion unit 23 acquires a required value of redundancy applied to write data including the data block (step B12).

次に、変換部２３は、そのデータブロックの冗長度が要求値を満たすか否かを判断する（ステップＢ１３）。 Next, the conversion unit 23 determines whether or not the redundancy of the data block satisfies the required value (step B13).

ここで、冗長度が要求値を満たす場合、ストレージシステム２の動作は、ステップＢ１７に進む。 Here, when the redundancy satisfies the required value, the operation of the storage system 2 proceeds to Step B17.

一方、ステップＢ１３において、データブロックの冗長度が要求値を満たさない場合、変換部２３は、分散記憶部９０から該当するデータブロックを読み込む（ステップＢ１４）。 On the other hand, when the redundancy of the data block does not satisfy the required value in step B13, the conversion unit 23 reads the corresponding data block from the distributed storage unit 90 (step B14).

具体的には、変換部２３は、複数のストレージノード９００からオリジナルフラグメントおよびパリティフラグメントを読み込み、読み込んだフラグメントに基づいて該当するデータブロックを復元する。 Specifically, the conversion unit 23 reads the original fragment and the parity fragment from the plurality of storage nodes 900, and restores the corresponding data block based on the read fragment.

次に、変換部２３は、読み込んだデータブロックに対して、要求値を満たす冗長度に基づく冗長処理を施し、オリジナルフラグメントおよびパリティフラグメントを生成する（ステップＢ１５）。 Next, the conversion unit 23 performs redundancy processing based on the redundancy satisfying the requested value on the read data block to generate an original fragment and a parity fragment (step B15).

次に、変換部２３は、オリジナルフラグメントおよびパリティフラグメントを、複数のストレージノード９００に分散して送信する（ステップＢ１６）。 Next, the conversion unit 23 distributes the original fragment and the parity fragment to the plurality of storage nodes 900 and transmits them (step B16).

そして、変換部２３は、ステップＢ１１からの処理をまだ施していない他のデータブロックが分散記憶部９０に記憶されていれば（ステップＢ１７でＹｅｓ）、所定時間待機した後（ステップＢ１８）、ステップＢ１１からの処理を繰り返す。 And if the other data block which has not yet performed the process from step B11 is memorize | stored in the dispersion | distribution memory | storage part 90 (Yes in step B17), the conversion part 23 will wait for predetermined time (step B18), and step Repeat the process from B11.

一方、ステップＢ１１からの処理を施していない他のデータブロックが分散記憶部９０に記憶されていなければ（ステップＢ１７でＮｏ）、ストレージシステム２は、冗長度の変換処理を終了する。 On the other hand, if another data block that has not been processed from step B11 is not stored in the distributed storage unit 90 (No in step B17), the storage system 2 ends the redundancy conversion process.

次に、本発明の第２の実施の形態の効果について述べる。 Next, the effect of the second exemplary embodiment of the present invention will be described.

本発明の第２の実施の形態としてのストレージシステムは、重複排除の機能を有するストレージシステムにおいて、性能への影響を抑えながら、冗長度をさらに効率的に変更することができる。 The storage system according to the second embodiment of the present invention can change the redundancy more efficiently while suppressing the influence on the performance in the storage system having the deduplication function.

その理由について説明する。本実施の形態では、ストレージノードにおいて、データ格納部が、データブロックを分割した情報を格納する。また、メタデータ格納部が、記憶済みのデータブロックの冗長度および同一性を判定するための情報を記憶する。そして、アクセス装置の選択部が、書き込みデータを構成するデータブロックから書き込みの対象とするデータブロックを選択する際に、メタデータ格納部を参照することにより、内容の同一性の検証および冗長度の検証を行う。そして、選択部は、同一内容のデータブロックが未だ記憶されていないデータブロックを、書き込みの対象として選択する。加えて、選択部は、同一内容のデータブロックが記憶済みで且つその冗長度が要求値を満たさないデータブロックの中から、所定量のデータブロックを書き込みの対象として選択するからである。そして、アクセス装置の書き込み部が、書き込みの対象として選択したデータブロックを分散して複数のストレージノードに書き込む。このとき、ストレージノードは、メタデータ格納部の冗長度および同一性を判定するための情報を更新する。また、アクセス装置の変換部は、メタデータ格納部に格納された冗長度を参照することにより、冗長度の変換処理の対象となるデータブロックを抽出する。そして、変換部が、処理対象のデータブロックのそれぞれに対する冗長度の変換処理を、所定のタイミング毎に実行するからである。 The reason will be described. In the present embodiment, in the storage node, the data storage unit stores information obtained by dividing the data block. Further, the metadata storage unit stores information for determining the redundancy and identity of the stored data block. Then, when the selection unit of the access device selects a data block to be written from the data blocks constituting the write data, by referring to the metadata storage unit, it is possible to verify the identity of the content and the redundancy Perform verification. Then, the selection unit selects a data block in which a data block having the same content is not yet stored as a write target. In addition, the selection unit selects a predetermined amount of data blocks to be written from data blocks in which data blocks having the same contents are stored and whose redundancy does not satisfy the required value. Then, the writing unit of the access device distributes the data block selected as a writing target and writes it to a plurality of storage nodes. At this time, the storage node updates information for determining the redundancy and identity of the metadata storage unit. In addition, the conversion unit of the access device refers to the redundancy stored in the metadata storage unit, and extracts a data block that is a target of the conversion processing of the redundancy. This is because the conversion unit performs the redundancy conversion process for each data block to be processed at each predetermined timing.

このように、本実施の形態は、冗長度の要求値が変更された場合に、書き込みデータを構成するデータブロックのうち、新たな要求値を満たす冗長度に基づく冗長処理が施されて書き込まれるデータブロックの個数を、より適切に抑制する。その結果、本実施の形態は、冗長度の要求値が変更された後でも、書き込みの性能をさらに効果的に低下させない。 As described above, according to the present embodiment, when the required value of the redundancy is changed, the redundancy processing based on the redundancy satisfying the new required value is performed among the data blocks constituting the write data and written. Reduce the number of data blocks more appropriately. As a result, the present embodiment does not further effectively reduce the write performance even after the required redundancy value is changed.

また、本実施の形態は、冗長度の要求値が変更された後、書き込み時に新たな冗長処理を施さず記憶し直さなかったデータブロックについて、別途実行する変換処理の処理速度を、より適切に抑制する。これにより、本実施の形態は、書き込み時に冗長度の変更を行わずに記憶済みのデータブロックに対して一括して冗長度の変換処理を行う場合と比べて、運用でのアクセスの性能に対する影響をさらに出ないようにすることができる。さらに、本実施の形態は、書き込み時に一部のデータブロックについて冗長度を変更しておくため、別途一括して行う変換処理の速度を抑制しても、全てのデータブロックについて冗長度の変更が完了するまでの時間を短縮することができる。 In addition, the present embodiment more appropriately sets the processing speed of the conversion processing to be separately performed on the data block that has not been re-stored without performing new redundancy processing at the time of writing after the required value of redundancy is changed. Suppress. As a result, this embodiment has an effect on the access performance in operation compared to the case where redundancy conversion processing is performed on stored data blocks at once without changing the redundancy at the time of writing. Can be prevented from coming out further. Furthermore, since the redundancy is changed for some data blocks at the time of writing in this embodiment, the redundancy can be changed for all the data blocks even if the speed of conversion processing performed separately is suppressed. Time to complete can be shortened.

なお、本実施の形態において、選択部が、第３の種類のデータブロックから所定量のデータブロックを選択する基準として、書き込みデータの先頭から順に第３の種類のデータブロックを所定の個数まで選択する例について説明した。これに限らず、選択部は、書き込みデータの末尾から順に第３の種類のデータブロックを所定量選択してもよい。あるいは、選択部は、第３の種類のデータブロックから、ランダムに所定量のデータブロックを選択してもよい。あるいは、選択部は、第３の種類のデータブロックから、アクセス頻度の高い順に所定量のデータブロックを選択してもよい。 In the present embodiment, the selection unit selects up to a predetermined number of third-type data blocks in order from the top of the write data as a reference for selecting a predetermined amount of data blocks from the third-type data block. The example to do was demonstrated. However, the selection unit may select a predetermined amount of the third type data block in order from the end of the write data. Alternatively, the selection unit may randomly select a predetermined amount of data blocks from the third type of data blocks. Alternatively, the selection unit may select a predetermined amount of data blocks from the third type of data block in descending order of access frequency.

あるいは、選択部は、複数の所定量や選択基準を組み合わせて、データブロックの選択を行ってもよい。例えば、選択部は、まず、ある選択基準にしたがって（例えば、先頭から順に）、第３の種類のデータブロックを第１の所定量（例えば、所定の個数まで）選択したとする。さらに、選択部は、選択したデータブロックが第２の所定量（例えば、所定の割合）に達していない場合に、第３の種類のデータブロックを他の選択基準にしたがって（例えば、アクセス頻度の高い順に）追加して選択するようにしてもよい。ただし、このような場合であっても、選択部は、第３の種類のデータブロックの全てを選択しないようにすることが望ましい。 Alternatively, the selection unit may select a data block by combining a plurality of predetermined amounts and selection criteria. For example, it is assumed that the selection unit first selects a third predetermined data block (for example, up to a predetermined number) according to a certain selection criterion (for example, in order from the top). Further, when the selected data block does not reach the second predetermined amount (for example, a predetermined ratio), the selection unit selects the third type of data block according to another selection criterion (for example, the access frequency). You may make it select in addition (in order of high). However, even in such a case, it is desirable that the selection unit does not select all of the third type data blocks.

また、本実施の形態において、変換部は、変換の対象となるデータブロックそれぞれに対する変換処理を所定のタイミング毎に実行することにより、変換処理の速度を抑制する例について説明した。これに限らず、変換部は、単位期間の間に変換処理を行うデータブロックの数に上限を設け、上限を超えた場合は、次の単位期間になるまで変換処理を一時的に停止することにより、処理速度を抑制してもよい。あるいは、変換部は、所定の間隔を開けて、所定数ずつのデータブロックに対して変換処理を実行するようにしてもよい。その他、変換部は、その他の手段により処理速度を抑制してもよい。 Moreover, in this Embodiment, the conversion part demonstrated the example which suppresses the speed of a conversion process by performing the conversion process with respect to each data block used as the conversion object for every predetermined timing. Not limited to this, the conversion unit sets an upper limit on the number of data blocks to be converted during the unit period, and when the upper limit is exceeded, the conversion unit temporarily stops the conversion process until the next unit period. Thus, the processing speed may be suppressed. Alternatively, the conversion unit may perform the conversion process on a predetermined number of data blocks at predetermined intervals. In addition, the conversion unit may suppress the processing speed by other means.

また、本実施の形態において、メタデータ格納部が、ストレージノードに設けられる例について説明した。これに限らず、メタデータ格納部は、アクセス装置に設けられていてもよい。また、メタデータ格納部は、データの書き込みを要求するクライアント装置に設けられていてもよい。その場合、クライアント装置は、自装置がストレージノードに書き込んだ書き込みデータに関するメタデータをメタデータ格納部に格納するようにしても良い。また、その場合、クライアント装置は、他のクライアント装置とメタデータ格納部の情報を共有しないようにしてもよいし、共有するようにしてもよい。 In the present embodiment, the example in which the metadata storage unit is provided in the storage node has been described. However, the present invention is not limited to this, and the metadata storage unit may be provided in the access device. The metadata storage unit may be provided in a client device that requests data writing. In this case, the client device may store metadata related to write data written by the own device in the storage node in the metadata storage unit. In this case, the client device may not share the information in the metadata storage unit with other client devices, or may share it.

（第３の実施の形態）
次に、本発明の第３の実施の形態について図面を参照して詳細に説明する。なお、本実施の形態の説明において参照する各図面において、本発明の第２の実施の形態と同一の構成および同様に動作するステップには同一の符号を付して本実施の形態における詳細な説明を省略する。 (Third embodiment)
Next, a third embodiment of the present invention will be described in detail with reference to the drawings. Note that, in each drawing referred to in the description of the present embodiment, the same reference numerals are given to the same configuration and steps that operate in the same manner as in the second embodiment of the present invention, and the detailed description in the present embodiment. Description is omitted.

本発明の第３の実施の形態としてのストレージシステム３の構成を図１４に示す。図１４において、ストレージシステム３は、本発明の第２の実施の形態としてのストレージシステム２に対して、アクセス装置２００に替えてアクセス装置３００を含み、さらに、クライアント装置４００を含む点が異なる。クライアント装置４００は、ストレージノード９００に対して、書き込みデータの書き込みを、アクセス装置３００を介して要求する装置である。 FIG. 14 shows the configuration of the storage system 3 as the third embodiment of the present invention. 14, the storage system 3 differs from the storage system 2 according to the second embodiment of the present invention in that it includes an access device 300 instead of the access device 200 and further includes a client device 400. The client device 400 is a device that requests the storage node 900 to write write data via the access device 300.

また、図１４に示すように、アクセス装置３００は、本発明の第２の実施の形態におけるアクセス装置２００に対して、書き込み部２２に替えて書き込み部３２を備える点と、選択部２１を含まない点とが異なる。クライアント装置４００は、選択部４１を備える。 As shown in FIG. 14, the access device 300 includes a selection unit 21 and a point provided with a writing unit 32 instead of the writing unit 22 with respect to the access device 200 according to the second embodiment of the present invention. There is no difference. The client device 400 includes a selection unit 41.

ここで、ストレージシステム３を構成する各装置は、図１５に示すようなハードウェア要素によって構成可能である。図１５において、クライアント装置４００は、ＣＰＵ４００１、メモリ４００２、記憶装置４００４、および、ネットワークインタフェース４００５を含む。この場合、選択部４１は、ネットワークインタフェース４００５と、メモリ４００２または記憶装置４００４に格納されるコンピュータ・プログラムを読み込んで実行するＣＰＵ４００１とによって構成される。アクセス装置３００は、本発明の第２の実施の形態におけるアクセス装置２００と同様のハードウェア要素により構成される。なお、ストレージシステム３を構成する各装置およびその各機能ブロックのハードウェア構成は、上述の構成に限定されない。 Here, each device constituting the storage system 3 can be constituted by hardware elements as shown in FIG. In FIG. 15, the client device 400 includes a CPU 4001, a memory 4002, a storage device 4004, and a network interface 4005. In this case, the selection unit 41 includes a network interface 4005 and a CPU 4001 that reads and executes a computer program stored in the memory 4002 or the storage device 4004. The access device 300 includes hardware elements similar to those of the access device 200 according to the second embodiment of the present invention. Note that the hardware configuration of each device and each functional block constituting the storage system 3 is not limited to the above-described configuration.

次に、まず、クライアント装置４００に含まれる機能ブロックについて説明する。 Next, functional blocks included in the client device 400 will be described first.

選択部４１は、本発明の第２の実施の形態における選択部２１と略同様に構成される。ただし、選択部４１は、自装置で生じた書き込みデータについて動作する点が異なる。選択部４１は、自装置における図示しない他の機能ブロックから、書き込みデータを受信して動作する。また、選択部４１は、書き込みの対象として選択したデータブロックおよびその冗長度や同一性を判定するための情報を含むメタデータを、アクセス装置３００に送信する点も異なる。なお、選択部４１は、書き込みの対象となるデータブロックを選択するために、記憶済みのデータブロックに関するメタデータを参照する。このとき、選択部４１は、アクセス装置３００に対してメタデータの参照を要求することにより、メタデータを取得すればよい。また、選択部４１は、書き込みの対象となるデータブロックを選択するために、冗長度の要求値を参照する。このとき、選択部４１は、アクセス装置３００から、冗長度の要求値を取得すればよい。 The selection unit 41 is configured in substantially the same manner as the selection unit 21 in the second embodiment of the present invention. However, the selection unit 41 is different in that it operates on write data generated in its own device. The selection unit 41 operates by receiving write data from other functional blocks (not shown) in its own device. The selection unit 41 also differs in that the data block selected as a write target and metadata including information for determining redundancy and identity thereof are transmitted to the access device 300. Note that the selection unit 41 refers to metadata about a stored data block in order to select a data block to be written. At this time, the selection unit 41 may acquire the metadata by requesting the access device 300 to refer to the metadata. Further, the selection unit 41 refers to the required value of redundancy in order to select a data block to be written. At this time, the selection unit 41 may acquire a required value for redundancy from the access device 300.

次に、アクセス装置３００に含まれる機能ブロックについて説明する。 Next, functional blocks included in the access device 300 will be described.

書き込み部３２は、本発明の第２の実施の形態における書き込み部２２と略同様に構成される。ただし、書き込み部３２は、クライアント装置４００から書き込みの対象として受信したデータブロックについて動作する点が異なる。 The writing unit 32 is configured in substantially the same manner as the writing unit 22 in the second embodiment of the present invention. However, the writing unit 32 is different in that it operates on a data block received as a writing target from the client device 400.

その他、アクセス装置３００は、クライアント装置４００からの要求に応じて、ストレージノード９００のメタデータ格納部９２に格納されるメタデータを取得し、クライアント装置４００に返信する機能を有する。また、アクセス装置３００は、クライアント装置４００からの要求に応じて、冗長度の要求値を返信する機能を有する。このとき、アクセス装置３００は、各データブロックについて共通に設定された冗長度の要求値を自装置のメモリ２００２または記憶装置２００４等から取得して、クライアント装置４００に返信してもよい。また、アクセス装置３００は、該当するデータブロックについて個別に設定された冗長度の要求値をストレージノード９００のメタデータ格納部９２から取得して、クライアント装置４００に返信してもよい。 In addition, the access device 300 has a function of acquiring metadata stored in the metadata storage unit 92 of the storage node 900 in response to a request from the client device 400 and returning it to the client device 400. Further, the access device 300 has a function of returning a request value for redundancy in response to a request from the client device 400. At this time, the access device 300 may acquire a request value for the redundancy set in common for each data block from the memory 2002 or the storage device 2004 of the own device and return it to the client device 400. Further, the access device 300 may obtain a request value for redundancy individually set for the corresponding data block from the metadata storage unit 92 of the storage node 900 and return it to the client device 400.

以上のように構成されたストレージシステム３の動作について、図面を参照して説明する。 The operation of the storage system 3 configured as described above will be described with reference to the drawings.

まず、クライアント装置４００において書き込みデータが生じた際の動作を図１６に示す。 First, FIG. 16 shows an operation when write data is generated in the client device 400.

図１６において、選択部４１は、ステップＡ１１〜Ａ１６まで、本発明の第２の実施の形態におけるアクセス装置２００の選択部２１と同様に動作する。これにより、書き込みデータが分割されたデータブロックの中から、書き込みの対象となるデータブロックが選択される。 In FIG. 16, the selection unit 41 operates in the same manner as the selection unit 21 of the access device 200 in the second embodiment of the present invention from step A11 to A16. Thereby, a data block to be written is selected from the data blocks into which the write data is divided.

次に、選択部４１は、選択したデータブロックおよびメタデータを含む情報を、アクセス装置３００に送信する（ステップＡ２７）。 Next, the selection unit 41 transmits information including the selected data block and metadata to the access device 300 (step A27).

以上で、クライアント装置４００は、書き込みデータが生じた際の動作を終了する。 Thus, the client device 400 ends the operation when write data is generated.

次に、アクセス装置３００において、クライアント装置４００からデータブロックを受信した際の動作を図１７に示す。 Next, FIG. 17 shows an operation when the access device 300 receives a data block from the client device 400.

図１７では、まず、書き込み部３２は、クライアント装置４００から、選択されたデータブロックを含む情報を受信する（ステップＡ２８）。 In FIG. 17, the writing unit 32 first receives information including the selected data block from the client device 400 (step A28).

以降、書き込み部３２は、ステップＡ１７〜Ａ１８まで、本発明の第２の実施の形態と同様に動作する。これにより、書き込み対象として選択された各データブロックは、要求値を満たす冗長度に基づく冗長処理が施されたオリジナルフラグメントおよびパリティフラグメントとして、複数のストレージノード９００に送信される。 Thereafter, the writing unit 32 operates in the same manner as in the second embodiment of the present invention from step A17 to A18. Thereby, each data block selected as a write target is transmitted to a plurality of storage nodes 900 as an original fragment and a parity fragment that have been subjected to redundancy processing based on the redundancy satisfying the requested value.

以上で、アクセス装置３００は、クライアント装置４００からデータブロックを受信した際の動作を終了する。 As described above, the access device 300 ends the operation when the data block is received from the client device 400.

なお、ストレージノード９００がフラグメントを受信した際の動作は、図１２を参照して説明した本発明の第２の実施の形態の動作と同様であるため、本実施の形態における説明を省略する。 The operation when the storage node 900 receives a fragment is the same as the operation of the second embodiment of the present invention described with reference to FIG.

また、アクセス装置３００が所定のタイミング毎に行う変換処理の動作は、図１３を参照して説明した本発明の第２の実施の形態の動作と同様であるため、本実施の形態における説明を省略する。 Further, the operation of the conversion process performed by the access device 300 at every predetermined timing is the same as the operation of the second embodiment of the present invention described with reference to FIG. Omitted.

次に、本発明の第３の実施の形態の効果について述べる。 Next, effects of the third exemplary embodiment of the present invention will be described.

本発明の第３の実施の形態としてのストレージシステムは、重複排除の機能を有し、性能への影響を抑えながら冗長度を効率的に変更するストレージシステムにおいて、クライアントからの通信量を低減することができる。 The storage system according to the third embodiment of the present invention has a deduplication function, and reduces the amount of communication from the client in the storage system that efficiently changes the redundancy while suppressing the influence on the performance. be able to.

その理由について説明する。本実施の形態では、クライアント装置において選択部が、自装置で生じた書き込みデータをデータブロックに分割し、その中から、書き込みの対象とするデータブロックを、本発明の第２の実施の形態と同様に選択する。そして、選択部が、選択したデータブロックをアクセス装置に送信する。そして、アクセス装置において書き込み部が、受信したデータブロックに冗長処理を施して、ストレージノードに送信するからである。 The reason will be described. In the present embodiment, the selection unit in the client device divides the write data generated in its own device into data blocks, and the data block to be written is selected from the data blocks as in the second embodiment of the present invention. Select similarly. Then, the selection unit transmits the selected data block to the access device. This is because the writing unit in the access device performs redundancy processing on the received data block and transmits it to the storage node.

これにより、本実施の形態は、書き込みデータの書き込みを要求するクライアント装置からアクセス装置への通信量を低減することができ、ネットワークおよびアクセス装置の負荷を低減することができる。 As a result, according to the present embodiment, it is possible to reduce the amount of communication from the client device that requests writing of write data to the access device, and to reduce the load on the network and the access device.

なお、上述した本発明の各実施の形態において、冗長度は、パリティフラグメントの個数または個数の割合によって表される例を中心に説明した。これに限らず、冗長度は、例えば、パリティフラグメントの個数およびオリジナルフラグメントの個数の組み合わせによって表されていてもよい。その他、冗長度は、データブロックにおける冗長情報の量を表す情報であれば、他の情報で表されていてもよい。 In each of the embodiments of the present invention described above, the redundancy has been described mainly with an example expressed by the number of parity fragments or the ratio of the number. For example, the redundancy may be expressed by a combination of the number of parity fragments and the number of original fragments. In addition, the redundancy may be represented by other information as long as the information represents the amount of redundant information in the data block.

また、上述した本発明の各実施の形態において、ストレージシステムの各機能ブロックが、メモリまたは記憶装置に記憶されたコンピュータ・プログラムを実行するＣＰＵによって実現される例を中心に説明した。これに限らず、各機能ブロックの一部、全部、または、それらの組み合わせが専用のハードウェアにより実現されていてもよい。 Further, in each of the embodiments of the present invention described above, the description has focused on an example in which each functional block of the storage system is realized by a CPU that executes a computer program stored in a memory or a storage device. However, the present invention is not limited to this, and some, all, or a combination of each functional block may be realized by dedicated hardware.

また、上述した本発明の各実施の形態において、ストレージシステムの各機能ブロックは、さらに複数の装置に分散されて実現されてもよい。 Further, in each of the above-described embodiments of the present invention, each functional block of the storage system may be further distributed to a plurality of devices.

また、上述した本発明の各実施の形態において、各フローチャートを参照して説明したストレージシステムの動作を、本発明のコンピュータ・プログラムとしてコンピュータ装置のデータ格納部（記憶媒体）に格納しておく。そして、係るコンピュータ・プログラムを当該ＣＰＵが読み出して実行するようにしてもよい。そして、このような場合において、本発明は、係るコンピュータ・プログラムのコードあるいは記憶媒体によって構成される。 In each embodiment of the present invention described above, the operation of the storage system described with reference to each flowchart is stored in the data storage unit (storage medium) of the computer device as the computer program of the present invention. Then, the computer program may be read and executed by the CPU. In such a case, the present invention is constituted by the code of the computer program or a storage medium.

また、上述した各実施の形態は、適宜組み合わせて実施されることが可能である。 Moreover, each embodiment mentioned above can be implemented in combination as appropriate.

また、本発明は、上述した各実施の形態に限定されず、様々な態様で実施されることが可能である。 The present invention is not limited to the above-described embodiments, and can be implemented in various modes.

また、上述した各実施の形態の一部又は全部は、以下の付記のようにも記載されうるが、以下には限られない。
（付記１）
冗長処理が施されたデータを複数の記憶装置に分散して記憶する分散記憶部と、
書き込みを要求された書き込みデータをデータブロックに分割し、分割したデータブロックのうち、同一内容のデータブロックが前記分散記憶部に記憶されていないデータブロックと、同一内容のデータブロックが前記分散記憶部に記憶済みで且つその冗長度がその要求値を満たしていないデータブロックの中から抽出したデータブロックとを、書き込みの対象として選択する選択部と、
前記選択部によって選択されたデータブロックについて、前記要求値を満たす冗長度に基づく冗長処理を施して前記分散記憶部に記憶させる書き込み部と、
前記分散記憶部に記憶済みのデータブロックのうち、その冗長度が前記要求値を満たしていないデータブロックを、前記要求値を満たす冗長度に基づく冗長処理を施したデータブロックに変換して、前記分散記憶部に記憶させる変換処理を、処理速度を抑制しながら実行する変換部と、
を備えたストレージシステム。
（付記２）
前記選択部は、同一内容のデータブロックが前記分散記憶部に記憶済みで且つその冗長度が前記要求値を満たしていないデータブロックの中から所定量のデータブロックを抽出し、前記書き込みの対象として選択することを特徴とする付記１に記載のストレージシステム。
（付記３）
前記変換部は、前記変換処理を所定のタイミング毎に実行することにより、前記処理速度を抑制することを特徴とする付記１または付記２に記載のストレージシステム。
（付記４）
前記分散記憶部に記憶された各データブロックについて、その冗長度およびその内容の同一性を判定するための情報を含むメタデータを格納したメタデータ格納部をさらに備え、
前記選択部は、前記メタデータ格納部を参照することにより、前記データブロックについて、同一内容のデータブロックが前記分散記憶部に記憶済みであるか否か、および、その冗長度が前記要求値を満たすか否かを判断することを特徴とする付記１から付記３のいずれか１つに記載のストレージシステム。
（付記５）
前記選択部と、前記書き込み部と、前記変換部とを有するアクセス装置と、
前記分散記憶部と、
を備えることを特徴とする付記１から付記４のいずれか１つに記載のストレージシステム。
（付記６）
前記選択部を有するクライアント装置と、
前記書き込み部および前記変換部を有するアクセス装置と、
前記分散記憶部と、
を備えることを特徴とする付記１から付記４のいずれか１つに記載のストレージシステム。
（付記７）
付記５または付記６に記載のアクセス装置。
（付記８）
付記６に記載のクライアント装置。
（付記９）
コンピュータ装置が、
冗長処理が施されたデータを複数の記憶装置に分散して記憶する分散記憶部を用いて、
書き込みを要求された書き込みデータをデータブロックに分割し、分割したデータブロックのうち、同一内容のデータブロックが前記分散記憶部に記憶されていないデータブロックと、同一内容のデータブロックが前記分散記憶部に記憶済みで且つその冗長度がその要求値を満たしていないデータブロックの中から抽出したデータブロックとを、書き込みの対象として選択し、
選択したデータブロックについて、前記要求値を満たす冗長度に基づく冗長処理を施して前記分散記憶部に記憶させ、
前記分散記憶部に記憶済みのデータブロックのうち、その冗長度が前記要求値を満たしていないデータブロックを、前記要求値を満たす冗長度に基づく冗長処理を施したデータブロックに変換して、前記分散記憶部に記憶させる変換処理を、処理速度を抑制しながら実行する方法。
（付記１０）
冗長処理が施されたデータを複数の記憶装置に分散して記憶する分散記憶部を用いて、
書き込みを要求された書き込みデータをデータブロックに分割し、分割したデータブロックのうち、同一内容のデータブロックが前記分散記憶部に記憶されていないデータブロックと、同一内容のデータブロックが前記分散記憶部に記憶済みで且つその冗長度がその要求値を満たしていないデータブロックの中から抽出したデータブロックとを、書き込みの対象として選択する選択ステップをコンピュータ装置に実行させるプログラム。
（付記１１）
冗長処理が施されたデータを複数の記憶装置に分散して記憶する分散記憶部を用いて、
書き込みを要求された書き込みデータをデータブロックに分割し、分割したデータブロックのうち、同一内容のデータブロックが前記分散記憶部に記憶されていないデータブロックと、同一内容のデータブロックが前記分散記憶部に記憶済みで且つその冗長度がその要求値を満たしていないデータブロックの中から抽出したデータブロックとを、書き込みの対象として選択する選択ステップと、
前記選択ステップにおいて選択されたデータブロックについて、前記要求値を満たす冗長度に基づく冗長処理を施して前記分散記憶部に記憶させる書き込みステップと、
前記分散記憶部に記憶済みのデータブロックのうち、その冗長度が前記要求値を満たしていないデータブロックを、前記要求値を満たす冗長度に基づく冗長処理を施したデータブロックに変換して、前記分散記憶部に記憶させる変換処理を、処理速度を抑制しながら実行する変換ステップと、
をコンピュータ装置に実行させるプログラム。 A part or all of each of the above-described embodiments can be described as in the following supplementary notes, but is not limited thereto.
(Appendix 1)
A distributed storage unit that stores the data subjected to the redundant processing in a distributed manner in a plurality of storage devices;
Write data requested to be written is divided into data blocks, and among the divided data blocks, a data block having the same content data stored in the distributed storage unit and a data block having the same content stored in the distributed storage unit A data block extracted from the data blocks that have been stored in the memory and whose redundancy does not satisfy the required value;
For the data block selected by the selection unit, a writing unit that performs redundancy processing based on redundancy satisfying the required value and stores the data in the distributed storage unit;
Of the data blocks stored in the distributed storage unit, the data block whose redundancy does not satisfy the required value is converted into a data block subjected to redundancy processing based on the redundancy satisfying the required value, A conversion unit that executes conversion processing to be stored in the distributed storage unit while suppressing processing speed;
Storage system with
(Appendix 2)
The selection unit extracts a predetermined amount of data blocks from among data blocks whose data blocks having the same contents are already stored in the distributed storage unit and whose redundancy does not satisfy the required value, and is used as the write target. The storage system according to appendix 1, wherein the storage system is selected.
(Appendix 3)
The storage system according to appendix 1 or appendix 2, wherein the conversion unit suppresses the processing speed by executing the conversion process at every predetermined timing.
(Appendix 4)
For each data block stored in the distributed storage unit, further comprising a metadata storage unit storing metadata including information for determining the redundancy and the identity of the content,
The selection unit refers to the metadata storage unit, and for the data block, whether or not a data block having the same content has already been stored in the distributed storage unit, and its redundancy indicates the required value. 4. The storage system according to any one of appendix 1 to appendix 3, wherein whether or not it is satisfied is determined.
(Appendix 5)
An access device including the selection unit, the writing unit, and the conversion unit;
The distributed storage unit;
The storage system according to any one of appendix 1 to appendix 4, further comprising:
(Appendix 6)
A client device having the selection unit;
An access device having the writing unit and the converting unit;
The distributed storage unit;
The storage system according to any one of appendix 1 to appendix 4, further comprising:
(Appendix 7)
The access device according to appendix 5 or appendix 6.
(Appendix 8)
The client device according to attachment 6.
(Appendix 9)
Computer equipment
Using a distributed storage unit that distributes and stores data subjected to redundant processing in a plurality of storage devices,
Write data requested to be written is divided into data blocks, and among the divided data blocks, a data block having the same content data stored in the distributed storage unit and a data block having the same content stored in the distributed storage unit Selected from the data blocks that have been stored in the data block and whose redundancy does not satisfy the required value,
The selected data block is subjected to redundancy processing based on redundancy satisfying the required value, and is stored in the distributed storage unit,
Of the data blocks stored in the distributed storage unit, the data block whose redundancy does not satisfy the required value is converted into a data block subjected to redundancy processing based on the redundancy satisfying the required value, A method of executing conversion processing stored in a distributed storage unit while suppressing processing speed.
(Appendix 10)
Using a distributed storage unit that distributes and stores data subjected to redundant processing in a plurality of storage devices,
Write data requested to be written is divided into data blocks, and among the divided data blocks, a data block having the same content data stored in the distributed storage unit and a data block having the same content stored in the distributed storage unit A program that causes a computer device to execute a selection step of selecting a data block extracted from data blocks that have already been stored in the memory and whose redundancy does not satisfy the required value as a write target.
(Appendix 11)
Using a distributed storage unit that distributes and stores data subjected to redundant processing in a plurality of storage devices,
Write data requested to be written is divided into data blocks, and among the divided data blocks, a data block having the same content data stored in the distributed storage unit and a data block having the same content stored in the distributed storage unit Selecting a data block extracted from data blocks that have been stored in the memory and whose redundancy does not satisfy the required value,
For the data block selected in the selection step, a writing step of performing redundancy processing based on redundancy satisfying the required value and storing the data in the distributed storage unit;
Of the data blocks stored in the distributed storage unit, the data block whose redundancy does not satisfy the required value is converted into a data block subjected to redundancy processing based on the redundancy satisfying the required value, A conversion step for executing the conversion process to be stored in the distributed storage unit while suppressing the processing speed;
That causes a computer device to execute the program.

１、２、３ストレージシステム
１１、２１、４１選択部
１２、２２、３２書き込み部
１３、２３変換部
８０、９０分散記憶部
９１データ格納部
９２メタデータ格納部
９３データ処理部
２００、３００アクセス装置
４００クライアント装置
９００ストレージノード
１００１、２００１、４００１、９００１ＣＰＵ
１００２、２００２、４００２、９００２メモリ
１００４、２００４、４００４、９００４記憶装置
１００５、２００５、４００５、９００５ネットワークインタフェース 1, 2, 3 Storage system 11, 21, 41 Selection unit 12, 22, 32 Writing unit 13, 23 Conversion unit 80, 90 Distributed storage unit 91 Data storage unit 92 Metadata storage unit 93 Data processing unit 200, 300 Access device 400 Client device 900 Storage node 1001, 2001, 4001, 9001 CPU
1002, 2002, 4002, 9002 Memory 1004, 2004, 4004, 9004 Storage device 1005, 2005, 4005, 9005 Network interface

Claims

A distributed storage unit that stores the data subjected to the redundant processing in a distributed manner in a plurality of storage devices;
Write data requested to be written is divided into data blocks, and among the divided data blocks, a data block having the same content data stored in the distributed storage unit and a data block having the same content stored in the distributed storage unit A data block extracted from the data blocks that have been stored in the memory and whose redundancy does not satisfy the required value;
For the data block selected by the selection unit, a writing unit that performs redundancy processing based on redundancy satisfying the required value and stores the data in the distributed storage unit;
Of the data blocks stored in the distributed storage unit, the data block whose redundancy does not satisfy the required value is converted into a data block subjected to redundancy processing based on the redundancy satisfying the required value, A conversion unit that performs conversion processing to be stored in the distributed storage unit while suppressing the processing speed by providing an upper limit value for a data block processed per unit time ; and
Storage system with

The selection unit extracts a predetermined amount of data blocks from among data blocks whose data blocks having the same contents are already stored in the distributed storage unit and whose redundancy does not satisfy the required value, and is used as the write target. The storage system according to claim 1, wherein the storage system is selected.

The storage system according to claim 1 or 2, wherein the conversion unit suppresses the processing speed by executing the conversion processing at predetermined timings.

For each data block stored in the distributed storage unit, further comprising a metadata storage unit storing metadata including information for determining the redundancy and the identity of the content,
The selection unit selects the data block to be written by referring to the metadata storage unit,
The said conversion part extracts the data block used as the object of the said conversion process from the said distributed storage part by referring to the said metadata storage part, The any one of Claims 1-3 characterized by the above-mentioned. The storage system described in.

An access device including the selection unit, the writing unit, and the conversion unit;
The distributed storage unit;
The storage system according to any one of claims 1 to 4, further comprising:

A client device having the selection unit;
An access device having the writing unit and the converting unit;
The distributed storage unit;
The storage system according to any one of claims 1 to 4, further comprising:

The access device according to claim 5 or 6.

Computer equipment
Using a distributed storage unit that distributes and stores data subjected to redundant processing in a plurality of storage devices,
Write data requested to be written is divided into data blocks, and among the divided data blocks, a data block having the same content data stored in the distributed storage unit and a data block having the same content stored in the distributed storage unit Selected from the data blocks that have been stored in the data block and whose redundancy does not satisfy the required value,
The selected data block is subjected to redundancy processing based on redundancy satisfying the required value, and is stored in the distributed storage unit,
Of the data blocks stored in the distributed storage unit, the data block whose redundancy does not satisfy the required value is converted into a data block subjected to redundancy processing based on the redundancy satisfying the required value, A method of executing conversion processing to be stored in a distributed storage unit while suppressing the processing speed by providing an upper limit value for a data block processed per unit time .