JP6343952B2

JP6343952B2 - Storage system

Info

Publication number: JP6343952B2
Application number: JP2014025237A
Authority: JP
Inventors: 昇平笹田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2014-02-13
Filing date: 2014-02-13
Publication date: 2018-06-20
Anticipated expiration: 2034-02-13
Also published as: JP2015153067A

Description

本発明は、ストレージシステムにかかり、特に、データを分散して複数の記憶装置に記憶するストレージシステムに関する。 The present invention relates to a storage system, and more particularly to a storage system that distributes data and stores it in a plurality of storage devices.

近年、コンピュータの発達及び普及に伴い、種々の情報がデジタルデータ化されている。このようなデジタルデータを保存しておく装置として、磁気テープや磁気ディスクなどの記憶装置がある。そして、保存すべきデータは日々増大し、膨大な量となるため、大容量なストレージシステムが必要となっている。また、記憶装置に費やすコストを削減しつつ、信頼性も必要とされる。これに加えて、後にデータを容易に取り出すことが可能であることも必要である。その結果、自動的に記憶容量や性能の増大を実現できると共に、重複記憶を排除して記憶コストを削減し、さらには、冗長性の高いストレージシステムが望まれている。 In recent years, with the development and spread of computers, various types of information have been converted into digital data. As a device for storing such digital data, there are storage devices such as a magnetic tape and a magnetic disk. Since the data to be stored increases day by day and becomes enormous, a large-capacity storage system is required. In addition, reliability is required while reducing the cost of the storage device. In addition to this, it is necessary that data can be easily retrieved later. As a result, there is a demand for a storage system that can automatically increase storage capacity and performance, eliminate duplicate storage, reduce storage costs, and have high redundancy.

このような状況に応じて、近年では、特許文献１に示すように、コンテンツアドレスストレージシステムが開発されている。このコンテンツアドレスストレージシステムは、データを分散して複数の記憶装置に記憶すると共に、このデータの内容に応じて特定される固有のコンテンツアドレスによって、当該データを格納した格納位置が特定される。 In response to such a situation, in recent years, a content address storage system has been developed as shown in Patent Document 1. In this content address storage system, data is distributed and stored in a plurality of storage devices, and the storage location where the data is stored is specified by a unique content address specified according to the content of the data.

このように、コンテンツアドレスは、データの内容に応じて固有となるよう生成されるため、重複データであれば、同じ格納位置のデータを参照することで、同一内容のデータを取得することができる。従って、重複データを別々に格納する必要がなく、重複記録を排除し、データ容量の削減を図ることができる。つまり、コンテンツアドレスストレージシステムでは、同一内容のデータが記憶されていない場合だけ、新たなデータが記憶されるという重複排除機能を有している。 As described above, since the content address is generated so as to be unique according to the content of the data, if it is duplicated data, the data of the same content can be acquired by referring to the data at the same storage position. . Therefore, it is not necessary to store the duplicate data separately, and duplicate recording can be eliminated and the data capacity can be reduced. That is, the content address storage system has a deduplication function in which new data is stored only when data of the same content is not stored.

また、ストレージシステムでは、所定容量のブロックデータを複数のフラグメントデータに分割すると共に、冗長データとなるフラグメントをさらに付加して、これら複数のフラグメントデータをそれぞれ複数の記憶装置にそれぞれ格納している。そして、後にコンテンツアドレスを指定することにより、当該コンテンツアドレスにて特定される格納位置に格納されているフラグメントデータを読み出し、複数のフラグメントデータから分割前のブロックデータを復元することができる。 Further, in the storage system, a predetermined amount of block data is divided into a plurality of fragment data, and a fragment that becomes redundant data is further added, and the plurality of fragment data are respectively stored in a plurality of storage devices. Then, by designating the content address later, the fragment data stored in the storage location specified by the content address can be read, and the block data before division can be restored from the plurality of fragment data.

このように、ストレージシステムは、冗長データとなるフラグメントデータを付加しているため、付加した冗長データのフラグメント数以下のフラグメントデータが失われた場合でも、元のブロックデータを再生成することができる。 As described above, since the storage system adds fragment data that becomes redundant data, even when fragment data equal to or less than the number of fragments of the added redundant data is lost, the original block data can be regenerated. .

さらに、ストレージシステムでは、複数の記憶装置にフラグメントデータを格納する際に、一度に多くのフラグメントデータが失われないよう、また、所定の記憶装置に負荷が集中しないよう、分散配置機能を有している。ここで、各記憶装置には、それぞれデータ格納用コンテナを有しており、各データ格納用コンテナにそれぞれフラグメントデータを格納して、負荷分散している。 Furthermore, when storing fragment data in multiple storage devices, the storage system has a distributed arrangement function so that a large amount of fragment data is not lost at one time and the load is not concentrated on a predetermined storage device. ing. Here, each storage device has a data storage container, and fragment data is stored in each data storage container to distribute the load.

ここで、上述したようなストレージシステムにおいては、書き込み時に同一のデータが既に記憶装置に格納されているか否かの重複判定を行うこととなるが、大量のデータの中から重複判定を行うことは容易ではない。そこで、データの重複判定の効率化のために、実際のデータを比較するのではなく、データの内容に基づいて算出されたフィンガプリント（ハッシュ値）を比較する方法が使用されている。ハッシュ値は、実データに比べてデータサイズが小さいため、重複判定の高速化が可能となる。そして、データの読み込み時にも同様に、対象となるデータのコンテンツアドレスと既に記憶されているハッシュ値とを比較することで、検索の高速化が可能となる。 Here, in the storage system as described above, it is determined whether or not the same data is already stored in the storage device at the time of writing. It's not easy. Therefore, in order to improve the efficiency of data duplication determination, a method is used in which fingerprints (hash values) calculated based on data contents are compared rather than comparing actual data. Since the hash value has a data size smaller than that of the actual data, it is possible to speed up duplication determination. Similarly, at the time of reading data, it is possible to speed up the search by comparing the content address of the target data with the already stored hash value.

一方で、上述したハッシュ値は、書き込みデータをあるサイズに分割（チャンキング）したブロックデータ単位で計算されるため、データ量の増加に伴って、その数が増加していくことになる。そして、ストレージシステムの多くは、書き込み、および読み込み性能が重要視されるため、重複判定の高速化のために、同一のハッシュ値を効率的に探索する手法が必要となってくる。 On the other hand, since the hash value described above is calculated in units of block data obtained by dividing (chunking) the write data into a certain size, the number increases as the data amount increases. In many storage systems, writing and reading performance is regarded as important, and thus a method for efficiently searching for the same hash value is required to speed up duplication determination.

そこで、ストレージシステムは、重複判定のために、ブロックデータのハッシュ値の一部であるショートハッシュをキーとして、そのハッシュ値に対応したデータにアクセスするための情報をまとめたテーブルであるハッシュテーブルを持つ。ハッシュテーブルを用いた場合には、ショートハッシュをベースに検索されるため、フルハッシュ値を比較する場合と比較して、検索性能を向上させることができる。また、重複判定のみならず、データの読み込みの際にも、読み込み要求のあった該当するブロックデータを検索する際に、ハッシュテーブルが使用される。 Therefore, the storage system uses a short hash that is a part of the hash value of the block data as a key for duplication determination, and a hash table that is a table that summarizes information for accessing data corresponding to the hash value. Have. When a hash table is used, search is performed based on a short hash, so that search performance can be improved as compared with a case where full hash values are compared. In addition to the determination of duplication, a hash table is used when searching for the corresponding block data requested to be read not only when data is read.

特開２０１３−１８２４７６JP2013-182476

ここで、ストレージシステムは、データ格納専用に使用されるシステムのため、第一に、格納データ（データブロックや、そのフルハッシュ値）の信頼性が重要となってくる。そのため、格納データに関しては、信頼性を重視し、ジャーナル機能を持つファイルシステムに保存しているため、障害が発生した後には、ジャーナル情報から故障したデータの復旧が可能となる。 Here, since the storage system is a system used exclusively for data storage, first, the reliability of stored data (data block and its full hash value) becomes important. For this reason, the stored data is stored in a file system having a journal function with an emphasis on reliability. Therefore, after a failure occurs, the failed data can be recovered from the journal information.

一方で、ハッシュテーブルは、上述したようにデータの高速検索を目的としていることと、保存されているデータから再度生成できる情報であることから、重要度が低いデータであり、信頼性よりも性能を重視した構成となっている。つまり、上述した格納データとは異なり、ジャーナル機能を持たない。そのため、電源断などの異常再起動が発生すると、実際に格納されているデータとハッシュテーブル内に格納されているデータとの整合性が保てなくなるが、整合性の回復のために、格納データから再度ハッシュテーブルを構築する処理を行わなければならなくなる。 On the other hand, the hash table is less important data because it is intended for high-speed data retrieval as described above and can be generated again from stored data. It has a configuration that emphasizes. That is, unlike the above-described stored data, it does not have a journal function. Therefore, if an abnormal restart such as a power failure occurs, the data stored in the hash table cannot be consistent with the data stored in the hash table. Therefore, the process of building the hash table again must be performed.

しかしながら、ハッシュテーブルは、読み込み時の格納データの検索の際に最初にアクセスする情報であるために、その破損時や再構築中は、格納データへのアクセスができなくなる、という問題が生じる。そして、この再構築処理は、近年のデータの肥大化によって時間が長くなってきており、格納データにアクセスができない時間が長期化する、という問題も生じる。その結果、ストレージシステムの性能が低下する、という問題が生じる。 However, since the hash table is information that is first accessed when searching for stored data at the time of reading, there arises a problem that access to the stored data becomes impossible when the data is damaged or during reconstruction. This restructuring process has become longer due to the recent enlargement of data, and there is a problem that the time during which stored data cannot be accessed is prolonged. As a result, there arises a problem that the performance of the storage system is degraded.

このため、本発明の目的は、上述した課題である、ストレージシステムの性能が低下する、ということを解決することにある。 Therefore, an object of the present invention is to solve the above-described problem that the performance of the storage system is degraded.

本発明の一形態であるストレージシステムは、
記憶対象データを複数に分割した分割データを含む複数のフラグメントデータを生成して、当該複数のフラグメントデータを複数の記憶手段にそれぞれ分散して記憶すると共に、
前記フラグメントデータを参照する情報と、当該フラグメントデータにて構成される前記記憶対象データのデータ内容に基づいて算出された要約データと、を関連付けたインデックスデータを、当該インデックスデータにて参照される前記フラグメントデータの記憶先となる前記記憶手段毎に区別して記憶し、
さらに、前記インデックスデータを参照する情報と、当該インデックスデータに含まれた前記要約データの一部からなる部分要約データと、を関連付けた要約テーブルを記憶する、
分散記憶処理手段と、
検索要求データに対応する前記要約データに基づいて、前記要約テーブル及び前記インデックスデータを探索して、前記複数の記憶手段に記憶されている前記検索要求データを構成する複数のフラグメントデータを検索するデータ検索手段と、を備え、
前記分散記憶処理手段は、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込み時の状況を表す書き込み状況情報を含めて記憶し、
前記データ検索手段は、前記要約テーブルの少なくとも一部が利用不可である場合に、当該利用不可である要約テーブルによって参照される前記インデックスデータにてさらに参照される前記フラグメントデータを記憶する特定の前記記憶手段とは別の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データの前記要約データに基づいて当該検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて前記特定の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
という構成をとる。 A storage system according to an aspect of the present invention
A plurality of fragment data including divided data obtained by dividing the storage target data into a plurality of pieces, and the plurality of fragment data are respectively distributed and stored in a plurality of storage units;
The index data that associates the information that refers to the fragment data with the summary data that is calculated based on the data content of the storage target data that is configured with the fragment data is referred to in the index data. Storing each of the storage means as fragment data storage destinations separately,
Further, a summary table that associates information that refers to the index data and partial summary data that includes a part of the summary data included in the index data is stored.
Distributed storage processing means;
Data for searching a plurality of fragment data constituting the search request data stored in the plurality of storage means by searching the summary table and the index data based on the summary data corresponding to the search request data A search means,
The distributed storage processing means stores the index data including write status information indicating a status at the time of writing the fragment data referred to by the index data,
The data search means stores the fragment data further referenced in the index data referenced by the unavailable summary table when at least a part of the summary table is unavailable. Searching the index data corresponding to a storage means different from the storage means, and writing the fragment data constituting the search target data based on the summary data of the search target data and the writing of the fragment data The other fragment constituting the search target data by specifying the status information, searching the index data corresponding to the specific storage means based on the specified write status information and the summary data of the search target data Identify the data,
The configuration is as follows.

本発明の他の形態であるプログラムは、
ストレージシステムの制御装置に、
記憶対象データを複数に分割した分割データを含む複数のフラグメントデータを生成して、当該複数のフラグメントデータを複数の記憶手段にそれぞれ分散して記憶すると共に、
前記フラグメントデータを参照する情報と、当該フラグメントデータにて構成される前記記憶対象データのデータ内容に基づいて算出された要約データと、を関連付けたインデックスデータを、当該インデックスデータにて参照される前記フラグメントデータの記憶先となる前記記憶手段毎に区別して記憶し、
さらに、前記インデックスデータを参照する情報と、当該インデックスデータに含まれた前記要約データの一部からなる部分要約データと、を関連付けた要約テーブルを記憶する、
分散記憶処理手段と、
検索要求データに対応する前記要約データに基づいて、前記要約テーブル及び前記インデックスデータを探索して、前記複数の記憶手段に記憶されている前記検索要求データを構成する複数のフラグメントデータを検索するデータ検索手段と、
を実現させると共に、
前記分散記憶処理手段は、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込み時の状況を表す書き込み状況情報を含めて記憶し、
前記データ検索手段は、前記要約テーブルの少なくとも一部が利用不可である場合に、当該利用不可である要約テーブルによって参照される前記インデックスデータにてさらに参照される前記フラグメントデータを記憶する特定の前記記憶手段とは別の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データの前記要約データに基づいて当該検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて前記特定の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
ことを実現させる、
という構成をとる。 The program which is the other form of this invention is:
In the storage system control unit,
A plurality of fragment data including divided data obtained by dividing the storage target data into a plurality of pieces, and the plurality of fragment data are respectively distributed and stored in a plurality of storage units;
The index data that associates the information that refers to the fragment data with the summary data that is calculated based on the data content of the storage target data that is configured with the fragment data is referred to in the index data. Storing each of the storage means as fragment data storage destinations separately,
Further, a summary table that associates information that refers to the index data and partial summary data that includes a part of the summary data included in the index data is stored.
Distributed storage processing means;
Data for searching a plurality of fragment data constituting the search request data stored in the plurality of storage means by searching the summary table and the index data based on the summary data corresponding to the search request data Search means;
And realize
The distributed storage processing means stores the index data including write status information indicating a status at the time of writing the fragment data referred to by the index data,
The data search means stores the fragment data further referenced in the index data referenced by the unavailable summary table when at least a part of the summary table is unavailable. Searching the index data corresponding to a storage means different from the storage means, and writing the fragment data constituting the search target data based on the summary data of the search target data and the writing of the fragment data The other fragment constituting the search target data by specifying the status information, searching the index data corresponding to the specific storage means based on the specified write status information and the summary data of the search target data Identify the data,
Make it happen,
The configuration is as follows.

本発明の他の形態であるデータ処理方法は、
記憶対象データを複数に分割した分割データを含む複数のフラグメントデータを生成して、当該複数のフラグメントデータを複数の記憶手段にそれぞれ分散して記憶すると共に、
前記フラグメントデータを参照する情報と、当該フラグメントデータにて構成される前記記憶対象データのデータ内容に基づいて算出された要約データと、を関連付けたインデックスデータを、当該インデックスデータにて参照される前記フラグメントデータの記憶先となる前記記憶手段毎に区別して記憶し、
さらに、前記インデックスデータを参照する情報と、当該インデックスデータに含まれた前記要約データの一部からなる部分要約データと、を関連付けた要約テーブルを記憶する、
分散記憶処理を行い、
検索要求データに対応する前記要約データに基づいて、前記要約テーブル及び前記インデックスデータを探索して、前記複数の記憶手段に記憶されている前記検索要求データを構成する複数のフラグメントデータを検索するデータ検索処理を行う、データ処理方法であって、
前記分散記憶処理の際に、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込み時の状況を表す書き込み状況情報を含めて記憶し、
前記データ検索処理の際に、前記要約テーブルの少なくとも一部が利用不可である場合に、当該利用不可である要約テーブルによって参照される前記インデックスデータにてさらに参照される前記フラグメントデータを記憶する特定の前記記憶手段とは別の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データの前記要約データに基づいて当該検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて前記特定の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
という構成をとる。 A data processing method according to another aspect of the present invention includes:
A plurality of fragment data including divided data obtained by dividing the storage target data into a plurality of pieces, and the plurality of fragment data are respectively distributed and stored in a plurality of storage units;
The index data that associates the information that refers to the fragment data with the summary data that is calculated based on the data content of the storage target data that is configured with the fragment data is referred to in the index data. Storing each of the storage means as fragment data storage destinations separately,
Further, a summary table that associates information that refers to the index data and partial summary data that includes a part of the summary data included in the index data is stored.
Perform distributed storage processing
Data for searching a plurality of fragment data constituting the search request data stored in the plurality of storage means by searching the summary table and the index data based on the summary data corresponding to the search request data A data processing method for performing search processing,
During the distributed storage process, the index data is stored including write status information indicating a status at the time of writing the fragment data referred to by the index data,
A specification for storing the fragment data further referred to in the index data referred to by the unavailable summary table when at least a part of the summary table is unavailable during the data search process The index data corresponding to the storage means different from the storage means is searched, and a part of the fragment data and the fragment data constituting the search target data based on the summary data of the search target data The write status information is specified, and the index data corresponding to the specific storage means is searched based on the specified write status information and the summary data of the search target data. Identifying the fragment data;
The configuration is as follows.

本発明は、以上のように構成されることにより、ストレージシステムの性能の低下を抑制することができる。 By configuring as described above, the present invention can suppress a decrease in performance of the storage system.

本発明の実施形態１におけるストレージシステムを含む全体システムの構成を示すブロック図である。1 is a block diagram showing a configuration of an entire system including a storage system in Embodiment 1 of the present invention. 図１に開示したストレージシステムの全体構成を示すブロック図である。FIG. 2 is a block diagram illustrating an overall configuration of the storage system disclosed in FIG. 1. 図２に開示したストレージシステムの構成を示す機能ブロック図である。FIG. 3 is a functional block diagram illustrating a configuration of a storage system disclosed in FIG. 2. 図３に開示したストレージシステムにデータを書き込むときの様子を説明するための説明図である。FIG. 4 is an explanatory diagram for explaining a state when data is written to the storage system disclosed in FIG. 3; 図３に開示したストレージシステムにデータを書き込むときの様子を説明するための説明図である。FIG. 4 is an explanatory diagram for explaining a state when data is written to the storage system disclosed in FIG. 3; 図３に開示したストレージシステムに書き込まれているデータの様子を説明するための説明図である。FIG. 4 is an explanatory diagram for explaining a state of data written in the storage system disclosed in FIG. 3. 図３に開示したストレージシステムに書き込まれているデータの様子を説明するための説明図である。FIG. 4 is an explanatory diagram for explaining a state of data written in the storage system disclosed in FIG. 3. 図３に開示したストレージシステムに書き込まれているデータを検索するときの様子を説明するための説明図である。FIG. 4 is an explanatory diagram for explaining a state when searching for data written in the storage system disclosed in FIG. 3; 図３に開示したストレージシステムの動作を示すフローチャートである。4 is a flowchart showing the operation of the storage system disclosed in FIG. 3. 図３に開示したストレージシステムの動作を示すフローチャートである。4 is a flowchart showing the operation of the storage system disclosed in FIG. 3. 図３に開示したストレージシステムの動作を示すフローチャートである。4 is a flowchart showing the operation of the storage system disclosed in FIG. 3. 本発明の付記１におけるストレージシステムの構成を示す図である。It is a figure which shows the structure of the storage system in attachment 1 of this invention.

＜実施形態１＞
本発明の第１の実施形態を、図１乃至図１２を参照して説明する。図１は、全体システムの構成を示すブロック図である。図２は、ストレージシステムの構成の概略を示すブロック図であり、図３は、ストレージシステムの構成を示す機能ブロック図である。図４乃至図８は、ストレージシステムに書き込まれるデータや検索時の様子を説明するための説明図である。図９乃至図１１は、ストレージシステムの動作を示すフローチャートである。 <Embodiment 1>
A first embodiment of the present invention will be described with reference to FIGS. FIG. 1 is a block diagram showing the configuration of the entire system. FIG. 2 is a block diagram showing an outline of the configuration of the storage system, and FIG. 3 is a functional block diagram showing the configuration of the storage system. 4 to 8 are explanatory diagrams for explaining data written in the storage system and a state at the time of retrieval. 9 to 11 are flowcharts showing the operation of the storage system.

［構成］
本実施形態は、後述する付記に記載のストレージシステム等の具体的な一例を示すものである。そして、以下では、ストレージシステムが、複数台のサーバコンピュータが接続されて構成されている場合を説明する。但し、本発明におけるストレージシステムは、複数台のコンピュータにて構成されることに限定されず、１台のコンピュータで構成されていてもよい。 [Constitution]
This embodiment shows a specific example of a storage system or the like described in an appendix to be described later. In the following, a case where the storage system is configured by connecting a plurality of server computers will be described. However, the storage system according to the present invention is not limited to being configured by a plurality of computers, and may be configured by a single computer.

図１に示すように、本発明におけるストレージシステム１は、ネットワークＮを介してバックアップ処理を制御するバックアップサーバ４に接続している。そして、バックアップサーバ４は、ネットワークＮを介して接続されたバックアップ対象装置５に格納されているバックアップ対象データ（記憶対象データ）を取得し、ストレージシステム１に対して記憶するよう要求する。これにより、ストレージシステム１は、記憶要求されたバックアップ対象データをバックアップ用に記憶する。 As shown in FIG. 1, the storage system 1 in the present invention is connected to a backup server 4 that controls backup processing via a network N. Then, the backup server 4 acquires backup target data (storage target data) stored in the backup target device 5 connected via the network N and requests the storage system 1 to store it. Thereby, the storage system 1 stores the backup target data requested to be stored for backup.

ここで、本実施形態におけるストレージシステム１は、データを分割及び冗長化し、分散して複数の記憶装置に記憶すると共に、記憶するデータの内容に応じて設定されるコンテンツアドレスによって、当該データを格納した格納位置を特定するコンテンツアドレスストレージシステムである。このコンテンツアドレスストレージシステムについては、後に詳述する。 Here, the storage system 1 according to the present embodiment divides and redundantly stores the data, distributes and stores the data in a plurality of storage devices, and stores the data according to the content address set according to the content of the stored data. This is a content address storage system for specifying the storage location. This content address storage system will be described in detail later.

図２に示すように、本実施形態におけるストレージシステム１は、複数のサーバコンピュータが接続された構成を採っている。具体的に、ストレージシステム１は、ストレージシステム１自体における記憶再生動作を制御するサーバコンピュータであるアクセスノード２と、データを格納する記憶装置を備えたサーバコンピュータであるストレージノード３と、を備えている。 As shown in FIG. 2, the storage system 1 in this embodiment has a configuration in which a plurality of server computers are connected. Specifically, the storage system 1 includes an access node 2 that is a server computer that controls storage and reproduction operations in the storage system 1 itself, and a storage node 3 that is a server computer including a storage device that stores data. Yes.

上記アクセスノード２は、バックアップサーバ４とのデータの送受信を行い、バックアップサーバ４に対してファイルシステムサービスを提供する。そして、実際にバックアップ対象データのディスクへの格納は、ストレージノード３が行う。なお、図２では、２つのアクセスノード２と４つのストレージノード３を図示しているが、アクセスノード２の数とストレージノード３の数は、図２に示したものに限定されない。 The access node 2 transmits / receives data to / from the backup server 4 and provides a file system service to the backup server 4. The storage node 3 actually stores the backup target data in the disk. In FIG. 2, two access nodes 2 and four storage nodes 3 are shown, but the number of access nodes 2 and the number of storage nodes 3 are not limited to those shown in FIG.

図３に、本実施形態におけるストレージシステム１の構成を示す。この図に示すように、ストレージシステム１を構成するアクセスノード２は、装備された演算装置にプログラムが組み込まれることで構築された、Ｉ／Ｏ処理部２１、データ送受信部２２、重複判定部２３、を備えている。また、ストレージノード３は、装備された演算装置にプログラムが組み込まれることで構築された、フラグメント処理部３１、フラグメント探索部３２、タイムスタンプ取得部３３、ノード状態監視部３４、を備える。そして、ストレージノード３は、装備された記憶装置３５に、ハッシュテーブル３６、インデックスファイル３７、格納ファイル３８を記憶する。 FIG. 3 shows the configuration of the storage system 1 in this embodiment. As shown in this figure, the access node 2 constituting the storage system 1 has an I / O processing unit 21, a data transmission / reception unit 22, and a duplication determination unit 23 that are constructed by incorporating a program in the equipped arithmetic device. It is equipped with. In addition, the storage node 3 includes a fragment processing unit 31, a fragment search unit 32, a time stamp acquisition unit 33, and a node state monitoring unit 34, which are constructed by incorporating a program into the equipped arithmetic device. Then, the storage node 3 stores the hash table 36, the index file 37, and the storage file 38 in the storage device 35 equipped.

なお、上述した各部は、アクセスノード２あるいはストレージノード３のどちらに装備されていてもよい。つまり、上述した各部は、ストレージシステムに装備された制御装置に構築されていればよい。以下、各構成について詳述する。 Each unit described above may be provided in either the access node 2 or the storage node 3. That is, each unit described above may be configured in a control device equipped in the storage system. Hereinafter, each configuration will be described in detail.

アクセスノード２のＩ／Ｏ処理部２１は、内部ネットワーク経由でバックアップサーバ４とバックアップデータの送信および受信を行う機能を有する。 The I / O processing unit 21 of the access node 2 has a function of transmitting and receiving backup data with the backup server 4 via the internal network.

アクセスノード２のデータ送受信部２２（分散記憶処理手段）は、重複判定部２３やストレージノード３の各部３１等と協働して、バックアップサーバ４から書き込み要求されたデータを、コンテンツアドレスを利用して、ストレージノード３に分散すると共に、重複排除を行って記憶する機能を有する。 The data transmission / reception unit 22 (distributed storage processing means) of the access node 2 uses the content address to write data requested to be written from the backup server 4 in cooperation with the duplication determination unit 23 and each unit 31 of the storage node 3. In addition, the storage node 3 has a function of storing data while performing deduplication.

具体的に、データ送受信部２２は、図４に示すように、まずバックアップサーバ４から書き込み要求されたファイルＡを、所定容量（例えば、６４ＫＢ）のブロックデータＤ（書き込み対象データ）に分割する。続いて、分割されたブロックデータＤのデータ内容に基づいて、当該データ内容に基づく要約データであるハッシュ値Ｈを算出する。例えば、ハッシュ値Ｈは、予め設定されたハッシュ関数を用いて、ブロックデータＤのデータ内容から算出する。なお、本実施形態では、要約データとしてハッシュ値Ｈを挙げているが、要約データは、ブロックデータの内容に基づいていかなる方法で算出されてもよい。 Specifically, as shown in FIG. 4, the data transmitting / receiving unit 22 first divides the file A requested to be written from the backup server 4 into block data D (data to be written) having a predetermined capacity (for example, 64 KB). Subsequently, based on the data content of the divided block data D, a hash value H that is summary data based on the data content is calculated. For example, the hash value H is calculated from the data content of the block data D using a preset hash function. In the present embodiment, the hash value H is cited as the summary data, but the summary data may be calculated by any method based on the contents of the block data.

ここで、重複判定部２３は、ファイルＡのブロックデータＤのハッシュ値Ｈを用いて、当該ブロックデータＤが既に格納されているか否かを調べる。なお、重複判定処理については後述する。 Here, the duplication determination unit 23 checks whether the block data D is already stored by using the hash value H of the block data D of the file A. The overlap determination process will be described later.

そして、データ送受信部２２は、重複判定部２３による重複判定の結果、既にブロックデータＤがストレージノード３に格納されていると判定された場合には、既に記憶されているブロックデータＤの記憶位置を表すコンテンツアドレスを、書き込み要求されたブロックデータのコンテンツアドレスとして用いる。つまり、重複している場合には、実際に書き込みを行うことなく、既に記憶されているブロックデータＤの記憶位置を参照することとし、書き込み完了とする。 When the data transmission / reception unit 22 determines that the block data D is already stored in the storage node 3 as a result of the duplication determination by the duplication determination unit 23, the storage position of the block data D that is already stored is determined. Is used as the content address of the block data requested to be written. That is, in the case of duplication, the storage position of the block data D that is already stored is referred to without actually writing, and the writing is completed.

一方、データ送受信部２２は、重複判定部２３による重複判定の結果、書き込み要求にかかるブロックデータＤが重複しておらず、まだ記憶されていないと判断された場合には、以下のようにしてブロックデータＤの書き込みを行う。データ送受信部２２は、ブロックデータＤを複数の所定の容量のフラグメントデータに分割する。例えば、図４の符号Ｄ１〜Ｄ９に示すように、ブロックデータＤを９つのフラグメントデータ（分割データＦ１）に分割する。そしてさらに、分割したフラグメントデータのうちいくつかが欠けた場合であっても、元となるブロックデータＤを復元可能なよう冗長データを生成し、上記分割したフラグメントデータＦ１に追加する。例えば、図４の符号Ｄ１０〜Ｄ１２に示すように、３つのフラグメントデータ（冗長データＦ２）を追加する。これにより、９つの分割データＦ１と、３つの冗長データＦ２とにより構成される１２個のフラグメントデータからなるデータセットを生成する。なお、冗長データであるフラグメントデータの数は、設定により変化させることができる。 On the other hand, if the data transmission / reception unit 22 determines that the block data D related to the write request is not duplicated and is not yet stored as a result of the duplication judgment by the duplication judgment unit 23, Write block data D. The data transmitting / receiving unit 22 divides the block data D into a plurality of fragment data having a predetermined capacity. For example, as indicated by reference numerals D1 to D9 in FIG. 4, the block data D is divided into nine fragment data (divided data F1). Further, even if some of the divided fragment data is missing, redundant data is generated so that the original block data D can be restored and added to the divided fragment data F1. For example, three pieces of fragment data (redundant data F2) are added as indicated by reference numerals D10 to D12 in FIG. As a result, a data set composed of 12 fragment data composed of 9 divided data F1 and 3 redundant data F2 is generated. The number of fragment data that is redundant data can be changed by setting.

そして、データ送受信部２２は、生成した複数のフラグメントデータからなるデータセットを各ストレージノード３に分散記憶するよう送信する。 Then, the data transmitting / receiving unit 22 transmits the data set composed of the plurality of generated fragment data so as to be distributed and stored in the storage nodes 3.

ストレージノード３のフラグメント処理部３１（分散記憶処理手段）は、アクセスノード２から送られてきたフラグメントデータをディスクに格納する。このとき、フラグメント処理部３１は、複数のストレージノード３にまたがって形成された仮想ノードＶ０１，Ｖ０２・・・に設けられた複数のデータ格納用コンテナに分散して格納する。 The fragment processing unit 31 (distributed storage processing means) of the storage node 3 stores the fragment data sent from the access node 2 on the disk. At this time, the fragment processing unit 31 stores the data in a plurality of data storage containers provided in the virtual nodes V01, V02,... Formed across the plurality of storage nodes 3.

ここで、仮想ノードＶ０１，Ｖ０２・・・について、図５を参照して説明する。図５の例では、仮想ノードＶ０１は、複数のストレージノード３をまたがって、各ストレージノード３内に形成されたデータ格納領域となる１２個のデータ格納用コンテナ（符号０１〜１２）で形成されている。ここでは、４つの仮想ノードＶ０１，Ｖ０２，Ｖ１０，Ｖ１１が形成されていることとする。 Here, the virtual nodes V01, V02... Will be described with reference to FIG. In the example of FIG. 5, the virtual node V01 is formed by twelve data storage containers (reference numerals 01 to 12) serving as a data storage area formed in each storage node 3 across a plurality of storage nodes 3. ing. Here, it is assumed that four virtual nodes V01, V02, V10, and V11 are formed.

そして、ブロックデータＤから生成された１２個のフラグメントデータがどの仮想ノードに格納されるかは、ブロックデータＤのハッシュ値の先頭から既定のビット数を確認することで決定される。図５の例では、仮想ノードが４つであるため、ブロックデータのハッシュ値の先頭２ビットの値に基づいて格納場所を決定する。例えば、ハッシュ値が、「00001111...」である場合には、先頭２ビット「00」から、「Ｖ００」の仮想ノードに格納することが決定される。このとき、図４に示すように、１２個のデータ格納用コンテナ（０１〜１２）に、それぞれ１つのフラグメントデータを格納する。 Then, which virtual node stores the 12 fragment data generated from the block data D is determined by confirming a predetermined number of bits from the head of the hash value of the block data D. In the example of FIG. 5, since there are four virtual nodes, the storage location is determined based on the value of the first 2 bits of the hash value of the block data. For example, when the hash value is “00001111...”, It is determined to store the first two bits “00” in the virtual node “V00”. At this time, as shown in FIG. 4, one piece of fragment data is stored in each of the 12 data storage containers (01 to 12).

そして、フラグメント処理部３１は、上述したようにフラグメントデータを分散記憶する際には、以下のようにしてデータを管理する。まず、図６に示すように、データ格納用コンテナＶ００，Ｖ０１は、フラグメントデータを格納する格納ファイル３８と、当該フラグメントデータに関するフルハッシュ値などのメタデータを保存するインデックスファイル３７と、を持つ。フラグメント処理部３１は、送られてきたフラグメントデータを、格納ファイル３８に追記的に保存する。格納ファイル３８は、ファイルＩＤが１から昇順の番号で作成され、フラグメントデータがある一定の数だけ格納されると、ファイルＩＤの値を上げて、次の格納ファイルに書き出す。具体的に、格納ファイル３８には、図７（Ｃ）に示すように、格納ファイル内の所定の格納位置を表す各オフセットの位置に、フラグメントデータ自体が格納される。 The fragment processing unit 31 manages the data as follows when distributing and storing the fragment data as described above. First, as shown in FIG. 6, the data storage containers V00 and V01 have a storage file 38 for storing fragment data, and an index file 37 for storing metadata such as a full hash value related to the fragment data. The fragment processing unit 31 additionally stores the sent fragment data in the storage file 38. The storage file 38 is created in ascending order from 1 as the file ID, and when a certain number of fragment data is stored, the value of the file ID is increased and written to the next storage file. Specifically, as shown in FIG. 7C, fragment data itself is stored in the storage file 38 at each offset position representing a predetermined storage position in the storage file.

また、フラグメント処理部３１は、格納ファイル３８にフラグメントデータを書き終えると、続いて、フラグメントデータ毎のメタデータであるインデックスデータをエントリとして、インデックスファイルに追記的に格納する。図７（Ｂ）にインデックスファイルの一例を示す。このように、インデックスファイル３７には、当該インデックスファイルの格納位置であるオフセット毎に、格納ファイル３８に格納したフラグメントデータに対応するメタデータであるインデックスデータが格納される。各インデックスデータは、格納ファイル３８に格納したフラグメントデータの元となるブロックデータＤのフルハッシュ値、当該フラグメントデータへのポインタ（参照情報）を表す格納ファイルＩＤ及びそのオフセット、当該フラグメントデータが書き込まれた時刻に基づくタイムスタンプ（書き込み時刻情報）、を含む。 Further, when the fragment processing unit 31 finishes writing the fragment data in the storage file 38, the fragment processing unit 31 subsequently stores the index data, which is metadata for each fragment data, as an entry in the index file. FIG. 7B shows an example of an index file. In this way, the index file 37 stores index data that is metadata corresponding to fragment data stored in the storage file 38 for each offset that is the storage position of the index file. Each index data is written with the full hash value of the block data D that is the source of the fragment data stored in the storage file 38, the storage file ID indicating the pointer (reference information) to the fragment data, its offset, and the fragment data. Includes a time stamp (writing time information) based on the recorded time.

なお、上記インデックスファイル３７は、当該インデックスファイル３７に含まれる各インデックスデータにて参照されるフラグメントデータが記憶されたストレージノード３毎に、当該ストレージノード３内に記憶される。つまり、インデックスファイル３７は、あるストレージノード３にフラグメントデータが記憶されると、そのフラグメントデータに対応するインデックスデータを、同一のストレージノード３に形成されたインデックスファイル３７内に格納する。このとき、インデックスデータを、インデックスファイル３７の格納領域に順番に追記する。 The index file 37 is stored in the storage node 3 for each storage node 3 in which fragment data referred to by each index data included in the index file 37 is stored. That is, when fragment data is stored in a certain storage node 3, the index file 37 stores the index data corresponding to the fragment data in the index file 37 formed in the same storage node 3. At this time, the index data is added to the storage area of the index file 37 in order.

さらに、フラグメント処理部３１は、ハッシュテーブル３６（要約テーブル）に、格納ファイル３８に格納したフラグメントデータ毎のエントリを、当該フラグメントデータに対応するインデックスデータを参照する情報とショートハッシュ値とを関連付けて格納する。図７（Ａ）に、ハッシュテーブル３６の一例を示す。ハッシュテーブル３６の各エントリには、格納ファイル３８に格納されたフラグメントデータの元となるブロックデータＤのショートハッシュ値と、当該フラグメントデータを参照するインデックスデータへのポインタ（参照情報）を表すインデックスファイルＩＤ及びそのオフセットと、を格納している。なお、上記ショートハッシュ（部分要約データ）は、フラグメントデータの元となるブロックデータＤのフルハッシュ値（例えば、20 bytes）の一部（例えば、先頭から8 bytes）の値である。 Further, the fragment processing unit 31 associates an entry for each fragment data stored in the storage file 38 with the hash table 36 (summary table) and associates the information referring to the index data corresponding to the fragment data with the short hash value. Store. FIG. 7A shows an example of the hash table 36. Each entry of the hash table 36 includes an index file that represents a short hash value of the block data D that is the source of the fragment data stored in the storage file 38 and a pointer (reference information) to index data that refers to the fragment data. ID and its offset are stored. The short hash (partial summary data) is a part of the full hash value (for example, 20 bytes) of the block data D that is the source of the fragment data (for example, 8 bytes from the beginning).

なお、上記ハッシュテーブル３６は、当該ハッシュテーブル３６にて参照されるインデックスファイルつまり各インデックスデータに対応するフラグメントデータが記憶されたストレージノード３毎に、当該ストレージノード３内に記憶される。つまり、あるストレージノード３にフラグメントデータが記憶されると、そのフラグメントデータを参照するインデックスデータが含まれるインデックスファイルの参照情報とショートハッシュ値とのエントリを、同一のストレージノード３に形成されたハッシュテーブル３６に格納する。このとき、各エントリを、ハッシュテーブル３６の格納領域に順番に追記する。 The hash table 36 is stored in the storage node 3 for each storage node 3 in which the index file referred to in the hash table 36, that is, fragment data corresponding to each index data is stored. That is, when fragment data is stored in a certain storage node 3, an entry of the reference information and the short hash value of the index file including the index data that refers to the fragment data is created in the same storage node 3. Store in table 36. At this time, each entry is added to the storage area of the hash table 36 in order.

そして、フラグメント処理部３１は、ブロックデータＤの格納位置として、当該ブロックデータＤのハッシュ値をデータ送受信部２２に返す。すると、データ送受信部２２は、ブロックデータＤの識別情報とハッシュ値とを関連付けて管理する。これにより、後述するように、ブロックデータＤの読み込み要求が有ったときに、ハッシュ値からその格納位置を検索することができる。 Then, the fragment processing unit 31 returns the hash value of the block data D to the data transmitting / receiving unit 22 as the storage position of the block data D. Then, the data transmitting / receiving unit 22 manages the identification information of the block data D in association with the hash value. Thereby, as will be described later, when there is a read request for the block data D, the storage position can be retrieved from the hash value.

ここで、上述したデータ送受信部２２からの、ブロックデータＤから生成された複数のフラグメントデータのフラグメント処理部３１への書き込みの指示は、同一のタイミングで行われる。このため、各ストレージノード３の各フラグメント処理部３１での書き込みもほぼ同一のタイミングで行われることとなる。すると、１つのブロックデータＤから生成された複数のフラグメントデータの各ストレージノード３への書き込みは、ほぼ同一時刻で行われることとなる。その結果、これらフラグメントデータを参照する各インデックスデータに含まれるタイムスタンプの値も、同一あるいは近似した値となる。また、インデックスファイル３７には、当該インデックスファイル３７の格納領域に順番に追記して各インデックスデータが格納されるため、当該各インデックスデータは、書き込まれた順つまりタイムスタンプ順に並んで格納されることとなる。 Here, an instruction to write the plurality of fragment data generated from the block data D to the fragment processing unit 31 from the data transmitting / receiving unit 22 is performed at the same timing. For this reason, writing in each fragment processing unit 31 of each storage node 3 is performed at almost the same timing. Then, the writing of the plurality of fragment data generated from one block data D to each storage node 3 is performed at substantially the same time. As a result, the time stamp value included in each index data referring to the fragment data is also the same or approximate value. In addition, since each index data is stored in the index file 37 in the storage area of the index file 37 in order, the index data is stored in the order of writing, that is, in the order of the time stamp. It becomes.

また、インデックスファイル３７に格納される各インデックスデータに含まれる「フルハッシュ値」は、当該各インデックスデータにて参照されるフラグメントデータの元となる分割される前のブロックデータＤのハッシュ値である。そのため、１２個のフラグメントデータの各インデックスデータに含まれるフルハッシュ値は、各フラグメントデータを格納した全てのストレージノード３の各インデックスファイル３７内で同一となる。この性質を用いることで、後述するように、１つのブロックデータＤから生成されたフラグメントデータを格納するどれかひとつのストレージノード３のインデックスファイルにアクセスできる場合、このフラグメントデータが書き込まれたタイムスタンプを取得することができる。そして、そのタイムスタンプに基づいて、残りのフラグメントデータも容易に検索することができる。 In addition, the “full hash value” included in each index data stored in the index file 37 is a hash value of the block data D before being divided, which is a source of fragment data referred to by each index data. . Therefore, the full hash value included in each index data of the 12 fragment data is the same in each index file 37 of all storage nodes 3 storing each fragment data. By using this property, as will be described later, when an index file of any one storage node 3 storing fragment data generated from one block data D can be accessed, the time stamp in which the fragment data is written Can be obtained. The remaining fragment data can be easily searched based on the time stamp.

以上のようにブロックデータＤを格納することで、データ送受信部２２及びフラグメント処理部３１（分散記憶処理手段、データ検索手段）は、バックアップサーバ４からデータの読み込み要求が有った場合や重複判定時に、図８に示すようにして格納されているブロックデータＤのコンテンツアドレスとなるハッシュ値やフラグメントデータにアクセスすることができる。 By storing the block data D as described above, the data transmitting / receiving unit 22 and the fragment processing unit 31 (distributed storage processing unit, data search unit) determine whether or not there is a data read request from the backup server 4 Sometimes, it is possible to access the hash value or fragment data that is the content address of the block data D stored as shown in FIG.

まず、検索対象となるブロックデータＤのハッシュ値を確認し、そのショートハッシュ値から自ノード３に記憶されたハッシュテーブル３６内を検索して、対象となるインデックスファイルＩＤとオフセットを取得する（矢印Ｙ１参照）。続いて、取得した情報からインデックスファイル３７にアクセスして、検索対象となるデータのフルハッシュ値と同一のフルハッシュ値が含まれるインデックスデータを探索する（矢印Ｙ２参照）。これにより、検索対象となるブロックデータＤと同一のフルハッシュ値が存在するか否かといった重複判定を行うことができる。そしてさらに、探索によって見つかったインデックスデータから格納ファイル３８にアクセスして、検索対象のフラグメントデータを読み込むことができる（矢印Ｙ３参照）。そして、各ストレージノード３から読み込まれたフラグメントデータを結合することで、検索対象となるブロックデータＤを生成し、そのブロックデータＤを返却することで、データの読み込みを行うことができる。 First, the hash value of the block data D to be searched is confirmed, the hash table 36 stored in the local node 3 is searched from the short hash value, and the target index file ID and offset are obtained (arrow) Y1). Subsequently, the index file 37 is accessed from the acquired information to search for index data including the same full hash value as the full hash value of the data to be searched (see arrow Y2). Thereby, it is possible to perform duplication determination such as whether or not the same full hash value as the block data D to be searched exists. Furthermore, the storage file 38 can be accessed from the index data found by the search, and the fragment data to be searched can be read (see arrow Y3). Then, by combining the fragment data read from each storage node 3, the block data D to be searched is generated, and by returning the block data D, the data can be read.

なお、ストレージノード３は、自己であるストレージノード３のハッシュテーブル３６が使用できない場合には、フラグメント探索部３２及びタイムスタンプ取得部３３（データ検索手段）にて、以下のようにして格納されているブロックデータＤを構成するフラグメントデータにアクセスする。 If the hash table 36 of the storage node 3 that is the storage node 3 cannot be used, the storage node 3 is stored as follows by the fragment search unit 32 and the time stamp acquisition unit 33 (data search unit). The fragment data constituting the block data D is accessed.

フラグメント探索部３２は、タイムスタンプ取得部３３を用いて、自己ではない別のストレージノード３に記憶されている検索対象となるブロックデータＤの一部を構成するフラグメントデータのタイムスタンプを取得する。そして、フラグメント探索部３２は、取得したタイムスタンプに基づいて、自己であるストレージノード３（特定のストレージノード）に記憶されているインデックスファイル３７内のタイムスタンプから、近い時刻に書き込まれたフラグメントデータのリストを取得する。 The fragment search unit 32 uses the time stamp acquisition unit 33 to acquire the time stamp of the fragment data constituting a part of the block data D to be searched that is stored in another storage node 3 that is not itself. The fragment search unit 32 then writes the fragment data written at a close time from the time stamp in the index file 37 stored in the storage node 3 (specific storage node) that is the self based on the acquired time stamp. Get a list of

タイムスタンプ取得部３３は、フラグメント探索部３２からの要求を受けて、検索対象となるブロックデータＤのフラグメントデータを記憶する別のストレージノード３のハッシュテーブル３６にリモートアクセスする。そして、別のストレージノード３のハッシュテーブル３６を参照して、検索対象となるブロックデータＤのハッシュ値を基に、別のストレージノード３に記憶されているインデックスファイル３７内のインデックスデータを特定する。これにより、特定したインデックスデータに対応するフラグメントデータが書き込まれたタイムスタンプを取得する。そして、取得したタイムスタンプを、フラグメント探索部３２に返す。 In response to the request from the fragment search unit 32, the time stamp acquisition unit 33 remotely accesses the hash table 36 of another storage node 3 that stores the fragment data of the block data D to be searched. Then, referring to the hash table 36 of another storage node 3, the index data in the index file 37 stored in another storage node 3 is specified based on the hash value of the block data D to be searched. . Thereby, the time stamp in which the fragment data corresponding to the specified index data is written is acquired. Then, the acquired time stamp is returned to the fragment search unit 32.

ノード状態監視部３４は、正常に動作しているストレージノード３のリストを取得する機能を持つ。正常なストレージノード３とは、ストレージサービスとしての機能が動作していて、かつ、ハッシュテーブル３６などのデータにアクセスすることができるノードである。 The node state monitoring unit 34 has a function of acquiring a list of storage nodes 3 that are operating normally. The normal storage node 3 is a node that has a function as a storage service and can access data such as the hash table 36.

なお、上述したフラグメント探索部３２、タイムスタンプ取得部３３、ノード状態監視部３４の機能については、以下の動作説明の際に詳述する。 The functions of the fragment search unit 32, time stamp acquisition unit 33, and node state monitoring unit 34 described above will be described in detail in the following description of the operation.

［動作］
次に、上述した構成のストレージシステムの動作、特に、データ書き込み処理、データ読み込み処理、ハッシュテーブル再構築中のデータ読み込み処理の動作を、主に図９乃至図１１のフローチャートを参照して説明する。 [Operation]
Next, operations of the storage system having the above-described configuration, in particular, data write processing, data read processing, and data read processing during hash table reconstruction will be described with reference mainly to the flowcharts of FIGS. .

（書き込み処理）
まず、図９を参照して、データ書き込み処理を説明する。バックアップサーバ４からの書き込み要求をアクセスノード２が受け取ると（ステップＳ１）、書き込み対象となるストリームデータは、あるサイズのブロックデータＤに分割される（ステップＳ２）。このブロックデータＤに対して、ハッシュ関数を用いてハッシュ値を算出して取得する。求められたハッシュ値を入力として、重複判定が行われる（ステップＳ３）。ここで、重複判定処理を、図８を参照して説明する。 (Write process)
First, the data writing process will be described with reference to FIG. When the access node 2 receives a write request from the backup server 4 (step S1), the stream data to be written is divided into block data D of a certain size (step S2). For this block data D, a hash value is calculated and acquired using a hash function. Duplication determination is performed with the obtained hash value as an input (step S3). Here, the overlap determination process will be described with reference to FIG.

上述したように算出したブロックデータＤのハッシュ値を基にして、任意のストレージノード３のハッシュテーブル３６にアクセスする。そして、ハッシュテーブル３６からハッシュ値のショートハッシュ値を検索して、同一のショートハッシュ値を持つエントリ（インデックスデータ）が存在するかを確認する（矢印Ｙ１参照）。この時に、エントリが見つからない場合は、記憶対象となるブロックデータＤと同一内容のデータは、ストレージノード３には書き込まれておらず、「非重複」となる。エントリが見つかった場合は、そのショートハッシュ値に対応するインデックスファイルＩＤとオフセットを取得する。なお、ハッシュテーブル３６ではショートハッシュ値のみで比較するため、複数のエントリが見つかる可能性がある。 Based on the hash value of the block data D calculated as described above, the hash table 36 of an arbitrary storage node 3 is accessed. Then, a short hash value of the hash value is searched from the hash table 36, and it is confirmed whether there is an entry (index data) having the same short hash value (see arrow Y1). At this time, if no entry is found, data having the same content as the block data D to be stored is not written to the storage node 3 and becomes “non-overlapping”. If an entry is found, the index file ID and offset corresponding to the short hash value are acquired. Since the hash table 36 compares only with the short hash value, a plurality of entries may be found.

続いて、見つかったハッシュテーブル３６のエントリの情報からインデックスファイル３７にアクセスして、当該インデックスファイル３７内のフルハッシュ値と算出したブロックデータＤのフルハッシュ値と比較する（矢印Ｙ２）。ここでフルハッシュ値が同一のエントリが見つかった場合は、書き込み対象となるブロックデータＤと同一内容のデータが既にストレージノード３に書き込まれていると判断でき、「重複」となる。一方、同一のエントリが見つからなかった場合は、「非重複」となる。 Subsequently, the index file 37 is accessed from the entry information of the found hash table 36, and the full hash value in the index file 37 is compared with the calculated full hash value of the block data D (arrow Y2). Here, when an entry having the same full hash value is found, it can be determined that data having the same content as the block data D to be written has already been written in the storage node 3, resulting in “duplication”. On the other hand, if the same entry is not found, it becomes “non-overlapping”.

上述した重複判定にて、「重複」となった場合は（ステップＳ３：Ｙｅｓ）、書き込み完了を返却する（ステップＳ８）。「非重複」の場合は（ステップＳ３：Ｎｏ）、ブロックデータＤをフラグメントデータに分割して、各ストレージノード３に分散させる（ステップＳ４）。以下、各ストレージノード３にて破線Ａで囲まれた処理が行われ、フラグメントデータが格納される。 In the duplication determination described above, when “duplication” is obtained (step S3: Yes), writing completion is returned (step S8). In the case of “non-overlapping” (step S3: No), the block data D is divided into fragment data and distributed to each storage node 3 (step S4). Thereafter, the processing surrounded by the broken line A is performed in each storage node 3, and fragment data is stored.

ストレージノード３では、受け取ったフラグメントデータを、図４に示すように各データ格納用コンテナに保存する。データ格納用コンテナは、各仮想ノード（図５のＶ００等）に１２個ずつ存在している（符号０１−１２）。ここでは、受け取ったハッシュ値の先頭２ビットを確認して、どの仮想ノードにデータを割り振るかを決定する。例えば、ハッシュ値が「1010101111...」の場合は、仮想ノードＶ１０が、その１２個のフラグメントデータを格納する。各データ格納用コンテナには、それぞれ１個のフラグメントデータが格納される。 The storage node 3 stores the received fragment data in each data storage container as shown in FIG. There are twelve data storage containers in each virtual node (such as V00 in FIG. 5) (reference numeral 01-12). Here, the first two bits of the received hash value are confirmed, and it is determined to which virtual node data is allocated. For example, when the hash value is “1010101111...”, The virtual node V10 stores the 12 pieces of fragment data. Each data storage container stores one piece of fragment data.

フラグメントデータは、図６に示すように各データ格納用コンテナが持つ格納ファイル３８に、図７（Ｃ）のように保存される（ステップＳ５）。格納ファイル３８は、ファイルＩＤが１から昇順の番号で作成され、フラグメントデータがある一定の数溜まった場合に、次のファイルに書き出す。 As shown in FIG. 6, the fragment data is stored in the storage file 38 of each data storage container as shown in FIG. 6 (step S5). The storage file 38 is created in ascending order from 1 as the file ID, and when a certain number of fragment data is accumulated, it is written to the next file.

格納ファイル３８にフラグメントデータが保存された後は、当該フラグメントデータのインデックスデータを各エントリとして、インデックスファイル３７に追記する。このとき、インデックスデータとして、フラグメントデータへのポインタである参照情報（格納ファイルＩＤ、オフセット）、フラグメントデータの元となるブロックデータＤのフルハッシュ値、フラグメントデータが書き込まれたタイムスタンプを、図７（Ｂ）のように記憶する（ステップＳ６）。 After the fragment data is stored in the storage file 38, the index data of the fragment data is added to the index file 37 as each entry. At this time, as index data, reference information (storage file ID, offset) that is a pointer to fragment data, a full hash value of block data D that is the source of fragment data, and a time stamp at which fragment data is written are shown in FIG. Store as (B) (step S6).

最後に、ハッシュテーブル３６に、書き込んだフラグメントデータに対応するインデックスデータを参照するエントリを書き込む。具体的には、各エントリとして、フラグメントデータの元となるブロックデータＤのショートハッシュ値、当該フラグメントデータを参照するインデックスデータへのポインタ（インデックスファイルＩＤ、オフセット）を、図７（Ａ）のように記憶する（ステップＳ７）。そして、記憶した各ブロックデータＤとハッシュ値とを関連付けて管理して、書き込み完了となる（ステップＳ８）。 Finally, an entry that refers to the index data corresponding to the written fragment data is written in the hash table 36. Specifically, as each entry, the short hash value of the block data D that is the source of the fragment data and the pointer (index file ID, offset) to the index data that refers to the fragment data are as shown in FIG. (Step S7). Then, the stored block data D and the hash value are managed in association with each other, and the writing is completed (step S8).

（読み込み処理）
次に、図１０を参照して、データ読み込み処理を説明する。バックアップサーバ４からの読み込み要求をアクセスノード２が受け取ると（ステップＳ１１）、管理している情報から、読み込み対象となるブロックデータＤのハッシュ値を確認する（ステップＳ１２）。そして、アクセスノード２は、読み込み対象となるブロックデータＤのハッシュ値を、全てのストレージノード３に分散させる（ステップＳ１３）。 (Reading process)
Next, the data reading process will be described with reference to FIG. When the access node 2 receives a read request from the backup server 4 (step S11), the hash value of the block data D to be read is confirmed from the managed information (step S12). Then, the access node 2 distributes the hash value of the block data D to be read to all the storage nodes 3 (step S13).

続いて、各ストレージノード３は、それぞれ破線Ｂで囲まれた処理を行い、フラグメントデータの読み込みを行う。まず、受け取ったハッシュ値のショートハッシュ値を用いて図７（Ｃ）に示すようなハッシュテーブル３６内を検索して（図８の矢印Ｙ１参照）、対象となるインデックスファイルＩＤとオフセットを取得する（ステップＳ１４）。そして、ストレージノード３は、取得した情報から、図７（Ｂ）に示すようなインデックスファイル３７にアクセスして、対象となるデータのフルハッシュ値と同一になるエントリであるインデックスデータを探索する（ステップＳ１５）（図８の矢印Ｙ２参照）。探索によって見つかったインデックスファイル３７のエントリであるインデックスデータから、図７（Ａ）に示すような格納ファイル３８にアクセスして（図８の矢印Ｙ３参照）、対象のフラグメントデータを読み込む（ステップＳ１６）。 Subsequently, each storage node 3 performs a process surrounded by a broken line B to read fragment data. First, the hash table 36 as shown in FIG. 7C is searched using the short hash value of the received hash value (see arrow Y1 in FIG. 8), and the target index file ID and offset are obtained. (Step S14). Then, the storage node 3 accesses the index file 37 as shown in FIG. 7B from the acquired information and searches for index data that is an entry that is the same as the full hash value of the target data ( Step S15) (see arrow Y2 in FIG. 8). A storage file 38 as shown in FIG. 7A is accessed from the index data that is an entry of the index file 37 found by the search (see arrow Y3 in FIG. 8), and the target fragment data is read (step S16). .

その後、ストレージノード３あるいはアクセスノード２は、各ストレージノード３から読み込まれたフラグメントデータを結合して、読み込み対象となるブロックデータＤを生成し（ステップ１７）、そのブロックデータＤを返却する（ステップＳ１８）。 Thereafter, the storage node 3 or the access node 2 combines the fragment data read from each storage node 3, generates block data D to be read (step 17), and returns the block data D (step 17). S18).

（ハッシュテーブル再構築中の読み込み処理）
次に、図１１を参照して、ハッシュテーブル再構築中におけるデータ読み込み処理を説明する。バックアップサーバ４からの読み込み要求をアクセスノード２が受け取ると（ステップＳ２１）、管理している情報から、読み込み対象となるブロックデータＤのハッシュ値を確認する（ステップＳ２２）。そして、アクセスノード２は、読み込み対象となるブロックデータＤのハッシュ値を、全てのストレージノード３に分散させる（ステップＳ２３）。 (Read processing during hash table reconstruction)
Next, with reference to FIG. 11, data reading processing during hash table reconstruction will be described. When the access node 2 receives a read request from the backup server 4 (step S21), the hash value of the block data D to be read is confirmed from the managed information (step S22). Then, the access node 2 distributes the hash value of the block data D to be read to all the storage nodes 3 (step S23).

続いて、各ストレージノード３は、それぞれ破線Ｃで囲まれた処理を行い、フラグメントデータの読み込みを行う。ここで、ストレージノード３は、自己であるストレージノード３に記憶されているハッシュテーブル３６が正常か否かを調べ（ステップＳ２４）、ハッシュテーブル３６が正常である場合には（ステップＳ２４：Ｎｏ）、上述した図１０に示すような通常の読み込み処理を実行する（ステップＳ２５）。一方、ストレージノード３は、自己であるストレージノード３に記憶されているハッシュテーブル３６が再構築中の場合は（ステップＳ２４：Ｙｅｓ）、次のような方法で読み込み対象のフラグメントデータを読み込む。なお、以下では、ハッシュテーブル３６が再構築中であるストレージノード３を特定のストレージノード３と記し、その他のストレージノード３を別のストレージノード３と記す。 Subsequently, each storage node 3 performs a process surrounded by a broken line C to read fragment data. Here, the storage node 3 checks whether or not the hash table 36 stored in its own storage node 3 is normal (step S24), and if the hash table 36 is normal (step S24: No). A normal reading process as shown in FIG. 10 is executed (step S25). On the other hand, when the hash table 36 stored in the storage node 3 that is the storage node 3 is being reconstructed (step S24: Yes), the storage node 3 reads the fragment data to be read by the following method. In the following description, the storage node 3 whose hash table 36 is being reconstructed is referred to as a specific storage node 3, and the other storage nodes 3 are referred to as other storage nodes 3.

はじめに、特定のストレージノード３は、ノード状態監視部３４を用いて、正常なストレージノード３のリストを取得する（ステップＳ２６）。正常なノードとは、ストレージサービスとしての機能が動作しており、ハッシュテーブル３６などのデータにアクセスできるノードである。正常なノードのリストから、同一のブロックデータＤから生成されたフラグメントデータが格納されているストレージノード３を探索する。 First, the specific storage node 3 acquires a list of normal storage nodes 3 using the node state monitoring unit 34 (step S26). A normal node is a node that operates as a storage service and can access data such as the hash table 36. A storage node 3 in which fragment data generated from the same block data D is stored is searched from the list of normal nodes.

具体的に、特定のストレージノード３による目的となるフラグメントデータが格納されている別のストレージノード３の探索は、データ格納時におけるフラグメントデータの振り分け規則を用いることができる。ここで、フラグメントデータは、ブロックデータＤのハッシュ値の先頭２ビットを用いて格納される仮想ノードが決定し、各仮想ノードは、フラグメントデータが格納されたデータ格納用コンテナの位置情報を持っている。このため、特定のストレージノード３は、目的となるフラグメントデータにて構成されるブロックデータＤのハッシュ値から、当該フラグメントデータを持つ別のストレージノード３を特定する。 Specifically, a search for another storage node 3 in which the target fragment data is stored by a specific storage node 3 can use a fragment data distribution rule at the time of data storage. Here, the virtual node in which the fragment data is stored is determined by using the first two bits of the hash value of the block data D, and each virtual node has the position information of the data storage container in which the fragment data is stored. Yes. For this reason, the specific storage node 3 specifies another storage node 3 having the fragment data from the hash value of the block data D composed of the target fragment data.

続いて、特定のストレージノード３は、目的のフラグメントデータを持つ別のストレージノード３に記憶されているハッシュテーブル３６に、ブロックデータＤのショートハッシュ値をキーとしてアクセスする。そして、特定のストレージノード３は、別のストレージノード３のハッシュテーブル３６から、目的のフラグメントデータのメタデータであり、当該別のストレージノード３に記憶されているインデックスデータにアクセスするための情報（インデックスファイルＩＤとオフセット）を取得する（ステップＳ２７）。そして、特定のストレージノード３は、別のストレージノード３から取得したインデックスデータから、ブロックデータＤのフルハッシュ値をキーとして、目的となるフラグメントデータのインデックスデータを特定する。これにより、特定のストレージノード３は、別のストレージノード３に記憶されている、読み込み対象のフラグメントデータが書き込まれた時刻を表すタイムスタンプを取得する（ステップＳ２８）。 Subsequently, the specific storage node 3 accesses the hash table 36 stored in another storage node 3 having the target fragment data using the short hash value of the block data D as a key. Then, the specific storage node 3 is the metadata of the target fragment data from the hash table 36 of another storage node 3, and information for accessing the index data stored in the other storage node 3 ( Index file ID and offset) are acquired (step S27). Then, the specific storage node 3 specifies the index data of the target fragment data from the index data acquired from another storage node 3 using the full hash value of the block data D as a key. As a result, the specific storage node 3 obtains a time stamp that is stored in another storage node 3 and represents the time when the fragment data to be read is written (step S28).

続いて、特定のストレージノード３は、別のストレージノード３から取得したタイムスタンプを基にして、自己である特定のストレージノード３に記憶されているインデックスファイル３７を参照して、タイムスタンプの値が近いエントリであるインデックスデータを検索する（ステップＳ２９）。このとき、インデックスファイル３７内から検索するインデックスデータは、別のストレージノード３から取得したタイムスタンプの前後１分程度の範囲にあるタイムスタンプを有するインデックスデータとする。ここで、規則のないストレージの場合、タイムスタンプを基に全てのエントリを探索することは非効率であるが、本実施形態におけるストレージシステムの場合、インデックスファイルは実質として追記型のデータ構造を持っている。このため、インデックスファイル内のインデックスデータは、タイムスタンプ順に並んでいるため、タイムスタンプの値が近いインデックスデータの検索が容易となる。 Subsequently, the specific storage node 3 refers to the index file 37 stored in the specific storage node 3 that is the self, based on the time stamp acquired from another storage node 3, and determines the value of the time stamp. The index data that is an entry close to is searched (step S29). At this time, the index data searched from the index file 37 is index data having a time stamp in the range of about 1 minute before and after the time stamp acquired from another storage node 3. Here, in the case of storage without a rule, it is inefficient to search all entries based on the time stamp. However, in the case of the storage system in this embodiment, the index file has a write-once data structure as a matter of fact. ing. For this reason, since the index data in the index file are arranged in the order of the time stamps, it is easy to search for index data having similar time stamp values.

続いて、特定のストレージノード３は、タイムスタンプから検索されたインデックスファイル内のエントリのフルハッシュ値を順に比較して、ハッシュ値が同一となるインデックスデータを探索する（ステップＳ３０）。ここで、対象のエントリが見つからなかった場合は、読み込みエラーとして上位に返却する。対象のエントリが見つかった場合は、当該エントリであるインデックスデータにて参照される格納ファイル３８からフラグメントデータを読み込む（ステップＳ３１）。その後、特定のストレージノード３あるいはアクセスノード２は、各ストレージノード３から読み込まれたフラグメントデータを結合して、読み込み対象となるブロックデータＤを生成して（ステップＳ３２）、そのブロックデータＤを返却する（ステップＳ３３）。 Subsequently, the specific storage node 3 sequentially compares the full hash values of the entries in the index file searched from the time stamp, and searches for index data having the same hash value (step S30). Here, if the target entry is not found, it is returned to the upper level as a read error. If the target entry is found, fragment data is read from the storage file 38 referenced by the index data that is the entry (step S31). Thereafter, the specific storage node 3 or access node 2 combines the fragment data read from each storage node 3, generates block data D to be read (step S32), and returns the block data D (Step S33).

以上のように、本発明におけるストレージシステムでは、高速検索のために設けているハッシュテーブルが障害となり復旧処理をしている最中でも、読み込み性能が低下することを抑制して、動作を継続することができる。その理由を以下に説明する。 As described above, in the storage system according to the present invention, even when the hash table provided for high-speed search becomes a failure and recovery processing is in progress, it is possible to suppress the decrease in reading performance and continue the operation. Can do. The reason will be described below.

まず、本発明のストレージシステムでは、フラグメントデータを保存する格納ファイル、フラグメントデータのメタデータを保存するインデックスファイルは、全てのストレージノードで実質的に追記型のデータ構造を持っている。このため、各ファイル内のデータは、タイムスタンプ順に並ぶこととなる。従って、ハッシュテーブル再構築中の場合に、ハッシュテーブルから格納データの情報を取得できなくても、そのフラグメントデータが書かれた時間（タイムスタンプ）がおおよそわかれば、どのファイルにフラグメントデータが格納されているかは、容易に推測することができる。 First, in the storage system of the present invention, the storage file for storing fragment data and the index file for storing fragment data metadata have substantially a write-once data structure in all storage nodes. For this reason, the data in each file is arranged in the order of the time stamp. Therefore, if the stored data information cannot be obtained from the hash table when the hash table is being reconstructed, the fragment data is stored in which file as long as the time (time stamp) at which the fragment data is written is known. It can be easily guessed.

また、本発明では、目的のブロックデータから生成された各フラグメントデータは、すぐに各ノードに分散格納されるため、これらのデータは近い時間帯に書き込まれる可能性が極めて高く、全てのフラグメントデータが持つタイムスタンプも近い値を持つ。そのため、他のフラグメントデータが書かれたタイムスタンプを取得することで、自ノードのフラグメントデータが書かれた時間が推測可能となる。 In the present invention, since each fragment data generated from the target block data is immediately distributed and stored in each node, it is highly likely that these data will be written in a close time zone, and all fragment data Also has a similar time stamp. Therefore, by acquiring a time stamp in which other fragment data is written, it is possible to estimate the time in which the fragment data of the own node is written.

そして、ハッシュテーブル再構築中には、当該ハッシュテーブルを用いず、他のフラグメントデータが書かれたタイムスタンプのみを用いて自ノードの中から絞り込んでいる。このとき、上述したようにタイムスタンプ順にメタデータ等が並ぶ性質を持っているため、データを極めて精度高く絞り込むことができ、フラグメントデータの格納場所を高速に検索することができる。 During reconstruction of the hash table, the hash table is not used, and only the time stamps in which other fragment data are written are used to narrow down from the own node. At this time, as described above, since the metadata and the like are arranged in the order of the time stamp, the data can be narrowed down with extremely high accuracy, and the storage location of the fragment data can be searched at high speed.

ここで、上記では、まず、別のストレージノード３のハッシュテーブル３６を探索して、フラグメントデータのタイムスタンプを取得する場合を例示したが、かかる方法によりタイムスタンプを取得することに限定されない。例えば、読み込み対象となるブロックデータＤのフルハッシュ値から、別のストレージノード３に記憶されているインデックスデータを探索して、当該フルハッシュ値を含むインデックスデータを取得してタイムスタンプを特定してもよい。 Here, the case where the hash table 36 of another storage node 3 is first searched and the time stamp of the fragment data is acquired has been exemplified above, but the present invention is not limited to acquiring the time stamp by such a method. For example, the index data stored in another storage node 3 is searched from the full hash value of the block data D to be read, the index data including the full hash value is acquired, and the time stamp is specified. Also good.

また、上述したインデックスデータには、参照するフラグメントデータを記憶した時刻を表すタイムスタンプを含めて記憶したが、タイムスタンプに替えて他のデータを記憶してもよい。例えば、同一のブロックデータＤから生成されたフラグメントデータが同一のタイミングで書き込まれたことを表すような値からなるタイミング情報を、上記タイムスタンプに替えてインデックスデータ内に記憶してもよい。このとき、タイミング情報は、他のタイミングと区別できるような情報である。なお、上記タイムスタンプに替える情報としては、同一のブロックデータＤから生成されたフラグメントデータの書き込み時の状況を、他のブロックデータの書き込み時の状況とは区別できる情報（書き込み状況情報）であれば、いかなる情報であってもよい。 Further, although the index data described above includes a time stamp indicating the time when the fragment data to be referenced is stored, other data may be stored instead of the time stamp. For example, timing information including a value indicating that fragment data generated from the same block data D is written at the same timing may be stored in the index data instead of the time stamp. At this time, the timing information is information that can be distinguished from other timings. Note that the information to be replaced with the time stamp may be information (write status information) that can distinguish the status at the time of writing fragment data generated from the same block data D from the status at the time of writing other block data. Any information may be used.

また、上記では、インデックスファイル３７及びハッシュテーブル３６を、参照するフラグメントデータが格納されたストレージノードに格納する場合を例示したが、本発明は、かかる構成に限定されない。例えば、インデックスファイル３７やハッシュテーブル３６は、いかなる記憶装置に記憶されていてもよい。但し、各インデックスファイルやハッシュテーブル３６は、参照するフラグメントデータが格納されたストレージノード毎に区別されて記憶されているとよい。 Moreover, although the case where the index file 37 and the hash table 36 are stored in the storage node in which the fragment data to be referenced is stored has been described above, the present invention is not limited to such a configuration. For example, the index file 37 and the hash table 36 may be stored in any storage device. However, each index file and hash table 36 may be stored separately for each storage node in which the fragment data to be referenced is stored.

＜付記＞
上記実施形態の一部又は全部は、以下の付記のようにも記載されうる。以下、本発明におけるストレージシステム１００（図１２参照）、プログラム、データ処理方法の構成の概略を説明する。但し、本発明は、以下の構成に限定されない。 <Appendix>
Part or all of the above-described embodiment can be described as in the following supplementary notes. The outline of the configuration of the storage system 100 (see FIG. 12), the program, and the data processing method in the present invention will be described below. However, the present invention is not limited to the following configuration.

（付記１）
記憶対象データを複数に分割した分割データを含む複数のフラグメントデータを生成して、当該複数のフラグメントデータを複数の記憶手段１２０にそれぞれ分散して記憶すると共に、
前記フラグメントデータを参照する情報と、当該フラグメントデータにて構成される前記記憶対象データのデータ内容に基づいて算出された要約データと、を関連付けたインデックスデータを、当該インデックスデータにて参照される前記フラグメントデータの記憶先となる前記記憶手段毎に区別して記憶し、
さらに、前記インデックスデータを参照する情報と、当該インデックスデータに含まれた前記要約データの一部からなる部分要約データと、を関連付けた要約テーブルを記憶する、
分散記憶処理手段１１１と、
検索要求データに対応する前記要約データに基づいて、前記要約テーブル及び前記インデックスデータを探索して、前記複数の記憶手段に記憶されている前記検索要求データを構成する複数のフラグメントデータを検索するデータ検索手段１１２と、を備え、
前記分散記憶処理手段１１１は、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込み時の状況を表す書き込み状況情報を含めて記憶し、
前記データ検索手段１１２は、前記要約テーブルの少なくとも一部が利用不可である場合に、当該利用不可である要約テーブルによって参照される前記インデックスデータにてさらに参照される前記フラグメントデータを記憶する特定の前記記憶手段とは別の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データの前記要約データに基づいて当該検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて前記特定の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
ストレージシステム１００。 (Appendix 1)
A plurality of fragment data including divided data obtained by dividing the storage target data into a plurality of pieces, and the plurality of fragment data are respectively distributed and stored in the plurality of storage units 120;
The index data that associates the information that refers to the fragment data with the summary data that is calculated based on the data content of the storage target data that is configured with the fragment data is referred to in the index data. Storing each of the storage means as fragment data storage destinations separately,
Further, a summary table that associates information that refers to the index data and partial summary data that includes a part of the summary data included in the index data is stored.
Distributed storage processing means 111;
Data for searching a plurality of fragment data constituting the search request data stored in the plurality of storage means by searching the summary table and the index data based on the summary data corresponding to the search request data Search means 112,
The distributed storage processing unit 111 stores the index data including write status information indicating a status at the time of writing the fragment data referred to by the index data,
The data search means 112 stores the fragment data further referenced in the index data referenced by the unavailable summary table when at least a part of the summary table is unavailable. Searching the index data corresponding to a storage means different from the storage means, and a part of the fragment data constituting the search target data based on the summary data of the search target data and the fragment data The write status information is specified, and the index data corresponding to the specific storage means is searched based on the specified write status information and the summary data of the search target data, and the other search data constituting the search target data Identify fragment data,
Storage system 100.

上記発明のストレージシステムによると、まず、データの書き込みの際には、記憶対象データから複数のフラグメントデータを生成し、これらを複数の記憶手段に分散記憶する。このとき、フラグメントデータを参照する情報と、フラグメントデータの書き込み状況情報と、記憶対象データの要約データと、を関連付けたインデックスデータを、記憶手段毎に区別して記憶する。さらに、インデックスデータを参照する情報と、記憶対象データの要約データの一部からなる部分要約データと、を関連付けた要約テーブルも記憶する。 According to the storage system of the present invention, first, when writing data, a plurality of fragment data is generated from the storage target data, and these are distributed and stored in a plurality of storage means. At this time, the index data in which the information for referring to the fragment data, the writing status information of the fragment data, and the summary data of the storage target data are associated with each other are stored for each storage unit. Furthermore, a summary table in which information that refers to the index data and partial summary data that is a part of the summary data of the storage target data is associated is also stored.

また、ストレージシステムは、データの読み込み時に、対象となるデータ（検索対象データ）が既に記憶されているか否かの検索を行う。このとき、まず、検索対象データの要約データの一部である部分要約データを用いて、要約テーブルを探索してインデックスデータを特定する。そして、検索対象データの要約データを用いて、特定したインデックスデータを探索して、検索対象データが存在するか否かを確認する。これにより、記憶位置を特定したフラグメントデータを読み出して、ブロックデータを再生成することで、読込処理を行う。 In addition, the storage system searches whether the target data (search target data) is already stored when reading the data. At this time, first, index data is specified by searching a summary table using partial summary data that is a part of summary data of search target data. Then, the specified index data is searched using the summary data of the search target data to check whether the search target data exists. Thereby, the read processing is performed by reading out the fragment data specifying the storage position and regenerating the block data.

そして、ストレージシステムは、要約テーブルの一部あるいは全部が利用不可である場合には、以下のようにして読み込み対象となるデータの検索を行う。まず、利用不可である要約テーブルに対応する特定の記憶手段とは異なる別の記憶手段に対応するインデックスデータから、検索対象データの要約データと一致するものと関連付けられたインデックスデータを探索して、検索対象データの一部のフラグメントデータ及びその書き込み状況情報を特定する。そして、この特定した書き込み状況情報を利用して、特定の記憶手段に対応するインデックスデータからも、検索対象データの要約データと一致するものと関連付けられたインデックスデータを探索して、残りのフラグメントデータを特定する。このように、検索対象データの一部のフラグメントデータの書き込み状況情報を利用することで、残りのフラグメントデータも容易かつ迅速に特定することができる。その結果、要約テーブルの少なくとも一部が利用できない場合であっても、容易かつ迅速にデータの検索を行うことができ、ストレージシステムの性能が低下することを抑制することができる。 When a part or all of the summary table is unavailable, the storage system searches for data to be read as follows. First, from the index data corresponding to another storage means different from the specific storage means corresponding to the unavailable summary table, the index data associated with the one that matches the summary data of the search target data is searched, A part of fragment data of the search target data and its writing status information are specified. Then, using this specified write status information, the index data associated with the summary data of the search target data is searched from the index data corresponding to the specific storage means, and the remaining fragment data Is identified. As described above, by using the writing status information of a part of fragment data of the search target data, the remaining fragment data can be easily and quickly identified. As a result, even when at least a part of the summary table cannot be used, data can be searched easily and quickly, and the performance of the storage system can be prevented from deteriorating.

（付記２）
付記１に記載のストレージシステムであって、
前記分散記憶処理手段は、前記要約テーブルを、当該要約テーブルにて参照される前記インデックスデータにてさらに参照される前記フラグメントデータの記憶先となる前記記憶手段毎に区別して記憶し、
前記データ検索手段は、前記検索対象データの前記要約データの一部である部分要約データに基づいて、前記別の記憶手段に対応する利用可能な前記要約テーブルを探索して、当該別の記憶手段に対応する前記インデックスデータを探索し、前記検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて前記特定の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
ストレージシステム。 (Appendix 2)
The storage system according to attachment 1, wherein
The distributed storage processing unit stores the summary table separately for each storage unit serving as a storage destination of the fragment data further referred to by the index data referred to by the summary table,
The data search means searches the available summary table corresponding to the other storage means based on partial summary data which is a part of the summary data of the search target data, and the other storage means The index data corresponding to the search target data is searched, the fragment data constituting the search target data and the write status information of the fragment data are specified, and the specified write status information and the summary data of the search target data are specified. The index data corresponding to the specific storage means is searched based on and the other fragment data constituting the search target data is specified.
Storage system.

（付記３）
付記２に記載のストレージシステムであって、
前記分散記憶処理手段は、前記インデックスデータを、当該インデックスデータにて参照される前記フラグメントデータの記憶先となる前記記憶手段に記憶し、前記要約テーブルを、当該要約テーブルにて参照される前記インデックスデータにてさらに参照される前記フラグメントデータの記憶先となる前記記憶手段に記憶し、
前記データ検索手段は、前記特定の記憶手段に記憶された前記要約テーブルが利用不可である場合に、前記検索対象データの前記要約データの一部である部分要約データに基づいて、前記別の記憶手段に記憶されている利用可能な前記要約テーブルを探索して、当該別の記憶手段に記憶されている前記インデックスデータを探索し、前記検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて、前記要約テーブルが利用不可である前記特定の記憶手段に記憶されている前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
ストレージシステム。 (Appendix 3)
The storage system according to appendix 2,
The distributed storage processing unit stores the index data in the storage unit serving as a storage destination of the fragment data referred to by the index data, and the summary table is referred to by the index table. Storing it in the storage means as a storage destination of the fragment data further referred to by data;
The data search means, when the summary table stored in the specific storage means is not available, based on partial summary data that is a part of the summary data of the search target data, Searching the available summary table stored in the means, searching the index data stored in the other storage means, and a part of the fragment data and the fragment constituting the search target data The write status information of data is specified, and based on the specified write status information and the summary data of search target data, the index data stored in the specific storage means in which the summary table is unavailable Searching to identify other fragment data constituting the search object data;
Storage system.

上記構成のストレージシステムによると、要約テーブルの一部が利用不可である場合には、まず、利用可能である要約テーブルのみを用いて、検索対象データの要約データと一致するものと関連付けられたインデックスデータを探索して、検索対象データの一部のフラグメントデータ及びその書き込み状況情報を特定する。そして、この特定した書き込み状況情報を利用して、利用不可である要約テーブルが関連するインデックスデータからも、残りのフラグメントデータを特定する。これにより、要約テーブルの一部が利用できない場合であっても、容易かつ迅速にデータの検索を行うことができ、ストレージシステムの性能が低下することを抑制することができる。 According to the storage system having the above configuration, when a part of the summary table is unavailable, first, an index associated with the one that matches the summary data of the search target data using only the available summary table. The data is searched to identify a part of the fragment data of the search target data and its write status information. Then, using the specified writing status information, the remaining fragment data is specified also from the index data related to the unusable summary table. As a result, even when a part of the summary table cannot be used, data can be easily and quickly searched, and degradation of the performance of the storage system can be suppressed.

（付記４）
付記１乃至３のいずれかに記載のストレージシステムであって、
前記分散記憶処理手段は、１つの記憶対象データから生成された前記複数のフラグメントデータを同一のタイミングで前記複数の記憶手段にそれぞれ分散して記憶すると共に、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込みタイミングに基づく値を表す書き込みタイミング情報を、前記書き込み状況情報として含めて記憶し、
前記データ検索手段は、前記別の記憶手段に対応する前記インデックスデータから特定した前記検索対象データを構成する一部のフラグメントデータの前記書き込み状況情報である前記書き込みタイミング情報、及び、検索対象データの前記要約データに基づいて、前記特定の前記記憶手段に記憶されている前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
ストレージシステム。 (Appendix 4)
The storage system according to any one of appendices 1 to 3,
The distributed storage processing unit stores the plurality of fragment data generated from one storage target data in a distributed manner in the plurality of storage units at the same timing, and the index data includes the index data. Write timing information representing a value based on the write timing of the fragment data referred to is included and stored as the write status information,
The data search means includes the write timing information which is the write status information of a part of fragment data constituting the search target data specified from the index data corresponding to the other storage means, and the search target data Based on the summary data, the index data stored in the specific storage means is searched, and the other fragment data constituting the search target data is specified.
Storage system.

（付記５）
付記１乃至４のいずれかに記載のストレージシステムであって、
前記分散記憶処理手段は、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込み時刻に基づく値を表す書き込み時刻情報を、前記書き込み状況情報として含めて記憶し、
前記データ検索手段は、前記別の記憶手段に対応する前記インデックスデータから特定した前記検索対象データを構成する一部のフラグメントデータの前記書き込み状況情報である前記書き込み時刻情報、及び、検索対象データの前記要約データに基づいて、前記特定の前記記憶手段に記憶されている前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
ストレージシステム。 (Appendix 5)
The storage system according to any one of appendices 1 to 4,
The distributed storage processing means stores, in the index data, write time information representing a value based on a write time of the fragment data referred to by the index data, including the write status information,
The data search means includes the write time information which is the write status information of a part of fragment data constituting the search target data specified from the index data corresponding to the other storage means, and the search target data Based on the summary data, the index data stored in the specific storage means is searched, and the other fragment data constituting the search target data is specified.
Storage system.

（付記６）
付記５に記載のストレージシステムであって、
前記データ検索手段は、前記特定の記憶手段に対応する前記インデックスデータを探索して、前記特定した書き込み時刻情報を基準として前後する所定時間内の前記書き込み時刻情報が関連付けられた前記インデックスデータから、前記検索対象データを構成する他の前記フラグメントデータを特定する、
ストレージシステム。 (Appendix 6)
The storage system according to appendix 5,
The data search means searches the index data corresponding to the specific storage means, from the index data associated with the write time information within a predetermined time before and after the specified write time information, Specifying the other fragment data constituting the search target data;
Storage system.

（付記７）
付記４乃至６のいずれかに記載のストレージシステムであって、
前記分散記憶処理手段は、前記複数のフラグメントデータを前記複数の記憶手段にそれぞれ分散して記憶する際に、当該各記憶手段内の記憶領域に順番に追記する、
ストレージシステム。 (Appendix 7)
The storage system according to any one of appendices 4 to 6,
The distributed storage processing means additionally writes in order to the storage area in each storage means when the plurality of fragment data is distributed and stored in the plurality of storage means, respectively.
Storage system.

上記構成のストレージシステムによると、まず、分散記憶の際に、インデックスデータには、フラグメントデータの書き込みのタイミングや時刻を表す情報が記憶される。そして、ストレージシステムは、まず、利用可能な要約テーブルに対応するインデックスデータを探索して、検索対象データの一部のフラグメントデータ、及び、その書き込み状況情報である書き込みのタイミングや時刻の情報を特定する。そして、この特定した書き込みのタイミングや時刻の情報から、利用不可である要約テーブルが関連するインデックスデータからも、残りのフラグメントデータを特定する。このとき、書き込みのタイミングや時刻が近い情報が関連付けられたフラグメントデータを特定することができる。そして、各記憶手段内の記憶領域に、書き込まれた順番でフラグメントデータが格納されている場合には、書き込みのタイミングや時刻を基準としてフラグメントデータを探索することがさらに容易となる。その結果、要約テーブルの一部又は全部が利用できない場合であっても、容易かつ迅速にデータの検索を行うことができ、ストレージシステムの性能が低下することを抑制することができる。 According to the storage system configured as described above, first, during distributed storage, information indicating the timing and time of writing fragment data is stored in the index data. Then, the storage system first searches the index data corresponding to the available summary table, and specifies part of the fragment data of the search target data and the write timing and time information that is the write status information. To do. The remaining fragment data is also specified from the index data associated with the unusable summary table from the specified writing timing and time information. At this time, it is possible to identify fragment data associated with information having similar writing timing and time. If the fragment data is stored in the storage area in each storage means in the order of writing, it becomes easier to search for the fragment data based on the write timing and time. As a result, even if a part or all of the summary table cannot be used, data can be easily and quickly searched, and the performance of the storage system can be prevented from deteriorating.

（付記８）
ストレージシステムの制御装置に、
記憶対象データを複数に分割した分割データを含む複数のフラグメントデータを生成して、当該複数のフラグメントデータを複数の記憶手段にそれぞれ分散して記憶すると共に、
前記フラグメントデータを参照する情報と、当該フラグメントデータにて構成される前記記憶対象データのデータ内容に基づいて算出された要約データと、を関連付けたインデックスデータを、当該インデックスデータにて参照される前記フラグメントデータの記憶先となる前記記憶手段毎に区別して記憶し、
さらに、前記インデックスデータを参照する情報と、当該インデックスデータに含まれた前記要約データの一部からなる部分要約データと、を関連付けた要約テーブルを記憶する、
分散記憶処理手段と、
検索要求データに対応する前記要約データに基づいて、前記要約テーブル及び前記インデックスデータを探索して、前記複数の記憶手段に記憶されている前記検索要求データを構成する複数のフラグメントデータを検索するデータ検索手段と、
を実現させると共に、
前記分散記憶処理手段は、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込み時の状況を表す書き込み状況情報を含めて記憶し、
前記データ検索手段は、前記要約テーブルの少なくとも一部が利用不可である場合に、当該利用不可である要約テーブルによって参照される前記インデックスデータにてさらに参照される前記フラグメントデータを記憶する特定の前記記憶手段とは別の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データの前記要約データに基づいて当該検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて前記特定の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
ことを実現させるためのプログラム。 (Appendix 8)
In the storage system control unit,
A plurality of fragment data including divided data obtained by dividing the storage target data into a plurality of pieces, and the plurality of fragment data are respectively distributed and stored in a plurality of storage units;
The index data that associates the information that refers to the fragment data with the summary data that is calculated based on the data content of the storage target data that is configured with the fragment data is referred to in the index data. Storing each of the storage means as fragment data storage destinations separately,
Further, a summary table that associates information that refers to the index data and partial summary data that includes a part of the summary data included in the index data is stored.
Distributed storage processing means;
Data for searching a plurality of fragment data constituting the search request data stored in the plurality of storage means by searching the summary table and the index data based on the summary data corresponding to the search request data Search means;
And realize
The distributed storage processing means stores the index data including write status information indicating a status at the time of writing the fragment data referred to by the index data,
The data search means stores the fragment data further referenced in the index data referenced by the unavailable summary table when at least a part of the summary table is unavailable. Searching the index data corresponding to a storage means different from the storage means, and writing the fragment data constituting the search target data based on the summary data of the search target data and the writing of the fragment data The other fragment constituting the search target data by specifying the status information, searching the index data corresponding to the specific storage means based on the specified write status information and the summary data of the search target data Identify the data,
A program to make things happen.

（付記８．１）
付記８に記載のプログラムであって、
前記分散記憶処理手段は、前記要約テーブルを、当該要約テーブルにて参照される前記インデックスデータにてさらに参照される前記フラグメントデータの記憶先となる前記記憶手段毎に区別して記憶し、
前記データ検索手段は、前記検索対象データの前記要約データの一部である部分要約データに基づいて、前記別の記憶手段に対応する利用可能な前記要約テーブルを探索して、当該別の記憶手段に対応する前記インデックスデータを探索し、前記検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて前記特定の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
プログラム。 (Appendix 8.1)
The program according to attachment 8, wherein
The distributed storage processing unit stores the summary table separately for each storage unit serving as a storage destination of the fragment data further referred to by the index data referred to by the summary table,
The data search means searches the available summary table corresponding to the other storage means based on partial summary data which is a part of the summary data of the search target data, and the other storage means The index data corresponding to the search target data is searched, the fragment data constituting the search target data and the write status information of the fragment data are specified, and the specified write status information and the summary data of the search target data are specified. The index data corresponding to the specific storage means is searched based on and the other fragment data constituting the search target data is specified.
program.

（付記８．２）
付記８．１に記載のプログラムであって、
前記分散記憶処理手段は、前記インデックスデータを、当該インデックスデータにて参照される前記フラグメントデータの記憶先となる前記記憶手段に記憶し、前記要約テーブルを、当該要約テーブルにて参照される前記インデックスデータにてさらに参照される前記フラグメントデータの記憶先となる前記記憶手段に記憶し、
前記データ検索手段は、前記特定の記憶手段に記憶された前記要約テーブルが利用不可である場合に、前記検索対象データの前記要約データの一部である部分要約データに基づいて、前記別の記憶手段に記憶されている利用可能な前記要約テーブルを探索して、当該別の記憶手段に記憶されている前記インデックスデータを探索し、前記検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて、前記要約テーブルが利用不可である前記特定の記憶手段に記憶されている前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
プログラム。 (Appendix 8.2)
A program according to appendix 8.1,
The distributed storage processing unit stores the index data in the storage unit serving as a storage destination of the fragment data referred to by the index data, and the summary table is referred to by the index table. Storing it in the storage means as a storage destination of the fragment data further referred to by data;
The data search means, when the summary table stored in the specific storage means is not available, based on partial summary data that is a part of the summary data of the search target data, Searching the available summary table stored in the means, searching the index data stored in the other storage means, and a part of the fragment data and the fragment constituting the search target data The write status information of data is specified, and based on the specified write status information and the summary data of search target data, the index data stored in the specific storage means in which the summary table is unavailable Searching to identify other fragment data constituting the search object data;
program.

（付記８．３）
付記８乃至８．２のいずれかに記載のプログラムであって、
前記分散記憶処理手段は、１つの記憶対象データから生成された前記複数のフラグメントデータを同一のタイミングで前記複数の記憶手段にそれぞれ分散して記憶すると共に、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込みタイミングに基づく値を表す書き込みタイミング情報を、前記書き込み状況情報として含めて記憶し、
前記データ検索手段は、前記別の記憶手段に対応する前記インデックスデータから特定した前記検索対象データを構成する一部のフラグメントデータの前記書き込み状況情報である前記書き込みタイミング情報、及び、検索対象データの前記要約データに基づいて、前記特定の前記記憶手段に記憶されている前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
プログラム。 (Appendix 8.3)
A program according to any one of appendices 8 to 8.2,
The distributed storage processing unit stores the plurality of fragment data generated from one storage target data in a distributed manner in the plurality of storage units at the same timing, and the index data includes the index data. Write timing information representing a value based on the write timing of the fragment data referred to is included and stored as the write status information,
The data search means includes the write timing information which is the write status information of a part of fragment data constituting the search target data specified from the index data corresponding to the other storage means, and the search target data Based on the summary data, the index data stored in the specific storage means is searched, and the other fragment data constituting the search target data is specified.
program.

（付記８．４）
付記８乃至８．３のいずれかに記載のプログラムであって、
前記分散記憶処理手段は、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込み時刻に基づく値を表す書き込み時刻情報を、前記書き込み状況情報として含めて記憶し、
前記データ検索手段は、前記別の記憶手段に対応する前記インデックスデータから特定した前記検索対象データを構成する一部のフラグメントデータの前記書き込み状況情報である前記書き込み時刻情報、及び、検索対象データの前記要約データに基づいて、前記特定の前記記憶手段に記憶されている前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
プログラム。 (Appendix 8.4)
A program according to any one of appendices 8 to 8.3,
The distributed storage processing means stores, in the index data, write time information representing a value based on a write time of the fragment data referred to by the index data, including the write status information,
The data search means includes the write time information which is the write status information of a part of fragment data constituting the search target data specified from the index data corresponding to the other storage means, and the search target data Based on the summary data, the index data stored in the specific storage means is searched, and the other fragment data constituting the search target data is specified.
program.

（付記９）
記憶対象データを複数に分割した分割データを含む複数のフラグメントデータを生成して、当該複数のフラグメントデータを複数の記憶手段にそれぞれ分散して記憶すると共に、
前記フラグメントデータを参照する情報と、当該フラグメントデータにて構成される前記記憶対象データのデータ内容に基づいて算出された要約データと、を関連付けたインデックスデータを、当該インデックスデータにて参照される前記フラグメントデータの記憶先となる前記記憶手段毎に区別して記憶し、
さらに、前記インデックスデータを参照する情報と、当該インデックスデータに含まれた前記要約データの一部からなる部分要約データと、を関連付けた要約テーブルを記憶する、
分散記憶処理を行い、
検索要求データに対応する前記要約データに基づいて、前記要約テーブル及び前記インデックスデータを探索して、前記複数の記憶手段に記憶されている前記検索要求データを構成する複数のフラグメントデータを検索するデータ検索処理を行う、データ処理方法であって、
前記分散記憶処理の際に、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込み時の状況を表す書き込み状況情報を含めて記憶し、
前記データ検索処理の際に、前記要約テーブルの少なくとも一部が利用不可である場合に、当該利用不可である要約テーブルによって参照される前記インデックスデータにてさらに参照される前記フラグメントデータを記憶する特定の前記記憶手段とは別の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データの前記要約データに基づいて当該検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて前記特定の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
データ処理方法。 (Appendix 9)
A plurality of fragment data including divided data obtained by dividing the storage target data into a plurality of pieces, and the plurality of fragment data are respectively distributed and stored in a plurality of storage units;
The index data that associates the information that refers to the fragment data with the summary data that is calculated based on the data content of the storage target data that is configured with the fragment data is referred to in the index data. Storing each of the storage means as fragment data storage destinations separately,
Further, a summary table that associates information that refers to the index data and partial summary data that includes a part of the summary data included in the index data is stored.
Perform distributed storage processing
Data for searching a plurality of fragment data constituting the search request data stored in the plurality of storage means by searching the summary table and the index data based on the summary data corresponding to the search request data A data processing method for performing search processing,
During the distributed storage process, the index data is stored including write status information indicating a status at the time of writing the fragment data referred to by the index data,
A specification for storing the fragment data further referred to in the index data referred to by the unavailable summary table when at least a part of the summary table is unavailable during the data search process The index data corresponding to the storage means different from the storage means is searched, and a part of the fragment data and the fragment data constituting the search target data based on the summary data of the search target data The write status information is specified, and the index data corresponding to the specific storage means is searched based on the specified write status information and the summary data of the search target data. Identifying the fragment data;
Data processing method.

（付記１０）
付記９に記載のデータ処理方法であって、
前記分散記憶処理の際に、前記要約テーブルを、当該要約テーブルにて参照される前記インデックスデータにてさらに参照される前記フラグメントデータの記憶先となる前記記憶手段毎に区別して記憶し、
前記データ検索処理の際に、前記検索対象データの前記要約データの一部である部分要約データに基づいて、前記別の記憶手段に対応する利用可能な前記要約テーブルを探索して、当該別の記憶手段に対応する前記インデックスデータを探索し、前記検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて前記特定の記憶手段に対応する前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
データ処理方法。 (Appendix 10)
A data processing method according to attachment 9, wherein
During the distributed storage process, the summary table is stored separately for each storage unit that is a storage destination of the fragment data that is further referred to by the index data referenced by the summary table,
During the data search process, based on partial summary data that is a part of the summary data of the search target data, the available summary table corresponding to the other storage means is searched for, The index data corresponding to the storage unit is searched, the fragment data constituting the search target data and the write status information of the fragment data are specified, and the specified write status information and the search target data Search the index data corresponding to the specific storage means based on the summary data, and specify the other fragment data constituting the search target data,
Data processing method.

（付記１０．１）
付記１０に記載のデータ処理方法であって、
前記分散記憶処理の際に、前記インデックスデータを、当該インデックスデータにて参照される前記フラグメントデータの記憶先となる前記記憶手段に記憶し、前記要約テーブルを、当該要約テーブルにて参照される前記インデックスデータにてさらに参照される前記フラグメントデータの記憶先となる前記記憶手段に記憶し、
前記データ検索処理の際に、前記特定の記憶手段に記憶された前記要約テーブルが利用不可である場合に、前記検索対象データの前記要約データの一部である部分要約データに基づいて、前記別の記憶手段に記憶されている利用可能な前記要約テーブルを探索して、当該別の記憶手段に記憶されている前記インデックスデータを探索し、前記検索対象データを構成する一部の前記フラグメントデータ及び当該フラグメントデータの前記書き込み状況情報を特定し、この特定した書き込み状況情報及び検索対象データの前記要約データに基づいて、前記要約テーブルが利用不可である前記特定の記憶手段に記憶されている前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
データ処理方法。 (Appendix 10.1)
A data processing method according to attachment 10, wherein
In the distributed storage processing, the index data is stored in the storage unit that is a storage destination of the fragment data referenced by the index data, and the summary table is referenced by the summary table. Storing it in the storage means as a storage destination of the fragment data further referred to by index data;
When the summary table stored in the specific storage means is unavailable during the data search process, the different summary data is based on partial summary data that is part of the summary data of the search target data. The available summary table stored in the storage means is searched, the index data stored in the other storage means is searched, a part of the fragment data constituting the search target data, and The index stored in the specific storage means that identifies the write status information of the fragment data, and based on the specified write status information and the summary data of the search target data, the summary table is unavailable Search the data to identify the other fragment data constituting the search target data.
Data processing method.

（付記１０．２）
付記９乃至１０．１のいずれかに記載のデータ処理方法であって、
前記分散記憶処理の際に、１つの記憶対象データから生成された前記複数のフラグメントデータを同一のタイミングで前記複数の記憶手段にそれぞれ分散して記憶すると共に、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込みタイミングに基づく値を表す書き込みタイミング情報を、前記書き込み状況情報として含めて記憶し、
前記データ検索処理の際に、前記別の記憶手段に対応する前記インデックスデータから特定した前記検索対象データを構成する一部のフラグメントデータの前記書き込み状況情報である前記書き込みタイミング情報、及び、検索対象データの前記要約データに基づいて、前記特定の前記記憶手段に記憶されている前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
データ処理方法。 (Appendix 10.2)
A data processing method according to any one of appendices 9 to 10.1,
In the distributed storage process, the plurality of fragment data generated from one storage target data is distributed and stored in the plurality of storage units at the same timing, and the index data is stored in the index data. Write timing information representing a value based on the write timing of the fragment data referred to is included and stored as the write status information,
In the data search process, the write timing information which is the write status information of a part of fragment data constituting the search target data specified from the index data corresponding to the other storage means, and the search target Searching the index data stored in the specific storage means based on the summary data of the data, and specifying the other fragment data constituting the search target data,
Data processing method.

（付記１０．３）
付記９乃至１０．２のいずれかに記載のデータ処理方法であって、
前記分散記憶処理の際に、前記インデックスデータに、当該インデックスデータにて参照される前記フラグメントデータの書き込み時刻に基づく値を表す書き込み時刻情報を、前記書き込み状況情報として含めて記憶し、
前記データ検索処理の際に、前記別の記憶手段に対応する前記インデックスデータから特定した前記検索対象データを構成する一部のフラグメントデータの前記書き込み状況情報である前記書き込み時刻情報、及び、検索対象データの前記要約データに基づいて、前記特定の前記記憶手段に記憶されている前記インデックスデータを探索して、前記検索対象データを構成する他の前記フラグメントデータを特定する、
データ処理方法。 (Appendix 10.3)
A data processing method according to any one of appendices 9 to 10.2,
During the distributed storage process, the index data includes write time information representing a value based on the write time of the fragment data referred to by the index data as the write status information.
In the data search process, the write time information which is the write status information of a part of fragment data constituting the search target data specified from the index data corresponding to the other storage means, and the search target Searching the index data stored in the specific storage means based on the summary data of the data, and specifying the other fragment data constituting the search target data,
Data processing method.

（付記１０．４）
付記１０．３に記載のデータ処理方法であって、
前記データ検索手段は、前記特定の記憶手段に対応する前記インデックスデータを探索して、前記特定した書き込み時刻情報を基準として前後する所定時間内の前記書き込み時刻情報が関連付けられた前記インデックスデータから、前記検索対象データを構成する他の前記フラグメントデータを特定する、
データ処理方法。 (Appendix 10.4)
A data processing method according to attachment 10.3,
The data search means searches the index data corresponding to the specific storage means, from the index data associated with the write time information within a predetermined time before and after the specified write time information, Specifying the other fragment data constituting the search target data;
Data processing method.

（付記１０．５）
付記１０．２乃至１０．４のいずれかに記載のデータ処理方法であって、
前記分散記憶処理の際に、前記複数のフラグメントデータを前記複数の記憶手段にそれぞれ分散して記憶する際に、当該各記憶手段内の記憶領域に順番に追記する、
ストレージシステム。 (Appendix 10.5)
A data processing method according to any one of appendices 10.2 to 10.4,
In the distributed storage process, when the plurality of fragment data are respectively distributed and stored in the plurality of storage units, they are added to the storage areas in the storage units in order.
Storage system.

なお、上述したプログラムは、記憶装置に記憶されていたり、コンピュータが読み取り可能な記録媒体に記録されている。例えば、記録媒体は、フレキシブルディスク、光ディスク、光磁気ディスク、及び、半導体メモリ等の可搬性を有する媒体である。 Note that the above-described program is stored in a storage device or recorded on a computer-readable recording medium. For example, the recording medium is a portable medium such as a flexible disk, an optical disk, a magneto-optical disk, and a semiconductor memory.

以上、上記実施形態等を参照して本願発明を説明したが、本願発明は、上述した実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明の範囲内で当業者が理解しうる様々な変更をすることができる。 Although the present invention has been described with reference to the above-described embodiment and the like, the present invention is not limited to the above-described embodiment. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

１ストレージシステム
２アクセスノード
２１Ｉ／Ｏ処理部
２２データ送受信部
２３重複判定部
３ストレージノード
３１フラグメント処理部
３２フラグメント探索部
３３タイムスタンプ取得部
３４ノード状態監視部
３５ディスク
３６ハッシュテーブル
３７インデックスファイル
３８格納ファイル
４バックアップシステム
５バックアップ対象装置
１００ストレージシステム
１１１分散記憶処理手段
１１２データ検索手段
１２０記憶手段
DESCRIPTION OF SYMBOLS 1 Storage system 2 Access node 21 I / O processing part 22 Data transmission / reception part 23 Duplication determination part 3 Storage node 31 Fragment processing part 32 Fragment search part 33 Time stamp acquisition part 34 Node state monitoring part 35 Disk 36 Hash table 37 Index file 38 Storage file 4 Backup system 5 Backup target device 100 Storage system 111 Distributed storage processing means 112 Data search means 120 Storage means

Claims

A plurality of fragment data including divided data obtained by dividing the storage target data into a plurality of pieces, and the plurality of fragment data are respectively distributed and stored in a plurality of storage units;
The index data that associates the information that refers to the fragment data with the summary data that is calculated based on the data content of the storage target data that is configured with the fragment data is referred to in the index data. Storing each of the storage means as fragment data storage destinations separately,
Further, a summary table that associates information that refers to the index data and partial summary data that includes a part of the summary data included in the index data is stored.
Distributed storage processing means;
Data that searches the summary table and the index data based on the summary data corresponding to the search target data, and searches a plurality of fragment data constituting the search target data stored in the plurality of storage means A search means,
The distributed storage processing means stores the index data including write status information indicating a status at the time of writing the fragment data referred to by the index data,
The data search means stores the fragment data further referenced in the index data referenced by the unavailable summary table when at least a part of the summary table is unavailable. Searching the index data corresponding to a storage means different from the storage means, and writing the fragment data constituting the search target data based on the summary data of the search target data and the writing of the fragment data The other fragment constituting the search target data by specifying the status information, searching the index data corresponding to the specific storage means based on the specified write status information and the summary data of the search target data Identify the data,
Storage system.

The storage system according to claim 1,
The distributed storage processing unit stores the summary table separately for each storage unit serving as a storage destination of the fragment data further referred to by the index data referred to by the summary table,
The data search means searches the available summary table corresponding to the other storage means based on partial summary data which is a part of the summary data of the search target data, and the other storage means The index data corresponding to the search target data is searched, the fragment data constituting the search target data and the write status information of the fragment data are specified, and the specified write status information and the summary data of the search target data are specified. The index data corresponding to the specific storage means is searched based on and the other fragment data constituting the search target data is specified.
Storage system.

The storage system according to claim 2,
The distributed storage processing unit stores the index data in the storage unit serving as a storage destination of the fragment data referred to by the index data, and the summary table is referred to by the index table. Storing it in the storage means as a storage destination of the fragment data further referred to by data;
The data search means, when the summary table stored in the specific storage means is not available, based on partial summary data that is a part of the summary data of the search target data, Searching the available summary table stored in the means, searching the index data stored in the other storage means, and a part of the fragment data and the fragment constituting the search target data The write status information of data is specified, and based on the specified write status information and the summary data of search target data, the index data stored in the specific storage means in which the summary table is unavailable Searching to identify other fragment data constituting the search object data;
Storage system.

The storage system according to any one of claims 1 to 3,
The distributed storage processing unit stores the plurality of fragment data generated from one storage target data in a distributed manner in the plurality of storage units at the same timing, and the index data includes the index data. Write timing information representing a value based on the write timing of the fragment data referred to is included and stored as the write status information,
The data search means includes the write timing information which is the write status information of a part of fragment data constituting the search target data specified from the index data corresponding to the other storage means, and the search target data Based on the summary data, the index data stored in the specific storage means is searched, and the other fragment data constituting the search target data is specified.
Storage system.

The storage system according to any one of claims 1 to 4,
The distributed storage processing means stores, in the index data, write time information representing a value based on a write time of the fragment data referred to by the index data, including the write status information,
The data search means includes the write time information which is the write status information of a part of fragment data constituting the search target data specified from the index data corresponding to the other storage means, and the search target data Based on the summary data, the index data stored in the specific storage means is searched, and the other fragment data constituting the search target data is specified.
Storage system.

The storage system according to claim 5,
The data search means searches the index data corresponding to the specific storage means, from the index data associated with the write time information within a predetermined time before and after the specified write time information, Specifying the other fragment data constituting the search target data;
Storage system.

The storage system according to any one of claims 4 to 6,
The distributed storage processing means additionally writes in order to the storage area in each storage means when the plurality of fragment data is distributed and stored in the plurality of storage means, respectively.
Storage system.

In the storage system control unit,
A plurality of fragment data including divided data obtained by dividing the storage target data into a plurality of pieces, and the plurality of fragment data are respectively distributed and stored in a plurality of storage units;
The index data that associates the information that refers to the fragment data with the summary data that is calculated based on the data content of the storage target data that is configured with the fragment data is referred to in the index data. Storing each of the storage means as fragment data storage destinations separately,
Further, a summary table that associates information that refers to the index data and partial summary data that includes a part of the summary data included in the index data is stored.
Distributed storage processing means;
Data that searches the summary table and the index data based on the summary data corresponding to the search target data, and searches a plurality of fragment data constituting the search target data stored in the plurality of storage means Search means;
And realize
The distributed storage processing means stores the index data including write status information indicating a status at the time of writing the fragment data referred to by the index data,
The data search means stores the fragment data further referenced in the index data referenced by the unavailable summary table when at least a part of the summary table is unavailable. Searching the index data corresponding to a storage means different from the storage means, and writing the fragment data constituting the search target data based on the summary data of the search target data and the writing of the fragment data The other fragment constituting the search target data by specifying the status information, searching the index data corresponding to the specific storage means based on the specified write status information and the summary data of the search target data Identify the data,
A program to make things happen.

A plurality of fragment data including divided data obtained by dividing the storage target data into a plurality of pieces, and the plurality of fragment data are respectively distributed and stored in a plurality of storage units;
The index data that associates the information that refers to the fragment data with the summary data that is calculated based on the data content of the storage target data that is configured with the fragment data is referred to in the index data. Storing each of the storage means as fragment data storage destinations separately,
Further, a summary table that associates information that refers to the index data and partial summary data that includes a part of the summary data included in the index data is stored.
Perform distributed storage processing
Data that searches the summary table and the index data based on the summary data corresponding to the search target data, and searches a plurality of fragment data constituting the search target data stored in the plurality of storage means A data processing method for performing search processing,
During the distributed storage process, the index data is stored including write status information indicating a status at the time of writing the fragment data referred to by the index data,
A specification for storing the fragment data further referred to in the index data referred to by the unavailable summary table when at least a part of the summary table is unavailable during the data search process The index data corresponding to the storage means different from the storage means is searched, and a part of the fragment data and the fragment data constituting the search target data based on the summary data of the search target data The write status information is specified, and the index data corresponding to the specific storage means is searched based on the specified write status information and the summary data of the search target data. Identifying the fragment data;
Data processing method.

A data processing method according to claim 9, wherein
During the distributed storage process, the summary table is stored separately for each storage unit that is a storage destination of the fragment data that is further referred to by the index data referenced by the summary table,
During the data search process, based on partial summary data that is a part of the summary data of the search target data, the available summary table corresponding to the other storage means is searched for, The index data corresponding to the storage unit is searched, the fragment data constituting the search target data and the write status information of the fragment data are specified, and the specified write status information and the search target data Search the index data corresponding to the specific storage means based on the summary data, and specify the other fragment data constituting the search target data,
Data processing method.