JP6201385B2

JP6201385B2 - Storage apparatus and storage control method

Info

Publication number: JP6201385B2
Application number: JP2013080769A
Authority: JP
Inventors: 裕治金澤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2013-04-08
Filing date: 2013-04-08
Publication date: 2017-09-27
Anticipated expiration: 2033-04-08
Also published as: JP2014203362A

Description

本発明は、ストレージ装置及びストレージ制御方法に関する。 The present invention relates to a storage apparatus and a storage control method.

従来、データをＨＤＤ（Hard Disk Drive）などの不揮発性記憶装置に書き込む前にハッシュ値を計算し、ハッシュ値が同じデータが既に記憶されている場合に、データを書き込むことなしにデータを共有する重複排除ストレージ技術がある。 Conventionally, a hash value is calculated before data is written to a nonvolatile storage device such as an HDD (Hard Disk Drive), and if data having the same hash value is already stored, the data is shared without writing the data. There is deduplication storage technology.

例えば、ハッシュのビット数を５１２ビットとすると、ハッシュ値が衝突する可能性は１０¹⁵⁰回に１回程度であり、データの重複排除を行うストレージ装置は、データが同じであるかを確認することなく、ハッシュ値だけでデータが同じであるか否かを判定できる。 For example, if the number of bits of hash is 512 bits, the possibility that hash values collide is about once every 10 ¹⁵⁰ times, and the storage device that performs data deduplication should check whether the data is the same. In addition, it can be determined whether or not the data is the same only by the hash value.

ここで、データはファイルや例えば大きさが６４ＫＢ（キロバイト）のブロックであり、データを用いて同一性を確認しようとすると書き込みに要する時間が大きくなる。したがって、ハッシュ値を用いてデータの同一性を判定することにより、ストレージ装置はデータの書き込み時間を短縮することができる。 Here, the data is a file or a block having a size of, for example, 64 KB (kilobytes), and if it is attempted to check the identity using the data, the time required for writing increases. Therefore, the storage apparatus can shorten the data writing time by determining the identity of the data using the hash value.

なお、原画像をＪＰＥＧ形式に変換する際に、原画像を分割した８×８画素のブロックの夫々にＤＣＴを行って量子化した量子化ＤＣＴ係数などを用いて秘密情報を求め、各ブロックに秘密情報を埋め込むことにより画像の改竄を防ぐ従来技術がある。また、入力されるコンテンツとメタデータに分離困難な変換処理を施すことによって、コンテンツ又はメタデータの改竄を困難にする従来技術がある。 When the original image is converted into the JPEG format, secret information is obtained by using a quantized DCT coefficient obtained by performing DCT on each of the 8 × 8 pixel blocks obtained by dividing the original image, and for each block. There is a conventional technique for preventing image falsification by embedding confidential information. Further, there is a conventional technique that makes it difficult to falsify content or metadata by performing conversion processing that is difficult to separate between input content and metadata.

特開２００３−２８９４３５号公報JP 2003-289435 A 特開２００４−７２１８４号公報JP 2004-72184 A

重複排除ストレージ技術において、ハッシュ値が偶然衝突する可能性は十分小さくすることができるが、意図的にハッシュ値を衝突させる技術が開発されることがある。かかる技術が開発され、あるデータのハッシュ値と同じハッシュ値を持つデータを作ることが可能となった場合、サービス拒否攻撃が可能になる。 In the deduplication storage technology, the possibility that the hash values collide accidentally can be sufficiently reduced, but a technology for intentionally colliding the hash values may be developed. When such a technique is developed and it becomes possible to create data having the same hash value as that of a certain data, a denial of service attack becomes possible.

例えば、攻撃者が、将来作られることが予想できるデータＡと同じハッシュ値を持つデータＢを作り、先回りしてデータＢをストレージ装置に書き込んでおくことで、データＡが書き込まれたときにデータＡがデータＢに置き換えられる。その結果、攻撃者は、データＡを使用する情報処理装置に誤った処理をさせることができる。 For example, an attacker creates data B having the same hash value as data A that can be expected to be created in the future, and data B is written to the storage device in advance, so that data A is written when data A is written. A is replaced with data B. As a result, the attacker can cause the information processing apparatus that uses the data A to perform an incorrect process.

なお、同一ハッシュ値を持つデータが生成可能になったときにハッシュ関数を変えることが考えられるが、ストレージ装置では、全データに対してハッシュ値の再計算が必要になるなど、ハッシュ関数の変更は負担が大きい。 Note that it is possible to change the hash function when data with the same hash value can be generated. However, in the storage device, it is necessary to recalculate the hash value for all data. Is a heavy burden.

本発明は、１つの側面では、重複排除ストレージ技術を利用したサービス拒否攻撃を防ぐことができるストレージ装置及びストレージ制御方法を提供することを目的とする。 An object of one aspect of the present invention is to provide a storage apparatus and a storage control method capable of preventing a denial-of-service attack using deduplication storage technology.

本願の開示するストレージ装置は、１つの態様において、データが不揮発性記憶装置に書き込まれるときに該データから第１の値としてハッシュ値を生成する第１の生成部と、前記データが前記不揮発性記憶装置に書き込まれるときに決定される数に基づいて前記データから取り出された一部のデータを用いて第２の値を生成する第２の生成部と、前記第１の値及び第２の値に基づいて前記データの重複記憶を制御する制御部とを備える。 In one aspect, the storage device disclosed in the present application includes a first generation unit that generates a hash value as a first value from data when the data is written to the nonvolatile storage device, and the data is the nonvolatile A second generator that generates a second value using a portion of the data extracted from the data based on a number determined when written to the storage device; the first value and the second And a control unit for controlling the redundant storage of the data based on the value.

１実施態様によれば、重複排除ストレージ技術を利用したサービス拒否攻撃を防ぐことができる。 According to one embodiment, a denial of service attack using deduplication storage technology can be prevented.

図１は、実施例に係るストレージ装置の機能構成を示す図である。FIG. 1 is a diagram illustrating a functional configuration of the storage apparatus according to the embodiment. 図２は、対応表の一例を示す図である。FIG. 2 is a diagram illustrating an example of the correspondence table. 図３は、追加キーのデータ構造の一例を示す図である。FIG. 3 is a diagram illustrating an example of the data structure of the additional key. 図４は、ハッシュ値テーブルの一例を示す図である。FIG. 4 is a diagram illustrating an example of the hash value table. 図５は、書込部の処理手順を示すフローチャートである。FIG. 5 is a flowchart showing the processing procedure of the writing unit. 図６は、重複データ処理の処理手順を示すフローチャートである。FIG. 6 is a flowchart showing a processing procedure for duplicate data processing. 図７は、新規データ処理の処理手順を示すフローチャートである。FIG. 7 is a flowchart showing a processing procedure for new data processing. 図８は、読込部の処理手順を示すフローチャートである。FIG. 8 is a flowchart showing the processing procedure of the reading unit. 図９は、サービス拒否攻撃の一例を示す図である。FIG. 9 is a diagram illustrating an example of a service denial attack. 図１０は、ストレージ装置のハードウェア構成を示す図である。FIG. 10 is a diagram illustrating a hardware configuration of the storage apparatus.

以下に、本願の開示するストレージ装置及びストレージ制御方法の実施例を図面に基づいて詳細に説明する。なお、この実施例は開示の技術を限定するものではない。 Embodiments of a storage apparatus and a storage control method disclosed in the present application will be described below in detail with reference to the drawings. Note that this embodiment does not limit the disclosed technology.

まず、実施例に係るストレージ装置の機能構成について説明する。図１は、実施例に係るストレージ装置の機能構成を示す図である。図１に示すように、ストレージ装置１は、揮発性記憶部１０と、制御部２０と、不揮発性記憶部３０とを有する。 First, the functional configuration of the storage apparatus according to the embodiment will be described. FIG. 1 is a diagram illustrating a functional configuration of the storage apparatus according to the embodiment. As illustrated in FIG. 1, the storage device 1 includes a volatile storage unit 10, a control unit 20, and a nonvolatile storage unit 30.

揮発性記憶部１０は、記憶するデータが電源オフ時に消える記憶部である。制御部２０は、揮発性記憶部１０が記憶するデータを用いてストレージ装置１を制御する。不揮発性記憶部３０は、ネットワークに接続されたサーバ２がストレージ装置１に記憶させるデータを記憶する。不揮発性記憶部３０が記憶するデータは、電源オフ時にも消えることはない。 The volatile storage unit 10 is a storage unit in which stored data disappears when the power is turned off. The control unit 20 controls the storage device 1 using data stored in the volatile storage unit 10. The nonvolatile storage unit 30 stores data that the server 2 connected to the network stores in the storage device 1. The data stored in the non-volatile storage unit 30 does not disappear even when the power is turned off.

揮発性記憶部１０は、データ部１１と、対応表１２と、ハッシュ値テーブル１３とを有する。データ部１１は、サーバ２が不揮発性記憶部３０に記憶させるデータの一部を一時的に記憶する。 The volatile storage unit 10 includes a data unit 11, a correspondence table 12, and a hash value table 13. The data unit 11 temporarily stores a part of data stored in the nonvolatile storage unit 30 by the server 2.

対応表１２は、不揮発性記憶部３０が記憶するデータの識別子とデータのキーとの対応を示す情報を記憶する。図２は、対応表１２の一例を示す図である。図２に示すように、対応表１２は、識別子とハッシュ値と追加キーとをデータ毎に記憶する。 The correspondence table 12 stores information indicating correspondence between data identifiers stored in the nonvolatile storage unit 30 and data keys. FIG. 2 is a diagram illustrating an example of the correspondence table 12. As shown in FIG. 2, the correspondence table 12 stores an identifier, a hash value, and an additional key for each data.

識別子は、データを識別するためにデータに付与される値である。ここでは、データは６４ＫＢのブロックであり、識別子はブロック番号である。データがファイルである場合には、識別子はファイルのＩＤである。 The identifier is a value given to the data in order to identify the data. Here, the data is a 64 KB block, and the identifier is a block number. If the data is a file, the identifier is the ID of the file.

ハッシュ値は、データからハッシュ関数を用いて計算される値である。ハッシュ関数としては、ＭＤ５（Message Digest Algorithm ５）、ＳＨＡ（Secure Hash Algorithm）１、ＳＨＡ２５６、ＳＨＡ５１２などがある。例えば、ハッシュ関数をＳＨＡ５１２とすると、ハッシュ値の長さは６４バイトとなる。 The hash value is a value calculated from data using a hash function. Examples of the hash function include MD5 (Message Digest Algorithm 5), SHA (Secure Hash Algorithm) 1, SHA256, SHA512, and the like. For example, if the hash function is SHA512, the length of the hash value is 64 bytes.

追加キーは、データがストレージ装置１に書き込まれるときに決定される値であり、ハッシュ値と合わせてデータのキーとして用いられる。図３は、追加キーのデータ構造の一例を示す図である。 The additional key is a value determined when data is written to the storage apparatus 1, and is used as a data key together with a hash value. FIG. 3 is a diagram illustrating an example of the data structure of the additional key.

図３に示すように、追加キーは、２バイト長の「位置」と２バイト長の「内容」が４つ連結されたものである。「位置」は、データがストレージ装置１に書き込まれるときに生成される乱数である。「内容」は、データの先頭から「位置」で示されるバイト目の２バイトデータである。 As shown in FIG. 3, the additional key is a concatenation of four “positions” having a length of 2 bytes and “content” having a length of 2 bytes. “Position” is a random number generated when data is written to the storage apparatus 1. “Content” is 2-byte data of the byte indicated by “position” from the top of the data.

例えば、対応表１２は、識別子としてブロック番号「２８３９１８９３」、ハッシュ値として「ａｆ４９３８９．．．」、追加キーとして「１２８“ａａ”，．．．」を対応させて記憶する。「１２８”ａａ“」は、データの先頭から１２８バイト目の２バイトデータが”ａａ“であることを示す。 For example, the correspondence table 12 stores a block number “28391893” as an identifier, “af49389...” As a hash value, and “128“ aa ”,. “128” aa “” indicates that the 2-byte data of the 128th byte from the top of the data is “aa”.

ストレージ装置１は、ハッシュ値と追加キーの組をデータのキーとして用いる。ストレージ装置１が、ハッシュ値に加えて追加キーをデータのキーとして用いることによって、重複排除ストレージ技術を利用したサービス拒否攻撃を防ぐことができる。 The storage device 1 uses a set of a hash value and an additional key as a data key. By using the additional key as the data key in addition to the hash value, the storage apparatus 1 can prevent a denial of service attack using the deduplication storage technology.

例えば、正しいデータＡと同一ハッシュ値を持つデータＢが、データＡが書き込まれる前にストレージ装置１に書き込まれた場合でも、データＡとデータＢでは追加キーが異なるため、データＡはストレージ装置１に書き込まれる。 For example, even when data B having the same hash value as the correct data A is written to the storage device 1 before the data A is written, the additional key is different between the data A and the data B, so the data A is stored in the storage device 1. Is written to.

追加キーは、データＡがストレージ装置１に書き込まれるときに決定される値であるので、攻撃者は、事前に追加キーを特定することはできない。したがって、攻撃者は、データＡと同一ハッシュ値を持つデータＢを生成できた場合でも、重複排除ストレージ技術を悪用してデータＡをデータＢで置き換えることはできない。 Since the additional key is a value determined when the data A is written in the storage device 1, the attacker cannot specify the additional key in advance. Therefore, even if the attacker can generate the data B having the same hash value as the data A, the attacker cannot abuse the deduplication storage technology to replace the data A with the data B.

なお、ここでは、４つの乱数を用いて４つの２バイトデータを追加キーに用いているが、ストレージ装置１は、より多くの数の乱数を用いることによって、追加キーが偶然に衝突する可能性を低減することができる。また、ストレージ装置１は、データがストレージ装置１に書き込まれるときに決定される値として、乱数を生成する代わりに、データを書き込む際の時間に基づく数など他の数を用いて追加キーを生成することもできる。 In this example, four random numbers are used and four 2-byte data are used as additional keys. However, the storage apparatus 1 may accidentally collide with additional keys by using a larger number of random numbers. Can be reduced. In addition, the storage device 1 generates an additional key using another number such as a number based on the time when data is written instead of generating a random number as a value determined when the data is written to the storage device 1. You can also

図１に戻って、ハッシュ値テーブル１３は、不揮発性記憶部３０が記憶するデータに関する情報を記憶する。図４は、ハッシュ値テーブル１３の一例を示す図である。図４に示すように、ハッシュ値テーブル１３は、ハッシュ値と追加キーと位置情報と参照数とをデータ毎に記憶する。 Returning to FIG. 1, the hash value table 13 stores information related to data stored in the nonvolatile storage unit 30. FIG. 4 is a diagram illustrating an example of the hash value table 13. As illustrated in FIG. 4, the hash value table 13 stores a hash value, an additional key, position information, and a reference number for each data.

位置情報は、ハッシュ値と追加キーの組で特定されるデータが不揮発性記憶部３０で記憶される位置を示す情報である。参照数は、データが異なる識別子で参照されている個数を示す。ストレージ装置１は、同一のデータを重複して記憶することはないので、１つのデータが複数の識別子に対応する可能性がある。 The position information is information indicating a position where the data specified by the combination of the hash value and the additional key is stored in the nonvolatile storage unit 30. The reference number indicates the number of data that is referred to by different identifiers. Since the storage apparatus 1 does not store the same data repeatedly, one data may correspond to a plurality of identifiers.

例えば、ハッシュ値テーブル１３は、ハッシュ値として「ａｆ４９３８９．．．」、追加キーとして「１２８“ａａ”，．．．」、位置情報として「４３２４９３２」、参照数として「１」を対応させて記憶する。 For example, the hash value table 13 stores “af49389...” As the hash value, “128“ aa ”,...” As the additional key, “4324932” as the position information, and “1” as the reference number. To do.

図１に戻って、制御部２０は、書込部２１と読込部２２とを有する。書込部２１は、サーバ２からの指示に基づいて不揮発性記憶部３０にデータを書き込む。書込部２１は、データを書き込むときに、データからハッシュ値を生成し、ハッシュ値テーブル１３を参照してハッシュ値と追加キーで指定されたデータの両方が一致するかどうかを判定する。 Returning to FIG. 1, the control unit 20 includes a writing unit 21 and a reading unit 22. The writing unit 21 writes data to the nonvolatile storage unit 30 based on an instruction from the server 2. When writing the data, the writing unit 21 generates a hash value from the data, and refers to the hash value table 13 to determine whether both the hash value and the data specified by the additional key match.

そして、書込部２１は、ハッシュ値と追加キーで指定されたデータの両方が一致する場合には、重複データがある場合の処理を行い、ハッシュ値と追加キーで指定されたデータの両方が一致しない場合には、新規データを不揮発性記憶部３０に書き込む処理を行う。 Then, when both the hash value and the data specified by the additional key match, the writing unit 21 performs processing when there is duplicate data, and both the hash value and the data specified by the additional key are If they do not match, a process of writing new data to the nonvolatile storage unit 30 is performed.

書込部２１は、生成部２１１を有する。生成部２１１は、書込部２１が新規データを不揮発性記憶部３０に書き込む際に、新規データから追加キーを生成する。そして、書込部２１は、書き込み要求で指定された識別子が対応表３２にあれば、対応表３２のハッシュ値と追記キーを更新し、対応表３２になければ、識別子、ハッシュ値、追加キーを用いて新しいエントリを対応表３２に作成する。 The writing unit 21 includes a generation unit 211. The generation unit 211 generates an additional key from the new data when the writing unit 21 writes the new data to the nonvolatile storage unit 30. If the identifier specified in the write request is in the correspondence table 32, the writing unit 21 updates the hash value and the additional key of the correspondence table 32. If not, the identifier, the hash value, and the additional key are updated. Is used to create a new entry in the correspondence table 32.

読込部２２は、サーバ２からの指示に基づいて不揮発性記憶部３０からデータを読み込む。読込部２２は、対応表１２及びハッシュ値テーブル１３を参照して、データの識別子からデータの位置情報を取得し、取得した位置情報を用いて不揮発性記憶部３０からデータを読み出す。そして、読込部２２は、不揮発性記憶部３０から読み出したデータをサーバ２へ送信する。 The reading unit 22 reads data from the nonvolatile storage unit 30 based on an instruction from the server 2. The reading unit 22 refers to the correspondence table 12 and the hash value table 13, acquires data position information from the data identifier, and reads data from the nonvolatile storage unit 30 using the acquired position information. Then, the reading unit 22 transmits the data read from the nonvolatile storage unit 30 to the server 2.

不揮発性記憶部３０は、データ部３１と、対応表３２と、ハッシュ値テーブル３３とを有する。データ部３１は、サーバ２によりストレージ装置１に書き込まれたデータを記憶する。データ部３１が記憶するデータの一部は、データ部１１に一時的に記憶される。 The nonvolatile storage unit 30 includes a data unit 31, a correspondence table 32, and a hash value table 33. The data unit 31 stores data written to the storage device 1 by the server 2. A part of the data stored in the data unit 31 is temporarily stored in the data unit 11.

対応表３２は、揮発性記憶部１０が有する対応表１２と同一であり、ストレージ装置１が起動されると、対応表３２が記憶するデータが対応表１２へ読み込まれる。また、対応表１２が更新されると対応表３２も更新される。 The correspondence table 32 is the same as the correspondence table 12 included in the volatile storage unit 10. When the storage apparatus 1 is activated, data stored in the correspondence table 32 is read into the correspondence table 12. When the correspondence table 12 is updated, the correspondence table 32 is also updated.

ハッシュ値テーブル３３は、揮発性記憶部１０が有するハッシュ値テーブル１３と同一であり、ストレージ装置１が起動されると、ハッシュ値テーブル３３が記憶するデータがハッシュ値テーブル１３へ読み込まれる。また、ハッシュ値テーブル１３が更新されるとハッシュ値テーブル３３も更新される。 The hash value table 33 is the same as the hash value table 13 included in the volatile storage unit 10, and data stored in the hash value table 33 is read into the hash value table 13 when the storage apparatus 1 is activated. Further, when the hash value table 13 is updated, the hash value table 33 is also updated.

なお、対応表３２及びハッシュ値テーブル３３は、対応表１２及びハッシュ値テーブル１３がそれぞれ更新されると同期して更新されるが、ストレージ装置１は、停止するときに、まとめて対応表３２及びハッシュ値テーブル３３を更新することもできる。ストレージ装置１が、対応表１２及びハッシュ値テーブル１３に同期してそれぞれ対応表３２及びハッシュ値テーブル３３を更新するのは、装置の故障に備えるためである。 The correspondence table 32 and the hash value table 33 are updated in synchronization with the update of the correspondence table 12 and the hash value table 13, respectively. However, when the storage apparatus 1 is stopped, the correspondence table 32 and the hash value table 33 are collectively updated. The hash value table 33 can also be updated. The reason why the storage device 1 updates the correspondence table 32 and the hash value table 33 in synchronization with the correspondence table 12 and the hash value table 13 is to prepare for a failure of the device.

次に、書込部２１の処理手順について説明する。図５は、書込部２１の処理手順を示すフローチャートである。図５に示すように、書込部２１は、サーバ２で動作するアプリケーションがデータの書き込みを要求すると、書き込むデータのハッシュ値を算出する（ステップＳ１）。 Next, the processing procedure of the writing unit 21 will be described. FIG. 5 is a flowchart showing a processing procedure of the writing unit 21. As illustrated in FIG. 5, when an application operating on the server 2 requests data writing, the writing unit 21 calculates a hash value of data to be written (step S1).

そして、書込部２１は、ハッシュ値テーブル１３を算出したハッシュ値で検索し（ステップＳ２）、同一ハッシュ値を持つデータが存在するか否かを判定する（ステップＳ３）。その結果、同一ハッシュ値を持つデータが存在する場合には、書込部２１は、ハッシュ値テーブルから追加キーを読み出し、追加キーで指定された位置のデータが一致するか否かを判定し（ステップＳ４）、追加キーで指定されたデータも一致する場合には、重複データ処理を行う（ステップＳ５）。 Then, the writing unit 21 searches the hash value table 13 with the calculated hash value (step S2) and determines whether there is data having the same hash value (step S3). As a result, when there is data having the same hash value, the writing unit 21 reads the additional key from the hash value table and determines whether or not the data at the position specified by the additional key matches ( Step S4) If the data specified by the additional key also matches, duplicate data processing is performed (step S5).

一方、追加キーで指定されたデータが一致しない場合、あるいは、同一ハッシュ値を持つデータが存在しない場合には、書込部２１は、新規データ処理を行う（ステップＳ６）。 On the other hand, if the data specified by the additional key does not match, or if there is no data having the same hash value, the writing unit 21 performs new data processing (step S6).

このように、書込部２１が、データを書き込むときに、追加キーも含めて一致するデータの有無を判定することによって、ストレージ装置１は、重複排除ストレージ技術を利用したサービス拒否攻撃を防ぐことができる。 Thus, when the writing unit 21 writes data, it determines whether there is matching data including the additional key, so that the storage apparatus 1 prevents a denial-of-service attack using the deduplication storage technology. Can do.

次に、重複データ処理の処理手順について説明する。図６は、重複データ処理の処理手順を示すフローチャートである。図６に示すように、書込部２１は、ハッシュ値テーブル１３のハッシュ値及び追加キーが一致するエントリの参照数を１増加する（ステップＳ１１）。 Next, a processing procedure for duplicate data processing will be described. FIG. 6 is a flowchart showing a processing procedure for duplicate data processing. As illustrated in FIG. 6, the writing unit 21 increases the reference number of the entry having the same hash value and additional key in the hash value table 13 by 1 (Step S <b> 11).

そして、書込部２１は、サーバ２からの書き込み要求で指定された識別子を用いて対応表１２を検索し（ステップＳ１２）、書き込み要求で指定された識別子が対応表１２にあるか否かを判定する（ステップＳ１３）。 Then, the writing unit 21 searches the correspondence table 12 using the identifier specified by the write request from the server 2 (step S12), and determines whether or not the identifier specified by the write request is in the correspondence table 12. Determination is made (step S13).

その結果、対応表１２にある場合には、識別子で指定されたデータが更新された場合であるので、書込部２１は、対応表１２の識別子に対応するハッシュ値と追加キーを新しいハッシュ値と追加キーで更新する（ステップＳ１４）。そして、書込部２１は、更新前のデータのハッシュ値テーブル１３の参照数を１減少する（ステップＳ１５）。 As a result, when the data is in the correspondence table 12, the data specified by the identifier is updated, so the writing unit 21 uses the hash value and the additional key corresponding to the identifier of the correspondence table 12 as the new hash value. And updated with the additional key (step S14). Then, the writing unit 21 decreases the reference number of the hash value table 13 of the data before update by 1 (step S15).

そして、書込部２１は、１減少した参照数は０であるか否かを判定し（ステップＳ１６）、０である場合には、更新前のデータは参照されなくなったので、ハッシュ値テーブル１３から更新前のデータのエントリを削除する（ステップＳ１７）。そして、書込部２１は、不揮発性記憶部３０に対応表１２とハッシュ値テーブル１３の変更を反映する（ステップＳ１８）。 Then, the writing unit 21 determines whether or not the reference number decreased by 1 is 0 (step S16). If the reference number is 0, the data before update is no longer referred to, so the hash value table 13 The entry of the data before update is deleted from (Step S17). Then, the writing unit 21 reflects the change in the correspondence table 12 and the hash value table 13 in the nonvolatile storage unit 30 (step S18).

一方、書き込み要求で指定された識別子が対応表１２にない場合には、新たな識別子で指定されたデータを書き込む場合であるので、書込部２１は、書き込み要求で指定された識別子を用いて新しいエントリを対応表１２に作成する（ステップＳ１９）。そして、書込部２１は、ステップＳ１８に進む。 On the other hand, when the identifier specified by the write request is not in the correspondence table 12, the data specified by the new identifier is written, so the writing unit 21 uses the identifier specified by the write request. A new entry is created in the correspondence table 12 (step S19). Then, the writing unit 21 proceeds to step S18.

このように、書込部２１が、重複データ処理を行うことによって、ストレージ装置１は、同一データの重複記憶を防ぐことができる。 As described above, the writing unit 21 performs duplicate data processing, whereby the storage apparatus 1 can prevent duplicate storage of the same data.

次に、新規データ処理の処理手順について説明する。図７は、新規データ処理の処理手順を示すフローチャートである。図７に示すように、生成部２１１は、追加キーを生成する（ステップＳ３０）。具体的には、生成部２１１は、４つの２バイト長の乱数を生成し、データの先頭から乱数に対応するバイト目の２バイトデータを乱数と結合して４バイトデータを生成する処理を各乱数に対して行う。そして、生成部２１１は、４つの４バイトデータを結合して追加キーを生成する。続いて、書込部２１は、不揮発性記憶部３０に新規の領域を確保し、データを書き込む（ステップＳ３１）。 Next, a processing procedure for new data processing will be described. FIG. 7 is a flowchart showing a processing procedure for new data processing. As illustrated in FIG. 7, the generation unit 211 generates an additional key (step S30). Specifically, the generation unit 211 generates four 2-byte long random numbers and combines the 2-byte data of the byte corresponding to the random number from the top of the data with the random numbers to generate 4-byte data. For random numbers. Then, the generation unit 211 generates an additional key by combining the four 4-byte data. Subsequently, the writing unit 21 secures a new area in the nonvolatile storage unit 30 and writes the data (step S31).

そして、書込部２１は、ハッシュ値と追加キーを用いてハッシュ値テーブル１３の新規エントリを作成し、データの位置情報と参照数を格納する（ステップＳ３２）。ここで、参照数の値は初期値の１である。 Then, the writing unit 21 creates a new entry in the hash value table 13 using the hash value and the additional key, and stores the data position information and the reference number (step S32). Here, the value of the reference number is an initial value of 1.

そして、書込部２１は、サーバ２からの書き込み要求で指定された識別子を用いて対応表１２を検索し（ステップＳ３３）、書き込み要求で指定された識別子が対応表１２にあるか否かを判定する（ステップＳ３４）。 Then, the writing unit 21 searches the correspondence table 12 using the identifier specified by the write request from the server 2 (step S33), and determines whether or not the identifier specified by the write request is in the correspondence table 12. Determination is made (step S34).

その結果、対応表１２にある場合には、識別子で指定されたデータが更新された場合であるので、書込部２１は、対応表１２の識別子に対応するハッシュ値と追加キーを新しいハッシュ値と追加キーで更新する（ステップＳ３５）。そして、書込部２１は、更新前のデータのハッシュ値テーブル１３の参照数を１減少する（ステップＳ３６）。 As a result, when the data is in the correspondence table 12, the data specified by the identifier is updated, so the writing unit 21 uses the hash value and the additional key corresponding to the identifier of the correspondence table 12 as the new hash value. And updated with the additional key (step S35). Then, the writing unit 21 decreases the reference number of the hash value table 13 of the data before update by 1 (step S36).

そして、書込部２１は、１減少した参照数は０であるか否かを判定し（ステップＳ３７）、０である場合には、更新前のデータは参照されなくなったので、ハッシュ値テーブル１３から更新前のデータのエントリを削除する（ステップＳ３８）。そして、書込部２１は、不揮発性記憶部３０に対応表１２とハッシュ値テーブル１３の変更を反映する（ステップＳ３９）。 Then, the writing unit 21 determines whether or not the reference number decreased by 1 is 0 (step S37). If the reference number is 0, the data before update is no longer referred to, so the hash value table 13 The entry of the data before update is deleted from (Step S38). Then, the writing unit 21 reflects the change of the correspondence table 12 and the hash value table 13 in the nonvolatile storage unit 30 (step S39).

一方、書き込み要求で指定された識別子が対応表１２にない場合には、新たな識別子で指定されたデータを書き込む場合であるので、書込部２１は、書き込み要求で指定された識別子を用いて新しいエントリを対応表１２に作成する（ステップＳ４０）。そして、書込部２１は、ステップＳ３９に進む。 On the other hand, when the identifier specified by the write request is not in the correspondence table 12, the data specified by the new identifier is written, so the writing unit 21 uses the identifier specified by the write request. A new entry is created in the correspondence table 12 (step S40). Then, the writing unit 21 proceeds to step S39.

このように、書込部２１が、新規データ処理を行うことによって、ストレージ装置１は、同一データが記憶されていないデータを不揮発性記憶部３０に格納することができる。 As described above, the writing unit 21 performs new data processing, whereby the storage apparatus 1 can store data in which the same data is not stored in the nonvolatile storage unit 30.

次に、読込部２２の処理手順について説明する。図８は、読込部２２の処理手順を示すフローチャートである。図８に示すように、サーバ２で動作するアプリケーションがデータの読み込みを要求すると、読込部２２は、読み込み要求で指定された識別子を用いて対応表１２を検索し、ハッシュ値と追加キーを取得する（ステップＳ５１）。 Next, the processing procedure of the reading unit 22 will be described. FIG. 8 is a flowchart showing the processing procedure of the reading unit 22. As shown in FIG. 8, when an application running on the server 2 requests to read data, the reading unit 22 searches the correspondence table 12 using the identifier specified by the read request, and acquires a hash value and an additional key. (Step S51).

そして、読込部２２は、取得したハッシュ値と追加キーを用いてハッシュ値テーブル１３を検索し、位置情報を取得する（ステップＳ５２）。そして、読込部２２は、取得した位置情報を用いて不揮発性記憶部３０からデータを読み出し、サーバ２に送信する（ステップＳ５３）。 And the reading part 22 searches the hash value table 13 using the acquired hash value and an additional key, and acquires position information (step S52). And the reading part 22 reads data from the non-volatile memory | storage part 30 using the acquired positional information, and transmits to the server 2 (step S53).

このように、読込部２２が、ハッシュ値に加えて追加キーを用いてデータを読み込むことにより、ストレージ装置１は、ハッシュ値が同一なデータの中から適切なデータをサーバ２に送信することができる。 As described above, when the reading unit 22 reads data using the additional key in addition to the hash value, the storage apparatus 1 may transmit appropriate data from the data having the same hash value to the server 2. it can.

上述してきたように、実施例では、書込部２１は、データを不揮発性記憶部３０に書き込むときに、乱数を生成し、生成した乱数に基づいて追加キーを生成する。そして、ストレージ装置１は、書込部２１が生成した追加キーをハッシュ値とともにキーとして用いてデータの重複を排除する。したがって、同一ハッシュ値を持つデータが生成可能な場合でも、ストレージ装置１は、重複排除ストレージ技術を利用したサービス拒否攻撃を防ぐことができる。 As described above, in the embodiment, the writing unit 21 generates a random number when writing data to the nonvolatile storage unit 30, and generates an additional key based on the generated random number. Then, the storage device 1 uses the additional key generated by the writing unit 21 as a key together with the hash value to eliminate data duplication. Therefore, even when data having the same hash value can be generated, the storage apparatus 1 can prevent a denial of service attack using the deduplication storage technology.

図９は、サービス拒否攻撃の一例を示す図である。図９は、大本のサイト７のファイルがミラーサイト８にコピーされる場合の攻撃例を示す。図９において、攻撃者６は、ミラーサイト８にファイルがコピーされる前にデータ内容を確認する（１）。そして、攻撃者６は、同一のハッシュ値を持つ偽のファイルを生成し、ミラーサイト８に書き込む（２）。 FIG. 9 is a diagram illustrating an example of a service denial attack. FIG. 9 shows an example of an attack when a file of the main site 7 is copied to the mirror site 8. In FIG. 9, the attacker 6 confirms the data contents before the file is copied to the mirror site 8 (1). Then, the attacker 6 generates a fake file having the same hash value and writes it in the mirror site 8 (2).

その後、大本のサイト７のファイルがミラーサイト８にミラーされる（３）。すると、従来は、同一のハッシュ値を持つ偽のファイルが既にミラーサイト８に書き込まれているので、ミラーされるファイルは偽のファイルと同一と判定され、ミラーサイト８では、ファイルの中身が置き換えられる（４）。 Thereafter, the file of the main site 7 is mirrored on the mirror site 8 (3). Then, conventionally, since the fake file having the same hash value has already been written in the mirror site 8, the file to be mirrored is determined to be the same as the fake file, and the contents of the file are replaced at the mirror site 8. (4).

しかしながら、実施例に係るストレージ装置１は、攻撃者６がミラーサイト８に偽のファイルを書き込んだ場合でも、ファイルをミラーするときに乱数を用いて追加キーを生成し、ハッシュ値に加えて追加キーをキーとしてデータの同一性を判定する。したがって、ストレージ装置１は、ミラーされるファイルに対して偽のファイルとは異なる追加キーを生成することによって、ミラーされるファイルが偽のファイルで置き換えられることを防ぐことができる。 However, even when the attacker 6 writes a fake file on the mirror site 8, the storage apparatus 1 according to the embodiment generates an additional key using a random number when mirroring the file, and adds it in addition to the hash value. Data identity is determined using the key as a key. Therefore, the storage apparatus 1 can prevent the mirrored file from being replaced with the fake file by generating an additional key different from the fake file for the mirrored file.

また、実施例では、対応表１２は、ハッシュ値と追加キーとデータの識別子とを対応させて記憶し、ハッシュ値テーブル１３は、ハッシュ値と追加キーとデータの格納位置と参照数とを対応させて記憶する。したがって、ストレージ装置１は、データが同一である複数の識別子を管理することができる。 In the embodiment, the correspondence table 12 stores hash values, additional keys, and data identifiers in association with each other, and the hash value table 13 associates hash values, additional keys, data storage positions, and reference numbers. Let me remember. Therefore, the storage apparatus 1 can manage a plurality of identifiers having the same data.

また、実施例では、ストレージ装置１は、４つの乱数を用いて追加キーを生成したが、より多くの乱数を用いて追加キーを生成することもできる。したがって、ストレージ装置１は、ハッシュ値と追加キーをデータのキーとした場合に、異なるデータに対してキーが偶然一致する可能性を低くすることができる。 In the embodiment, the storage apparatus 1 generates an additional key using four random numbers, but can also generate an additional key using more random numbers. Accordingly, when the hash value and the additional key are used as the data key, the storage apparatus 1 can reduce the possibility that the key coincides with different data by chance.

なお、実施例で説明したストレージ装置は、ＣＰＵでプログラムを動作させることによっても実現される。そこで、ＣＰＵでプログラムを動作させることによって実現されるストレージ装置のハードウェア構成について説明する。 Note that the storage apparatus described in the embodiment can also be realized by operating a program with a CPU. Therefore, a hardware configuration of a storage apparatus realized by operating a program with the CPU will be described.

図１０は、ストレージ装置のハードウェア構成を示す図である。図１０に示すように、ストレージ装置４０は、メインメモリ４１と、ＣＰＵ（Central Processing Unit）４２と、ホストインタフェース４３と、ＨＤＤ（Hard Disk Drive）４４とを有する。 FIG. 10 is a diagram illustrating a hardware configuration of the storage apparatus. As shown in FIG. 10, the storage device 40 includes a main memory 41, a CPU (Central Processing Unit) 42, a host interface 43, and an HDD (Hard Disk Drive) 44.

メインメモリ４１は、プログラムやプログラムの実行途中結果などを記憶するメモリであり、図１の記憶部１０に対応する。ＣＰＵ４２は、メインメモリ４１からプログラムを読み出して実行する中央処理装置であり、図１の制御部２０に対応する。 The main memory 41 is a memory for storing a program, a program execution result, and the like, and corresponds to the storage unit 10 in FIG. The CPU 42 is a central processing unit that reads and executes a program from the main memory 41, and corresponds to the control unit 20 in FIG.

ホストインタフェース４３は、ストレージ装置４０をサーバ２に接続するためのインタフェースである。ＨＤＤ４４は、プログラムやデータを格納するディスク装置であり、図１の不揮発性記憶部３０に対応する。なお、ストレージ装置４０は、ＨＤＤ４４の替わりにＳＳＤ（Solid State Drive）を備えることもできる。 The host interface 43 is an interface for connecting the storage device 40 to the server 2. The HDD 44 is a disk device that stores programs and data, and corresponds to the nonvolatile storage unit 30 of FIG. The storage device 40 can also include an SSD (Solid State Drive) instead of the HDD 44.

１ストレージ装置
２サーバ
６攻撃者
７大本のサイト
８ミラーサイト
１０揮発性記憶部
１１データ部
１２対応表
１３ハッシュ値テーブル
２０制御部
２１書込部
２２読込部
３０不揮発性記憶部
３１データ部
３２対応表
３３ハッシュ値テーブル
４０ストレージ装置
４１メインメモリ
４２ＣＰＵ
４３ホストインタフェース
４４ＨＤＤ
２１１生成部 DESCRIPTION OF SYMBOLS 1 Storage apparatus 2 Server 6 Attacker 7 Large site 8 Mirror site 10 Volatile memory | storage part 11 Data part 12 Correspondence table 13 Hash value table 20 Control part 21 Writing part 22 Reading part 30 Non-volatile storage part 31 Data part 32 Correspondence table 33 Hash value table 40 Storage device 41 Main memory 42 CPU
43 Host interface 44 HDD
211 generator

Claims

A first generation unit that generates a hash value as a first value from the data when the data is written to the nonvolatile storage device;
A second generator that generates a second value using a portion of the data extracted from the data based on a number determined when the data is written to the non-volatile storage device;
And a control unit that controls the redundant storage of the data based on the first value and the second value.

A first table for storing the first value and the second value in association with the storage position of the data;
A second table for storing the first value and the second value in association with an identifier for identifying the data;
The storage apparatus according to claim 1, wherein the control unit controls duplication storage of the data using the first table and the second table.

The storage apparatus according to claim 1, wherein the second generation unit generates the second value using the number and the extracted data.

Before numeration storage device according to claim 1, 2 or 3, characterized in that a random number.

Generating a hash value as a first value from the data when the data is written to the non-volatile storage device;
Generating a second value using a portion of the data extracted from the data based on a number determined when the data is written to the non-volatile storage device ;
Controlling the overlapping storage of the data based on the previous SL first value and second value
A storage control method , wherein a process of the nonvolatile storage device is executed by a processor .