JP6648596B2

JP6648596B2 - File system control device, storage system, file system control method, and program

Info

Publication number: JP6648596B2
Application number: JP2016063596A
Authority: JP
Inventors: 正貴的場
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2016-03-28
Filing date: 2016-03-28
Publication date: 2020-02-14
Anticipated expiration: 2036-03-28
Also published as: JP2017182145A

Description

本発明は、ファイルシステム制御装置、ストレージシステム、ファイルシステム制御方法、及び、そのためのプログラムに関する。 The present invention relates to a file system control device, a storage system, a file system control method, and a program therefor.

ＣＡＳ（コンテンツアドレスストレージ：ＣｏｎｔｅｎｔＡｄｄｒｅｓｓｅｄＳｔｏｒａｇｅ）は、格納するデータの内容（例えばデータのハッシュ値）を元に決まるコンテンツアドレス（以下ＣＡと呼ぶ）を使用してデータを格納する。 The CAS (Content Addressed Storage) stores data using a content address (hereinafter referred to as CA) determined based on the content of the data to be stored (for example, a hash value of the data).

このため、ＣＡＳは、同一の内容のデータを別々に格納する必要が無く、重複記録を排除し、データ容量の削減を図ることができる。 For this reason, the CAS does not need to separately store data of the same content, eliminates duplicate recording, and can reduce the data capacity.

特許文献１は、格納したデータの内容に応じて特定される固有のアドレスによって、当該データを格納した格納位置を特定するコンテンツアドレス型のストレージシステムに関する技術を開示している。 Patent Literature 1 discloses a technology related to a content address type storage system that specifies a storage location where data is stored by a unique address specified according to the content of the stored data.

特許文献２は、実際のデータを格納せず、代わりにメタデータを利用して、システムにおける全てのデータの位置を追跡し、多数のユーザが最新ファイルを同期してアクセスする技術を開示している。 Patent Document 2 discloses a technique in which actual data is not stored, but instead metadata is used to track the position of all data in the system, and a large number of users access the latest file in synchronization. I have.

特許文献３は、ファイルシステム構造の中に、間接アドレスを恒久的に割り当てることで、ブロックを書き換えてもＣＡ再計算がファイルシステムのルートまで伝搬することを防ぐ、ストレージシステムに関する技術を開示している。 Patent Literature 3 discloses a technique related to a storage system in which an indirect address is permanently allocated in a file system structure to prevent a CA recalculation from being propagated to a file system root even when a block is rewritten. I have.

特許文献４は、ＣＡＳにおいて、データのコピー処理に要する時間及び負荷を抑制し、システムの性能の低下を抑制したストレージシステムに関する技術を開示している。 Patent Literature 4 discloses a technology related to a storage system in which the time and load required for data copy processing in a CAS are suppressed, and a decrease in system performance is suppressed.

国際公開第２０１２／１０１９８３号WO2012 / 109833 特表２０１５−５１２０７１号公報JP-T-2015-512071 特許第５５５６０２５号公報Japanese Patent No. 5556025 特開２０１０−１９８２７６号公報JP 2010-198276 A

ＣＡＳ上に、ツリー構造でファイルシステム構造を格納する装置において、ファイル更新時の性能的な課題がある。 In a device that stores a file system structure in a tree structure on a CAS, there is a performance problem when updating a file.

通常のストレージは、データを格納する位置をアドレスとして使用してデータを格納する。これに対して、ＣＡＳは、格納するデータの内容を元に決まるＣＡを使用してデータを格納する。すなわち、ＣＡＳは、参照のためのＣＡを、そのデータを格納してから決定する。このため、ＣＡＳは、上位のツリーのブロックの内容を決めるために、下位のブロックの格納完了を待つ必要があり、ツリー構造の格納に時間を要する。レイテンシの大きいシステムでは、特に、この動作が性能的な課題となる。 A normal storage stores data by using a data storage position as an address. In contrast, the CAS stores data using a CA determined based on the content of the data to be stored. That is, the CAS determines the CA for reference after storing the data. For this reason, the CAS needs to wait for the completion of storing the lower block in order to determine the content of the block of the upper tree, and it takes time to store the tree structure. This operation is a performance issue especially in a system with a large latency.

ＣＡＳのファイル更新時の性能的な課題について、図１を用いて説明する。図１は、ＣＡＳにおけるファイル更新の動作の一例を示す図である。 A performance problem at the time of updating a CAS file will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of a file update operation in CAS.

図１は、「／ｄｉｒ１／ｆｉｌｅ１」を構成しているデータブロックの１つである「ｄａｔａ２」を更新する例を説明するための図である。図１の上段のファイルシステム構造１０００は、オリジナルのブロック群を示し、下段のファイルシステム構造２０００は、ファイル更新過程のブロック群を示す。ファイルシステム構造１０００は、ツリー構造であり、図１の左側のブロックが上流側、右側のブロックが下流側である。「ｃａ００」乃至「ｃａ６１」は、ＣＡ（コンテンツアドレス）である。 FIG. 1 is a diagram for explaining an example of updating “data2”, which is one of the data blocks configuring “/ dir1 / file1”. The upper file system structure 1000 in FIG. 1 shows an original block group, and the lower file system structure 2000 shows a block group in a file update process. The file system structure 1000 is a tree structure. The left block in FIG. 1 is the upstream side, and the right block is the downstream side. “Ca00” to “ca61” are CAs (content addresses).

たとえば、ＣＡ「ｃａ２０」のブロックには、「ｄｉｒ１」のブロック情報であるＣＡ「ｃａ４０」が記憶されている。ＣＡ「ｃａ５０」のブロックには、「ｆｉｌｅ１」のブロック情報であるＣＡ「ｃａ６０」、「ｃａ６１」及び「ｃａ６２」が記憶されている。ＣＡ「ｃａ６１」のブロック「ｄａｔａ２」にはデータが記憶されている。データが記憶されているブロックは、データブロックと呼ばれる。ＣＡＳは、以下のようにブロック「ｄａｔａ２」を更新するファイル更新を実行する。 For example, in the block of CA “ca20”, CA “ca40” which is block information of “dir1” is stored. In the block of CA “ca50”, CA “ca60”, “ca61” and “ca62” which are block information of “file1” are stored. Data is stored in the block “data2” of the CA “ca61”. A block in which data is stored is called a data block. The CAS performs a file update that updates the block "data2" as follows.

はじめに、ＣＡＳは、「ｄａｔａ２」を複製し、複製したブロック「ｄａｔａ４」に変更を加え（図中の（１）のハッチングの部分が変更されたデータ）、新たなＣＡである「ｃａ６３」を算出する（図中の（２））。ＣＡＳでは、同一内容のブロックを１つのブロックに集約し、複数の上位ブロックから参照させることで重複排除を実現しており、既存ブロックに変更を加える場合は必ず既存ブロックの複製が必要となる。 First, the CAS duplicates “data2”, modifies the duplicated block “data4” (data in which the hatched portion in (1) in the figure is changed), and calculates a new CA “ca63”. ((2) in the figure). In CAS, deduplication is realized by integrating blocks of the same content into one block and referring to the blocks from a plurality of upper blocks. When a change is made to an existing block, the existing block must be duplicated without fail.

次に、ＣＡＳは、変更したブロックのＣＡ「ｃａ６３」を「ｆｉｌｅ１」に反映するため、まず、ＣＡ「ｃａ５０」の「ｆｉｌｅ１」のブロックリストを複製する。その後、ＣＡＳは、複製したブロックリストに含まれるＣＡ「ｃａ６１」を「ｃａ６３」に更新し（置き換え）、さらにＣＡの再計算によって「ｃａ５１」を生成し、「ｃａ５０」を「ｃａ５１」に更新する（置き換える）（図中の（３））。このように、ＣＡＳは、ツリーの上位ブロックに対して、複製と、ＣＡの更新・再計算を順次行う必要があり、ツリーの最上位に位置するルート（根）に到達するまで順次行わなければならない（図中の（４））。 Next, the CAS first copies the block list of “file1” of CA “ca50” in order to reflect the CA “ca63” of the changed block in “file1”. Thereafter, the CAS updates (replaces) the CA “ca61” included in the duplicated block list with “ca63”, generates “ca51” by recalculating the CA, and updates “ca50” to “ca51”. (Replace) ((3) in the figure). As described above, the CAS needs to sequentially perform the duplication and the CA update / recalculation for the upper block of the tree, and must sequentially perform the update until the root (root) located at the top of the tree is reached. No ((4) in the figure).

特に、レイテンシの大きいＣＡＳにおいて、下記のような動作が大きな性能問題となる。 In particular, in a CAS having a large latency, the following operation is a major performance problem.

まず、１つ目の課題は、ファイル更新時のＣＡの再計算処理が広範囲に伝搬することである（上述の動作（３）、（４））。ＣＡＳでは、ブロックの内容から計算した値（例えば、データのハッシュ値）をＣＡとして使用する。このため、ＣＡＳでは、ファイルのデータブロックを書き換えると、データブロックのＣＡが変更される。そして、そのブロック情報を格納しているファイルのＣＡが変更され、さらにそのファイル情報を格納しているディレクトリのＣＡが変更され、というようにファイルシステムのルートに到達するまでＣＡの再計算が伝搬する。特に、頻繁にファイルを更新する場合は、非効率となる。 First, the first problem is that the CA recalculation processing at the time of file update propagates over a wide range (the above-described operations (3) and (4)). In CAS, a value calculated from the contents of a block (for example, a hash value of data) is used as CA. Therefore, in CAS, when a data block of a file is rewritten, the CA of the data block is changed. Then, the CA of the file storing the block information is changed, the CA of the directory storing the file information is changed, and so on, and the recalculation of the CA is propagated until the root of the file system is reached. I do. In particular, when files are updated frequently, it becomes inefficient.

２つ目の課題は、ファイル更新時に複製が必要なことである（上述の動作（１））。ＣＡＳでは、同じ内容のブロックは同じＣＡを持つため、１つのブロックが複数の上位ブロックから参照されている場合がある。このため、ブロックを書き換える時は、オリジナルブロックをそのまま上書きするのではなく、まずオリジナルブロックを複製し、この複製したブロックに変更を加え、格納する必要がある。この複製処理のため、ファイル更新時の応答性能が遅くなる。 The second problem is that a copy is required when updating a file (the above-described operation (1)). In CAS, since blocks having the same content have the same CA, one block may be referred to by a plurality of upper blocks. For this reason, when rewriting a block, it is necessary to copy the original block first, make a change to the copied block, and store it, instead of overwriting the original block as it is. Due to this duplication process, the response performance at the time of updating the file becomes slow.

特許文献１は、ＣＡＳの一般的な動作について開示したものであり、ファイル更新時のＣＡ再計算処理が広範囲に伝搬し、ファイル更新時に複製が必要である。 Patent Literature 1 discloses a general operation of a CAS, in which CA recalculation processing at the time of file update propagates over a wide range, and duplication is required at the time of file update.

特許文献２は、ＣＡＳにおけるファイル更新時の性能向上については言及していない。 Patent Document 2 does not mention improvement in performance at the time of file update in CAS.

特許文献３は、ＣＡ再計算がファイルシステムのルートまで伝搬することを防ぐことはできるが、ファイル更新時の性能向上については言及していない。 Patent Document 3 can prevent the CA recalculation from propagating to the root of the file system, but does not mention improvement in performance at the time of file update.

特許文献４は、ＣＡＳにおけるコピー処理の負荷抑制について記載しているのみであり、ファイル更新時のＣＡ再計算処理が広範囲に伝搬し、ファイル更新時に複製が必要である。 Patent Literature 4 only describes the load reduction of the copy processing in the CAS. The CA recalculation processing at the time of file update propagates over a wide range, and the file update requires copying.

従って、上記の文献に記載の技術は、いずれも、ＣＡＳにおいて、ファイル更新時のＣＡ再計算処理が広範囲に伝搬し、ファイル更新時に複製が必要であるという課題が解決されておらず、ファイル更新時の性能に問題がある。 Therefore, none of the techniques described in the above-mentioned documents have solved the problem that the CA recalculation processing at the time of file update propagates in a wide range in the CAS, and that duplication is required at the time of file update. There is a problem with the performance at the time.

このため、本発明の目的は、上述した課題である、ＣＡＳにおいて、ファイル更新時のＣＡ再計算処理が広範囲に伝搬し、ファイル更新時に複製が必要であるため、ファイル更新時の性能に問題がある、という問題を解決するファイルシステム制御装置等を提供することにある。 Therefore, an object of the present invention is to solve the above-mentioned problem. In the CAS, the CA recalculation processing at the time of file update propagates over a wide range, and duplication is required at the time of file update. An object of the present invention is to provide a file system control device or the like that solves the problem of the existence.

本発明のファイルシステム制御装置は、データを更新し、前記データの更新箇所のファイルパス及びデータブロックのオフセットに基づいて、前記データブロックの第一のコンテンツアドレスを算出する、データ再現手段と、前記データ再現手段が更新した前記データブロックを格納する上位ブロックの前記第一のコンテンツアドレスを算出する、ファイルシステム再現手段と、前記第一のコンテンツアドレスを第二のコンテンツアドレスに変換する、ファイルシステム確定手段と、を包含する。 The file system control device of the present invention updates data, and calculates a first content address of the data block based on a file path and an offset of the data block at a location where the data is updated. A file system reproducing unit that calculates the first content address of an upper block storing the data block updated by the data reproducing unit; and a file system determination that converts the first content address into a second content address. Means.

本発明のファイルシステム制御方法は、データを更新し、前記データの更新箇所のファイルパス及びデータブロックのオフセットに基づいて、前記データブロックの第一のコンテンツアドレスを算出し、更新した前記データブロックを格納する上位ブロックの前記第一のコンテンツアドレスを算出し、前記第一のコンテンツアドレスを第二のコンテンツアドレスに変換する。 The file system control method of the present invention updates data, calculates a first content address of the data block based on a file path of an updated location of the data and an offset of the data block, and updates the updated data block. The first content address of the upper block to be stored is calculated, and the first content address is converted to a second content address.

本発明のコンピュータプログラムは、データを更新し、前記データの更新箇所のファイルパス及びデータブロックのオフセットに基づいて、前記データブロックの第一のコンテンツアドレスを算出する処理と、更新した前記データブロックを格納する上位ブロックの前記第一のコンテンツアドレスを算出する処理と、前記第一のコンテンツアドレスを第二のコンテンツアドレスに変換する処理と、をコンピュータに実行させる。 The computer program of the present invention updates data, calculates a first content address of the data block based on a file path and an offset of the data block at an updated location of the data, and updates the updated data block. The program causes the computer to execute a process of calculating the first content address of the upper block to be stored and a process of converting the first content address into a second content address.

本発明によれば、ＣＡＳにおいて、ファイル更新時のＣＡ再計算処理を広範囲に伝搬する必要がなく、ファイル更新時に複製を必要としないため、ファイル更新時の性能を低下させない、という効果を奏する。 According to the present invention, in the CAS, it is not necessary to propagate the CA recalculation processing at the time of updating a file over a wide range, and duplication is not required at the time of updating the file, so that there is an effect that the performance at the time of updating the file is not reduced.

図１は、ＣＡＳにおけるファイル更新の動作の一例を示す図である。FIG. 1 is a diagram illustrating an example of a file update operation in CAS. 図２は、第一の実施の形態に係る、ストレージシステムの構成の一例を示すブロック図である。FIG. 2 is a block diagram illustrating an example of a configuration of the storage system according to the first embodiment. 図３は、ディレクトリ、ファイル、データブロック、及び、それらのツリー構造が、データ格納部に格納されている初期状態の一例を示す図である。FIG. 3 is a diagram illustrating an example of an initial state in which directories, files, data blocks, and their tree structures are stored in a data storage unit. 図４は、図３の状態に対して、ファイルシステム制御装置が「ｆｉｌｅ１」を更新する動作の一例を示す図である。FIG. 4 is a diagram illustrating an example of an operation in which the file system control device updates “file1” in the state of FIG. 3. 図５は、ファイルシステム確定部がデータを確定する動作の一例を示す図である。FIG. 5 is a diagram illustrating an example of an operation of determining data by the file system determination unit. 図６は、第二の実施形態に係る、ファイルシステム制御装置の構成の一例を示すブロック図である。FIG. 6 is a block diagram illustrating an example of a configuration of a file system control device according to the second embodiment.

＜第一の実施形態＞
本発明の第一の実施の形態について、図面を参照して詳細に説明する。 <First embodiment>
A first embodiment of the present invention will be described in detail with reference to the drawings.

図２は、第一の実施の形態に係る、ストレージシステム１０の構成の一例を示すブロック図である。 FIG. 2 is a block diagram illustrating an example of a configuration of the storage system 10 according to the first embodiment.

ストレージシステム１０は、ファイルシステム制御装置１００、及び、データ格納部２００を含む。 The storage system 10 includes a file system control device 100 and a data storage unit 200.

ファイルシステム制御装置１００は、ファイルシステムアクセス部１０１、ファイルシステム再現部１０２、データ再現部１０３、及び、ファイルシステム確定部１０４を含む。 The file system control device 100 includes a file system access unit 101, a file system reproduction unit 102, a data reproduction unit 103, and a file system determination unit 104.

ファイルシステムアクセス部１０１は、クライアントからのファイルシステム（データ格納部２００）へのアクセス手段を提供する。ファイルシステムアクセスは、ｏｐｅｎ、ｃｌｏｓｅ、ｍｋｄｉｒ、ｕｎｌｉｎｋなどのファイルシステム構造に関する操作と、ｒｅａｄ、ｗｒｉｔｅなどのデータに関する操作の２種類に分類できる。本実施の形態におけるファイルシステム構造は、ツリー構造であるので、以下、ファイルシステム構造は、ツリー構造とも呼ばれる。ファイルシステムアクセス部１０１は、前者の操作に関わる操作要求をファイルシステム再現部１０２に、後者の操作に関わるデータ転送要求をデータ再現部１０３に振り分ける。 The file system access unit 101 provides a means for accessing a file system (data storage unit 200) from a client. File system access can be classified into two types: operations relating to a file system structure such as open, close, mkdir, and unlink, and operations relating to data such as read and write. Since the file system structure in the present embodiment is a tree structure, the file system structure is hereinafter also referred to as a tree structure. The file system access unit 101 distributes an operation request relating to the former operation to the file system reproduction unit 102 and a data transfer request relating to the latter operation to the data reproduction unit 103.

ファイルシステム再現部１０２は、ファイルシステムアクセス部１０１からツリー構造の操作要求を受け、ツリー構造の操作に必要な情報（ディレクトリ、ファイル、データブロック）をデータ格納部２００から読み出し、その結果を保持する。ファイルシステム再現部１０２に保持された情報は、第１の情報と呼ばれる。 The file system reproduction unit 102 receives a tree structure operation request from the file system access unit 101, reads information (directory, file, data block) necessary for the tree structure operation from the data storage unit 200, and holds the result. . The information held in the file system reproduction unit 102 is called first information.

データ再現部１０３は、ファイルシステムアクセス部１０１からデータ転送要求を受け、データ格納部２００に対して、ｒｅａｄ、ｗｒｉｔｅ処理を行う。データ再現部１０３は、例えば、ｗｒｉｔｅ処理を行う場合、ｗｒｉｔｅデータを保持する。データ再現部１０３に保持された情報は、第２の情報と呼ばれる。 The data reproduction unit 103 receives a data transfer request from the file system access unit 101, and performs read and write processing on the data storage unit 200. For example, when performing a write process, the data reproduction unit 103 holds the write data. The information held in the data reproduction unit 103 is called second information.

ファイルシステム確定部１０４は、一定期間、更新されていないファイルを検出し、ファイルシステム再現部１０２とデータ再現部１０３の各々が保持する該第１の情報及び第２の情報（ツリー構造とｗｒｉｔｅデータ）をデータ格納部２００へ格納するよう指示する。また、この他にも、ファイルシステム確定部１０４は、ファイルシステム再現部１０２とデータ再現部１０３に、各々が保持する情報を、全てデータ格納部２００へ書き出すよう指示することもできる。 The file system determination unit 104 detects a file that has not been updated for a certain period of time, and stores the first information and the second information (the tree structure and the write data) held by each of the file system reproduction unit 102 and the data reproduction unit 103. ) Is stored in the data storage unit 200. In addition, the file system determination unit 104 can also instruct the file system reproduction unit 102 and the data reproduction unit 103 to write out all the information held by each to the data storage unit 200.

ここで、ファイルシステムアクセス部１０１、ファイルシステム再現部１０２、データ再現部１０３、及び、ファイルシステム確定部１０４は、例えば、論理回路等のハードウェア回路で構成される。 Here, the file system access unit 101, the file system reproduction unit 102, the data reproduction unit 103, and the file system determination unit 104 are configured by a hardware circuit such as a logic circuit.

また、ファイルシステムアクセス部１０１、ファイルシステム再現部１０２、データ再現部１０３、及び、ファイルシステム確定部１０４は、コンピュータである、ファイルシステム制御装置１００のプロセッサが、図示されないメモリ上のプログラムを実行することで実現される機能ユニットでも良い。 In the file system access unit 101, the file system reproduction unit 102, the data reproduction unit 103, and the file system determination unit 104, the processor of the file system control device 100, which is a computer, executes a program on a memory (not shown). It may be a functional unit realized by this.

データ格納部２００は、例えば、ディスク装置、半導体メモリ等の記憶装置で構成される。 The data storage unit 200 is configured by a storage device such as a disk device and a semiconductor memory.

なお、ファイルシステム再現部１０２、及び、データ再現部１０３は、内部にデータを保持するメモリ機能も有するものとする。 Note that the file system reproduction unit 102 and the data reproduction unit 103 also have a memory function of holding data inside.

データ格納部２００に格納されているファイルを更新する処理イメージについて、図３〜図５を用いて、以下で説明する。なお、図３乃至図５は、説明のため、データ格納部２００、ファイルシステム再現部１０２及びデータ再現部１０３の各々とその中に格納されるブロックまたはブロック群を併記して示している。 A processing image for updating a file stored in the data storage unit 200 will be described below with reference to FIGS. 3 to 5 each illustrate the data storage unit 200, the file system reproduction unit 102, and the data reproduction unit 103 together with the blocks or block groups stored therein for the sake of explanation.

図３は、ディレクトリ、ファイル、及びデータブロックを含むツリー構造が、データ格納部２００に格納されている初期状態の一例を示す図である。 FIG. 3 is a diagram illustrating an example of an initial state in which a tree structure including directories, files, and data blocks is stored in the data storage unit 200.

なお、図３のツリー構造は、オリジナルのファイルシステム構造であり、前述の図１と同様であるため、詳細な説明は省略する。 Note that the tree structure in FIG. 3 is an original file system structure, and is the same as that in FIG. 1 described above, and thus detailed description is omitted.

図４は、図３の状態に対して、ファイルシステム制御装置１００が「ｆｉｌｅ１」のブロック情報を更新する動作の一例を示す図である。下記（１）〜（４）は、ファイルシステム制御装置１００の動作の具体的な手順を示す。 FIG. 4 is a diagram illustrating an example of an operation in which the file system control device 100 updates the block information of “file1” in the state of FIG. The following (1) to (4) show specific procedures of the operation of the file system control device 100.

（１）まず、ファイルシステムアクセス部１０１は、クライアントから受け取った「ｆｉｌｅ１」のｗｒｉｔｅ（更新）要求をファイルシステム再現部１０２に通知する。 (1) First, the file system access unit 101 notifies the file system reproduction unit 102 of a write (update) request for “file1” received from the client.

（２）次に、ファイルシステム再現部１０２は、「ｆｉｌｅ１」のツリー構造の情報をデータ格納部２００から読み込み、更新箇所に該当するデータブロックのＣＡ「ｃａ６１」をファイルシステムアクセス部１０１に返却する。ここで、ファイルシステム再現部１０２は、「ｆｉｌｅ１」のツリー構造の情報を保持する。 (2) Next, the file system reproduction unit 102 reads the information of the tree structure of “file1” from the data storage unit 200, and returns the CA “ca61” of the data block corresponding to the update location to the file system access unit 101. . Here, the file system reproduction unit 102 holds the information of the tree structure of “file1”.

（３）そして、ファイルシステムアクセス部１０１は、クライアントからのｗｒｉｔｅ要求とオリジナルブロックのＣＡ「ｃａ６１」をデータ再現部１０３に通知する。データ再現部１０３は、オリジナルブロック「ｄａｔａ２」を複製せずに、新しいブロック「ｄａｔａ２´」に更新データを書き込み、その更新データのオフセット、サイズ、及び、オリジナルブロックのＣＡ「ｃａ６１」を保持する。そして、データ再現部１０３は、更新データを格納したブロックのＣＡを後述される新方式で算出し「ｃａ６１´」、ファイルシステムアクセス部１０１に返却する。 (3) Then, the file system access unit 101 notifies the data reproduction unit 103 of the write request from the client and the CA “ca61” of the original block. The data reproduction unit 103 writes the update data in the new block “data2 ′” without duplicating the original block “data2”, and holds the offset and size of the update data and the CA “ca61” of the original block. Then, the data reproduction unit 103 calculates the CA of the block storing the update data by a new method described later, and returns “ca61 ′” to the file system access unit 101.

なお、上記の項目（３）は、図４のデータ再現部１０３の枠内の記載に対応する。例えば、新しいブロックは、「ｄａｔａ２´」と記載した部分に対応する。また、例えば、オリジナルブロックのＣＡ「ｃａ６１」は「Ｏｒｉｇｉｎａｌｄａｔａ」に、新しいブロックのＣＡ「ｃａ６１´」は「Ｔａｒｇｅｔｅｄｄａｔａ」に対応する。そして、更新データのオフセット、サイズが、「Ｍｏｄｉｆｉｅｄ」に対応する。更新データのオフセット、サイズの各々の具体的な値は、図４に示すように、例えば、「１２９：８５」及び「２５７：４２」であり、右側のｄａｔａ２´の斜線（ハッチング）の位置に対応する。上記のオフセット及びサイズの単位は、例えば、ビット、ワード等である。 Note that the above item (3) corresponds to the description in the frame of the data reproduction unit 103 in FIG. For example, the new block corresponds to the portion described as "data2 '". Further, for example, CA “ca61” of the original block corresponds to “Original data”, and CA “ca61 ′” of the new block corresponds to “Targeted data”. Then, the offset and size of the update data correspond to “Modified”. The specific values of the offset and size of the update data are, for example, “129: 85” and “257: 42” as shown in FIG. Corresponding. The unit of the offset and the size is, for example, a bit or a word.

なお、新方式は、例えば、ファイルＩＤ（ＩＤｅｎｔｉｆｉｃａｔｉｏｎ）（もしくは、フルパス）とデータブロックのオフセットからＣＡを算出する関数である。データブロックのオフセットは、例えば、「ｆｉｌｅ１」に対する「ｃａ６１´」の位置情報とする。新方式での算出は、２進数表記した場合、例えば、ファイルＩＤを「０１００」、データブロックのオフセットを「００１０」とすると、「ｃａ６１´」のＣＡを「０１１０」とする。 The new method is, for example, a function for calculating a CA from a file ID (IDentification) (or full path) and an offset of a data block. The offset of the data block is, for example, the position information of “ca61 ′” with respect to “file1”. In the calculation using the new method, when the file ID is “0100” and the offset of the data block is “0010”, the CA of “ca61 ′” is set to “0110”, for example, in binary notation.

（４）この後、ファイルシステムアクセス部１０１は、受け取ったＣＡ「ｃａ６１´」をファイルシステム再現部１０２に通知する。そして、ファイルシステム再現部１０２は、書き換えたブロックの情報を格納するファイルである「ｆｉｌｅ１」及び上位のＣＡを新方式でファイルシステムのルートまで再計算する。再計算されたＣＡは、図４に示すように、「ｃａ５０´」、「ｃａ４０´」、「ｃａ２０´」、「ｃａ１０´」、及び、「ｃａ００´」である。 (4) Thereafter, the file system access unit 101 notifies the received CA “ca61 ′” to the file system reproduction unit 102. Then, the file system reproduction unit 102 recalculates “file1”, which is the file storing the information of the rewritten block, and the higher-order CA to the root of the file system by the new method. The recalculated CAs are "ca50 '", "ca40'", "ca20 '", "ca10'", and "ca00 '", as shown in FIG.

上記（３）のとおり、ファイルシステム制御装置１００は、ファイル更新時には、変更箇所のみを書き込み、変更されていない箇所はオリジナルブロックへの参照とする。これにより、ファイルシステム制御装置１００は、書き込み量を必要最小限とし、応答時間を短縮できる。 As described in (3) above, when updating the file, the file system control device 100 writes only the changed part, and the part that has not been changed is referred to the original block. As a result, the file system control device 100 can minimize the writing amount and shorten the response time.

また、上記（３）及び（４）において、ファイルシステム制御装置１００は、ＣＡを新方式で算出することにより、同一ファイルを複数回更新する場合に、ＣＡの再計算を初回のみとし、２回目以降の更新では、ＣＡの再計算を不要とする。なお、既存方式と新方式の間でＣＡのコリジョンを防ぐため、例えば、既存方式は、ＣＡの先頭ｂｉｔを「０」とする。また、例えば、新方式は、先頭ｂｉｔを「１」とする。新方式の場合、上記の例を用いると、例えば、「０１１０」の先頭ｂｉｔに「１」を加え、ＣＡは「１０１１０」となる。なお、新方式のＣＡを第一のコンテンツアドレスとも呼ぶ。また、既存方式のＣＡを第二のコンテンツアドレスとも呼ぶ。 Further, in the above (3) and (4), the file system control apparatus 100 calculates the CA by the new method, so that when updating the same file a plurality of times, the recalculation of the CA is performed only for the first time and the second time. Subsequent updating eliminates the need to recalculate CA. In order to prevent CA collision between the existing system and the new system, for example, the leading bit of the CA is set to “0” in the existing system. Also, for example, in the new method, the first bit is “1”. In the case of the new system, using the above example, for example, “1” is added to the first bit of “0110”, and the CA becomes “10110”. Note that the CA of the new method is also referred to as a first content address. The CA of the existing system is also called a second content address.

図５は、ファイルシステム確定部１０４がデータを確定する動作の一例を示す図である。 FIG. 5 is a diagram illustrating an example of an operation in which the file system determination unit 104 determines data.

まず、ファイルシステム確定部１０４は、一定期間、更新されていないファイルを検出し、ファイルシステム再現部１０２とデータ再現部１０３に、各々が保持する該当の情報をデータ格納部２００へ書き出すよう指示する。この時、データ再現部１０３は、オリジナルブロックへの参照とした未変更箇所を複製し、その後データ格納部２００に書き出す。なお、上記の所定の期間は、事前にＣＡＳのレイテンシ等に基づきファイルシステムの管理者が設定するものとする。 First, the file system determination unit 104 detects a file that has not been updated for a certain period of time, and instructs the file system reproduction unit 102 and the data reproduction unit 103 to write the corresponding information held by each to the data storage unit 200. . At this time, the data reproduction unit 103 duplicates the unchanged portion referred to as the original block, and then writes it out to the data storage unit 200. The above-mentioned predetermined period is set in advance by the file system administrator based on the CAS latency or the like.

次に、ファイルシステム確定部１０４が、ファイルシステム再現部１０２とデータ再現部１０３を介して、データ格納部２００が受け取ったブロックの内容を元にＣＡを既存方式で再計算し、データ格納部２００のファイルシステム構造を更新する。なお、既存方式は、前述のように、格納データの内容に基づいて決定されるアドレスである。ファイルシステム確定部１０４は、下位のブロックが全て既存方式のＣＡとなった場合に、上位のブロックのＣＡを既存方式で再計算する。 Next, the file system determination unit 104 recalculates the CA by the existing method based on the content of the block received by the data storage unit 200 via the file system reproduction unit 102 and the data reproduction unit 103, and Update the file system structure of Note that, as described above, the existing method is an address determined based on the content of the stored data. When all of the lower blocks become CAs of the existing method, the file system determination unit 104 recalculates the CA of the upper block by the existing method.

ところで、上記は、階層ディレクトリ構造のファイルシステムをＣＡＳ上に格納する場合を例示している。しかし、オブジェクトストレージのように階層のないファイルシステムをＣＡＳ上に格納する場合においても、本実施形態を適用することでファイル更新時の性能向上が見込める。 Incidentally, the above is an example of storing a file system having a hierarchical directory structure on a CAS. However, even when a file system having no hierarchy, such as an object storage, is stored on the CAS, improvement in performance at the time of updating a file can be expected by applying this embodiment.

以上、述べてきたように、本実施形態のストレージシステム１０は、データのバイナリ表現（データの内容）に依存しない新たなＣＡ計算方式を導入し、既存方式と混在させる。ストレージシステム１０は、通常は既存方式とするが、ファイル更新時は新方式とする。 As described above, the storage system 10 of the present embodiment introduces a new CA calculation method that does not depend on the binary representation (data content) of data and mixes it with the existing method. The storage system 10 normally uses an existing method, but uses a new method when updating a file.

既存方式は、ブロックの内容からＣＡを算出する。ブロックの内容が同じであれば、同じＣＡを持つため、ストレージシステム１０は、重複排除の効果を享受できる。 The existing method calculates CA from the contents of a block. If the contents of the blocks are the same, they have the same CA, so that the storage system 10 can enjoy the effect of deduplication.

一方、新方式は、ファイルＩＤ（もしくは、フルパス）とデータブロックのオフセットからＣＡを算出する。新方式は、システム内でユニークなＣＡとし、ファイル更新時の複製処理を不要とする。ただし、新方式は、ＣＡがユニークなので、重複排除の効果はない。 On the other hand, the new method calculates CA from the file ID (or full path) and the offset of the data block. The new method uses a unique CA in the system and eliminates the need for duplication processing when updating a file. However, in the new method, since the CA is unique, there is no deduplication effect.

上記のように、第一に、新方式は、ＣＡがブロックの内容に依存しないため、データブロックを書き換えた際のＣＡ再計算は不要となり、上位のブロックを書き換える必要がなくなる。ただし、新方式は、重複排除の効果を享受できない。これを解決するため、ストレージシステム１０は、新方式で計算されたＣＡを、非同期（ポストプロセス）で既存方式のＣＡに書き換える機構を導入する。これにより、ストレージシステム１０は、ファイル更新処理を効率化しながら、長期的には重複排除の効果も享受できる。 As described above, first, in the new method, since the CA does not depend on the content of the block, the CA recalculation when the data block is rewritten becomes unnecessary, and the upper block does not need to be rewritten. However, the new method cannot enjoy the effect of deduplication. To solve this, the storage system 10 introduces a mechanism for asynchronously (post-process) rewriting the CA calculated by the new method to the CA of the existing method. Thereby, the storage system 10 can enjoy the effect of deduplication in the long term while improving the efficiency of the file update process.

また、第二に、新方式は、ＣＡがユニークであり、１つのブロックが複数から参照されることは無いため、ファイル更新時にオリジナルブロックの複製は不要であり、そのままブロックを書き換えることができる。これにより、ストレージシステム１０は、ファイル更新時の応答時間を短縮できる。ただし、初回のブロック書き換え時は、オリジナルブロックのＣＡが既存方式であるため、ブロックの複製が必要となる。これを解決するため、ストレージシステム１０は、新しいブロックには、更新データだけを格納し、オリジナルブロックのＣＡと書き換えるデータの情報（データのオフセット値、データサイズ）を保存しておき、変更が無い箇所はオリジナルブロックへの参照とし、複製はしない。これにより、ストレージシステム１０は、初回のブロック書き換え時の応答性能も短縮できる。なお、ストレージシステム１０は、参照とした箇所をポストプロセスで複製しておく。 Secondly, in the new method, since the CA is unique and one block is not referred to by a plurality of blocks, it is not necessary to duplicate the original block when updating the file, and the block can be rewritten as it is. Thereby, the storage system 10 can shorten the response time at the time of updating the file. However, when the block is rewritten for the first time, since the CA of the original block is of the existing system, the block must be copied. To solve this, the storage system 10 stores only the update data in the new block, stores the CA of the original block and information (data offset value, data size) of the data to be rewritten, and there is no change. The location is a reference to the original block and is not duplicated. Thereby, the storage system 10 can also shorten the response performance at the time of the first block rewriting. In addition, the storage system 10 duplicates the location referred to by a post process.

上記のように、本実施形態のストレージシステム１０は、ファイルシステムを格納しているＣＡＳ上で複数のＣＡ体系の混在を許し、そのＣＡ体系間の移行をデータの安定性（長期間更新されていない、など）を契機として行うことができる。 As described above, the storage system 10 according to the present embodiment allows a plurality of CA systems to coexist on the CAS storing the file system, and determines the transition between the CA systems to data stability (for long-term updating). No, etc.).

また、ストレージシステム１０は、レイテンシの大きいＣＡＳにおいて、長期的に重複率を落とさずファイル更新の性能を向上できる。 In addition, the storage system 10 can improve the file update performance without decreasing the duplication rate over a long period of time in a CAS with a large latency.

さらに、ストレージシステム１０は、同一ファイルを頻繁に更新するシステムで、重複率を落とさず、ファイル更新時の性能向上が見込める。 Further, the storage system 10 is a system that frequently updates the same file, and can improve the performance when updating the file without reducing the duplication rate.

このように、ストレージシステム１０は、ＣＡ再計算処理を最適化し、ファイル更新時の複製処理をポストプロセス化する。 As described above, the storage system 10 optimizes the CA recalculation processing and performs the post-processing for the copy processing when updating the file.

本実施形態に係るストレージシステム１０は、以下に記載するような効果を奏する。 The storage system 10 according to the present embodiment has the following effects.

すなわち、ＣＡＳにおいて、ファイル更新時のＣＡ再計算処理が広範囲に伝搬し、また、ファイル更新時に複製が必要であるため、ファイル更新時に性能の問題がある、という課題を解決する、という効果を奏する。 That is, in the CAS, the problem that the CA recalculation processing at the time of updating a file propagates over a wide area and that the duplication is required at the time of updating the file, and that there is a performance problem at the time of updating the file, is achieved. .

その理由は、ファイルパスとデータブロックのオフセットから新方式でＣＡを算出し、新方式で計算されたＣＡを、非同期で既存方式のＣＡに書き換えるからである。
＜第二の実施形態＞
次に、本発明の第二の実施の形態について図面を参照して詳細に説明する。 The reason is that the CA is calculated by the new method from the file path and the offset of the data block, and the CA calculated by the new method is asynchronously rewritten to the CA of the existing method.
<Second embodiment>
Next, a second embodiment of the present invention will be described in detail with reference to the drawings.

図６は、第二の実施形態に係る、ファイルシステム制御装置３００の構成の一例を示すブロック図である。第二の実施形態は、第一の実施形態のファイルシステム制御装置３００の最小構成の一例に該当する。 FIG. 6 is a block diagram illustrating an example of a configuration of the file system control device 300 according to the second embodiment. The second embodiment corresponds to an example of the minimum configuration of the file system control device 300 of the first embodiment.

ファイルシステム制御装置３００は、データ再現部３０１、ファイルシステム再現部３０２、及び、ファイルシステム確定部３０３を含む。 The file system control device 300 includes a data reproduction unit 301, a file system reproduction unit 302, and a file system determination unit 303.

データ再現部３０１は、データを更新し、データの更新箇所のファイルパス及びデータブロックのオフセットに基づいて、データブロックの第一のコンテンツアドレスを算出する。 The data reproducing unit 301 updates the data, and calculates a first content address of the data block based on the file path of the updated location of the data and the offset of the data block.

ファイルシステム再現部３０２は、前記データ再現手段が更新したデータブロックを格納する上位ブロックの第一のコンテンツアドレスを算出する。 The file system reproduction unit 302 calculates a first content address of an upper block that stores the data block updated by the data reproduction unit.

ファイルシステム確定部３０３は、第一のコンテンツアドレスを第二のコンテンツアドレスに変換する。 The file system determination unit 303 converts the first content address into a second content address.

本実施形態に係るファイルシステム制御装置３００は、以下に記載するような効果を奏する。 The file system control device 300 according to the present embodiment has the following effects.

その理由は、データを更新し、データの更新箇所のファイルパス及びデータブロックのオフセットに基づいて、データブロックの第一のコンテンツアドレスを算出し、更新したデータブロックを格納する上位ブロックの第一のコンテンツアドレスを算出する。そして、第一のコンテンツアドレスを第二のコンテンツアドレスに変換するからである。 The reason is that the data is updated, the first content address of the data block is calculated based on the file path of the data update location and the offset of the data block, and the first content address of the upper block that stores the updated data block is calculated. Calculate the content address. Then, the first content address is converted into the second content address.

以上、図面を参照して本発明の実施形態を説明したが、本発明は上記実施形態に限定されるものではない。本発明の構成や詳細には、本発明のスコープ内で当業者が理解し得る様々な変更をすることができる。 The embodiment of the present invention has been described with reference to the drawings, but the present invention is not limited to the above embodiment. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

１０ストレージシステム
１００ファイルシステム制御装置
１０１ファイルシステムアクセス部
１０２ファイルシステム再現部
１０３データ再現部
１０４ファイルシステム確定部
２００データ格納部
３００ファイルシステム制御装置
３０１データ再現部
３０２ファイルシステム再現部
３０３ファイルシステム確定部
１０００ファイルシステム構造
２０００ファイルシステム構造 DESCRIPTION OF SYMBOLS 10 Storage system 100 File system control device 101 File system access part 102 File system reproduction part 103 Data reproduction part 104 File system determination part 200 Data storage part 300 File system control device 301 Data reproduction part 302 File system reproduction part 303 File system determination part 1000 File system structure 2000 File system structure

Claims

Updating data, calculating a first content address of the data block based on a file path and an offset of the data block at a location where the data is updated, data reproduction means,
Calculating a first content address of an upper block storing the data block updated by the data reproducing unit, a file system reproducing unit,
A file system determination unit that converts the first content address into a second content address.

The file system control device according to claim 1, wherein the second content address is determined based on the content of the data.

3. The file according to claim 1, wherein the second content address sets a first bit of the content address to “0”, and the first content address sets a first bit of the content address to “1”. 4. System control unit.

The file system control device according to any one of claims 1 to 3,
A data storage means for storing the data.

Updating data, calculating a first content address of the data block based on a file path and an offset of the data block at a location where the data is updated,
Calculating the first content address of the upper block storing the updated data block;
A file system control method for converting the first content address into a second content address.

6. The file system control method according to claim 5, wherein the second content address is determined based on the content of the data.

7. The file according to claim 5, wherein the second content address sets the first bit of the content address to “0”, and the first content address sets the first bit of the content address to “1”. 8. System control method.

Updating data, and calculating a first content address of the data block based on a file path and an offset of the data block at an updated location of the data;
A process of calculating the first content address of an upper block that stores the updated data block;
Converting the first content address into a second content address.

The program according to claim 8, wherein the second content address is determined based on a content of the data.

The program according to claim 8, wherein the second content address sets the first bit of the content address to “0”, and the first content address sets the first bit of the content address to “1”. .