JP2018185760A

JP2018185760A - Storage device for eliminating duplication of data

Info

Publication number: JP2018185760A
Application number: JP2017088762A
Authority: JP
Inventors: 小林　正樹; Masaki Kobayashi; 正樹小林
Original assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Current assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Priority date: 2017-04-27
Filing date: 2017-04-27
Publication date: 2018-11-22

Abstract

PROBLEM TO BE SOLVED: To provide a storage device capable of holding an entry of a hash table in a data duplication elimination technique in a cache memory of a constant amount, allowing retrieval at a high speed, and preventing the shortage of an amount of the cache memory.SOLUTION: A storage device comprises a volatile memory that divides data designated by a host computer into a plurality of chunk, generates each hush value on the basis of the data of each chunk, and holds a hush table in which an identifier and the hush value of the chunk written into a storage medium are stored in association with each other. When the chunk is written into the storage medium, it is searched whether the hush value matched to the hush value generated on the basis of the data of the chunk has been recorded into the hush table. If the hush value is not recorded, the hush value is written into the hush table. If the hush value has been recorded, the writing of the hush value is suppressed.SELECTED DRAWING: Figure 7

Description

本発明の実施形態は、データの重複を排除するためのストレージ装置に関する。 Embodiments described herein relate generally to a storage apparatus for eliminating data duplication.

近年，情報処理システムで使われるビックデータを格納するためにストレージ装置の大
容量化が進むだけでなく、限られた記憶容量を有効に使用するストレージ装置が求められ
ている。例えば、データを保存する際にデータを一定の大きさに区切り、各々のハッシュ
値を計算し、ハッシュ値を保存しているデータテーブル上に重複する値がある場合は、該
当データを書き込まないという、データ重複排除技術が注目されている。 In recent years, not only has the capacity of a storage device increased to store big data used in an information processing system, but also a storage device that effectively uses a limited storage capacity has been demanded. For example, when storing data, the data is divided into a certain size, each hash value is calculated, and if there are duplicate values on the data table storing the hash value, the corresponding data is not written Data deduplication technology is attracting attention.

特許第５４４４５０６号Patent No. 5444506

データ重複排除技術は、その動作原理上、ハッシュテーブル内の検索が必要になる。ハ
ッシュテーブルは、保存データ量の増加に伴い大きくなる。データ重複排除処理速度向上
には、ハッシュテーブルの検索時間を短縮することが重要となる。例えば、ハッシュテー
ブルをメモリ上に配置すれば高速に検索可能だが、データ量増加に伴い必要メモリ量が増
大するため、コストが掛かりデータ量によっては容量が足りなくなってしまう。また、仮
想メモリのようにハッシュテーブルの実体をHDDのような二次記憶装置に持てば、コスト
を抑えられデータ量上限は向上させられるが、検索に時間がかかる。 The data deduplication technique requires a search in a hash table because of its operation principle. The hash table increases as the amount of stored data increases. To improve the data deduplication processing speed, it is important to shorten the hash table search time. For example, if a hash table is arranged on the memory, the search can be performed at high speed. However, since the required memory amount increases as the data amount increases, the cost increases and the capacity becomes insufficient depending on the data amount. Further, if a hash table entity such as a virtual memory is held in a secondary storage device such as an HDD, the cost can be reduced and the upper limit of the data amount can be improved, but the search takes time.

実施形態のストレージ装置は、ホストコンピュータに指定されたデータを複数のチャン
クに分割するチャンク分割部と、前記複数のチャンクの各々のデータに基づいて各々のハ
ッシュ値を生成するハッシュ生成部と、記録媒体に書き込まれているチャンクの識別子と
そのチャンクのハッシュ値とが対応付けられて記憶されているハッシュテーブルを保持す
る揮発性メモリと、前記記録媒体に第一のチャンクを書き込むときに、第一のチャンクの
データに基づいて前記ハッシュ生成部で生成されたハッシュ値と一致するハッシュ値が、
既に前記ハッシュテーブルに記録されているか否かを探索し、一致するハッシュが記録さ
れていない場合には第一のチャンクを書き込み、一致するハッシュが記録されている場合
には書き込みを抑止する制御部と、を備える。 The storage apparatus according to the embodiment includes a chunk dividing unit that divides data designated by the host computer into a plurality of chunks, a hash generation unit that generates each hash value based on the data of each of the plurality of chunks, and a recording A volatile memory that holds a hash table in which identifiers of chunks written on the medium and hash values of the chunks are stored in association with each other, and when the first chunk is written to the recording medium, Hash value that matches the hash value generated by the hash generation unit based on the chunk data,
A control unit that searches whether or not a hash is already recorded in the hash table, writes a first chunk if no matching hash is recorded, and suppresses writing if a matching hash is recorded And comprising.

実施形態に係るストレージシステムのブロック図。1 is a block diagram of a storage system according to an embodiment. ディスクアレイ制御装置の機能構成を示すブロック図。The block diagram which shows the function structure of a disk array control apparatus. ホストコンピュータの書き込み要求で指定されたデータとチャンクとの関係の一例を示す図。The figure which shows an example of the relationship between the data designated by the write request of a host computer, and a chunk. ハッシュテーブルのデータ構造例を示す図。The figure which shows the data structure example of a hash table. メタテーブルのデータ構造例を示す図。The figure which shows the data structure example of a metatable. チャンクテーブルのデータ構造例を示す図。The figure which shows the data structure example of a chunk table. ディスクアレイ制御装置がデータ書き込み要求を受信した場合に実行されるデータ書き込み処理の手順を示す図。The figure which shows the procedure of the data write process performed when a disk array control apparatus receives a data write request. ディスクアレイ制御装置がデータ読み込み要求を受信した場合に実行されるデータ書き込み処理の手順を示す図。The figure which shows the procedure of the data write process performed when a disk array control apparatus receives a data read request. ハッシュテーブルのエントリをソートするためのエージング周期処理の手順を示す図。The figure which shows the procedure of the aging period process for sorting the entry of a hash table.

以下、発明を実施するための実施形態について説明する。 Hereinafter, embodiments for carrying out the invention will be described.

図１は、実施形態に係るストレージシステムのブロック図を示す。本実施形態のストレ
ージシステムは、ホストコンピュータ１００（以下、ホスト１００と称する）とストレー
ジ装置２００を備える。ストレージ装置２００は、ネットワーク１１０を介してホスト１
００と接続されている。 FIG. 1 is a block diagram of a storage system according to the embodiment. The storage system of this embodiment includes a host computer 100 (hereinafter referred to as host 100) and a storage apparatus 200. The storage apparatus 200 is connected to the host 1 via the network 110.
00 is connected.

ホスト１００は、サーバあるいはクライアントPCのような物理計算機である。ホスト１
００内では、ストレージ装置２００のデータにアクセスするためのアプリケーションプロ
グラムが動作する。ホスト１００は、このアプリケーションプログラムに従い、ストレー
ジ装置２００を外部記憶装置としてネットワーク１１０を介して利用する。ネットワーク
の種類は、SAN（Storage Area Network）、あるいはEthernet（登録商標）等である。 The host 100 is a physical computer such as a server or a client PC. Host 1
In 00, an application program for accessing data in the storage apparatus 200 operates. The host 100 uses the storage device 200 as an external storage device via the network 110 according to the application program. The type of network is SAN (Storage Area Network) or Ethernet (registered trademark).

ストレージ装置２００は、ディスクアレイ制御装置２１０およびRAIDディスクアレイ２
３０を備える。 The storage device 200 includes a disk array control device 210 and a RAID disk array 2
30.

ディスクアレイ制御装置２１０は、RAIDディスクアレイ２３０と、ディスクインタフェ
ースバス２２０を介して接続されている。ディスクインタフェースバス２２０のインタフ
ェース種別は、SCSI、あるいはFC、あるいはSAS（Serial Attached SCSI）、あるいはSAT
A（Serial AT Attachment）等である。 The disk array control device 210 is connected to the RAID disk array 230 via the disk interface bus 220. The interface type of the disk interface bus 220 is SCSI, FC, SAS (Serial Attached SCSI), or SAT.
A (Serial AT Attachment).

ディスクアレイ制御装置２１０は、ホストインタフェース（ホストＩ／Ｆ）２１１と、
ディスクインタフェース（ディスクＩ／Ｆ）２１２と、キャッシュメモリ２１３と、キャ
ッシュコントローラ２１４と、ＦＲＯＭ２１５と、ローカルメモリ２１６と、ＣＰＵ２１
７と、チップセット２１８と、内部バス２１９とを備える。 The disk array controller 210 includes a host interface (host I / F) 211,
Disk interface (disk I / F) 212, cache memory 213, cache controller 214, FROM 215, local memory 216, and CPU 21
7, a chip set 218, and an internal bus 219.

ディスクアレイ制御装置２１０は、ストレージ装置２００を外部記憶装置として利用す
るホスト１００と、ネットワーク１１０を経由し、ホストＩ／Ｆ２１１を介して接続され
る。ホストＩ／Ｆ２１１のインタフェース種別はFC（Fibre Channnel）、あるいはiSCSI
（internet SCSI）などである。 The disk array control device 210 is connected to the host 100 that uses the storage device 200 as an external storage device via the network 110 via the host I / F 211. The interface type of the host I / F 211 is FC (Fibre Channel) or iSCSI
(Internet SCSI).

ホストＩ／Ｆ２１１は、ホスト１００との間のデータ転送（データ転送プロトコル）を
制御する。 The host I / F 211 controls data transfer (data transfer protocol) with the host 100.

ディスクＩ／Ｆ２１２は、ホスト１００からRAIDディスクアレイ２３０のドライブへの
データライト／リード要求とその応答を送受信するためのインタフェースである。また、
キャッシュコントローラ２１４は、キャッシュメモリ２１３へのデータライト／リード要
求を行う。 The disk I / F 212 is an interface for transmitting / receiving a data write / read request and a response to the drive of the RAID disk array 230 from the host 100. Also,
The cache controller 214 makes a data write / read request to the cache memory 213.

キャッシュメモリ２１３は、ホスト１００からのライト／リード要求に対するキャッシ
ュとして用いられる。本実施形態では、主に図２に示すハッシュテーブル２１３aを格納
する。なお、キャッシュメモリ２１３はミラーリング等の冗長化が成されている構成でも
良い。 The cache memory 213 is used as a cache for a write / read request from the host 100. In this embodiment, the hash table 213a shown mainly in FIG. 2 is stored. Note that the cache memory 213 may be configured to have redundancy such as mirroring.

キャッシュコントローラ２１４は、ＣＰＵ２１７上で動作するディスクアレイ制御プロ
グラムからの命令に従いキャッシュメモリ２１３へのデータのリード／ライト処理を実施
する。ず The cache controller 214 performs data read / write processing to the cache memory 213 in accordance with instructions from the disk array control program operating on the CPU 217. Z

ＦＲＯＭ２１５は、ＣＰＵ２１７により実行されるディスクアレイ制御プログラムを格
納するための書き換えが可能な不揮発性メモリである。本実施形態では、ディスク制御装
置が起動されたときの最初の処理でＦＲＯＭ２１５に格納されているディスクアレイ制御
プログラムを、ローカルメモリ２１６にコピーする。 The FROM 215 is a rewritable nonvolatile memory for storing a disk array control program executed by the CPU 217. In this embodiment, the disk array control program stored in the FROM 215 is copied to the local memory 216 in the first process when the disk control device is activated.

ローカルメモリ２１６は、ＦＲＯＭ２１５からコピーされたディスクアレイ制御プログ
ラムを記憶するのに用いられるほか、ローカルメモリ２１６の一部の領域は、ＣＰＵ２１
７が動作する上で使用するワーク領域として用いられる。 The local memory 216 is used to store a disk array control program copied from the FROM 215. In addition, a part of the local memory 216 is stored in the CPU 21.
7 is used as a work area used for operation.

ＣＰＵ２１７は、ローカルメモリ２１６に格納されたディスクアレイ装置制御プログラム
のプログラムコードに従い、ディスクアレイ制御装置２１０内の各部を制御する。また、
ＣＰＵ２１７は、チップセット２１８を介し、ローカルメモリ２１６上に格納されている
ディスクアレイ制御プログラムが動作し、ディスクアレイ制御装置２１０全体の制御を行
う。 The CPU 217 controls each unit in the disk array control device 210 according to the program code of the disk array device control program stored in the local memory 216. Also,
The CPU 217 operates the disk array control program stored on the local memory 216 via the chip set 218 to control the entire disk array control apparatus 210.

チップセット２１８は、ＣＰＵ２１７及びその周辺回路を内部バス２１９に結合するブ
リッジ回路である。 The chip set 218 is a bridge circuit that couples the CPU 217 and its peripheral circuits to the internal bus 219.

内部バス２１９は汎用バスであり、例えばPCI-expressバスである。ホストＩ／Ｆ２１
１、ディスクＩ／Ｆ２１２、キャッシュコントローラ２１４及びチップセット２１８は、
内部バス２１９により相互接続されている。また、ＦＲＯＭ２１５、ローカルメモリ２１
６、ＣＰＵ２１７は、チップセット２１８を介して内部バス２１９に接続されている。 The internal bus 219 is a general-purpose bus, for example, a PCI-express bus. Host I / F21
1, disk I / F 212, cache controller 214 and chipset 218
Interconnected by an internal bus 219. Also, FROM 215, local memory 21
6. The CPU 217 is connected to the internal bus 219 via the chipset 218.

RAIDディスクアレイ２３０は、複数のハードディスクドライブ（HDD)を用いて構成され
る。RAIDディスクアレイ２３０に代えて、例えば複数のHDDおよび複数のSSD（ソリッドス
テートドライブ）を用いて構成されるRAIDアレイや複数のSSDのみを用いて構成されるRAI
Dアレイを適用しても構わない。また、RAID構成を持たない単なる複数のストレージドラ
イブの集合（ドライブアレイ）を適用しても構わない。また、RAIDディスクアレイ２３０
は、ディスクアレイ制御プログラムによって管理される。以降のRAIDディスクアレイ２３
０を使用した機能についても、ディスクアレイ制御プログラムによって制御する。さらに
本実施形態において、RAIDディスクアレイ２３０は、図２に示すメタテーブル２３０aお
よびチャンクテーブル２３０bを格納する。なお、RAIDディスクアレイ２３０に代えて、
フラッシュメモリ等を用いることも可能である。 The RAID disk array 230 is configured using a plurality of hard disk drives (HDD). Instead of the RAID disk array 230, for example, a RAID array configured using a plurality of HDDs and a plurality of SSDs (solid state drives) or a RAI configured using only a plurality of SSDs
A D array may be applied. Also, a simple set of storage drives (drive array) having no RAID configuration may be applied. RAID disk array 230
Are managed by a disk array control program. Subsequent RAID disk array 23
Functions using 0 are also controlled by the disk array control program. Further, in the present embodiment, the RAID disk array 230 stores the meta table 230a and the chunk table 230b shown in FIG. Instead of the RAID disk array 230,
A flash memory or the like can also be used.

本実施形態においてディスクアレイ制御装置２１０は、RAIDディスクアレイ２３０を含
むストレージ装置に備えられている。しかし、ディスクアレイ制御装置２１０が、ストレ
ージ装置２００から独立して備えられていても構わない。この場合、ディスクアレイ制御
装置２１０が、ホスト１００に内蔵されていても構わない。また、ディスクアレイ制御装
置２１０の機能が、ホスト１００の有するオペレーティングシステム（ＯＳ）の機能の一
部を用いて実現されても構わない。 In this embodiment, the disk array control device 210 is provided in a storage device including the RAID disk array 230. However, the disk array control device 210 may be provided independently of the storage device 200. In this case, the disk array control device 210 may be built in the host 100. Further, the function of the disk array control device 210 may be realized by using a part of the function of the operating system (OS) that the host 100 has.

また、ディスクアレイ制御装置２１０が、ホスト１００のカードスロットに装着して用
いられるカードに備えられていても構わない。また、ディスクアレイ制御装置２１０の一
部がホスト１００に内蔵され、当該ディスクアレイ制御装置２１０の残りが前記カードに
備えられていても構わない。 Further, the disk array control device 210 may be provided in a card that is used by being installed in a card slot of the host 100. Further, a part of the disk array control device 210 may be built in the host 100, and the rest of the disk array control device 210 may be provided in the card.

図２は、図１に示されるディスクアレイ制御装置２１０の機能構成を示すブロック図で
ある。ディスクアレイ制御装置２１０は、分割部３００、重複管理部３０１、重複判定部
３０２、ハッシュ生成部３０３、アクセスコントローラ３０４を備えている。分割部３０
０、重複管理部３０１、重複判定部３０２、ハッシュ生成部３０３の機能については後述
する。アクセスコントローラ３０４は、RAIDディスクアレイ２３０に対するデータの読み
出し、書き込みを制御する。これらの機能要素３００ないし３０４は、図１に示されるデ
ィスクアレイ制御装置のＣＰＵ２１７が制御プログラムを実行することにより実現される
ソフトウェアモジュールである。機能要素３００ないし３０４の一部または全部がハード
ウェアモジュールによって実現されても構わない。 FIG. 2 is a block diagram showing a functional configuration of the disk array control apparatus 210 shown in FIG. The disk array control apparatus 210 includes a dividing unit 300, a duplication management unit 301, a duplication determination unit 302, a hash generation unit 303, and an access controller 304. Dividing unit 30
The functions of 0, the duplication management unit 301, the duplication determination unit 302, and the hash generation unit 303 will be described later. The access controller 304 controls reading and writing of data with respect to the RAID disk array 230. These functional elements 300 to 304 are software modules realized by the CPU 217 of the disk array control apparatus shown in FIG. 1 executing a control program. A part or all of the functional elements 300 to 304 may be realized by a hardware module.

ローカルメモリ２１６は、制御プログラム領域２１６ａ、テーブル領域２１６ｂ、ワーク
領域２１６ｃを含む。制御プログラム領域２１６ａは、ＣＰＵ２１７によって実行される
制御プログラムの少なくとも一部を格納するのに用いられる。この制御プログラムは、前
述したようにFROM２１５に予め格納されており、当該制御プログラムの少なくとも一部は
、ディスクアレイ制御装置２１０の起動時に当該FROM２１５からローカルメモリ２１６の
制御プログラム領域２１６ａにロードされる。 The local memory 216 includes a control program area 216a, a table area 216b, and a work area 216c. The control program area 216a is used to store at least a part of a control program executed by the CPU 217. As described above, this control program is stored in advance in the FROM 215, and at least a part of the control program is loaded from the FROM 215 to the control program area 216a of the local memory 216 when the disk array control device 210 is activated.

テーブル領域２１６ｂは、ディスクアレイ制御装置２１０格納されている各種のテーブ
ルの少なくとも一部を格納するのに用いられる。ワーク領域２１６ｃは、ＣＰＵ２１７が
制御プログラムを実行する際に利用される一時的なデータを格納するのに用いられる。 The table area 216b is used to store at least a part of various tables stored in the disk array control device 210. The work area 216c is used to store temporary data used when the CPU 217 executes the control program.

RAIDディスクアレイ２３０は、メタテーブル２３０a、チャンクテーブルb（詳細は後述
する）を格納する。つまりRAIDディスクアレイ２３０は、メタテーブル２３０a、チャン
クテーブル２３０bがそれぞれ格納される記憶領域を含む。 The RAID disk array 230 stores a meta table 230a and a chunk table b (details will be described later). That is, the RAID disk array 230 includes storage areas for storing the meta table 230a and the chunk table 230b, respectively.

キャッシュメモリ２１３は、ハッシュテーブル２１３a（詳細は後述する）を格納する
。つまりキャッシュメモリ２１３は、ハッシュテーブル２１３aが格納される記憶領域を
含む。ハッシュテーブル２１３aの記憶容量は固定長であるものとする。 The cache memory 213 stores a hash table 213a (details will be described later). That is, the cache memory 213 includes a storage area in which the hash table 213a is stored. The storage capacity of the hash table 213a is assumed to be a fixed length.

一般的なストレージシステムでは、論理ボリュームに最初に割り当てられた記憶容量（
例えば、最小の単位の記憶容量）で運用が開始される。論理ボリュームとは、ホストによ
って論理的なストレージドライブとして認識される記憶領域を指す。論理ボリュームには
、例えば４Ｋ（キロ）バイト毎にRAIDディスクアレイ２３０の記憶領域（物理領域）が適
宜割り当てられる。この種のストレージシステムは、運用開始後の状況に応じて、論理ボ
リュームの記憶容量をフレキシブルに増加させる機能を有する。本実施形態におけるスト
レージシステムも同様である。 In a typical storage system, the storage capacity initially assigned to the logical volume (
For example, the operation starts with a minimum unit storage capacity. A logical volume refers to a storage area that is recognized by a host as a logical storage drive. For example, a storage area (physical area) of the RAID disk array 230 is appropriately assigned to the logical volume every 4K (kilo) bytes. This type of storage system has a function of flexibly increasing the storage capacity of the logical volume according to the situation after the start of operation. The same applies to the storage system in this embodiment.

図３は、ホスト１００からの書き込み要求で指定されたデータとチャンクとの関係の例
を示す。分割部３００は、まず、図３に示されるデータを、固定長、例えば４キロバイト
（４Ｋバイト）のデータの塊に分割する。この４Ｋバイト毎のデータの塊をチャンクと呼
ぶ。重複管理部３０１および重複判定部３０２は、チャンク毎に重複の有無を検索し、管
理する。図３の例では、データＦのサイズは４ＮＫバイトである。この場合、データＦは
、Ｎ個のチャンク＃０ないし＃Ｎ−１に分割される。重複管理部３０１は、チャンク＃０
ないし＃Ｎ−１にユニークな識別番号であるチャンク番号を付与する。本実施形態におい
て、チャンク番号は８バイトで表される。 FIG. 3 shows an example of the relationship between the data specified by the write request from the host 100 and the chunk. The dividing unit 300 first divides the data shown in FIG. 3 into data chunks having a fixed length, for example, 4 kilobytes (4 Kbytes). A chunk of data every 4 Kbytes is called a chunk. The duplication management unit 301 and the duplication determination unit 302 search for and manage duplication for each chunk. In the example of FIG. 3, the size of the data F is 4NK bytes. In this case, the data F is divided into N chunks # 0 to # N-1. Duplicate management unit 301 uses chunk # 0.
Also, a chunk number that is a unique identification number is assigned to # N-1. In the present embodiment, the chunk number is represented by 8 bytes.

ここで、本実施形態で適用されるデータライト時の重複排除の概要について説明する。
まず、ホスト１００からストレージ装置２００のディスクアレイ制御装置２１０に、チャ
ンクＡを含むデータの書き込みが要求されたことを想定する。また、チャンクＡのハッシ
ュ値がＨ（Ａ）であるものとする。重複判定部３０２は、チャンクＡのハッシュ値Ｈ（Ａ
）をハッシュテーブルに登録されているハッシュ値と順次比較する。この比較の結果に基
づいて、重複判定部３０２は、RAIDディスクアレイ２３０にチャンクＡと同一内容を持つ
チャンク（以下、チャンクＡと称する）が格納されているかを判定する。 Here, an outline of deduplication at the time of data writing applied in the present embodiment will be described.
First, it is assumed that the host 100 requests the disk array control apparatus 210 of the storage apparatus 200 to write data including chunk A. Further, it is assumed that the hash value of chunk A is H (A). The duplication judgment unit 302 uses the hash value H (A
) Are sequentially compared with the hash values registered in the hash table. Based on the comparison result, the duplication determination unit 302 determines whether a chunk having the same contents as the chunk A (hereinafter referred to as chunk A) is stored in the RAID disk array 230.

まず、図４を用いてキャッシュメモリ２１３に格納されるハッシュテーブル２１３aに
ついて説明する。ハッシュテーブルは、チャンク番号に対応付けられたエントリの集合を
有する。ハッシュテーブルの各エントリは、そのエントリに登録されたチャンク番号が付
与されたチャンクに関するハッシュ値、参照ビット、ヒットカウンタから構成される。 First, the hash table 213a stored in the cache memory 213 will be described with reference to FIG. The hash table has a set of entries associated with chunk numbers. Each entry in the hash table includes a hash value, a reference bit, and a hit counter related to the chunk to which the chunk number registered in the entry is assigned.

ハッシュ値は、前述した通り、対応付けられたチャンク番号が付与されたチャンクのデ
ータに基づいて、所定のハッシュ関数を用いて計算した値である。参照ビットは、後述す
るエージング周期処理の周期期間内において、対応付けられたチャンク番号が示すチャン
クがアクセスされた際にビット”１”が設定されるもので、対応付けられたチャンクがア
クセスされたことを示すフラグ情報である。参照ビットは、後述するエージング周期処理
が終了した後にビット”０”が設定されてフラグ情報がクリアされる。ヒットカウンタは
、後述するエージング周期処理の際に、ハッシュエントリの優先順位を決定するために用
いられる。ハッシュテーブル２１３aは、RAIDディスクアレイ２３０に書き込まれるべき
チャンクと同一内容を持つチャンクが、既に書き込まれているかを重複判定部３０２が判
定するのに用いられる。 As described above, the hash value is a value calculated using a predetermined hash function based on the data of the chunk to which the associated chunk number is assigned. The reference bit is set to bit “1” when the chunk indicated by the associated chunk number is accessed within the period of the aging cycle process described later, and the associated chunk has been accessed. This is flag information indicating this. The reference bit is set to bit “0” after the aging cycle process described later is completed, and the flag information is cleared. The hit counter is used to determine the priority order of hash entries during the aging cycle process described later. The hash table 213a is used by the duplication determination unit 302 to determine whether a chunk having the same content as the chunk to be written to the RAID disk array 230 has already been written.

もし、ハッシュテーブル２１３aにハッシュ値Ｈ（Ａ）に一致するハッシュ値（つまり
、ハッシュ値Ｈ（Ａ））が登録されているならば、重複判定部３０２は、RAIDディスクア
レイ２３０にチャンクＡが格納されていると判定する。つまり重複判定部３０２は、チャ
ンクＡが重複すると判定する。一方、ハッシュテーブル２１３aにハッシュ値Ｈ（Ａ）が
登録されていないならば、重複判定部３０２は、チャンクＡが重複しないと判定する。 If a hash value that matches the hash value H (A) (that is, the hash value H (A)) is registered in the hash table 213a, the duplication determination unit 302 stores the chunk A in the RAID disk array 230. It is determined that That is, the duplication determination unit 302 determines that the chunk A is duplicated. On the other hand, if the hash value H (A) is not registered in the hash table 213a, the duplication determination unit 302 determines that the chunk A does not overlap.

チャンクＡが重複しないと判定された場合、アクセスコントローラ３０４は、書き込み
要求によって指定された論理アドレスに割り当てられる物理アドレスの示す物理領域に、
チャンクＡを書き込む指示を出す。このとき重複管理部３０１は、ハッシュ値Ｈ（Ａ）と
チャンクＡのチャンク番号との対を含むハッシュ情報を、ハッシュテーブル２１３aに登
録する。 When it is determined that the chunk A does not overlap, the access controller 304 stores the physical address indicated by the physical address assigned to the logical address specified by the write request.
An instruction to write chunk A is issued. At this time, the duplication management unit 301 registers hash information including a pair of the hash value H (A) and the chunk number of the chunk A in the hash table 213a.

一方、チャンクＡが重複すると判定された場合には、アクセスコントローラ３０４は、
チャンクＡをRAIDディスクアレイ２３０に書き込むのを抑止する。この場合、重複管理部
３０１は、ホスト１００からの書き込み要求によって指定された論理アドレスに、チャン
クＡが既にRAIDディスクアレイ２３０に書き込まれている物理アドレスを割り当てる（つ
まりマッピングする）。 On the other hand, when it is determined that the chunk A is duplicated, the access controller 304
Writing of the chunk A to the RAID disk array 230 is suppressed. In this case, the duplication management unit 301 assigns (that is, maps) a physical address in which chunk A has already been written to the RAID disk array 230 to a logical address designated by a write request from the host 100.

図５は、メタテーブル２３０aのデータ構造例を示す。メタテーブル２３０aは、論理ボ
リュームを４Ｋ（キロ）バイト毎に区切ることにより得られる各領域（４Ｋバイト領域）
に書き込まれているチャンクを管理するのに用いられる。メタテーブル２３０aは、論理
ボリュームのそれぞれの４Ｋバイト領域を指し示す論理アドレスに対応付けられたエント
リの集合である。メタテーブル２３０aの各エントリは、当該エントリに対応付けられた
論理アドレスの示す４Ｋバイト領域に格納されるチャンクのチャンク番号を登録するのに
用いられる。したがって重複管理部３０１は、メタテーブル２３０aを参照することによ
り、目的の論理アドレスで指定される領域に格納されているチャンクとデータが特定でき
る。 FIG. 5 shows an example of the data structure of the meta table 230a. The meta table 230a has each area (4K byte area) obtained by dividing the logical volume into 4K (kilo) bytes.
Used to manage chunks written in The meta table 230a is a set of entries associated with logical addresses indicating the respective 4 Kbyte areas of the logical volume. Each entry of the meta table 230a is used to register a chunk number of a chunk stored in a 4 Kbyte area indicated by a logical address associated with the entry. Therefore, the duplication management unit 301 can identify the chunk and data stored in the area specified by the target logical address by referring to the meta table 230a.

図６は、チャンクテーブル２３０ｂのデータ構造例を示す。チャンクテーブル２３０ｂ
は、チャンク番号に対応付けられたエントリの集合を有する。チャンクテーブル２３０ｂ
の各エントリは、当該エントリと対応付けられたチャンク番号を有するチャンクに関する
チャンク情報を登録するのに用いられる。チャンクテーブル２３０ｂの各エントリは、チ
ャンク番号、物理アドレス、重複数から構成される。物理アドレスは、対応付けられたチ
ャンク番号が付与されたチャンクが記録されている物理アドレスである。重複数は、対応
付けられたチャンク番号が付与されたチャンクが幾つの論理アドレスに対応付けられてい
るかを示す。重複数が１の場合には、対応付けられたチャンク番号が付与されたチャンク
が論理ボリューム内の１つの論理アドレスが指し示す４Ｋバイト領域だけに書き込まれて
いることを示す。この場合該当するチャンクが重複していないことを示す。重複数が２の
場合には、重複排除により、実態として１つのチャンクがRAIDディスクアレイ２３０に書
き込まれているものの、当該チャンクが２つの論理アドレスに対応付けられていることを
示す。つまり、ホスト１００から見て、同一内容を持つチャンクが論理ボリューム内の２
つの論理アドレスの指し示す４Kバイト領域に書き込まれていることを示す。 FIG. 6 shows an example of the data structure of the chunk table 230b. Chunk table 230b
Has a set of entries associated with chunk numbers. Chunk table 230b
Each entry is used to register chunk information related to a chunk having a chunk number associated with the entry. Each entry of the chunk table 230b is composed of a chunk number, a physical address, and a duplicate number. The physical address is a physical address in which a chunk to which the associated chunk number is assigned is recorded. The duplication number indicates how many logical addresses the chunks to which the associated chunk numbers are assigned are associated with each other. When the duplication number is 1, it indicates that the chunk assigned with the associated chunk number is written only in the 4 Kbyte area indicated by one logical address in the logical volume. In this case, the corresponding chunk is not duplicated. When the duplication number is 2, it indicates that although one chunk is actually written in the RAID disk array 230 by deduplication, the chunk is associated with two logical addresses. That is, as viewed from the host 100, chunks having the same contents are 2 in the logical volume.
This indicates that data is written in the 4 Kbyte area indicated by one logical address.

（ライト処理時の動作）
次に、本実施形態における、ストレージ装置２００のディスク
アレイ制御装置２１０がホスト１００からデータ書き込み要求を受信した場合に実行され
るライト処理時の動作について、図７を参照して説明する。 (Operation during write processing)
Next, an operation at the time of a write process executed when the disk array control device 210 of the storage apparatus 200 receives a data write request from the host 100 in this embodiment will be described with reference to FIG.

今、ホスト１００からネットワーク１１０を介してストレージ装置２００に、データの
書き込みを指定するデータ書き込み要求が送られたものとする。そして、ストレージ装置
２００のディスクアレイ制御装置２１０が、ホスト１００からのデータ書き込み要求を受
信したものとする。 Now, it is assumed that a data write request designating data write is sent from the host 100 to the storage apparatus 200 via the network 110. Assume that the disk array control device 210 of the storage device 200 has received a data write request from the host 100.

まず、ディスクアレイ制御装置２１０の分割部３００は、データ書き込み要求で指定さ
れたデータを例えば４Ｋバイト毎に区切る。これにより、分割部３００は、指定されたデ
ータを４Ｋバイトのサイズを有する複数のチャンクに分割する（Ｓ１）。つまり重複管理
部３０１は、指定されたデータから、そのデータを構成する複数のチャンクを取得する。
なお、指定されたデータのサイズが４Ｋバイト以下の場合、分割部３００は、当該データ
自体を１つのチャンクとして取得する。ここで、チャンクのサイズは固定長である必要は
なく、チャンクのサイズは可変長であっても構わない。 First, the dividing unit 300 of the disk array control device 210 divides the data designated by the data write request, for example, every 4 Kbytes. Thereby, the dividing unit 300 divides the designated data into a plurality of chunks having a size of 4 Kbytes (S1). That is, the duplication management unit 301 acquires a plurality of chunks constituting the data from the designated data.
When the size of the designated data is 4 Kbytes or less, the dividing unit 300 acquires the data itself as one chunk. Here, the chunk size need not be a fixed length, and the chunk size may be a variable length.

重複管理部３０１は、取得されたチャンクの数を変数Ｎに設定する（Ｓ２）。ここで、
取得されたＮ個のチャンクをチャンクＣ＿１ないしＣ＿Ｎと表記する。 The duplication management unit 301 sets the number of acquired chunks as a variable N (S2). here,
The acquired N chunks are denoted as chunks C_1 to C_N.

すると重複管理部３０１は、ステップＳ１で取得されたＮ個のチャンクの１つを指定す
るのに用いられる変数ｎを初期値１に設定する（Ｓ３）。続いてハッシュ生成部３０３は
、ｎ番目のチャンクＣ＿ｎのハッシュ値Ｈｎ（Ｃ＿ｎ）を計算する（Ｓ４）。つまり初期
値ｎ＝１であるので、１番目のチャンクＣ＿１のハッシュ値Ｈ１（Ｃ＿１）を計算する。
また、ハッシュ値計算は、例えば“ＳＨＡ−２５６”と呼ばれるハッシュ関数を用いて計
算する。さらに、本実施形態では、ハッシュ値Ｈ１（Ｃ＿１）乃至Ｈ１（Ｃ＿Ｎ）は、そ
れぞれ３２バイトで表される。 Then, the duplication management part 301 sets the variable n used for designating one of the N chunks acquired at step S1 to an initial value 1 (S3). Subsequently, the hash generation unit 303 calculates a hash value Hn (C_n) of the nth chunk C_n (S4). That is, since the initial value n = 1, the hash value H1 (C_1) of the first chunk C_1 is calculated.
The hash value is calculated using a hash function called “SHA-256”, for example. Further, in the present embodiment, the hash values H1 (C_1) to H1 (C_N) are each represented by 32 bytes.

次に、重複管理部３０１は、ステップＳ４でハッシュ値計算された、ｎ番目のチャンク
Ｃ＿ｎのハッシュ値Ｈｎ（Ｃ＿ｎ）に一致するハッシュ値をハッシュテーブル２１３aか
ら探索するためのハッシュ値探索処理を実行する（Ｓ５）。このハッシュ値探索処理の結
果に基づいて、選択されたチャンクＣ＿ｎのハッシュ値Ｈｎ（Ｃ＿ｎ）に一致するハッシ
ュ値がハッシュテーブル２１３aに存在するかを判定する（Ｓ６）。 Next, the duplication management unit 301 executes a hash value search process for searching the hash table 213a for a hash value that matches the hash value Hn (C_n) of the n-th chunk C_n calculated in step S4. (S5). Based on the result of the hash value search process, it is determined whether or not a hash value matching the hash value Hn (C_n) of the selected chunk C_n exists in the hash table 213a (S6).

重複管理部３０１は、この重複判定部３０２による判定の結果に基づいて、ステップＳ
７またはステップＳ１４に進む。 Based on the result of determination by the overlap determination unit 302, the overlap management unit 301 performs step S
7 or Step S14.

選択されたチャンクＣ＿ｎのハッシュ値Ｈｎ（Ｃ＿ｎ）に一致するハッシュ値が存在し
ない場合（Ｓ６のＮｏ）を説明する。この場合、重複管理部３０１は、チャンクＣ＿ｎと
同一内容を持つチャンクは格納されていないと判断する。 A case where there is no hash value that matches the hash value Hn (C_n) of the selected chunk C_n will be described (No in S6). In this case, the duplication management unit 301 determines that no chunk having the same content as the chunk C_n is stored.

ステップＳ７おいて重複管理部３０１は、チャンクＣ＿ｎにチャンク番号ＣＮＣ＿ｎを付
与する。次に、アクセスコントローラ３０４は、当該チャンクＣ＿ｎをRAIDディスクアレ
イ２３０内の空いている記憶領域に書き込む（Ｓ８）。 In step S7, the duplication management unit 301 assigns a chunk number CNC_n to the chunk C_n. Next, the access controller 304 writes the chunk C_n to a free storage area in the RAID disk array 230 (S8).

続いて、重複管理部３０１は、選択されたチャンクＣ＿ｎの論理ボリューム内の位置を
示す論理アドレスを取得する。この論理アドレスは、ホスト１００からのデータ書き込み
要求で指定される論理アドレスに４Ｋを加算することにより取得される。重複管理部３０
１は更に、取得された論理アドレスに対応付けられているメタテーブル２３０aのエント
リに、選択されたチャンクＣ＿ｎに付与されたチャンク番号ＣＮＣ＿ｎを登録する（Ｓ９
）。 Subsequently, the duplication management unit 301 acquires a logical address indicating the position of the selected chunk C_n in the logical volume. This logical address is obtained by adding 4K to the logical address specified by the data write request from the host 100. Duplicate management unit 30
1 further registers the chunk number CNC_n assigned to the selected chunk C_n in the entry of the meta table 230a associated with the acquired logical address (S9).
).

また、重複管理部３０１は、チャンクＣ＿ｎのチャンク番号ＣＮＣ＿ｎに対応付けられ
たチャンクテーブル２３０ｂのチャンク番号ＣＮＣ＿ｎのエントリに、チャンクＣ＿ｎが
書き込まれたRAIDディスクアレイ２３０内の記憶領域の物理アドレスPＣ＿ｎを登録する
。さらに、重複管理部３０１は、チャンクテーブル２３０ｂにおいて、チャンク番号ＣＮ
Ｃ＿ｎに対応するエントリの重複数に１を設定する（Ｓ１０)。 Further, the duplication management unit 301 registers the physical address PC_n of the storage area in the RAID disk array 230 in which the chunk C_n is written in the entry of the chunk number CNC_n of the chunk table 230b associated with the chunk number CNC_n of the chunk C_n. To do. Furthermore, the duplication management unit 301 uses the chunk number CN in the chunk table 230b.
1 is set to the overlapping number of entries corresponding to C_n (S10).

続いて、重複管理部３０１は、ハッシュテーブル２１３aに、チャンクＣ＿ｎのチャン
ク番号CNC_ｎおよびハッシュ値Ｈｎ（Ｃ−ｎ）を登録する。この時、ハッシュテーブル２
１３aに空きエントリが少ない場合は、ハッシュテーブル２１３a最下段のエントリ（つま
り、ヒットカウンタに格納された値が最小のエントリ）を削除し、チャンクＣ＿ｎのエン
トリを登録する（詳細は後述する）。また重複管理部３０１は、ハッシュテーブル２１３
aにおいて、チャンク番号CNC_ｎに対応するエントリの参照ビットをビット”１”に設定
する（Ｓ１１）。 Subsequently, the duplication management unit 301 registers the chunk number CNC_n and the hash value Hn (C−n) of the chunk C_n in the hash table 213a. At this time, hash table 2
If there are few empty entries in 13a, the entry at the bottom of the hash table 213a (that is, the entry with the smallest value stored in the hit counter) is deleted, and the entry of chunk C_n is registered (details will be described later). The duplication management unit 301 also has a hash table 213.
In a, the reference bit of the entry corresponding to the chunk number CNC_n is set to bit “1” (S11).

次に、変数ｎがチャンク数Nを超えているかを判定する。もし、変数ｎがチャンク数Ｎ
を超えていないならば（Ｓ１２のＮｏ）、重複管理部３０１は、ホスト１００からのデー
タ書き込み要求によって指定されたデータに含まれている次のチャンクの書き込みを処理
するために、変数ｎを１インクリメントし（Ｓ１３）、ステップＳ４に戻る。これに対し
て変数ｎがチャンク数Ｎを超えているならば（Ｓ１２のＹｅｓ）、重複管理部３０１は、
ホスト１００からのデータ書き込み要求によって指定されたデータに含まれている全ての
チャンクの書き込みを終了したと判定する。この場合、データ書き込み処理は終了する。 Next, it is determined whether the variable n exceeds the number of chunks N. If variable n is the number of chunks N
If it does not exceed (No in S12), the duplication management unit 301 sets the variable n to 1 in order to process writing of the next chunk included in the data specified by the data write request from the host 100. Increment (S13) and return to step S4. On the other hand, if the variable n exceeds the number of chunks N (Yes in S12), the duplication management unit 301
It is determined that writing of all the chunks included in the data designated by the data write request from the host 100 has been completed. In this case, the data writing process ends.

次に、チャンクＣ＿ｎのハッシュ値Ｈｎ（Ｃ＿ｎ）に一致するハッシュ値が存在する場
合、つまり重複判定部３０２がステップＳ６においてＹｅｓと判定した場合の動作を説明
する。重複管理部３０１は、選択されたチャンクＣ＿ｎと同一内容を持つチャンクが既に
RAIDディスクアレイ２３０に格納されていると判定する。ここで、RAIDディスクアレイ２
３０に既に格納されていて、かつチャンクＣ＿ｎと同一内容を持つチャンクをチャンクＣ
＿ｘと表記する。このとき、チャンクＣ＿ｎのチャンク番号としてチャンクＣ＿ｘのチャ
ンク番号CNＣ＿ｘが用いられる。 Next, an operation when there is a hash value that matches the hash value Hn (C_n) of the chunk C_n, that is, when the duplication determination unit 302 determines Yes in step S6 will be described. The duplication management unit 301 has already confirmed that the chunk having the same content as the selected chunk C_n
It is determined that the data is stored in the RAID disk array 230. Here, RAID disk array 2
Chunk C is already stored in 30 and has the same content as Chunk C_n.
Indicated as _x. At this time, the chunk number CNC_x of the chunk C_x is used as the chunk number of the chunk C_n.

この場合、同一内容を持つ複数のチャンクが重複してRAIDディスクアレイ２３０に格納
されるのを排除するため、アクセスコントローラ３０４は、選択されたチャンクＣ＿ｎを
RAIDディスクアレイ２３０に書き込む動作を抑止する（Ｓ１４）。 In this case, in order to exclude a plurality of chunks having the same contents from being stored in the RAID disk array 230, the access controller 304 selects the selected chunk C_n.
The operation of writing to the RAID disk array 230 is suppressed (S14).

続いて重複管理部３０１は、選択されたチャンクＣ＿ｎのチャンク番号CNＣ＿ｘに対応
付けられたチャンクテーブル２３０ｂのエントリ情報を更新する（Ｓ１５）。具体的には
、エントリの重複数を１増やす。 Subsequently, the duplication management unit 301 updates the entry information of the chunk table 230b associated with the chunk number CNC_x of the selected chunk C_n (S15). Specifically, the overlapping number of entries is increased by one.

次に、重複管理部３０１は、メタテーブル２３０aのチャンクＣ＿ｎが書き込まれるべ
き論理アドレスに対応付けられたエントリにチャンクＣ＿ｎと同一内容を持ち既にRAIDデ
ィスクアレイ２３０に書き込まれているチャンクＣ＿ｘのチャンク番号CNＣ＿ｘを登録す
る（S１６）。 Next, the duplication management unit 301 has the same contents as the chunk C_n in the entry associated with the logical address to which the chunk C_n of the meta table 230a is to be written, and the chunk number of the chunk C_x that has already been written to the RAID disk array 230. CNC_x is registered (S16).

続いて、重複管理部３０１は、チャンクＣ＿ｎのチャンク番号CNＣ＿ｘに対応付けられ
たハッシュテーブル２１３aのエントリ情報を更新する（Ｓ１７)。参照ビット欄にビット
”０”が格納されていれば、ビット”１”に更新しフラグを立てる。既に、ビット”１”
が格納されフラグが立っていれば、そのままにする。その後、処理をステップＳ１２へ進
める。 Subsequently, the duplication management unit 301 updates the entry information of the hash table 213a associated with the chunk number CNC_x of the chunk C_n (S17). If bit “0” is stored in the reference bit column, it is updated to bit “1” and a flag is set. Already bit “1”
Is stored and the flag is set, it is left as it is. Thereafter, the process proceeds to step S12.

（リード処理時の動作）
次に、本実施形態における、ストレージ装置２００のディスクアレイ制御装置２１０が
ホスト１００からデータ読み込み要求を受信した場合に実行されるリード処理時の動作に
ついて、図８を参照して説明する。 (Operation during read processing)
Next, an operation at the time of read processing executed when the disk array control device 210 of the storage apparatus 200 receives a data read request from the host 100 in this embodiment will be described with reference to FIG.

アクセスコントローラ３０４は、ホスト１００から受信したデータの読み出し要求で指
定されたデータの先頭の論理アドレスとデータ長（セクタ数）に基づいて、ディスクアレ
イ２３０からデータを読み出してローカルメモリ２１６のワーク領域２１６ｃに設けたリ
ードバッファに保存する（Ｓ２０）。このデータの読み出しに際して、読み出しデータを
構成する各チャンクが記録されている物理アドレスは次のようにして求めることができる
。即ち、先頭の論理アドレスに順次チャンクを構成するセクタ数（８セクタ＝４ＫＢ）を
加算することで、読み出しデータを構成する各チャンクの論理アドレスを特定する。この
特定された論理アドレスに基づいてメタテーブル２３０aを参照すれば、その論理アドレ
スに記録されているチャンクのチャンク番号が特定できる。さらに、この特定されたチャ
ンク番号に基づいてチャンクテーブル２３０ｂを参照すれば、各チャンクが記録されてい
る物理アドレスが特定できる。 The access controller 304 reads the data from the disk array 230 based on the top logical address and the data length (number of sectors) of the data specified in the data read request received from the host 100, and the work area 216c of the local memory 216. (S20). When reading this data, the physical address where each chunk constituting the read data is recorded can be obtained as follows. That is, the logical address of each chunk constituting the read data is specified by adding the number of sectors (8 sectors = 4 KB) constituting the chunk sequentially to the top logical address. By referring to the meta table 230a based on the specified logical address, the chunk number of the chunk recorded at the logical address can be specified. Further, referring to the chunk table 230b based on the specified chunk number, the physical address where each chunk is recorded can be specified.

次に、アクセスコントローラ３０４は、ローカルメモリ２１６のリードバッファに保存
したデータを読み出し、ホスト１００へ転送する（Ｓ２１）。 Next, the access controller 304 reads the data stored in the read buffer of the local memory 216 and transfers it to the host 100 (S21).

データを転送後、ディスクアレイ制御装置２１０の分割部３００は、転送されたデータ
を複数のチャンクに分割する（Ｓ２２）。重複管理部３０１は、取得されたチャンクの数
を変数Ｎに設定する（Ｓ２３）。 After transferring the data, the dividing unit 300 of the disk array control device 210 divides the transferred data into a plurality of chunks (S22). The duplication management unit 301 sets the number of acquired chunks as a variable N (S23).

続いて、重複管理部３０１は、ステップＳ２３で取得されたＮ個のチャンクの１つを指
定するのに用いられる変数ｎを初期値１に設定する（Ｓ２４）。次に、ハッシュ生成部３
０３は、チャンクＣ＿ｎのハッシュ値、Ｈｎ（Ｃ＿ｎ）を計算する（Ｓ２５）。つまり初
期値ｎ＝１であるので、１番目のチャンクＣ＿１のハッシュ値Ｈ１（Ｃ＿１）を計算する
。 Subsequently, the duplication management unit 301 sets a variable n used to designate one of the N chunks acquired in step S23 to an initial value 1 (S24). Next, the hash generation unit 3
03 calculates the hash value, Hn (C_n), of the chunk C_n (S25). That is, since the initial value n = 1, the hash value H1 (C_1) of the first chunk C_1 is calculated.

続いて、重複管理部３０１は、ステップＳ２５でハッシュ値計算されたチャンクＣ＿ｎ
のハッシュ値Ｈｎ（Ｃ＿ｎ）に一致するハッシュ値をハッシュテーブル２１３aから探索
するためのハッシュ値探索処理を実行する（Ｓ２６）。このハッシュ値探索処理の結果に
基づいて、選択されたチャンクＣ＿ｎのハッシュ値Ｈｎ（Ｃ＿ｎ）に一致するハッシュ値
がハッシュテーブル２１３aに存在するかを判定する（Ｓ２７）。 Subsequently, the duplication management unit 301 determines the chunk C_n whose hash value has been calculated in step S25.
A hash value search process is executed to search the hash table 213a for a hash value that matches the hash value Hn (C_n) (S26). Based on the result of the hash value search process, it is determined whether or not a hash value matching the hash value Hn (C_n) of the selected chunk C_n exists in the hash table 213a (S27).

重複管理部３０１は、この重複判定部３０２による判定の結果に基づいて、ステップＳ
２８またはステップＳ２９に進む。 Based on the result of determination by the overlap determination unit 302, the overlap management unit 301 performs step S
Go to step 28 or step S29.

チャンクＣ＿ｎのハッシュ値Ｈｎ（Ｃ＿ｎ）に一致するハッシュ値が存在しない場合（
Ｓ２７のＮｏ）、重複管理部３０１は、チャンクＣ＿ｎに対応するチャンク番号ＣNC＿ｎ
のエントリが、ハッシュテーブル２１３aに保持されていないと判断する。続いて、重複
管理部３０１は、チャンクＣ＿ｎのハッシュ値Ｈｎ（Ｃ＿ｎ）をチャンク番号ＣNC＿ｎに
対応付けてハッシュテーブル２１３aに登録する（Ｓ２８）。この時、ハッシュテーブル
２１３aの空きエントリが少ない場合は、ハッシュテーブル２１３a最下段のエントリ（つ
まり、ヒットカウンタに格納された値が最小のエントリ）を削除し、チャンクＣ＿ｎのエ
ントリを登録する（詳細は後述する）。また、対応するエントリの参照ビットをビット”
１”に設定する。 When there is no hash value that matches the hash value Hn (C_n) of the chunk C_n (
No in S27), the duplication management unit 301 determines the chunk number CNC_n corresponding to the chunk C_n.
Is determined not to be held in the hash table 213a. Subsequently, the duplication management unit 301 registers the hash value Hn (C_n) of the chunk C_n in the hash table 213a in association with the chunk number CNC_n (S28). At this time, if there are few empty entries in the hash table 213a, the entry at the bottom of the hash table 213a (that is, the entry with the smallest value stored in the hit counter) is deleted, and the entry of the chunk C_n is registered (for details) Will be described later). Also, set the reference bit of the corresponding entry to “
Set to 1 ”.

チャンクＣ＿ｎのハッシュ値Ｈｎ（Ｃ＿ｎ）に一致するハッシュ値が存在し、重複判定
部３０２がステップＳ２７においてＹｅｓと判定した場合、重複管理部３０１は、ハッシ
ュ値がＨｎ（Ｃ＿ｎ）であるハッシュテーブル２１３aのエントリを更新する（Ｓ２９）
。重複管理部３０１は、ハッシュ値がＨｎ（Ｃ＿ｎ）であるハッシュテーブル２１３aの
エントリに参照ビット“１”をセットする。 If there is a hash value that matches the hash value Hn (C_n) of the chunk C_n and the duplication determination unit 302 determines Yes in step S27, the duplication management unit 301 determines that the hash value is Hn (C_n). Update the entry (S29)
. The duplication management unit 301 sets the reference bit “1” to the entry of the hash table 213a whose hash value is Hn (C_n).

重複管理部３０１は、変数ｎがステップＳ２３で設定されたチャンク数Ｎを超えている
かを判定する（Ｓ３０）。もし、変数ｎがチャンク数Ｎを超えていないならば（Ｓ３０の
Ｎｏ）、重複管理部３０１は、ホスト１００から指定されたデータに含まれている次のチ
ャンクを処理するために、変数ｎを１インクリメントし（Ｓ３１）、ステップＳ２５に戻
る。これに対して変数ｎがチャンク数Ｎを超えているならば（Ｓ３０のＹｅｓ）、重複管
理部３０１は、全てのチャンクの処理を終了したと判定する。 The duplication management unit 301 determines whether the variable n exceeds the number of chunks N set in step S23 (S30). If the variable n does not exceed the number of chunks N (No in S30), the duplication management unit 301 sets the variable n to process the next chunk included in the data designated by the host 100. Increment by 1 (S31) and return to step S25. On the other hand, if the variable n exceeds the number of chunks N (Yes in S30), the duplication management unit 301 determines that the processing of all chunks has been completed.

このように本実施形態において、新規のハッシュテーブルのエントリの登録をライト処
理時だけでなくリード処理完了時にも行う。これにより、ハッシュテーブルの記憶容量が
固定値となっているがために、RAIDディスクアレイ２３０に書き込まれている全てのチャ
ンクに対応するエントリを登録できない場合にも、重複排除をより効率的に実現できる。 Thus, in the present embodiment, registration of a new hash table entry is performed not only at the time of write processing but also at the time of completion of read processing. As a result, deduplication is more efficiently realized even when entries corresponding to all chunks written in the RAID disk array 230 cannot be registered because the storage capacity of the hash table is a fixed value. it can.

即ち、あるチャンクを書き込みする場合に、既にそのチャンクと同一内容のチRAIDディ
スクアレイ２３０に書き込まれているにも関わらず、ハッシュテーブルが固定長であるこ
とに起因して、この既に書き込まれているチャンクに対応するエントリがハッシュテーブ
ルに登録されていないケースが想定される。この場合には、ハッシュテーブルにエントリ
が登録されていないため、重複排除ができない。しかし、リード処理時にもハッシュテー
ブルへのエントリ登録を行うことで、重複排除をすることができる。例えば、ホスト１０
０がRAIDディスクアレイ２３０からデータを読み出す目的として、そのデータに別のデー
タ名を付けてデータをコピーする場合や、その読み出したデータを更新して再度書き込み
する場合が想定される。このような場合にリード処理時にもハッシュテーブルへのエント
リの登録を行うことで、ハッシュテーブルにエントリが登録されていないため発生する重
複排除ができない事態を回避することができる。 That is, when a certain chunk is written, it is already written because the hash table has a fixed length even though it has already been written to the same RAID disk array 230 as that chunk. A case where an entry corresponding to a chunk is not registered in the hash table is assumed. In this case, since no entry is registered in the hash table, deduplication cannot be performed. However, duplication can be eliminated by registering entries in the hash table even during read processing. For example, host 10
For the purpose of 0 reading data from the RAID disk array 230, it is assumed that the data is copied with a different data name, or the read data is updated and rewritten. In such a case, by registering the entry in the hash table even during the read process, it is possible to avoid a situation in which deduplication cannot be performed because no entry is registered in the hash table.

（ハッシュエントリの置換）
ハッシュテーブル２１３ａの記憶容量が固定長であるため、空きのエントリがなくなっ
た場合の対策について説明する。空きのエントリがなくなった場合には、ヒットカウンタ
の値が最小のエントリを選択し、そのエントリに記録されているチャンク番号、ハッシュ
値、参照ビット、ビットカウンタの各記録内容を削除し、新たに登録するチャンク番号、
ハッシュ値、参照ビット、ビットカウンタの値を記録することで置換する。この置換処理
を円滑に進めるために、あらかじめ、定期的にハッシュテーブル２１３ａに記録されてい
るエントリをヒットカウンタの値に基づいてソート処理しておく。このように定期的にソ
ート処理することをエージング周期処理と呼び、その詳細を図９を参照して説明する。な
お、周期処理は、例えばOSが提供するタイマー機能等を利用し任意時間周期で行う。 (Replace hash entry)
Since the storage capacity of the hash table 213a is a fixed length, a countermeasure when there are no more empty entries will be described. When there are no more empty entries, select the entry with the smallest hit counter value, delete the recorded contents of the chunk number, hash value, reference bit, and bit counter recorded in that entry, and newly Chunk number to register,
Replaces by recording the hash value, reference bit, and bit counter value. In order to smoothly proceed with the replacement process, the entries recorded in the hash table 213a are periodically sorted in advance based on the value of the hit counter. Such regular sorting processing is called aging cycle processing, and the details thereof will be described with reference to FIG. Note that the periodic processing is performed at an arbitrary time period using, for example, a timer function provided by the OS.

まず、重複管理部３０１は、ハッシュテーブル２１３aの先頭のエントリを選択する（
Ｓ３２）。本実施形態では、例えば一番上にあるエントリから選択する。続いて重複管理
部３０１は、ヒットカウンタを1ビット右シフトする（２で割る）（Ｓ３３）。本実施形
態のヒットカウンタは８ビットで形成されているとし、例えば当該エントリのヒットカウ
ンタに「10000000（１６進数：８０）」が格納されているとする。１ビット右シフトする
ことで、ヒットカウンタの値は「01000000（１６進数：４０）」となる。 First, the duplication management unit 301 selects the top entry of the hash table 213a (
S32). In this embodiment, for example, the entry at the top is selected. Subsequently, the duplication management unit 301 right shifts the hit counter by 1 bit (divides by 2) (S33). Assume that the hit counter of this embodiment is formed of 8 bits. For example, “10000000 (hexadecimal number: 80)” is stored in the hit counter of the entry. By shifting 1 bit to the right, the value of the hit counter becomes “01000000 (hexadecimal number: 40)”.

次に重複管理部３０１は、当該エントリの参照ビット“１”が立っているか判定する（
Ｓ３４）。例えば、ステップＳ３３の後に、当該エントリのヒットカウンタの値が「0100
0000（１６進数：４０）」である場合を考える。参照ビット“１”が立っている場合、さ
らに最上位ビットを立て（Ｓ３５）、ヒットカウンタの値は「11000000（１６進数：Ｃ０
）」となる。参照ビット“０”の場合、ヒットカウンタの値はそのまま「01000000（１６
進数：４０）」とし、Ｓ３６に進む。このように、ステップＳ３３の処理の後に、ステッ
プＳ３４またはステップＳ３５の処理をすることによって、直近にアクセス（参照）され
たチャンクのヒットカウンタの値が増大されることになる。一方、直近にアクセス（参照
）されていないチャンクのヒットカウンタの値は減少されることになる。このようにヒッ
トカウンタの値を変更することによって、アクセス回数は少ないチャンクでも、直近にア
クセスされているものは、そのヒットカウンタの値が大きく設定されるため、高い優先順
位を持つことになる。 Next, the duplication management unit 301 determines whether the reference bit “1” of the entry is set (
S34). For example, after step S33, the value of the hit counter of the entry is “0100”.
Consider the case of “0000 (hexadecimal number: 40)”. When the reference bit “1” is set, the most significant bit is set (S35), and the value of the hit counter is “11000000 (hexadecimal number: C0).
) ”. When the reference bit is “0”, the value of the hit counter is “01000000 (16
(Advance number: 40) ", and the process proceeds to S36. In this way, the value of the hit counter of the most recently accessed (referenced) chunk is increased by performing the processing of step S34 or step S35 after the processing of step S33. On the other hand, the hit counter value of a chunk that has not been accessed (referenced) most recently is decreased. By changing the value of the hit counter in this way, even if the number of accesses is small, the most recently accessed chunk has a high priority because the value of the hit counter is set large.

続いて重複管理部３０１は、全てのエントリの処理を行ったか判定する（Ｓ３６）。全
てのエントリの処理を行っていないと判断した場合（Ｓ３６のNo）は、次のエントリを選
択し（Ｓ３７）、ステップＳ３３に戻る。全てのエントリの処理を行っていると判断した
場合（Ｓ３６のNo）は、全エントリをソートし、ヒットカウンタの値が昇順になる位置に
動かし（Ｓ３８）、エージング周期処理は終了となる。 Subsequently, the duplication management unit 301 determines whether all entries have been processed (S36). If it is determined that all entries have not been processed (No in S36), the next entry is selected (S37), and the process returns to step S33. If it is determined that all entries have been processed (No in S36), all entries are sorted and moved to a position where the value of the hit counter is in ascending order (S38), and the aging cycle process ends.

このエージング周期処理が行われることにより、ハッシュテーブル２１３a内のエント
リの優先順位が付けられる。その優先順位に則り、エントリの空き領域がなくなった場合
には、ハッシュテーブル２１３aの一番下のエントリ（つまり、ヒットカウンタの値が最
小のエントリ）を選択し、そのエントリに記録されているチャンク番号、ハッシュ値、参
照ビット、ビットカウンタの各記録内容を削除し、新たに登録するチャンク番号、ハッシ
ュ値、参照ビット、ビットカウンタの値を記録することで置換する。 By performing this aging cycle process, the priority of the entries in the hash table 213a is given. If there is no free space in the entry according to the priority, the lowest entry in the hash table 213a (that is, the entry with the smallest hit counter value) is selected, and the chunk recorded in the entry is selected. The recorded contents of the number, hash value, reference bit, and bit counter are deleted and replaced by recording the newly registered chunk number, hash value, reference bit, and bit counter value.

なお、ハッシュテーブル２１３aはキャッシュメモリ２１３上にのみ存在するため、装
置停止で揮発してしまうが、装置シャットダウン時にはHDDやSSDのような不揮発媒体に退
避し、起動時に復元する。 Since the hash table 213a exists only on the cache memory 213, it is volatilized when the apparatus is stopped. However, when the apparatus is shut down, the hash table 213a is saved in a nonvolatile medium such as an HDD or an SSD and restored when the apparatus is activated.

このように、本実施形態は、ハッシュエントリを一定量のみキャッシュメモリ２１３上
に保持し、アクセス数とアクセス間隔に基づいて重み付けしたヒットカウンタによりハッ
シュエントリを管理する。また、リード処理時にもハッシュエントリを登録する。それに
より、高速で高重複排除率の重複排除機能を実現することができる。 As described above, in this embodiment, only a certain amount of hash entries are held in the cache memory 213, and the hash entries are managed by a hit counter weighted based on the number of accesses and the access interval. A hash entry is also registered during the read process. As a result, a deduplication function with a high deduplication rate can be realized at high speed.

以上、本発明の実施形態を説明したが、これらの実施形態は、例として提示したもので
あり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の
様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略
、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨
に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 As mentioned above, although embodiment of this invention was described, these embodiment is shown as an example and is not intending limiting the range of invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the scope of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and the equivalents thereof.

１００…ホスト
１１０…ネットワーク
２００…ストレージ装置
２１０…ディスクアレイ制御装置
２１１…ホストＩ／Ｆ
２１２…ディスクＩ／Ｆ
２１３…キャッシュメモリ
２１３a…ハッシュテーブル
２１４…キャッシュコントローラ
２１５…FROM
２１６…ローカルメモリ
２１６a…制御プログラム領域
２１６b…テーブル領域
２１６c…ワーク領域
２１７…CPU
２１８…チップセット
２２０…ディスクインタフェースバス
２３０…RAIDディスクアレイ
２３０a…メタテーブル
２３０ｂ…チャンクテーブル
３００…分割部
３０１…重複管理部
３０２…重複判定部
３０３…ハッシュ生成部
３０４…アクセスコントローラ DESCRIPTION OF SYMBOLS 100 ... Host 110 ... Network 200 ... Storage apparatus 210 ... Disk array control apparatus 211 ... Host I / F
212 ... Disk I / F
213 ... Cache memory 213a ... Hash table 214 ... Cache controller 215 ... FROM
216 ... Local memory 216a ... Control program area 216b ... Table area 216c ... Work area 217 ... CPU
218 ... Chip set 220 ... Disk interface bus 230 ... RAID disk array 230a ... Meta table 230b ... Chunk table 300 ... Dividing unit 301 ... Duplicate managing unit 302 ... Duplicate determining unit 303 ... Hash generating unit 304 ... Access controller

Claims

In a storage device having a recording medium for recording data,
A chunk division unit that divides data designated by a data write request from a host computer into a plurality of chunks;
A hash generation unit that generates a hash value of each of the plurality of chunks based on data of each of the plurality of chunks;
A volatile memory that holds a hash table in which the identifier of the chunk written in the recording medium and the hash value of the chunk are stored in association with each other;
Whether or not a hash value that matches the hash value generated by the hash generation unit based on the data of the first chunk has already been recorded in the hash table when writing the first chunk to the recording medium If the matching hash is not recorded in the hash table, the first chunk is written to the recording medium, and if the matching hash is recorded in the hash table, A control unit for suppressing writing of the first chunk to the recording medium;
A storage device comprising:

When the control unit reads data designated by a read request from a host computer from the recording medium,
For a plurality of chunks constituting the data, it is searched whether or not a hash value that matches a hash value generated based on each chunk is already registered in the hash table, and the matching hash value is the hash table. If not recorded in the hash table, the chunk identifier and hash value are registered in the hash table.
The storage apparatus according to claim 1.