JP4968218B2

JP4968218B2 - Disk storage device, data backup system, data relocation method, and program

Info

Publication number: JP4968218B2
Application number: JP2008232282A
Authority: JP
Inventors: 玄金原
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2008-09-10
Filing date: 2008-09-10
Publication date: 2012-07-04
Anticipated expiration: 2028-09-10
Also published as: JP2010066979A

Description

本発明は、ディスク記憶装置、データバックアップシステム、データ再配置方法及びプログラムに関し、特に、バックアップ対象データを複製（レプリカ）したレプリケーションデータを記憶するディスク記憶装置、データバックアップシステム、データ再配置方法及びプログラムに関する。 The present invention relates to a disk storage device, a data backup system, a data relocation method, and a program, and more particularly to a disk storage device, a data backup system, a data relocation method, and a program for storing replication data obtained by replicating (replicating) backup target data. About.

近年、業務用ストレージのデータバックアップ手法として、バックアップ中の業務停止時間を短縮する為に、業務データとは異なる物理ディスクに作成したレプリカからテープ等のバックアップ用媒体にバックアップすることが行われている。 In recent years, as a data backup method for business storage, in order to reduce the business stop time during backup, backup from a replica created on a physical disk different from business data to a backup medium such as tape has been performed. .

また、物理ディスクの大容量化に伴い、一つの物理ディスクから論理ＬＵＮ（ＬｏｇｉｃａｌＵｎｉｔＮｕｍｂｅｒ）と呼ばれる仮想的な物理ディスクを作成し、パーティションを設定してサーバに参照させる方式も一般化している。例えば、システム更新の際に、論理ＬＵＮのサイズを合せるだけで、実際の物理ディスク容量の差異を意識することなくデータ移行を行うことができる。 As the capacity of a physical disk increases, a method of creating a virtual physical disk called a logical LUN (Logical Unit Number) from one physical disk, setting a partition, and referring to the server is also generalized. For example, when the system is updated, it is possible to perform data migration without being aware of the difference in actual physical disk capacity by simply combining the logical LUN sizes.

一方、コスト削減の観点より、レプリカ用の物理ディスクとしては、より大容量のものを少数だけ実装される傾向にある。また、保護方式もＲＡＩＤ１（ミラー）より安価なＲＡＩＤ５（パリティ）が採用される傾向にあるため、物理ディスクから切り出される論理ＬＵＮ数は、オリジナル（複製元）の物理ディスクの数の数倍になる。 On the other hand, from the viewpoint of cost reduction, there is a tendency that only a small number of replica physical disks are mounted. Further, since RAID 5 (parity), which is cheaper than RAID 1 (mirror), tends to be adopted as the protection method, the number of logical LUNs cut out from the physical disk is several times the number of original (replication source) physical disks. .

特許文献１に、磁気ディスクや光ディスク等のディスクにおけるデータを書き込む際に、最後にアクセスしたセクタの最も近くの空きセクタにデータを書き込むようにし、シーク時間の短縮を実現する情報処理システムが提案されている。 Patent Document 1 proposes an information processing system that, when writing data on a disk such as a magnetic disk or an optical disk, writes data in an empty sector closest to the last accessed sector, thereby reducing seek time. ing.

特許文献２には、複数の回転型記憶装置に同一データを多重に格納する記憶装置において、上位装置からのアクセス形態やデータの使用目的に応じて、予め回転型記憶装置のデータ配置（内周側／外周側等）を決めておき、アクセス要求時に、制御装置が該データ配置にあわせたアクセスを実行する記憶装置が開示されている。 In Patent Document 2, in a storage device that stores the same data in a plurality of rotary storage devices in a multiplexed manner, the data arrangement (inner A storage device is disclosed in which a control device executes access in accordance with the data arrangement when an access request is made.

特開平１０−２８３１２０号公報Japanese Patent Laid-Open No. 10-283120 特開平９−１３４２５８号公報JP-A-9-134258

図１４は、業務サーバＡ（１１）と、業務サーバＡとは別の業務に用いる業務サーバＢ（１２）を有する業務システムにおけるディスクストレージ装置１００の構成を表した図である。破線２１、２２、２３はそれぞれ物理ディスクを表している。物理ディスク２１、２２には、業務サーバＡ（１１）用のＬＵＮと、その他ＬＵＮが構成され、物理ディスク２３には、業務サーバＢ（１２）用のＬＵＮと、その他ＬＵＮが構成されている。また、一点鎖線２４は、物理ディスク２１、２２の業務サーバＡ（１１）用のＬＵＮを用いてストライピング（ＲＡＩＤ０）による高速化が図られていることを示している。 FIG. 14 is a diagram showing the configuration of the disk storage device 100 in a business system having a business server A (11) and a business server B (12) used for business different from the business server A. Dashed lines 21, 22, and 23 represent physical disks, respectively. The physical disks 21 and 22 are configured with LUNs for the business server A (11) and other LUNs, and the physical disks 23 are configured with LUNs for the business server B (12) and other LUNs. In addition, the alternate long and short dash line 24 indicates that high speed is achieved by striping (RAID 0) using the LUN for the business server A (11) of the physical disks 21 and 22.

ここで、物理ディスク３１は、レプリカ用物理ディスクであり、３つの物理ディスク２１、２２、２３のデータを記憶できる、必要十分な大きさを持っているものとする。バックアップサーバ４０は、予め設定したバックアップスケジュールに従って、物理ディスク３１からテープ装置４１にバックアップを作成する。 Here, the physical disk 31 is a replica physical disk, and has a necessary and sufficient size capable of storing data of the three physical disks 21, 22, and 23. The backup server 40 creates a backup from the physical disk 31 to the tape device 41 according to a preset backup schedule.

図１４に示したように、物理ディスク２１、２２、２３に含まれる論理ＬＵＮ数が多くなると、レプリケーション先の物理ディスク３１には複数の論理ＬＵＮが存在することになる。特に、一点鎖線２４で表したように、複数の論理ＬＵＮをボリュームマネージャのストライプ（ＲＡＩＤ０）機能で高速化している場合等は、複数の論理ＬＵＮへの同時アクセスによりスループットが大幅に低下してしまう。 As illustrated in FIG. 14, when the number of logical LUNs included in the physical disks 21, 22, and 23 increases, a plurality of logical LUNs exist in the physical disk 31 that is the replication destination. In particular, as indicated by the alternate long and short dash line 24, when a plurality of logical LUNs are accelerated by the volume manager stripe (RAID0) function, the throughput is significantly reduced by simultaneous access to the plurality of logical LUNs. .

図１５は、レプリケーション先の物理ディスク３１におけるブロックの配置状況を表した図であり、図中の数値は各ブロックに対するアクセス時刻を表している。図１５のように、物理ディスク単位でレプリケーションが行われている場合、複数の論理ＬＵＮにわたってアクセスが発生しないバックアップスケジュールを作成することは困難である。例えば、業務サーバＡ用の論理ＬＵＮのバックアップを取る場合、物理ディスク３１へのアクセスが広範囲に分散し、シーク時間が大きくなってしまうのである。 FIG. 15 is a diagram showing the block arrangement status in the replication destination physical disk 31, and the numerical values in the figure show the access time for each block. As shown in FIG. 15, when replication is performed in units of physical disks, it is difficult to create a backup schedule in which no access occurs over a plurality of logical LUNs. For example, when a logical LUN for the business server A is backed up, access to the physical disk 31 is distributed over a wide range, and the seek time becomes long.

本発明は、上記した事情に鑑みてなされたものであって、その目的とするところは、上記レプリカ用の物理ディスクからバックアップを作成する際のスループットの低下を抑えることができるディスク記憶装置、データバックアップシステム、データ再配置方法及びプログラムを提供することにある。 The present invention has been made in view of the above-described circumstances, and the object of the present invention is to provide a disk storage device and data that can suppress a decrease in throughput when creating a backup from the replica physical disk. To provide a backup system, a data relocation method, and a program.

本発明の第１の視点によれば、複製元のデータを複製したレプリケーションデータを格納するディスク媒体と、前記ディスク媒体へのアクセスパターンを観測し、該観測されたアクセスパターンにおけるデータアクセス順序に従い、前記ディスク媒体におけるデータ配置順序を決定する手段と、前記データ配置順序に従ってレプリケーションデータを再配置する手段と、を備えるディスク記憶装置が提供される。 According to the first aspect of the present invention, a disk medium that stores replication data obtained by replicating replication source data, and an access pattern to the disk medium are observed, and according to the data access order in the observed access pattern, There is provided a disk storage device comprising: means for determining a data arrangement order in the disk medium; and means for rearranging replication data according to the data arrangement order.

本発明の第２の視点によれば、複製元のデータを複製したレプリケーションデータを格納するディスク媒体へのアクセスパターンを観測し、該観測されたアクセスパターンにおけるデータアクセス順序に従い、前記ディスク媒体におけるデータ配置順序を決定し、前記データ配置順序に従って前記レプリケーションデータを再配置するデータ再配置方法が提供される。 According to the second aspect of the present invention, an access pattern to a disk medium storing replication data obtained by duplicating replication source data is observed, and the data in the disk medium is determined according to the data access order in the observed access pattern. There is provided a data rearrangement method for determining an arrangement order and rearranging the replication data according to the data arrangement order.

本発明の第３の視点によれば、複製元のデータを複製したレプリケーションデータを格納するディスク媒体へのアクセスパターンを観測し、該観測されたアクセスパターンにおけるデータアクセス順序に従い、前記ディスク媒体におけるデータ配置順序を決定する処理と、前記データ配置順序に従って、前記レプリケーションデータを再配置する処理とをコンピュータに実行させるプログラムが提供される。 According to the third aspect of the present invention, an access pattern to a disk medium that stores replication data obtained by duplicating replication source data is observed, and data in the disk medium is determined according to the data access order in the observed access pattern. There is provided a program for causing a computer to execute a process for determining an arrangement order and a process for rearranging the replication data according to the data arrangement order.

本発明によれば、キャッシュヒット率が上昇し、ディスク媒体からデータを読み出す際のスループットが向上する。また、これらの結果として、バックアップの所要時間を少なくすることができる。その理由は、バックアップが所定のバックアップスケジュールに基づいてなされるというバックアップの特性に着目し、観測されたアクセス状況と同様のアクセスが行われた場合に少ないシーク時間で済むように、実アクセス順序に基づくデータの局所化を行うようにしたことにある。 According to the present invention, the cache hit rate increases and the throughput when reading data from the disk medium is improved. As a result, the time required for backup can be reduced. The reason is that focusing on the backup characteristics that backup is performed based on a predetermined backup schedule, the actual access order is set so that less seek time is required when the same access as the observed access status is performed. This is based on the localization of the data based on it.

［発明の概要］
はじめに本発明の概要を説明する。図１は、本発明に係るディスク記憶装置によるデータの再配置処理を模式的に表した図である。図１の左側の図は、再配置処理前の物理ディスク３１におけるブロックの配置状況を表した図であり、図中の数値は各ブロックに対するアクセス時刻（３０分単位）を表している（図１５と同一）。 [Summary of Invention]
First, the outline of the present invention will be described. FIG. 1 is a diagram schematically showing data rearrangement processing by the disk storage device according to the present invention. The diagram on the left side of FIG. 1 is a diagram showing the block arrangement status on the physical disk 31 before the relocation process, and the numerical values in the diagram show the access time (in units of 30 minutes) for each block (FIG. 15). The same).

そこで、本発明に係るディスク記憶装置は、上記物理ディスク３１に対するアクセスを観測し、観測されたアクセスパターンが再現された場合にシーク時間が短くなるよう物理ディスクのデータを配置し直す。例えば、図１５では「２１：００」と表示された２１：００〜２１：２９という時間帯にアクセスされたブロックは、図１５の右側の図のように局所化するように再配置される。再配置後は、再配置前のデータアドレスから再配置後のデータアドレスを求めることのできるポインタテーブル等を用いて、アクセスすることができるようにする。以上により、次回以降のアクセスの際のシーク時間を短縮することが可能になる。また、キャッシュヒット率も向上させることが可能になる。 Therefore, the disk storage device according to the present invention observes access to the physical disk 31 and rearranges the physical disk data so that the seek time is shortened when the observed access pattern is reproduced. For example, the blocks accessed in the time zone of 21: 0 to 21:29 displayed as “21:00” in FIG. 15 are rearranged so as to be localized as shown on the right side of FIG. After the rearrangement, the data address after the rearrangement can be accessed from the data address before the rearrangement by using a pointer table or the like. As described above, the seek time for the next and subsequent accesses can be shortened. In addition, the cache hit rate can be improved.

［第１の実施形態］
続いて、本発明の第１の実施形態について図面を参照して詳細に説明する。図２は、本発明の一実施形態の構成を表したブロック図である。図２を参照すると、業務サーバ群１１／１２と、バックアップサーバ４０と、接続されたディスクストレージ装置（ディスク記憶装置）１００が示されている。 [First Embodiment]
Next, a first embodiment of the present invention will be described in detail with reference to the drawings. FIG. 2 is a block diagram showing the configuration of one embodiment of the present invention. Referring to FIG. 2, a business server group 11/12, a backup server 40, and a connected disk storage device (disk storage device) 100 are shown.

ディスクストレージ装置１００は、業務用の磁気ディスク群２１／２２／２３と、レプリカ用の磁気ディスク群３１Ａと、磁気ディスク制御機構６０と、データ再配置部６１とを備える。 The disk storage device 100 includes a business magnetic disk group 21/22/23, a replica magnetic disk group 31A, a magnetic disk control mechanism 60, and a data rearrangement unit 61.

業務用の磁気ディスク群２１／２２／２３は、業務サーバ群１１／１２用の論理ＬＵＮが設定されたディスク群であり、図１４の物理ディスク２１〜２３に相当する。 The business magnetic disk group 21/22/23 is a disk group in which a logical LUN for the business server group 11/12 is set, and corresponds to the physical disks 21 to 23 in FIG.

レプリカ用の磁気ディスク群３１Ａは、業務用の磁気ディスク群２１／２２／２３のレプリケーション先となるディスク群であり、図１４の物理ディスク３１に相当する。 The replica magnetic disk group 31A is a disk group serving as a replication destination of the business magnetic disk group 21/22/23, and corresponds to the physical disk 31 of FIG.

磁気ディスク制御機構６０は、業務サーバ群１１／１２及びバックアップサーバ４０から、業務用の磁気ディスク群２１／２２／２３及びレプリカ用の磁気ディスク群３１Ａへのアクセス要求に応じる。なお、レプリカ用の磁気ディスク群３１Ａへのアクセスに際しては、スナップショット等に利用されているものと同等のポインタテーブルが参照される。 The magnetic disk control mechanism 60 responds to an access request from the business server group 11/12 and the backup server 40 to the business magnetic disk group 21/22/23 and the replica magnetic disk group 31A. When accessing the replica magnetic disk group 31A, a pointer table equivalent to that used for snapshots is referred to.

データ再配置部６１は、バックアップサーバ４０から磁気ディスク制御機構６０へのアクセスを観測し（アクセス情報）、その結果に基づいて、レプリカ用の磁気ディスク群３１Ａにおけるデータ配置順序を示すポインタテーブルを作成・更新する。また、データ再配置部６１は、作成・更新したポインタテーブルに基づいたデータの再配置を磁気ディスク制御機構６０に指示する。本実施形態では、ポインタテーブルに基づき、業務用の磁気ディスク群２１／２２／２３からレプリカ用の磁気ディスク群３１Ａに含まれる物理ディスクに全レプリケーションを行うことによりデータの再配置を行うものとする。 The data rearrangement unit 61 observes access from the backup server 40 to the magnetic disk control mechanism 60 (access information), and creates a pointer table indicating the data arrangement order in the replica magnetic disk group 31A based on the result. ·Update. The data rearrangement unit 61 instructs the magnetic disk control mechanism 60 to rearrange data based on the created / updated pointer table. In this embodiment, based on the pointer table, data is rearranged by performing full replication from the business use magnetic disk group 21/22/23 to the physical disks included in the replica magnetic disk group 31A. .

図３は、データ再配置部６１の具体的な構成の例を表したブロック図である。 FIG. 3 is a block diagram illustrating an example of a specific configuration of the data rearrangement unit 61.

総ブロック数記憶部６１Ａは、レプリケーション先となるレプリカ用の磁気ディスク群３１Ａに含まれる対象物理ディスクのブロック数を記憶する。 The total block number storage unit 61A stores the number of blocks of the target physical disk included in the replica magnetic disk group 31A serving as a replication destination.

分割数記憶部６１Ｂは、後記する再配置処理単位となる物理ディスクの分割数ｍを記憶する。物理ディスクの分割数としては、総ブロック数記憶部６１Ａに記憶されたブロック数よりも小さい値が設定される。ある程度の数のブロックをまとめた再配置処理単位を採用することにより、後記するソートや再配置処理を効率よく行うためである。もちろん、物理ディスクの分割数ｍ＝物理ディスクのブロック数とすることも可能である。 The division number storage unit 61B stores a division number m of a physical disk that is a rearrangement processing unit to be described later. As the number of divisions of the physical disk, a value smaller than the number of blocks stored in the total block number storage unit 61A is set. This is because by adopting a rearrangement processing unit in which a certain number of blocks are collected, sorting and rearrangement processing described later are performed efficiently. Of course, the number m of physical disk divisions can also be set to the number of physical disk blocks.

１領域あたりのブロック数記憶部６１Ｃは、上記物理ディスクの分割数で分割された各領域に含まれるブロック数を記憶する。１領域あたりのブロック数は、総ブロック数記憶部６１Ａに記憶された総ブロック数を、分割数記憶部６１Ｂに記憶された分割数で除することにより求めることができる。 The block number storage unit 61C per area stores the number of blocks included in each area divided by the number of divisions of the physical disk. The number of blocks per area can be obtained by dividing the total number of blocks stored in the total block number storage unit 61A by the number of divisions stored in the division number storage unit 61B.

上記総ブロック数や分割数の設定を省略して、直接ユーザが１領域あたりのブロック数を設定できるようにしてもよい。 The setting of the total number of blocks and the number of divisions may be omitted, and the user may directly set the number of blocks per area.

第１メモリテーブル６１Ｄは、図１０に示すように、時間帯毎に、各領域へのアクセス回数を記録するｍ行ｎ列のテーブルによって構成される。ｍは、分割数記憶部６１Ｂに記憶された分割数であり、ｎは、観測時間やアクセス時刻の記録間隔によって決定される。例えば、２４時間を観測期間とし、１５分単位でアクセス数を記録する場合、ｎ＝２４［ｈ］×６０［分］÷１５［分］＝９６と算出され、ｔ１〜ｔ９６の時間帯に区分される。 As shown in FIG. 10, the first memory table 61D is configured by a table of m rows and n columns that records the number of accesses to each area for each time period. m is the number of divisions stored in the division number storage unit 61B, and n is determined by the observation time and the recording interval of the access time. For example, when the number of accesses is recorded in units of 15 minutes with an observation period of 24 hours, n = 24 [h] × 60 [minutes] ÷ 15 [minutes] = 96, and is divided into time zones t1 to t96. Is done.

第２メモリテーブル６１Ｅは、図１１に示すように、第１メモリテーブル６１Ｄを参照して、各領域における最もアクセスの多かった時間帯を記録するｍ行２列のテーブルである。 As shown in FIG. 11, the second memory table 61E refers to the first memory table 61D, and is an m-row / 2-column table that records the most frequently accessed time zone in each area.

第１ブロック番号オフセット記憶部６１Ｇ、第２ブロック番号オフセット記憶部６１Ｈ、同時間帯の領域数記憶部６１Ｉ、処理中時間帯記憶部６１Ｊ、ブロック数カウンタ６１Ｋは、ポインタテーブルの作成時に使用する各種変数を格納する。 The first block number offset storage unit 61G, the second block number offset storage unit 61H, the area number storage unit 61I in the same time zone, the processing time zone storage unit 61J, and the block number counter 61K are various types used when creating the pointer table. Stores variables.

アクセス監視手段６２は、物理ディスクに対するアクセスを監視し、第１メモリテーブル６１Ｄに記録する手段である。 The access monitoring means 62 is a means for monitoring access to the physical disk and recording it in the first memory table 61D.

アクセス時間帯抽出手段６３は、第１メモリテーブル６１Ｄを参照して、各領域の領域番号と、各領域における最もアクセスの多かった時間帯との組みを前記第２メモリテーブルに記録する。 The access time zone extracting means 63 refers to the first memory table 61D and records the combination of the area number of each area and the most accessed time slot in each area in the second memory table.

ポインタテーブル作成手段６４は、処理変数が格納される第１ブロック番号オフセット記憶部６１Ｇ、第２ブロック番号オフセット記憶部６１Ｈ、同時間帯の領域数記憶部６１Ｉ、処理中時間帯記憶部６１Ｊ、ブロック数カウンタ６１Ｋを参照して、第２メモリテーブル６１Ｅから、データ配置順序（ポインタテーブル）を作成する（図１３参照）。 The pointer table creation means 64 includes a first block number offset storage unit 61G, a second block number offset storage unit 61H, a region number storage unit 61I in the same time zone, a processing time zone storage unit 61J, and a block in which processing variables are stored. The data arrangement order (pointer table) is created from the second memory table 61E with reference to the number counter 61K (see FIG. 13).

なお、上記したデータ再配置部６１及びその各処理手段は、ハードウェアにより構成することも可能であるが、ディスクストレージ装置１００に搭載されたコンピュータに実行させるプログラムにより実現することが可能である。 The data rearrangement unit 61 and each processing unit described above can be configured by hardware, but can also be realized by a program executed by a computer installed in the disk storage device 100.

続いて、本実施形態のデータ再配置処理の流れについて図面を参照して詳細に説明する。 Next, the flow of data rearrangement processing according to this embodiment will be described in detail with reference to the drawings.

［事前準備］
まず、事前準備として以下のとおり初期値の設定や初期化が行われる。図４は、事前準備において行われる処理フローチャートの例である。 [Advance preparation]
First, as an advance preparation, initial values are set and initialized as follows. FIG. 4 is an example of a process flowchart performed in advance preparation.

まず、総ブロック数記憶部６１Ａに物理ディスクのブロック数を設定する（ステップＪ１）。 First, the number of physical disk blocks is set in the total block number storage unit 61A (step J1).

次に、分割数記憶部６１Ｂに、物理ディスクの分割数をセットする（ステップＪ２）。 Next, the division number of the physical disk is set in the division number storage unit 61B (step J2).

次に、一つの領域に含まれるブロック数を算出し、１領域あたりのブロック数記憶部６１Ｃに設定する（ステップＪ３）。 Next, the number of blocks included in one area is calculated and set in the block number storage unit 61C per area (step J3).

次に、初期化処理を行う。具体的には、第１メモリテーブル６１Ｄ、第１ブロック番号オフセット記憶部６１Ｇ及び同時間帯の領域数記憶部６１Ｉに０を設定する。処理中時間帯記憶部６１Ｊに−１を設定する（ステップＪ４）。 Next, initialization processing is performed. Specifically, 0 is set in the first memory table 61D, the first block number offset storage unit 61G, and the area number storage unit 61I in the same time zone. -1 is set in the processing time zone storage unit 61J (step J4).

［データ再配置処理−サンプリング］
図５は、データ再配置処理の流れを表したフローチャートである。始めに、データ再配置部６１は、物理ディスクへの個々のブロックへのアクセス回数と時間帯を記録するサンプリング処理を実行する（ステップＡ１）。サンプリング処理は、一定のサンプリング期間（例えば、１２時間、２４時間等）を設定して行われる。 [Data relocation processing-sampling]
FIG. 5 is a flowchart showing the flow of the data rearrangement process. First, the data rearrangement unit 61 executes a sampling process for recording the number of accesses to each block to the physical disk and the time zone (step A1). The sampling process is performed by setting a certain sampling period (for example, 12 hours, 24 hours, etc.).

図６は、サンプリング処理の流れを表したフローチャートである。まず、データ再配置部６１は、対象の物理ディスクにＲＥＡＤアクセスが発生した際、アクセス先のブロック番号と時刻を検出する（ステップＢ１）。 FIG. 6 is a flowchart showing the flow of the sampling process. First, the data relocation unit 61 detects the block number and time of the access destination when the READ access has occurred to the target physical disk (step B1).

次に、データ再配置部６１内のアクセス監視手段６２が、前記検出したブロックが属する領域番号と時刻に対応する第１メモリテーブルの欄の値を１加算する（ステップＢ２）。 Next, the access monitoring means 62 in the data rearrangement unit 61 adds 1 to the area number to which the detected block belongs and the value in the first memory table column corresponding to the time (step B2).

以上の処理が、前述したサンプリング期間を経過するまで継続される（ステップＢ３のＮｏ）。 The above processing is continued until the above-described sampling period elapses (No in step B3).

図１０は、上記サンプリングに時間帯毎のアクセスが記憶された第１メモリテーブルを表している。図１０のｔ１〜ｔｎは、所定のサンプリング時間間隔により設定された時間帯を示している。例えば、領域番号３は、時間帯ｔ３に８回のアクセスを受けていることを読み取ることができる。 FIG. 10 shows a first memory table in which access for each time period is stored in the sampling. In FIG. 10, t1 to tn indicate time zones set by predetermined sampling time intervals. For example, it can be read that the area number 3 receives 8 accesses in the time zone t3.

［データ再配置処理−アクセス時間帯の集計］
次に、データ再配置部６１内のアクセス時間帯抽出手段６３が起動され、第１のメモリテーブル６１Ｄを参照して、各領域についてアクセス数の最も多い時間帯を探し、その領域のアクセス時間帯に設定する処理が行われる（図５のステップＡ２）。 [Data relocation processing-Total access time]
Next, the access time zone extraction means 63 in the data rearrangement unit 61 is activated, refers to the first memory table 61D, searches for the time zone having the largest number of accesses for each area, and accesses the access time zone for that area. Is set (step A2 in FIG. 5).

図７は、アクセス時間帯の抽出処理の流れを表したフローチャートである。まず、アクセス時間帯抽出手段６３は、第１メモリテーブル６１Ｄから一行読み出し、アクセス回数が最も大きい時間帯を抽出とする（ステップＣ１）。 FIG. 7 is a flowchart showing the flow of access time zone extraction processing. First, the access time zone extracting means 63 reads one row from the first memory table 61D and extracts the time zone having the largest number of accesses (step C1).

次に、アクセス時間帯抽出手段６３は、前記読み出した行の領域番号と、前記抽出したアクセス時間帯と、の組を第２メモリテーブル６１Ｅに記憶する（ステップＣ２）。図１１は、図１０の第１メモリテーブル６１Ｄから一行ずつアクセス回数が大きい時間帯を抽出した結果を示している。 Next, the access time zone extracting means 63 stores the set of the read row area number and the extracted access time zone in the second memory table 61E (step C2). FIG. 11 shows a result of extracting a time zone in which the number of accesses is large for each row from the first memory table 61D of FIG.

以上の処理は、第１メモリテーブル６１Ｄのすべての行についてアクセス時間帯の抽出が完了するまで継続される（ステップＣ３のＮｏ）。 The above process is continued until the extraction of the access time period is completed for all the rows of the first memory table 61D (No in Step C3).

第１メモリテーブル６１Ｄのすべての行についてアクセス時間帯の抽出が完了すると（ステップＣ３のＹｅｓ）、次に、ポインタテーブル作成手段６４が起動される。 When the extraction of the access time zone is completed for all the rows of the first memory table 61D (Yes in step C3), the pointer table creation means 64 is then activated.

ポインタテーブル作成手段６４は、第２メモリテーブル６１Ｅの各行をアクセス時間帯の順にソートする（ステップＡ３）。図１２は、図１１の第２メモリテーブル６１Ｅをアクセス時間帯でソート（昇順）した状態を示している。 The pointer table creation means 64 sorts each row of the second memory table 61E in the order of the access time zone (step A3). FIG. 12 shows a state where the second memory table 61E of FIG. 11 is sorted (in ascending order) by the access time zone.

ポインタテーブル作成手段６４は、上記ソートした第２メモリテーブル６１Ｅを用いてデータ配置順序（ポインタテーブル）の作成を開始する（ステップＡ４）。 The pointer table creation means 64 starts creating the data arrangement order (pointer table) using the sorted second memory table 61E (step A4).

図８は、データ配置順序（ポインタテーブル）の作成処理の流れを表したフローチャートである。まず、ポインタテーブル作成手段６４は、第２メモリテーブル６１Ｅから１行読み出し（ステップＤ１）、当該行のアクセス時間帯と、処理中時間帯記憶部６１Ｊに記憶されている時間帯と比較する（ステップＤ２）。 FIG. 8 is a flowchart showing the flow of processing for creating the data arrangement order (pointer table). First, the pointer table creation unit 64 reads one row from the second memory table 61E (step D1), and compares the access time zone of the row with the time zone stored in the processing time zone storage unit 61J (step S1). D2).

ここで、読み出した行のアクセス時間帯が変化している場合（ステップＤ２のＹｅｓ）、ポインタテーブル作成手段６４は、処理中時間帯記憶部６１Ｊの値を更新するとともに（ステップＤ３）、第２メモリテーブル６１Ｅを参照してアクセス時間帯が同一の領域数（行数）を算出し、同時間帯の領域数記憶部６１Ｉに記憶する（ステップＤ４）。 Here, when the access time zone of the read row has changed (Yes in Step D2), the pointer table creation means 64 updates the value of the processing time zone storage unit 61J (Step D3) and the second The number of areas (number of rows) having the same access time zone is calculated with reference to the memory table 61E and stored in the area number storage unit 61I in the same time zone (step D4).

次に、ポインタテーブル作成手段６４は、第２ブロック番号オフセット記憶部６１Ｈの値を０とし、第１ブロック番号オフセット記憶部６１Ｇに、同時間帯の領域数記憶部６１Ｉの値をセットする（ステップＤ５）。第２ブロック番号オフセット記憶部６１Ｈの値は、後記するブロック番号の書き出しに用いられる。 Next, the pointer table creation means 64 sets the value of the second block number offset storage unit 61H to 0, and sets the value of the area number storage unit 61I in the same time zone to the first block number offset storage unit 61G (step S1). D5). The value in the second block number offset storage unit 61H is used for writing out a block number to be described later.

なお、読み出した行のアクセス時間帯が変化していない場合（ステップＤ２のＮｏ）、ポインタテーブル作成手段６４は、第２ブロック番号オフセット記憶部６１Ｈの値を１加算する処理を行う（ステップＤ６）。 If the access time zone of the read row has not changed (No in step D2), the pointer table creation unit 64 performs a process of adding 1 to the value of the second block number offset storage unit 61H (step D6). .

以上のブロック番号の書き出し準備が完了すると、ポインタテーブル作成手段６４は、ブロック数カウンタ６１Ｋを０で初期化し、第２メモリテーブル６１Ｅから読み出した行をデータ配置順序（ポインタテーブル）に書き出す処理を実行する（ステップＤ７）。 When the above block number writing preparation is completed, the pointer table creating means 64 initializes the block number counter 61K with 0, and executes a process of writing the row read from the second memory table 61E in the data arrangement order (pointer table). (Step D7).

図９は、第２メモリテーブル６１Ｅから読み出した行をデータ配置順序（ポインタテーブル）に書き出す処理の流れを表したフローチャートである。まず、ポインタテーブル作成手段６４は、第２メモリテーブル６１Ｅから読み出した行の領域番号を用いて下記のとおり、データ配置順序（ポインタテーブル）６１Ｆの該当行にブロック番号を記録する。
行番号：読み出した行の領域番号×１領域あたりのブロック数＋ブロック数カウンタ値
記録内容（ブロック番号）：ブロック数カウンタ値×同時間帯の領域数＋第１ブロック番号オフセット＋第２ブロック番号オフセット FIG. 9 is a flowchart showing a flow of processing for writing out the row read from the second memory table 61E in the data arrangement order (pointer table). First, the pointer table creating means 64 records the block number in the corresponding row of the data arrangement order (pointer table) 61F using the area number of the row read from the second memory table 61E as follows.
Row number: area number of read row × number of blocks per area + block number counter value Recorded content (block number): block number counter value × number of areas in the same time zone + first block number offset + second block number offset

上記１領域あたりのブロック数は、１領域あたりのブロック数記憶部６１Ｃに記憶されている。ブロック数カウンタ値は、ブロック数カウンタ６１Ｋに記憶されている。また、同時間帯の領域数は、同時間帯の領域数記憶部６１Ｉに記憶されている。第１、第２ブロック番号オフセットは、それぞれ第１ブロック番号オフセット記憶部６１Ｇ、第２ブロック番号オフセット記憶部６１Ｈに記憶されている。 The number of blocks per area is stored in the block number storage unit 61C per area. The block number counter value is stored in the block number counter 61K. The number of areas in the same time zone is stored in the area number storage unit 61I in the same time zone. The first and second block number offsets are stored in the first block number offset storage unit 61G and the second block number offset storage unit 61H, respectively.

ポインタテーブル作成手段６４は、ブロック数カウンタ６１Ｋの値を１加算する（ステップＥ２）。以上の処理は、ブロック数カウンタ６１Ｋの値が、１領域あたりのブロック数に達するまで継続される（ステップＥ３）。 The pointer table creating means 64 adds 1 to the value of the block number counter 61K (step E2). The above processing is continued until the value of the block number counter 61K reaches the number of blocks per area (step E3).

第２メモリテーブル６１Ｅから読み出した行のブロックの書き出しが完了すると、ポインタテーブル作成手段６４は、第２メモリテーブル６１Ｅの全行（全領域）の書き出しが完了したか否かを確認する（ステップＤ８）。 When the writing of the block of the row read from the second memory table 61E is completed, the pointer table creating means 64 checks whether or not the writing of all the rows (all areas) of the second memory table 61E is completed (Step D8). ).

ここで、第２メモリテーブル６１Ｅの全行（全領域）の書き出しが完了していない場合ポインタテーブル作成手段６４は、ステップＤ１に戻って、第２メモリテーブル６１Ｅの次の行について処理を継続する（ステップＤ８のＮｏ）。 Here, when writing of all the rows (all areas) of the second memory table 61E is not completed, the pointer table creating means 64 returns to step D1 and continues the processing for the next row of the second memory table 61E. (No in step D8).

以上の結果、ソート後の第２メモリテーブル６１Ｅにおける順序に従って各行に含まれるブロック番号の書き出しが行われる。図１３は、図１２の第２メモリテーブル６１Ｅから作成されたデータ配置情報（ポインタテーブル）の例である。 As a result, the block numbers included in each row are written according to the order in the second memory table 61E after sorting. FIG. 13 is an example of data arrangement information (pointer table) created from the second memory table 61E of FIG.

最後に、上記のようにしてデータ配置順序（ポインタテーブル）の作成・更新が完了すると、データ再配置部６１は、業務データ用の物理ディスクからレプリカ用の物理ディスクに、全ブロックのデータをコピーする（ステップＡ５）。例えば、図１３のとおりに、物理ディスク３１のブロックが再配置されることでシーク時間が最小化され、次回以降、同様のアクセスが行われた場合に、高速にアクセスすることが可能になる。 Finally, when the creation / update of the data arrangement order (pointer table) is completed as described above, the data rearrangement unit 61 copies the data of all blocks from the business data physical disk to the replica physical disk. (Step A5). For example, as shown in FIG. 13, the seek time is minimized by rearranging the blocks of the physical disk 31, and high speed access is possible when the same access is performed from the next time.

特に、本実施形態では、所定のサンプリング期間を定めて物理ディスクへのアクセス状況を観測し、その結果から、最もアクセス回数が多かった時間帯をアクセス時間帯として抽出し、データの再配置プランを作成することとしているため、実効性の高いデータ再配置を行うことが可能になる。 In particular, in this embodiment, a predetermined sampling period is set and the access status to the physical disk is observed, and from the result, the time zone with the highest number of accesses is extracted as the access time zone, and the data relocation plan is determined. Therefore, it is possible to perform highly effective data rearrangement.

なお、本実施形態では、業務データ用の物理ディスクからレプリカ用の物理ディスクに、全ブロックのデータをコピーすることによりデータの再配置を行うものとして説明したが、ディスクストレージ装置１００が別途作業用メモリエリアを有している場合には、該エリアを利用してデータの再配置を行うこととしてもよい。 In the present embodiment, the data relocation is performed by copying the data of all blocks from the physical disk for business data to the physical disk for replica. However, the disk storage apparatus 100 is separately used for work. When a memory area is provided, data rearrangement may be performed using the area.

以上、本発明の好適な実施形態を説明したが、本発明は、上記した実施形態に限定されるものではなく、本発明の基本的技術的思想を逸脱しない範囲で、更なる変形・置換・調整を加えることができる。例えば、上記したデータ再配置部の構成や処理フローチャートは、あくまでその一例を示したものに過ぎず、実際のアクセスパターンに従い、データを並び替え局所化することができる構成・処理を採用することが可能である。 The preferred embodiments of the present invention have been described above. However, the present invention is not limited to the above-described embodiments, and further modifications, replacements, and replacements may be made without departing from the basic technical idea of the present invention. Adjustments can be made. For example, the configuration and processing flowchart of the data rearrangement unit described above is merely an example, and it is possible to adopt a configuration and processing that can rearrange and localize data according to an actual access pattern. Is possible.

例えば、上記した実施形態では、アクセス監視手段６２を備えて、アクセスパターンを入手するものとして説明したが、バックアップサーバ４０等から入手したバックアップスケジュールに基づいて、データアクセス順序を導出するものとしても良い。更に、バックアップスケジュールが更新されたタイミングでデータの再配置を行うものとすれば、バックアップスケジュール更新直後のスループットの低下を抑えることが可能になる。 For example, in the above-described embodiment, the access monitoring unit 62 is provided and the access pattern is obtained. However, the data access order may be derived based on the backup schedule obtained from the backup server 40 or the like. . Furthermore, if data rearrangement is performed at the timing when the backup schedule is updated, it is possible to suppress a decrease in throughput immediately after the backup schedule is updated.

上記した実施形態のデータストレージ装置１００を、図２、図１４等に示したバックアップサーバ４０と連携させれば、より効率的なバックアップを行いうるデータバックアップシステムが提供される。 If the data storage device 100 of the above-described embodiment is linked to the backup server 40 shown in FIGS. 2 and 14, a data backup system capable of performing more efficient backup is provided.

本発明に係るディスク記憶装置によるデータの再配置処理を模式的に表した図である。It is the figure which represented typically the data rearrangement process by the disk storage apparatus which concerns on this invention. 本発明の第１の実施形態に係るディスクストレージ装置の構成を表したブロック図である。1 is a block diagram showing the configuration of a disk storage device according to a first embodiment of the present invention. 図２のデータ再配置部の具体的構成の一例を表したブロック図である。FIG. 3 is a block diagram illustrating an example of a specific configuration of a data rearrangement unit in FIG. 2. 事前準備において行われる処理フローチャートの例である。It is an example of the process flowchart performed in prior preparation. 本発明の第１の実施形態の全体の動作の流れを表したフローチャートである。It is a flowchart showing the flow of the whole operation | movement of the 1st Embodiment of this invention. 図５のサンプリング処理において行われる処理フローチャートの例である。It is an example of the process flowchart performed in the sampling process of FIG. 図５のアクセス時間帯の抽出において行われる処理フローチャートの例である。FIG. 6 is an example of a process flowchart performed in the access time zone extraction of FIG. 図５のデータ配置順序（ポインタテーブル）の作成において行われる処理フローチャートの例である。FIG. 6 is an example of a process flowchart performed in creating the data arrangement order (pointer table) in FIG. 5. 図８の領域内ｎブロック番号の記録において行われる処理フローチャートの例である。FIG. 9 is an example of a processing flowchart performed in recording of n block numbers in the area of FIG. 8. 第１メモリテーブルの例である。It is an example of a 1st memory table. 第１メモリテーブルから作成した第２メモリテーブルの例である。It is an example of the 2nd memory table created from the 1st memory table. 図１１の第２メモリテーブルをソートした状態を表す図である。It is a figure showing the state which sorted the 2nd memory table of FIG. ポインタテーブルの一例を表した図である。It is a figure showing an example of the pointer table. レプリカ用物理ディスクを備えたディスクストレージ装置の構成を表した図である。It is a figure showing the structure of the disk storage apparatus provided with the physical disk for replicas. シーク時間が増えてしまう仕組みを説明するための図である。It is a figure for demonstrating the mechanism which seek time increases.

Explanation of symbols

１１業務サーバＡ
１２業務サーバＢ
２１、２２、２３物理ディスク（複製元）
２４ＲＡＩＤ０（ストライプ）構成
３１物理ディスク（レプリカ用）
３１Ａ磁気ディスク群（レプリカ用）
４０バックアップサーバ
４１テープ装置
６０磁気ディスク制御機構
６１データ再配置部
６１Ａ総ブロック数記憶部
６１Ｂ分割数記憶部
６１Ｃ１領域あたりのブロック数記憶部
６１Ｄ第１メモリテーブル
６１Ｅ第２メモリテーブル
６１Ｆデータ配置順序（ポインタテーブル）
６１Ｇ第１ブロック番号オフセット記憶部
６１Ｈ第２ブロック番号オフセット記憶部
６１Ｉ同時間帯の領域数記憶部
６１Ｊ処理中時間帯記憶部
６１Ｋブロック数カウンタ
６２アクセス監視手段
６３アクセス時間帯抽出手段
６４ポインタテーブル作成手段
１００ディスクストレージ装置（ディスク記憶装置） 11 Business server A
12 Business server B
21, 22, 23 Physical disk (replication source)
24 RAID 0 (stripe) configuration 31 Physical disk (for replica)
31A Magnetic disk group (for replica)
40 Backup Server 41 Tape Device 60 Magnetic Disk Control Mechanism 61 Data Relocation Unit 61A Total Block Number Storage Unit 61B Division Number Storage Unit 61C Block Number Storage Unit 61D First Area Memory Table 61E Second Memory Table 61F Data Arrangement Order (Pointer table)
61G First block number offset storage unit 61H Second block number offset storage unit 61I Area number storage unit 61J Processing time zone storage unit 61K Block number counter 62 Access monitoring unit 63 Access time zone extraction unit 64 Pointer table creation Means 100 Disk storage device (disk storage device)

Claims

A disk medium for storing replication data that is a copy of the original data;
Means for observing an access pattern to the disk medium and determining a data arrangement order in the disk medium according to a data access order in the observed access pattern;
Means for rearranging replication data according to the data arrangement order ,
The means for determining the data arrangement order in the disk medium is:
The storage area of the disk medium is divided by a predetermined size, and the number of accesses in a predetermined time segment for each divided area is recorded,
For each divided area, the time segment with the largest number of accesses is extracted as the access time zone of the divided area, and a table in which the divided areas are rearranged based on the access time zone is created. And
A disk storage device that refers to the table and creates a pointer table that sequentially assigns new block numbers to the blocks belonging to the areas having the same access time zone and indicates the correspondence between the block numbers before and after the rearrangement .

2. The disk storage device according to claim 1, wherein, when an access request to the disk medium is received, access is received with reference to the data arrangement order.

The disk storage device according to claim 1 or 2 ,
A backup server that reads data from the disk storage device and performs data backup to a predetermined backup medium;
Including data backup system.

By dividing the storage area of the disk medium storing the replication data obtained by copying the replication source data by a predetermined size, and recording the number of accesses in a predetermined time section for each of the divided areas , Observe access patterns,
For each divided area, the time segment with the largest number of accesses is extracted as the access time zone of the divided area, and a table in which the divided areas are rearranged based on the access time zone is created. And
By referring to the table, by assigning a new block number in order to the blocks belonging to the area where the access time zone matches, by creating a pointer table that represents the correspondence between the block numbers before and after the rearrangement, Determining a data arrangement order in the disk medium;
A data rearrangement method for rearranging the replication data according to the data arrangement order.

By dividing the storage area of the disk medium storing the replication data obtained by copying the replication source data by a predetermined size, and recording the number of accesses in a predetermined time section for each of the divided areas , Processing to observe access patterns ;
For each divided area, the time segment with the largest number of accesses is extracted as the access time zone of the divided area, and a table in which the divided areas are rearranged based on the access time zone is created. Then, referring to the table, a new block number is assigned in order to the blocks belonging to the areas where the access time zones coincide with each other, and a pointer table representing the correspondence relationship between the block numbers before and after the rearrangement is created. Accordingly, the process of determining the data arrangement order in the disk medium,
A program that causes a computer to execute a process of rearranging the replication data in accordance with the data arrangement order.