JP2008140415A

JP2008140415A - Data duplication system, data transmission/reception method, and program for duplicating data in storage

Info

Publication number: JP2008140415A
Application number: JP2008004914A
Authority: JP
Inventors: Junichi Yamato; 純一大和; Yoshihide Kikuchi; 芳秀菊地; Yuji Kaneko; 裕治金子
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2008-01-11
Filing date: 2008-01-11
Publication date: 2008-06-19

Abstract

PROBLEM TO BE SOLVED: To complete quickly data transmission/reception, even if data are discarded during transmission. SOLUTION: In a storage 20 which is a transmission source, at least one piece of redundant data for error correction are generated from transmitted original data. The storage 20 transmits the original data and the redundant data by different transmission unit. The storage 20 is equipped with a redundancy part for duplicating data transmitted to another storage 21, and a communication part for transmitting the original data and the redundant data individually to the storage 21. COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、データ複製システム、データ送受信方法およびストレージ内のデータを複製するためのプログラムに関する。 The present invention relates to a data replication system, a data transmission / reception method, and a program for replicating data in a storage.

災害等が発生してもコンピュータシステムの機能を維持できるようにするため、正常系（稼働系）のシステムと待機系のシステムとが設けられたコンピュータシステムが実現されている。例えば、ＥＭＣＣｏｒｐｏｒａｔｉｏｎ（イーエムシーコーポレーション）は、正常系のストレージと待機系のストレージとを用いてミラーリングを行うシステムを実現している。このシステムに関する情報は、「http://www.emc2.co.jp/local/ja/JP/products/product_pdfs/srdf/srdf.pdf 」というＵＲＬで公開されている。 In order to maintain the functions of a computer system even when a disaster or the like occurs, a computer system provided with a normal (active) system and a standby system has been realized. For example, EMC Corporation implements a system that performs mirroring using normal storage and standby storage. Information on this system is disclosed at the URL "http://www.emc2.co.jp/local/ja/JP/products/product_pdfs/srdf/srdf.pdf".

また、特開２０００−３０５８５６公報には、メインセンター（正常系のシステム）とリモートセンター（待機系のシステム）とでデータの二重化を図るシステムが記載されている。 Japanese Patent Laid-Open No. 2000-305856 describes a system for duplicating data between a main center (normal system) and a remote center (standby system).

一般に、正常系のシステムでは、正常系のストレージとそのストレージを使用する正常系のホストコンピュータ（以下、ホストという。）とが接続される。待機系のシステムも同様である。そして、正常系のストレージと待機系のストレージとが、例えば、専用回線やインターネット等の通信ネットワークを介して接続される。 In general, in a normal system, a normal storage and a normal host computer (hereinafter referred to as a host) using the storage are connected. The same applies to the standby system. The normal storage and the standby storage are connected via a communication network such as a dedicated line or the Internet.

「http://www.emc2.co.jp/local/ja/JP/products/product_pdfs/srdf/srdf.pdf 」において公開されているシステムや、特開２０００−３０５８５６公報に記載されたシステムでは、正常系のシステムから待機系のシステムに直接データを送信している。しかし、正常系のシステムから中継装置にデータを送信し、中継装置から待機系のシステムにデータを送信することもある。一般に、中継装置は、正常系のシステムから受信したデータを待機系のシステムに送信し、待機系のシステムから受信完了の通知を受けたときに、正常系のシステムに待機系へのデータ転送が完了したことを通知している。そして、正常系のシステムは、中継装置にデータを送信した場合、データ転送が完了した旨の通知を中継装置から受信してから次の処理を開始する。 In the system disclosed in “http://www.emc2.co.jp/local/en/JP/products/product_pdfs/srdf/srdf.pdf” or the system described in Japanese Patent Laid-Open No. 2000-305856, Data is sent directly from the normal system to the standby system. However, data may be transmitted from the normal system to the relay apparatus, and data may be transmitted from the relay apparatus to the standby system. In general, the relay device transmits data received from the normal system to the standby system, and when receiving a notification of completion of reception from the standby system, the normal system transfers the data to the standby system. Notifying completion. When the normal system transmits data to the relay device, the normal system starts a next process after receiving a notification from the relay device that the data transfer is completed.

また、一般に、通信ネットワークを介してデータを送受信する場合、データを分割してデータ送信単位毎に送受信する。このデータ送信単位は、通信プロトコル毎に異なる。ここでは、パケットをデータ送信単位とする場合を例に説明する。パケット等のデータ送信単位には、送信対象のデータだけでなく、送信過程でのデータの誤り（データ化け）を検出するためのエラー検出コードも含まれる。エラー検出コードとしては、チェックサムデータやＣＲＣ（Cyclic Redundancy Check ）データ等がある。データを受信した装置は、エラー検出コードによってデータの誤りを検出すると、そのパケットを廃棄する。 In general, when data is transmitted / received via a communication network, the data is divided and transmitted / received for each data transmission unit. This data transmission unit is different for each communication protocol. Here, a case where a packet is used as a data transmission unit will be described as an example. A data transmission unit such as a packet includes not only data to be transmitted but also an error detection code for detecting an error (data corruption) in the transmission process. Examples of error detection codes include checksum data and CRC (Cyclic Redundancy Check) data. When the device receiving the data detects an error in the data using the error detection code, the device discards the packet.

また、データに誤りが生じていない場合でも、通信ネットワークに輻輳（通信負荷が高い状態）が生じると、パケットは通信ネットワーク上で廃棄される。パケットが廃棄され、送信先からの応答が得られない場合、送信元は再度パケットを送信する。なお、通信ネットワークに輻輳が生じたときにパケットの廃棄を開始すると、送信元による再送信によってさらに通信負荷が高くなったり、通信ネットワークの利用率が低くなったりすることがある。このような問題を回避するため、近年、通信負荷が所定のしきい値を越えたときに通信ネットワークを構成する機器が任意にパケットを廃棄する方式が採用されている。この方式は、ＲＥＤ（Random Early Detection）と呼ばれている。 Even if no error has occurred in the data, if the communication network is congested (a high communication load), the packet is discarded on the communication network. If the packet is discarded and no response is received from the transmission destination, the transmission source transmits the packet again. If packet discarding is started when congestion occurs in the communication network, the communication load may be further increased or the utilization rate of the communication network may be decreased due to retransmission by the transmission source. In order to avoid such a problem, recently, a method has been adopted in which devices constituting a communication network arbitrarily discard packets when the communication load exceeds a predetermined threshold. This method is called RED (Random Early Detection).

また、データを受信した装置がデータの誤りを訂正することができるデータ伝送方式も提案されている。例えば、特開昭５７−１３８２３７号公報には、送信対象のデータと、そのデータから作成されたパリティビットとを別々に送信し、送信対象のデータとパリティビットとを受信した装置が誤り訂正を行うデータ伝送方式が記載されている。 There has also been proposed a data transmission method in which a device that receives data can correct an error in the data. For example, in Japanese Patent Application Laid-Open No. 57-138237, a device that transmits data to be transmitted and a parity bit created from the data separately and receives the data to be transmitted and the parity bit performs error correction. The data transmission system to be performed is described.

なお、以下、ミラーリングとバックアップとを以下のように区別するものとする。「ミラーリング」とは、あるストレージに対してホストから書込コマンド（write コマンド）が出力されたことを契機として、そのストレージを含む２つ以上のストレージに対して同じデータを書き込むことと定義する。レプリケーションもミラーリングの一種として扱う。また、「バックアップ」とは、ストレージに対するwrite コマンドを契機とせず、任意のタイミングで、あるストレージの内容を他のストレージに複写することと定義する。 In the following, mirroring and backup are distinguished as follows. “Mirroring” is defined as writing the same data to two or more storages including the storage when a write command (write command) is output from the host to a certain storage. Replication is also treated as a kind of mirroring. In addition, “backup” is defined as copying the contents of a certain storage to another storage at an arbitrary timing without being triggered by a write command for the storage.

また、コンピュータシステムにおいて、実行中のプログラムを任意の時点で中断させ、後に再開させる技術が実現されている。例えば、ＳＸシリーズという名称で販売されている日本電気株式会社製のスーパコンピュータでは、任意の時点におけるプロセスの実行状態（例えば、メモリやレジスタの状態等）を保存して、プログラムを中断・再開するようにしている。この機能は、チェックポイント・リスタート機能と呼ばれる場合もある。 In addition, in a computer system, a technique has been realized in which a running program is interrupted at an arbitrary time point and restarted later. For example, in a super computer manufactured by NEC Corporation sold under the name SX series, the execution state of a process (for example, the state of a memory or a register) at an arbitrary time is saved, and the program is suspended / resumed. I am doing so. This function is sometimes called a checkpoint / restart function.

正常系のシステムから中継装置を介して待機系のシステムにデータを送信する場合には、以下のような問題があった。中継装置は、正常系のシステムから受信したデータを待機系のシステムに送信し、待機系のシステムから受信完了の通知を受けたときに、転送が完了したことを正常系のシステムに通知する。そして、正常系のシステムは、転送が完了した旨の通知を中継装置から受信しなければ、次の処理に移行しない。そのため、正常系のシステムが中継装置にデータを送信してから、次の処理を開始できるまでの時間がかかってしまうという問題があった。特に、災害対策として中継装置や待機系のシステムを設ける場合には、正常系のシステムの遠隔地に中継装置を配置し、待機系のシステムをより遠い場所に配置する。その結果、正常系システムと中継装置との間でデータや通知を送受信する時間や、中継装置と待機系システムとの間でデータや通知を送受信する時間がかかってしまい、正常系のシステムにおける次の処理の開始が遅れてしまう。 When data is transmitted from a normal system to a standby system via a relay device, there are the following problems. The relay apparatus transmits the data received from the normal system to the standby system, and when the reception completion notification is received from the standby system, notifies the normal system that the transfer has been completed. The normal system does not proceed to the next processing unless receiving a notification that the transfer is completed from the relay device. Therefore, there is a problem that it takes time until the next system can start after the normal system transmits data to the relay apparatus. In particular, when a relay device or a standby system is provided as a disaster countermeasure, the relay device is arranged at a remote location of the normal system, and the standby system is arranged at a far place. As a result, it takes time to send and receive data and notifications between the normal system and the relay device, and time to send and receive data and notifications between the relay device and the standby system. The start of the process will be delayed.

例えば１００ｋｍ離れた地点との通信におけるラウンドトリップ時間（データを送信してから応答を得るまでの往復時間）は、ｍｓ（ミリ秒）のオーダである。ホストは、μｓ（マイクロ秒）のオーダで各処理を進める。従って、送信に対する応答を得るまでの時間は、処理の遅れの原因となる。 For example, the round trip time (round trip time from sending data to getting a response) in communication with a point 100 km away is on the order of ms (milliseconds). The host advances each process on the order of μs (microseconds). Therefore, the time until the response to the transmission is obtained causes a processing delay.

また、データの送信過程において、パケット等のデータ送信単位で廃棄されると、送信元となる正常系のストレージはデータを再送信することになる。すると、データ送信完了までに一層時間がかかり、次の処理の開始がさらに遅れてしまう。 In the data transmission process, if a packet or other data transmission unit is discarded, the normal storage serving as a transmission source retransmits the data. Then, it takes more time to complete data transmission, and the start of the next process is further delayed.

また、正常系のストレージから待機系のストレージにデータのバックアップ処理を行うときには、データの送信距離が長いことに加えて、送信すべきデータの量が多くなるため、送信時間が一層増大してしまう。 In addition, when data backup processing is performed from a normal storage to a standby storage, in addition to a long data transmission distance, the amount of data to be transmitted increases, which further increases the transmission time. .

また、正常系のシステムに異常が生じたときに、待機系のシステムですぐに処理を開始できるとは限らなかった。例えば、ある処理Ｘを開始する際には、ストレージに対するデータＡおよびデータＢの書き込みが完了していなければならないとする。正常系のホストがデータＡを書き込むwrite コマンドをストレージに出力してミラーリングを行うと、待機系のストレージにもデータＡは反映される。その後、正常系のシステムに異常が発生したとする。この場合、待機系のストレージには、データＢが書き込まれていないため、すぐに処理Ｘを開始できない。待機系のシステムで処理を開始する場合には、待機系のストレージにデータＢを書き込んで処理Ｘを開始するか、あるいは、データＡを削除して処理Ｘの前の処理からやり直す必要がある。そのため、待機系システムでの処理の再開に時間がかかってしまっていた。 Further, when an abnormality occurs in a normal system, the standby system cannot always start processing immediately. For example, when starting a certain process X, the writing of data A and data B to the storage must be completed. When a normal host outputs a write command for writing data A to the storage and performs mirroring, the data A is reflected in the standby storage. Then, it is assumed that an abnormality occurs in the normal system. In this case, since the data B is not written in the standby storage, the process X cannot be started immediately. When the processing is started in the standby system, it is necessary to write the data B to the standby storage and start the processing X, or to delete the data A and start again from the processing before the processing X. For this reason, it takes time to resume the processing in the standby system.

また、所定のデータ記録状態になっていれば処理を再開できるような機能がアプリケーションプログラムによって実現されていない場合がある。例えば、上記の例においてホストがストレージにＡ，Ｂを書き込んだならば処理Ｘから処理を再開できるようにした機能が実現されていない場合がある。以下、ストレージが所定のデータ記録状態になっていれば処理を再開できるような機能を「再開機能」と記す。アプリケーションプログラムによって再開機能が実現されていない場合においても、正常系で異常が生じたならば、待機系で迅速に処理を再開できることが好ましい。 In addition, there is a case where a function that can resume the processing if the data recording state is reached is not realized by the application program. For example, in the above example, there is a case where a function that can resume the process from the process X if the host writes A and B in the storage may not be realized. Hereinafter, a function capable of resuming processing when the storage is in a predetermined data recording state is referred to as a “resuming function”. Even when the resume function is not realized by the application program, it is preferable that the process can be quickly resumed in the standby system if an abnormality occurs in the normal system.

そこで本発明は、送信過程でデータが廃棄されたとしても迅速にデータの送受信を完了できるようにすることを目的とする。 Therefore, an object of the present invention is to allow data transmission / reception to be completed quickly even if data is discarded in the transmission process.

また、本発明によるデータを送信する送信元からデータを受信する送信先にデータを送信するデータ送受信方法において、送信元では、送信される元データから少なくとも１つのエラー訂正のための冗長データを作成し、元データと冗長データとを別々のデータ送信単位で送信することを特徴とする。 In the data transmission / reception method for transmitting data to a transmission destination for receiving data from a transmission source for transmitting data according to the present invention, the transmission source generates at least one redundant data for error correction from the transmitted original data. The original data and the redundant data are transmitted in separate data transmission units.

送信先では、元データと冗長データとの集合であるデータ群のすべてについて受信を完了する前に、元データについて部分的にエラー訂正処理を実行できるデータ群の一部を受信した段階で、エラー訂正処理を実行することが好ましい。そのような方法によれば、送信過程でデータの一部が廃棄されても、送信元はデータを再度送信する必要がない。 At the destination, before completing the reception of all data groups that are a set of original data and redundant data, an error is received when a part of the data group that can be partially subjected to error correction processing is received. It is preferable to execute a correction process. According to such a method, even if a part of data is discarded in the transmission process, the transmission source does not need to transmit the data again.

例えば、送信元では、元データを分割データに分割し、それらの分割データのうちの１つまたは複数が消失しても元データを復元可能な冗長データを作成する。 For example, the transmission source divides the original data into divided data, and creates redundant data that can restore the original data even if one or more of the divided data is lost.

例えば、冗長データとしてパリティデータまたはＥＣＣ（Error Correcting Code ）を用いればよい。 For example, parity data or ECC (Error Correcting Code) may be used as redundant data.

また、冗長データとして送信データの複製データを用いてもよい。 Further, duplicate data of transmission data may be used as redundant data.

元データと冗長データとを、別々の通信ネットワークに送出してもよい。そのような方法によれば、一方の通信ネットワークで障害等が発生しても、もう一方の通信ネットワークから受信したデータによって処理を進めることができる。 The original data and the redundant data may be sent to different communication networks. According to such a method, even if a failure or the like occurs in one communication network, it is possible to proceed with processing using data received from the other communication network.

また、本発明によるデータ複製システムは、第一のストレージ内のデータを第二のストレージに通信ネットワークを介してミラーリングまたはバックアップするデータ複製システムにおいて、第一のストレージは、データ転送の制御を行うデータ転送処理手段と、送信される元データから少なくとも１つのエラー訂正のための冗長データを作成する冗長化手段とを含み、データ転送処理手段は、元データと冗長化手段が作成した冗長データとを別々のデータ送信単位で送信することを特徴とする。 The data replication system according to the present invention is a data replication system in which data in the first storage is mirrored or backed up to the second storage via a communication network. The first storage is data for controlling data transfer. Transfer processing means, and redundancy means for creating at least one error correction data from the original data to be transmitted. The data transfer processing means includes the original data and the redundant data created by the redundancy means. It transmits by a separate data transmission unit, It is characterized by the above-mentioned.

第二のストレージは、第一のストレージから受信した冗長データを用いてエラー訂正処理を行うデータ復元手段と、データ復元手段が復元したデータを記憶媒体に格納する格納処理手段とを含み、データ復元手段は、第一のストレージから元データと冗長データとの集合であるデータ群のすべてについて受信を完了する前に、元データについて部分的にエラー訂正処理を実行できるデータ群の一部を受信した段階で、エラー訂正処理を実行することが好ましい。そのような構成によれば、送信過程でデータの一部が廃棄されても、送信元はデータを再度送信する必要がない。 The second storage includes data restoration means for performing error correction processing using redundant data received from the first storage, and storage processing means for storing the data restored by the data restoration means in a storage medium. The means received a part of the data group that can partially execute the error correction processing on the original data before completing the reception of all the data group that is a set of the original data and the redundant data from the first storage. It is preferable to execute error correction processing in stages. According to such a configuration, even if a part of data is discarded in the transmission process, the transmission source does not need to transmit the data again.

例えば、第一のストレージにおける冗長化手段は、元データを分割データに分割し、それらの分割データのうちの１つまたは複数が消失しても元データを復元可能な冗長データを作成する。 For example, the redundancy means in the first storage divides the original data into divided data, and creates redundant data that can restore the original data even if one or more of the divided data are lost.

冗長化手段は、冗長データとしてパリティデータまたはＥＣＣを用いてもよい。 The redundancy means may use parity data or ECC as redundant data.

冗長化手段は、冗長データとして、元データの複製データを作成してもよい。 The redundancy unit may create duplicate data of the original data as redundant data.

データ転送処理手段は、元データと冗長データとを、別々の通信ネットワークに送出してもよい。そのような構成によれば、一方の通信ネットワークで障害等が発生しても、もう一方の通信ネットワークから受信したデータによって処理を進めることができる。 The data transfer processing means may send the original data and the redundant data to different communication networks. According to such a configuration, even if a failure or the like occurs in one communication network, it is possible to proceed with processing using data received from the other communication network.

また、本発明によるストレージ内のデータを複製するためのプログラムは、第一のストレージ内のデータを第二のストレージに通信ネットワークを介してミラーリングまたはバックアップするデータ複製システムにおける第一のストレージ内に設けられているコンピュータに、送信される元データから少なくとも１つのエラー訂正のための冗長データを作成する処理と、元データと冗長データとを別々のデータ送信単位で送信する処理とを実行させる。 Further, a program for duplicating data in a storage according to the present invention is provided in the first storage in a data duplication system that mirrors or backs up data in the first storage to the second storage via a communication network. The computer is configured to execute a process of creating at least one redundant data for error correction from the transmitted original data and a process of transmitting the original data and the redundant data in different data transmission units.

本発明によれば、送信される元データから少なくとも１つのエラー訂正のための冗長データを作成し、元データと冗長データとを別々のデータ送信単位で送信する。従って、待機系では、元データと冗長データとの集合のうちの一部から元データを復元することができ、送信過程で一部のデータが廃棄されても再度送信する必要がない。その結果、データの転送を迅速に完了させることができる。 According to the present invention, at least one redundant data for error correction is created from the transmitted original data, and the original data and the redundant data are transmitted in separate data transmission units. Therefore, in the standby system, the original data can be restored from a part of the set of the original data and the redundant data, and even if a part of the data is discarded in the transmission process, there is no need to transmit again. As a result, the data transfer can be completed quickly.

以下、本発明の実施の形態を図面を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

実施の形態１．
図１は、本発明によるデータ複製システムの第１の実施の形態を示すブロック図である。図１に示すデータ複製システムにおいて、ストレージ１１が、ストレージ１１を使用するホスト（ホストコンピュータ）１０とローカルに接続されている。ストレージ１１は、インターネットや専用線等の通信ネットワーク（以下、ネットワークという。）１３を介して中継装置１５に接続されている。また、中継装置１５は、ネットワーク１４を介してストレージ１２に接続されている。 Embodiment 1 FIG.
FIG. 1 is a block diagram showing a first embodiment of a data replication system according to the present invention. In the data replication system shown in FIG. 1, a storage 11 is locally connected to a host (host computer) 10 that uses the storage 11. The storage 11 is connected to the relay device 15 via a communication network (hereinafter referred to as a network) 13 such as the Internet or a dedicated line. Further, the relay device 15 is connected to the storage 12 via the network 14.

ストレージ１１，１２は、例えば、単体の磁気ディスク装置、光ディスク装置または光磁気ディスク装置である。ストレージ１１，１２として、単体の磁気ディスク装置、光ディスク装置または光磁気ディスク装置の集合であるディスクアレイ装置を使用することもできる。ホスト１０とストレージ１１とは、ＳＣＳＩ、ファイバチャネル（Fibre channel ）、イーサネット（登録商標）等で接続される。 The storages 11 and 12 are, for example, a single magnetic disk device, an optical disk device, or a magneto-optical disk device. As the storages 11 and 12, a disk array device that is a single magnetic disk device, an optical disk device, or a set of magneto-optical disk devices can be used. The host 10 and the storage 11 are connected by SCSI, Fiber channel, Ethernet (registered trademark), or the like.

図２は、図１に示すストレージ１１の構成例を示すブロック図である。なお、ストレージ１２の構成も、図２に示すような構成である。図２に示すように、ストレージ１１は、ストレージコントローラ１００とストレージ本体である記憶媒体１０１とを含む。ストレージコントローラ１００は、ホスト１０および他のストレージまたは中継装置と通信を行う通信部１０２、各処理のシーケンスを管理する処理シーケンサ１０３、記憶媒体１０１に対する処理命令の順序制御を行うＩＯスケジューラ１０４、ＩＯスケジューラ１０４が発行する処理命令に従って記憶媒体１０１の動作を制御する媒体処理部１０５、およびホスト１０等から記憶媒体１０１へのデータおよび記憶媒体１０１からホスト１０等へのデータを一時記憶するバッファメモリ１０６を含む。処理シーケンサ１０３は、例えば、プログラムに従って動作するＣＰＵで実現される。 FIG. 2 is a block diagram showing a configuration example of the storage 11 shown in FIG. The configuration of the storage 12 is also as shown in FIG. As shown in FIG. 2, the storage 11 includes a storage controller 100 and a storage medium 101 that is a storage body. The storage controller 100 includes a communication unit 102 that communicates with the host 10 and other storages or relay devices, a processing sequencer 103 that manages each processing sequence, an IO scheduler 104 that controls the order of processing instructions for the storage medium 101, and an IO scheduler. A medium processing unit 105 that controls the operation of the storage medium 101 in accordance with a processing command issued by the 104, and a buffer memory 106 that temporarily stores data from the host 10 or the like to the storage medium 101 and data from the storage medium 101 to the host 10 or the like. Including. The processing sequencer 103 is realized by a CPU that operates according to a program, for example.

図２では、媒体制御部１０５と記憶媒体１０１との組み合わせを一組だけ含む場合を示した。ストレージ１１、１２が複数の記録媒体１０１を含み、それぞれの記憶媒体に対応して媒体制御部１０５が設けられていてもよい。ただし、個々のストレージ１１，１２に含まれるＩＯスケジューラ１０４は一つである。記憶媒体１０１が複数存在する場合、処理シーケンサ１０３は処理命令の対象となる記憶媒体を指定し、ＩＯスケジューラ１０４は、指定された記憶媒体に対応する媒体制御部１０５に処理を行わせる。 FIG. 2 shows a case where only one combination of the medium control unit 105 and the storage medium 101 is included. The storages 11 and 12 may include a plurality of recording media 101, and a medium control unit 105 may be provided corresponding to each storage medium. However, there is one IO scheduler 104 included in each storage 11, 12. When there are a plurality of storage media 101, the processing sequencer 103 designates a storage medium to be processed, and the IO scheduler 104 causes the media control unit 105 corresponding to the designated storage medium to perform processing.

図３は、図１に示す中継装置１５の構成例を示すブロック図である。中継装置１５は、ストレージ１１，１２と通信を行う通信部１５０、中継処理のシーケンス管理を行う中継処理部１５１、およびストレージ１１，１２から受信したデータを一時記憶するバッファメモリ１５２を含む。中継処理部１５１は、例えば、プログラムに従って動作するＣＰＵで実現される。 FIG. 3 is a block diagram illustrating a configuration example of the relay device 15 illustrated in FIG. 1. The relay device 15 includes a communication unit 150 that communicates with the storages 11 and 12, a relay processing unit 151 that performs sequence management of relay processing, and a buffer memory 152 that temporarily stores data received from the storages 11 and 12. The relay processing unit 151 is realized by a CPU that operates according to a program, for example.

図１および図２に示すように、ストレージ１１には通信部１０２が設けられ、ストレージ１１自身が主体的にデータを転送する。すなわち、ホスト１０がストレージ１１からデータを読み出して中継装置１５にデータを送信するのではなく、ストレージ１１が直接中継装置１５にデータを送信する。ストレージ１２も、中継装置１５から直接データを受信する。ただし、ストレージ１１が中継装置１５にデータを送信するタイミングは、ホスト１０が指示する。 As shown in FIGS. 1 and 2, the storage 11 is provided with a communication unit 102, and the storage 11 itself transfers data mainly. That is, the host 10 does not read data from the storage 11 and transmits data to the relay device 15, but the storage 11 directly transmits data to the relay device 15. The storage 12 also receives data directly from the relay device 15. However, the host 10 instructs the timing at which the storage 11 transmits data to the relay device 15.

また、中継装置１５は、ストレージ１１，１２の設置場所において地震等の災害が発生したときに、災害の影響が波及すると想定される範囲の外に設置される。すなわち、ストレージ１１，１２が災害によって稼働できない状態になっても、その位置であれば稼働を継続できる位置（ストレージ１１，１２からの距離）があらかじめ算定され、中継装置１５は、算定された位置に設置されている。例えば、ストレージ１１があらかじめ想定した震度（例えば震度６〜７）の地震に被災したときに、中継装置１５には破損等が生じない程度の震度になるような位置に、中継装置１５が設置される。さらに、ストレージ１１またはストレージ１２と中継装置１５との間のデータ転送時間が、ストレージ１１とストレージ１２との間で直接データを転送した場合のデータ転送時間よりも短くなるように、中継装置１５が設置される。よって、データ複製システムは、耐災害データ管理システムとして機能する。 In addition, the relay device 15 is installed outside the range where the influence of the disaster is expected to spread when a disaster such as an earthquake occurs at the installation location of the storages 11 and 12. That is, even if the storages 11 and 12 become inoperable due to a disaster, a position (distance from the storages 11 and 12) where the operation can be continued at that position is calculated in advance, and the relay device 15 is calculated at the calculated position. Is installed. For example, when the storage 11 is affected by an earthquake with a presumed seismic intensity (for example, seismic intensity 6-7), the relay device 15 is installed at a position where the relay device 15 has a seismic intensity that does not cause damage or the like. The Further, the relay device 15 is configured so that the data transfer time between the storage 11 or the storage 12 and the relay device 15 is shorter than the data transfer time when data is directly transferred between the storage 11 and the storage 12. Installed. Therefore, the data replication system functions as a disaster resistant data management system.

次に、データ複製システムの動作を、図４および図５のフローチャートを参照して説明する。ここでは、バックアップ処理として、データの転送元としてのストレージ１１が、データの転送先としてのストレージ１２に向けてデータを転送する場合を例にする。図４は、ストレージコントローラ１００における処理シーケンサ１０３の動作を示すフローチャートであり、図５は中継装置１５における中継処理部１５１の動作を示すフローチャートである。 Next, the operation of the data replication system will be described with reference to the flowcharts of FIGS. Here, as an example of a backup process, a storage 11 as a data transfer source transfers data to a storage 12 as a data transfer destination. FIG. 4 is a flowchart showing the operation of the processing sequencer 103 in the storage controller 100, and FIG. 5 is a flowchart showing the operation of the relay processing unit 151 in the relay device 15.

ホスト１０は、バックアップ処理を実行するときに、ストレージ１１のストレージコントローラ１００における通信部１０２に対して複写コマンドを出力する。複写コマンドは、バックアップを指示するコマンドである。複写コマンドには、複写対象のデータの範囲すなわちバックアップ処理対象のデータ範囲を指定する情報および複写先のストレージを指定する情報が含まれている。通信部１０２は、複写コマンドを受け取ると、複写コマンドを処理シーケンサ１０３に渡し、複写処理を開始することを処理シーケンサ１０３に指示する。 The host 10 outputs a copy command to the communication unit 102 in the storage controller 100 of the storage 11 when executing the backup process. The copy command is a command for instructing backup. The copy command includes information for specifying a range of data to be copied, that is, a data range for a backup process, and information for specifying a copy destination storage. Upon receiving the copy command, the communication unit 102 passes the copy command to the processing sequencer 103 and instructs the processing sequencer 103 to start the copying process.

処理シーケンサ１０３は、複写コマンドを受け取ると、図４に示すように、まず、複写対象のデータの範囲の先頭のブロックをデータ転送対象のブロックとして設定する（ステップＳ１００）。次いで、通信部１０２に、複写コマンドで指定された複写先のストレージに対応した中継装置に対してwrite コマンドを送信するように指示する（ステップＳ１０１）。write コマンドには、ブロックサイズすなわちデータ量を示す情報が含まれている。また、データ転送対象のデータの読み出し要求をＩＯスケジューラ１０４に登録する。ＩＯスケジューラ１０４は、読み出し要求に応じて、媒体制御部１０５に、データ転送対象のデータの読み出し指示を行う。媒体制御部１０５は、読み出し指示に従って、データ転送対象のデータを記憶媒体１０１からバッファメモリ１０６に出力させる（ステップＳ１０２）。媒体制御部１０５は、データ転送対象の全てのデータが記憶媒体１０１からバッファメモリ１０６に出力されると、読み出し完了通知を処理シーケンサ１０３に出力する。 When the processing sequencer 103 receives the copy command, as shown in FIG. 4, first, the processing sequencer 103 sets the first block in the range of data to be copied as a block to be transferred (step S100). Next, the communication unit 102 is instructed to transmit a write command to the relay device corresponding to the copy destination storage specified by the copy command (step S101). The write command includes information indicating the block size, that is, the data amount. Also, a request for reading data to be transferred is registered in the IO scheduler 104. In response to the read request, the IO scheduler 104 instructs the medium control unit 105 to read data to be transferred. The medium control unit 105 causes the data to be transferred to be output from the storage medium 101 to the buffer memory 106 according to the read instruction (step S102). When all data to be transferred is output from the storage medium 101 to the buffer memory 106, the medium control unit 105 outputs a read completion notification to the processing sequencer 103.

そして、処理シーケンサ１０３は、通信部１０２を介して入力される中継装置１５からの受信準備完了のメッセージと、媒体制御部１０５からの読み出し完了通知との双方を待ち（ステップＳ１０３）。双方が入力されたら、バッファメモリ１０６に記憶されたデータを中継装置１５に転送するように通信部１０２に指示する。通信部１０２は、指示に応じてバッファメモリ１０６に記憶されたデータを中継装置１５に送信する（ステップＳ１０４）。そして、処理シーケンサ１０３は、通信部１０２を介して入力される中継装置１５からの受信完了のメッセージを待つ（ステップＳ１０５）。 Then, the processing sequencer 103 waits for both a reception completion message from the relay device 15 input via the communication unit 102 and a read completion notification from the medium control unit 105 (step S103). When both are input, the communication unit 102 is instructed to transfer the data stored in the buffer memory 106 to the relay device 15. In response to the instruction, the communication unit 102 transmits the data stored in the buffer memory 106 to the relay device 15 (step S104). Then, the processing sequencer 103 waits for a reception completion message from the relay device 15 input via the communication unit 102 (step S105).

通信部１０２は、バッファメモリ１０６に記憶された全てのデータを中継装置１５に送信し、中継装置１５から受信完了のメッセージを受けたら、受信完了のメッセージを処理シーケンサ１０３を出力する。処理シーケンサ１０３は、受信完了のメッセージを入力すると、複写対象の全てのデータの中継装置１５への転送が完了したか否か確認する（ステップＳ１０６）。完了していなければ、複写対象のデータの範囲の次のブロックをデータ転送対象のブロックとして設定し（ステップＳ１０８）、ステップＳ１０１に戻る。 The communication unit 102 transmits all the data stored in the buffer memory 106 to the relay device 15 and outputs a reception completion message to the processing sequencer 103 when receiving a reception completion message from the relay device 15. When the processing sequencer 103 inputs a reception completion message, the processing sequencer 103 checks whether or not the transfer of all data to be copied to the relay device 15 has been completed (step S106). If not completed, the next block in the range of data to be copied is set as the block to be transferred (step S108), and the process returns to step S101.

複写対象の全てのデータの転送が完了していれば、処理シーケンサ１０３は、ホスト１０に対して完了通知を出力するように通信部１０２に指示し（ステップＳ１０７）、処理を終了する。 If the transfer of all data to be copied has been completed, the processing sequencer 103 instructs the communication unit 102 to output a completion notification to the host 10 (step S107), and ends the processing.

なお、１ブロックのデータ量は、あらかじめシステムに設定されている量である。また、処理シーケンサ１０３は、１ブロックのデータ量を、転送先のストレージ１２のバッファメモリ１０６および中継装置１５のバッファメモリ１５２の容量に応じて変化させるようにしてもよい。 Note that the data amount of one block is an amount set in the system in advance. Further, the processing sequencer 103 may change the data amount of one block according to the capacities of the buffer memory 106 of the transfer destination storage 12 and the buffer memory 152 of the relay device 15.

処理シーケンサ１０３がＩＯスケジューラ１０４に登録するものとして、処理の種類（読み出し／書き込み）と、処理を識別するためのＩＤと、処理の対象となる記憶媒体１０１中の領域を示す情報と、処理の対象となるバッファメモリ１０６の領域を示す情報とがある。なお、処理シーケンサ１０３が複数の読み出し処理命令や複数の書き込み処理命令を指示する場合もあり、処理を識別するためのＩＤは、このような場合に各処理を識別するために用いられる。従って、図４に示された処理では、処理シーケンサ１０３は、データ転送対象のデータの読み出し要求を登録する際に、処理の種類、読み出し処理を識別するためのＩＤを登録するとともに、処理の対象となる記憶媒体１０１中の領域としてデータ転送対象のブロックを登録する。また、記憶媒体が複数ある場合には、処理シーケンサ１０３は、記憶媒体を特定するための情報も登録する。そして、処理シーケンサ１０３は、登録時に指定したＩＤによって、どの処理が完了したのかを判別する。 The processing sequencer 103 registers with the IO scheduler 104 as a type of processing (read / write), an ID for identifying the processing, information indicating an area in the storage medium 101 to be processed, There is information indicating an area of the target buffer memory 106. Note that the processing sequencer 103 may instruct a plurality of read processing instructions and a plurality of write processing instructions, and the ID for identifying the process is used to identify each process in such a case. Therefore, in the process shown in FIG. 4, the process sequencer 103 registers the type of process and an ID for identifying the read process when registering the read request for the data to be transferred, and the process target. The block to be transferred is registered as an area in the storage medium 101. When there are a plurality of storage media, the processing sequencer 103 also registers information for specifying the storage media. Then, the process sequencer 103 determines which process has been completed based on the ID specified at the time of registration.

ＩＯスケジューラ１０４は、処理シーケンサ１０３によって登録された処理を記録する。そして、記録されている処理をあらかじめ決められているアルゴリズムに従って選択し、取り出した処理を媒体制御部１０５に実行させる。あらかじめ決められているアルゴリズムとして、例えば、登録された順がある。また、記憶媒体１０１の磁気ヘッドや光ヘッドの現在位置から処理対象の位置までの移動距離が最も小さくなる処理を最初に選択してもよい。また、ストレージ１１が複数の記録媒体１０１を含むディスクアレイ装置であってそれぞれの記憶媒体に対応して媒体制御部１０５が設けられている場合には、ＩＯスケジューラ１０４は、処理を取り出そうとした記憶媒体１０１に対応した媒体制御部１０５が担当する記憶媒体１０１を対象とした処理を選択する。 The IO scheduler 104 records the process registered by the process sequencer 103. Then, the recorded process is selected according to a predetermined algorithm, and the medium control unit 105 is caused to execute the extracted process. As a predetermined algorithm, for example, there is a registered order. Alternatively, a process that minimizes the moving distance from the current position of the magnetic head or optical head of the storage medium 101 to the position to be processed may be selected first. Further, when the storage 11 is a disk array device including a plurality of recording media 101 and the medium control unit 105 is provided corresponding to each storage medium, the IO scheduler 104 stores the memory to be processed. A process for the storage medium 101 that is handled by the medium control unit 105 corresponding to the medium 101 is selected.

媒体制御部１０５の実行中の処理が完了すると、ＩＯスケジューラ１０４は次の処理を取り出す。処理が読み出し処理であった場合には、媒体制御部１０５は、記憶媒体１０１に、記憶媒体１０１における指定された領域からデータを読み出させて、読み出したデータをバッファメモリ１０６の指定された領域に書き込ませる。処理が書き込み処理であった場合には、媒体制御部１０５は、記憶媒体１０１に、バッファメモリ１０６の指定された領域のデータを、記憶媒体１０１における指定された領域に書き込ませる。そして、媒体制御部１０５は、処理が完了したときには、処理シーケンサ１０３に処理の完了を通知する。 When the processing being executed by the medium control unit 105 is completed, the IO scheduler 104 takes out the next processing. If the process is a read process, the medium control unit 105 causes the storage medium 101 to read data from a specified area in the storage medium 101, and reads the read data into a specified area in the buffer memory 106. To write to. If the process is a write process, the medium control unit 105 causes the storage medium 101 to write the data in the specified area of the buffer memory 106 to the specified area in the storage medium 101. Then, when the process is completed, the medium control unit 105 notifies the process sequencer 103 of the completion of the process.

通信部１０２は、外部から入力されたデータが、コマンド、受信準備完了のメッセージ、受信完了のメッセージ等の制御系メッセージであった場合には、入力されたデータを処理シーケンサ１０３に渡す。また、ストレージ１１に書き込まれるべきデータであった場合には、データを格納すべき場所を処理シーケンサ１０３に問い合わせる。そして、処理シーケンサ１０３から指定されたバッファメモリ１０６中の領域にデータを格納する。 When the data input from the outside is a control system message such as a command, a reception completion message, or a reception completion message, the communication unit 102 passes the input data to the processing sequencer 103. If the data is to be written to the storage 11, the processing sequencer 103 is inquired as to where to store the data. Then, the data is stored in an area in the buffer memory 106 designated by the processing sequencer 103.

また、通信部１０２は、処理シーケンサ１０３から指定されたコマンドまたは完了通知を、指定された中継装置またはホストに向けて送信する処理も行う。さらに、処理シーケンサ１０３から指定されたバッファメモリ１０６中のデータを、指定された中継装置またはホストに向けて送信する処理も行う。 The communication unit 102 also performs processing for transmitting a command or completion notification designated from the processing sequencer 103 to the designated relay device or host. Further, processing for transmitting the data in the buffer memory 106 designated by the processing sequencer 103 to the designated relay device or host is also performed.

なお、ホスト１０は、ストレージ１１から完了通知を受けたら、すなわち、中継装置１５へのデータ転送が完了したら、ストレージ１１からストレージ１２へのデータ転送が完了したと認識する。 The host 10 recognizes that the data transfer from the storage 11 to the storage 12 is completed when the completion notification is received from the storage 11, that is, when the data transfer to the relay device 15 is completed.

次に、中継装置１５の動作を説明する。中継装置１５において、ストレージ１１，１２からのデータ（コマンドまたは転送されるデータ）は、通信部１５０で受信される。通信部１５０は、ストレージ１１がステップＳ１０１において送信したwrite コマンドを受信すると、write コマンドを中継処理部１５１に渡す。 Next, the operation of the relay device 15 will be described. In the relay device 15, data (command or transferred data) from the storages 11 and 12 is received by the communication unit 150. When receiving the write command transmitted from the storage 11 in step S101, the communication unit 150 passes the write command to the relay processing unit 151.

中継処理部１５１は、図５に示すように、ストレージ１１から送られてくるデータを格納するのに必要な領域をバッファメモリ１５２に確保する（ステップＳ１２０）。中継処理部１５１は、write コマンドに含まれるデータ量の情報に基づいてデータの格納領域を確保する。そして、受信準備完了の通知をストレージ１１に送るように通信部１５０に指示する（ステップＳ１２１）。通信部１５０は、指示に応じて、ストレージ１１に受信準備完了のメッセージを送信する。また、write コマンドをストレージ１２に送るように通信部１５０に指示する（ステップＳ１２２）。通信部１５０は、指示に応じて、ストレージ１２にwrite コマンドを送信する。このwrite コマンドは、ストレージ１１から受信するデータをストレージ１２に書き込ませるためのwrite コマンドである。 As shown in FIG. 5, the relay processing unit 151 reserves an area necessary for storing data sent from the storage 11 in the buffer memory 152 (step S120). The relay processing unit 151 reserves a data storage area based on the data amount information included in the write command. Then, the communication unit 150 is instructed to send a reception preparation completion notification to the storage 11 (step S121). The communication unit 150 transmits a reception preparation completion message to the storage 11 in response to the instruction. Further, the communication unit 150 is instructed to send a write command to the storage 12 (step S122). The communication unit 150 transmits a write command to the storage 12 in response to the instruction. This write command is a write command for writing data received from the storage 11 to the storage 12.

次いで、ストレージ１１からデータが届くのを待ち（ステップＳ１２３）、データが届いて通信部１５０からデータを格納すべきバッファメモリ１５２の領域の問い合わせを受けると、ステップＳ１２０で確保した領域を通信部１５０に知らせる（ステップＳ１２４）。また、データのバッファメモリ１５２への格納の完了を待ち（ステップＳ１２５）、全てのデータがバッファメモリ１５２に格納されたことが通信部１５０から通知されると、ストレージ１１にwrite コマンドに対する完了を通知するように通信部１５０に指示する（ステップＳ１２６）。通信部１５０は、指示に応じて、ストレージ１１に受信完了のメッセージを送信する。 Next, the process waits for data to arrive from the storage 11 (step S123). When the data arrives and receives an inquiry from the communication unit 150 about the area of the buffer memory 152 where the data is to be stored, the communication unit 150 determines the area secured in step S120. (Step S124). In addition, waiting for completion of storage of data in the buffer memory 152 (step S125), and when the communication unit 150 notifies that all data has been stored in the buffer memory 152, notifies the storage 11 of completion of the write command. The communication unit 150 is instructed to do so (step S126). The communication unit 150 transmits a reception completion message to the storage 11 in response to the instruction.

そして、ストレージ１２から準備完了のメッセージ（ステップＳ１２２において送信したwrite コマンドに対する応答）が送信されるのを待つ（ステップＳ１２７）。ストレージ１２からの準備完了のメッセージを受信したことが通信部１５０から通知されると、通信部１５０に、バッファメモリ１５２に格納されたデータをストレージ１２に向けて送信させる（ステップＳ１２８）。その後、ストレージ１２から受信完了のメッセージが送信されるのを待ち（ステップＳ１２９）、ストレージ１２からの受信完了のメッセージを受信したことが通信部１５０から通知されると、処理を終了する。 Then, it waits for a preparation completion message (response to the write command transmitted in step S122) from the storage 12 (step S127). When the communication unit 150 notifies that the preparation completion message from the storage 12 has been received, the communication unit 150 causes the data stored in the buffer memory 152 to be transmitted to the storage 12 (step S128). Thereafter, the process waits for a reception completion message transmitted from the storage 12 (step S129). When the communication unit 150 notifies the reception completion message from the storage 12, the processing is terminated.

通信部１５０は、外部からコマンドを受信すると、受信したコマンドを中継処理部１５１に渡す。また、外部から各種のメッセージを受信すると、受信したことを中継処理部１５１に通知する。さらに、データを受信した場合には、データを格納すべきバッファメモリ１５２の領域を中継処理部１５１に問い合わせ、バッファメモリ１５２中の指定された領域にデータを格納する。また、中継処理部１５１の指示に応じて、指定されたメッセージを指定されたストレージに送信する。さらに、バッファメモリ１５２中の領域と送信先のストレージとが指定された場合には、その領域中のデータを指定されたストレージに送信する。 When receiving a command from the outside, the communication unit 150 passes the received command to the relay processing unit 151. Further, when various messages are received from the outside, the relay processing unit 151 is notified of the reception. Further, when data is received, the relay processing unit 151 is inquired about the area of the buffer memory 152 where the data is to be stored, and the data is stored in the designated area in the buffer memory 152. Further, in response to an instruction from the relay processing unit 151, the designated message is transmitted to the designated storage. Further, when an area in the buffer memory 152 and a destination storage are designated, the data in the area is transmitted to the designated storage.

中継処理部１５１がデータの送信先のストレージを決定する際に、あらかじめ送信先のストレージ（この例ではストレージ１２）が固定的に決められている場合には特に選択処理を行わないが、データの転送元のストレージ（この例ではストレージ１１）に応じてデータの送信先のストレージが決まる場合には、転送元のストレージに応じて送信先のストレージを選択する処理を行う。また、送信先のストレージのデータを送信するのに先立って、データの転送元のストレージから送信先が指定されることもある。なお、バッファメモリ１５２に格納されたデータは、送信先のストレージ（この例ではストレージ１２）へのデータの送信が完了すると破棄される。 When the relay processing unit 151 determines the storage of the data transmission destination, if the transmission destination storage (the storage 12 in this example) is fixedly determined in advance, no selection process is performed. When the data destination storage is determined according to the transfer source storage (storage 11 in this example), processing for selecting the destination storage is performed according to the transfer source storage. In addition, prior to transmitting data in the destination storage, the destination may be specified from the data source storage. The data stored in the buffer memory 152 is discarded when transmission of data to the destination storage (storage 12 in this example) is completed.

ストレージ１２の通信部１０２は、中継装置１５がステップＳ１２２において送信したwrite コマンドを受信すると、write コマンドを処理シーケンサ１０３に渡す。処理シーケンサ１０３は、中継装置１５から送られてくるデータを格納するのに必要な領域をバッファメモリ１０６に確保する。そして、通信部１０２に、準備完了のメッセージを中継装置１５に対して送信させる。この準備完了メッセージは、中継装置１５がステップＳ１２７において待つメッセージである。その後、中継装置１５からデータが送られてくると、ストレージ１２の通信部１０２は、データを格納すべき領域を処理シーケンサ１０３に問い合わせ、バッファメモリ１０６に格納する。データを全て格納したならば、受信完了のメッセージを中継装置１５に送信する。この受信完了メッセージは、中継装置１５がステップＳ１２９において待つメッセージである。また、ストレージ１２の処理シーケンサ１０３は、バッファメモリ１０６へのデータ格納後、ＩＯスケジューラ１０４に対して、処理の種類（この場合には書き込み）と、処理の識別ＩＤと、処理の対象となる記憶媒体１０１中の領域を示す情報と、処理の対象となるバッファメモリ１０６の領域を示す情報とを登録する。ＩＯスケジューラ１０４は、書き込み要求に応じて、媒体制御部１０５に、書き込み対象のデータの書き込み指示を行う。媒体制御部１０５は、登録内容に応じてバッファメモリ１０６から記憶媒体１０１へのデータの書き込み処理を行う。 When the communication unit 102 of the storage 12 receives the write command transmitted by the relay device 15 in step S 122, the communication unit 102 passes the write command to the processing sequencer 103. The processing sequencer 103 secures an area necessary for storing data sent from the relay device 15 in the buffer memory 106. Then, the communication unit 102 is caused to transmit a preparation completion message to the relay device 15. This preparation completion message is a message that the relay device 15 waits in step S127. Thereafter, when data is sent from the relay device 15, the communication unit 102 of the storage 12 inquires the processing sequencer 103 about the area where the data is to be stored, and stores it in the buffer memory 106. When all the data is stored, a reception completion message is transmitted to the relay device 15. This reception completion message is a message that the relay device 15 waits in step S129. Further, after storing data in the buffer memory 106, the processing sequencer 103 of the storage 12 stores the type of processing (in this case, writing), the processing identification ID, and the processing target storage for the IO scheduler 104. Information indicating an area in the medium 101 and information indicating an area of the buffer memory 106 to be processed are registered. The IO scheduler 104 instructs the medium control unit 105 to write data to be written in response to the write request. The medium control unit 105 performs data writing processing from the buffer memory 106 to the storage medium 101 in accordance with the registered contents.

この実施の形態では、中継装置１５は、ストレージ１１の設置場所において地震等の災害が発生したときに、災害の影響が波及すると想定される範囲の外に設置される。また、中継装置１５は、ストレージ１１と中継装置１５との間のデータ転送時間が、ストレージ１１とストレージ１２との間で直接データを転送した場合のデータ転送時間よりも短くなるような位置に設置されている。さらに、バックアップのためにストレージ１１からストレージ１２に向けてデータを転送する際に、ストレージ１１を使用しているホスト１０は、中継装置１５へのデータ転送が完了したら、ストレージ１１からストレージ１２へのデータ転送が完了したと認識する。 In this embodiment, the relay device 15 is installed outside the range where the influence of the disaster is expected to spread when a disaster such as an earthquake occurs at the installation location of the storage 11. Further, the relay device 15 is installed at a position where the data transfer time between the storage 11 and the relay device 15 is shorter than the data transfer time when data is directly transferred between the storage 11 and the storage 12. Has been. Further, when transferring data from the storage 11 to the storage 12 for backup, the host 10 using the storage 11 transfers data from the storage 11 to the storage 12 when the data transfer to the relay device 15 is completed. Recognizes that data transfer is complete.

従って、ストレージ１１が被災しても中継装置１５は被災せず、かつ、ストレージ１１から中継装置１５へのデータ転送時間が短いので、耐障害性が向上する。また、ホスト１０は、ストレージ１２にデータが格納されるのを待たずに、次の処理を開始することができるので、データ転送に伴う処理の遅れが改善される。 Therefore, even if the storage 11 is damaged, the relay device 15 is not damaged, and the data transfer time from the storage 11 to the relay device 15 is short, so that the fault tolerance is improved. Further, since the host 10 can start the next process without waiting for the data to be stored in the storage 12, the delay in the process associated with the data transfer is improved.

実施の形態２．
第１の実施の形態では、中継装置１５は、ストレージ１１からの１ブロックのデータを全て受信してから、ストレージ１２に対するデータの送信を開始したが、ストレージ１１からの１ブロックのデータの受信の完了を待たずに、ストレージ１２に対するデータの送信を開始してもよい。図６は、ストレージ１１からのデータの受信の完了を待たずにストレージ１２に対するデータの送信を開始する制御を行う第２の実施の形態の中継処理部１５１の動作を示すフローチャートである。なお、データ複製システムの構成およびストレージ１１，１２と中継装置１５の構成は第１の実施の形態の場合と同じである（図１〜図３参照）。 Embodiment 2. FIG.
In the first embodiment, the relay device 15 receives all of one block of data from the storage 11 and then starts transmitting data to the storage 12. However, the relay device 15 receives one block of data from the storage 11. Data transmission to the storage 12 may be started without waiting for completion. FIG. 6 is a flowchart illustrating the operation of the relay processing unit 151 according to the second embodiment that performs control to start transmission of data to the storage 12 without waiting for completion of reception of data from the storage 11. The configuration of the data replication system and the configurations of the storages 11 and 12 and the relay device 15 are the same as those in the first embodiment (see FIGS. 1 to 3).

中継処理部１５１は、第一の実施の形態に示す場合と同様に、ストレージ１１がステップＳ１０１において送信したwrite コマンドを受信する。すると、中継処理部１５１は、図６に示すように、ストレージ１１から送られてくるデータを格納するのに必要な領域をバッファメモリ１５２に確保する（ステップＳ１４０）。中継処理部１５１は、ストレージ１１からのwrite コマンドに含まれるデータ量の情報に基づいてデータの格納領域を確保する。そして、受信準備完了の通知をストレージ１１に送るように通信部１５０に指示する（ステップＳ１４１）。また、write コマンドをストレージ１２に送るように通信部１５０に指示する（ステップＳ１４２）。このwrite コマンドは、ストレージ１１から受信するデータをストレージ１２に書き込ませるためのwrite コマンドである。 As in the case of the first embodiment, the relay processing unit 151 receives the write command transmitted by the storage 11 in step S101. Then, the relay processing unit 151 secures an area necessary for storing data sent from the storage 11 in the buffer memory 152 as shown in FIG. 6 (step S140). The relay processing unit 151 secures a data storage area based on data amount information included in the write command from the storage 11. Then, the communication unit 150 is instructed to send a notification of reception preparation completion to the storage 11 (step S141). Further, the communication unit 150 is instructed to send a write command to the storage 12 (step S142). This write command is a write command for writing data received from the storage 11 to the storage 12.

次いで、ストレージ１１からデータが届くのを待ち（ステップＳ１４３）、データが届いて通信部１５０からデータを格納すべきバッファメモリ１５２の領域の問い合わせを受けると、ステップＳ１４０で確保した領域を通信部１５０に知らせる（ステップＳ１４４）。次いで、ストレージ１２から準備完了のメッセージが送信されるのを待ち（ステップＳ１４５）、ストレージ１２からの準備完了のメッセージを受信したことが通信部１５０から通知されると、通信部１５０に、バッファメモリ１５２に格納されたデータをストレージ１２に向けて送信させる（ステップＳ１４６）。 Next, it waits for data to arrive from the storage 11 (step S143), and when the data arrives and receives an inquiry from the communication unit 150 about the area of the buffer memory 152 where the data is to be stored, the area secured in step S140 is the communication unit 150. (Step S144). Next, the storage unit 12 waits for a preparation completion message to be transmitted (step S145). When the communication unit 150 notifies that the preparation completion message has been received from the storage unit 12, the communication unit 150 receives the buffer memory. The data stored in 152 is transmitted to the storage 12 (step S146).

ステップＳ１４６の処理を行っているときに、ストレージ１１からのデータがバッファメモリ１５２に格納され、バッファメモリ１５２から読み出されたデータがストレージに向けて送信されるが、通信部１５０は、データの読み出し位置（読み出しアドレス）が格納位置（格納アドレス）を追い越さないように制御する。すなわち、読み出し位置が格納位置に追いついたら、バッファメモリ１５２からのデータの読み出しを中止する。 While performing the process of step S146, the data from the storage 11 is stored in the buffer memory 152, and the data read from the buffer memory 152 is transmitted to the storage. Control is performed so that the reading position (reading address) does not overtake the storage position (storage address). That is, when the reading position catches up with the storage position, reading of data from the buffer memory 152 is stopped.

そして、データのバッファメモリ１５２への格納が完了するのを待ち（ステップＳ１４７）、全てのデータがバッファメモリ１５２に格納されたことが通信部１５０から通知されると、ストレージ１１にwrite コマンドに対する完了を通知するように通信部１５０に指示する（ステップＳ１４８）。その後、ストレージ１２から受信完了のメッセージが送信されるのを待ち（ステップＳ１４９）、ストレージ１２からの受信完了のメッセージを受信したことが通信部１５０から通知されると、処理を終了する。 Then, it waits for the storage of data in the buffer memory 152 to be completed (step S147). When the communication unit 150 notifies that all the data has been stored in the buffer memory 152, the storage 11 is completed for the write command. Is notified to the communication unit 150 (step S148). Thereafter, the process waits for a reception completion message transmitted from the storage 12 (step S149). When the communication unit 150 notifies that the reception completion message has been received from the storage 12, the processing ends.

この実施の形態では、第１の実施の形態に比べて、中継装置１５からストレージ１２へのデータの送信を早めに完了させることができる。なお、ストレージ１１，１２の動作は、第１の実施の形態のそれらの動作と同じである。 In this embodiment, data transmission from the relay device 15 to the storage 12 can be completed earlier than in the first embodiment. Note that the operations of the storages 11 and 12 are the same as those of the first embodiment.

実施の形態３．
第１および第２の実施の形態では、ストレージ１１のデータのバックアップが実現されたが、ミラーリングによってストレージ１１のデータをストレージ１２に転送するようにしてもよい。図７は、第３の実施の形態、すなわちミラーリングを行う場合のストレージ１１のストレージコントローラ１００における処理シーケンサ１０３の動作を示すフローチャートである。なお、データ複製システムの構成およびストレージ１１，１２と中継装置１５の構成は第１の実施の形態の場合と同じである（図１〜図３参照）。また、ストレージ１１において、通信部１０２、ＩＯスケジューラ１０４および媒体制御部１０５の動作は、第１の実施の形態のそれらの動作と同じである。 Embodiment 3 FIG.
In the first and second embodiments, the backup of the data in the storage 11 is realized. However, the data in the storage 11 may be transferred to the storage 12 by mirroring. FIG. 7 is a flowchart showing the operation of the processing sequencer 103 in the storage controller 100 of the storage 11 in the third embodiment, that is, when mirroring is performed. The configuration of the data replication system and the configurations of the storages 11 and 12 and the relay device 15 are the same as those in the first embodiment (see FIGS. 1 to 3). In the storage 11, the operations of the communication unit 102, the IO scheduler 104, and the medium control unit 105 are the same as those in the first embodiment.

第１および第２の実施の形態に示したバックアップは、ホスト１０がストレージ１１に対して複写コマンドを出力したことを契機に開始される。第３の実施の形態として示すミラーリングは、ホスト１０がストレージ１１に対してデータの書き込みを指示するwrite コマンドを出力したことを契機に開始される。 The backup shown in the first and second embodiments is started when the host 10 outputs a copy command to the storage 11. The mirroring shown as the third embodiment is started when the host 10 outputs a write command instructing the storage 11 to write data.

ストレージ１１の通信部１０２がホスト１０からwrite コマンドを受信すると、通信部１０２はそのwrite コマンドを処理シーケンサ１０３に渡す。すると、処理シーケンサ１０３は、図７に示すように、ホスト１０から受け取るデータを格納するのに必要な領域をバッファメモリ１０６に確保する（ステップＳ１６０）。また、準備完了の通知をホスト１０に送るように通信部１０２に指示する（ステップＳ１６１）。通信部１０２は、指示に応じて、準備完了の通知をホスト１０に送る。 When the communication unit 102 of the storage 11 receives a write command from the host 10, the communication unit 102 passes the write command to the processing sequencer 103. Then, as shown in FIG. 7, the processing sequencer 103 secures an area necessary for storing data received from the host 10 in the buffer memory 106 (step S160). Further, the communication unit 102 is instructed to send a notification of completion of preparation to the host 10 (step S161). The communication unit 102 sends a notification of preparation completion to the host 10 in response to the instruction.

そして、処理シーケンサ１０３は、ホスト１０からデータが到着するのを待ち（ステップＳ１６２）、データが届いて通信部１０２からデータを格納すべきバッファメモリ１０６の領域の問い合わせを受けると、ステップＳ１６０で確保した領域を通信部１０２に知らせる（ステップＳ１６３）。次いで、write コマンドを中継装置１５に送るように通信部１０２に指示する（ステップＳ１６４）。通信部１０２は、指示に応じて、中継装置１５にwrite コマンドを送信する。 The processing sequencer 103 waits for data to arrive from the host 10 (step S162). When the data arrives and receives an inquiry from the communication unit 102 about the area of the buffer memory 106 where the data is to be stored, the processing sequencer 103 secures it in step S160. The communication unit 102 is notified of the completed area (step S163). Next, the communication unit 102 is instructed to send a write command to the relay device 15 (step S164). The communication unit 102 transmits a write command to the relay device 15 in response to the instruction.

次いで、処理シーケンサ１０３は、ホスト１０からのデータのバッファメモリ１０６への格納の完了を待ち（ステップＳ１６５）、全てのデータがバッファメモリ１０６に格納されたことが通信部１０２から通知されると、処理シーケンサ１０３は、ＩＯスケジューラ１０４に対して、処理の種類（この場合には書き込み）と、処理の識別ＩＤと、処理の対象となる記憶媒体１０１中の領域を示す情報と、処理の対象となるバッファメモリ１０６の領域を示す情報とを登録する。ＩＯスケジューラ１０４は、書き込み要求に応じて、媒体制御部１０５に、書き込み対象のデータの書き込み指示を行う。媒体制御部１０５は、登録内容に応じてバッファメモリ１０６から記憶媒体１０１へのデータの書き込み処理を行う（ステップＳ１６６）。 Next, the processing sequencer 103 waits for the storage of data from the host 10 to the buffer memory 106 (step S165), and when the communication unit 102 notifies that all the data has been stored in the buffer memory 106, The processing sequencer 103 instructs the IO scheduler 104 to specify the processing type (in this case, writing), the processing identification ID, information indicating the area in the storage medium 101 to be processed, and the processing target. Information indicating the area of the buffer memory 106 is registered. The IO scheduler 104 instructs the medium control unit 105 to write data to be written in response to the write request. The medium control unit 105 performs a data writing process from the buffer memory 106 to the storage medium 101 in accordance with the registered content (step S166).

そして、処理シーケンサ１０３は、中継装置１５から受信準備完了のメッセージ（ステップＳ１６４において送ったwrite コマンドに対する応答）が送信されるのを待つ（ステップＳ１６７）。中継装置１５からの受信準備完了のメッセージを受信したことが通信部１０２から通知されると、処理シーケンサ１０３は、通信部１０２に、バッファメモリ１０６に格納されたデータを中継装置１５に向けて送信させる（ステップＳ１６８）。その後、中継装置１５から受信完了のメッセージが送信されるのと、媒体制御部１０５からの書き込み完了通知とを待ち（ステップＳ１６９）、中継装置１５からの受信完了のメッセージを受信したことが通信部１０２から通知され、かつ、媒体制御部１０５からの書き込み完了通知を受けると、ホスト１０に書き込みの完了を通知し（ステップＳ１７０）、処理を終了する。ホスト１０は、ストレージ１１から完了通知を受けたら、すなわち、中継装置１５へのデータ転送および記憶媒体１０１へのデータ書き込みが完了したら、ミラーリングが完了したと認識する。 Then, the processing sequencer 103 waits for a reception preparation completion message (response to the write command sent in step S164) from the relay device 15 (step S167). When the communication unit 102 is notified that the reception preparation completion message from the relay device 15 has been received, the processing sequencer 103 transmits the data stored in the buffer memory 106 to the relay device 15. (Step S168). Thereafter, the communication unit waits for a reception completion message from the relay device 15 and a write completion notification from the medium control unit 105 (step S169), and that the communication unit has received the reception completion message from the relay device 15 When notified from 102 and a write completion notification from the medium control unit 105 is received, the host 10 is notified of the completion of writing (step S170), and the process is terminated. When the host 10 receives the completion notification from the storage 11, that is, when the data transfer to the relay device 15 and the data writing to the storage medium 101 are completed, the host 10 recognizes that the mirroring is completed.

中継装置１５がストレージ１１から送られるデータを中継する動作や、ストレージ１２が中継装置１５から送られるデータを記憶媒体１０１に記憶させる動作は、第一の実施の形態と同様である。 The operation in which the relay device 15 relays data sent from the storage 11 and the operation in which the storage 12 stores data sent from the relay device 15 in the storage medium 101 are the same as in the first embodiment.

この実施の形態でも、ストレージ１１が被災しても中継装置１５は被災せず、かつ、ストレージ１１から中継装置１５へのデータ転送時間が短いので、耐障害性が向上する。また、ホスト１０がwrite コマンドを送信してから次の処理を開始するまでの時間を短縮化できる。 Also in this embodiment, even if the storage 11 is damaged, the relay device 15 is not damaged, and the data transfer time from the storage 11 to the relay device 15 is short, so that the fault tolerance is improved. In addition, the time from when the host 10 transmits the write command to when the next process is started can be shortened.

実施の形態４．
図８は、本発明によるデータ複製システムの第４の実施の形態を示すブロック図である。図８に示すデータ複製システムにおいて、ストレージ１１が、ストレージ１１を使用するホスト１０とローカルに接続されている。ストレージ１１は、ネットワーク１３を介して中継装置１５−１〜１５−ｎに接続されている。また、中継装置１５−１〜１５−ｎは、ネットワーク１４を介してストレージ１２に接続されている。なお、ストレージ１１，１２の構成は、第１の実施の形態のストレージ１１，１２の構成と同じであり、中継装置１５−１〜１５−ｎの構成は、第１の実施の形態の中継装置１５の構成と同じである。また、各中継装置１５−１〜１５−ｎがストレージ１１から受信したデータをストレージ１２に転送する動作や、ストレージ１２が受信したデータを記憶媒体１０１に記憶させる動作は、第１の実施の形態の中継装置１５やストレージ１２の動作と同様である。 Embodiment 4 FIG.
FIG. 8 is a block diagram showing a fourth embodiment of the data replication system according to the present invention. In the data replication system shown in FIG. 8, the storage 11 is locally connected to the host 10 that uses the storage 11. The storage 11 is connected to the relay devices 15-1 to 15-n via the network 13. In addition, the relay devices 15-1 to 15-n are connected to the storage 12 via the network 14. The configurations of the storages 11 and 12 are the same as the configurations of the storages 11 and 12 of the first embodiment, and the configurations of the relay devices 15-1 to 15-n are the relay devices of the first embodiment. The configuration is the same as 15. In addition, the operation in which each relay device 15-1 to 15-n transfers the data received from the storage 11 to the storage 12, and the operation in which the data received by the storage 12 is stored in the storage medium 101 are described in the first embodiment. The operations of the relay device 15 and the storage 12 are the same.

第一の実施の形態と同様に、ストレージ１１は主体的にデータを送信する。すなわち、ホスト１０がストレージ１１からデータを読み出して中継装置１５−１〜１５−ｎにデータを送信するのではなく、ストレージ１１が直接中継装置１５にデータを送信する。ストレージ１２も、中継装置１５から直接データを受信する。また、中継装置１５−１〜１５−ｎは、ストレージ１１，１２の設置場所において地震等の災害が発生したときに、災害の影響が波及すると想定される範囲の外に設置される。さらに、ストレージ１１またはストレージ１２と中継装置１５−１〜１５−ｎとの間のデータ転送時間が、ストレージ１１とストレージ１２との間で直接データを転送した場合のデータ転送時間よりも短くなるように、中継装置１５−１〜１５−ｎが設置される。 Similar to the first embodiment, the storage 11 actively transmits data. That is, the host 10 does not read data from the storage 11 and transmits data to the relay apparatuses 15-1 to 15-n, but the storage 11 directly transmits data to the relay apparatus 15. The storage 12 also receives data directly from the relay device 15. In addition, the relay devices 15-1 to 15-n are installed outside the range in which the influence of the disaster is expected to spread when a disaster such as an earthquake occurs at the installation location of the storages 11 and 12. Furthermore, the data transfer time between the storage 11 or storage 12 and the relay devices 15-1 to 15-n is shorter than the data transfer time when data is directly transferred between the storage 11 and the storage 12. In addition, relay devices 15-1 to 15-n are installed.

次に、データ複製システムの動作を説明する。図９は、ストレージ１１のストレージコントローラ１００における処理シーケンサ１０３の動作を示すフローチャートである。ここでは、データの転送元としてのストレージ１１が、データの転送先としてのストレージ１２に向けてデータを転送して、バックアップを行う場合の例を説明する。 Next, the operation of the data replication system will be described. FIG. 9 is a flowchart showing the operation of the processing sequencer 103 in the storage controller 100 of the storage 11. Here, an example in which the storage 11 as the data transfer source transfers data to the storage 12 as the data transfer destination to perform backup will be described.

処理シーケンサ１０３は、通信部１０２を介してホスト１０から複写コマンドを受け取ると、図９に示すように、まず、複写対象のデータの範囲の先頭のブロックをデータ転送対象のブロックとして設定する（ステップＳ２００）。次いで、中継装置１５−１〜１５−ｎのうちから使用する中継装置を選択する（ステップＳ２０１）。ここでは、中継装置１５−１が選択されたとする。そして、通信部１０２に、選択した中継装置１５−１に対してwrite コマンドを送信するように指示する（ステップＳ２０２）。また、データ転送対象のデータの読み出し要求をＩＯスケジューラ１０４に登録する（ステップＳ２０３）。ステップＳ２０３において読み出し要求が登録されたときのＩＯスケジューラ１０４と媒体制御部１０５の動作は、第１の実施の形態で説明したステップＳ１０２における動作と同様である。 When the processing sequencer 103 receives a copy command from the host 10 via the communication unit 102, first, as shown in FIG. 9, the processing sequencer 103 sets the first block in the range of data to be copied as a block to be transferred (step). S200). Next, a relay device to be used is selected from the relay devices 15-1 to 15-n (step S201). Here, it is assumed that the relay device 15-1 is selected. Then, the communication unit 102 is instructed to transmit a write command to the selected relay device 15-1 (step S202). Also, a request for reading data to be transferred is registered in the IO scheduler 104 (step S203). The operations of the IO scheduler 104 and the medium control unit 105 when a read request is registered in step S203 are the same as the operations in step S102 described in the first embodiment.

そして、処理シーケンサ１０３は、通信部１０２を介して入力される中継装置１５−１からの受信準備完了のメッセージと、媒体制御部１０５からの読み出し完了通知との双方を待ち（ステップＳ２０４）。双方が入力されたら、バッファメモリ１０６に記憶されたデータを中継装置１５−１に転送するように通信部１０２に指示する（ステップＳ２０５）。そして、処理シーケンサ１０３は、通信部１０２を介して入力される中継装置１５−１からの受信完了のメッセージを待つ（ステップＳ２０６）。 Then, the processing sequencer 103 waits for both a reception preparation completion message from the relay device 15-1 input via the communication unit 102 and a read completion notification from the medium control unit 105 (step S204). When both are input, the communication unit 102 is instructed to transfer the data stored in the buffer memory 106 to the relay device 15-1 (step S205). Then, the processing sequencer 103 waits for a reception completion message from the relay device 15-1 input via the communication unit 102 (step S206).

通信部１０２は、ステップＳ２０３においてバッファメモリ１０６に記憶させたデータを中継装置１５−１に送信し、中継装置１５−１から受信完了のメッセージを受けたら、受信完了のメッセージを処理シーケンサ１０３に出力する。処理シーケンサ１０３は、受信完了のメッセージを入力すると、複写対象の全てのデータの中継装置１５への転送が完了したか否か確認する（ステップＳ２０７）。完了していなければ、複写対象のデータの範囲の次のブロックをデータ転送対象のブロックとして設定し（ステップＳ２０９）、ステップＳ２０１に戻る。 The communication unit 102 transmits the data stored in the buffer memory 106 in step S 203 to the relay device 15-1, and upon receiving a reception completion message from the relay device 15-1, outputs the reception completion message to the processing sequencer 103. To do. When receiving the reception completion message, the processing sequencer 103 checks whether or not the transfer of all data to be copied to the relay device 15 has been completed (step S207). If not completed, the next block in the range of the data to be copied is set as the block to be transferred (step S209), and the process returns to step S201.

複写対象の全てのデータの転送が完了していれば、処理シーケンサ１０３は、ホスト１０に対して完了通知を出力するように通信部１０２に指示し（ステップＳ２０８）、処理を終了する。 If the transfer of all data to be copied has been completed, the processing sequencer 103 instructs the communication unit 102 to output a completion notification to the host 10 (step S208), and ends the processing.

ステップＳ２０１での中継装置を選択する方法として、例えば、中継装置１５−１〜１５−ｎを順に選択したり、負荷の軽い（他のデータ転送に用いられていない）中継装置を選択したり、乱数を用いて選択したりする方法がある。 As a method of selecting a relay device in step S201, for example, the relay devices 15-1 to 15-n are sequentially selected, or a relay device with a light load (not used for other data transfer) is selected. There is a method of selecting using a random number.

あるいは、複数の中継装置を同時に使用してデータ転送を並列に実行してもよい。図１０は、複数の中継装置を同時に使用する場合のストレージ１１の動作を示すフローチャートである。ここでは、１ブロックのデータを転送する場合を例にする。 Alternatively, data transfer may be performed in parallel using a plurality of relay devices simultaneously. FIG. 10 is a flowchart showing the operation of the storage 11 when a plurality of relay devices are used simultaneously. Here, a case where one block of data is transferred is taken as an example.

ストレージ１１において、処理シーケンサ１０３が、複写対象のデータの範囲の先頭のブロックをデータ転送対象のブロックとして設定すると（ステップＳ２２０）、ストレージ１１は、全中継装置１５−１〜１５−ｎに対してデータを転送する（ステップＳ２２１）。 In the storage 11, when the processing sequencer 103 sets the first block in the range of data to be copied as a block to be transferred (step S 220), the storage 11 sends to all the relay devices 15-1 to 15 -n. Data is transferred (step S221).

図１１は、ステップＳ２２１の処理を具体的に示すフローチャートである。処理シーケンサ１０３は、個々の中継装置１５−１〜１５−ｎに対して、それぞれ、図１１に示すフローチャートに従ってデータを送信する。ここでは、中継装置１５−１にデータを送信する場合を例に説明する。処理シーケンサ１０３は、バッファメモリ１０６に格納されているデータから、中継装置１５−１に送信するデータを選択する（ステップＳ２４０）。そして、その中継装置１５−１に対してwrite コマンドを送るように通信部１０２に指示する（ステップＳ２４１）。また、ＩＯスケジューラ１０４に、転送対象のデータの読み出し要求を登録する（ステップＳ２４２）。ステップＳ２４２において読み出し要求が登録されたときのＩＯスケジューラ１０４と媒体制御部１０５の動作は、第１の実施の形態で説明したステップＳ１０２における動作と同様である。処理シーケンサ１０３は、通信部１０２を介して入力される中継装置１５−１からの受信準備完了のメッセージと、媒体制御部１０５からの読み出し完了通知との双方を待ち（ステップＳ２４３）。双方が入力されたら、バッファメモリ１０６に記憶されたデータを中継装置１５−１に送信するように通信部１０２に指示する（ステップＳ２４４）。ここでは中継装置１５−１にデータを送信する場合を例に説明したが、処理シーケンサ１０３は、ステップＳ２４０〜Ｓ２４４の処理を全ての中継装置１５−１〜１５−ｎについて実行する。 FIG. 11 is a flowchart specifically showing the process of step S221. The processing sequencer 103 transmits data to each of the relay devices 15-1 to 15-n according to the flowchart shown in FIG. Here, a case where data is transmitted to the relay device 15-1 will be described as an example. The processing sequencer 103 selects data to be transmitted to the relay device 15-1 from the data stored in the buffer memory 106 (step S240). Then, the communication unit 102 is instructed to send a write command to the relay device 15-1 (step S241). Also, a read request for data to be transferred is registered in the IO scheduler 104 (step S242). The operations of the IO scheduler 104 and the medium control unit 105 when a read request is registered in step S242 are the same as the operations in step S102 described in the first embodiment. The processing sequencer 103 waits for both a reception preparation completion message from the relay device 15-1 input via the communication unit 102 and a read completion notification from the medium control unit 105 (step S243). If both are input, the communication unit 102 is instructed to transmit the data stored in the buffer memory 106 to the relay device 15-1 (step S244). Here, the case where data is transmitted to the relay device 15-1 has been described as an example. However, the processing sequencer 103 executes the processing of steps S240 to S244 for all the relay devices 15-1 to 15-n.

さらに、処理シーケンサ１０３は、いずれかの中継装置から受信完了のメッセージを受けたことを通信部１０２から通知されたら（ステップＳ２２２）、未転送のデータがまだあるか否か確認し（ステップＳ２２３）、ある場合には、受信完了のメッセージを送信した中継装置に対して、write コマンドを送るように通信部１０２に指示する（ステップＳ２２４）。なお、ステップＳ２２４の具体的な処理は、図１１に示された処理である。 Furthermore, when the processing sequencer 103 is notified from the communication unit 102 that a reception completion message has been received from one of the relay devices (step S222), it checks whether there is still untransferred data (step S223). In some cases, the communication unit 102 is instructed to send a write command to the relay apparatus that has transmitted the reception completion message (step S224). The specific process of step S224 is the process shown in FIG.

ステップＳ２２３において未転送のデータがないことを確認したら、処理シーケンサ１０３は、各中継装置から受信完了のメッセージを受けるのを待ち（ステップＳ２２５）、データを転送した全ての中継装置から受信完了のメッセージを受けたら（ステップＳ２２６）、ホスト１０に対して完了通知を出力するように通信部１０２に指示し（ステップＳ２２７）、処理を終了する。 If it is confirmed in step S223 that there is no untransferred data, the processing sequencer 103 waits for reception of a reception completion message from each relay device (step S225), and reception completion messages from all the relay devices that have transferred the data. If received (step S226), the communication unit 102 is instructed to output a completion notification to the host 10 (step S227), and the process is terminated.

図１２は、１ブロックのデータを５つに分け、３台の中継装置１５−１，１５−２，１５−３を使用する場合のデータ転送の例を示すタイミング図である。ストレージ１１は、ホスト１０からの複写コマンドに応じて、中継装置１５−１，１５−２，１５−３のそれぞれにwrite コマンドを送信する。そして、受信準備完了のメッセージを送信した中継装置１５−１，１５−２，１５−３にデータ（１），（２），（３）を転送する。 FIG. 12 is a timing chart showing an example of data transfer when one block of data is divided into five and three relay apparatuses 15-1, 15-2, and 15-3 are used. In response to the copy command from the host 10, the storage 11 transmits a write command to each of the relay devices 15-1, 15-2, and 15-3. Then, the data (1), (2), and (3) are transferred to the relay devices 15-1, 15-2, and 15-3 that have transmitted the reception preparation completion message.

図１２に示す例では、中継装置１５−２，１５−３が先に完了通知を送信したので、ストレージ１１は、中継装置１５−２，１５−３に対してwrite コマンドを送信し、その後、データ（４），（５）を送信する。そして、全ての中継装置１５−１，１５−２，１５−３からの完了通知の受信を確認した時点で、ストレージ１１は、ホスト１０に完了通知を出力する。 In the example illustrated in FIG. 12, since the relay apparatuses 15-2 and 15-3 have transmitted the completion notification first, the storage 11 transmits a write command to the relay apparatuses 15-2 and 15-3. Data (4) and (5) are transmitted. When the storage 11 confirms reception of completion notifications from all the relay apparatuses 15-1, 15-2, and 15-3, the storage 11 outputs completion notifications to the host 10.

この実施の形態では、複数の経路を用いてデータ転送を実行することができる。よって、個々の経路の使用率および中継装置の処理能力の影響を受けにくくすることができる。例えば、図１２に示された例では、中継装置１５−１への経路が輻輳していたか、または中継装置１５−１の処理能力が低かったことになるが、他の中継装置１５−２，１５−３を用いることによって、中継装置１５−１への経路の輻輳または中継装置１５−１の処理能力が低いことは、ストレージ１１からのデータ転送に対して大きな影響を与えることはない。 In this embodiment, data transfer can be executed using a plurality of paths. Therefore, it can be made difficult to be influenced by the usage rate of each route and the processing capability of the relay device. For example, in the example shown in FIG. 12, the route to the relay device 15-1 is congested or the processing capability of the relay device 15-1 is low. By using 15-3, the congestion of the route to the relay device 15-1 or the low processing capability of the relay device 15-1 does not significantly affect the data transfer from the storage 11.

ここでは、各中継装置１５−１〜１５−ｎに、それぞれ異なるデータを送信する場合について説明したが、同一のデータのwrite コマンドを各中継装置１５−１〜１５−ｎに送信するようにしてもよい。例えば、送信すべきデータとしてデータＡ〜Ｃがあるとする。ストレージ１１は、データＡ〜Ｃのいずれのwrite コマンドについても、各中継装置１５−１〜１５−ｎに送信する。 Here, a case has been described where different data is transmitted to each of the relay devices 15-1 to 15-n, but a write command of the same data is transmitted to each of the relay devices 15-1 to 15-n. Also good. For example, it is assumed that there are data A to C as data to be transmitted. The storage 11 transmits any write command of the data A to C to each relay device 15-1 to 15-n.

この場合、ストレージ１１は、各中継装置１５−１〜１５−ｎに送信する同一のデータの組毎に識別子を設け、その識別子を各write コマンドに付加する。識別子としては、送信される各データの組の順番を表す数値を用いればよい。例えば、最初にデータＡを各中継装置に送信するのであれば、各中継装置に送信する各write コマンドに一番目を表す「１」という識別子を付加する。続いて、各中継装置にデータＢを送信するのであれば、各中継装置に送信する各write コマンドに二番目を表す「２」という識別子を付加する。他のデータのwrite コマンドにも同様に識別子を付加する。 In this case, the storage 11 provides an identifier for each identical data set to be transmitted to each relay apparatus 15-1 to 15-n, and adds the identifier to each write command. As the identifier, a numerical value indicating the order of each data set to be transmitted may be used. For example, if data A is first transmitted to each relay apparatus, an identifier “1” representing the first is added to each write command transmitted to each relay apparatus. Subsequently, if data B is transmitted to each relay device, an identifier “2” representing the second is added to each write command transmitted to each relay device. In the same way, identifiers are added to other data write commands.

各中継装置１５−１〜１５−ｎは、ストレージ１１に対してデータの転送完了を通知するときには、write コマンドに付加された識別子も通知する。従って、ストレージ１１は、同一の識別子が付加された応答を各中継装置１５−１〜１５−ｎから受信する。ストレージ１１は、応答に付加された識別子が初めて受信する識別子であれば、ホスト１０にそのデータの転送終了を通知する。その後、同一の識別子が付加された応答を受信した場合、その応答に対しては何も処理を行わない。 When each relay apparatus 15-1 to 15-n notifies the storage 11 of the completion of data transfer, it also notifies the identifier added to the write command. Accordingly, the storage 11 receives a response to which the same identifier is added from each relay device 15-1 to 15-n. If the identifier added to the response is the first identifier received, the storage 11 notifies the host 10 of the end of the data transfer. Thereafter, when a response to which the same identifier is added is received, no processing is performed on the response.

また、各中継装置１５−１〜１５−ｎは、識別子を付加したままwrite コマンドを待機系のストレージ１２に送信する。従って、ストレージ１２は同一の識別子が付加されたwrite コマンドを各中継装置１５−１〜１５−ｎから受信する。ストレージ１２は、write コマンドに付加された識別子が初めて受信する識別子であれば、そのwrite コマンドに従って処理を進める。その後、同一の識別子が付加されたwrite コマンドを受信した場合には、そのwrite コマンドを送信した中継装置に対して、転送が完了した通知のみを行う。 Each relay device 15-1 to 15-n transmits a write command to the standby storage 12 with the identifier added. Therefore, the storage 12 receives a write command to which the same identifier is added from each relay device 15-1 to 15-n. If the identifier added to the write command is an identifier received for the first time, the storage 12 proceeds with processing according to the write command. Thereafter, when a write command to which the same identifier is added is received, only a transfer completion notification is sent to the relay apparatus that has transmitted the write command.

このように同一データのwrite コマンドを各中継装置に送信すれば、複数のデータ転送経路のうち最も速度が速い経路を用いて処理を進めることができる。従って、ストレージ１１がミラーリングやバックアップを行う際、中継装置からの応答の待ち時間を短くすることができる。その結果、ホスト１０が次の処理を開始するまでの時間を短縮化される。 If the write command of the same data is transmitted to each relay device in this way, the process can be advanced using the fastest path among the plurality of data transfer paths. Therefore, when the storage 11 performs mirroring or backup, the waiting time for a response from the relay device can be shortened. As a result, the time until the host 10 starts the next process is shortened.

第一の実施の形態から第四の実施の形態において、中継装置１５に不揮発性記憶装置を設け、中継装置１５が、ストレージ１１から受信したコマンドおよびデータをその不揮発性記憶装置に記録し、不揮発性記憶装置に記録したコマンドを任意のタイミングでストレージ１２に発行するようにしてもよい。中継装置１５に設ける不揮発性記憶装置としては、例えば、単体の磁気ディスク装置、光ディスク装置または光磁気ディスク装置の集合であるディスクアレイ装置等を使用すればよい。あるいは、バッテリバックアップされたメモリを使用してもよい。 In the first embodiment to the fourth embodiment, the relay device 15 is provided with a nonvolatile storage device, and the relay device 15 records the command and data received from the storage 11 in the nonvolatile storage device. The command recorded in the volatile storage device may be issued to the storage 12 at an arbitrary timing. As the nonvolatile storage device provided in the relay device 15, for example, a single magnetic disk device, an optical disk device, or a disk array device that is a set of magneto-optical disk devices may be used. Alternatively, a battery-backed memory may be used.

中継装置１５は、例えば、中継装置１５およびストレージ１２が接続されるネットワーク１４の通信量が所定のしきい値以下になった場合等に、ストレージ１２に対してwrite コマンドを発行すればよい。write コマンド発行のタイミングは、不揮発性記憶装置にコマンドおよびデータを記録してから所定の時間が経過したとき、所定の時刻になったとき、あるいはストレージ１２からwrite コマンドの発行を要求されたとき等であってもよい。 The relay device 15 may issue a write command to the storage 12 when, for example, the communication amount of the network 14 to which the relay device 15 and the storage 12 are connected falls below a predetermined threshold. The write command is issued when a predetermined time has elapsed since the command and data were recorded in the non-volatile storage device, when a predetermined time has been reached, or when the storage 12 is requested to issue a write command, etc. It may be.

このように不揮発性記憶装置にストレージ１１からのコマンドおよびデータを記録すれば、コマンドやデータが喪失されることがない。そして、ストレージ１１からのwrite コマンド受信と連動せずに、任意のタイミングでストレージ１２にwrite コマンドを発行できる。その結果、ストレージ１２を常時稼働させておかなくてもよくなる。同様に、中継装置１５およびストレージ１２が接続されるネットワーク１４も常時接続状態を保つ必要がなくなり、ネットワーク１４の運用コストも低減される。 Thus, if the command and data from the storage 11 are recorded on the nonvolatile storage device, the command and data are not lost. Then, the write command can be issued to the storage 12 at an arbitrary timing without interlocking with reception of the write command from the storage 11. As a result, the storage 12 need not always be operated. Similarly, the network 14 to which the relay device 15 and the storage 12 are connected need not always be connected, and the operation cost of the network 14 is reduced.

また、通常は、第一の実施の形態から第四の実施の形態に示したようにストレージ１１からのwrite コマンド受信と連動させてストレージ１２にwrite コマンドを発行し、ネットワーク１４の通信量が所定のしきい値以上になったときに、コマンドやデータを不揮発性記憶装置に記録するようにしてもよい。この場合、ネットワーク１４の通信量を平均化することが可能となる。また、ネットワーク１４として運用コストが安価な回線を使用することができ、データ複製システム全体のコストを低減することができる。 Normally, as shown in the first to fourth embodiments, a write command is issued to the storage 12 in conjunction with the reception of the write command from the storage 11, and the communication amount of the network 14 is predetermined. When the threshold is exceeded, commands and data may be recorded in the nonvolatile storage device. In this case, the traffic on the network 14 can be averaged. Further, a line having a low operation cost can be used as the network 14, and the cost of the entire data replication system can be reduced.

実施の形態５．
図１３は、本発明によるデータ複製システムの第５の実施の形態を示すブロック図である。図１３に示すデータ複製システムにおいて、ストレージ２０が、ストレージ２０を使用するホスト１０とローカルに接続されている。ストレージ２０は、ネットワーク１３を介してストレージ２１に接続されている。 Embodiment 5. FIG.
FIG. 13 is a block diagram showing a fifth embodiment of the data replication system according to the present invention. In the data replication system shown in FIG. 13, the storage 20 is locally connected to the host 10 that uses the storage 20. The storage 20 is connected to the storage 21 via the network 13.

ストレージ２０，２１は、例えば、単体の磁気ディスク装置、光ディスク装置または光磁気ディスク装置である。ストレージ２０，２１として、単体の磁気ディスク装置、光ディスク装置または光磁気ディスク装置の集合であるディスクアレイ装置を使用することもできる。ホスト１０とストレージ２０とは、ＳＣＳＩ、ファイバチャネル（Fibre channel ）、イーサネット（登録商標）等で接続される。 The storages 20 and 21 are, for example, a single magnetic disk device, an optical disk device, or a magneto-optical disk device. As the storages 20 and 21, a disk array device which is a single magnetic disk device, an optical disk device or a set of magneto-optical disk devices can be used. The host 10 and the storage 20 are connected by SCSI, Fiber channel, Ethernet (registered trademark), or the like.

図１４は、図１３に示すストレージ２０の構成例を示すブロック図である。なお、ストレージ２１の構成も、図１４に示すような構成である。図１４に示すように、ストレージ２０は、ストレージコントローラ２００とストレージ本体である記憶媒体１０１とを含む。ストレージコントローラ２００は、ホスト１０および他のストレージと通信を行う通信部２０４、各処理のシーケンスを管理する処理シーケンサ２０１、記憶媒体１０１に対する処理命令の順序制御を行うＩＯスケジューラ１０４、ＩＯスケジューラ１０４が発行する処理命令に従って記憶媒体１０１の動作を制御する媒体処理部１０５、ホスト１０等から記憶媒体１０１へのデータおよび記憶媒体１０１からホスト１０等へのデータを一時記憶するバッファメモリ１０６、他のストレージに送るデータを冗長化する冗長化部２０２、および他のストレージから送られてきた冗長化データから元のデータを復元する復元部２０２を含む。処理シーケンサ１０３は、例えば、プログラムに従って動作するＣＰＵで実現される。 FIG. 14 is a block diagram illustrating a configuration example of the storage 20 illustrated in FIG. The configuration of the storage 21 is also as shown in FIG. As shown in FIG. 14, the storage 20 includes a storage controller 200 and a storage medium 101 that is a storage body. The storage controller 200 is issued by a communication unit 204 that communicates with the host 10 and other storages, a processing sequencer 201 that manages the sequence of each process, an IO scheduler 104 that controls the order of processing instructions for the storage medium 101, and an IO scheduler 104 In the medium processing unit 105 that controls the operation of the storage medium 101 according to the processing instruction to be performed, the buffer memory 106 that temporarily stores data from the host 10 or the like to the storage medium 101 and data from the storage medium 101 to the host 10 or the like A redundancy unit 202 that makes the data to be sent redundant, and a restoration unit 202 that restores the original data from the redundant data sent from another storage are included. The processing sequencer 103 is realized by a CPU that operates according to a program, for example.

次に、実施の形態５のデータ複製システムの動作を、図１５および図１６のフローチャートを参照して説明する。図１５は、ストレージコントローラ２００における処理シーケンサ２０１の動作を示すフローチャートであり、図１６は通信部２０４の動作を示すフローチャートである。 Next, the operation of the data replication system of the fifth embodiment will be described with reference to the flowcharts of FIGS. FIG. 15 is a flowchart showing the operation of the processing sequencer 201 in the storage controller 200, and FIG. 16 is a flowchart showing the operation of the communication unit 204.

処理シーケンサ２０１は、通信部２０４を介してホスト１０から複写コマンドを受け取ると、図１５に示すように、まず、複写対象のデータの範囲の先頭のブロックをデータ転送対象のブロックとして設定する（ステップＳ３００）。次いで、通信部２０４に、複写コマンドで指定された複写先のストレージに対して冗長化write コマンドを送信するように指示する（ステップＳ３０１）。冗長化write コマンドは、冗長化されたデータに基づく書き込みを指示するコマンドである。冗長化write コマンドには、ブロックサイズすなわちデータ量を示す情報とともに、データが冗長化されることを示す情報が含まれている。また、処理シーケンサ２０１は、データ転送対象のデータの読み出し要求をＩＯスケジューラ１０４に登録する。ＩＯスケジューラ１０４は、読み出し要求に応じて、媒体制御部１０５に、データ転送対象のデータの読み出し指示を行う。媒体制御部１０５は、読み出し指示に従って、データ転送対象のデータを記憶媒体１０１からバッファメモリ１０６に出力させる（ステップＳ３０２）。 When the processing sequencer 201 receives a copy command from the host 10 via the communication unit 204, as shown in FIG. 15, first, the first block in the range of data to be copied is set as a block to be transferred (step S300). Next, the communication unit 204 is instructed to transmit a redundant write command to the copy destination storage designated by the copy command (step S301). The redundant write command is a command for instructing writing based on redundant data. The redundancy write command includes information indicating that the data is made redundant together with information indicating the block size, that is, the data amount. Further, the processing sequencer 201 registers a read request for data to be transferred with the IO scheduler 104. In response to the read request, the IO scheduler 104 instructs the medium control unit 105 to read data to be transferred. The medium control unit 105 causes the data to be transferred to be output from the storage medium 101 to the buffer memory 106 according to the read instruction (step S302).

そして、処理シーケンサ２０１は、通信部２０４を介して入力されるストレージ２１からの受信準備完了のメッセージと、媒体制御部１０５からの読み出し完了通知との双方を待ち（ステップＳ３０３）。双方が入力されたら、バッファメモリ１０６に格納されたデータの冗長化を冗長化部２０２に行わせる（ステップＳ３０４）。冗長化は、元のデータ（以下、元データという。）から冗長データを作成することによって行う。以下の説明において、元データおよび新たに作成された冗長データの集合を冗長化されたデータ群と記す。処理シーケンサ２０１は、冗長化されたデータ群をストレージ２１に転送するように通信部２０４に指示する。通信部２０４は、この指示に応じて、冗長化されたデータ群をストレージ２１に送信する（ステップＳ３０５）。 Then, the processing sequencer 201 waits for both a reception preparation completion message from the storage 21 input via the communication unit 204 and a read completion notification from the medium control unit 105 (step S303). If both are input, the redundancy unit 202 is caused to make the data stored in the buffer memory 106 redundant (step S304). Redundancy is performed by creating redundant data from original data (hereinafter referred to as original data). In the following description, a set of original data and newly created redundant data is referred to as a redundant data group. The processing sequencer 201 instructs the communication unit 204 to transfer the redundant data group to the storage 21. In response to this instruction, the communication unit 204 transmits the redundant data group to the storage 21 (step S305).

ただし、通信部２０４は、元データと冗長データとを別々のデータのまとまりとしてストレージ２１に送信する。この「データのまとまり」とは、各種データ転送プロトコルの最小のデータ送信単位のことである。「データのまとまり」の具体例としては、例えば、ＴＣＰやＵＤＰにおけるセグメント、インターネットプロトコルにおけるパケット、あるいはファイバチャネルプロトコルやイーサネット（登録商標）の物理層におけるフレーム等がある。従って、例えばインターネットプロトコルに従って送信を行う場合、通信部２０４は、元データと冗長データとを別々のパケットで送信する。なお、一般に、パケット等のデータ送信単位の内部には、伝送中に生じたエラー（データ化け）を検出するためのエラー検出データが付加される。ステップＳ３０４で作成される冗長データは、独立したデータのまとまりとして送信されるデータであり、パケット等に付加されるエラー検出データとは異なるデータである。 However, the communication unit 204 transmits the original data and the redundant data to the storage 21 as a set of separate data. The “data group” is a minimum data transmission unit of various data transfer protocols. Specific examples of the “data group” include, for example, a segment in TCP or UDP, a packet in Internet protocol, a frame in a physical layer of Fiber Channel protocol or Ethernet (registered trademark), and the like. Therefore, for example, when transmitting according to the Internet protocol, the communication unit 204 transmits the original data and the redundant data in separate packets. Generally, error detection data for detecting an error (data corruption) that occurs during transmission is added inside a data transmission unit such as a packet. The redundant data created in step S304 is data transmitted as a group of independent data, and is different from error detection data added to a packet or the like.

また、後述するように、元データを分割し、分割された元データに基づいて冗長データを作成する。ステップＳ３０５において、通信部３０５は、分割された元データをそれぞれ個々のデータ送信単位として送信する。例えば、元データをＮ個に分割し、Ｎ個のデータからｍ個の冗長データを作成したとする。この場合、通信部は元データおよび冗長データをそれぞれＮ個、ｍ個のパケット（あるいはセグメント、フレーム等）で送信する。通信部２０４が元データと冗長データとをそれぞれ別々のデータのまとまりとして送信したならば、通信部２０４を介して入力されるストレージ２１からの受信完了のメッセージを待つ（ステップＳ３０６）。 Further, as will be described later, the original data is divided, and redundant data is created based on the divided original data. In step S305, the communication unit 305 transmits the divided original data as individual data transmission units. For example, it is assumed that the original data is divided into N pieces and m redundant data are created from the N pieces of data. In this case, the communication unit transmits the original data and redundant data in N and m packets (or segments, frames, etc.), respectively. If the communication unit 204 transmits the original data and the redundant data as separate data sets, the communication unit 204 waits for a reception completion message from the storage 21 input via the communication unit 204 (step S306).

通信部２０４は、冗長化されたデータ群（元データと冗長データ）を全てストレージ２１に送信し、ストレージ２１から受信完了のメッセージを受けたら、受信完了のメッセージを処理シーケンサ２０１を出力する。処理シーケンサ２０１は、受信完了のメッセージを入力すると、複写対象の全てのデータの中継装置１５への転送が完了したか否か確認する（ステップＳ３０７）。完了していなければ、複写対象のデータの範囲の次のブロックをデータ転送対象のブロックとして設定し（ステップＳ３０９）、ステップＳ３０１に戻る。 The communication unit 204 transmits all the redundant data groups (original data and redundant data) to the storage 21, and outputs a reception completion message to the processing sequencer 201 when receiving a reception completion message from the storage 21. When receiving the reception completion message, the processing sequencer 201 confirms whether or not the transfer of all data to be copied to the relay device 15 has been completed (step S307). If not completed, the block next to the copy target data range is set as the data transfer target block (step S309), and the process returns to step S301.

複写対象の全てのデータの転送が完了していれば、処理シーケンサ２０１は、ホスト１０に対して完了通知を出力するように通信部２０４に指示し（ステップＳ３０８）、処理を終了する。 If the transfer of all data to be copied has been completed, the processing sequencer 201 instructs the communication unit 204 to output a completion notification to the host 10 (step S308), and ends the processing.

なお、１ブロックのデータ量は、あらかじめシステムに設定されている量である。また、処理シーケンサ２０１は、１ブロックのデータ量を、転送先のストレージ２１のバッファメモリ１０６の容量に応じて変化させるようにしてもよい。 Note that the data amount of one block is an amount set in the system in advance. Further, the processing sequencer 201 may change the data amount of one block in accordance with the capacity of the buffer memory 106 of the transfer destination storage 21.

通信部２０４は、外部から入力されたデータが、コマンド、受信準備完了のメッセージ、受信完了のメッセージ等の制御系メッセージであった場合には、入力されたデータを処理シーケンサ２０１に渡す。また、ストレージ２０に書き込まれるべきデータであった場合には、データを格納すべき場所を処理シーケンサ２０１に問い合わせる。そして、処理シーケンサ２０１から指定されたバッファメモリ１０６中の領域にデータを格納するか、または、復元部２０３にデータを復元させた後、指定されたバッファメモリ１０６中の領域にデータを格納する。 If the data input from the outside is a control system message such as a command, a reception completion message, or a reception completion message, the communication unit 204 passes the input data to the processing sequencer 201. If the data is to be written to the storage 20, the processing sequencer 201 is inquired about the location where the data is to be stored. Then, the data is stored in an area in the buffer memory 106 designated by the processing sequencer 201, or the data is stored in the area in the designated buffer memory 106 after the restoration unit 203 restores the data.

また、通信部２０４は、処理シーケンサ２０１から指定されたコマンドまたは完了通知を、指定されたストレージまたはホストに送信する処理も行う。さらに、処理シーケンサ２０１から指定されたバッファメモリ１０６中のデータを、指定されたストレージまたはホストに送信する処理も行う。また、冗長化部２０２から冗長化されたデータを受け取り、受け取ったデータを、指定されたストレージまたはホスト１０に送信する処理も行う。 The communication unit 204 also performs processing for transmitting a command or completion notification designated by the processing sequencer 201 to the designated storage or host. Further, processing for transmitting data in the buffer memory 106 designated by the processing sequencer 201 to the designated storage or host is also performed. In addition, a process of receiving redundant data from the redundancy unit 202 and transmitting the received data to the designated storage or host 10 is also performed.

なお、通信部２０４は、他のストレージにコマンドを送信する場合に、各コマンドを区別するためのコマンド識別子をコマンドに付加する。コマンド識別子の例として、コマンドを送信する毎に１加算した数値である発行番号（後述する発行ＩＤ）がある。また、冗長化されたデータ群を送信する際に、その送信に関連したコマンドのコマンド識別子を付加する。例えば、冗長化write コマンドを送信するときに、その冗長化write コマンドを識別するためのコマンド識別子を付加する。そして、その冗長化write コマンドに対応する冗長化されたデータ群（元データおよび冗長データ）にも、冗長化write コマンドと同じコマンド識別子を付加する。 Note that when the command is transmitted to another storage, the communication unit 204 adds a command identifier for distinguishing each command to the command. As an example of the command identifier, there is an issue number (issue ID described later) which is a numerical value obtained by adding 1 each time a command is transmitted. Further, when transmitting a redundant data group, a command identifier of a command related to the transmission is added. For example, when a redundant write command is transmitted, a command identifier for identifying the redundant write command is added. Then, the same command identifier as that of the redundant write command is added to the redundant data group (original data and redundant data) corresponding to the redundant write command.

さらに、通信部２０４は、各コマンドやデータに、送信元となるストレージの識別子も付加する。 Further, the communication unit 204 adds a storage identifier as a transmission source to each command and data.

次に、冗長化部２０２の動作について説明する。冗長化部２０２は、指定されたデータをバッファ１０６から取得し、規定された冗長化方法を用いて、冗長化されたデータ群を作成する。冗長化部２０２は、指定された元データをＮ個のデータに分割する。そして、Ｎ個に分割された元データからｍ個の冗長データを作成する。Ｎ，ｍは自然数である。このＮ＋ｍ個のデータ群が、冗長化されたデータ群となる。冗長化されたデータ群には、復元時に使用するための識別情報が付加される。 Next, the operation of the redundancy unit 202 will be described. The redundancy unit 202 acquires designated data from the buffer 106 and creates a redundant data group using a prescribed redundancy method. The redundancy unit 202 divides the designated original data into N pieces of data. Then, m redundant data are created from the original data divided into N pieces. N and m are natural numbers. The N + m data groups become redundant data groups. Identification information for use at the time of restoration is added to the redundant data group.

以下、冗長化の具体例について説明する。冗長化部２０２は、例えば、元データをその先頭からＮ等分するなどの方法により、元データをＮ個に分割する。この方法を採用した場合、先頭から何番目のデータであるのかを示す番号を識別情報として使用する。冗長化部２０２は、元データをＮ個のデータに分割したならば、ＲＡＩＤ３およびＲＡＩＤ５の装置等と同様にパリティ演算によってパリティデータを作成し、そのパリティデータを冗長データとする。すなわち、Ｎ個に分割された各データおよび冗長データにおいて、対応する各ビットの値の総和が必ず奇数（または必ず偶数）になるように冗長データを作成する。この場合、１個の冗長データを作成すればよい。このように冗長ビットを作成すれば、分割後のＮ個の元データのうち、一つが欠落しても、欠落したデータを復元できる。 Hereinafter, a specific example of redundancy will be described. The redundancy unit 202 divides the original data into N by, for example, dividing the original data into N equal parts from the head. When this method is adopted, a number indicating the number of data from the beginning is used as identification information. When the original data is divided into N pieces of data, the redundancy unit 202 creates parity data by parity calculation in the same manner as in RAID 3 and RAID 5 devices, and makes the parity data redundant data. That is, redundant data is created so that the sum of the values of the corresponding bits is always an odd number (or necessarily an even number) in each piece of N data and redundant data. In this case, one piece of redundant data may be created. If redundant bits are created in this way, even if one of the divided N original data is missing, the missing data can be restored.

冗長化の方法は上記のパリティ演算のみに限定されず、冗長データの数も１個とは限らない。例えば、ダブルパリティ演算によって冗長データを作成してもよい。また、元データに基づくＥＣＣ（Error Correcting Code：誤り訂正符号）を冗長データとして用いてもよい。このように様々な冗長化の方法があるが、以下の説明では、パリティ演算によって１個の冗長データを作成した場合を例に説明する。 The redundancy method is not limited to the above parity operation, and the number of redundant data is not limited to one. For example, redundant data may be created by a double parity operation. Further, ECC (Error Correcting Code) based on the original data may be used as redundant data. There are various redundancy methods as described above. In the following description, a case where one piece of redundant data is created by parity calculation will be described as an example.

次に、データの転送先であるストレージ２１の動作を説明する。ストレージ２０からストレージ２１のストレージコントローラ２００における通信部２０４に冗長化write コマンドが送られる。通信部２０４は、冗長化write コマンドを受信すると、処理シーケンサ２０１に冗長化write コマンドを渡し、冗長化write コマンドにもとづく処理（冗長化write 処理）の開始を指示する。 Next, the operation of the storage 21 that is the data transfer destination will be described. The redundant write command is sent from the storage 20 to the communication unit 204 in the storage controller 200 of the storage 21. Upon receiving the redundant write command, the communication unit 204 passes the redundant write command to the processing sequencer 201 and instructs the start of processing based on the redundant write command (redundant write processing).

図１６は、ストレージ２１における処理シーケンサ２０１の動作を示すフローチャートである。処理シーケンサ２０１は、複数の冗長化write 処理を並行して行うことが可能であり、コマンドを発行処理したストレージの識別子と、コマンドに付加されたコマンド識別子を用いて各処理を識別する。 FIG. 16 is a flowchart showing the operation of the processing sequencer 201 in the storage 21. The processing sequencer 201 can perform a plurality of redundant write processes in parallel, and identifies each process using the identifier of the storage that issued the command and the command identifier added to the command.

冗長化write 処理では、処理シーケンサ２０１は、まず、送られてくるデータを記憶できる領域をバッファ１０６中に確保する（ステップＳ３２０）。次いで、通信部２０４に冗長化write コマンドを送信したストレージ２０に向けて受信準備の完了通知を送信させる（ステップＳ３２１）。そして、ストレージ２０からデータが届くのを待ち（ステップＳ３２２）、データが到着したことが通信部２０４から通知されたら、データ転送元のストレージ２０の識別子とデータに付加されているコマンド識別子とを指定し、データを復元部２０３に送るように通信部２０４に指示する（ステップＳ３２３）。 In the redundant write process, the process sequencer 201 first secures an area in the buffer 106 where the transmitted data can be stored (step S320). Next, the communication unit 204 is caused to transmit a reception preparation completion notification to the storage 20 that has transmitted the redundant write command (step S321). Then, it waits for data to arrive from the storage 20 (step S322). When the communication unit 204 notifies that the data has arrived, the identifier of the data transfer source storage 20 and the command identifier added to the data are designated. Then, the communication unit 204 is instructed to send the data to the restoration unit 203 (step S323).

次いで、処理シーケンサ２０１は、冗長化write コマンドに対応して送られてきたデータの個数を判断する（ステップＳ３２４）。各データには、冗長化write コマンドに対応したコマンド識別子が付加されているので、処理シーケンサ２０１は、受信した各データが冗長化write コマンドに対応して送られてきたデータか否かを確認することができる。データの個数が所定の個数（ｎ個）未満の場合にはステップＳ３２２に移行し、所定の個数（ｎ個）である場合にはステップＳ３２５に移行する（ステップＳ３２４）。この所定の個数は、元データを復元することができるデータの個数であり、冗長化の方式によって異なる。例えば、Ｎ個に分割された元データからパリティ演算によって１個の冗長データが作成された場合、ストレージ２０からはＮ＋１個のデータが送られてくる。この場合、Ｎ個のデータを受信すれば、元データを復元することができる。従って、ステップＳ３２４では受信したデータの数がＮ個未満か否かを判断すればよい。 Next, the processing sequencer 201 determines the number of data sent in response to the redundant write command (step S324). Since the command identifier corresponding to the redundant write command is added to each data, the processing sequencer 201 confirms whether or not each received data is data transmitted corresponding to the redundant write command. be able to. When the number of data is less than the predetermined number (n), the process proceeds to step S322, and when the number is the predetermined number (n), the process proceeds to step S325 (step S324). This predetermined number is the number of data from which the original data can be restored, and differs depending on the redundancy scheme. For example, when one redundant data is created from the original data divided into N pieces by parity calculation, N + 1 pieces of data are sent from the storage 20. In this case, if N pieces of data are received, the original data can be restored. Accordingly, in step S324, it may be determined whether or not the number of received data is less than N.

ステップＳ３２５において、処理シーケンサ２０１は、データ転送元のストレージ２０の識別子およびコマンド識別子と、ステップＳ３２０で確保したバッファ１０６中の領域とを指定し、データの復元を復元部２０３に指示する。そして、データ転送元のストレージ２０に応答を送信するように通信部２０４に指示する（ステップＳ３２６）。また、ＩＯスケジューラ１０４に対して、ステップＳ３２０で確保したバッファ１０６中の領域中のデータを、冗長化write コマンドで指定された媒体中の領域に書き込むように指示を出し（ステップＳ３２７）、処理を終了する。ＩＯスケジューラ１０４にこの指示が出された場合のＩＯスケジューラ１０４および媒体制御部１０５の動作は、第１の実施の形態で説明したステップＳ１０２における動作と同様である。 In step S325, the processing sequencer 201 designates the identifier and command identifier of the data transfer source storage 20 and the area in the buffer 106 secured in step S320, and instructs the restoration unit 203 to restore the data. Then, the communication unit 204 is instructed to send a response to the data transfer source storage 20 (step S326). Further, the IO scheduler 104 is instructed to write the data in the area in the buffer 106 secured in step S320 to the area in the medium designated by the redundant write command (step S327), and the processing is performed. finish. The operations of the IO scheduler 104 and the medium control unit 105 when this instruction is issued to the IO scheduler 104 are the same as the operations in step S102 described in the first embodiment.

なお、処理シーケンサ２０１は、ステップＳ３２５の処理開始以降に通信部２０４に到着した冗長化write コマンドに関連したデータについては、通信部２０４にそのデータの破棄を指示する。例えば、Ｎ個のデータから元データを復元できる場合、Ｎ＋１個目に到着したデータは必要がない。必要なデータが到着した後にストレージ２１に到着したデータは破棄する。 Note that the processing sequencer 201 instructs the communication unit 204 to discard the data related to the redundant write command that has arrived at the communication unit 204 after the start of the process of step S325. For example, when the original data can be restored from N pieces of data, there is no need for the data that has reached the (N + 1) th piece. Data that arrives at the storage 21 after the necessary data arrives is discarded.

次に、復元部２０３の処理について説明する。復元部２０３は、バッファメモリを有し、データ転送元のストレージ２０の識別子およびコマンド識別子とともに渡されたデータをバッファメモリに蓄える。データ転送元のストレージ２０の識別子およびコマンド識別子と、バッファメモリ１０６中の復元したデータを格納する領域とが指定され、復元が指示されると、復元部２０３は、内部のバッファから、データ転送元のストレージ２０の識別子およびコマンド識別子に該当するデータを集める。そして、各データに付加された識別情報を元に、データ群から冗長化前のデータを復元する。 Next, processing of the restoration unit 203 will be described. The restoration unit 203 has a buffer memory, and stores the data passed together with the identifier and command identifier of the data transfer source storage 20 in the buffer memory. When the identifier and command identifier of the data transfer source storage 20 and the area for storing the restored data in the buffer memory 106 are designated and restoration is instructed, the restoration unit 203 reads the data transfer source from the internal buffer. The data corresponding to the identifier and command identifier of the storage 20 is collected. Then, based on the identification information added to each data, the data before redundancy is restored from the data group.

例えば、ＲＡＩＤ３およびＲＡＩＤ５の装置等のように、Ｎ個に分割された元データからパリティ演算によって１個の冗長データを作成して冗長化を行っていたとする。そして、分割されたＮ個の元データのうちｋ番目のデータがストレージ２１に到着せず、他のＮ−１個のデータと１個の冗長データ（合計Ｎ個のデータ）が到着したとする。この場合、ｋ番目のデータと到着したＮ個のデータにおいて、対応する各ビットの値の総和が必ず奇数（または必ず偶数）になるようにすることによって、ｋ番目のデータを作成することができる。そして、作成したｋ番目のデータと、ストレージ２１に到着した他のデータとによって、分割前の元データを復元すればよい。分割された各データには、識別情報（例えば、先頭から何番目のデータであるのかを示す番号）が付加されているので、分割された状態から分割前の元データに復元させることができる。また、Ｎ個に分割された元データが全てストレージ２１に到着し、冗長データが到着しなかった場合には、到着したＮ個のデータから分割前の元データを復元すればよい。 For example, it is assumed that one redundant data is created by parity calculation from original data divided into N pieces, as in RAID 3 and RAID 5 devices. Then, it is assumed that k-th data among the divided N original data does not arrive at the storage 21, but other N-1 data and one redundant data (a total of N data) have arrived. . In this case, the kth data can be created by ensuring that the sum of the values of the corresponding bits in the kth data and the N pieces of data that have arrived is always odd (or always even). . Then, the original data before the division may be restored using the created kth data and other data that has arrived at the storage 21. Since each piece of divided data is added with identification information (for example, a number indicating the number of data from the beginning), it can be restored from the divided state to the original data before the division. Further, when all of the original data divided into N pieces arrives at the storage 21 and the redundant data does not arrive, the original data before division may be restored from the N pieces of arrived data.

復元部２０３は、指定されたバッファメモリ１０６中の領域に復元したデータを格納する。なお、復元部２０３の内部のバッファにおける復元処理で使用されていたデータを格納していた領域は、復元処理終了後他のデータの格納に再利用される。 The restoration unit 203 stores the restored data in the designated area in the buffer memory 106. It should be noted that the area storing the data used in the restoration process in the buffer inside the restoration unit 203 is reused for storing other data after the restoration process is completed.

既述のように、冗長化の方法は、パリティ演算によって１個の冗長データを作成する方法に限られない。転送先のストレージ２１では、冗長化の方法に対応した方法で復元を行えるように制御すればよい。例えば、ＥＣＣを冗長データとしたならば、ＥＣＣに対応した方法で復元を行うように制御すればよい。 As described above, the redundancy method is not limited to a method of creating one redundant data by parity calculation. The transfer destination storage 21 may be controlled so that it can be restored by a method corresponding to the redundancy method. For example, if ECC is used as redundant data, control may be performed so that restoration is performed by a method corresponding to ECC.

また、ここでは、元データをＮ個に分割して冗長データを作成する場合を例示に説明したが、ストレージ２０の冗長化部２０２は、ステップＳ３０４において、元データを分割せずに元データと同一のデータを複製し、その複製したデータを冗長データとしてもよい。そして、ステップＳ３０５では、元データを一つのデータのまとまり（例えば、一つのパケット、セグメントあるいはフレーム等）として送信し、元データの複製である冗長データも一つのデータのまとまりとして送信してもよい。元データと冗長データとは同一のデータであり、また、分割されていないので、転送先ストレージ２１では、いずれか一方を受信したときに、受信したデータをそのまま記憶媒体１０１に書き込んでよい。また、遅れて受信したデータは破棄してよい。従って、ストレージ２１の処理シーケンサ２０１は、一回目にステップＳ３２４に移行したならば、すぐに次のステップＳ３２５に移行してよい。さらに、ステップＳ３２５において復元部２０３が復元処理を行うことなく、次のステップＳ３２６に移行してよい。 Further, here, the case where the original data is divided into N pieces to create redundant data has been described as an example, but the redundancy unit 202 of the storage 20 does not divide the original data and divides the original data in step S304. The same data may be duplicated and the duplicated data may be used as redundant data. In step S305, the original data may be transmitted as one piece of data (for example, one packet, segment, or frame), and redundant data that is a duplicate of the original data may be sent as one piece of data. . Since the original data and the redundant data are the same data and are not divided, the transfer destination storage 21 may write the received data to the storage medium 101 as it is when either one is received. In addition, data received late may be discarded. Therefore, the processing sequencer 201 of the storage 21 may move to the next step S325 immediately after moving to step S324 for the first time. Furthermore, the restoration unit 203 may move to the next step S326 without performing the restoration process in step S325.

このように元データと冗長データとを同一のデータとして、別々の二つのデータのまとまりとして送信すれば、一方のデータが送信過程で失われてもストレージ２１は、送信元が送信しようとした元データをそのまま受信することができる。 In this way, if the original data and the redundant data are transmitted as the same data as a group of two separate data, even if one of the data is lost during the transmission process, the storage 21 is Data can be received as it is.

また、ここでは、元データを分割せずに元データと同一のデータを複製し、その複製したデータを冗長データとする場合を示した。ストレージ２０の冗長化部２０２は、冗長データを作成するときに、元データをＮ個に分割し、分割したＮ個のデータについてそれぞれ同一のデータを複製してもよい。この場合、分割後のＮ個のデータの複製であるＮ個のデータが冗長データとなる。そして、ステップＳ３０５では、元データをＮ個のデータのまとまり（例えば、Ｎ個のパケット、セグメントあるいはフレーム等）として送信し、冗長データもＮ個のデータのまとまりとして送信してもよい。この場合、ストレージ２１には２×Ｎ個のデータが送信される。ストレージ２１は、この２×Ｎ個のデータのうち、元データを復元することができる一組のデータ（分割後の第一番目から第Ｎ番目までの各データ）について受信したならば、そのデータに基づいて元データを復元する。すなわち、ストレージ２１の処理シーケンサ２０１は、ステップＳ３２４において、分割後の第一番目のデータから第Ｎ番目のデータまでを全て受信したか否かを確認すればよい。そして、全て受信していればステップＳ３２５に移行し、まだ受信していないデータがあればステップＳ３２２に戻ればよい。 Further, here, a case is shown in which the same data as the original data is duplicated without dividing the original data, and the duplicated data is used as redundant data. When creating redundant data, the redundancy unit 202 of the storage 20 may divide the original data into N pieces and duplicate the same data for each of the divided N pieces of data. In this case, N data, which is a copy of the N data after division, becomes redundant data. In step S305, the original data may be transmitted as a group of N data (for example, N packets, segments, or frames), and the redundant data may be transmitted as a group of N data. In this case, 2 × N pieces of data are transmitted to the storage 21. If the storage 21 receives a set of data (each data from the first to the Nth data after the division) that can restore the original data among the 2 × N pieces of data, the data Restore the original data based on That is, the processing sequencer 201 of the storage 21 may confirm whether or not all of the divided first to Nth data have been received in step S324. If all the data has been received, the process proceeds to step S325, and if there is data that has not been received, the process may return to step S322.

このように複製データを冗長データとして送信すれば、２×Ｎ個のデータの一部が送信過程で廃棄されても、ストレージ２１が分割後の第一番目のデータから第Ｎ番目のデータまでを全て受信した時点でデータを復元することができる。なお、ストレージ２０の冗長化部２０２は、元データを分割してからその分割後の各データの複製を作成してもよいし、また、分割前の元データと同一のデータを複製し、元データおよび複製データをそれぞれ分割してもよい。 If the duplicate data is transmitted as redundant data in this way, even if a part of 2 × N data is discarded in the transmission process, the storage 21 performs processing from the first data to the Nth data after division. Data can be restored when all are received. The redundancy unit 202 of the storage 20 may divide the original data and then create a copy of each data after the division, or may duplicate the same data as the original data before the division, Data and replicated data may be divided respectively.

第５の実施の形態によれば、冗長データを作成し、元データと冗長データとを別々に送信する。従って、データの一部が送信過程で失われてしまったとしても、元データを復元することができる。あるいは、送信元が送信しようとした元データをそのまま受信することができる。この結果、元データの一部が送信過程で失われた場合、転送元のストレージ２０に元データが届いていないことを通知して、ストレージ２０に再度送信させる必要がなくなり、データの転送時間を短縮することができる。 According to the fifth embodiment, redundant data is created, and original data and redundant data are transmitted separately. Therefore, even if a part of data is lost in the transmission process, the original data can be restored. Or the original data which the transmission source tried to transmit can be received as it is. As a result, when a part of the original data is lost in the transmission process, it is not necessary to notify the transfer source storage 20 that the original data has not arrived and to send the data to the storage 20 again. It can be shortened.

図１３に示すデータ複製システムでは、冗長化されたデータ群に含まれる元データおよび冗長データは、いずれもネットワーク１３を介して送受信される。元データと冗長データとを別々のネットワークを介して送信する構成であってもよい。図１７は、元データと冗長データとを別々のネットワークを介して送信するデータ複製システムの構成例を示すブロック図である。図１７に示す構成例において、ストレージ２０は、ネットワーク１３を介してストレージ２１に接続され、またネットワーク１４を介してもストレージ２１に接続されている。ストレージ２０とストレージ２１とが二つのネットワーク１３，１４を介して接続されている点以外は、図１３に示す構成と同様である。また、図１７に示すストレージ２０，２１の構成は、図１４に示す構成と同様である。ただし、通信部２０４は、二つのネットワーク１３，１４に接続される。 In the data replication system shown in FIG. 13, both original data and redundant data included in a redundant data group are transmitted / received via the network 13. The original data and the redundant data may be transmitted via separate networks. FIG. 17 is a block diagram illustrating a configuration example of a data replication system that transmits original data and redundant data via different networks. In the configuration example illustrated in FIG. 17, the storage 20 is connected to the storage 21 via the network 13 and is also connected to the storage 21 via the network 14. The configuration is the same as that shown in FIG. 13 except that the storage 20 and the storage 21 are connected via the two networks 13 and 14. The configuration of the storages 20 and 21 shown in FIG. 17 is the same as the configuration shown in FIG. However, the communication unit 204 is connected to the two networks 13 and 14.

また、図１７に示すストレージ２０の動作は、図１３に示すストレージ２０の動作と同様である。ただし、ステップＳ３０５（図１５参照）において、通信部２０４が冗長化されたデータ群をストレージ２１に送信する場合、通信部２０４は、元データと冗長データをそれぞれ別々のネットワークを介して送信する。例えば、元データを一つのデータ転送単位として（例えば一つのパケット等として）、ネットワーク１３ａを介してストレージ２１に送信し、元データの複製である冗長データを、ネットワーク１４を介してストレージ２１に送信する。 Also, the operation of the storage 20 shown in FIG. 17 is the same as the operation of the storage 20 shown in FIG. However, when the communication unit 204 transmits the redundant data group to the storage 21 in step S305 (see FIG. 15), the communication unit 204 transmits the original data and the redundant data via separate networks. For example, the original data is transmitted as a single data transfer unit (for example, as one packet) to the storage 21 via the network 13a, and redundant data that is a copy of the original data is transmitted to the storage 21 via the network 14. To do.

このような構成によれば、ネットワーク１３，１４のいずれか一方で障害が発生したり、いずれか一方のネットワークの転送速度が遅かったとしても、もう一方のネットワークを介して元データまたは元データと同一の冗長データを送信することができる。従って、元データの複製を冗長データとする場合に、耐障害性をより高めることができる。なお、元データをＮ個に分割して冗長データを作成する場合であっても、分割した元データと冗長データとをそれぞれ別のネットワークを介して送信するようにしてもよい。 According to such a configuration, even if a failure occurs in one of the networks 13 and 14 or the transfer speed of one of the networks is slow, the original data or the original data is transmitted via the other network. The same redundant data can be transmitted. Therefore, fault tolerance can be further improved when replicating the original data is redundant data. Even when the redundant data is generated by dividing the original data into N pieces, the divided original data and redundant data may be transmitted via different networks.

実施の形態６．
第５の実施の形態では、ストレージ２０のデータのバックアップが実現されたが、ミラーリングによってストレージ２０のデータをストレージ２１に転送するようにしてもよい。図１８は、第６の実施の形態、すなわちミラーリングを行う場合のストレージ２０のストレージコントローラ２００における処理シーケンサ２０１の動作を示すフローチャートである。なお、データ複製システムの構成およびストレージ２０，２１の構成は第５の実施の形態の場合と同じである（図１３，図１４参照）。また、ストレージ２０において、通信部２０４、ＩＯスケジューラ１０４および媒体制御部１０５の動作は、第５の実施の形態のそれらの動作と同じである。 Embodiment 6 FIG.
In the fifth embodiment, the backup of the data in the storage 20 is realized. However, the data in the storage 20 may be transferred to the storage 21 by mirroring. FIG. 18 is a flowchart showing the operation of the processing sequencer 201 in the storage controller 200 of the storage 20 in the sixth embodiment, that is, when mirroring is performed. The configuration of the data replication system and the configurations of the storages 20 and 21 are the same as those in the fifth embodiment (see FIGS. 13 and 14). In the storage 20, the operations of the communication unit 204, the IO scheduler 104, and the medium control unit 105 are the same as those in the fifth embodiment.

ストレージ２０は、ホスト１０からwrite コマンドを受信したときにミラーリングを開始する。ストレージ２０の通信部２０４がホスト１０からwrite コマンドを受信すると、通信部２０４はそのwrite コマンドを処理シーケンサ２０４に渡す。すると、処理シーケンサ２０１は、図１８に示すように、ホスト１０から受け取るデータを格納するのに必要な領域をバッファメモリ１０６に確保する（ステップＳ３４０）。また、準備完了の通知をホスト１０に送るように通信部２０４に指示する（ステップＳ３４１）。通信部２０４は、指示に応じて、準備完了の通知をホスト１０に送る。 The storage 20 starts mirroring when it receives a write command from the host 10. When the communication unit 204 of the storage 20 receives a write command from the host 10, the communication unit 204 passes the write command to the processing sequencer 204. Then, as shown in FIG. 18, the processing sequencer 201 secures an area necessary for storing data received from the host 10 in the buffer memory 106 (step S340). Further, the communication unit 204 is instructed to send a notification of completion of preparation to the host 10 (step S341). The communication unit 204 sends a notification of preparation completion to the host 10 in response to the instruction.

そして、処理シーケンサ２０１は、ホスト１０からデータが到着するのを待ち（ステップＳ３４２）、データが届いて通信部２０４からデータを格納すべきバッファメモリ１０６の領域の問い合わせを受けると、ステップＳ３４０で確保した領域を通信部２０４に知らせる（ステップＳ３４３）。次いで、冗長化write コマンドをストレージ２１に送るように通信部２０４に指示する（ステップＳ３４４）。通信部２０４は、指示に応じて、ストレージ２１に冗長化write コマンドを送信する。 Then, the processing sequencer 201 waits for data to arrive from the host 10 (step S342). When the data arrives and receives an inquiry about the area of the buffer memory 106 where the data is to be stored from the communication unit 204, the processing sequencer 201 secures it in step S340. The communication unit 204 is notified of the completed area (step S343). Next, the communication unit 204 is instructed to send a redundant write command to the storage 21 (step S344). The communication unit 204 transmits a redundant write command to the storage 21 in response to the instruction.

次いで、処理シーケンサ２０１は、ホスト１０からのデータのバッファメモリ１０６への格納の完了を待ち（ステップＳ３４５）、全てのデータがバッファメモリ１０６に格納されたことが通信部２０４から通知されると、処理シーケンサ２０１は、ＩＯスケジューラ１０４に対して、処理の種類（この場合には書き込み）と、処理の識別ＩＤと、処理の対象となる記憶媒体１０１中の領域を示す情報と、処理の対象となるバッファメモリ１０６の領域を示す情報とを登録する。ＩＯスケジューラ１０４は、書き込み要求に応じて、媒体制御部１０５に、書き込み対象のデータの書き込み指示を行う。媒体制御部１０５は、登録内容に応じてバッファメモリ１０６から記憶媒体１０１へのデータの書き込み処理を行う（ステップＳ３４６）。 Next, the processing sequencer 201 waits for completion of storage of data from the host 10 in the buffer memory 106 (step S345), and when the communication unit 204 notifies that all the data has been stored in the buffer memory 106, The processing sequencer 201 sends a processing type (in this case, writing), a processing identification ID, information indicating an area in the storage medium 101 to be processed, and a processing target to the IO scheduler 104. Information indicating the area of the buffer memory 106 is registered. The IO scheduler 104 instructs the medium control unit 105 to write data to be written in response to the write request. The medium control unit 105 performs data writing processing from the buffer memory 106 to the storage medium 101 according to the registered contents (step S346).

そして、処理シーケンサ２０１は、ストレージ２１から受信準備完了のメッセージ（ステップＳ３４４において送った冗長化write コマンドに対する応答）が送信されるのを待つ（ステップＳ３４７）。ストレージ２１からの受信準備完了のメッセージを受信したことが通信部２０４から通知されると、バッファメモリ１０６に格納されたデータを冗長化するように冗長化部２０２に指示する。冗長化部２０２は、指示に応じて、第５の実施の形態の場合と同様に冗長化を行う（ステップＳ３４８）。ステップＳ３４８では、元データをＮ個に分割し、分割後のデータから冗長データを作成してもよい。あるいは、元データを分割せず、元データを複製したデータを冗長データとしてもよい。また、元データを分割した各データと同一のデータを複製し、複製した各データを冗長データとしてもよい。元データと同一のデータを複製し、元データおよび複製データをそれぞれ分割してもよい。 Then, the processing sequencer 201 waits for a reception preparation completion message (response to the redundant write command sent in step S344) from the storage 21 (step S347). When the communication unit 204 is notified that a reception preparation completion message from the storage 21 has been received, the redundancy unit 202 is instructed to make the data stored in the buffer memory 106 redundant. In response to the instruction, the redundancy unit 202 performs redundancy as in the case of the fifth embodiment (step S348). In step S348, the original data may be divided into N pieces, and redundant data may be created from the divided data. Alternatively, the original data may not be divided, and the duplicated data may be used as redundant data. Further, the same data as each data obtained by dividing the original data may be duplicated, and each duplicated data may be used as redundant data. The same data as the original data may be duplicated, and the original data and the duplicated data may be divided.

続いて、処理シーケンサ２０１は、通信部２０４に、冗長化部２０２によって冗長化されたデータ群をストレージ２１に向けて送信するように指示する（ステップＳ３４９）。通信部２０４は、この指示に応じて、冗長化されたデータ群（分割後の元データおよび冗長データ）をストレージ２１に送信する。その後、ストレージ２１から受信完了のメッセージが送信されるのと、媒体制御部１０５からの書き込み完了通知とを待ち（ステップＳ３５０）、ストレージ２１からの受信完了のメッセージを受信したことが通信部２０４から通知され、かつ、媒体制御部１０５からの書き込み完了通知を受けると、ホスト１０に完了通知し（ステップＳ３５１）、処理を終了する。なお、ストレージ２１が、ストレージ２０から冗長化されたデータ群を受信するときの動作は、実施の形態５と同様である。 Subsequently, the processing sequencer 201 instructs the communication unit 204 to transmit the data group made redundant by the redundancy unit 202 toward the storage 21 (step S349). In response to this instruction, the communication unit 204 transmits the redundant data group (the divided original data and redundant data) to the storage 21. After that, a reception completion message is transmitted from the storage 21 and a write completion notification from the medium control unit 105 is waited (step S350), and the reception completion message from the storage 21 is received from the communication unit 204. When the notification is received and the writing completion notification is received from the medium control unit 105, the host 10 is notified of the completion (step S351), and the processing is terminated. The operation when the storage 21 receives a redundant data group from the storage 20 is the same as that of the fifth embodiment.

図１７に示す場合と同様に、ストレージ２０とストレージ２１とがネットワーク１３，１４によって接続されていてもよい。そして、ステップＳ３４９において、通信部２０４が冗長化されたデータ群を送信するときには、元データと冗長データとをそれぞれ別々のネットワークを介して送信するようにしてもよい。 Similarly to the case illustrated in FIG. 17, the storage 20 and the storage 21 may be connected by the networks 13 and 14. In step S349, when the communication unit 204 transmits the redundant data group, the original data and the redundant data may be transmitted via different networks.

第１の実施の形態から第４の実施の形態のデータ複製システムにおいて、ストレージと中継装置とがデータを送受信する際に、第５の実施の形態または第６の実施の形態に示したように冗長化されたデータ群を送受信するようにしてもよい。その場合、第１の実施の形態から第４の実施の形態においても、第５の実施の形態または第６の実施の形態と同様の効果が得られる。 In the data replication system according to the first to fourth embodiments, as shown in the fifth embodiment or the sixth embodiment when the storage and the relay device transmit and receive data, as shown in the fifth embodiment or the sixth embodiment. You may make it transmit / receive the data group made redundant. In that case, also in the first to fourth embodiments, the same effects as those of the fifth or sixth embodiment can be obtained.

第５の実施の形態および第６の実施の形態において、データ転送処理手段は、処理シーケンサ２０１および通信部２０４によって実現される。冗長化手段２０２は、冗長株２０２によって実現される。復元手段は、復元部２０３によって実現される。格納処理手段は、処理シーケンサ２０１、ＩＯスケジューラ１０４および媒体制御部１０５によって実現される。 In the fifth embodiment and the sixth embodiment, the data transfer processing means is realized by the processing sequencer 201 and the communication unit 204. The redundancy means 202 is realized by the redundant stock 202. The restoration means is realized by the restoration unit 203. The storage processing unit is realized by the processing sequencer 201, the IO scheduler 104, and the medium control unit 105.

実施の形態７．
図１９は、本発明によるデータ複製システムの第７の実施の形態を示すブロック図である。図１９に示すデータ複製システムにおいて、ストレージ３０１が、ストレージ３０１を使用するホスト（上位装置）３００とローカルに接続されている。また、ストレージ３０２が、ストレージ３０２を使用するホスト３０３とローカルに接続されている。ストレージ３０１は、ネットワーク１３を介してストレージ３０２に接続されている。また、ホスト３００とホスト３０３とは通信可能に接続されている。ホスト３００とホスト３０３とは、専用回線によって接続されていることが好ましいが、専用回線以外のネットワーク（例えばインターネット等）によって接続されていてもよい。 Embodiment 7 FIG.
FIG. 19 is a block diagram showing a seventh embodiment of the data replication system according to the present invention. In the data replication system shown in FIG. 19, a storage 301 is locally connected to a host (higher level apparatus) 300 that uses the storage 301. Further, the storage 302 is locally connected to a host 303 that uses the storage 302. The storage 301 is connected to the storage 302 via the network 13. Further, the host 300 and the host 303 are connected to be communicable. The host 300 and the host 303 are preferably connected by a dedicated line, but may be connected by a network other than the dedicated line (for example, the Internet).

ストレージ３０１，３０２は、例えば、単体の磁気ディスク装置、光ディスク装置または光磁気ディスク装置である。ストレージ３０１，３０２として、単体の磁気ディスク装置、光ディスク装置または光磁気ディスク装置の集合であるディスクアレイ装置を使用することもできる。ホスト３００，３０３とストレージ３０１，３０２とは、ＳＣＳＩ、ファイバチャネル（Fibre channel ）、イーサネット（登録商標）等で接続される。なお、図１９に示すシステムにおいて、ホスト３００が、システムに障害が発生していないときに稼働する正常系ホストであり、ホスト３０３が、ホスト３００において障害が発生したときに稼働する待機系ホストであるとする。 The storages 301 and 302 are, for example, single magnetic disk devices, optical disk devices, or magneto-optical disk devices. As the storages 301 and 302, a single magnetic disk device, an optical disk device, or a disk array device that is a set of magneto-optical disk devices can be used. The hosts 300 and 303 and the storages 301 and 302 are connected by SCSI, Fiber channel, Ethernet (registered trademark), or the like. In the system shown in FIG. 19, the host 300 is a normal host that operates when no failure occurs in the system, and the host 303 is a standby host that operates when a failure occurs in the host 300. Suppose there is.

ホスト３００，３０３は、それぞれホスト３００，３０３自身が保持するアプリケーションプログラム（以下、アプリケーションと記す。）に従って処理を行う。このアプリケーションは、ストレージ３０１，３０２のデータを処理対象とする。例えば、ストレージ３０１，３０２に銀行の顧客の預金額等のデータを記憶する場合には、ホスト３００，３０３は、預金データ管理アプリケーションに従って、ストレージ３０１，３０２内のデータの更新等を行う。実際に動作を行うのはホストであるが、以下、アプリケーションの動作として説明する。 The hosts 300 and 303 perform processing according to application programs (hereinafter referred to as applications) held by the hosts 300 and 303 themselves. This application uses data in the storages 301 and 302 as a processing target. For example, when storing data such as the deposit amount of a bank customer in the storages 301 and 302, the hosts 300 and 303 update the data in the storages 301 and 302 in accordance with the deposit data management application. The host that actually performs the operation is described below as the operation of the application.

本実施の形態では、正常系のストレージ３０１内のデータが遠隔地に存在する待機系のストレージ３０３にミラーリングされる。ただし、正常系ホスト３００がストレージ３０１にデータを書き込んだ時点では、待機系ストレージ３０３の記憶媒体にそのデータを書き込ませるのではなく、待機系ストレージ３０３が備える同期用バッファメモリにそのデータを保持させる。そして、ホスト３００が指定するタイミングで、待機系ストレージ３０３の同期用バッファメモリ内のデータを所定の記憶媒体に書き込ませる。例えば、ホストが処理Ｘを開始するときには、ストレージにデータＡ，Ｂが書き込まれていなければならないとする。この場合、正常系のホスト３００がストレージ３０１にデータＡを書き込んだ時点では、待機系ストレージ３０２にはデータＡを同期用バッファメモリに保持させるだけで、待機系ストレージ３０２の記憶媒体への書き込みは行わせない。ホスト３００がストレージ３０１にデータＢを書き込んだ後に、ホスト３００は待機系ストレージ３０２の記憶媒体への書き込みタイミングを指定し、そのタイミングで待機系ストレージ３０２は同期用バッファメモリ内のデータＡ，Ｂを記憶媒体に書き込む。 In this embodiment, the data in the normal storage 301 is mirrored to the standby storage 303 that exists at a remote location. However, when the normal host 300 writes data to the storage 301, the data is not written to the storage medium of the standby storage 303 but is held in the synchronization buffer memory provided in the standby storage 303. . Then, at the timing designated by the host 300, the data in the synchronization buffer memory of the standby storage 303 is written into a predetermined storage medium. For example, when the host starts the process X, it is assumed that data A and B must be written in the storage. In this case, when the normal host 300 writes data A to the storage 301, the standby storage 302 simply holds the data A in the synchronization buffer memory, and writing to the storage medium of the standby storage 302 is not possible. Don't do it. After the host 300 writes the data B to the storage 301, the host 300 designates the write timing to the storage medium of the standby storage 302, and at this timing, the standby storage 302 stores the data A and B in the synchronization buffer memory. Write to storage media.

そのデータの状態であればアプリケーションがそのまま動作を再開することができるタイミングを、再開可能ポイントと記す。すなわち、再開可能ポイントとは、書き込まれたデータの状態がアプリケーションによる処理を再開することができる状態になっているタイミングのことである。上記の例では、正常系のストレージ３０１にデータＡ，Ｂが書き込まれてから次のデータが書き込まれるまでの間が、処理Ｘを開始することができる再開可能ポイントとなる。 The timing at which the application can resume its operation as long as it is in the data state is referred to as a resumable point. That is, the resumable point is a timing at which the state of written data is in a state where processing by an application can be resumed. In the above example, the period from when data A and B are written to the normal storage 301 until the next data is written is a resumable point at which the process X can be started.

図２０は、ホスト３００の構成例を示すブロック図である。ホスト３００において、単数あるいは複数のアプリケーションが動作する。ここでは、２つのアプリケーション３１０ａ，３１０ｂを例示する。アプリケーション３１０ａ，３１０ｂは、ＩＯ管理部３１１を用いて、ストレージ３０１中のデータにアクセスする。また、ＩＯ管理部３１１は、ストレージ３０１に再開可能ポイントを通知するための再開可能ポイント通知部３１２を有する。アプリケーション３１０ａ，３１０ｂに従い、再開可能ポイント通知部３１２は、再開可能ポイントにおいて、ストレージ３０１に再開可能ポイントであることを知らせる処理（再開可能ポイント通知処理）を行う。また、ホスト３００の状態を監視するホスト監視部３１３が備えられている。なお、ホスト３０３の構成は、ホスト３００の構成と同じである。 FIG. 20 is a block diagram illustrating a configuration example of the host 300. In the host 300, one or a plurality of applications operate. Here, two applications 310a and 310b are illustrated. The applications 310 a and 310 b access the data in the storage 301 using the IO management unit 311. Further, the IO management unit 311 includes a resumable point notifying unit 312 for notifying the storage 301 of resumable points. In accordance with the applications 310a and 310b, the resumable point notifying unit 312 performs a process of notifying the storage 301 of a resumable point at the resumable point (resumable point notifying process). A host monitoring unit 313 that monitors the state of the host 300 is also provided. The configuration of the host 303 is the same as the configuration of the host 300.

本実施の形態におけるアプリケーション３１０ａ，３１０ｂは、再開機能を有するアプリケーションである。すなわち、ストレージの記憶媒体１０１のデータ記録状態が所定の状態になっていれば処理を再開できる機能を実現するアプリケーションである。 The applications 310a and 310b in the present embodiment are applications having a resume function. In other words, this is an application that realizes a function that can resume processing if the data recording state of the storage medium 101 of the storage is in a predetermined state.

図２１は、図１９に示すストレージ３０１の構成例を示すブロック図である。なお、ストレージ３０２の構成も、図２１に示すような構成である。図２１に示すように、ストレージ３０１は、ストレージコントローラ３２０とストレージ本体である記憶媒体１０１とを含む。ストレージコントローラ３２０は、ホスト３００および他のストレージと通信を行う通信部３２２、各処理のシーケンスを管理する処理シーケンサ３２１、記憶媒体１０１に対する処理命令の順序制御を行うＩＯスケジューラ１０４、ＩＯスケジューラ１０４が発行する処理命令に従って記憶媒体１０１の動作を制御する媒体処理部１０５、ホスト３００から記憶媒体１０１へのデータおよび記憶媒体１０１からホスト３００へのデータを一時記憶するバッファメモリ１０６、および他のストレージから送られてきたデータを一時保存する同期用バッファメモリ３２２を含む。処理シーケンサ３２１は、例えば、プログラムに従って動作するＣＰＵで実現される。ＩＯスケジューラ１０４および媒体制御部１０５の動作は、第１の実施の形態におけるＩＯスケジューラ１０４および媒体制御部１０５の動作と同様である。 FIG. 21 is a block diagram illustrating a configuration example of the storage 301 illustrated in FIG. The configuration of the storage 302 is also as shown in FIG. As shown in FIG. 21, the storage 301 includes a storage controller 320 and a storage medium 101 that is a storage body. The storage controller 320 is issued by a communication unit 322 that communicates with the host 300 and other storages, a processing sequencer 321 that manages the sequence of each process, an IO scheduler 104 that controls the order of processing instructions for the storage medium 101, and an IO scheduler 104 A medium processing unit 105 that controls the operation of the storage medium 101 according to the processing command to be transmitted, a buffer memory 106 that temporarily stores data from the host 300 to the storage medium 101 and data from the storage medium 101 to the host 300, and other storages. A synchronization buffer memory 322 for temporarily storing received data is included. The processing sequencer 321 is realized by, for example, a CPU that operates according to a program. The operations of the IO scheduler 104 and the medium control unit 105 are the same as the operations of the IO scheduler 104 and the medium control unit 105 in the first embodiment.

なお、同期用バッファメモリ３２２として半導体メモリが使用される場合もあるし、磁気ディスク装置、光ディスク装置または光磁気ディスク装置等のより大容量の記憶装置が使用される場合もある。 A semiconductor memory may be used as the synchronization buffer memory 322, or a larger capacity storage device such as a magnetic disk device, an optical disk device, or a magneto-optical disk device may be used.

アプリケーション３１０ａ，３１０ｂがストレージ３０１からデータを読み出す際、ホスト３００は、ストレージ３０１にデータの読み出しを要求する。ストレージ３０１の処理シーケンサ３２１は、ＩＯスケジューラ１０４等によって、要求されたデータを記憶媒体１０１からバッファメモリ１０６にコピーする。そして、そのデータをホスト３００に送信する。 When the applications 310 a and 310 b read data from the storage 301, the host 300 requests the storage 301 to read data. The processing sequencer 321 of the storage 301 copies the requested data from the storage medium 101 to the buffer memory 106 by the IO scheduler 104 or the like. Then, the data is transmitted to the host 300.

次に、ホスト３００がストレージに３０１にデータを書き込むときの動作の概要について説明する。ホスト３００は、write コマンドをストレージ３０１に出力することによって、ストレージ３０１にデータを書き込ませる。また、一回または複数回write コマンドを出力した後に再開可能ポイントになったならば、再開可能ポイント通知コマンドをストレージ３０１に出力することによって、再開可能ポイントになったことを通知する。 Next, an outline of an operation when the host 300 writes data to the storage 301 will be described. The host 300 outputs data to the storage 301 by outputting a write command to the storage 301. If a resumable point is reached after outputting the write command once or a plurality of times, a resumable point notification command is output to the storage 301 to notify that the point has become a resumable point.

ストレージ３０１は、write コマンドを受信した場合、そのwrite コマンドに従って、記憶媒体１０１にデータを書き込む。また、ストレージ３０２に、遅延write コマンドを出力して、そのデータを待機系ストレージ３０２の同期用バッファメモリ３２３に保持させる。遅延write コマンドは、記憶媒体１０１に書き込むべきデータを同期用バッファメモリ３２３に保持させ、後述する遅延データ反映コマンドが届いたときに記憶媒体１０１に書き込むことを指示するコマンドである。 When the storage 301 receives the write command, the storage 301 writes data to the storage medium 101 in accordance with the write command. Also, a delayed write command is output to the storage 302 and the data is held in the synchronization buffer memory 323 of the standby storage 302. The delayed write command is a command for holding data to be written to the storage medium 101 in the synchronization buffer memory 323 and instructing to write to the storage medium 101 when a later-described delayed data reflection command arrives.

ストレージ３０１は、再開可能ポイント通知コマンドを受信した場合、遅延データ反映コマンド（遅延書き込み実行要求）をストレージ３０２に送信する。遅延データ反映コマンドは、同期用バッファメモリ内に記憶しているデータを記憶媒体１０１に書き込むことを指示するコマンドである。待機系のストレージ３０２は、遅延データ反映コマンドを受信したときに、同期用バッファメモリ３２３に記憶したデータを記憶媒体１０１に書き込む。このような動作によって、ストレージ３０２の記憶媒体１０１が常に処理を開始できる状態に保つ。 When the storage 301 receives the resumable point notification command, the storage 301 transmits a delayed data reflection command (delayed write execution request) to the storage 302. The delayed data reflection command is a command for instructing to write the data stored in the synchronization buffer memory to the storage medium 101. When the standby storage 302 receives the delayed data reflection command, the standby storage 302 writes the data stored in the synchronization buffer memory 323 to the storage medium 101. By such an operation, the storage medium 101 of the storage 302 is always kept in a state where processing can be started.

ストレージ３０１は、遅延write コマンドおよび遅延データ反映コマンドに同期ＩＤおよび発行ＩＤを付加して送信する。図２２は、同期ＩＤおよび発行ＩＤの例を示す説明図である。ストレージ３０１は、ホスト３００から再開可能ポイント通知コマンドを受信する度に同期ＩＤを更新する。従って、ある再開可能ポイントから次の再開可能ポイントまでの間に出力した遅延write コマンドには、同一の同期ＩＤが付加される。ただし、ストレージ３０１は、再開可能ポイント通知コマンドを受信したときに発行する遅延データ反映コマンドに、更新する直前の同期ＩＤを付加する。その後に発行する遅延write コマンドに更新後の同期ＩＤを付加する。図２２に示す例では、遅延write コマンドに更新後の同期ＩＤである「２４」や「２５」が付加され、その直近に出力された遅延データ反映コマンドに更新前の同期ＩＤである「２３」や「２４」が付加されている。 The storage 301 adds a synchronization ID and an issue ID to the delayed write command and the delayed data reflection command and transmits the command. FIG. 22 is an explanatory diagram illustrating an example of a synchronization ID and an issue ID. The storage 301 updates the synchronization ID every time a resumable point notification command is received from the host 300. Therefore, the same synchronization ID is added to the delayed write command output from one resumable point to the next resumable point. However, the storage 301 adds the synchronization ID immediately before the update to the delayed data reflection command issued when the resumable point notification command is received. The updated synchronization ID is added to the delayed write command issued thereafter. In the example illustrated in FIG. 22, “24” and “25” that are the updated synchronization IDs are added to the delayed write command, and “23” that is the synchronization ID before the update is added to the delayed data reflection command that is output most recently. And “24” are added.

また、ストレージ３０１は、遅延write コマンドまたは遅延データ反映コマンドを発行する度に（すなわち、遅延write コマンドまたは遅延データ反映コマンドを作成して送信する度に）発行ＩＤを更新する。図２２に示す例では、遅延write コマンドまたは遅延データ反映コマンドの発行ＩＤが１増加する場合の例を示している。 The storage 301 updates the issue ID every time a delayed write command or delayed data reflection command is issued (that is, every time a delayed write command or delayed data reflection command is created and transmitted). The example shown in FIG. 22 shows an example in which the issue ID of the delayed write command or the delayed data reflection command is increased by 1.

ストレージ３０１は、ホスト３００からのwrite コマンドを受信すると、ストレージ３０２に遅延write コマンドを送信し、その後、遅延write コマンドに対応するデータを送信する。このデータの送信順序は、遅延write コマンドの発行順序と同一でなくてもよい。例えば、ストレージ３０１が、データｐ，ｑを書き込ませるために遅延write コマンドを発行し、その後データｒを書き込ませるために次の遅延write コマンドを発行するものとする。この場合、ストレージ３０１が各データを送信する順番は、ｐ，ｑ，ｒという順番に限られない。データｐ，ｑの送信完了前にデータｒの送信を開始してもよい。ただし、遅延データ反映コマンドをストレージ３０２に送信するときには、それまでに発行した遅延write コマンドのデータの送信を完了させてから遅延データ反映コマンドを送信する。 When the storage 301 receives a write command from the host 300, the storage 301 transmits a delayed write command to the storage 302, and then transmits data corresponding to the delayed write command. The data transmission order may not be the same as the delayed write command issue order. For example, it is assumed that the storage 301 issues a delayed write command to write data p and q, and then issues the next delayed write command to write data r. In this case, the order in which the storage 301 transmits each data is not limited to the order of p, q, r. The transmission of the data r may be started before the transmission of the data p and q is completed. However, when the delayed data reflection command is transmitted to the storage 302, the delayed data reflection command is transmitted after the transmission of the delayed write command data issued so far is completed.

なお、ストレージ３０１は、遅延write コマンドに対して、遅延write コマンドで指定されたデータを書き込むべき場所（オフセットアドレス、セクター番号、ブロック番号等）やデータのサイズも付加する。 The storage 301 also adds a location (offset address, sector number, block number, etc.) and data size to which data specified by the delayed write command is written to the delayed write command.

次に、アプリケーション３１０ａ，３１０ｂが、ストレージ３０１にデータを書き込むときの動作を説明する。図２３は、ストレージコントローラ３２０における処理シーケンサ３２１の動作を示すフローチャートである。 Next, an operation when the applications 310a and 310b write data to the storage 301 will be described. FIG. 23 is a flowchart showing the operation of the processing sequencer 321 in the storage controller 320.

アプリケーション３１０ａ，３１０ｂは、ストレージ３０１にデータを書き込むときに、ストレージ３０１のストレージコントローラ３２０における通信部３２２に対してwrite コマンドを出力する。通信部３２２は、write コマンドを受け取ると、write コマンドを処理シーケンサ３２１に渡し、write 処理を開始することを処理シーケンサ３２１に指示する。 When the applications 310 a and 310 b write data to the storage 301, they output a write command to the communication unit 322 in the storage controller 320 of the storage 301. Upon receiving the write command, the communication unit 322 passes the write command to the processing sequencer 321 and instructs the processing sequencer 321 to start the write processing.

処理シーケンサ３２１は、write コマンドを受け取ると、図２３に示すように、ホスト３００から送られてくるデータを格納するのに必要な領域をバッファメモリ１０６に確保する（ステップＳ４００）。そして、準備完了の通知をホスト３００に送るように通信部３２２に指示する（ステップＳ４０１）。通信部３２２は、指示に応じて、ホスト３００に準備完了の通知を送信する。 When receiving the write command, the processing sequencer 321 secures an area necessary for storing data sent from the host 300 in the buffer memory 106 as shown in FIG. 23 (step S400). Then, the communication unit 322 is instructed to send a notification of completion of preparation to the host 300 (step S401). In response to the instruction, the communication unit 322 transmits a preparation completion notification to the host 300.

次いで、ホスト３００からデータが届くのを待ち（ステップＳ４０２）、データが届いて通信部３２２からデータを格納すべきバッファメモリ１０６の領域の問い合わせを受けると、ステップＳ４００で確保した領域を通信部３２２に知らせる（ステップＳ４０３）。また、通信部３２２にストレージ３０２に対して同期ＩＤを指定して遅延write コマンドを送信するよう指示する（ステップＳ４０４）。通信部３２２は、指示に応じて、ストレージ３０２に遅延write コマンドを送信する。既に説明したように、遅延write コマンドには、同期ＩＤおよび発行ＩＤが付加される。ある再開可能ポイントから次の再開可能ポイントまでの間に出力する各遅延write コマンドには、同一の同期ＩＤを付加する。また、遅延write コマンドまたは遅延データ反映コマンドを出力する度に発行ＩＤを更新する（図２２参照）。 Next, the process waits for data to arrive from the host 300 (step S402). When the data arrives and receives an inquiry from the communication unit 322 about the area of the buffer memory 106 where the data is to be stored, the communication unit 322 allocates the area secured in step S400. (Step S403). Further, the communication unit 322 is instructed to designate the synchronization ID to the storage 302 and transmit a delayed write command (step S404). The communication unit 322 transmits a delayed write command to the storage 302 in response to the instruction. As already described, the synchronization ID and the issue ID are added to the delayed write command. The same synchronization ID is added to each delayed write command output from one resumable point to the next resumable point. Also, the issuance ID is updated every time a delayed write command or delayed data reflection command is output (see FIG. 22).

そして、ホスト３００からのデータのバッファメモリ１０６への格納の完了を待ち（ステップＳ４０５）、全てのデータがバッファメモリ１０６に格納されたことが通信部３２２から通知されると、処理シーケンサ３２１は、ＩＯスケジューラ１０４に対して、処理の種類（この場合には書き込み）と、処理の識別ＩＤと、処理の対象となる記憶媒体１０１中の領域を示す情報と、処理の対象となるバッファメモリ１０６の領域を示す情報とを登録する。ＩＯスケジューラ１０４は、書き込み要求に応じて、媒体制御部１０５に、書き込み対象のデータの書き込み指示を行う。媒体制御部１０５は、登録内容に応じてバッファメモリ１０６から記憶媒体１０１へのデータの書き込み処理を行う（ステップＳ４０６）。 Then, the storage sequencer 321 waits for the storage of data from the host 300 to the buffer memory 106 to be completed (step S405), and is notified from the communication unit 322 that all data has been stored in the buffer memory 106. For the IO scheduler 104, the processing type (in this case, writing), the processing identification ID, information indicating the area in the storage medium 101 to be processed, and the buffer memory 106 to be processed Information indicating the area is registered. The IO scheduler 104 instructs the medium control unit 105 to write data to be written in response to the write request. The medium control unit 105 performs data writing processing from the buffer memory 106 to the storage medium 101 in accordance with the registered contents (step S406).

そして、ストレージ３０２から準備完了のメッセージ（遅延write コマンドに対する応答）が送信されるのを待つ（ステップＳ４０７）。ストレージ３０２からの準備完了のメッセージを受信したことが通信部３２２から通知されると、処理シーケンサ３２１は、通信部３２２に、バッファメモリ１０６に格納されたデータをストレージ３０２に送信させる（ステップＳ４０８）。その後、ストレージ３０２から受信完了のメッセージが送信されるのと、媒体制御部１０５からの書き込み完了通知とを待ち（ステップＳ４０９）、ストレージ３０２からの受信完了のメッセージを受信したことが通信部３２２から通知され、かつ、媒体制御部１０５からの書き込み完了通知を受けると、ホスト３００に完了通知し（ステップＳ４１０）、処理を終了する。 Then, it waits for a preparation completion message (response to the delayed write command) to be transmitted from the storage 302 (step S407). When the communication unit 322 is notified that the preparation completion message has been received from the storage 302, the processing sequencer 321 causes the communication unit 322 to transmit the data stored in the buffer memory 106 to the storage 302 (step S408). . Thereafter, the communication unit 322 waits for a reception completion message from the storage 302 and a write completion notification from the medium control unit 105 (step S409), and the communication unit 322 receives the reception completion message from the storage 302. When the notification is received and the writing completion notification is received from the medium control unit 105, the host 300 is notified of the completion (step S410), and the processing is terminated.

ステップＳ４０８において送信するデータの順番は、対応する遅延write コマンドの発行順序と異なっていてもよい。ストレージ３０１がホスト３００から連続してwrite コマンドを受信し、連続して遅延write コマンドを発行したとする。この場合、ストレージ３０１は、先に発行した遅延write コマンドに対応するデータの送信（ステップＳ４０８）が完了しないうちに、後の遅延write コマンドに対応するデータの送信（ステップＳ４０８）を開始してもよい。 The order of data to be transmitted in step S408 may be different from the order in which the corresponding delayed write commands are issued. Assume that the storage 301 continuously receives write commands from the host 300 and issues delayed write commands continuously. In this case, the storage 301 may start transmitting data corresponding to a later delayed write command (step S408) before completing transmission of data corresponding to the previously issued delayed write command (step S408). Good.

次に、ストレージ３０２が遅延write コマンドを受信したときの処理を説明する。図２４は、ストレージ３０２のストレージコントローラ３２０における処理シーケンサ３２１が遅延write コマンドを受信したときの動作を示すフローチャートである。 Next, processing when the storage 302 receives a delayed write command will be described. FIG. 24 is a flowchart showing an operation when the processing sequencer 321 in the storage controller 320 of the storage 302 receives the delayed write command.

処理シーケンサ３２１は、まず、受け取るデータに必要な領域を同期用バッファメモリ３２３に確保し（ステップＳ４２０）、要求の送信元（この例ではストレージ３０１）に受信準備完了の通知を送るように通信部３２２に指示する。通信部３２２は、指示に応じて受信準備完了のメッセージを送信元に送る（ステップＳ４２１）。次いで、データの到着を待ち（ステップＳ４２２）、データが到着し通信部３２２からデータの格納位置の問い合わせを受けると、ステップＳ４２０で確保した同期用バッファメモリ３２３の領域を通信部３２２に通知する（ステップＳ４２３）。そして、要求の送信先からのデータの同期用バッファメモリ３２３への格納が完了するのを待ち（ステップＳ４２４）、データの格納が完了したら、要求の送信先に遅延write コマンドにもとづく処理を完了したことを通知するように通信部３２２に指示する。通信部３２２は、指示に応じて、遅延write コマンドにもとづく処理を完了したことを要求の送信先に通知する（ステップＳ４２５）。そして、処理シーケンサ３２１は処理を終了する。 The processing sequencer 321 first secures an area necessary for the received data in the synchronization buffer memory 323 (step S420), and transmits a notification of reception preparation completion to the request transmission source (the storage 301 in this example). 322 is instructed. In response to the instruction, the communication unit 322 sends a reception preparation completion message to the transmission source (step S421). Next, it waits for the arrival of data (step S422), and when the data arrives and receives an inquiry about the storage location of the data from the communication unit 322, the communication unit 322 is notified of the area of the synchronization buffer memory 323 secured in step S420 ( Step S423). Then, it waits for the storage of data from the request destination to the synchronization buffer memory 323 to be completed (step S424), and when the data storage is completed, the processing based on the delayed write command is completed at the request destination. The communication unit 322 is instructed to notify this. In response to the instruction, the communication unit 322 notifies the request transmission destination that the processing based on the delayed write command has been completed (step S425). Then, the processing sequencer 321 ends the processing.

また、処理シーケンサ３２１は、データを記憶媒体１０１に書き込む際に、同期用バッファメモリ３２３に既に格納されているデータと同一の同期ＩＤを有し、かつ、格納する領域が一部だけでも重なるデータについては、既に同期用バッファメモリ３２３に格納されている重なる部分を書き込まないように制御する。すなわち、同じ領域に書き込むデータに関して、後続の遅延write コマンドにもとづく処理でのデータが有効になるように制御を行う。 Further, when the processing sequencer 321 writes data to the storage medium 101, the processing sequencer 321 has the same synchronization ID as the data already stored in the synchronization buffer memory 323, and overlaps even if only a part of the storage area is stored. Is controlled so as not to write an overlapping portion already stored in the buffer memory 323 for synchronization. That is, with respect to data to be written in the same area, control is performed so that data in processing based on the subsequent delayed write command becomes valid.

例えば、記憶媒体１０１中のある領域へのデータの書き込みを指示する遅延write コマンドがストレージ３０２に届き、同じ領域への書き込みを指示する遅延write コマンドが届いたとする。そして、この二つの遅延write コマンドの同期ＩＤが同一であるとする。この場合、正常系のストレージ３０１において再開可能ポイントの間に、一回データが書き込まれ、さらにデータが上書きされたことを意味する。従って、処理シーケンサ３２１は、同じ領域に書き込まれるデータのうち最初に書き込まれるデータを待機系ストレージ３０２の同期用バッファメモリ３２３の中に保持しなくてもよい。処理シーケンサ３２１は、新たにストレージ３０１から遅延write コマンドを受信したときに、上書きされることになるデータが同期用バッファメモリ３２３に存在するか否かを判断する。そして、上書きされることになるデータを特定したならばそのデータを同期用バッファメモリ３２３から削除する。 For example, it is assumed that a delayed write command for instructing data writing to a certain area in the storage medium 101 reaches the storage 302, and a delayed write command for instructing writing to the same area arrives. Assume that these two delayed write commands have the same synchronization ID. In this case, it means that data is written once and rewritten over the resumable point in the normal storage 301. Therefore, the processing sequencer 321 may not hold the data written first in the same area in the synchronization buffer memory 323 of the standby storage 302. When the processing sequencer 321 newly receives a delayed write command from the storage 301, the processing sequencer 321 determines whether data to be overwritten exists in the synchronization buffer memory 323. When the data to be overwritten is specified, the data is deleted from the synchronization buffer memory 323.

次に、アプリケーション３１０ａ，３１０ｂが、ストレージ３０１に再開可能ポイント通知を通知する再開可能ポイント通知処理の動作を説明する。ここでは、アプリケーション３１０ａが再開可能ポイント通知処理を行う場合を例にする。 Next, the operation of resumable point notification processing in which the applications 310a and 310b notify the storage 301 of resumable point notification will be described. Here, a case where the application 310a performs resumable point notification processing is taken as an example.

アプリケーション３１０ａが、再開可能ポイントを通知する場合、ＩＯ管理部３１１に再開可能ポイントを通知するように指示する。ＩＯ管理部３１１において、アプリケーション３１０ａから再開可能ポイントの通知が指示されると、再開可能ポイント通知部３１２が、ストレージ３０１に対して再開可能ポイント通知コマンドを発行する。 When the application 310a notifies the resumable point, the application 310a instructs the IO management unit 311 to notify the resumable point. When notification of resumable points is instructed from the application 310 a in the IO management unit 311, the resumable point notification unit 312 issues a resumable point notification command to the storage 301.

ストレージ３０１に再開可能ポイント通知コマンドが到着すると、ストレージ３０１のストレージコントローラ３２０における通信部３２２に再開可能ポイント通知コマンドが入力される。通信部３２２は、再開可能ポイント通知コマンドを受け取ると、処理シーケンサ３２１に再開可能ポイント通知コマンドを渡し、再開可能ポイント通知処理の開始を指示する。 When the resumable point notification command arrives at the storage 301, the resumable point notification command is input to the communication unit 322 in the storage controller 320 of the storage 301. Upon receiving the resumable point notification command, the communication unit 322 passes the resumable point notification command to the processing sequencer 321 and instructs the start of the resumable point notification process.

図２５は、処理シーケンサ３２１の動作を示すフローチャートである。処理シーケンサ３２１は、まず、その時点での内部の同期ＩＤの値を保持した後に、同期ＩＤの値を更新する（ステップＳ４４０）。同期ＩＤの値を保持するとは、例えば、レジスタに保存することである。次いで、ステップＳ４４０で保持した同期ＩＤを指定した遅延データ反映コマンドをストレージ３０２に送信するように通信部３２２に指示する。通信部３２２は、指示に応じて、遅延データ反映コマンドをストレージ３０２に送信する（ステップＳ４４１）。そして、ストレージ３０２から遅延データ反映コマンドにもとづく処理の完了のメッセージが送られてくるのを待つ（ステップＳ４４２）。ストレージ３０２から完了のメッセージが到着したことが通信部３２２から通されたら、通信部３２２を介してホスト３００に再開可能ポイント通知コマンドにもとづく処理の完了を通知し（ステップＳ４４３）、処理を終了する。 FIG. 25 is a flowchart showing the operation of the processing sequencer 321. The processing sequencer 321 first holds the value of the internal synchronization ID at that time, and then updates the value of the synchronization ID (step S440). Holding the value of the synchronization ID is, for example, saving it in a register. Next, the communication unit 322 is instructed to transmit the delayed data reflection command specifying the synchronization ID held in step S440 to the storage 302. In response to the instruction, the communication unit 322 transmits a delay data reflection command to the storage 302 (step S441). Then, it waits for a message indicating completion of processing based on the delayed data reflection command from the storage 302 (step S442). When the communication unit 322 notifies that the completion message has arrived from the storage 302, the host 300 is notified of the completion of the processing based on the resumable point notification command via the communication unit 322 (step S443), and the processing ends. .

なお、図２２に示すように、一つの遅延データ反映コマンドの送信後に出力する各遅延write コマンドには、レジスタ等に保存した同期ＩＤではなく、更新後の同期ＩＤを付加する。 Note that, as shown in FIG. 22, an updated synchronization ID is added to each delayed write command output after transmission of one delayed data reflection command instead of the synchronization ID stored in the register or the like.

次に、ストレージ３０２の遅延データ反映コマンドを受けた際の動作を説明する。ストレージ３０２に遅延データ反映コマンドが到着すると、ストレージ３０２のストレージコントローラ３２０の通信部３２２に遅延データ反映コマンドが入力される。通信部３２２は、遅延データ反映コマンドを受け取ると、処理シーケンサ３２１に遅延データ反映コマンドを渡し、遅延データ反映処理の開始を指示する。 Next, an operation when a delayed data reflection command of the storage 302 is received will be described. When the delayed data reflection command arrives at the storage 302, the delayed data reflection command is input to the communication unit 322 of the storage controller 320 of the storage 302. Upon receiving the delay data reflection command, the communication unit 322 passes the delay data reflection command to the processing sequencer 321 and instructs the start of the delay data reflection process.

図２６は、ストレージ３０２のストレージコントローラ３２０における処理シーケンサ３２１の遅延データ反映処理の動作を示すフローチャートである。処理シーケンサ３２１は、到着した遅延データ反映コマンドに付加された同期ＩＤの値が前回処理した遅延データ反映コマンドの次の値かどうかを判定する（ステップＳ４６０）。次の値でなかった場合には、同期ＩＤが、前回処理した遅延データ反映コマンドの同期ＩＤの次の値になっている遅延データ反映コマンドの到着を待つ。例えば、図２２に示す例において、前回処理した遅延データ反映コマンドが、同期ＩＤ「２３」の遅延データ反映コマンドであったとする。その後、同期ＩＤ「２５」の遅延データ反映コマンドを受信した場合、同期ＩＤ「２４」の遅延データ反映コマンドの到着を待つ。そして、同期ＩＤ「２４」の遅延データ反映コマンドについてステップＳ４６１〜Ｓ４６８の処理を行った後、同期ＩＤ「２５」の遅延データ反映コマンドについてステップＳ４６１以降の処理を行う。 FIG. 26 is a flowchart showing the operation of the delay data reflection process of the process sequencer 321 in the storage controller 320 of the storage 302. The processing sequencer 321 determines whether the value of the synchronization ID added to the arrived delayed data reflection command is the next value of the previously processed delayed data reflection command (step S460). If it is not the next value, it waits for the arrival of the delayed data reflection command whose synchronization ID is the next value of the synchronous ID of the previously processed delayed data reflection command. For example, in the example shown in FIG. 22, it is assumed that the previously processed delayed data reflection command is the delayed data reflection command with the synchronization ID “23”. Thereafter, when the delay data reflection command with the synchronization ID “25” is received, the arrival of the delay data reflection command with the synchronization ID “24” is awaited. Then, after performing the processing of steps S461 to S468 for the delayed data reflection command with the synchronization ID “24”, the processing after step S461 is performed for the delayed data reflection command with the synchronization ID “25”.

同期ＩＤが前回の同期ＩＤの次の値であった場合には、保持してある前回処理した遅延データ反映コマンドの発行ＩＤの値と、今回処理する遅延データ反映コマンドに付加された発行ＩＤの値の間の値の発行ＩＤを持つ遅延write コマンドに対応するデータが全て同期用バッファメモリ３２３に記録されているか否か検索する（ステップＳ４６１）。例えば、図２２に示す各コマンドのうち、前回処理した遅延データ反映コマンドの発行ＩＤが「７１」であり、今回処理する遅延データ反映コマンドに付加された発行ＩＤが「７６」であるとする。この場合、発行ＩＤが「７１」〜「７５」の遅延write コマンドに対応するデータが全て同期用バッファメモリ３２３に記録されているか否かを確認する。ただし、上書きされるデータであるとして削除したデータは記録確認の対象に含めなくてよい。抜けがあった場合には、各遅延write コマンドに対応するデータが全て待機系ストレージ３０２に到着し、同期用バッファメモリ３２３に記録されるまで待つ。 If the synchronization ID is the next value of the previous synchronization ID, the value of the issued ID of the delayed data reflection command processed last time and the issue ID added to the delayed data reflection command processed this time are stored. A search is performed to determine whether all data corresponding to the delayed write command having an issue ID between values is recorded in the synchronization buffer memory 323 (step S461). For example, among the commands shown in FIG. 22, it is assumed that the issue ID of the delayed data reflection command processed last time is “71” and the issue ID added to the delayed data reflection command processed this time is “76”. In this case, it is confirmed whether or not all the data corresponding to the delayed write commands having the issue IDs “71” to “75” are recorded in the synchronization buffer memory 323. However, data deleted as data to be overwritten does not have to be included in the recording confirmation target. If there is omission, it waits until all data corresponding to each delayed write command arrives at the standby storage 302 and is recorded in the synchronization buffer memory 323.

ステップＳ４６３では、処理シーケンサ３２１は、同期用バッファメモリ３２３を検索し、遅延データ反映コマンドにより指定された同期ＩＤと一致し、かつ、書き込み処理中でない遅延write コマンドを検索する。例えば、図２２に示す発行ＩＤ「７６」の遅延データ反映コマンドを受信して処理を行っている場合、そのコマンドに付加された同期ＩＤ「２４」と一致し、かつ、書き込み処理中でない遅延write コマンドを検索する。検索した結果、見つからなければ、記憶媒体１０１へのデータの書き込み処理が開始されていない遅延write コマンドが存在しないことになる。この場合、ステップＳ４６５に移行する。また、検索対象の遅延write コマンドが見つかった場合には、記憶媒体１０１へのデータの書き込み処理が開始されていない遅延write コマンドが存在することになる。この場合、ステップＳ４６４に移行する。 In step S463, the processing sequencer 321 searches the synchronization buffer memory 323, and searches for a delayed write command that matches the synchronization ID specified by the delayed data reflection command and is not being written. For example, when the delayed data reflection command with the issue ID “76” shown in FIG. 22 is received and processed, the delayed write that matches the synchronization ID “24” added to the command and is not being written is written. Search for commands. If it is not found as a result of the search, there is no delayed write command for which data write processing to the storage medium 101 has not started. In this case, the process proceeds to step S465. When a delayed write command to be searched is found, there is a delayed write command for which data write processing to the storage medium 101 has not started. In this case, the process proceeds to step S464.

ステップＳ４６４では、ステップＳ４６２の検索で見つかった遅延write コマンドで指定された記憶媒体１０１上のデータを書き込むべき場所（オフセットアドレス）に対して、同期用バッファメモリ３２３中の対応したデータを書き込む指示（書き込み要求）をＩＯスケジューラ１０４に登録する。ＩＯスケジューラ１０４は、書き込み要求に応じて、媒体制御部１０５に、データ転送対象のデータの書き込み指示を行う。媒体制御部１０５は、書き込み指示に従って、データを同期用バッファメモリ３２３から記憶媒体１０１に出力させる。そして、処理シーケンサ３２１は、同期用バッファメモリ３２３中の検索された領域を書き込み処理中とし、ステップＳ４６２に戻る。 In step S464, an instruction to write the corresponding data in the synchronization buffer memory 323 to the location (offset address) where the data on the storage medium 101 specified by the delayed write command found in the search in step S462 is to be written ( Write request) is registered in the IO scheduler 104. In response to a write request, the IO scheduler 104 instructs the medium control unit 105 to write data to be transferred. The medium control unit 105 outputs data from the synchronization buffer memory 323 to the storage medium 101 in accordance with the write instruction. Then, the processing sequencer 321 sets the searched area in the synchronization buffer memory 323 as a writing process, and returns to Step S462.

ステップＳ４６５では、同期用バッファメモリ３２３から記憶媒体１０１へデータ出力処理を行っているか否かを判断する。そして、処理中であればステップＳ４６６に移行し、すでに処理が完了しているならばステップＳ４６８に移行する。ステップＳ４６６では、記憶媒体１０１へデータ出力処理のうち１つが終了するまで待ち、完了した処理の対象であった同期用バッファメモリ３２３中の領域を未使用状態にして（ステップＳ４６７）、ステップＳ４６５に戻る。ステップＳ４６８では、遅延データ反映コマンドの送信元（この例ではストレージ３０１）に遅延データ反映コマンドにもとづく処理の完了を通知するように通信部３２２に指示する。通信部３２２は、指示に応じて、処理の完了を遅延データ反映コマンドの送信元に通知する。また、処理シーケンサ３２１は、前回処理した遅延データ反映コマンドの情報として保持している同期ＩＤおよび発行ＩＤを、処理した遅延データ反映コマンドのものに更新し、処理を終了する。 In step S465, it is determined whether data output processing is being performed from the synchronization buffer memory 323 to the storage medium 101. If the process is in progress, the process proceeds to step S466. If the process has already been completed, the process proceeds to step S468. In step S466, the process waits until one of the data output processes to the storage medium 101 is completed, the area in the synchronization buffer memory 323 that is the target of the completed process is set to an unused state (step S467), and the process proceeds to step S465. Return. In step S468, the communication unit 322 is instructed to notify the transmission source of the delayed data reflection command (storage 301 in this example) of the completion of processing based on the delayed data reflection command. In response to the instruction, the communication unit 322 notifies the transmission source of the delayed data reflection command of the completion of processing. Also, the processing sequencer 321 updates the synchronization ID and issue ID held as information of the previously processed delayed data reflection command to those of the processed delayed data reflection command, and ends the processing.

次に、災害発生時の動作を説明する。災害等によってホスト３００が使用できなくなった場合には、ストレージ３０２を待機系から正常系にする。また、ホスト３０３が、ホスト３００から処理を引き継ぐ。 Next, the operation when a disaster occurs will be described. When the host 300 becomes unusable due to a disaster or the like, the storage 302 is changed from the standby system to the normal system. Further, the host 303 takes over processing from the host 300.

障害検知からアプリケーション再開までのホスト３０３の動作を説明する。まず、ホスト３０３中のホスト監視部３１３がホスト３００の異常を検出する。ホスト監視部３１３は、ホスト３００の異常を検出すると、待機系であるストレージ３０２に対して遅延データ破棄コマンドを発行する。そして、遅延データ破棄コマンドに対する応答を待ち、応答を受けたら、待機系ホスト３０３のホスト監視部３１３は、待機系ホストのアプリケーション３１０ａ，３１０ｂを実行させる。以後、ホスト３０３からのデータの書き込みおよび読み出しは、ストレージ３０２に対して行われる。 The operation of the host 303 from failure detection to application restart will be described. First, the host monitoring unit 313 in the host 303 detects an abnormality of the host 300. When the host monitoring unit 313 detects an abnormality of the host 300, the host monitoring unit 313 issues a delayed data discard command to the storage 302 that is a standby system. Then, waiting for a response to the delayed data discard command, and receiving the response, the host monitoring unit 313 of the standby host 303 executes the applications 310a and 310b of the standby host. Thereafter, data writing and reading from the host 303 are performed on the storage 302.

なお、ホスト３０３中のホスト監視部３１３は、ホスト３００と常時あるいは定期的に通信を行っている。ホスト監視部３１３が、一定時間ホスト３００と通信できなくなった場合、あるいは、ホスト３００から異常が報告された場合に、ホスト監視部３１３は、ホスト３００において災害が発生したと認識する。ホスト３０３がホスト３００の異常を確実に認識するために、ホスト３０３とホスト３００とは専用回線で接続されていることが好ましい。 Note that the host monitoring unit 313 in the host 303 communicates with the host 300 constantly or periodically. When the host monitoring unit 313 cannot communicate with the host 300 for a certain period of time or when an abnormality is reported from the host 300, the host monitoring unit 313 recognizes that a disaster has occurred in the host 300. In order for the host 303 to reliably recognize the abnormality of the host 300, the host 303 and the host 300 are preferably connected by a dedicated line.

次に、待機系ストレージ３０２が、待機系ホスト３０３から遅延データ破棄コマンドを受けた場合の動作を説明する。ストレージ３０２に遅延データ破棄コマンドが到着すると、ストレージ３０２のストレージコントローラ３２０における通信部３２２に遅延データ破棄コマンドが入力される。通信部３２２は、遅延データ破棄コマンドを受け取ると、処理シーケンサ３２２に遅延データ破棄コマンドを渡し、遅延データ破棄処理の開始を指示する。 Next, an operation when the standby storage 302 receives a delayed data discard command from the standby host 303 will be described. When the delayed data discard command arrives at the storage 302, the delayed data discard command is input to the communication unit 322 in the storage controller 320 of the storage 302. Upon receiving the delayed data discard command, the communication unit 322 passes the delayed data discard command to the processing sequencer 322 and instructs the start of the delayed data discard process.

図２７は、処理シーケンサ３２２が実行する遅延データ破棄処理を示すフローチャートである。処理シーケンサ３２２は、同期用バッファメモリ３２３を検索する（ステップＳ４８０）。ステップＳ４８０では、未だ遅延データ反映コマンドが到着していないため記憶媒体１０１に移動させる必要がないデータを検索する。処理シーケンサ３２２は、検索した結果、見つからなかった場合ステップＳ４８３に移行し、見つかった場合にはステップＳ４８２に移行する（ステップＳ４８１）。例えば、図２２に示す各コマンドのうち、発行ＩＤ「７６」の遅延データ反映コマンドを受信しておらず、発行ＩＤ「７２」〜「７５」のうちの一部の遅延write コマンドのデータが同期用バッファメモリ３２３に記憶されているならば、ステップＳ４８２に移行する。発行ＩＤ「７６」の遅延データ反映コマンドを受信して、各遅延write コマンドのデータを記憶媒体１０１に移行している場合には、ステップＳ４８３に移行する。 FIG. 27 is a flowchart showing the delayed data discarding process executed by the processing sequencer 322. The processing sequencer 322 searches the synchronization buffer memory 323 (step S480). In step S480, data that does not need to be moved to the storage medium 101 is searched because the delayed data reflection command has not yet arrived. As a result of the search, the processing sequencer 322 proceeds to step S483 if it is not found, and proceeds to step S482 if found (step S481). For example, among the commands shown in FIG. 22, the delayed data reflection command of issue ID “76” has not been received, and the data of some delayed write commands of issue IDs “72” to “75” are synchronized. If stored in the buffer memory 323, the process proceeds to step S482. If the delayed data reflection command with the issue ID “76” has been received and the data of each delayed write command has been transferred to the storage medium 101, the process advances to step S483.

ステップＳ４８２では、処理シーケンサ３２２は、ステップＳ４８０で見つかったデータを記録している同期用バッファメモリ３２３中の領域を未使用状態にしてステップＳ４８０に戻る。すると、次のステップＳ４８１では、ステップＳ４８３に移行することになる。ステップＳ４８３では、遅延データ反映処理中であった場合にはステップＳ４８４に移行し、遅延データ反映処理中でなかった場合にはステップＳ４８５に移行する。 In step S482, the processing sequencer 322 makes the area in the synchronization buffer memory 323 in which the data found in step S480 is recorded unused, and returns to step S480. Then, in the next step S481, the process proceeds to step S483. In step S483, if the delayed data reflection process is being performed, the process proceeds to step S484. If the delayed data reflection process is not being performed, the process proceeds to step S485.

ステップＳ４８４では、処理中の遅延データ反映処理の完了を持ち、完了したらステップＳ４８５に移行する。ステップＳ４８５では、遅延データ破棄コマンドの発行元（この例ではホスト３０３）に遅延データ破棄コマンドにもとづく処理の完了を通知するように通信部３２２に指示する。通信部３２２は、指示に応じて、処理の完了を遅延データ破棄コマンドの発行元に通知する、そして、処理シーケンサ３２２は処理を終了する。 In step S484, the delayed data reflection process being processed is completed, and if completed, the process proceeds to step S485. In step S485, the communication unit 322 is instructed to notify the issuer of the delayed data discard command (host 303 in this example) of the completion of the processing based on the delayed data discard command. In response to the instruction, the communication unit 322 notifies the issuer of the delayed data discard command of the completion of the processing, and the processing sequencer 322 ends the processing.

本実施の形態では、正常系のホスト３００がストレージ３０１にwrite コマンドを出力したとき、待機系のストレージ３０２はストレージ３０１から書き込まれるデータを受信するが、記憶媒体１０１には記録せず、同期用バッファメモリ３２３に記録する。そして、ストレージ３０２は、再開可能ポイント通知コマンドを受信したときに、そのデータを同期用バッファメモリ３２３から記憶媒体１０１に移動させ、遅延データ破棄コマンドを受信した場合、同期用バッファメモリ３２３を未使用状態にする。従って、ストレージ３０２の記憶媒体１０１は、常にアプリケーション３１０ａ，３１０ｂの処理を再開できる状態に保たれる。その結果、ホスト３００に異常が生じたときに、即座にホスト３０３が処理を続行することができる。 In this embodiment, when the normal host 300 outputs a write command to the storage 301, the standby storage 302 receives the data written from the storage 301, but does not record it in the storage medium 101, but for synchronization. Records in the buffer memory 323. When the storage 302 receives the resumable point notification command, the storage 302 moves the data from the synchronization buffer memory 323 to the storage medium 101. When the storage 302 receives the delayed data discard command, the storage 302 does not use the synchronization buffer memory 323. Put it in a state. Accordingly, the storage medium 101 of the storage 302 is always kept in a state where the processing of the applications 310a and 310b can be resumed. As a result, when an abnormality occurs in the host 300, the host 303 can immediately continue the processing.

また、正常系ストレージ３０１は、再開可能ポイントと再開可能ポイントとの間で記憶媒体１０１に書き込んだデータを一度に待機系ストレージ３０２に送信するのではなく、write コマンドを受信したタイミング毎に送信する。従って、大量のデータを一度に送信しないので、ホスト３０２とのデータ転送時間が少なくてすむ。 In addition, the normal storage 301 does not transmit the data written in the storage medium 101 between the resumable point and the resumable point to the standby storage 302 at a time, but transmits it at every timing when the write command is received. . Therefore, since a large amount of data is not transmitted at a time, the data transfer time with the host 302 can be reduced.

さらに、待機系ストレージ３０２がストレージ３０１から受信する各コマンドは同期ＩＤおよび発行ＩＤによって管理され、再開可能ポイントと再開可能ポイントとの間に正常系ストレージが送信したデータの到着順序は任意の順序でよい。従って、遅延write コマンドに対応するデータを遅延write コマンドが送信する順序が制限されないので、設計を行いやすくなる。 Further, each command received by the standby storage 302 from the storage 301 is managed by the synchronization ID and the issue ID, and the arrival order of the data transmitted by the normal storage between the resumable point and the resumable point is in an arbitrary order. Good. Therefore, since the order in which the delayed write command transmits data corresponding to the delayed write command is not limited, the design is facilitated.

本実施の形態では、遅延write コマンドおよび遅延データ反映コマンドに同期ＩＤおよび発行ＩＤの双方を付加する場合を説明したが、発行ＩＤのみを付加するようにしてもよい。この場合、待機系ストレージ３０２の処理シーケンサ３２１は、遅延データ反映コマンドを受信した場合、ステップＳ４６０，Ｓ４６１の代わりに以下の処理を行えばよい。処理シーケンサ３２１は、遅延データ反映コマンドを受信した場合、前回処理した遅延データ反映コマンドの発行ＩＤと受信した遅延データ反映コマンドの発行ＩＤとの間の発行ＩＤが付加された各コマンドを全て受信しているか否かを確認する。そして、各発行ＩＤが付加されたコマンドを全て受信してれば、ステップＳ４６２（図２６参照）以降の処理を行う。各発行ＩＤが付加されたコマンドを全て受信していなければ、各コマンドを全て受信したときにステップＳ４６２以降の処理を開始する。 In the present embodiment, the case where both the synchronization ID and the issue ID are added to the delayed write command and the delayed data reflection command has been described. However, only the issue ID may be added. In this case, when the processing sequencer 321 of the standby storage 302 receives the delayed data reflection command, the processing sequencer 321 may perform the following processing instead of steps S460 and S461. When receiving the delayed data reflection command, the processing sequencer 321 receives all the commands to which the issue ID between the issued ID of the delayed data reflection command processed last time and the issued ID of the received delayed data reflection command is added. Check if it is. If all commands to which each issue ID is added have been received, the processing after step S462 (see FIG. 26) is performed. If not all commands to which the issue IDs are added have been received, the processing after step S462 is started when all the commands are received.

例えば、図２２に示す各コマンド（同期ＩＤは付加されていないものとする）のうち、発行ＩＤ「７１」の遅延データ反映コマンドが前回処理した遅延データ反映コマンドであったとする。その後、発行ＩＤ「７６」の遅延データ反映コマンドを受信した場合、発行ＩＤ「７２」〜「７５」の各コマンドを受信しているか否かを確認し、これらのコマンドを全て受信した後にステップＳ４６２以降の処理を開始する。また、発行ＩＤ「７６」より先に発行ＩＤ「８０」の遅延データ反映コマンドを受信した場合、制御シーケンサ３２１は、発行ＩＤ「７２」〜「７９」の各コマンドを受信しているか否かを確認する。そして、発行ＩＤ「７２」〜「７６」の各コマンドを全て受信したときに、発行ＩＤ「７６」の遅延データ反映コマンドの処理を行う。その後、発行ＩＤ「７７」〜「７９」のコマンドを全て受信したときに発行ＩＤ「８０」の遅延データ反映コマンドの処理を行う。 For example, it is assumed that, among the commands shown in FIG. 22 (assuming that no synchronization ID is added), the delayed data reflection command with the issue ID “71” is the delayed data reflection command processed last time. After that, when the delayed data reflection command with the issue ID “76” is received, it is confirmed whether or not each command with the issue IDs “72” to “75” has been received, and after receiving all these commands, step S462. Subsequent processing is started. When the delayed data reflection command with the issue ID “80” is received before the issue ID “76”, the control sequencer 321 determines whether or not each command with the issue IDs “72” to “79” has been received. Check. When all commands with issue IDs “72” to “76” are received, processing of a delayed data reflection command with issue ID “76” is performed. Thereafter, when all the commands with the issue IDs “77” to “79” are received, the delayed data reflection command with the issue ID “80” is processed.

なお、各コマンドに同期ＩＤも付加される場合には、遅延write コマンドを受信したときに、上書きされるデータを特定することができる。一方、同期ＩＤを用いない場合には、処理シーケンサ３２１はステップＳ４６２の処理を開始する直前に、上書きされるデータを特定し、そのデータを同期用バッファメモリ３２３から削除する。 If a synchronization ID is also added to each command, the data to be overwritten can be specified when a delayed write command is received. On the other hand, when the synchronization ID is not used, the processing sequencer 321 specifies data to be overwritten immediately before starting the processing of step S462, and deletes the data from the synchronization buffer memory 323.

実施の形態８．
図２８は、本発明によるデータ複製システムの第８の実施の形態を示すブロック図である。この実施の形態では、ストレージが主体となってデータ転送を実行するとともに、正常系システムにおけるストレージ内のデータが遠隔地にミラーリングされる。さらに、システムには、少なくとも１世代前のスナップショットを保存できる待機系のストレージが設けられる。１世代前のスナップショットとは、直近の再開可能ポイントにおいてデータが格納されていた記憶媒体１０１のアドレスの情報である。 Embodiment 8 FIG.
FIG. 28 is a block diagram showing an eighth embodiment of a data replication system according to the present invention. In this embodiment, data is transferred mainly by the storage, and data in the storage in the normal system is mirrored to a remote location. Further, the system is provided with a standby storage capable of storing a snapshot of at least one generation before. The snapshot one generation before is information on the address of the storage medium 101 in which data was stored at the latest resumable point.

図２８に示すデータ複製システムにおいて、ストレージ４００が、ストレージ４００を使用するホスト３００とローカルに接続されている。また、ストレージ４０１が、ストレージ４０１を使用するホスト３０３とローカルに接続されている。ストレージ３０１は、ネットワーク１３を介してストレージ３０２に接続されている。また、ホスト３００とホスト３０３とは、ホスト３０３がホスト３００の状態を監視するために通信可能に接続されている。ホスト３００とホスト３０３は専用回線によって接続されていることが好ましいが、専用回線以外のネットワーク（例えばインターネット等）によって接続されていてもよい。なお、ホスト３００，３０３の構成は、第７の実施の形態におけるホスト３００，３０３の構成と同じである（図２０参照）。 In the data replication system shown in FIG. 28, a storage 400 is locally connected to a host 300 that uses the storage 400. Further, the storage 401 is locally connected to the host 303 that uses the storage 401. The storage 301 is connected to the storage 302 via the network 13. Further, the host 300 and the host 303 are communicably connected so that the host 303 monitors the state of the host 300. The host 300 and the host 303 are preferably connected by a dedicated line, but may be connected by a network other than the dedicated line (for example, the Internet). The configurations of the hosts 300 and 303 are the same as the configurations of the hosts 300 and 303 in the seventh embodiment (see FIG. 20).

ストレージ４００，４０１は、例えば、単体の磁気ディスク装置、光ディスク装置または光磁気ディスク装置である。ストレージ４００，４０１として、単体の磁気ディスク装置、光ディスク装置または光磁気ディスク装置の集合であるディスクアレイ装置を使用することもできる。ホスト３００，３０３とストレージ４００，４０１とは、ＳＣＳＩ、ファイバチャネル（Fibre channel ）、イーサネット（登録商標）等で接続される。なお、図２８に示すシステムにおいて、ホスト３００が、システムに障害が発生していないときに稼働する正常系ホストであり、ホスト３０３が、ホスト３００において障害が発生したときに稼働する待機系ホストであるとする。 The storages 400 and 401 are, for example, a single magnetic disk device, an optical disk device, or a magneto-optical disk device. As the storages 400 and 401, a single magnetic disk device, an optical disk device, or a disk array device that is a set of magneto-optical disk devices can be used. The hosts 300 and 303 and the storages 400 and 401 are connected by SCSI, Fiber channel, Ethernet (registered trademark), or the like. In the system shown in FIG. 28, the host 300 is a normal host that operates when there is no failure in the system, and the host 303 is a standby host that operates when a failure occurs in the host 300. Suppose there is.

この実施の形態では、ホスト３００上で動作するアプリケーション３１０ａ，３１０ｂが、動作中に、そのデータの状態であればアプリケーションがそのまま動作を再開可能なポイントで、ストレージ３０１に再開可能なポイントであることを知らせるために再開可能ポイント通知処理を行う。なお、アプリケーション３１０ａ，３１０ｂが、ストレージ４００からデータを読み出す際のホスト３００およびストレージ４００の動作は、通常のデータ読み出し処理の場合の動作と同様である。また、ストレージ４００において、ＩＯスケジューラ１０４および媒体制御部１０５の動作は、第１の実施の形態におけるＩＯスケジューラ１０４および媒体制御部１０５の動作と同様である。 In this embodiment, the applications 310 a and 310 b running on the host 300 are points that can be resumed in the storage 301 as long as the applications can resume operation if they are in the data state. In order to notify, resumable point notification processing is performed. Note that the operations of the host 300 and the storage 400 when the applications 310a and 310b read data from the storage 400 are the same as the operations in the normal data read processing. In the storage 400, the operations of the IO scheduler 104 and the medium control unit 105 are the same as the operations of the IO scheduler 104 and the medium control unit 105 in the first embodiment.

本実施の形態におけるアプリケーション３１０ａ，３１０ｂは、再開機能を有するアプリケーションである。すなわち、ストレージの記憶媒体１０１のデータ記録状態が所定の状態になっていれば処理を再開できるアプリケーションである。 The applications 310a and 310b in the present embodiment are applications having a resume function. That is, it is an application that can resume processing if the data recording state of the storage medium 101 of the storage is in a predetermined state.

図２９は、図２８に示すストレージ４００の構成例を示すブロック図である。なお、ストレージ４０１の構成も、図２９に示すような構成である。図２９に示すように、ストレージ４００は、ストレージコントローラ４１０とストレージ本体である記憶媒体１０１とを含む。ストレージコントローラ４１０は、ホスト３００および他のストレージと通信を行う通信部４１２、各処理のシーケンスを管理する処理シーケンサ４１１、記憶媒体１０１に対する処理命令の順序制御を行うＩＯスケジューラ１０４、ＩＯスケジューラ１０４が発行する処理命令に従って記憶媒体１０１の動作を制御する媒体処理部１０５、ホスト３００から記憶媒体１０１へのデータおよび記憶媒体１０１からホスト３００へのデータを一時記憶するバッファメモリ１０６、論理ブロックアドレスの管理を行うＬＢＡ管理部４１３、および論理ブロックアドレスの管理に用いられるアドレステーブルを記憶したアドレステーブル記憶部４１４を含む。処理シーケンサ３２１は、例えば、プログラムに従って動作するＣＰＵで実現される。 FIG. 29 is a block diagram showing a configuration example of the storage 400 shown in FIG. The configuration of the storage 401 is also as shown in FIG. As shown in FIG. 29, the storage 400 includes a storage controller 410 and a storage medium 101 that is a storage body. The storage controller 410 is issued by a communication unit 412 that communicates with the host 300 and other storages, a processing sequencer 411 that manages the sequence of each process, an IO scheduler 104 that controls the order of processing instructions for the storage medium 101, and an IO scheduler 104 A medium processing unit 105 that controls the operation of the storage medium 101 in accordance with a processing instruction to be performed; a buffer memory 106 that temporarily stores data from the host 300 to the storage medium 101 and data from the storage medium 101 to the host 300; It includes an LBA management unit 413 to perform, and an address table storage unit 414 that stores an address table used for managing logical block addresses. The processing sequencer 321 is realized by, for example, a CPU that operates according to a program.

ストレージ４００，４０１は、記憶媒体１０１の物理アドレス空間によって、データを管理する。一方、他の装置からデータの書き込み処理や読み込み処理等の要求を受け付ける場合には、書き込み領域や読み込み領域を論理アドレス空間のアドレスによって指定される。再開可能ポイントにおける論理アドレス空間をスナップショット論理アドレス空間と呼ぶ。また、最新の時点において論理アドレスを最新論理アドレス空間と呼ぶ。ストレージ４００，４０１は、スナップショット論理アドレス空間の情報と、最新論理アドレス空間の情報とを用いて処理を進める。また、ストレージ４００，４０１は、複数のスナップショット論理アドレス空間を管理する場合もある。 The storages 400 and 401 manage data according to the physical address space of the storage medium 101. On the other hand, when receiving a request for data write processing, data read processing, or the like from another device, a write area or a read area is designated by an address in the logical address space. The logical address space at the resumable point is called a snapshot logical address space. In addition, the logical address at the latest time point is called the latest logical address space. The storages 400 and 401 proceed with the processing using the snapshot logical address space information and the latest logical address space information. The storages 400 and 401 may manage a plurality of snapshot logical address spaces.

アドレステーブル記憶部４１４が記憶するアドレステーブルの一例を図３０に示す。アドレステーブル４２０は、複数のエントリ４２０−０〜４２０−ｎから構成されている。各エントリ４２０−０〜４２０−ｎは、論理アドレス空間を固定長のデータに分割した論理ブロックに対応する。 An example of the address table stored in the address table storage unit 414 is shown in FIG. The address table 420 includes a plurality of entries 420-0 to 420-n. Each entry 420-0 to 420-n corresponds to a logical block obtained by dividing the logical address space into fixed-length data.

図３１（ａ）は、エントリの説明図である。各エントリ４２０−０〜４２０−ｎを構成する情報には、最新論理アドレス空間に対応したストレージ内の物理ブロック番号と、スナップショット論理アドレス空間に対応した物理ブロック番号がある。各物理ブロック番号は、直近にスナップショットを作成してからブロック番号の変更があったことを記録するフラグとセットになっている。なお、物理ブロック番号とは、ストレージ４００，４０１のアドレス空間を固定長のデータに分割した際の物理ブロック個々につけられた一意の番号である。なお、スナップショット論理アドレス空間に対応する物理ブロック番号のフラグは設けなくてもよい。 FIG. 31A is an explanatory diagram of entries. Information constituting each entry 420-0 to 420-n includes a physical block number in the storage corresponding to the latest logical address space and a physical block number corresponding to the snapshot logical address space. Each physical block number is set with a flag for recording that the block number has changed since the most recent snapshot was created. The physical block number is a unique number assigned to each physical block when the address space of the storage 400 or 401 is divided into fixed-length data. It is not necessary to provide a physical block number flag corresponding to the snapshot logical address space.

図３１（ｂ）は、一つのエントリの初期状態の例を示す。図３１（ｂ）では、「ａａａ」という物理ブロック番号に対応するアドレスにデータが格納されていることを示す。また、再開可能ポイント以降、データが変更されていないので、最新論理アドレス空間に対応する物理ブロック番号も「ａａａ」となっている。「ａａａ」という物理ブロック番号に対応するアドレスへの書き込みが要求されたとする。その場合、書き込みは他のアドレスに行い、そのアドレスに対応する物理ブロック番号「ｂｂｂ」を最新論理アドレス空間に対応する物理ブロック番号として保持する。また、新論理アドレス空間に対応するフラグも「変更有り」に更新する。その後、再開可能ポイントになったならば、図３１（ｄ）に示すように最新論理アドレス空間に対応する情報を、スナップショット論理アドレス空間に対応する情報にコピーする。また、最新論理アドレス空間に対応するフラグを「変更無し」とする。 FIG. 31B shows an example of the initial state of one entry. FIG. 31B shows that data is stored at an address corresponding to the physical block number “aaa”. Since the data has not been changed since the resumable point, the physical block number corresponding to the latest logical address space is also “aaa”. Assume that a write to an address corresponding to the physical block number “aaa” is requested. In this case, writing is performed to another address, and the physical block number “bbb” corresponding to the address is held as the physical block number corresponding to the latest logical address space. In addition, the flag corresponding to the new logical address space is also updated to “changed”. Thereafter, when a resumable point is reached, information corresponding to the latest logical address space is copied to information corresponding to the snapshot logical address space as shown in FIG. Further, the flag corresponding to the latest logical address space is set to “no change”.

本実施の形態では、スナップショット論理アドレス空間に対応する情報をスナップショットとして用いる。 In this embodiment, information corresponding to the snapshot logical address space is used as a snapshot.

この結果、次に再開可能ポイントとなるまでは、新たに書き込もうとしたデータは、書き込もうとしたアドレスとは別のアドレスに書き込まれる。従って、再開可能ポイントの時点におけるデータが残ることになる。待機系のストレージへのデータの書き込みをこのように管理することにより、ホスト３００に異常が生じても、すぐにホスト３０３が処理を再開することができる。 As a result, until the next resumable point, the data to be newly written is written to an address different from the address to be written. Accordingly, data at the point of the resumable point remains. By managing the writing of data to the standby storage in this way, even if an abnormality occurs in the host 300, the host 303 can immediately resume processing.

また、アドレステーブル記憶部４１４は、どの論理アドレス空間にも含まれない物理ブロックのブロック番号が記録された未使用ブロックテーブルも記憶する。なお、アドレステーブル記憶部４１４として、不揮発性の半導体メモリ、磁気ディスク装置、光ディスク装置または光磁気ディスク等が使用される。また、アドレステーブル記憶部４１４として、記憶媒体１０１の記憶領域の一部が使用される場合もある。 The address table storage unit 414 also stores an unused block table in which block numbers of physical blocks not included in any logical address space are recorded. As the address table storage unit 414, a nonvolatile semiconductor memory, a magnetic disk device, an optical disk device, a magneto-optical disk, or the like is used. In addition, a part of the storage area of the storage medium 101 may be used as the address table storage unit 414.

次に、ホスト３００が、ストレージ４００，４０１にデータを書き込むときの動作を説明する。ホスト３００は、ストレージ４００にデータを書き込むときに、ストレージ４００のストレージコントローラ４１０における通信部４１２に対してwrite コマンドを出力する。通信部４１２は、write コマンドを受け取ると、write コマンドを処理シーケンサ４１１に渡し、write 処理を開始することを処理シーケンサ４１１に指示する。処理シーケンサ４１１が実行する書き込み処理は、データ転送先ストレージが設定されている場合といない場合とで異なる。なお、ストレージ４００には、データ転送先としてストレージ４０１が設定され、ストレージ４０１には、データ転送先が設定されていない。 Next, an operation when the host 300 writes data to the storages 400 and 401 will be described. When writing data to the storage 400, the host 300 outputs a write command to the communication unit 412 in the storage controller 410 of the storage 400. Upon receiving the write command, the communication unit 412 passes the write command to the processing sequencer 411 and instructs the processing sequencer 411 to start the write processing. The writing process executed by the processing sequencer 411 differs depending on whether or not the data transfer destination storage is set. In the storage 400, the storage 401 is set as the data transfer destination, and no data transfer destination is set in the storage 401.

図３２は、データ転送先が設定されている場合のストレージコントローラ４１０における処理シーケンサ４１１の動作を示すフローチャートである。すなわち、この実施の形態では、ストレージ４００における処理シーケンサ４１１の動作を示すフローチャートである。データ転送先が設定されている場合には、処理シーケンサ４１１は、まず、受け取るデータに必要な領域を同期用バッファメモリ３２３に確保する（ステップＳ５００）。また、write コマンドで指定された書き込み先の論理アドレスとwrite 処理であることを指定して、ＬＢＡ管理部４１３に論理アドレスを物理アドレスに変換させる（ステップＳ５０１）。ＬＢＡ管理部４１３は、図３１に示す最新論理アドレス空間に対応する物理ブロック番号を割り当て、その物理ブロック番号に対応する物理アドレスを処理シーケンサ４１１に返す。ステップＳ５０１におけるＬＢＡ管理部４１３の動作の詳細については後述する。 FIG. 32 is a flowchart showing the operation of the processing sequencer 411 in the storage controller 410 when the data transfer destination is set. In other words, this embodiment is a flowchart showing the operation of the processing sequencer 411 in the storage 400. If the data transfer destination is set, the processing sequencer 411 first secures an area necessary for the received data in the synchronization buffer memory 323 (step S500). In addition, the logical address of the write destination specified by the write command and the write processing are specified, and the LBA management unit 413 converts the logical address into a physical address (step S501). The LBA management unit 413 assigns a physical block number corresponding to the latest logical address space shown in FIG. 31 and returns a physical address corresponding to the physical block number to the processing sequencer 411. Details of the operation of the LBA management unit 413 in step S501 will be described later.

次いで、処理シーケンサ４１１は、準備完了の通知をホスト３００に送るように通信部４１２に指示する（ステップＳ５０２）。通信部４１２は、指示に応じて、ホスト３００に準備完了の通知を送信する。そして、ホスト３００からデータが届くのを待ち（ステップＳ５０３）、データが届いて通信部４１２からデータを格納すべきバッファメモリ１０６の領域の問い合わせを受けると、ステップＳ５００で確保した領域を通信部４１２に知らせる（ステップＳ５０４）。また、再開可能ポイント通知処理を行っている場合にはステップＳ５１３に移行し、行っていなかった場合にはステップＳ５０６に移行する（ステップＳ５０５）。ここで、「再開可能ポイント通知処理を行っている場合」とは、ホスト３００から再開可能ポイントを通知され、ストレージ４０１に対して所定の処理を行っている場合を指す。具体的には、後述するステップＳ５６２〜Ｓ５６４の処理を行っている場合にはステップＳ５１３に移行し、行っていなかった場合にはステップＳ５０６に移行する。 Next, the processing sequencer 411 instructs the communication unit 412 to send a notification of completion of preparation to the host 300 (step S502). The communication unit 412 transmits a notification of preparation completion to the host 300 in response to the instruction. Then, it waits for data to arrive from the host 300 (step S503), and when the data arrives and receives an inquiry from the communication unit 412 about the area of the buffer memory 106 where the data is to be stored, the communication unit 412 identifies the area secured in step S500. (Step S504). If the resumable point notification process is being performed, the process proceeds to step S513, and if not, the process proceeds to step S506 (step S505). Here, “when resumable point notification processing is performed” refers to a case where a resumable point is notified from the host 300 and predetermined processing is performed on the storage 401. Specifically, if processing in steps S562 to S564 described later is performed, the process proceeds to step S513, and if not, the process proceeds to step S506.

ステップＳ５０６では、処理シーケンサ４１１は、データ転送先のストレージ（この例ではストレージ４０１）に対してwrite コマンドを発行するように通信部４１２に指示する。このwrite コマンドでは、書き込み先を論理アドレスで指定する。通信部４１２は、指示に応じて、write コマンドをデータ転送先のストレージに送信する。そして、ホスト３００からのデータのバッファメモリ１０６への格納の完了を待ち（ステップＳ５０７）、全てのデータがバッファメモリ１０６に格納されたことが通信部４１２から通知されると、処理シーケンサ４１１は、ＩＯスケジューラ１０４に対して、ステップＳ５０１で変換した物理アドレスに対応した領域に、バッファメモリ１０６に格納されたデータを書き込む指示（書き込み要求）を登録する。ＩＯスケジューラ１０４は、書き込み要求に応じて、媒体制御部１０５に、書き込み対象のデータの書き込み指示を行う。媒体制御部１０５は、登録内容に応じてバッファメモリ１０６から記憶媒体１０１へのデータの書き込み処理を行う（ステップＳ５０８）。 In step S506, the processing sequencer 411 instructs the communication unit 412 to issue a write command to the data transfer destination storage (storage 401 in this example). This write command specifies the write destination with a logical address. The communication unit 412 transmits a write command to the data transfer destination storage in response to the instruction. Then, the storage sequencer 411 waits for completion of storage of data from the host 300 in the buffer memory 106 (step S507). When the communication unit 412 notifies that all data has been stored in the buffer memory 106, the processing sequencer 411 An instruction (write request) for writing the data stored in the buffer memory 106 is registered in the IO scheduler 104 in the area corresponding to the physical address converted in step S501. The IO scheduler 104 instructs the medium control unit 105 to write data to be written in response to the write request. The medium control unit 105 performs a data writing process from the buffer memory 106 to the storage medium 101 in accordance with the registered content (step S508).

そして、ストレージ４０１から準備完了のメッセージが送信されるのを待ち（ステップＳ５０９）、ストレージ４０１からの準備完了のメッセージを受信したことが通信部４１２から通知されると、通信部４１２に、バッファメモリ１０６に格納されたデータをストレージ４０１に送信させる（ステップＳ５１０）。その後、ストレージ４０１から受信完了のメッセージが送信されるのと、媒体制御部１０５からの書き込み完了通知とを待ち（ステップＳ５１１）、ストレージ４０１からの受信完了のメッセージを受信したことが通信部４１２から通知され、かつ、媒体制御部１０５からの書き込み完了通知を受けると、ホスト３００に完了通知し（ステップＳ５１２）、処理を終了する。 The storage unit 401 waits for a preparation completion message to be transmitted from the storage 401 (step S509). When the communication unit 412 notifies that the preparation completion message has been received from the storage 401, the communication unit 412 is notified of the buffer memory. The data stored in 106 is transmitted to the storage 401 (step S510). After that, a reception completion message is transmitted from the storage 401 and a write completion notification from the medium control unit 105 is awaited (step S511), and the reception completion message from the storage 401 is received from the communication unit 412. When the notification is received and the writing completion notification is received from the medium control unit 105, the host 300 is notified of the completion (step S512), and the processing is terminated.

ステップＳ５１３では、ホスト３００からのデータのバッファメモリ１０６への格納の完了を待ち、全てのデータがバッファメモリ１０６に格納されたことが通信部４１２から通知されると、処理シーケンサ４１１は、ＩＯスケジューラ１０４に対して、ステップＳ５０１で変換した物理アドレスに対応した領域に、バッファメモリ１０６に格納されたデータを書き込む指示（書き込み要求）を登録する（ステップＳ５１４）。次いで、再開可能ポイント通知処理が完了するまで待ち（ステップＳ５１５）、再開可能ポイント通知処理が完了したら、データ転送先のストレージに対してwrite コマンドを発行するように通信部４１２に指示する（ステップＳ５１６）。このwrite コマンドでは、書き込み先を論理アドレスで指定する。ステップＳ５１６の後、ステップＳ５０９に移行する。 In step S513, the processing sequencer 411 waits for the completion of storage of data from the host 300 in the buffer memory 106, and when the communication unit 412 notifies that all the data has been stored in the buffer memory 106, the processing sequencer 411 For 104, an instruction (write request) to write the data stored in the buffer memory 106 is registered in the area corresponding to the physical address converted in step S501 (step S514). Next, the process waits until the resumable point notification process is completed (step S515). When the resumable point notification process is completed, the communication unit 412 is instructed to issue a write command to the data transfer destination storage (step S516). ). This write command specifies the write destination with a logical address. After step S516, the process proceeds to step S509.

なお、ステップＳ５０６〜Ｓ５０８での処理と、ステップＳ５１３〜Ｓ５１６での処理では、ストレージ４０１に対するwrite コマンド発行処理と記憶媒体１０１への書き込み処理の順番が逆になっている。これは、以下の理由によるものである。ステップＳ５０６〜Ｓ５０８では、ストレージ４０１にwrite コマンドを送信し、その応答が戻って来るまでの期間に記憶媒体１０１への書き込み処理を進めるため、write コマンド発行処理を記憶媒体１０１への書き込み処理よりも先に行うこととした。一方、ステップＳ５１３〜Ｓ５１６では、再開可能ポイント通知処理の終了を待っている間、ストレージ４０１にエントリの変更を行わせている。従って、再開可能ポイント通知処理の終了まではストレージ４０１にwrite コマンドを送信できない。そこで、再開可能ポイント通知処理の終了を待つまでの間に記憶媒体１０１への書き込み処理を進めることとした。 Note that, in the processes in steps S506 to S508 and the processes in steps S513 to S516, the order of the write command issuance process for the storage 401 and the write process to the storage medium 101 is reversed. This is due to the following reason. In steps S506 to S508, the write command is transmitted to the storage 401 and the write process to the storage medium 101 is advanced during the period until the response is returned. I decided to do it first. On the other hand, in steps S513 to S516, the entry is changed in the storage 401 while waiting for completion of the resumable point notification process. Therefore, the write command cannot be transmitted to the storage 401 until the resumable point notification process is completed. Therefore, the writing process to the storage medium 101 is advanced until the end of the resumable point notification process is awaited.

図３３は、データ転送先が設定されている場合のストレージコントローラ４１０における処理シーケンサ４１１の動作を示すフローチャートである。すなわち、この実施の形態では、ストレージ４０１における処理シーケンサ４１１の動作を示すフローチャートである。より具体的には、ステップＳ５０６，Ｓ５１６で送信されたwrite コマンドを受信したストレージ４０１の処理シーケンサ４１１の動作を示すフローチャートである。 FIG. 33 is a flowchart showing the operation of the processing sequencer 411 in the storage controller 410 when the data transfer destination is set. In other words, this embodiment is a flowchart showing the operation of the processing sequencer 411 in the storage 401. More specifically, it is a flowchart showing the operation of the processing sequencer 411 of the storage 401 that has received the write command transmitted in steps S506 and S516.

処理シーケンサ４１１はwrite コマンドを受信すると、まず、受け取るデータに必要な領域をバッファメモリ１０６に確保する（ステップＳ５２０）。また、write コマンドで指定された書き込み先の論理アドレスとwrite 処理であることを指定して、ＬＢＡ管理部４１３に論理アドレスを物理アドレスに変換させる（ステップＳ５２１）。この処理はステップＳ５０１（図３２参照）と同様の処理である。 When receiving the write command, the processing sequencer 411 first secures an area necessary for the received data in the buffer memory 106 (step S520). Also, the logical address of the write destination specified by the write command and the write processing are specified, and the LBA management unit 413 converts the logical address into a physical address (step S521). This process is the same as step S501 (see FIG. 32).

次いで、処理シーケンサ４１１は、準備完了の通知をデータ転送元のストレージ（この例ではストレージ４００）に送るように通信部４１２に指示する（ステップＳ５２２）。通信部４１２は、指示に応じて、データ転送元のストレージに準備完了の通知を送信する。そして、データ転送元のストレージからデータが届くのを待ち（ステップＳ５２３）、データが届いて通信部４１２からデータを格納すべきバッファメモリ１０６の領域の問い合わせを受けると、ステップＳ５２０で確保した領域を通信部４１２に知らせる（ステップＳ５２４）。 Next, the processing sequencer 411 instructs the communication unit 412 to send a notification of preparation completion to the data transfer source storage (storage 400 in this example) (step S522). In response to the instruction, the communication unit 412 transmits a preparation completion notification to the data transfer source storage. Then, it waits for data to arrive from the data transfer source storage (step S523). When the data arrives and receives an inquiry from the communication unit 412 about the area of the buffer memory 106 where the data is to be stored, the area secured in step S520 is obtained. The communication unit 412 is notified (step S524).

さらに、データ転送元のストレージからのデータのバッファメモリ１０６への格納の完了を待ち（ステップＳ５２５）、全てのデータがバッファメモリ１０６に格納されたことが通信部４１２から通知されると、処理シーケンサ４１１は、ＩＯスケジューラ１０４に対して、ステップＳ５２１で変換した物理アドレスに対応した領域に、バッファメモリ１０６に格納されたデータを書き込む指示（書き込み要求）を登録する。ＩＯスケジューラ１０４は、書き込み要求に応じて、媒体制御部１０５に、書き込み対象のデータの書き込み指示を行う。媒体制御部１０５は、登録内容に応じてバッファメモリ１０６から記憶媒体１０１へのデータの書き込み処理を行う（ステップＳ５２６）。 Further, it waits for the storage of data from the data transfer source storage to the buffer memory 106 (step S525), and when the communication unit 412 notifies that all the data has been stored in the buffer memory 106, the processing sequencer 411 registers an instruction (write request) for writing the data stored in the buffer memory 106 in the area corresponding to the physical address converted in step S 521 to the IO scheduler 104. The IO scheduler 104 instructs the medium control unit 105 to write data to be written in response to the write request. The medium control unit 105 performs data writing processing from the buffer memory 106 to the storage medium 101 according to the registered contents (step S526).

そして、媒体制御部１０５からの書き込み完了通知を待ち（ステップＳ５２７）、媒体制御部１０５からの書き込み完了通知を受けると、データ転送元のストレージに完了通知を行い（ステップＳ５２８）、処理を終了する。 Then, it waits for a write completion notification from the medium control unit 105 (step S527). When the write completion notification is received from the medium control unit 105, it notifies the data transfer source storage (step S528) and ends the processing. .

ストレージ４００，４０１は、いずれも論理アドレスを指定されたwrite コマンドを受信して、書き込み処理を開始する。しかし、その論理アドレスに対応する物理アドレスにデータを書き込むのではなく、別の物理アドレス（ステップＳ５０１，Ｓ５２１で割り当てた物理アドレス）にデータを書き込む。従って、再開可能ポイントにおけるデータを記憶媒体１０１に保持し続けることができるので、正常系のホスト３００に異常が生じても、待機系のホスト３０３は、短時間で処理を再開することができる。 The storages 400 and 401 both receive a write command with a specified logical address, and start a write process. However, data is not written to a physical address corresponding to the logical address, but data is written to another physical address (the physical address assigned in steps S501 and S521). Therefore, since the data at the resumable point can be kept in the storage medium 101, even if an abnormality occurs in the normal host 300, the standby host 303 can restart the processing in a short time.

次に、ＬＢＡ管理部４１３による論理アドレスから物理アドレスへの変換処理を説明する。なお、ＬＢＡ管理部４１３による変換処理は、write 処理であることが指定された場合とされなかった場合とで異なる。write 処理であることが指定されなかった場合には、ＬＢＡ管理部４１３は、論理アドレスが指定されて変換処理の指示を受けると、論理アドレスから論理ブロック番号を算出し、アドレステーブル記憶部４１４から、最新論理アドレス空間でのその論理ブロック番号に対応した物理ブロック番号を取得する。また、物理ブロック番号と論理ブロックアドレスとから物理アドレスを算出し、算出した物理アドレスを処理シーケンサ４１１に通知する。なお、論理ブロック番号は、論理ブロックアドレスをブロック長で割った値（小数点以下は切り捨てる）である。また、物理アドレスを算出する場合、まず物理ブロックとブロック長の積（Ｐとする）を求める。また、論理アドレスをブロック長で割った値の余り（Ｑとする）を求める。ＰとＱの和を物理アドレスとして算出する。 Next, conversion processing from a logical address to a physical address by the LBA management unit 413 will be described. Note that the conversion process by the LBA management unit 413 differs depending on whether the write process is designated or not. If the write process is not designated, the LBA management unit 413 calculates a logical block number from the logical address when the logical address is designated and receives an instruction for the conversion process, and the address table storage unit 414 The physical block number corresponding to the logical block number in the latest logical address space is acquired. Further, the physical address is calculated from the physical block number and the logical block address, and the calculated physical address is notified to the processing sequencer 411. The logical block number is a value obtained by dividing the logical block address by the block length (the decimal part is rounded down). When calculating a physical address, first, a product of a physical block and a block length (referred to as P) is obtained. Further, the remainder (Q) of the value obtained by dividing the logical address by the block length is obtained. The sum of P and Q is calculated as a physical address.

図３４は、write 処理であることが指定された場合のＬＢＡ管理部４１３の動作を示すフローチャートである。すなわち、ステップＳ５０１，Ｓ５２１におけるＬＢＡ管理部４１３の動作を示すフローチャートである。write 処理であることが指定された場合には、ＬＢＡ管理部４１３は、論理アドレスから論理ブロック番号を算出する（ステップＳ５４０）、次いで、アドレステーブル記憶部４１４から、ステップＳ５４０で算出した論理ブロック番号に対応したエントリを取得する（ステップＳ５４１）。ステップＳ５４１では、最新論理アドレス空間におけるエントリを取得する。そして、エントリ中の、最新論理アドレス空間に対応する物理ブロック番号と対になるフラグを確認する（ステップＳ５４２）。フラグが「変更有り」となっている場合にはステップＳ５４６に移行し、「変更無し」となっている場合にはステップＳ５４３に移行する。 FIG. 34 is a flowchart showing the operation of the LBA management unit 413 when it is specified that the write process is performed. That is, it is a flowchart showing the operation of the LBA management unit 413 in steps S501 and S521. If it is specified that the process is write processing, the LBA management unit 413 calculates a logical block number from the logical address (step S540), and then the logical block number calculated in step S540 from the address table storage unit 414. The entry corresponding to is acquired (step S541). In step S541, an entry in the latest logical address space is acquired. Then, a flag paired with the physical block number corresponding to the latest logical address space in the entry is confirmed (step S542). If the flag is “changed”, the process proceeds to step S546. If the flag is “not changed”, the process proceeds to step S543.

フラグが「変更無し」となっている場合とは、図３１（ｂ）に示すように、最新論理アドレス空間に対応する物理ブロック番号がスナップショット論理アドレス空間に対応する物理ブロック番号と等しい場合である。以下、図３１（ｂ），（ｃ）を用いて説明する。ステップＳ５４３では、ＬＢＡ管理部４１３は、アドレステーブル記憶部４１４中の未使用ブロックテーブルから未使用の物理ブロック番号を入手する（ステップＳ５４３）。ここでは、「ｂｂｂ」という物理ブロック番号を入手したものとする。また、アドレステーブル記憶部４１４に登録されている未使用ブロックテーブルの情報から、入手した物理ブロック番号の情報を削除する。そして、削除後の未使用ブロックテーブルの情報をアドレステーブル記憶部４１４に登録し直す。 When the flag is “no change”, as shown in FIG. 31B, the physical block number corresponding to the latest logical address space is equal to the physical block number corresponding to the snapshot logical address space. is there. Hereinafter, description will be made with reference to FIGS. In step S543, the LBA management unit 413 obtains an unused physical block number from the unused block table in the address table storage unit 414 (step S543). Here, it is assumed that a physical block number “bbb” is obtained. Further, the acquired physical block number information is deleted from the unused block table information registered in the address table storage unit 414. Then, the unused block table information after deletion is registered again in the address table storage unit 414.

さらに、ステップＳ５４３で入手した物理ブロック番号（ｂｂｂ）に該当する領域に、ステップＳ５４１で入手したエントリ中の最新論理アドレス空間に対応する物理ブロック（ａａａ）に該当する領域のデータをコピーする旨の情報（コピー要求）をＩＯスケジューラ１０４に登録する。ＩＯスケジューラ１０４は、コピー要求に応じて、媒体制御部１０５に、対象のデータのコピー指示を行う。媒体制御部１０５は、記憶媒体１０１における指定された領域間のコピー処理を行う（ステップＳ５４４）。コピー処理が完了したら、ＬＢＡ管理部４１３は、ステップＳ５４１で入手したエントリの最新論理アドレス空間に対応する物理ブロック番号（ａａａ）をステップＳ５４３で入手した物理ブロック番号（ｂｂｂ）に変更し、変更後の物理ブロック番号とセットとなっているフラグを「変更有り」に更新する。また、アドレステーブル記憶部４１４に、変更したエントリを記憶させる（ステップＳ５４５）。そして、ステップＳ５４６に移行する。 Further, the data in the area corresponding to the physical block (aaa) corresponding to the latest logical address space in the entry obtained in step S541 is copied to the area corresponding to the physical block number (bbb) obtained in step S543. Information (copy request) is registered in the IO scheduler 104. The IO scheduler 104 instructs the medium control unit 105 to copy the target data in response to the copy request. The medium control unit 105 performs a copy process between designated areas in the storage medium 101 (step S544). When the copy process is completed, the LBA management unit 413 changes the physical block number (aaa) corresponding to the latest logical address space of the entry obtained in step S541 to the physical block number (bbb) obtained in step S543, and after the change The flag set with the physical block number is updated to “changed”. Further, the changed entry is stored in the address table storage unit 414 (step S545). Then, control goes to a step S546.

ステップＳ５４６では、ＬＢＡ管理部４１３は、エントリの最新論理アドレス空間に対応する物理ブロック番号から物理アドレスを算出し、算出した物理アドレスを処理シーケンサ４１１に通知し（ステップＳ５４７）、処理を終了する。 In step S546, the LBA management unit 413 calculates a physical address from the physical block number corresponding to the latest logical address space of the entry, notifies the calculated physical address to the processing sequencer 411 (step S547), and ends the process.

ステップＳ５４２の後、ステップＳ５４３以降の処理を行うと、図３１（ｂ）に示す物理ブロック番号は、図（ｃ）に示すように変更される。しかし、ステップＳ５４７までの処理が終了した時点では、「ｂｂｂ」に対応する物理アドレスには、「ａａａ」に対応する物理アドレスと同一のデータが格納されている。正常系ストレージ４００が「ｂｂｂ」に対応する物理アドレスのデータを書き換えるのは、ステップＳ５０８またはステップＳ５１４においてである（図３２参照）。また、待機系ストレージ４０１が「ｂｂｂ」に対応する物理アドレスのデータを書き換えるのはステップＳ５２６においてである（図３３参照）。 After step S542, when the processing after step S543 is performed, the physical block numbers shown in FIG. 31B are changed as shown in FIG. However, when the processing up to step S547 is completed, the same data as the physical address corresponding to “aaa” is stored in the physical address corresponding to “bbb”. The normal storage 400 rewrites the data at the physical address corresponding to “bbb” in step S508 or step S514 (see FIG. 32). The standby storage 401 rewrites the data of the physical address corresponding to “bbb” in step S526 (see FIG. 33).

次に、アプリケーション３１０ａ，３１０ｂが、ストレージ４００に再開可能ポイント通知を通知するときの動作を説明する。アプリケーション３１０ａ，３１０ｂは、ストレージ４００に再開可能ポイント通知を通知するときに、ＩＯ管理部３１１（図２０参照）に再開可能ポイントの通知を指示する。すると、ＩＯ管理部３１１において再開可能ポイント通知部３１２は、ストレージ４００に対して再開可能ポイント通知コマンドを発行する。 Next, an operation when the applications 310a and 310b notify the storage 400 of a resumable point notification will be described. When the applications 310 a and 310 b notify the storage 400 of a resumable point notification, the applications 310 a and 310 b instruct the IO management unit 311 (see FIG. 20) to notify the resumable point. Then, the resumable point notification unit 312 in the IO management unit 311 issues a resumable point notification command to the storage 400.

ストレージ４００に再開可能ポイント通知コマンドが到着すると、再開可能ポイント通知コマンドは、ストレージコントローラ４１０における通信部４１２に入力される。通信部４１２は、再開可能ポイント通知コマンドを受け取ると、再開可能ポイント通知コマンドを処理シーケンサ４１１に渡し、再開可能ポイント通知処理を開始することを処理シーケンサ４１１に指示する。 When the resumable point notification command arrives at the storage 400, the resumable point notification command is input to the communication unit 412 in the storage controller 410. Upon receiving the resumable point notification command, the communication unit 412 passes the resumable point notification command to the processing sequencer 411 and instructs the processing sequencer 411 to start the resumable point notification process.

図３５は、再開可能ポイント通知コマンドを受け取った処理シーケンサ４１１の動作を示すフローチャートである。処理シーケンサ４１１は、再開可能ポイント通知コマンドを受け取ると、図３５に示すように、まず、他ストレージに対して要求したwrite 処理のうちで完了していないものがあるか否かを調べ（ステップＳ５６０）、あった場合にはステップＳ５６１に処理を実行する。すなわち、他ストレージに対して要求したwrite 処理が全て完了するのを待つ（ステップＳ５６１）。なお、処理シーケンサ４１１は、発行した各write コマンドに対する応答が完了しているのか否かを示す一覧情報を管理する。ステップＳ５６０では、この情報に基づいて判断を行えばよい。 FIG. 35 is a flowchart showing the operation of the processing sequencer 411 that has received the resumable point notification command. When the process sequencer 411 receives the resumable point notification command, as shown in FIG. 35, the process sequencer 411 first checks whether there is any uncompleted write process requested to the other storage (step S560). If there is, the process is executed in step S561. That is, it waits for all the write processing requested to the other storage to be completed (step S561). The processing sequencer 411 manages list information indicating whether or not a response to each issued write command has been completed. In step S560, a determination may be made based on this information.

処理シーケンサ４１１は、他ストレージに対して要求したwrite 処理が全て完了している状態で、通信部４１２に、データ転送先のストレージ（この例ではストレージ４０１）にスナップショット作成コマンド（スナップショット作成要求）を送信するように指示する。通信部４１２は、指示に応じて、データ転送先のストレージにスナップショット作成コマンドを送信する（ステップＳ５６２）。スナップショット作成コマンドとは、最新論理アドレス空間に対応する情報を、スナップショット論理アドレス空間に対応する情報にコピーするようにし、かつ、最新論理アドレス空間に対応するフラグを「変更無し」に更新するように要求するコマンドである。処理シーケンサ４１１は、ステップＳ５６２の後、転送先ストレージからの応答を待ち（ステップＳ５６３）、応答が到着ことが通信部４１２から通知されたら、通信部４１２にホスト３００に対して完了を通知するように指示し（ステップＳ５６４）、処理を終了する。 The processing sequencer 411 sends the snapshot creation command (snapshot creation request) to the data transfer destination storage (storage 401 in this example) to the communication unit 412 in a state where all the write processing requested to the other storage has been completed. ) To send. In response to the instruction, the communication unit 412 transmits a snapshot creation command to the data transfer destination storage (step S562). The snapshot creation command copies information corresponding to the latest logical address space to information corresponding to the snapshot logical address space, and updates the flag corresponding to the latest logical address space to “no change”. It is a command to request After step S562, the processing sequencer 411 waits for a response from the transfer destination storage (step S563). When the communication unit 412 notifies that the response has arrived, the processing sequencer 411 notifies the communication unit 412 of completion to the host 300. (Step S564), and the process ends.

次に、データ転送先のストレージであるストレージ４０１がスナップショット作成コマンドを受信した場合の動作を説明する。ストレージ４０１にスナップショット作成コマンドが到着すると、ストレージ４０１のストレージコントローラ４１０の通信部４１２にスナップショット作成コマンドが入力される。通信部４１２は、スナップショット作成コマンドを受け取ると、処理シーケンサ４１１にスナップショット作成コマンドを渡し、スナップショット作成処理の開始を指示する。 Next, the operation when the storage 401, which is the data transfer destination storage, receives the snapshot creation command will be described. When the snapshot creation command arrives at the storage 401, the snapshot creation command is input to the communication unit 412 of the storage controller 410 of the storage 401. Upon receiving the snapshot creation command, the communication unit 412 passes the snapshot creation command to the processing sequencer 411 and instructs the start of the snapshot creation processing.

スナップショット作成処理では、処理シーケンサ４１１は、アドレステーブル記憶部４１４中のアドレステーブルの全エントリに対して以下の処理を行う。すなわち、最新論理アドレス空間に対応するフラグに「変更有り」が記録されていた場合には、そのエントリのスナップショット論理アドレス空間に対応する物理ブロック番号を、未使用ブロックとしてアドレステーブル記憶部４１４中の未使用ブロックテーブルに登録する。また、スナップショット論理アドレス空間に対応するブロック番号に最新論理アドレス空間に対応する物理ブロック番号をコピーする。このとき、フラグの値もコピーする。その後、最新論理アドレス空間に対応するフラグを「変更無し」に初期化する。 In the snapshot creation process, the process sequencer 411 performs the following process for all entries in the address table in the address table storage unit 414. That is, if “changed” is recorded in the flag corresponding to the latest logical address space, the physical block number corresponding to the snapshot logical address space of the entry is stored in the address table storage unit 414 as an unused block. Register in the unused block table. Further, the physical block number corresponding to the latest logical address space is copied to the block number corresponding to the snapshot logical address space. At this time, the flag value is also copied. Thereafter, the flag corresponding to the latest logical address space is initialized to “no change”.

例えば、図３１（ｃ）に示すエントリでは、最新論理アドレス空間に対応するフラグに「変更有り」が記録されている。この場合、スナップショット論理アドレス空間に対応する物理ブロック番号「ａａａ」を未使用ブロックとして未使用ブロックテーブルに登録する。そして、スナップショット論理アドレス空間に対応するブロック番号に、最新論理アドレス空間に対応する物理ブロック番号「ｂｂｂ」をコピーする。そして、最新論理アドレス空間に対応するフラグ「変更有り」も同様にコピーする。さらに、最新論理アドレス空間に対応するフラグを「変更無し」に初期化する。すると、エントリの状態は、図３１（ｃ）に示す状態から、図３１（ｄ）に示す状態になる。 For example, in the entry shown in FIG. 31C, “changed” is recorded in the flag corresponding to the latest logical address space. In this case, the physical block number “aaa” corresponding to the snapshot logical address space is registered in the unused block table as an unused block. Then, the physical block number “bbb” corresponding to the latest logical address space is copied to the block number corresponding to the snapshot logical address space. Then, the flag “changed” corresponding to the latest logical address space is also copied. Further, the flag corresponding to the latest logical address space is initialized to “no change”. Then, the entry state changes from the state shown in FIG. 31C to the state shown in FIG.

また、最新論理アドレス空間に対応するフラグが「変更無し」である場合、直前の再開可能ポイント以降その物理ブロック番号に対応するアドレスに格納されたデータが変更されていないことを意味する。従って、そのエントリには何も処理を行わない。 If the flag corresponding to the latest logical address space is “no change”, it means that the data stored in the address corresponding to the physical block number has not been changed since the immediately preceding resumable point. Therefore, no processing is performed on the entry.

最新論理アドレス空間に対応するフラグが「変更有り」となっている全エントリに対して処理を終了した後、処理シーケンサ４１１は、通信部４１２を用いてスナップショット作成コマンドに対する応答を送信する。 After completing the processing for all entries for which the flag corresponding to the latest logical address space is “changed”, the processing sequencer 411 transmits a response to the snapshot creation command using the communication unit 412.

次に、災害発生時の動作を説明する。災害等によってホスト３００が使用できなくなった場合には、ストレージ４０１を待機系から正常系にする。また、ホスト３０３が、ホスト３００から処理を引き継ぐ。 Next, the operation when a disaster occurs will be described. When the host 300 becomes unusable due to a disaster or the like, the storage 401 is changed from the standby system to the normal system. Further, the host 303 takes over processing from the host 300.

障害検知からアプリケーション再開までのホスト３０３の動作を説明する。まず、ホスト３０３中のホスト監視部３１３（図２０参照）がホスト３００の異常を検出する。ホスト監視部３１３は、ホスト３００の異常を検出すると、待機系であるストレージ４０１に対してスナップショット復帰コマンドを発行する。そして、スナップショット復帰コマンドに対する応答を待ち、応答を受けたら、ホスト監視部３１３は、アプリケーション３１０ａ，３１０ｂを実行させる。以後、ホスト３０３からのデータの書き込みおよび読み出しは、ストレージ４０１に対して行われる。 The operation of the host 303 from failure detection to application restart will be described. First, the host monitoring unit 313 (see FIG. 20) in the host 303 detects an abnormality of the host 300. When the host monitoring unit 313 detects an abnormality in the host 300, it issues a snapshot restoration command to the storage 401 that is the standby system. The host monitoring unit 313 then executes the applications 310a and 310b when waiting for a response to the snapshot restoration command and receiving the response. Thereafter, writing and reading of data from the host 303 is performed on the storage 401.

なお、ホスト３０３中のホスト監視部３１３は、ホスト３００と常時あるいは定期的に通信を行っている。ホスト監視部３１３が、一定時間ホスト３００と通信できなくなった場合、あるいは、ホスト３００から異常が報告された場合に、ホスト監視部３１３は、ホスト３００において災害が発生したと認識する。 Note that the host monitoring unit 313 in the host 303 communicates with the host 300 constantly or periodically. When the host monitoring unit 313 cannot communicate with the host 300 for a certain period of time or when an abnormality is reported from the host 300, the host monitoring unit 313 recognizes that a disaster has occurred in the host 300.

次に、ストレージ４０１が、スナップショット復帰コマンドを受けた場合の動作を説明する。ストレージ４０１にスナップショット復帰コマンドが到着すると、ストレージ４０１のストレージコントローラ４１０における通信部４１２にスナップショット復帰コマンドが入力される。通信部４１２は、スナップショット復帰コマンドを受け取ると、処理シーケンサ４１１にスナップショット復帰コマンドを渡し、スナップショット復帰処理の開始を指示する。 Next, an operation when the storage 401 receives a snapshot restoration command will be described. When the snapshot restoration command arrives at the storage 401, the snapshot restoration command is input to the communication unit 412 in the storage controller 410 of the storage 401. Upon receiving the snapshot restoration command, the communication unit 412 passes the snapshot restoration command to the processing sequencer 411 and instructs the start of the snapshot restoration process.

スナップショット復帰処理では、処理シーケンサ４１１は、アドレステーブル記憶部中４１４のアドレステーブルの全エントリに対して以下の処理を行う。すなわち、最新論理アドレス空間に対応するフラグに「変更有り」が記録されていた場合には、その最新論理アドレス空間に対応する物理ブロック番号を未使用ブロックとしてアドレステーブル記憶部４１４の未使用ブロックテーブルに登録する。その結果、その物理ブロック番号に対応する領域は未使用状態として解放される。次いで、記憶媒体へのデータ格納状況を示す格納情報（最新論理アドレス空間に対応する物理ブロック番号およびフラグ）を、直前のスナップショット作成時の状態に戻す。すなわち、最新論理アドレス空間に対応するブロック番号にスナップショット論理アドレス空間に対応する物理ブロック番号をコピーする。また、最新論理アドレス空間に対応するフラグを「変更有り」から「変更無し」にする。 In the snapshot restoration process, the process sequencer 411 performs the following process on all entries in the address table 414 in the address table storage unit. That is, when “changed” is recorded in the flag corresponding to the latest logical address space, the unused block table in the address table storage unit 414 is set to the physical block number corresponding to the latest logical address space as an unused block. Register with. As a result, the area corresponding to the physical block number is released as an unused state. Next, the storage information (physical block number and flag corresponding to the latest logical address space) indicating the data storage status in the storage medium is returned to the state at the time of the previous snapshot creation. That is, the physical block number corresponding to the snapshot logical address space is copied to the block number corresponding to the latest logical address space. Further, the flag corresponding to the latest logical address space is changed from “changed” to “not changed”.

例えば、スナップショット復帰処理開始時に、あるエントリが図３１（ｃ）に示す状態であったとする。この最新論理アドレス空間に対応するフラグは「変更有り」となっている。従って、処理シーケンサ４１１は、その最新論理アドレス空間に対応する物理ブロック番号「ｂｂｂ」を未使用ブロックとして未使用ブロックテーブルに登録する。次いで、最新論理アドレス空間に対応するブロック番号「ｂｂｂ」にスナップショット論理アドレス空間に対応する物理ブロック番号をコピーする。この結果、最新論理アドレス空間に対応するブロック番号は「ａａａ」になる。また、最新論理アドレス空間に対応するフラグを「変更有り」から「変更無し」にする。この結果、格納情報（最新論理アドレス空間に対応する物理ブロック番号およびフラグ）は、直前のスナップショット作成時の状態に戻る。このように、スナップショット復帰処理を行うことにより、エントリは直前の再開可能ポイントにおける状態に復帰する。 For example, assume that an entry is in the state shown in FIG. The flag corresponding to this latest logical address space is “changed”. Accordingly, the processing sequencer 411 registers the physical block number “bbb” corresponding to the latest logical address space as an unused block in the unused block table. Next, the physical block number corresponding to the snapshot logical address space is copied to the block number “bbb” corresponding to the latest logical address space. As a result, the block number corresponding to the latest logical address space is “aaa”. Further, the flag corresponding to the latest logical address space is changed from “changed” to “not changed”. As a result, the storage information (physical block number and flag corresponding to the latest logical address space) returns to the state at the time of the previous snapshot creation. Thus, by performing the snapshot restoration process, the entry is restored to the state at the immediately preceding resumable point.

本実施の形態では、待機系のストレージ４０１はストレージ４００から受信したデータを、新たに割り当てた物理ブロック番号に対応するアドレスに格納する。そして、ストレージ４０１は、再開可能ポイントにおいてスナップショット作成コマンドを受信すると、その新たに割り当てた物理ブロックの情報をスナップショットとして保持する。また、再開可能ポイントの前に、正常系のホストに異常が発生した場合には、新たに割り当てた物理ブロック番号を未使用状態にする。新たに割り当てた物理ブロック番号を未使用の状態にすれば、ホスト３０３は、アプリケーション３１０ａ，３１０ｂの処理を再開できる。すなわち、ホスト３００に異常が生じたとしても、ホスト３０３が処理を再開するまでの時間は短くて済む。 In the present embodiment, the standby storage 401 stores the data received from the storage 400 at an address corresponding to the newly assigned physical block number. When the storage 401 receives the snapshot creation command at the resumable point, the storage 401 holds the information of the newly allocated physical block as a snapshot. In addition, when an abnormality occurs in a normal host before the resumable point, the newly assigned physical block number is set to an unused state. If the newly assigned physical block number is set to an unused state, the host 303 can resume the processing of the applications 310a and 310b. That is, even if an abnormality occurs in the host 300, the time until the host 303 resumes the process can be short.

また、正常系ストレージ４００は、再開可能ポイントと再開可能ポイントとの間で記憶媒体１０１に書き込んだデータを一度に待機系ストレージ４０１に送信するのではなく、ホスト３００からwrite コマンドを受信したタイミング毎に送信する。従って、大量のデータを一度に送信しないので、ホスト４０１とのデータ転送時間が少なくてすむ。 Further, the normal storage 400 does not transmit the data written in the storage medium 101 between the resumable point and the resumable point to the standby storage 401 at a time, but at each timing when the write command is received from the host 300. Send to. Therefore, since a large amount of data is not transmitted at a time, the data transfer time with the host 401 can be reduced.

次に、第８の実施の形態の変形例について説明する。図３２および図３５に示された処理例では、再開可能ポイント通知処理が行われていた場合にはwrite コマンドの発行は待たされていた（ステップＳ５０５参照）。また、write 処理のうちで完了していないものがある場合にはスナップショット作成コマンドの発行が待たされていた（ステップＳ５６０参照）。これに対し、本変形例では、ストレージ４００は、ホスト３００からwrite コマンドを受信した場合、再開可能ポイント通知処理の状態によらずにストレージ４０１にwrite コマンドを送信する。すなわち、図３２に示すステップＳ５０４の後、即座にステップＳ５０６以降の処理を開始する。また、ストレージ４００は、ホスト３００から再開可能ポイント通知を受けた場合には、write 処理の状況によらず即座にストレージ４０１にスナップショット作成コマンドが発行される。すなわち、再開可能ポイント通知を受けた場合、即座に図３５に示すステップＳ５６２以降の処理を開始する。 Next, a modification of the eighth embodiment will be described. In the processing examples shown in FIGS. 32 and 35, when the resumable point notification processing has been performed, the issue of the write command is awaited (see step S505). If there is a write process that has not been completed, a snapshot creation command is awaited (see step S560). On the other hand, in this modification, when the storage 400 receives a write command from the host 300, the storage 400 transmits the write command to the storage 401 regardless of the state of the resumable point notification process. That is, immediately after step S504 shown in FIG. 32, the processing after step S506 is started. Further, when the storage 400 receives a resumable point notification from the host 300, a snapshot creation command is immediately issued to the storage 401 regardless of the status of the write process. That is, when the resumable point notification is received, the processing after step S562 shown in FIG. 35 is immediately started.

本変形例では、ストレージ４００は、ストレージ４０１に対して送信する全てのコマンドに発行ＩＤを付加する。また、待機系のストレージ４０１は、処理済発行ＩＤ情報を保持する。処理済発行ＩＤ情報は、ストレージ４００から受信した各コマンドのどのコマンドまでの処理が完了したのかを示す情報である。例えば、処理済発行ＩＤ情報の内容が「５３」であるならば、発行ＩＤ「５３」までのコマンドに対する処理が完了したことを意味する。 In this modification, the storage 400 adds an issue ID to all commands transmitted to the storage 401. The standby storage 401 holds processed issue ID information. The processed issue ID information is information indicating up to which command of each command received from the storage 400 has been processed. For example, if the content of the processed issue ID information is “53”, it means that the processing for the commands up to the issue ID “53” has been completed.

図３６は、第８の実施の形態の変形例において、write コマンドを受信したストレージ４０１の処理シーケンサ４１１の動作を示すフローチャートである。処理シーケンサ４１１は、まず、受け取るデータに必要な領域をバッファメモリ１０６に確保する（ステップＳ５８０）。また、write コマンドで指定された書き込み先の論理アドレスとwrite 処理であることを指定して、ＬＢＡ管理部４１３に論理アドレスを物理アドレスに変換させる（ステップＳ５８１）。ステップＳ５８１において、ＬＢＡ管理部４１３は、ステップＳ５２１の場合と同様にエントリに対して処理を行い、物理アドレスを処理シーケンサ４１１に渡す。 FIG. 36 is a flowchart showing the operation of the processing sequencer 411 of the storage 401 that has received the write command in the modification of the eighth embodiment. The processing sequencer 411 first secures an area necessary for received data in the buffer memory 106 (step S580). In addition, the logical address of the write destination specified by the write command and the write processing are specified, and the LBA management unit 413 converts the logical address into a physical address (step S581). In step S581, the LBA management unit 413 performs processing on the entry in the same manner as in step S521, and passes the physical address to the processing sequencer 411.

次いで、処理シーケンサ４１１は、準備完了の通知をデータ転送元のストレージ（この例ではストレージ４００）に送るように通信部４１２に指示する（ステップＳ５８２）。通信部４１２は、指示に応じて、データ転送元のストレージに準備完了の通知を送信する。そして、データ転送元のストレージからデータが届くのを待ち（ステップＳ５８３）、データが届いて通信部４１２からデータを格納すべきバッファメモリ１０６の領域の問い合わせを受けると、ステップＳ５８０で確保した領域を通信部４１２に知らせる（ステップＳ５８４）。 Next, the processing sequencer 411 instructs the communication unit 412 to send a notification of completion of preparation to the data transfer source storage (storage 400 in this example) (step S582). In response to the instruction, the communication unit 412 transmits a preparation completion notification to the data transfer source storage. Then, it waits for data to arrive from the data transfer source storage (step S583). When the data arrives and receives an inquiry from the communication unit 412 about the area of the buffer memory 106 where the data is to be stored, the area secured in step S580 is obtained. The communication unit 412 is notified (step S584).

さらに、データ転送元のストレージからのデータのバッファメモリ１０６への格納の完了を待ち（ステップＳ５８５）、全てのデータがバッファメモリ１０６に格納されたことが通信部４１２から通知されると、処理シーケンサ４１１は、処理済発行ＩＤ情報の値が、処理対象のwrite コマンドに付加された発行ＩＤの一つ前の値になるまで待つ（ステップＳ５８６）。例えば、発行ＩＤ「５４」のwrite コマンドを受信して、ステップＳ５８０からステップＳ５８５までの動作を行ったとする。このとき、発行ＩＤ「５３」までの各コマンド（write コマンドやスナップショット作成コマンド等）を受信しているとは限らない。この場合、ステップＳ５８６では、発行ＩＤ「５３」までの各コマンドを受信し、その各コマンドに応じた処理を全て完了させるまで、発行ＩＤ「５４」に対する処理を中断する。処理済発行ＩＤ情報が「５３」になったならば、発行ＩＤ「５４」に対する処理を再開し、ステップＳ５８７に移行する。 Further, it waits for the storage of data from the data transfer source storage to the buffer memory 106 (step S585), and when the communication unit 412 notifies that all the data has been stored in the buffer memory 106, the processing sequencer Step 411 waits until the value of the processed issuance ID information becomes the previous value of the issuance ID added to the write command to be processed (step S586). For example, it is assumed that the write command with the issue ID “54” is received and the operations from step S580 to step S585 are performed. At this time, each command (write command, snapshot creation command, etc.) up to the issue ID “53” is not necessarily received. In this case, in step S586, each command up to the issue ID “53” is received, and the process for the issue ID “54” is interrupted until all the processes corresponding to each command are completed. When the processed issue ID information becomes “53”, the process for the issue ID “54” is resumed, and the process proceeds to step S587.

処理済発行ＩＤ情報の値が発行ＩＤの一つ前の値になったら、処理シーケンサ４１１は、ＩＯスケジューラ１０４に対して、ステップＳ５８１で変換した物理アドレスに対応した領域に、バッファメモリ１０６に格納されたデータを書き込む指示（書き込み要求）を登録する。ＩＯスケジューラ１０４は、書き込み要求に応じて、媒体制御部１０５に、書き込み対象のデータの書き込み指示を行う。媒体制御部１０５は、登録内容に応じてバッファメモリ１０６から記憶媒体１０１へのデータの書き込み処理を行う（ステップＳ５８７）。 When the value of the processed issuance ID information becomes the previous value of the issuance ID, the processing sequencer 411 stores the IO scheduler 104 in the buffer memory 106 in an area corresponding to the physical address converted in step S581. An instruction (write request) for writing the read data is registered. The IO scheduler 104 instructs the medium control unit 105 to write data to be written in response to the write request. The medium control unit 105 writes data from the buffer memory 106 to the storage medium 101 according to the registered contents (step S587).

そして、媒体制御部１０５からの書き込み完了通知を待ち（ステップＳ５８８）、媒体制御部１０５からの書き込み完了通知を受けると、データ転送元のストレージに完了通知を行う（ステップＳ５８９）。また、ステップＳ５８９では、write コマンドに付加された発行ＩＤを処理済発行ＩＤ情報に反映させる。例えば、発行ＩＤ「５４」のwrite コマンドについてステップＳ５８９までの処理を行ったならば、処理済発行ＩＤ情報の情報を「５３」から「５４」に更新する。 Then, it waits for a write completion notification from the medium control unit 105 (step S588), and when it receives a write completion notification from the medium control unit 105, it notifies the data transfer source storage of the completion (step S589). In step S589, the issue ID added to the write command is reflected in the processed issue ID information. For example, if the processing up to step S589 is performed for the write command of the issue ID “54”, the information of the processed issue ID information is updated from “53” to “54”.

また、第８の実施の形態の変形例では、ストレージ４０１がスナップショット作成コマンドを受信した場合には、以下のように動作する。すなわち、ストレージ４０１にスナップショット作成コマンドが到着すると、ストレージ４０１のストレージコントローラ４１０の通信部４１２にスナップショット作成コマンドが入力される。通信部４１２は、スナップショット作成コマンドを受け取ると、処理シーケンサ４１１にスナップショット作成コマンドを渡し、スナップショット作成処理の開始を指示する。 In the modification of the eighth embodiment, when the storage 401 receives a snapshot creation command, the operation is as follows. That is, when a snapshot creation command arrives at the storage 401, the snapshot creation command is input to the communication unit 412 of the storage controller 410 of the storage 401. Upon receiving the snapshot creation command, the communication unit 412 passes the snapshot creation command to the processing sequencer 411 and instructs the start of the snapshot creation processing.

スナップショット作成処理では、処理済発行ＩＤ情報後が、処理対象の要求に付加された発行ＩＤの一つ前の値になるまで待つ。すなわち、ステップＳ５８６の場合と同様に、スナップショット作成コマンドよりも一つ前に発行されたコマンドに応じた処理を全て完了させるまで、そのスナップショット作成コマンドに応じた処理を進めずに待つ。 In the snapshot creation process, the process waits until the processed issue ID information becomes the value immediately before the issue ID added to the request to be processed. That is, as in the case of step S586, the process waits without proceeding with the process according to the snapshot creation command until all the processes according to the command issued immediately before the snapshot creation command are completed.

処理済発行ＩＤ情報の値が発行ＩＤの一つ前の値になったら、処理シーケンサ４１１は、アドレステーブル記憶部中４１４のアドレステーブルの全エントリに対して以下の処理を行う。すなわち、最新論理アドレス空間に対応するフラグに「変更有り」が記録されていた場合には、その最新論理アドレス空間に対応する物理ブロック番号を未使用ブロックとしてアドレステーブル記憶部４１４の未使用ブロックテーブルに登録する。次いで、最新論理アドレス空間に対応するブロック番号にスナップショット論理アドレス空間に対応する物理ブロック番号をコピーする。また、最新論理アドレス空間に対応するフラグを「変更有り」から「変更無し」にする。 When the value of the processed issue ID information becomes the previous value of the issue ID, the processing sequencer 411 performs the following processing on all entries in the address table in the address table storage unit 414. That is, when “changed” is recorded in the flag corresponding to the latest logical address space, the unused block table in the address table storage unit 414 is set to the physical block number corresponding to the latest logical address space as an unused block. Register with. Next, the physical block number corresponding to the snapshot logical address space is copied to the block number corresponding to the latest logical address space. Further, the flag corresponding to the latest logical address space is changed from “changed” to “not changed”.

以上の処理全てが終了したならば、処理シーケンサ４１１は、処理済発行ＩＤ情報の値をスナップショット作成コマンドに付加された発行ＩＤに変更する。そして、通信部４１２を用いてスナップショット作成コマンドに対する応答をストレージ４００に送信する。 If all the above processes are completed, the processing sequencer 411 changes the value of the processed issuance ID information to the issuance ID added to the snapshot creation command. Then, a response to the snapshot creation command is transmitted to the storage 400 using the communication unit 412.

本変形例では、ストレージ４０１が発行ＩＤの順番に各コマンドの処理を進めていく点が、既に説明した第８の実施の形態と異なる。しかし、エントリに対する処理自体に相違点はない。従って、ホスト３００に異常が生じたとしても、ホスト３０３が処理を再開するまでの時間は短くて済む。また、ストレージ４００，４０１間でのデータ転送時間も短くて済む。 This modification is different from the already described eighth embodiment in that the storage 401 advances the processing of each command in the order of the issue ID. However, there is no difference in the processing for the entry itself. Therefore, even if an abnormality occurs in the host 300, the time until the host 303 resumes processing can be short. In addition, the data transfer time between the storages 400 and 401 can be shortened.

実施の形態９．
図３７は、本発明によるデータ複製システムの第９の実施の形態を示すブロック図である。図３７に示すデータ複製システムにおいて、正常系のストレージ４００が、ストレージ４００を使用する正常系のホスト６００とローカルに接続されている。また、待機系のストレージ４０１が、ストレージ４０１を使用する待機系のホスト６０１とローカルに接続されている。ストレージ４００は、ネットワーク１３を介してストレージ４０１に接続されている。また、ホスト６００は、ネットワーク６０２を介してホスト６０１に接続されている。 Embodiment 9 FIG.
FIG. 37 is a block diagram showing a ninth embodiment of a data replication system according to the present invention. In the data replication system shown in FIG. 37, a normal storage 400 is locally connected to a normal host 600 that uses the storage 400. A standby storage 401 is locally connected to a standby host 601 that uses the storage 401. The storage 400 is connected to the storage 401 via the network 13. The host 600 is connected to the host 601 via the network 602.

なお、ストレージ４００，４０１の構成および動作は、第８の実施の形態におけるスナップショット機能を有するストレージ４００，４０１の構成および動作と同じである（図２９参照）。また、ストレージ４００，４０１に代えて、第７の実施の形態におけるストレージ３０１，３０２（図２１参照）を用いてもよい。 The configurations and operations of the storages 400 and 401 are the same as the configurations and operations of the storages 400 and 401 having the snapshot function in the eighth embodiment (see FIG. 29). Further, in place of the storages 400 and 401, the storages 301 and 302 (see FIG. 21) in the seventh embodiment may be used.

また、ネットワーク１３,６０２は、１つのネットワークであってもよい。ただし、ホスト６００，６０１が接続されるネットワーク６０２は、専用回線とすることが好ましい。 The networks 13 and 602 may be a single network. However, the network 602 to which the hosts 600 and 601 are connected is preferably a dedicated line.

図３８は、ホスト６００の構成例を示すブロック図である。図３８に示すように、ホスト６００において、単数あるいは複数のアプリケーションが動作する。ここでは、２つのアプリケーション６０３ａ，６０３ｂを例示する。アプリケーション６０３ａ，６０３ｂは、ＩＯ管理部３１１を用いて、ストレージ４００中のデータにアクセスを行う。また、ＩＯ管理部３１１は、ストレージ４００に再開可能ポイントを通知するための再開可能ポイント通知部３１２を有する。また、ホスト６０１にアプリケーション６０３ａ，６０３ｂの実行イメージを転送し、再開可能ポイント通知部３１２に再開可能ポイントを通知する実行イメージ転送部６０４が備えられている。 FIG. 38 is a block diagram illustrating a configuration example of the host 600. As shown in FIG. 38, one or a plurality of applications operate on the host 600. Here, two applications 603a and 603b are illustrated. The applications 603 a and 603 b use the IO management unit 311 to access data in the storage 400. Further, the IO management unit 311 includes a resumable point notifying unit 312 for notifying the storage 400 of resumable points. In addition, an execution image transfer unit 604 that transfers execution images of the applications 603a and 603b to the host 601 and notifies the resumable point notification unit 312 of the resumable point is provided.

本実施の形態におけるアプリケーション６０３ａ，６０３ｂは、再開機能を有さないアプリケーションである。すなわち、ストレージの記憶媒体１０１のデータ記録状態が所定の状態になっていれば処理を再開できるような機能を有していないアプリケーションである。 The applications 603a and 603b in the present embodiment are applications that do not have a resume function. In other words, the application does not have a function that can resume processing if the data recording state of the storage medium 101 of the storage is in a predetermined state.

図３９は、ホスト６０１の構成例を示すブロック図である。図３９に示すように、ホスト６０１には、実行イメージを保存する実行イメージ保存部６０６と、ホスト６００から送られてきた実行イメージを受け取り実行イメージ保存部６０６に保存する実行イメージ受信部６０５と、ホスト６００の状況を監視するホスト監視部６０８と、実行イメージ保存部６０６中の実行イメージを元にアプリケーションを再開させるアプリケーション再開部６０７とが備えられている。実行イメージ保存部６０６は、揮発性半導体メモリ、不揮発性半導体メモリ、磁気ディスク、光磁気ディスク、光ディスク等のデータを保存する媒体である。 FIG. 39 is a block diagram illustrating a configuration example of the host 601. As illustrated in FIG. 39, the host 601 includes an execution image storage unit 606 that stores an execution image, an execution image reception unit 605 that receives the execution image transmitted from the host 600 and stores the execution image in the execution image storage unit 606, A host monitoring unit 608 that monitors the status of the host 600 and an application resuming unit 607 that resumes an application based on the execution image in the execution image storage unit 606 are provided. The execution image storage unit 606 is a medium for storing data such as a volatile semiconductor memory, a nonvolatile semiconductor memory, a magnetic disk, a magneto-optical disk, and an optical disk.

実行イメージは、稼働系の処理実行状態を示す情報であり、各アプリケーションによって実行されるプロセスを他のホストで実行させるのに必要な情報である。実行イメージには、例えば、各プロセスの仮想アドレス空間中のデータ、プロセス管理情報であるレジスタの値、プログラムカウンタの値およびプロセスの状態等である。プロセスが使用しているファイルやプロセス間通信を復元するための情報等が含まれる。また、並列処理によって一つのプロセスを複数のスレッドで実現する場合には、各スレッドのレジスタの値、プログラムカウンタの値、プログラム状態ワードおよび各種フラグ等も実行イメージに含まれる。さらに、実行イメージは、対象プロセスが使用するカーネル中の通信バッファの内容等を含む場合もある。なお、スレッドとは、並列処理における処理単位である。 The execution image is information indicating the processing execution status of the active system, and is information necessary for causing a process executed by each application to be executed on another host. The execution image includes, for example, data in a virtual address space of each process, a register value as process management information, a program counter value, a process state, and the like. The file used by the process, information for restoring communication between processes, and the like are included. When a single process is realized by a plurality of threads by parallel processing, the execution value includes a register value of each thread, a value of a program counter, a program status word, various flags, and the like. Further, the execution image may include the contents of a communication buffer in the kernel used by the target process. A thread is a processing unit in parallel processing.

次に、動作について説明する。まず、実行イメージ転送時の動作を説明する。正常系のホスト６００の実行イメージ転送部６０４には、例えば、ユーザから実行イメージ転送指示が入力される。この指示が入力されるタイミングは、任意のタイミングでよい。指示が入力されると、実行イメージ転送部６０４は、アプリケーション６０３ａ，６０３ｂの実行イメージを取得する。次に、実行イメージ転送部６０４は、ＩＯ管理部３１１の再開可能ポイント通知部３１２に、ストレージ４００に対して再開可能ポイントを通知するように指示する。そして、再開可能ポイント通知部３１２は、ストレージ４００に再開可能ポイントを通知する。次いで、実行イメージ転送部６０４は、ホスト６０１の実行イメージ受信部６０５に対して、取得した実行イメージを転送する。ホスト６０１において、実行イメージ受信部６０５は、ホスト６００の実行イメージ受信部６０６から実行イメージを受け取ると、受け取った実行イメージを実行イメージ保存部６０６に保存する。 Next, the operation will be described. First, the operation at the time of execution image transfer will be described. For example, an execution image transfer instruction is input from the user to the execution image transfer unit 604 of the normal host 600. The timing at which this instruction is input may be any timing. When the instruction is input, the execution image transfer unit 604 acquires execution images of the applications 603a and 603b. Next, the execution image transfer unit 604 instructs the resumable point notification unit 312 of the IO management unit 311 to notify the storage 400 of the resumable point. Then, the resumable point notifying unit 312 notifies the storage 400 of the resumable point. Next, the execution image transfer unit 604 transfers the acquired execution image to the execution image reception unit 605 of the host 601. In the host 601, when the execution image receiving unit 605 receives the execution image from the execution image receiving unit 606 of the host 600, the execution image receiving unit 605 stores the received execution image in the execution image storage unit 606.

ストレージ４００は、実行イメージの転送が指示された時点において再開可能ポイント通知を受信し、その通知に応じて動作する。従って、待機系のストレージ４０１は、正常系のホストにおいて実行イメージ転送が指示された時点におけるスナップショットを作成する。 The storage 400 receives the resumable point notification at the time when the transfer of the execution image is instructed, and operates according to the notification. Therefore, the standby storage 401 creates a snapshot when execution image transfer is instructed in the normal host.

ストレージとして第７の実施の形態と同様のストレージ３０１，３０２（図１９参照）を用いる場合、待機系のストレージ４０１は、記憶媒体の状態を、正常系のホストにおいて実行イメージ転送が指示された時点における状態に維持する。 When the same storages 301 and 302 (see FIG. 19) as in the seventh embodiment are used as the storage, the standby storage 401 indicates the state of the storage medium when execution image transfer is instructed in the normal host. Maintain the state at.

なお、実行イメージの容量は、数百ＭＢ以上になる場合もあり、ホスト６０１への実行イメージの転送には時間がかかる。そして、実行イメージの転送中にアプリケーション６０３ａ，６０３ｂが処理を進めるとメモリやレジスタの値が変化してしまい、転送途中で実行イメージの内容が変わってしまう。そのため、実行イメージ転送中は、アプリケーション６０３ａ，６０３ｂは処理を停止する。 The capacity of the execution image may be several hundred MB or more, and it takes time to transfer the execution image to the host 601. When the applications 603a and 603b advance the processing during the transfer of the execution image, the values of the memory and the register change, and the contents of the execution image change during the transfer. Therefore, the applications 603a and 603b stop processing during execution image transfer.

あるいは、ホスト６００が備えるメモリや磁気ディスク等（図３７において図示せず）に、実行イメージの情報を保存し、保存した実行イメージをホスト６０１に送信してもよい。一旦、保存完了後に、処理を進めても、保存された実行イメージの内容は保たれる。従って、この場合、ホスト６００のアプリケーション６０３ａ，６０３ｂは処理を進めることができる。実行イメージをメモリ等に保存する時間は、実行イメージの転送時間に比べ短い。従って、実行イメージをメモリ等に保存してから転送するようにすれば、処理を停止する時間は短くて済む。 Alternatively, the execution image information may be stored in a memory or a magnetic disk (not shown in FIG. 37) provided in the host 600, and the stored execution image may be transmitted to the host 601. Once the storage is completed, even if the process is advanced, the contents of the stored execution image are maintained. Therefore, in this case, the applications 603a and 603b of the host 600 can proceed with processing. The time for storing the execution image in the memory or the like is shorter than the transfer time of the execution image. Accordingly, if the execution image is stored in a memory or the like and then transferred, the time for stopping the processing can be shortened.

ここでは、実行イメージ転送部６０４がユーザからの指示に応じて実行イメージを転送する場合を示した。実行イメージ転送部６０４が実行イメージ転送処理を開始するタイミングは、ユーザから指示が入力された時点に限定されない。例えば、一定時間間隔毎に実行イメージ転送処理を開始するようにしてもよい。また、実行イメージの転送が終了したときに再度実行イメージの転送処理を開始してもよい。あるいは、アプリケーションがアイドルになった時点、アプリケーションのアイドルタイムが一定値を越えた時点等に転送処理を開始してもよい。 Here, the execution image transfer unit 604 transfers the execution image in accordance with an instruction from the user. The timing at which the execution image transfer unit 604 starts the execution image transfer process is not limited to the time when an instruction is input from the user. For example, the execution image transfer process may be started at regular time intervals. Alternatively, the execution image transfer process may be started again when the execution image transfer ends. Alternatively, the transfer process may be started when the application becomes idle or when the idle time of the application exceeds a certain value.

次に、災害発生時の動作を説明する。災害等によってホスト６００が使用できなくなった場合には、ストレージ４０１を待機系から正常系にする。また、ホスト６０１が、ホスト６００から処理を引き継ぐ。 Next, the operation when a disaster occurs will be described. When the host 600 becomes unusable due to a disaster or the like, the storage 401 is changed from the standby system to the normal system. In addition, the host 601 takes over processing from the host 600.

障害検知からアプリケーション再開までのホスト６０１の動作を説明する。まず、ホスト６０１中のホスト監視部６０８がホスト６００の異常を検出する。例えば、ホスト監視部６０８は、ホスト６００と常時あるいは定期的に通信を行い、一定時間ホスト６００と通信できなくなった場合に異常が生じたと認識する。あるいは、ホスト６００から異常が報告された場合に、ホスト６００において災害が発生したと認識してもよい。 The operation of the host 601 from failure detection to application restart will be described. First, the host monitoring unit 608 in the host 601 detects an abnormality of the host 600. For example, the host monitoring unit 608 communicates with the host 600 constantly or periodically, and recognizes that an abnormality has occurred when communication with the host 600 is disabled for a certain period of time. Alternatively, when an abnormality is reported from the host 600, it may be recognized that a disaster has occurred in the host 600.

ホスト監視部６０８は、ホスト６００の異常を検出すると、待機系であるストレージ４０１に対してスナップショット復帰コマンドを発行する。そして、スナップショット復帰コマンドに対する応答を待ち、応答を受けたら、ホスト監視部６０８は、アプリケーション再開部６０７に、アプリケーションの再開を指示する。アプリケーション再開部６０７は、実行イメージ保存部６０６に保存されている実行イメージを用いてアプリケーションアプリケーションの動作を再開させる。本例では、ストレージとして第五の実施の形態と同様のストレージ４００，４０１を用いる場合を示した。ストレージとして第７の実施の形態と同様のストレージ３０１，３０２（図１９参照）を用いる場合、ホスト監視部６０８は、スナップショット復帰コマンドの代わりに遅延データ破棄コマンドを出力すればよい。 When the host monitoring unit 608 detects an abnormality of the host 600, it issues a snapshot restoration command to the storage 401 that is the standby system. Then, the host monitoring unit 608 waits for a response to the snapshot restoration command, and upon receiving the response, the host monitoring unit 608 instructs the application resuming unit 607 to resume the application. The application resuming unit 607 uses the execution image stored in the execution image storage unit 606 to restart the operation of the application application. In this example, the storage 400 and 401 similar to the fifth embodiment are used as the storage. When the storages 301 and 302 (see FIG. 19) similar to those of the seventh embodiment are used as the storage, the host monitoring unit 608 may output a delayed data discard command instead of the snapshot restoration command.

なお、アプリケーション再開部６０７は、アプリケーションの処理を再開するときに、実行イメージのうち動作するホストによって変更を必要とする情報がある場合には、復元時にその情報を変更する。 Note that, when resuming the processing of an application, the application resuming unit 607 changes the information at the time of restoration if there is information in the execution image that needs to be changed by the operating host.

第９の実施の形態では、正常系のホストにおいて実行イメージの転送が指示された時点で、正常系のストレージは再開可能ポイント通知を受信する。従って、待機系のストレージは、正常系のホストにおいて実行イメージの転送が指示された時点におけるスナップショットを作成する。あるいは、正常系のホストにおいて実行イメージの転送が指示された時点における記憶媒体の状態を維持する。さらに、待機系のホストは、実行イメージの転送が指示された時点における正常系ホストの実行イメージを保持する。従って、アプリケーションによって再開機能が実現されていなくても、待機系のストレージは、所定の時点における正常系のストレージの状態を再現でき、待機系のホストは、その時点における実行イメージを再現できるので、待機系で処理を迅速に再開することができる。 In the ninth embodiment, when the transfer of the execution image is instructed in the normal host, the normal storage receives the resumable point notification. Therefore, the standby storage system creates a snapshot at the time when transfer of the execution image is instructed in the normal system host. Alternatively, the state of the storage medium at the time when transfer of the execution image is instructed in the normal host is maintained. Further, the standby host holds the execution image of the normal host when the execution image transfer is instructed. Therefore, even if the restart function is not realized by the application, the standby storage can reproduce the normal storage state at a predetermined time, and the standby host can reproduce the execution image at that time. Processing can be resumed quickly in the standby system.

第９の実施の形態において、実行イメージ転送部６０４は、実行イメージを全て転送するのではなく、前回転送した実行イメージとの差分のみを転送してもよい。このように、前回転送した実行イメージから変更された部分のみを転送するようにすれば、転送時間を短縮化することができる。 In the ninth embodiment, the execution image transfer unit 604 may transfer only the difference from the previously transferred execution image, instead of transferring all execution images. In this way, if only the part changed from the previously transferred execution image is transferred, the transfer time can be shortened.

仮想記憶をコンピュータシステムでは、使用される確率の高い論理アドレスと物理アドレスとの変換テーブルを用いることが多い。この変換テーブルは、ＴＬＢ（Translation Look-aside Buffer ）と呼ばれる。ＴＬＢでは、変換される各アドレス毎に、そのアドレスのデータが書き換えられたか否かを示す更新情報を管理する。この更新情報は、一般に、ダーティ（dirty）フラグまたはダーティビットと呼ばれている。例えば、「コンピュータ・アーキテクチャ設計・実現・評価の定量的アプローチ（デイビット・パターソン，ジョン・ヘネシー著、富田眞治他２名訳、日経ＢＰ社，１９９４年２月，ｐ．４４２−４４３）」では、この更新情報はダーティビットとして記載されている。実行イメージ転送部６０４は、変更があったことを示すダーティフラグに対応する情報を実行イメージとして送信すればよい。 Computer systems using virtual memory often use a conversion table between logical addresses and physical addresses that have a high probability of being used. This conversion table is called TLB (Translation Look-aside Buffer). In the TLB, for each address to be converted, update information indicating whether or not the data at that address has been rewritten is managed. This update information is generally called a dirty flag or dirty bit. For example, in "Quantitative approach to computer architecture design, implementation and evaluation (David Patterson, John Hennessy, Yuji Tomita and two other translations, Nikkei BP, February 1994, p.442-443)" This update information is described as a dirty bit. The execution image transfer unit 604 may transmit information corresponding to the dirty flag indicating that there has been a change as an execution image.

また、仮想記憶において、プログラムの実行に必要なページを実記憶装置に配置することをページインとよび、ページを実記憶装置から外して仮想記憶装置に配置することをページアウトとよぶ。実行イメージ転送部６０４は、実行イメージの転送を指示された場合には、ページインしているメモリのデータをホスト６０１に送信し、その後、ページアウトが発生したタイミングで、ページアウトしたメモリのデータをホスト６０１に送信してもよい。ページアウトしたメモリのデータは、次にページインするまで変更されることがない。そのため、このように実行イメージを送信することで、実行イメージの送信量を減らすことができる。 In virtual storage, placing a page necessary for executing a program in a real storage device is called page-in, and removing a page from the real storage device and placing it in a virtual storage device is called page-out. When the execution image transfer unit 604 is instructed to transfer the execution image, the execution image transfer unit 604 transmits the page-in memory data to the host 601, and then the page-out memory data at the timing when the page-out occurs. May be transmitted to the host 601. The paged-out memory data is not changed until the next page-in. Therefore, the transmission amount of the execution image can be reduced by transmitting the execution image in this way.

図４０，４１は、クライアントサーバシステムに、第７の実施の形態から第９の実施の形態を適用した場合の構成例を示すブロック図である。通常、ホスト３００が各クライアント５００−１〜５００−ｎのサーバとして機能する。災害等により、ホスト３００またはストレージ４００に以上が発生した場合、ホスト３０３が稼働する。ホスト３０３が稼働を開始した後、各クライアントはホスト３０３をサーバとして、ホスト３０３に処理を要求する。 40 and 41 are block diagrams illustrating a configuration example in the case where the seventh to ninth embodiments are applied to a client server system. Usually, the host 300 functions as a server for each of the clients 500-1 to 500-n. When the above occurs in the host 300 or the storage 400 due to a disaster or the like, the host 303 operates. After the host 303 starts operation, each client requests the host 303 for processing using the host 303 as a server.

なお、図４０に示すように、ホスト３００とホスト３０３とが直接接続されている場合には、第７の実施の形態から第９の実施の形態で説明したように、ホスト３０３がホスト３００の状態を監視する。図４１に示すように、ホスト３００とホスト３０３とが直接接続されていない場合には、各クライアント５００−１〜５００−ｎがホスト３００の状態を監視し、ホスト３００に以上が発生したときに、ホスト３０３に以上の発生を通知してもよい。 As shown in FIG. 40, when the host 300 and the host 303 are directly connected, the host 303 is connected to the host 300 as described in the seventh to ninth embodiments. Monitor status. As shown in FIG. 41, when the host 300 and the host 303 are not directly connected, the clients 500-1 to 500-n monitor the status of the host 300, and when the above occurs in the host 300. The host 303 may be notified of the above occurrence.

第７の実施の形態から第９の実施の形態において、第１の実施の形態から第４の実施の形態と同様に、中継装置を介して待機系のストレージにデータを送信するようにしてもよい。その場合、第１の実施の形態から第４の実施の形態の効果も得ることができる。また、ストレージ同士で、あるいはストレージと中継装置との間でデータを送受信するときには、第５の実施の形態または第６の実施の形態に示したように冗長化されたデータ群を送受信するようにしてもよい。その場合、第５の実施の形態または第６の実施の形態と同様の効果も得られる。 In the seventh to ninth embodiments, similarly to the first to fourth embodiments, data may be transmitted to the standby storage via the relay device. Good. In that case, the effects of the first to fourth embodiments can also be obtained. In addition, when data is transmitted / received between storages or between a storage and a relay device, a redundant data group is transmitted / received as shown in the fifth embodiment or the sixth embodiment. May be. In that case, the same effect as the fifth embodiment or the sixth embodiment can be obtained.

データ複製システムの第１の実施の形態を示すブロック図である。It is a block diagram which shows 1st Embodiment of a data replication system. ストレージの構成例を示すブロック図である。It is a block diagram which shows the structural example of a storage. 中継装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of a relay apparatus. 処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a process sequencer. 中継処理部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of a relay process part. 第２の実施の形態の中継処理部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the relay process part of 2nd Embodiment. 第３の実施の形態の処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the process sequencer of 3rd Embodiment. データ複製システムの第４の実施の形態を示すブロック図である。It is a block diagram which shows 4th Embodiment of a data replication system. 処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a process sequencer. ストレージの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a storage. ステップＳ２２１の処理を具体的に示すフローチャートである。It is a flowchart which shows the process of step S221 concretely. データ転送の例を示すタイミング図である。It is a timing diagram showing an example of data transfer. データ複製システムの第５の実施の形態を示すブロック図である。It is a block diagram which shows 5th Embodiment of a data replication system. ストレージの構成例を示すブロック図である。It is a block diagram which shows the structural example of a storage. 処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a process sequencer. 通信部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of a communication part. データ複製システムの他の構成例を示すブロック図である。It is a block diagram which shows the other structural example of a data replication system. 第６の実施の形態の処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the process sequencer of 6th Embodiment. データ複製システムの第の実施の形態を示すブロック図である。It is a block diagram which shows 1st Embodiment of a data replication system. ホストの構成例を示すブロック図である。It is a block diagram which shows the structural example of a host. ストレージの構成例を示すブロック図である。It is a block diagram which shows the structural example of a storage. 同期ＩＤおよび発行ＩＤの例を示す説明図である。It is explanatory drawing which shows the example of synchronous ID and issue ID. 処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a process sequencer. 処理シーケンサが遅延write コマンドを受信したときの動作を示すフローチャートである。It is a flowchart which shows operation | movement when a process sequencer receives a delay write command. 処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a process sequencer. 遅延データ反映処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of a delay data reflection process. 遅延データ破棄処理を示すフローチャートである。It is a flowchart which shows a delay data discard process. データ複製システムの第８の実施の形態を示すブロック図である。It is a block diagram which shows 8th Embodiment of a data replication system. ストレージの構成例を示すブロック図である。It is a block diagram which shows the structural example of a storage. アドレステーブルの一例を示す説明図である。It is explanatory drawing which shows an example of an address table. エントリの説明図である。It is explanatory drawing of an entry. 処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a process sequencer. 処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a process sequencer. ＬＢＡ管理部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of a LBA management part. 処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a process sequencer. 処理シーケンサの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a process sequencer. データ複製システムの第９の実施の形態を示すブロック図である。It is a block diagram which shows 9th Embodiment of a data replication system. ホストの構成例を示すブロック図である。It is a block diagram which shows the structural example of a host. ホストの構成例を示すブロック図である。It is a block diagram which shows the structural example of a host. クライアントサーバシステムに本発明を適用した場合の構成例を示すブロック図である。It is a block diagram which shows the structural example at the time of applying this invention to a client server system. クライアントサーバシステムに本発明を適用した場合の構成例を示すブロック図である。It is a block diagram which shows the structural example at the time of applying this invention to a client server system.

Explanation of symbols

１０ホスト
１１，１２ストレージ
１３，１４ネットワーク
１５中継装置
１０１記憶媒体
１０２通信部
１０３処理シーケンサ
１０４ＩＯスケジューラ
１０５媒体制御部
１０６バッファメモリ
１５１中継処理部
１５０通信部
１５２バッファメモリ DESCRIPTION OF SYMBOLS 10 Host 11,12 Storage 13,14 Network 15 Relay apparatus 101 Storage medium 102 Communication part 103 Processing sequencer 104 IO scheduler 105 Medium control part 106 Buffer memory 151 Relay processing part 150 Communication part 152 Buffer memory

Claims

In a data transmission / reception method for transmitting data from a transmission source that transmits data to a transmission destination that receives data,
The data transmission / reception method characterized in that the transmission source creates at least one redundant data for error correction from the transmitted original data, and transmits the original data and the redundant data in different data transmission units.

In the transmission destination, before completing the reception of all of the data group that is a set of the original data and the redundant data, at the stage of receiving a part of the data group that can partially perform error correction processing on the original data, The data transmission / reception method according to claim 1, wherein error correction processing is executed.

3. The transmission source according to claim 1 or 2, wherein the transmission source divides the original data into divided data, and creates redundant data that can restore the original data even if one or more of the divided data is lost. Data transmission / reception method.

4. The data transmission / reception method according to claim 3, wherein parity data or ECC is used as redundant data.

The data transmission / reception method according to any one of claims 1 to 3, wherein duplicate data of transmission data is used as redundant data.

The data transmission / reception method according to any one of claims 1 to 5, wherein the original data and the redundant data are transmitted to different communication networks.

In a data replication system that mirrors or backs up data in the first storage to the second storage via a communication network,
The first storage includes data transfer processing means for controlling data transfer, and redundancy means for creating redundant data for error correction from the transmitted original data,
The data transfer processing means transmits the original data and the redundant data created by the redundancy means in separate data transmission units.

The second storage includes data restoring means for performing error correction processing using redundant data received from the first storage, and storage processing means for storing data restored by the data restoring means in a storage medium,
The data restoration means can execute partial error correction processing on the original data before completing reception of all of the data group that is a set of original data and redundant data from the first storage. The data replication system according to claim 7, wherein error correction processing is executed when a part of the data is received.

The redundancy means in the first storage divides the original data into divided data, and creates redundant data that can restore the original data even if one or more of the divided data is lost. The data replication system according to claim 8.

The data duplication system according to claim 9, wherein the redundancy means uses parity data or ECC as redundant data.

The data duplication system according to any one of claims 7 to 9, wherein the redundancy unit creates duplicate data of the original data as redundant data.

The data replication system according to any one of claims 7 to 11, wherein the data transfer processing means sends the original data and the redundant data to different communication networks.

A computer provided in the first storage in the data replication system that mirrors or backs up data in the first storage to the second storage via a communication network.
Creating at least one redundant data for error correction from the transmitted original data;
A program for duplicating data in the storage for executing processing for transmitting original data and redundant data in separate data transmission units.