KR101713537B1

KR101713537B1 - Data replication method and data storage system for processing state machine based replication guaranteeing consistency of data and distributed recovery using check point data and replication log

Info

Publication number: KR101713537B1
Application number: KR1020140100361A
Authority: KR
Inventors: 차민규; 한승후; 이규재
Original assignee: 네이버 주식회사
Priority date: 2014-08-05
Filing date: 2014-08-05
Publication date: 2017-03-09
Also published as: KR20160016358A

Abstract

데이터의 일관성을 보장하는 상태기기 기반의 복제 및 체크포인트데이터와 복제로그를 이용한 분산 복구를 처리하는 데이터 복제 방법 및 데이터 저장 시스템이 개시된다. 데이터 복제 방법은, 저장된 데이터의 상태 집합을 의미하는 메모리저장소의 상태의 전이가, 메모리저장소에 대해 제공되는 연산의 수행에 의해 확정적으로 일어나며, 연산의 수행 결과도 동일하다는 특징을 이용하여, 두 메모리저장소에서 수행되는 연산을 선입선출 순서 방식으로 수행함으로써, 두 메모리저장소간의 일관성을 보장하는 복제를 수행할 수 있다.Disclosed is a data replication method and data storage system for processing distributed and recovery based on state machine-based replication and checkpoint data and replication log, which ensures data consistency. The data replication method is characterized in that the transition of the state of the memory storage means the state set of the stored data is determined by the execution of the operation provided for the memory storage and the result of the operation is the same, By performing the operations performed in the store in a first-in-first-out order manner, replication can be performed to ensure consistency between the two memory stores.

Description

TECHNICAL FIELD [0001] The present invention relates to a data replication method and a data storage system for processing distributed recovery using replication and checkpoint data based on a state machine and data using replication logs to ensure data consistency. AND DISTRIBUTED RECOVERY USING CHECK POINT DATA AND REPLICATION LOG}

본 발명의 실시예들은 데이터의 일관성을 보장하는 상태기기 기반의 복제 및 체크포인트데이터와 복제로그를 이용한 분산 복구를 처리하는 데이터 복제 방법 및 데이터 저장 시스템에 관한 것이다.Embodiments of the present invention relate to state machine-based replication and checkpoint data to ensure data consistency, and data replication methods and data storage systems that handle distributed recovery using replication logs.

네트워크를 통해 연결된 장치들에서 데이터의 분산된 저장소는 비용 효율이 높고, 많은 양의 데이터에 대한 신뢰성 있는 저장소로 이용되고 있다. 이러한 분산된 데이터 저장 시스템에서, 일시적 또는 영구적 데이터 손실에 대한 데이터 가용성과 같은 데이터 일관성(consistency)을 보장하기 위해, 분산된 데이터 저장 시스템에서 동일한 데이터 항목을 네트워크를 통해 연결된 다수의 기기에 복사하는 방법이 개발되었다. 이와 같이, 다수의 기기를 통해 동일한 데이터를 복사하여 저장하는 것을 데이터 복제라 한다. 이러한 데이터 복제는 기기의 고장, 장애 또는 일시적/영구적 데이터 손실의 위험에 대비하기 위해 이용될 수 있다.Distributed storage of data on devices connected through a network is cost-effective and is being used as a reliable repository for large amounts of data. In such a distributed data storage system, in order to ensure data consistency such as data availability for temporary or permanent data loss, a method of copying the same data item to a plurality of devices connected via a network in a distributed data storage system Was developed. In this way, copying and storing the same data through a plurality of devices is referred to as data replication. Such data replication can be used to counter the failure of a device, a failure, or the risk of temporary / permanent data loss.

이러한 데이터 복제는 비동기 방식 또는 동기 방식으로 이루어질 수 있다.Such data replication can be done asynchronously or synchronously.

비동기 방식의 데이터 복제는, 동기 방식의 데이터 복제에 비해 복제를 위해 필요한 비용이 낮은 반면, 데이터 가용성이 떨어진다. 예를 들어, 클라이언트가 제1 기기에 저장된 데이터를 변경하는 경우, 제1 기기의 데이터를 복제하는 제2 기기에 변경된 데이터가 바로 반영되지 않기 때문에, 제1 기기에 장애가 발생하는 경우, 변경된 데이터가 영구적으로 손실될 위험성이 존재한다.Asynchronous data replication is less costly for replication than synchronous data replication, while data availability is lower. For example, when the client changes the data stored in the first device, the changed data is not immediately reflected in the second device that replicates the data of the first device. Therefore, when a failure occurs in the first device, There is a risk of being permanently lost.

반면, 동기 방식의 데이터 복제는, 비동기 방식의 데이터 복제에 비해 가용성은 높으나, 데이터가 변경될 때마다 디스크와 같은 매체에 데이터 복제가 발생된다는 점에서 복제 흐름상의 시간 지연이 발생되는 등 복제를 위해 필요한 비용이 비동기 방식의 데이터 복제에 비해 높다.Synchronous data replication, on the other hand, is highly available compared to asynchronous data replication. However, since data replication occurs on a disk-like medium whenever data is changed, The cost is higher than the asynchronous data replication.

저장된 데이터의 상태 집합을 의미하는 메모리저장소의 상태의 전이가, 메모리저장소에 대해 제공되는 연산의 수행에 의해 확정적으로 일어나며, 연산의 수행 결과도 동일하다는 특징을 이용하여, 두 메모리저장소에서 수행되는 연산을 선입선출 순서 방식으로 수행함으로써, 두 메모리저장소간의 일관성을 보장하는 복제를 수행할 수 있는 데이터 복제 방법 및 데이터 저장 시스템을 제공한다.The transition of the state of the memory storage means the state set of the stored data is determined by the execution of the operation provided for the memory storage and the result of the operation is the same, In a first-in-first-out order manner, thereby providing a data replication method and a data storage system capable of performing replication that ensures consistency between two memory stores.

메모리 사상파일 형태로 복제로그에 대한 읽기 및/또는 쓰기를 수행하고, 디스크로의 플러시(flush)를 주기적으로 및/또는 복제로그에 쓰여진 데이터의 크기에 기반하여 수행함으로써, 복제 흐름에 대한 시간 지연을 최소화할 수 있는 데이터 복제 방법 및 데이터 저장 시스템을 제공한다.By performing a read and / or write to the replication log in the form of a memory mapping file and performing a flush to disk periodically and / or based on the size of the data written to the replication log, A data replication method and a data storage system capable of minimizing the number

기설정된 복제 팩터의 값에 기반하여 복제 팩터의 값 이상의 저장부들로부터 동일한 로그 시퀀스 넘버(Log Sequence Number, LSN)가 수신되는 경우에, 해당 LSN을 커밋 LSN으로 결정하고, 결정된 커밋 LSN까지 메모리저장소에 대한 연산이 순차적으로 수행되도록 하여, 가용성을 복제 팩터만큼 유지함으로써 유실된 데이터를 복구할 수 있는 데이터 복제 방법 및 데이터 저장 시스템을 제공한다.When the same log sequence number (LSN) is received from storage units whose values are equal to or greater than the value of the replication factor based on the value of the predetermined replication factor, the corresponding LSN is determined as the commit LSN, A data replication method and a data storage system capable of recovering lost data by maintaining availability as a replication factor.

메모리저장소의 상태 및 메모리저장소에 대해 수행된 연산의 마지막 위치에 해당하는 LSN을 포함하는 체크포인트데이터를 이용하여 메모리저장소의 재 시작 복구를 처리할 수 있는 데이터 복제 방법 및 데이터 저장 시스템을 제공한다.There is provided a data replication method and a data storage system capable of processing a restart of a memory storage using checkpoint data including an LSN corresponding to a state of a memory storage and an end position of an operation performed on the memory storage.

복수의 저장부를 포함하는 데이터 저장 시스템의 데이터 복제 방법에 있어서, 상기 복수의 저장부 중 마스터 저장부에서, 상기 마스터 저장부가 포함하는 마스터 메모리저장소 및 상기 복수의 저장부 중 나머지 저장부인 슬레이브 저장부가 포함하는 슬레이브 메모리저장소 중 적어도 하나에 대해 요청된 연산을 수신하는 단계, 상기 마스터 저장부에서, 상기 수신된 연산을 상기 마스터 저장부가 포함하는 마스터 복제로그에 저장하고, 수행되어야 할 연산의 상기 마스터 복제로그에서의 위치에 대한 정보를 상기 마스터 복제로그에 더 저장하고, 상기 마스터 복제로그를 상기 슬레이브 저장부가 포함하는 슬레이브 복제로그로 전달하는 단계 및 상기 마스터 저장부 및 상기 슬레이브 저장부에서, 상기 수행되어야 할 연산의 위치에 대한 정보에 기반하여 상기 마스터 복제로그 및 상기 슬레이브 복제로그에 저장된 동일한 연산을 상기 마스터 메모리저장소 및 상기 슬레이브 메모리저장소에 대해 수행하는 단계를 포함하는 데이터 복제 방법이 제공된다.A method for data replication in a data storage system including a plurality of storage units, the method comprising the steps of: in a master storage unit of the plurality of storage units, a master memory storage included in the master storage unit and a slave storage unit being a remaining storage unit of the plurality of storage units Receiving a requested operation for at least one of slave memory repositories in the master storage unit; storing, in the master storage unit, the received operation in a master replication log contained in the master storage unit; Storing the master replication log in the master replication log and transferring the master replication log to a slave replication log included in the slave storage unit; Based on information about the location of the operation The data replication method comprising the step of performing the same operation for the stored in the replication log and the slave replication log to said master memory and the slave memory store storage is provided.

마스터 저장 기기의 데이터 복제 방법에 있어서, 상기 마스터 저장 기기에서, 상기 마스터 저장 기기가 포함하는 마스터 메모리저장소 및 슬레이브 저장 기기가 포함하는 슬레이브 메모리저장소 중 적어도 하나에 대해 요청된 연산을 수신하는 단계; 상기 마스터 저장 기기에서, 상기 수신된 연산을 상기 마스터 저장 기기가 포함하는 마스터 복제로그에 저장하고, 수행되어야 할 연산의 상기 마스터 복제로그에서의 위치에 대한 정보를 상기 마스터 복제로그에 더 저장하고, 상기 마스터 복제로그를 상기 슬레이브 저장 기기로 전달하는 단계; 및 상기 마스터 저장 기기에서, 상기 수행되어야 할 연산의 위치에 대한 정보에 기반하여 상기 마스터 복제로그에 저장된 연산을 상기 마스터 메모리저장소에 대해 수행하는 단계를 포함하고, 상기 마스터 메모리저장소 및 상기 슬레이브 메모리저장소는, 동일한 데이터 상태 집합을 갖고, 상기 저장된 연산에 따라 동일한 연산결과를 갖는 것을 특징으로 하는 데이터 복제 방법이 제공된다.A method for replicating data in a master storage device, the method comprising: receiving, at the master storage device, a requested operation for at least one of a master memory storage included in the master storage device and a slave memory storage included in the slave storage device; Storing the received operation in a master replication log included in the master storage device and further storing information on a location in the master replication log of an operation to be performed in the master replication log, Transferring the master replication log to the slave storage device; And performing, at the master storage device, an operation stored in the master replica log based on information about the location of the operation to be performed for the master memory store, wherein the master memory store and the slave memory store Has the same data state set and has the same operation result according to the stored operation.

슬레이브 저장 기기의 데이터 복제 방법에 있어서, 상기 슬레이브 저장 기기에서, 상기 슬레이브 저장 기기가 포함하는 슬레이브 메모리저장소에 대해 요청된 연산을 마스터 저장 기기로 전송하는 단계; 상기 슬레이브 저장 기기에서, 상기 연산 및 상기 마스터 저장 기기가 포함하는 마스터 메모리저장소에 대해 요청된 연산 중 적어도 하나, 그리고 수행되어야 할 연산의 상기 마스터 저장 기기가 포함하는 마스터 복제로그에서의 위치에 대한 정보를 포함하는 마스터 복제로그를 수신하는 단계; 및 상기 슬레이브 저장 기기에서, 상기 수행되어야 할 연산의 위치에 대한 정보에 기반하여 상기 슬레이브 저장 기기의 슬레이브 복제로그에 저장된 연산을 상기 슬레이브 메모리저장소에 대해 수행하는 단계를 포함하고, 상기 마스터 메모리저장소 및 상기 슬레이브 메모리저장소는, 동일한 데이터 상태 집합을 갖고, 상기 저장된 연산에 따라 동일한 연산결과를 갖는 것을 특징으로 하는 데이터 복제 방법이 제공된다.A method for data replication of a slave storage device, the method comprising: transmitting, at the slave storage device, a requested operation to a slave memory storage included in the slave storage device to a master storage device; At the slave storage device, at least one of the operation and the operation requested for the master memory storage included in the master storage device, and information about the location in the master replication log included in the master storage device of the operation to be performed Receiving a master replication log including the master replication log; And performing, in the slave storage device, an operation stored in the slave replication log of the slave storage device for the slave memory storage based on information about the location of the operation to be performed, Wherein the slave memory repository has the same data state set and has the same operation result according to the stored operation.

데이터 저장 시스템에 있어서, 복수의 저장부를 포함하고, 상기 복수의 저장부 각각은, 리플리케이터 및 메모리저장소를 포함하고, 상기 복수의 저장부 중 마스터 저장부가 포함하는 마스터 리플리케이터는, 상기 복수의 저장부 각각이 포함하는 메모리저장소들에 대해 수행될 연산의 수행 순서를 선입선출(First In First Out, FIFO) 방식으로 결정하여 상기 마스터 저장부가 포함하는 마스터 복제로그에 저장하고, 수행되어야 할 연산의 상기 마스터 복제로그에서의 위치에 대한 정보를 상기 마스터 복제로그에 더 저장하고, 상기 복수의 저장부 중 슬레이브 저장부의 슬레이브 리플리케이터는, 상기 마스터 복제로그를 수신하여 상기 슬레이브 저장부가 포함하는 슬레이브 복제로그에 저장하고, 상기 마스터 리플리케이터 및 상기 슬레이브 리플리케이터는, 상기 수행되어야 할 연산의 위치에 대한 정보에 기반하여 상기 마스터 복제로그 및 상기 슬레이브 복제로그에 저장된 동일한 연산을 상기 마스터 저장부가 포함하는 메모리저장소 및 상기 슬레이브 저장부가 포함하는 메모리저장소에 대해 수행하는 것을 특징으로 하는 데이터 저장 시스템이 제공된다.A data storage system, comprising: a plurality of storage units, each of the plurality of storage units including a replicator and a memory storage, wherein a master replicator of the plurality of storage units includes a plurality of storage units (FIFO) method, and stores the result in the master replica log included in the master storage unit. The master replica of the operation to be performed Wherein the slave replicator of the slave storage unit of the plurality of storage units receives the master replication log and stores the master replication log in the slave replication log included in the slave storage unit, Wherein the master replicator and the slave replicator comprise: Wherein the master storage unit and the slave storage unit execute the same operations stored in the master replication log and the slave replication log based on the information about the location of the operation to be performed, Is provided.

마스터 저장 기기에 있어서, 마스터 메모리저장소; 상기 마스터 메모리저장소 및 슬레이브 저장 기기가 포함하는 슬레이브 메모리저장소 중 적어도 하나에 대해 요청된 연산을 수신하는 마스터 리플리케이터; 및 클라이언트 라이브러리를 포함하고, 상기 마스터 리플리케이터는, 상기 수신된 연산을 마스터 복제로그에 저장하고, 수행되어야 할 연산의 위치에 대한 정보를 상기 마스터 복제로그에 더 저장하며, 상기 클라이언트 라이브러리는, 상기 수행되어야 할 연산의 위치에 대한 정보에 기반하여 상기 마스터 복제로그에 저장된 연산을 상기 마스터 메모리저장소에 대해 수행하고, 상기 마스터 메모리저장소 및 상기 슬레이브 메모리저장소는, 동일한 데이터 상태 집합을 갖고, 상기 저장된 연산에 따라 동일한 연산결과를 갖는 것을 특징으로 하는 마스터 저장 기기가 제공된다.A master storage device comprising: a master memory storage; A master replicator for receiving a requested operation on at least one of the master memory store and the slave memory stores included in the slave storage device; And a client library, wherein the master replicator stores the received operation in a master replication log, and further stores information on a location of an operation to be performed in the master replication log, And wherein the master memory store and the slave memory store have the same set of data states, and wherein the stored operation is performed on the stored operation And thus have the same operation result.

슬레이브 저장 기기에 있어서, 슬레이브 메모리저장소; 상기 슬레이브 메모리저장소에 대해 요청된 연산을 마스터 저장 기기로 전송하는 클라이언트 라이브러리; 및 상기 마스터 저장 기기가 포함하는 마스터 복제로그를 수신하는 슬레이브 리플리케이터 - 상기 마스터 복제로그는, 상기 연산 및 상기 마스터 저장 기기가 포함하는 마스터 메모리저장소에 대해 요청된 연산 중 적어도 하나, 그리고 수행되어야 할 연산의 상기 마스터 복제로그에서의 위치에 대한 정보를 포함함. -;를 포함하고, 상기 클라이언트 라이브러리는, 상기 수행되어야 할 연산의 위치에 대한 정보에 기반하여 상기 슬레이브 저장 기기가 포함하는 슬레이브 복제로그에 저장된 연산을 상기 슬레이브 메모리저장소에 대해 수행하고, 상기 마스터 메모리저장소 및 상기 슬레이브 메모리저장소는, 동일한 데이터 상태 집합을 갖고, 상기 저장된 연산에 따라 동일한 연산결과를 갖는 것을 특징으로 하는 슬레이브 저장 기기가 제공된다.A slave storage device comprising: a slave memory storage; A client library for transferring the requested operation to the slave memory repository to the master storage device; And a slave replicator for receiving a master replication log included in the master storage device, the master replication log comprising at least one of the operations and the operations requested for the master memory storage included in the master storage device, Includes information on the location in the master replica log of the master replica log. - the client library performs operations stored in the slave replication log included in the slave storage device on the slave memory storage based on information on the location of the operation to be performed, Wherein the storage and the slave memory storage have the same data state set and have the same operation result according to the stored operation.

저장된 데이터의 상태 집합을 의미하는 메모리저장소의 상태의 전이가, 메모리저장소에 대해 제공되는 연산의 수행에 의해 확정적으로 일어나며, 연산의 수행 결과도 동일하다는 특징을 이용하여, 두 메모리저장소에서 수행되는 연산을 선입선출 순서 방식으로 수행함으로써, 두 메모리저장소간의 일관성을 보장하는 복제를 수행할 수 있다.The transition of the state of the memory storage means the state set of the stored data is determined by the execution of the operation provided for the memory storage and the result of the operation is the same, In a first-in-first-out order manner, replication can be performed to ensure consistency between the two memory stores.

메모리 사상파일 형태로 복제로그에 대한 읽기 및/또는 쓰기를 수행하고, 디스크로의 플러시(flush)를 주기적으로 및/또는 복제로그에 쓰여진 데이터의 크기에 기반하여 수행함으로써, 복제 흐름에 대한 시간 지연을 최소화할 수 있다.By performing a read and / or write to the replication log in the form of a memory mapping file and performing a flush to disk periodically and / or based on the size of the data written to the replication log, Can be minimized.

기설정된 복제 팩터의 값에 기반하여 복제 팩터의 값 이상의 저장부들로부터 동일한 로그 시퀀스 넘버(Log Sequence Number, LSN)가 수신되는 경우에, 해당 LSN을 커밋 LSN으로 결정하고, 결정된 커밋 LSN까지 메모리저장소에 대한 연산이 순차적으로 수행되도록 하여, 가용성을 복제 팩터만큼 유지함으로써 유실된 데이터를 복구할 수 있다.When the same log sequence number (LSN) is received from storage units whose values are equal to or greater than the value of the replication factor based on the value of the predetermined replication factor, the corresponding LSN is determined as the commit LSN, So that the lost data can be recovered by maintaining the availability as much as the replication factor.

메모리저장소의 상태 및 메모리저장소에 대해 수행된 연산의 마지막 위치에 해당하는 LSN을 포함하는 체크포인트데이터를 이용하여 메모리저장소의 재 시작 복구를 처리할 수 있다.Restart recovery of the memory store can be handled using checkpoint data including the LSN corresponding to the state of the memory store and the last location of the operations performed on the memory store.

도 1은 본 발명의 일실시예에 있어서, 데이터 저장 시스템의 개괄적인 구성의 예를 설명하기 위한 블록도이다.
도 2는 본 발명의 일실시예에 있어서, 복제로그의 예를 도시한 도면이다.
도 3은 본 발명의 일실시예에 있어서, 복제 방식의 예를 도시한 도면이다.
도 4는 본 발명의 일실시예에 있어서, 복제로그를 전송하는 과정의 예를 도시한 도면이다.
도 5는 본 발명의 일실시예에 있어서, COMMIT LSN을 결정 및 전달하는 과정의 예를 도시한 도면이다.
도 6은 본 발명의 일실시예에 있어서, 리플리케이터의 재 시작 복구 및 메모리저장소의 재 시작 복구를 포함하는 리플리케이터의 상태 전이를 나타낸 도면이다.
도 7은 본 발명의 일실시예에 있어서, 마스터 저장 기기의 내부 구성을 설명하기 위한 블록도이다.
도 8은 본 발명의 일실시예에 있어서, 마스터 저장 기기의 데이터 복제 방법을 도시한 흐름도이다.
도 9는 본 발명의 일실시예에 있어서, 슬레이브 저장 기기의 내부 구성을 설명하기 위한 블록도이다.
도 10은 본 발명의 일실시예에 있어서, 슬레이브 저장 기기의 데이터 복제 방법을 도시한 흐름도이다.1 is a block diagram illustrating an example of a general configuration of a data storage system in an embodiment of the present invention.
2 is a diagram showing an example of a replication log in an embodiment of the present invention.
Fig. 3 is a diagram showing an example of a replication method in an embodiment of the present invention.
4 is a diagram illustrating an example of a process of transmitting a replication log in an embodiment of the present invention.
5 is a diagram illustrating an example of a process of determining and transmitting a COMMIT LSN in an embodiment of the present invention.
Figure 6 is a diagram illustrating state transitions of a replicator, including a restart of a replicator and a restart of a memory store, in an embodiment of the present invention.
7 is a block diagram illustrating an internal configuration of a master storage device according to an embodiment of the present invention.
8 is a flowchart illustrating a data replication method of a master storage device according to an embodiment of the present invention.
9 is a block diagram for explaining an internal configuration of a slave storage device according to an embodiment of the present invention.
10 is a flowchart illustrating a data replication method of a slave storage device according to an embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명의 실시예들은 데이터 저장 시스템에서 상태기기(state machine)로 정의되어 동작되는 메모리저장소간의 일관성을 보장하는 명령어의 선입선출 방식의 고성능 복제 방법과 체크포인트데이터와 복제로그를 이용한 분산 복구 방법에 관한 것이다.Embodiments of the present invention are directed to a high performance duplication method of first-in-first-out method of instruction to ensure consistency between memory stores operated as defined by a state machine in a data storage system and a distributed recovery method using checkpoint data and replication log .

상태기기는 컴퓨터 사이언스(computer science)에서 정의되는 계산 모델로서, 본 발명의 실시예들에서는 상술한 메모리저장소를 의미할 수 있으며, 다음과 같은 속성을 만족할 수 있다.The state machine is a computation model defined in computer science. In the embodiments of the present invention, the state machine can mean the above-mentioned memory storage, and the following attributes can be satisfied.

1. 메모리저장소의 상태는 저장된 데이터의 상태 집합을 의미한다.1. The state of a memory store is a set of states of stored data.

2. 메모리저장소에서 제공하는 연산의 수행에 의해서 상태의 전이가 확정적으로(deterministic) 일어나며, 연산의 수행 결과도 동일하다.2. The transition of the state is deterministic by the execution of the operation provided by the memory storage, and the result of the operation is the same.

메모리저장소에 대한 연산을 순차적으로 수행하면, 상태기기의 정의에 의해서 두 메모리저장소의 데이터(상태)가 동일하게 된다. 본 발명의 실시예들에서는 두 메모리저장소에 수행되는 명령어를 선입선출(First In First Out, FIFO) 방식으로 수행하여 두 메모리저장소의 데이터(상태)를 복제할 수 있다.When operations are sequentially performed on the memory storage, the data (states) of the two memory stores are identical by definition of the state machine. In the embodiments of the present invention, the data (state) of the two memory repositories can be replicated by performing a first in first out (FIFO) instruction on the two memory stores.

이때, 본 발명의 실시예들에서 제공하는 메모리저장소의 변경에 대한 일관성은 순차 일관성(sequential consistency)일 수 있다. 이는, 전체 시스템에서 수행된 모든 연산의 결과가 특정한 명령어들의 순차실행에 의해 수행된 결과와 동일하다는 것을 의미할 수 있다. 예를 들어, 모든 클라이언트는 메모리저장소에 수행한 변경 연산(일례로, 쓰기(write))의 결과를 바로 읽을(read) 수 있다.
At this time, the consistency of the change of the memory storage provided by the embodiments of the present invention may be sequential consistency. This may mean that the results of all operations performed on the overall system are the same as those performed by sequential execution of particular instructions. For example, all clients can immediately read the result of a change operation (for example, a write) performed on a memory store.

도 1은 본 발명의 일실시예에 있어서, 데이터 저장 시스템의 개괄적인 구성의 예를 설명하기 위한 블록도이다. 도 1에서 데이터 저장 시스템(100)은 설정 마스터(Configuration Master, 110)와 마스터 저장부(120) 및 슬레이브 저장부(130)를 도시하고 있다.1 is a block diagram illustrating an example of a general configuration of a data storage system in an embodiment of the present invention. 1, the data storage system 100 includes a configuration master 110, a master storage unit 120, and a slave storage unit 130.

데이터 저장 시스템(100)은 데이터를 분산 저장하는 시스템으로, 데이터를 분산 저장하는 장치들에서 전원 공급 중단 등의 장애에 따라 데이터가 유실되는 것을 방지하기 위한 저장부들(120 및 130)을 포함할 수 있다.The data storage system 100 is a system for distributing data and may include storage units 120 and 130 for preventing data from being lost due to a failure such as a power supply interruption in devices for distributing and storing data. have.

저장부들(120 및 130)은 도 1에 도시된 바와 같이, 마스터 역할을 수행하기 위한 마스터 저장부(120)와 슬레이브 역할을 수행하기 위한 슬레이브 저장부(130)로 구성될 수 있다. 이때, 슬레이브 저장부(130)는 동일한 구성을 갖는 복수의 저장부들로 구현될 수도 있다. 예를 들어, 데이터 저장 시스템(100)은 복수의 저장부들을 포함할 수 있고, 그 중 하나의 저장부가 마스터 저장부(120)가 될 수 있으며, 둘 이상의 나머지 저장부들이 각각 슬레이브 저장부들로서 구현될 수 있다.The storage units 120 and 130 may include a master storage unit 120 for performing a master role and a slave storage unit 130 for performing a slave function, as shown in FIG. At this time, the slave storage unit 130 may be implemented by a plurality of storage units having the same configuration. For example, the data storage system 100 may include a plurality of storage units, one of which may be the master storage unit 120, and two or more of the remaining storage units may be implemented as slave storage units .

또한, 저장부들(120 및 130)은 각각 하나의 저장 기기로서 구현될 수 있다. 예를 들어, 저장부들(120 및 130)은 개별 전원을 갖는 별도의 저장 기기들이 네트워크를 통해 통신하는 형태로 구현될 수 있다. Also, the storage units 120 and 130 may be implemented as one storage device, respectively. For example, the storage units 120 and 130 may be implemented in such a manner that separate storage devices having separate power sources communicate with each other through a network.

또한, 저장부들(120 및 130)은 도 1에 도시된 바와 같이, 리플리케이터(Replicator, 마스터 리플리케이터(121) 및 슬레이브 리플리케이터(131)), 클라이언트 라이브러리(Client Library, 122 및 132), 메모리저장소(123 및 133), 복제로그(124 및 134) 및 체크포인트데이터(125 및 135)를 포함할 수 있다. 클라이언트 라이브러리(122 및 132)는 메모리저장소(123 및 133)에 내장된 형태로 구현될 수 있으며, 체크포인트데이터(125 및 135)는 데이터의 복구를 위한 것으로, 선택적으로 저장부들(120 및 130)에 포함될 수 있다.1, includes a replicator (master replicator 121 and slave replicator 131), a client library (Client Library 122 and 132), a memory repository 123 And 133, replication logs 124 and 134, and checkpoint data 125 and 135. [ The client libraries 122 and 132 may be implemented in a form embedded in the memory stores 123 and 133 and the checkpoint data 125 and 135 are for data recovery, .

마스터 리플리케이터(121)는 메모리저장소(123 및 133)에서 수행할 연산(명령어)을 선입선출 순서로 결정할 수 있다. 이때, 결정된 연산은 마스터 저장부(120)의 복제로그(124)와 슬레이브 저장부(130)의 복제로그(134)에 저장될 수 있다. 이미 설명한 바와 같이, 슬레이브 역할을 수행하기 위한 다수의 저장부들이 존재하는 경우, 저장부들 각각이 슬레이브 리플리케이터와 복제로그를 포함할 수 있다. 이 경우, 마스터 리플리케이터(121)에 의해 결정된 연산은 슬레이브 리플리케이터들의 복제로그들 각각에 저장될 수 있다.The master replicator 121 can determine an operation (instruction) to be performed in the memory storages 123 and 133 in a first-in first-out order. At this time, the determined operation can be stored in the replication log 124 of the master storage unit 120 and the replication log 134 of the slave storage unit 130. As described above, when there are a plurality of storage units for performing a slave role, each of the storage units may include a slave replicator and a replica log. In this case, the operation determined by the master replicator 121 may be stored in each of the replicate logs of the slave replicators.

슬레이브 저장부(130)의 클라이언트 라이브러리(132)는 메모리저장소(133)에서 수행할 연산을 마스터 리플리케이터(121)로 전달하고, 수행될 복제로그(134)의 부분에 대한 정보를 로컬 리플리케이터(131)로부터 전달받아 복제로그(134)에서 수행할 연산을 읽어들여 메모리저장소(133)에서 처리할 수 있다. 여기서, "로컬"은 동일한 저장부에 포함된 구성요소를 의미할 수 있다. 예를 들어, 클라이언트 라이브러리(132)의 "로컬 리플리케이터"는 리플리케이터(131)를 의미할 수 있고, 클라이언트 라이브러리(122)의 "로컬 메모리저장소"는 메모리저장소(123)를 의미할 수 있다.The client library 132 of the slave storage unit 130 transfers the operation to be performed in the memory repository 133 to the master replicator 121 and transmits information on the part of the replication log 134 to be performed to the local replicator 131 And can read the operation to be performed in the replication log 134 and process it in the memory storage 133. Here, "local" may mean a component included in the same storage unit. For example, the "local replicators" of the client libraries 132 may refer to the replicators 131, and the "local memory repositories" of the client libraries 122 may refer to the memory repositories 123.

설정 마스터(110)는 리플리케이터들(121 및 131)의 상태 감시와 리플리케이터들(121 및 131)의 장애 발생시 복제관계 재설정을 위한 처리를 수행할 수 있다. 이러한 설정 마스터(110)에 대해서는 이후 더욱 자세히 설명한다.The setting master 110 may perform processing for monitoring the status of the replicators 121 and 131 and for resetting the replication relationship when a failure of the replicators 121 and 131 occurs. This setting master 110 will be described in more detail below.

메모리저장소(123 및 133)는 이미 설명한 바와 같이 상태기기로 정의되는 저장소로서 모든 상태를 메모리상에 상주시킬 수 있다.The memory stores 123 and 133 can be any state stored in memory as a store defined by the state machine as previously described.

복제로그(124 및 134)는 리플리케이터들(121 및 131)로 전달된 연산과 수행하기 위한 연산의 위치에 대한 정보가 저장된 데이터를 의미할 수 있다. 이러한 복제로그(124 및 134)에 대해서는 이후 더욱 자세히 설명한다.Replication logs 124 and 134 may refer to data stored in the replicators 121 and 131 with information about the location of the operations transferred to the replicators 121 and 131. These replica logs 124 and 134 are described in more detail below.

체크포인트데이터(125 및 135)는 메모리저장소(123 및 133)의 시점 데이터를 디스크상의 데이터로 저장할 것을 의미할 수 있다.
The checkpoint data 125 and 135 may be meant to store view data of the memory stores 123 and 133 as data on the disk.

마스터 리플리케이터(121)에서 슬레이브 리플리케이터(131)로 전송되는 모든 데이터는 복제로그(134)에 저장될 수 있다. 복제로그(134)는 고정크기의 파일로 구성될 수 있으며, 크게 데이터가 저장되는 부분과 메타데이터가 저장되는 부분으로 구성될 수 있다.All data transferred from the master replicator 121 to the slave replicator 131 may be stored in the replica log 134. The replication log 134 may be composed of a fixed-size file, and may be composed of a portion where data is stored and a portion where metadata is stored.

도 2는 본 발명의 일실시예에 있어서, 복제로그의 예를 도시한 도면이다. 도 2는 복제로그(200)가 데이터 부분(210)과 메타데이터 부분(220)으로 구성된 파일들(230 및 240)의 집합으로 구현된 예를 나타내고 있다. 데이터 부분(210)은 복제데이터 자체가 저장되는 부분일 수 있고, 메타데이터 부분(220)은 메타데이터에 대한 체크섬(checksum) 및 복제로그(200)상에서 복제데이터가 어느 위치까지 저장되어 있는가에 대한 정보를 포함할 수 있다.2 is a diagram showing an example of a replication log in an embodiment of the present invention. 2 shows an example in which the replication log 200 is implemented as a collection of files 230 and 240 composed of a data portion 210 and a metadata portion 220. [ The data portion 210 may be a portion in which the duplicate data itself is stored and the metadata portion 220 may include a checksum for the metadata and a description of where the duplicate data is stored on the duplicate log 200 Information.

이때, 복제로그(200)가 포함하는 각 파일들(230 및 240)의 크기가 고정되어 있기 때문에 복제 상에서 임의의 메시지의 위치를 숫자로 나타낼 수 있다. 이 숫자를 로그 시퀀스 넘버(Log Sequence Number, LSN)라 부를 수 있다. 복제로그(200)가 포함하는 파일들(230 및 240)의 이름은 데이터 부분이 시작하는 LSN으로 정의될 수 있다. 따라서, 임의의 메시지의 위치를 나타내는 LSN이 복제로그(200)의 어느 파일의 어느 위치에 저장되어 있는지 역으로 파악이 가능해진다.At this time, since the sizes of the files 230 and 240 included in the replication log 200 are fixed, the position of an arbitrary message on the replication can be represented by a number. This number can be referred to as a log sequence number (LSN). The names of the files 230 and 240 included in the replication log 200 may be defined as the LSN at which the data portion begins. Therefore, it is possible to grasp in which position in the file of the replica log 200 the LSN indicating the position of an arbitrary message is stored.

복제로그(200)에 저장되는 데이터는 복제로그(200)와 동일한 저장부에 포함된 클라이언트 라이브러리에 의해 요청된 연산 전송의 목적 이외에 이후 설명될 복제 프로토콜 상의 메시지를 포함할 수 있다. 아래 표 1은 복제로그(200)에 저장되는 데이터에 대한 정보를 나타내고 있다.The data stored in the replication log 200 may include a message on the replication protocol to be described later, in addition to the purpose of transferring the operation requested by the client library included in the same storage unit as the replication log 200. [ Table 1 below shows information about the data stored in the replica log 200.

메시지 타입Message type 필드field 설명Explanation REP_DATAREP_DATA ·NID: 리플리케이터의 식별자
·LENGTH: 연산 길이
·DATA: 연산 데이터NID: the identifier of the replicator
· LENGTH: Operation length
DATA: Operation data DATA 필드 값이 연산의 내용임.The DATA field value is the contents of the operation. REP_COMMITREP_COMMIT ·COMMIT_LSN: 수용된 LSNCOMMIT_LSN: accepted LSN 복제 프로토콜에서 가용성이 확보된 복제로그의 일련번호를 나타내며, COMMIT_LSN에 명시된 LSN 부분까지 메모리저장소에서 수행됨.Represents the serial number of the replication log that is made available in the replication protocol. It is performed in the memory store up to the LSN part specified in COMMIT_LSN.

도 3은 본 발명의 일실시예에 있어서, 복제 방식의 예를 도시한 도면이다.Fig. 3 is a diagram showing an example of a replication method in an embodiment of the present invention.

단계(310)은 복제로그를 전송하는 과정일 수 있다. 이때, 메모리저장소에 요청된 연산이 클라이언트 라이브러리를 통해 각 리플리케이터의 복제로그에 저장될 수 있다. 이러한 단계(310)에 대해서는 이후 도 4를 통해 더욱 자세히 설명한다.Step 310 may be the process of transmitting the replica log. At this time, the operation requested in the memory repository can be stored in the replicate log of each replicator through the client library. This step 310 will be described in more detail later with reference to FIG.

단계(320)은 커밋(commit) LSN을 결정 및 전달하는 과정일 수 있다. 이러한 단계(320)은 도 3에 도시된 바와 같이, 마스터 리플리케이터에서 각 리플리케이터의 LSN 값을 취합하여 커밋 LSN을 결정하는 단계(321), 커밋 LSN이 마스터 리플리케이터의 복제로그에 기입되고, 복제로그의 전송을 통해 각 리플리케이터에 전달되는 단계(322) 및 각 리플리케이터가 복제로그상의 REP_COMMIT 메시지를 통해 클라이언트 라이브러리에 수행할 LSN을 전송하는 단계(323)를 포함할 수 있다. 이러한 단계(320)에 대해서는 이후 도 5를 통해 더욱 자세히 설명한다.Step 320 may be a process of determining and delivering a commit LSN. This step 320 includes a step 321 of collecting LSN values of each replicator in the master replicator to determine a commit LSN, as shown in FIG. 3, the commit LSN being written to the replicate log of the master replicator, Transmitting 322 to each of the replicators via transmission and 323 transmitting each LSN to be performed to the client library through the REP_COMMIT message on the replica log. This step 320 will be described later in more detail with reference to FIG.

단계(330)은 메모리저장소에서 연산이 실행되는 과정일 수 있다. 예를 들어, 클라이언트 라이브러리의 콜백(callback) 함수와의 호출을 통해 메모리저장소에서 복제된 연산이 수행될 수 있다. 이러한 단계(330)에 대해서는 이후 도 6을 통해 더욱 자세히 설명한다. 상술한 콜백 함수를 포함하여 이후 설명될 함수들(일례로, 시스템 호출을 위한)은 리눅스(Linux) 운영체제에서 이용되는 함수들을 기반으로 설명되나 이에 한정되지 않고, 다른 운영체제의 유사한 함수로 대체될 수도 있다.
Step 330 may be the process by which the operation is performed in the memory store. For example, operations replicated in the memory store may be performed via a call to a client library's callback function. This step 330 will be described later in more detail with reference to FIG. The functions (e.g., for system calls) described below, including the callback functions described above, are described based on functions used in the Linux operating system, but are not limited to, and may be replaced with similar functions in other operating systems have.

도 4는 본 발명의 일실시예에 있어서, 복제로그를 전송하는 과정의 예를 도시한 도면이다. 도 4는 도 3의 단계(310)을 설명하기 위한 일실시예를 나타내고 있다. 리플리케이터들(121 및 131)은 기본적으로 메모리저장소(123 및 133)에 대해 요청된 연산을 복제로그(124 및 134)에 저장할 수 있다.4 is a diagram illustrating an example of a process of transmitting a replication log in an embodiment of the present invention. FIG. 4 shows an embodiment for explaining step 310 of FIG. The replicators 121 and 131 can basically store the requested operations on the memory stores 123 and 133 in the replica logs 124 and 134. [

연산 과정(410)은 클라이언트 라이브러리(122 및 132)가 마스터 리플리케이터(121)로 연산을 요청하는 과정일 수 있다.The calculation process 410 may be a process in which the client libraries 122 and 132 request an operation from the master replicator 121.

제1 저장 과정(420)은 마스터 리플리케이터(121)가 요청된 연산을 요청 받은 순서대로 복제로그(124)에 기입하는 과정일 수 있다.The first storage process 420 may be a process in which the master replicator 121 writes the requested operation into the replication log 124 in the order in which the requested operation is requested.

읽기 과정(430)은 마스터 리플리케이터(121)가 기입된 연산과 관련된 로그를 복제로그(124)로부터 읽어들이는 과정일 수 있다.The reading process 430 may be a process in which the master replicator 121 reads a log associated with the written operation from the replication log 124.

로그 전송 과정(440)은 마스터 리플리케이터(121)가 읽어들인 로그를 슬레이브 리플리케이터(131)로 전송하는 과정일 수 있다.The log transfer process 440 may be a process of transferring the log read by the master replicator 121 to the slave replicator 131.

제2 저장 과정(450)은 슬레이브 리플리케이터(131)가 수신된 로그를 복제로그(134)에 기입하는 과정일 수 있다.The second storing process 450 may be a process in which the slave replicator 131 writes the received log into the replica log 134. [

이와 같이, 연산 과정(410) 및 제1 저장 과정(420)을 통해 클라이언트 라이브러리(122 및 132)에 의해 요청된 연산이 마스터 리플리케이터(121)의 복제로그(124)에 저장될 수 있고, 읽기 과정(430), 로그 전송 과정(440) 및 제2 저장 과정(450)을 통해 마스터 리플리케이터(121)의 복제로그(124)가 슬레이브 리플리케이터(131)로 전달되어 슬레이브 리플리케이터(131)의 복제로그(134)에 저장될 수 있다.In this way, operations requested by the client libraries 122 and 132 can be stored in the replica log 124 of the master replicator 121 through the computation process 410 and the first storage process 420, The replica log 124 of the master replicator 121 is transferred to the slave replicator 131 through the log replication process 430, the log transfer process 440 and the second storing process 450 and the replica log 134 of the slave replicator 131 ). &Lt; / RTI >

마스터 리플리케이터(121)의 복제로그(124)는 일단 기입되면 더 이상 변경되지 않기 때문에 독립적으로 슬레이브 리플리케이터(131)로 전송될 수 있다.The replica log 124 of the master replicator 121 can be independently transmitted to the slave replicator 131 since it is not changed any more once it is written.

리플리케이터(121 및 131)는 서비스 스레드(service thread) 및 백그라운드 스레드(background thread)로 동작할 수 있으며, 복제로그(124 및 134)에 쓰여진 데이터를 디스크로 플러시(flush)하는 작업이 각 스레드에 의해 아래 (ㄱ) 및 (ㄴ)과 같이 나뉘어 진행될 수 있다.The replicators 121 and 131 may operate as a service thread and a background thread and the operation of flushing the data written to the replica logs 124 and 134 to disk may be performed by each thread It can be divided into the following (a) and (b).

(ㄱ) 우선, 리플리케이터(121 및 131)의 서비스 스레드는 메모리 사상파일 형태로 복제로그(124 및 134)에 대한 읽기/쓰기를 수행할 수 있다. 이 경우, 복제로그(124 및 134)에 쓰여진 내용은 저장부들(120 및 130)의 버퍼 캐시(buffer cache) 영역에 존재할 수 있으며, 아직 디스크로 플러시되지 않은 상태일 수 있다.(A) First, the service threads of the replicators 121 and 131 can read / write the replication logs 124 and 134 in the form of a memory mapping file. In this case, the contents written in the replica logs 124 and 134 may exist in the buffer cache area of the storage units 120 and 130, and may not yet be flushed to the disk.

(ㄴ) 리플리케이터(121 및 131)의 백그라운드 스레드는 복제로그(124 및 134)의 내용을 디스크로 플러시할 수 있다. 예를 들어, 백그라운드 스레드는, 저장 시스템(120 및 130)의 버퍼 캐시 영역에 존재하는 데이터의 모든 내용을 주기적으로 fsync(fdatasync) 함수와 같은 시스템 호출을 이용하여 디스크에 플러시할 수 있다. 백그라운드 스레드는 주기적으로 플러시를 수행할 수 있으며, 마지막으로 수행된 플러시 이후에 복제로그(124 및 134)에 쓰여진 데이터가 기설정된 크기를 넘어가는 경우에 다시 플러시를 실행할 수 있다.(B) The background thread of the replicators 121 and 131 may flush the contents of the replica logs 124 and 134 to disk. For example, the background thread may flush all the contents of the data residing in the buffer cache area of the storage systems 120 and 130 periodically to the disk using a system call such as the fsync (fdatasync) function. The background thread may periodically flush and may flush again if the data written to the replica logs 124 and 134 after the last flush exceeds a predetermined size.

이와 같이, 리플리케이터(121 및 131)에서 복제로그(124 및 134)를 저장하는 것은, 버퍼 캐시를 통해 우선적으로 읽기/쓰기를 수행한 후, 특정 상황(마지막으로 수행된 플러시 이후에 복제로그(124 및 134)(일례로, 버퍼 캐시)에 쓰여진 데이터가 특정 크기를 넘어가는 경우)에 따라 또는 주기적으로 플러시를 수행하여 데이터를 디스크에 저장함으로써, 복제 흐름에 대한 시간 지연을 최소화하는 방식으로 구성될 수 있다.
As such, storing replica logs 124 and 134 in replicators 121 and 131 may be performed after preferential read / write through the buffer cache and then in a specific situation (after the last flush, And 134) (e.g., when the data written to the buffer cache exceeds a certain size), or by periodically flushing the data to store the data on the disk, thereby minimizing the time delay for the replication flow .

저장부들(120 및 130)의 버퍼 캐시 영역에 존재하는 데이터는 아직 디스크로 플러시되지 않았기 때문에 시스템의 전원 공급이 중단되는 등의 장애에 따라 유실될 가능성이 존재한다. 그러나, 본 발명의 실시예들에서는 복제로그(124 및 134)의 가용성이 복제 팩터(factor)만큼 유지될 수 있도록 복제 방식이 동작하기 때문에 유실된 데이터 부분을 복구할 수 있다.Since the data in the buffer cache area of the storage units 120 and 130 has not yet been flushed to the disk, there is a possibility that the data may be lost due to the interruption of power supply to the system. However, in the embodiments of the present invention, since the replication scheme operates so that the availability of the replication logs 124 and 134 can be maintained by a replication factor, the lost data portion can be recovered.

도 5는 본 발명의 일실시예에 있어서, COMMIT LSN을 결정 및 전달하는 과정의 예를 도시한 도면이다.5 is a diagram illustrating an example of a process of determining and transmitting a COMMIT LSN in an embodiment of the present invention.

과정 (1)은, 마스터 리플리케이터(121)로부터 복제로그를 전송받은 슬레이브 리플리케이터들(131 및 510)이 마스터 리플리케이터(121)로 마지막 LSN 값을 전송하는 과정일 수 있다. LSN 값은 복제로그(슬레이브 리플리케이터들(131 및 510) 각각의 복제로그들)에 쓰여진 마지막 바이트(byte)에 대한 위치로서 복제로그를 구성하는 메시지의 경계뿐만 아니라 임의의 위치를 가리킬 수 있다.The process (1) may be a process in which the slave replicators 131 and 510, which have received the replication log from the master replicator 121, transmit the last LSN value to the master replicator 121. The LSN value may point to any location as well as the boundary of the message that makes up the replication log as the location for the last byte written to the replication log (replication logs for each of the slave replicators 131 and 510).

과정 (2)는, 마스터 리플리케이터(121)가 슬레이브 리플리케이터들(131 및 510)로부터 전송받은 LSN 기반으로 가용성이 보장되어 안전하게 연산을 수행할 수 있는 LSN을 결정할 수 있다. 이때 결정된 LSN을 'COMMIT LSN'이라 부른다.In the process (2), the master replicator 121 can determine the LSN that can be securely operated by ensuring availability based on the LSN received from the slave replicators 131 and 510. The determined LSN is called 'COMMIT LSN'.

COMMIT LSN은 복제 팩터값을 기반으로 결정될 수 있다. 예를 들어, 복제 팩터값이 '2'인 경우, 적어도 '2' 곳 이상의 복제로그에 전달된 LSN까지가 COMMIT LSN으로 지정될 수 있다. COMMIT LSN은 항상 단조 증가(monotone increasing)할 수 있다.The COMMIT LSN may be determined based on the replication factor value. For example, if the replication factor value is '2', then at least two LSNs sent to the replication log can be designated as COMMIT LSN. The COMMIT LSN can always be monotone increasing.

과정 (3)은, 새로운 COMMIT LSN이 결정된 이후의 마스터 리플리케이터(121)의 동작을 나타낼 수 있다.Process (3) may indicate the operation of the master replicator 121 after a new COMMIT LSN has been determined.

예를 들어, 과정 (3)에서, 마스터 리플리케이터(121)는 복제로그(124)에 COMMIT LSN값을 데이터로 하는 REP_COMMIT 메시지를 기입할 수 있다. REP_COMMIT 메시지는 도 4를 통해 설명한 복제로그 전송 방식에 따라 슬레이브 리플리케이터들(131 및 510)로 전달될 수 있다. 이때, 과정 (3)에서 마스터 리플리케이터(121)는, 결정된 COMMIT LSN 값을 바로 로컬 클라이언트 라이브러리(122)로 전송할 수 없다. 그 이유는, 해당 COMMIT LSN 값이 아직 복제로그(124)에 존재하고, 슬레이브 리플리케이터들(131 및 510)에 가용성을 보장할 수 있을 만큼 전송되지 않았을 수 있기 때문이다. 이러한 이유로 마스터 리플리케이터(121)는, 마스터 리플리케이터(121)가 발행한 REP_COMMIT 메시지에 대한 정보를 가지고 있으며, 해당 COMMIT LSN 이전에 존재하는 REP_COMMIT 메시지의 COMMIT_LSN 필드값을 클라이언트 라이브러리(122)로 전송할 수 있다(REP_COMMIT lookup table(530)).For example, in the process (3), the master replicator 121 can write a REP_COMMIT message with the COMMIT LSN value as the data in the replica log 124. The REP_COMMIT message may be transmitted to the slave replicators 131 and 510 according to the replication log transmission method described with reference to FIG. At this time, in the process (3), the master replicator 121 can not directly transmit the determined COMMIT LSN value to the local client library 122. This is because the COMMIT LSN value is still in the replica log 124 and may not have been transmitted enough to guarantee availability to the slave replicators 131 and 510. For this reason, the master replicator 121 has information on the REP_COMMIT message issued by the master replicator 121, and can transmit the COMMIT_LSN field value of the REP_COMMIT message existing before the COMMIT LSN to the client library 122 REP_COMMIT lookup table 530).

슬레이브 리플리케이터들(131 및 510)은 전송받은 복제 내용 중 REP_COMMIT 메시지가 존재하는 경우, REP_COMMIT 메시지의 COMMIT_LSN 필드값을 로컬 클라이언트 라이브러리(일례로, 슬레이브 리플리케이터(510)의 경우, 클라이언트 라이브러리(520)가 로컬 클라이언트 라이브러리임)에 전달할 수 있다. 슬레이브 리플리케이터들(131 및 510)에서는, 마스터 리플리케이터(121)의 복제로그(124)에 기입된 내용이 이미 전달되어 있기 때문에 가용성을 만족할 수 있다.The slave replicators 131 and 510 may compare the value of the COMMIT_LSN field of the REP_COMMIT message with the local client library (for example, in the case of the slave replicator 510, Client library). In the slave replicators 131 and 510, since the contents written in the replica log 124 of the master replicator 121 have already been transmitted, availability can be satisfied.

다시 도 1을 참조하면, 복제로그(124 및 134)는 지속적으로 그 크기가 증가하기 때문에, 적당한 수준에서 복제로그(124 및 134)의 크기를 유지하기 위한 방안이 필요하다. 이때, 체크포인트데이터(125 및 135)는 메모리저장소(123 및 133)의 상태를 하나의 파일 형태로 저장된 데이터를 포함할 수 있다.Referring again to Figure 1, there is a need to maintain the size of the replica logs 124 and 134 at a reasonable level since the replica logs 124 and 134 continue to grow in size. At this time, the checkpoint data 125 and 135 may include data stored in the form of one file as the state of the memory stores 123 and 133.

임의의 저장부에 포함되는 체크포인트데이터는 아래 (a) 내지 (c)의 과정을 통해 생성될 수 있다.The checkpoint data included in an arbitrary storage unit may be generated through the following processes (a) to (c).

(a) 포크(fork) 시스템 호출을 통해 부모 프로세스와 자식 프로세스가 동일한 데이터를 유지하는 상태에서 메모리저장소가 시작될 수 있다.(a) Through the fork system call, the memory store can be started with the parent and child processes maintaining the same data.

(b) 부모 프로세스는 클라이언트 라이브러리와 연동되어 요청되는 연산을 수행할 수 있다.(b) The parent process can work with the client library to perform the requested operation.

(c) 자식 프로세스는 메모리저장소의 모든 상태를 체크포인트데이터로 저장할 수 있다. 체크포인트데이터에는 메모리저장소의 상태 이외에 해당 메모리저장소에 가해진 연산의 마지막 위치에 해당하는 LSN 값이 포함할 수 있다.(c) The child process can store all state of the memory store as checkpoint data. The checkpoint data may include an LSN value corresponding to the last position of the operation applied to the memory store in addition to the state of the memory store.

체크포인트데이터에 포함된 LSN 이전의 복제로그 파일들은 제거가 가능할 수 있다. 또한, 체크포인트데이터와 복제로그를 이용하여 메모리저장소의 상태를 복구할 수 있다.
Replication log files prior to the LSN included in the checkpoint data may be removable. In addition, checkpoint data and replication logs can be used to restore the state of the memory repository.

이상에서 설명한 복제 방식은 메모리저장소에 가해지는 연산에 대해 선입선출 순서를 결정하는 마스터 리플리케이터(일례로, 도 1의 마스터 리플리케이터(121))를 가정하고 있다. 만약, 마스터 리플리케이터에 장애가 발생하는 경우에는, 전체 복제 흐름이 멈추게 되어 메모리저장소에서 연산이 수행되지 않게 된다.The above-described copying method assumes a master replicator (for example, the master replicator 121 of FIG. 1) that determines the first-in first-out order for an operation applied to the memory storage. If a failure occurs in the master replicator, the entire replication flow is stopped and no operation is performed in the memory repository.

설정 마스터(일례로, 도 1의 설정 마스터(110))는 리플리케이터들의 상태를 감시할 수 있으며, 마스터 리플리케이터에 장애가 발생한 경우, 남아 있는 슬레이브 리플리케이터들 중 LSN 값이 가장 큰 슬레이브 리플리케이터를 마스터 리플리케이터로 선출할 수 있다. 이미 설명한 바와 같이, LSN은 복제로그의 메시지 단위의 값이 아니기 때문에, 마스터 리플리케이터를 선출하는 기준은, 슬레이브 리플리케이터들 각각에 존재하는 COMMIT LSN의 최대값을 이용할 수 있다. 슬레이브 리플리케이터에 장애가 발생된 경우에는 복제 팩터를 조정하는 등의 추가적인 작업이 수행될 수 있다.
The configuration master (e.g., the configuration master 110 in FIG. 1) can monitor the status of the replicators and, if a failure occurs in the master replicator, select the slave replicator having the largest LSN value among the remaining slave replicators as the master replicator can do. As described above, since the LSN is not the value of the message unit of the replication log, the criterion for selecting the master replicator can be the maximum value of the COMMIT LSN existing in each of the slave replicators. If a failure occurs in the slave replicator, additional operations such as adjusting the replication factor can be performed.

본 발명의 실시예들에서는 상술한 복제 방식에서 다음과 같은 기법을 이용하여 고성능 복제를 구현할 수 있다.In the embodiments of the present invention, high-performance replication can be implemented using the following technique in the replication method described above.

(1) 복제로그에 대한 메모리사상 파일 형태의 접근을 통해 메모리 복사에서 발생하는 부하를 제거(1) Memory mapping for replicated logs Removes the load caused by memory copying through file type access

(1-1) 마스터 리플리케이터는, 클라이언트 라이브러리로부터 복제될 연산을 수신한 이후, 복제로그의 메모리 사상 파일 영역을 버퍼 영역으로 지정하여 메모리 복사를 수행할 수 있다.(1-1) After receiving the operation to be replicated from the client library, the master replicator can perform memory copying by designating the memory mapping file area of the replication log as a buffer area.

(1-2) 슬레이브 리플리케이터는 마스터 리플리케이터로부터 복제로그를 수신하는 경우, 복제로그의 메모리 사상 파일 영역을 버퍼 영역으로 지정하여 TCP 커넥션(connection)의 파일 기술자에 대해 읽기(read) 시스템 콜을 수행할 수 있다.(1-2) When receiving the replication log from the master replicator, the slave replicator specifies the memory mapping file area of the replication log as a buffer area and performs a read system call to the file descriptor of the TCP connection .

(1-3) 메모리저장소에 내장된 클라이언트 라이브러리는 수행될 COMMIT LSN에 해당하는 영역을 읽어들일 수 있다. 이때, 클라이언트 라이브러리는 복제에 따른 메모리 사상 파일 형태로 직접 메모리에 접근할 수 있다.(1-3) The client library built in the memory repository can read the area corresponding to the COMMIT LSN to be executed. At this time, the client library can directly access the memory in the form of a memory mapping file according to the replication.

(2) 복제로그의 전송 방식에서 설명하였듯, 복제될 연산이 마스터 리플리케이터의 복제로그에 저장되는 과정과, 마스터 리플리케이터의 복제로그가 슬레이브 리플리케이터로 전송되는 과정은 요청(request)/응답(response) 방식이 아니라 스트리밍(streaming) 방식으로 진행된다. 따라서, 네트워크 응답 시간에 따른 지연 현상이 발생하지 않으며, 네트워크 대역폭에 의해 단위시간 당 처리량이 결정되게 된다.(2) As described in the transmission method of the replication log, a process in which an operation to be replicated is stored in the replication log of the master replicator, and a process in which the replication log of the master replicator is transmitted to the slave replicator includes a request / Streaming method rather than a method. Therefore, a delay phenomenon does not occur according to the network response time, and the throughput per unit time is determined by the network bandwidth.

(3) 복제로그에 저장되는 모든 내용은 복제 프로토콜의 메시지가 가감 없이 저장되는 형태이기 때문에 수신된 데이터를 변환하기 위한 추가적인 비용이 발생하지 않는다.(3) All the contents stored in the replica log do not incur the additional cost of converting the received data because the replica protocol messages are stored without addition or subtraction.

(4) 복제로그의 디스크 저장방식에서 설명한 바와 같이, 리플리케이터의 복제로그 저장은 복제 단계에서 fsync 시스템 호출을 이용하지 않고, write 시스템 호출만을 수행해서 시스템의 버퍼 캐시 영역에 데이터를 기입한 이후 반환되고, 리플리케이터의 백그라운드 스레드에 의해 디스크로 플러시된다. 이는 복제로그의 분산에 의해 고 가용성을 이미 확보하였기 때문에 가능한 방식이다.
(4) As described in the disk storage method of the replication log, the replicator log storage of the replicator is performed after writing data to the buffer cache area of the system by performing only the write system call without using the fsync system call at the replication step , And is flushed to disk by the background thread of the replicator. This is possible because of the high availability already achieved by the distribution of replication logs.

이후에서는 본 발명의 실시예들에 따른 복구 방식에 대해 설명한다. 복구 방식은 아래와 같이 리플리케이터의 재 시작 복구, 메모리저장소의 재 시작 복구, 메모리저장소가 로컬 리플리케이터에 연결된 이후 복제로그의 전송에 의해서 수행되는 복제복구로 나뉘어질 수 있다.Hereinafter, a recovery method according to embodiments of the present invention will be described. The restoration method can be divided into restart recovery of the replicator, restoration of restart of the memory storage, and copy recovery performed by transferring the replication log after the memory storage is connected to the local replicator as shown below.

A. 리플리케이터의 재 시작 복구: 설정 마스터는, 복제로그의 정합성을 확인한 후 복제로그의 가용 LSN 범위를 확인할 수 있다. 예를 들어, 설정 마스터는, 가용한 최소 LSN인 MIN_LSN값과 가용한 최대 LSN인 MAX_LSN 값을 확인할 수 있다. 이때, 재 시작되는 리플리케이터는 MIN_LSN값과 MAX_LSN 값 사이의 복제 데이터를 가질 수 있다.A. Restart restart of replicators: The configuration master can check the replication LSN range of the replication log after verifying the integrity of the replication log. For example, the configuration master can check the MIN_LSN value, which is the minimum LSN available, and the MAX_LSN value, which is the maximum LSN available. At this time, the restarting replicator may have duplicate data between the MIN_LSN value and the MAX_LSN value.

B. 메모리저장소의 재 시작 복구: 리플리케이터는 체크포인트데이터가 존재하는 경우, 체크포인트데이터를 이용하여 메모리저장소의 상태를 복원할 수 있다. 체크포인트데이터와 함께 저장된 LSN 값을 CKPT_LSN이라 부를 수 있다. 이때, CKPT_LSN은 해당 메모리저장소의 상태에 적용된 마지막 연산의 끝 위치를 나타낼 수 있다. 만약, 체크포인트데이터가 유실 등의 이유로 존재하지 않는 경우, CKPT_LSN의 값은 '0'이 될 수 있다.B. Restarting Memory Store Restore: If the checkpoint data is present, the replicator can restore the state of the memory store using checkpoint data. The LSN value stored together with the checkpoint data may be referred to as CKPT_LSN. At this time, CKPT_LSN can indicate the end position of the last operation applied to the state of the corresponding memory storage. If the checkpoint data does not exist due to loss or the like, the value of CKPT_LSN may be '0'.

C. 복제 복구: 시스템이 새로 시작되어 리플리케이터들의 마스터/슬레이브 역할이 정해지지 않은 상태에서, 기존의 마스터 리플리케이터가 가지고 있는 복제로그를 이용하여 복구가 가능한 경우, 기존의 마스터 리플리케이터는 슬레이브로서 복제 관계에 참석할 수 있다.C. Replication Recovery: If the system is newly started and the master / slave role of the replicators is not defined, and the recovery is possible using the replication log of the existing master replicator, the existing master replicator will be able to participate in the replication relationship as a slave .

만약, 로컬 리플리케이터가 마스터 리플리케이터로부터 복제로그를 이용하여 복구될 수 없다면, 로컬 리플리케이터는 로컬 복제로그를 모두 지우고, 마스터 리플리케이터의 메모리저장소에 대한 체크포인트데이터를 전송 받아서 복구 과정을 다시 시작할 수 있다. 이를 원격 체크포인트데이터를 이용한 복구라고 한다.If the local replicator can not be recovered from the master replicator using the replication log, the local replicator may clear the local replication log and receive checkpoint data for the memory repository of the master replicator to resume the recovery process. This is called recovery using remote checkpoint data.

도 6은 본 발명의 일실시예에 있어서, 리플리케이터의 재 시작 복구 및 메모리저장소의 재 시작 복구를 포함하는 리플리케이터의 상태 전이를 나타낸 도면이다.Figure 6 is a diagram illustrating state transitions of a replicator, including a restart of a replicator and a restart of a memory store, in an embodiment of the present invention.

리플리케이터의 상태는 NONE(610), LCONN(620), MASTER(630) 및 SLAVE(640)의 네 가지 상태를 가질 수 있다.The state of the replicator may have four states: NONE 610, LCONN 620, MASTER 630, and SLAVE 640.

NONE(610) 상태는, 리플리케이터의 재 시작 복구가 끝난 상태를 의미할 수 있다. 이때, NONE(610) 상태에서 MIN_LSN 및 MAX_LSN이 결정될 수 있다.The NONE 610 state may indicate that the replicator has been restored. At this time, MIN_LSN and MAX_LSN can be determined in the NONE 610 state.

LCONN(620) 상태는, 메모리저장소가 리플리케이터에 연결된 상태를 의미할 수 있다. 메모리저장소가 리플리케이터에 접속할 때, 메모리저장소 재 시작 복구 과정에서 얻어진 CKPT_LSN 값이 리플리케이터로 전달될 수 있다.The LCONN 620 state may indicate that the memory store is connected to the replicator. When the memory store is connected to the replicator, the CKPT_LSN value obtained in the memory store restart recovery process can be transferred to the replicator.

CKPT_LSN이 MIN_LSN보다 작은 경우에는 원격 체크포인트데이터를 이용한 복구가 시작되어 리플리케이터가 재 구동될 수 있다.When CKPT_LSN is smaller than MIN_LSN, recovery using remote checkpoint data is started and the replicator can be restarted.

CKPT_LSN이 MIN_LSN보다 크거나 같은 경우, 리플리케이터는 LCONN(620) 상태에서 설정 마스터가 리플리케이터의 역할을 정해 줄 수 있도록 기다릴 수 있다. 설정 마스터는 설정 마스터의 복제로그 정보(MIN_LSN, MAX_LSN 및 CKPT_LSN)를 이용하여 리플리케이터의 역할을 결정할 수 있다.If CKPT_LSN is greater than or equal to MIN_LSN, the replicator may wait in the LCONN 620 state so that the configuration master can determine the role of the replicator. The setting master can determine the role of the replicator using the setting master's replication log information (MIN_LSN, MAX_LSN and CKPT_LSN).

설정 마스터가 마스터에 대한 접속 정보를 LCONN(620) 상태의 리플리케이터에 전달해 줌으로써, SLAVE(640) 상태의 슬레이브 리플리케이터는 마스터 리플리케이터로의 연결을 획득할 수 있다. 이때, 슬레이브 리플리케이터는 마스터에 대한 접속 정보를 로컬 클라이언트 라이브러리(의 메모리저장소)에 전달할 수 있다.A slave replicator in the SLAVE 640 state can obtain a connection to the master replicator by passing the connection information for the master to the replicator in the LCONN 620 state. At this time, the slave replicator can forward the connection information to the master to the local client library (the memory repository).

LCONN(620) 상태의 리플리케이터는, 접속 시 자신이 필요로 하는 시작 로그(CKPT_LSN과 MAX_LSN 값 중 큰 값) 정보를 마스터 리플리케이터에 전달할 수 있다. 이후, 마스터 리플리케이터는 해당 로그 번호로부터 복제로그를 슬레이브 리플리케이터로 전송할 수 있다.The replicator in the LCONN 620 state can forward the starting log (the larger of the CKPT_LSN and MAX_LSN values) information that it needs at the time of connection to the master replicator. The master replicator can then transfer the replication log from the log number to the slave replicator.

복제가 처음 구성된 경우, 설정 마스터는 LCONN(620) 상태의 리플리케이터를 MASTER(630) 상태의 마스터 리플리케이터로 지정할 수 있다.
If replication is initially configured, the configuration master may designate the replicator of the LCONN 620 state as the master replicator of the MASTER 630 state.

도 7은 본 발명의 일실시예에 있어서, 마스터 저장 기기의 내부 구성을 설명하기 위한 블록도이고, 도 8은 본 발명의 일실시예에 있어서, 마스터 저장 기기의 데이터 복제 방법을 도시한 흐름도이다.FIG. 7 is a block diagram illustrating an internal configuration of a master storage device according to an exemplary embodiment of the present invention, and FIG. 8 is a flowchart illustrating a data replication method of a master storage device according to an exemplary embodiment of the present invention .

본 실시예에 따른 마스터 저장 기기(700)는 도 7에 도시된 바와 같이 프로세서(710), 버스(720), 네트워크 인터페이스(730) 및 메모리(740)를 포함할 수 있다. 메모리(740)는 운영체제(741), 데이터 복제 루틴(742) 및 마스터 메모리저장소(743)를 포함할 수 있다. 프로세서(710)는 마스터 리플리케이터(711) 및 클라이언트 라이브러리(712)를 포함할 수 있다. 다른 실시예들에서 마스터 저장 기기(700)는 도 7의 구성요소들보다 더 많은 구성요소들을 포함할 수도 있다. 그러나, 대부분의 종래기술적 구성요소들을 명확하게 도시할 필요성은 없다. 예를 들어, 마스터 저장 기기(700)는 디스플레이나 트랜시버(transceiver)와 같은 다른 구성요소들을 포함할 수도 있다.The master storage device 700 according to the present embodiment may include a processor 710, a bus 720, a network interface 730 and a memory 740 as shown in FIG. The memory 740 may include an operating system 741, a data replication routine 742, and a master memory store 743. The processor 710 may include a master replicator 711 and a client library 712. In other embodiments, the master storage device 700 may include more components than the components of FIG. However, there is no need to clearly illustrate most prior art components. For example, the master storage device 700 may include other components such as a display or a transceiver.

메모리(740)는 컴퓨터에서 판독 가능한 기록 매체로서, RAM(random access memory), ROM(read only memory) 및 디스크 드라이브와 같은 비소멸성 대용량 기록장치(permanent mass storage device)를 포함할 수 있다. 또한, 메모리(740)에는 운영체제(741), 데이터 복제 루틴(742)을 위한 프로그램 코드가 저장될 수 있으며, 마스터 메모리저장소(743)가 구현될 수 있다. 이러한 운영체제(741) 및 데이터 복제 루틴(742)과 같은 소프트웨어 구성요소들은 드라이브 메커니즘(drive mechanism, 미도시)을 이용하여 메모리(740)와는 별도의 컴퓨터에서 판독 가능한 기록 매체로부터 로딩될 수 있다. 이러한 별도의 컴퓨터에서 판독 가능한 기록 매체는 플로피 드라이브, 디스크, 테이프, DVD/CD-ROM 드라이브, 메모리 카드 등의 컴퓨터에서 판독 가능한 기록 매체(미도시)를 포함할 수 있다. 다른 실시예에서 소프트웨어 구성요소들은 컴퓨터에서 판독 가능한 기록 매체가 아닌 네트워크 인터페이스(730)를 통해 메모리(740)에 로딩될 수도 있다. 예를 들어, 데이터 복제 루틴(742)은 개발자들이 네트워크를 통해 제공하는 파일들에 의해 설치되는 프로그램에 기반하여 메모리(740)에 로딩될 수 있다.The memory 740 may be a computer-readable recording medium and may include a permanent mass storage device such as a random access memory (RAM), a read only memory (ROM), and a disk drive. Also, in the memory 740, program codes for the operating system 741 and the data replication routine 742 can be stored, and a master memory storage 743 can be implemented. Software components such as operating system 741 and data replication routine 742 may be loaded from a computer readable recording medium separate from memory 740 using a drive mechanism (not shown). Such a computer-readable recording medium may include a computer-readable recording medium (not shown) such as a floppy drive, a disk, a tape, a DVD / CD-ROM drive, or a memory card. In other embodiments, the software components may be loaded into the memory 740 via the network interface 730 rather than from a computer readable recording medium. For example, the data replication routine 742 may be loaded into the memory 740 based on the program installed by the developers through the files they provide over the network.

버스(720)는 마스터 저장 기기(700)의 구성요소들간의 통신 및 데이터 전송을 가능하게 할 수 있다. 버스(720)는 고속 시리얼 버스(high-speed serial bus), 병렬 버스(parallel bus), SAN(Storage Area Network) 및/또는 다른 적절한 통신 기술을 이용하여 구성될 수 있다.The bus 720 may enable communication and data transfer between components of the master storage device 700. The bus 720 may be configured using a high-speed serial bus, a parallel bus, a Storage Area Network (SAN), and / or other suitable communication technology.

네트워크 인터페이스(730)는 마스터 저장 기기(700)를 컴퓨터 네트워크에 연결하기 위한 컴퓨터 하드웨어 구성요소일 수 있다. 네트워크 인터페이스(730)는 서버(700)를 무선 또는 유선 커넥션을 통해 컴퓨터 네트워크에 연결시킬 수 있다.The network interface 730 may be a computer hardware component for connecting the master storage device 700 to a computer network. The network interface 730 may connect the server 700 to a computer network via a wireless or wired connection.

프로세서(710)는 기본적인 산술, 로직 및 마스터 저장 기기(700)의 입출력 연산을 수행함으로써, 컴퓨터 프로그램의 명령을 처리하도록 구성될 수 있다. 명령은 메모리(740) 또는 네트워크 인터페이스(730)에 의해, 그리고 버스(720)를 통해 프로세서(710)로 제공될 수 있다. 프로세서(710)는 마스터 리플리케이터(711) 및 클라이언트 라이브러리(712)를 위한 프로그램 코드를 실행하도록 구성될 수 있다. 이러한 프로그램 코드는 메모리(740)와 같은 기록 장치에 저장(일례로, 데이터 복제 루틴(742))될 수 있다.The processor 710 may be configured to process instructions of a computer program by performing basic arithmetic, logic, and input / output operations of the master storage device 700. The instructions may be provided to the processor 710 by the memory 740 or the network interface 730 and via the bus 720. The processor 710 may be configured to execute program code for the master replicator 711 and the client library 712. Such program code may be stored in a recording device, such as memory 740 (e. G., Data replication routine 742).

이때, 프로세서(710)가 포함하는 마스터 리플리케이터(711) 및 클라이언트 라이브러리(712)는 도 8의 단계들(810 내지 830)를 수행하기 위해 구성될 수 있다.At this time, the master replicator 711 and the client library 712 included in the processor 710 may be configured to perform the steps 810 to 830 of FIG.

단계(810)에서 마스터 리플리케이터(711)은, 마스터 메모리저장소(743) 및 슬레이브 저장 기기가 포함하는 슬레이브 메모리저장소 중 적어도 하나에 대해 요청된 연산을 수신할 수 있다. 예를 들어, 도 4의 연산 과정(410)을 통해 설명한 바와 같이, 마스터 리플리케이터(711)는 마스터 연산 기기(700)가 포함하는 클라이언트 라이브러리(712) 및/또는 적어도 하나의 슬레이브 저장 기기가 포함하는 클라이언트 라이브러리를 통해 메모리저장소들에 대해 요청된 연산(명령어)을 수신할 수 있다.At step 810, the master replicator 711 may receive the requested operation for at least one of the master memory store 743 and the slave memory store that the slave storage device contains. 4, the master replicator 711 may include a client library 712 included in the master computing device 700 and / or at least one slave storage device included in the slave storage device 712. [ (Command) to the memory stores via the client library.

단계(820)에서 마스터 리플리케이터(711)은, 수신된 연산을 마스터 복제로그(마스터 저장 기기(700)가 포함하는 복제로그)에 저장하고, 수행되어야 할 연산의 위치에 대한 정보를 마스터 복제로그에 더 저장하고, 마스터 복제로그를 슬레이브 저장 기기로 전달할 수 있다. 이때, 슬레이브 저장 기기가 포함하는 슬레이브 복제로그가 전달된 마스터 복제로그와 동기화될 수 있다.In step 820, the master replicator 711 stores the received operation in the master replication log (the replication log included in the master storage 700) and stores information about the location of the operation to be performed in the master replication log And can transfer the master replication log to the slave storage device. At this time, the slave replication log included in the slave storage device can be synchronized with the transferred master replication log.

이때, 수행되어야 할 연산의 위치는 앞서 설명한 커밋 LSN에 대응될 수 있다. 이미 설명한 바와 같이, 마스터 복제로그(또는 슬레이브 복제로그)는, 로그 시퀀스 넘버(Log Sequence Number, LSN)를 통해 마스터 복제로그(또는 슬레이브 복제로그)에서 데이터가 저장된 위치가 표현될 수 있다.At this time, the position of the operation to be performed may correspond to the commit LSN described above. As described above, the master replication log (or the slave replication log) can be represented by a log sequence number (LSN) in which the data is stored in the master replication log (or the slave replication log).

슬레이브 저장 기기의 슬레이브 리플리케이터는 슬레이브 복제로그에 저장된 마지막 데이터의 LSN값을 마스터 리플리케이터(711)로 전송할 수 있으며, 마스터 리플리케이터(711)는 수신된 LSN 값과 복제 팩터에 기반하여 커밋 LSN을 결정할 수 있다. 이러한 커밋 LSN은 수신된 연산과 함께 마스터 복제로그에 저장되어 슬레이브 저장 기기로 전달될 수 있다. 마스터 리플리케이터(711)는 복수의 슬레이브 저장 기기들로부터 수신된 복수의 LSN 값들 중에서 동일한 값의 LSN들 복제 팩터의 값(일례로 '3') 이상 존재하는 경우, 해당 LSN의 값을 커밋 LSN의 값으로 결정할 수 있다.The slave replicator of the slave storage device can transfer the LSN value of the last data stored in the slave replication log to the master replicator 711 and the master replicator 711 can determine the commit LSN based on the received LSN value and the replication factor . Such a commit LSN may be stored in the master replication log along with the received operation and forwarded to the slave storage device. When there is more than a value (for example, '3') of the replication factor of LSNs of the same value among the plurality of LSN values received from the plurality of slave storage devices, the master replicator 711 replaces the value of the LSN with the value of the commit LSN .

또한, 마스터 리플리케이터(711)는 단계(820)에서 수신된 연산을 메모리 사상파일의 형태로 버퍼 캐시 영역에 저장하여 마스터 복제로그에 대한 읽기 및/또는 쓰기를 수행할 수 있다. 버퍼 캐시 영역에 대한 읽기 및/또는 쓰기는 디스크에 비해 매우 빠른 접근이 가능하기 때문에 복제 흐름에 대한 시간 지연을 최소화할 수 있다. 또한, 마스터 리플리케이터(711)는 주기적으로 버퍼 캐시 영역에 저장된 데이터를 디스크로 플러시(flush)할 수 있다. 뿐만 아니라, 마스터 리플리케이터(711)는 마지막 플러시에 따라 마스터 복제로그에 저장된 데이터의 크기가 기설정된 크기 이상인 경우, 버퍼 캐시 영역에 저장된 데이터를 디스크로 플러시할 수 있다. 이러한 데이터의 플러시는 fsync(fdatasync)와 같은 시스템 호출을 이용하여 수행될 수 있다.The master replicator 711 may also perform reads and / or writes to the master replication log by storing the operations received in step 820 in a buffer cache area in the form of a memory mapping file. Reads and / or writes to the buffer cache area can be accessed very quickly compared to the disk, thus minimizing the time delay for the replication flow. In addition, the master replicator 711 may periodically flush the data stored in the buffer cache area to the disk. In addition, the master replicator 711 can flush data stored in the buffer cache area to the disk when the size of the data stored in the master replica log is equal to or larger than a predetermined size according to the last flush. Flushing this data can be done using a system call such as fsync (fdatasync).

또한, 마스터 복제로그의 전달 역시 버퍼 캐시 영역을 이용하여 처리될 수 있다. 마스터 리플리케이터(711)는 마스터 복제로그의 메모리 사상 파일 영역을 버퍼 캐시 영역으로 지정하여 메모리 복사를 수행할 수 있다. 이때, 슬레이브 리플리케이터 역시 상술한 사상 파일 영역을 버퍼 캐시 영역으로 지정하여 마스터 복제로그를 읽어들일 수 있다. 이 경우, 슬레이브 저장 기기의 클라이언트 라이브러리는 상술한 사상 파일 영역에 직접 메모리 접근함으로서, 수행되어야 할 연산의 위치(커밋 LSN)에 대한 정보를 획득할 수 있다.Also, the transfer of the master replication log may also be handled using the buffer cache area. The master replicator 711 can perform memory copying by designating the memory mapping file area of the master replication log as the buffer cache area. At this time, the slave replicator can also read the master replication log by designating the above-mentioned mapping file area as the buffer cache area. In this case, the client library of the slave storage device can acquire information on the position (commit LSN) of the operation to be performed by directly accessing the above-described mapped file area.

단계(830)에서 클라이언트 라이브러리(712)는, 수행되어야 할 연산의 위치에 대한 정보에 기반하여 마스터 복제로그에 저장된 연산을 마스터 메모리저장소(743)에 대해 수행할 수 있다. 이때, 마스터 메모리저장소(743) 및 슬레이브 메모리저장소는, 동일한 데이터 상태 집합을 가질 수 있고, 저장된 연산에 따라 동일한 연산결과를 가질 수 있다.At step 830, the client library 712 may perform operations on the master memory store 743 stored in the master replication log based on information about the location of the operation to be performed. At this time, the master memory store 743 and the slave memory store may have the same data state set and may have the same operation result according to the stored operation.

이미 설명한 바와 같이, 메모리저장소의 상태는 저장된 데이터의 상태 집합을 의미할 수 있고, 메모리저장소에서 제공하는 연산의 수행에 의해서 상태의 전이가 확정적으로 일어나며, 연산의 수행 결과도 동일하다. 이는, 서로 동일한 상태에서 시작된 두 메모리저장소에 대해 동일한 연산이 수행되는 경우, 두 메모리저장소의 데이터(상태)가 동일함을 의미할 수 있으며, 두 메모리저장소에 수행되는 연산을 선입선출 방식으로 수행함으로써, 두 메모리저장소의 데이터(상태)를 복제할 수 있음을 의미할 수 있다. 예를 들어, 마스터 저장 기기(700)의 클라이언트 라이브러리(712)와 슬레이브 저장 기기의 클라이언트 라이브러리(712)는 수행되어야 할 연산의 위치에 대한 정보에 따라 마스터 복제로그와 슬레이브 복제로그에서 서로 동일한 위치까지의 연산을 마스터 메모리저장소(743) 및 슬레이브 메모리저장소에 대해 순차적으로 실행할 수 있다. 따라서, 마스터 메모리저장소(743) 및 슬레이브 메모리저장소의 데이터는 서로 동일해질 수 있다.As described above, the state of the memory storage may mean a state set of the stored data, and the transition of the state is definitively performed by the execution of the operation provided by the memory storage, and the execution result of the operation is the same. This means that if the same operation is performed on two memory stores started in the same state with each other, it means that the data (states) of the two memory stores are the same, and the operations performed on the two memory stores are performed by first- , Meaning that the data (state) of both memory stores can be replicated. For example, the client library 712 of the master storage device 700 and the client library 712 of the slave storage device may have the same location in the master replication log and the slave replication log according to the information about the location of the operation to be performed To the master memory store 743 and to the slave memory store sequentially. Thus, the data in the master memory store 743 and the slave memory store may be identical to each other.

이 외에, 마스터 저장 기기(700)와 슬레이브 저장 기기를 포함하는 복수의 저장 기기들 각각은 체크포인트데이터를 더 포함할 수 있다. 체크포인트데이터는, 복수의 저장 기기들 각각이 포함하는 메모리저장소의 상태 및 메모리저장소에 대해 수행된 연산의 마지막 위치에 해당하는 LSN(CKPT_LSN)을 포함할 수 있다.In addition, each of the plurality of storage devices including the master storage device 700 and the slave storage device may further include checkpoint data. The checkpoint data may include an LSN (CKPT_LSN) corresponding to a state of a memory storage included in each of a plurality of storage devices and an end position of an operation performed on the memory storage.

이때, 복수의 저장 기기들 중 장애가 발생한 저장 기기의 메모리저장소의 상태를 장애가 발생한 저장 기기의 체크포인트데이터를 이용하여 복구할 수 있다. 예를 들어, 마스터 저장 기기(700)에 장애가 발생된 경우, 마스터 메모리저장소(743)의 상태는 마스터 저장 기기(700)가 포함하는 체크포인트데이터를 이용하여 복구될 수 있다.At this time, it is possible to restore the state of the memory storage of the failed storage device among the plurality of storage devices by using the checkpoint data of the failed storage device. For example, when a failure occurs in the master storage device 700, the state of the master memory storage 743 may be recovered using the checkpoint data included in the master storage device 700.

반면, 체크포인트데이터를 이용하여 할 수 없는 경우, 장애가 발생한 저장 기기는 마스터 저장 기기(700)의 체크포인트데이터를 전송받고, 전송된 체크포인트데이터를 이용하여 메모리저장소의 상태를 복구할 수도 있다. 예를 들어, 장애가 발생한 저장 기기는 복제로그의 사용한 최소 LSN과 최대 LSN을 확인할 수 있다. 이때, 장애가 발생한 저장 기기는 CKPT_LSN이 최소 LSN 미만인 경우, 체크포인트데이터가 유실되어 이용할 수 없음을 판단할 수 있다. 이때, 장애가 발생한 저장 기기는 마스터 저장 기기(700)의 체크포인트데이터를 이용하여 메모리저장소의 상태를 복구할 수 있다.On the other hand, if it is not possible to use the checkpoint data, the failed storage device may receive the checkpoint data of the master storage 700 and recover the state of the memory storage using the transmitted checkpoint data. For example, a failed storage device can determine the minimum LSN and maximum LSN used in the replication log. At this time, when the CKPT_LSN is less than the minimum LSN, the failed storage device can determine that the checkpoint data is lost and can not be used. At this time, the failed storage device can restore the state of the memory storage using the checkpoint data of the master storage device 700. [

다른 예로, 장애가 발생한 저장 기기는 설정 마스터를 통해 마스터 저장 기기(700)에 대한 접속 정보를 제공받아 마스터 저장 기기(700)와의 연결을 획득할 수 있다. 이때, 장애가 발생한 저장 기기는 마스터 저장 기기(700)로 필요한 시작 로그에 대한 정보(복제로그의 필요한 시작 LSN)를 전달할 수 있다. 이 경우, 마스터 저장 기기(700)는 시작 로그에 대한 정보에 기반하여 요청된 복제로그를 장애가 발생한 저장 기기로 전송할 수 있다. 예를 들어, 마스터 저장 기기(700)는 시작 LSN부터의 복제로그를 장애가 발생한 저장 기기로 전송할 수 있다.As another example, the failed storage device may receive the connection information to the master storage device 700 through the setting master and obtain a connection with the master storage device 700. [ At this time, the failed storage device can transmit information (start LSN required) of the necessary start log to the master storage 700. In this case, the master storage device 700 may transmit the requested replication log to the failed storage device based on the information about the start log. For example, the master storage device 700 may transmit the replication log from the starting LSN to the failed storage device.

마스터 저장 기기(700)에 장애가 발생하는 경우에는, 상술한 설정 마스터를 통해 슬레이브 저장 기기 중 하나가 마스터로 선출될 수 있다. 예를 들어, 복제로그에 저장된 마지막 데이터의 LSN이 가장 큰 슬레이브 저장 기기 또는 복제로그에 저장된 커밋 LSN이 가장 큰 슬레이브 저장 기기가 마스터로서 새롭게 선출될 수 있다.When a failure occurs in the master storage device 700, one of the slave storage devices can be selected as the master through the above-described setting master. For example, the slave storage device having the largest LSN of the last data stored in the replication log or the slave storage device having the largest committed LSN stored in the replication log may be newly selected as the master.

도 7 및 도 8에서 생략된 내용은 도 1 내지 도 6을 참조할 수 있다. 일례로, 마스터 저장 기기(700)는 도 1을 통해 설명한 마스터 저장부(120)에 슬레이브 저장 기기는 역시 도 1을 통해 설명한 슬레이브 저장부(130)에 각각 대응될 수 있다.The contents omitted in FIGS. 7 and 8 can be referred to FIGS. 1 to 6. FIG. For example, the master storage device 700 may correspond to the master storage unit 120 described with reference to FIG. 1, and the slave storage devices may correspond to the slave storage unit 130 described with reference to FIG.

도 9는 본 발명의 일실시예에 있어서, 슬레이브 저장 기기의 내부 구성을 설명하기 위한 블록도이고, 도 10은 본 발명의 일실시예에 있어서, 슬레이브 저장 기기의 데이터 복제 방법을 도시한 흐름도이다.FIG. 9 is a block diagram illustrating an internal configuration of a slave storage device according to an exemplary embodiment of the present invention, and FIG. 10 is a flowchart illustrating a data replication method of a slave storage device according to an embodiment of the present invention .

본 실시예에 따른 슬레이브 저장 기기(900)는 도 7 및 도 8을 통해 설명한 슬레이브 저장 기기에 대응될 수 있으며, 도 9에 도시된 바와 같이 프로세서(910), 버스(920), 네트워크 인터페이스(930) 및 메모리(940)를 포함할 수 있다. 메모리(940)는 운영체제(941), 데이터 복제 루틴(942) 및 슬레이브 메모리저장소(943)를 포함할 수 있다. 프로세서(910)는 슬레이브 리플리케이터(911) 및 클라이언트 라이브러리(912)를 포함할 수 있다. 여기서, 프로세서(910), 버스(920), 네트워크 인터페이스(930) 및 메모리(940)는 도 7 및 도 8을 통해 설명한 프로세서(710), 버스(720), 네트워크 인터페이스(730) 및 메모리(740)에 대응할 수 있으며, 프로세서(910)가 포함하는 슬레이브 리플리케이터(911) 및 클라이언트 라이브러리(912)는 도 10의 단계들(1010 내지 1030)를 수행하기 위해 구성될 수 있다.The slave storage device 900 according to the present embodiment may correspond to the slave storage device described with reference to FIGS. 7 and 8, and may include a processor 910, a bus 920, a network interface 930 ) And a memory 940. [ The memory 940 may include an operating system 941, a data replication routine 942, and a slave memory store 943. The processor 910 may include a slave replicator 911 and a client library 912. The processor 910, the bus 920, the network interface 930 and the memory 940 are connected to the processor 710, the bus 720, the network interface 730 and the memory 740 And the slave replicator 911 and the client library 912 included in the processor 910 may be configured to perform the steps 1010 to 1030 of FIG.

단계(1010)에서 슬레이브 리플리케이터(911)는 슬레이브 저장 기기(900)가 포함하는 슬레이브 메모리저장소(943)에 대해 요청된 연산을 마스터 저장 기기로 전송할 수 있다. 마스터 저장 기기는 도 7 및 도 8을 통해 설명한 마스터 저장 기기(700)에 대응할 수 있다.The slave replicator 911 may transmit the requested operation to the master storage device for the slave memory storage 943 included in the slave storage device 900 in step 1010. [ The master storage device may correspond to the master storage device 700 described with reference to FIGS.

단계(1020)에서 슬레이브 리플리케이터(911)는 상기 연산 및 마스터 저장 기기와 연계된 마스터 메모리저장소에 대해 요청된 연산 중 적어도 하나, 그리고 수행되어야 할 연산의 마스터 저장 기기가 포함하는 마스터 복제로그에서의 위치에 대한 정보를 포함하는 상기 마스터 복제로그를 수신할 수 있다.In step 1020, the slave replicator 911 receives at least one of the operations requested for the master memory store associated with the arithmetic and master storage device, and the location in the master replica log contained by the master storage device of the operation to be performed May receive the master replication log including information about the master replication log.

단계(1030)에서 클라이언트 라이브러리(912)는 수행되어야 할 연산의 위치에 대한 정보에 기반하여 슬레이브 저장 기기(900)가 포함하는 복제로그에 저장된 연산을 슬레이브 메모리저장소(943)에 대해 수행할 수 있다. 이때, 마스터 메모리저장소 및 슬레이브 메모리저장소(943)는 동일한 데이터 상태 집합을 가질 수 있고, 저장된 연산에 따라 동일한 연산결과를 가질 수 있다.The client library 912 may perform operations stored in the replication log included in the slave storage device 900 for the slave memory storage 943 based on the information about the location of the operation to be performed at step 1030 . At this time, the master memory storage and the slave memory storage 943 may have the same data state set, and may have the same operation result according to the stored operation.

도 9 및 도 10에서 생략된 내용은 도 1 내지 도 8을 참조할 수 있다.
The contents omitted in Figs. 9 and 10 can be referred to Fig. 1 to Fig.

이와 같이, 본 발명의 실시예들에 따르면, 저장된 데이터의 상태 집합을 의미하는 메모리저장소의 상태의 전이가, 메모리저장소에 대해 제공되는 연산의 수행에 의해 확정적으로 일어나며, 연산의 수행 결과도 동일하다는 특징을 이용하여, 두 메모리저장소에서 수행되는 연산을 선입선출 순서 방식으로 수행함으로써, 두 메모리저장소간의 일관성을 보장하는 복제를 수행할 수 있다.As described above, according to the embodiments of the present invention, the transition of the state of the memory storage, which means a state set of the stored data, is definitively performed by the execution of the operation provided for the memory storage, By using the feature, the operations performed in the two memory repositories are performed in a first-in first-out order manner, so that replication can be performed ensuring consistency between the two memory repositories.

또한, 메모리 사상파일 형태로 복제로그에 대한 읽기 및/또는 쓰기를 수행하고, 디스크로의 플러시(flush)를 주기적으로 및/또는 복제로그에 쓰여진 데이터의 크기에 기반하여 수행함으로써, 복제 흐름에 대한 시간 지연을 최소화할 수 있다.It is also possible to read and / or write to the replication log in the form of a memory mapping file and perform a flush to disk periodically and / or based on the size of the data written to the replication log, Time delay can be minimized.

또한, 기설정된 복제 팩터의 값에 기반하여 복제 팩터의 값 이상의 저장부들로부터 동일한 로그 시퀀스 넘버(Log Sequence Number, LSN)가 수신되는 경우에, 해당 LSN을 커밋 LSN으로 결정하고, 결정된 커밋 LSN까지 메모리저장소에 대한 연산이 순차적으로 수행되도록 하여, 가용성을 복제 팩터만큼 유지함으로써 유실된 데이터를 복구할 수 있다.In addition, when the same log sequence number (LSN) is received from the storage units whose values are equal to or greater than the value of the replication factor based on the value of the predetermined replication factor, the corresponding LSN is determined as the commit LSN, Operations on the repository can be performed sequentially so that lost data can be recovered by maintaining availability as much as the replication factor.

또한, 메모리저장소의 상태 및 메모리저장소에 대해 수행된 연산의 마지막 위치에 해당하는 LSN을 포함하는 체크포인트데이터를 이용하여 메모리저장소의 재 시작 복구를 처리할 수 있다.Also, it is possible to handle the restart of the memory store using the checkpoint data including the LSN corresponding to the state of the memory store and the last position of the operation performed on the memory store.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

A data replication method of a data storage system including a plurality of storage units,
Receiving a requested operation on at least one of a master memory storage included in the master storage unit and a slave memory storage included in a slave storage unit that is a remaining storage unit of the plurality of storage units in the master storage unit among the plurality of storage units ;
Storing the received operation in a master replica log included in the master storage unit and further storing information on a position in the master replica log of an operation to be performed in the master replica log, Transferring a master replication log to a slave replication log including the slave storage unit; And
The master storage unit and the slave storage unit perform the same operations stored in the master replication log and the slave replication log with respect to the master memory storage and the slave memory storage based on information on the position of the operation to be performed Step
Lt; / RTI >
Storing the received operation in a replica log included in the master storage unit,
Storing the received operation in a buffer cache area in the form of a memory mapping file;
Periodically flushing data stored in the buffer cache area to a disk; And
Flushing the data stored in the buffer cache area to the disk if the size of the data stored in the master replicated log is greater than a predetermined size according to a last flush
Wherein the data copying method comprises:

The method according to claim 1,
Wherein the master memory store and the slave memory store have the same data state set and have the same operation result according to the same operation
The data copying method comprising the steps of:

The method according to claim 1,
Wherein receiving the requested operation comprises:
Receiving a requested operation on the slave memory repository through a slave client library included in the slave storage unit or receiving a requested operation on the slave memory repository through a master client library included in the master storage unit,
Wherein performing the same operation further comprises:
Performing the same operation on the master memory store in the master client library and performing the same operation on the slave memory store in the slave client library
The data copying method comprising the steps of:

The method according to claim 1,
The master replica log (or the slave replica log) is a location where data is stored in the master replica log (or the slave replica log) through a log sequence number (LSN)
Wherein the step of storing further information on a location in the master replica log of the operation to be performed in the master replica log further comprises the steps of, in the master replicator included in the master storage section, Receiving a last LSN in which data is stored; And
Determining, at the master replicator, a commit LSN that is information on the location of the operation to be performed based on the received LSN
Wherein the data copying method comprises:

5. The method of claim 4,
The step of determining the commit LSN comprises:
If the same LSN is received from a slave replicator equal to or greater than a value of a predetermined replication factor of the slave replicator, determining the same LSN as the commit LSN
The data copying method comprising the steps of:

delete

A data replication method of a data storage system including a plurality of storage units,
Receiving a requested operation on at least one of a master memory storage included in the master storage unit and a slave memory storage included in a slave storage unit that is a remaining storage unit of the plurality of storage units in the master storage unit among the plurality of storage units ;
Storing the received operation in a master replica log included in the master storage unit and further storing information on a position in the master replica log of an operation to be performed in the master replica log, Transferring a master replication log to a slave replication log including the slave storage unit; And
The master storage unit and the slave storage unit perform the same operations stored in the master replication log and the slave replication log with respect to the master memory storage and the slave memory storage based on information on the position of the operation to be performed Step
Lt; / RTI >
Detecting a failure of the master storage unit in a configuration master, which is further comprised by the data storage system; And
Selecting a slave storage unit having the largest log sequence number (LSN) of the last data stored in the slave replication log as a master storage unit in the slave storage unit when the failure is detected in the setting master
Further comprising the step of:

A data replication method of a data storage system including a plurality of storage units,
Receiving a requested operation on at least one of a master memory storage included in the master storage unit and a slave memory storage included in a slave storage unit that is a remaining storage unit of the plurality of storage units in the master storage unit among the plurality of storage units ;
Storing the received operation in a master replica log included in the master storage unit and further storing information on a position in the master replica log of an operation to be performed in the master replica log, Transferring a master replication log to a slave replication log including the slave storage unit; And
The master storage unit and the slave storage unit perform the same operations stored in the master replication log and the slave replication log with respect to the master memory storage and the slave memory storage based on information on the position of the operation to be performed Step
Lt; / RTI >
Detecting a failure of the master storage unit in a configuration master, which is further comprised by the data storage system; And
Selecting a slave storage unit having the largest log sequence number (LSN) stored in the slave replication log as a master storage unit in the slave storage unit when the failure is detected in the setting master
Further comprising:
Wherein the commit LSN is determined in the master storage unit based on an LSN of data last stored in the slave replication log and is transferred from the master storage unit to the slave storage unit.

A data replication method of a data storage system including a plurality of storage units,
Receiving a requested operation on at least one of a master memory storage included in the master storage unit and a slave memory storage included in a slave storage unit that is a remaining storage unit of the plurality of storage units in the master storage unit among the plurality of storage units ;
Storing the received operation in a master replica log included in the master storage unit and further storing information on a position in the master replica log of an operation to be performed in the master replica log, Transferring a master replication log to a slave replication log including the slave storage unit; And
The master storage unit and the slave storage unit perform the same operations stored in the master replication log and the slave replication log with respect to the master memory storage and the slave memory storage based on information on the position of the operation to be performed Step
Lt; / RTI >
Wherein the step of transferring the replication log to the replication log of the slave storage comprises:
The master replicator included in the master storage unit designating a memory mapping file area of the master replication log as a buffer cache area to perform memory copying;
Designating a memory mapping file area of the master replication log as a buffer cache area and reading the master replication log by a slave replicator included in the slave storage unit; And
The slave client library included in the slave storage unit accesses the memory mapping file area of the master replication log directly to obtain the information on the location of the operation to be performed
Wherein the data copying method comprises:

A data replication method of a data storage system including a plurality of storage units,
Receiving a requested operation on at least one of a master memory storage included in the master storage unit and a slave memory storage included in a slave storage unit that is a remaining storage unit of the plurality of storage units in the master storage unit among the plurality of storage units ;
Storing the received operation in a master replica log included in the master storage unit and further storing information on a position in the master replica log of an operation to be performed in the master replica log, Transferring a master replication log to a slave replication log including the slave storage unit; And
The master storage unit and the slave storage unit perform the same operations stored in the master replication log and the slave replication log with respect to the master memory storage and the slave memory storage based on information on the position of the operation to be performed Step
Lt; / RTI >
Wherein each of the plurality of storage units includes a log sequence number (LSN) corresponding to a state of a memory storage included in each of the plurality of storage units and an end position of an operation performed on the memory storage, Including point data
The data copying method comprising the steps of:

11. The method of claim 10,
Recovering the state of the memory storage of the storage unit in which the failure occurred in the plurality of storage units using the checkpoint data of the failed storage unit
Further comprising the step of:

11. The method of claim 10,
Checking a minimum LSN and a maximum LSN of the duplication log of the failed storage unit among the plurality of storage units; And
If the LSN corresponding to the last position is less than the minimum LSN, recovering the state of the memory storage of the failed storage unit using the checkpoint data of the master storage unit
Further comprising the step of:

11. The method of claim 10,
Checking a minimum LSN and a maximum LSN of the duplication log of the failed storage unit among the plurality of storage units;
Providing connection information for the master storage unit to the failed storage unit in a configuration master that the data storage system further includes;
Acquiring a connection from the storage unit in which the failure occurs to the master storage unit, and transmitting information about a necessary starting log to the master storage unit; And
Transmitting the requested replication log to the failed storage unit based on the information on the start log in the master storage unit
Further comprising the step of:

A method for data replication of a master storage device,
Receiving, in the master storage device, a requested operation for at least one of a master memory storage included in the master storage device and a slave memory storage included in the slave storage device;
Storing the received operation in a master replication log included in the master storage device and further storing information on a location in the master replication log of an operation to be performed in the master replication log, Transferring the master replication log to the slave storage device; And
Performing, in the master storage device, an operation stored in the master replication log based on information about the location of the operation to be performed on the master memory store
Lt; / RTI >
Wherein the master memory store and the slave memory store have the same data state set and have the same operation result according to the stored operation,
Storing the received operation in a master replication log included in the master storage device and further storing information on a location in the master replication log of an operation to be performed in the master replication log, Wherein transferring the master replication log to the slave storage device comprises:
Storing the received operation in a buffer cache area in the form of a memory mapping file;
Periodically flushing data stored in the buffer cache area to a disk; And
Flushing the data stored in the buffer cache area to the disk if the size of the data stored in the master replicated log is greater than a predetermined size according to a last flush
Including
The data copying method comprising the steps of:

15. The method of claim 14,
Wherein receiving the requested operation comprises:
Receiving a requested operation for the master memory store through a master client library included in the master storage device or receiving a requested operation for the slave memory storage via a slave client library included in the slave storage device,
Wherein performing the stored operation further comprises:
Performing the stored operation on the master memory store in the master client library,
The stored operation is performed in the same manner in the slave client library for the slave memory repository
The data copying method comprising the steps of:

A method of data replication of a slave storage device,
Transmitting, in the slave storage device, a requested operation to a slave memory storage included in the slave storage device, to a master storage device;
At the slave storage device, at least one of the operation and the operation requested for the master memory storage included in the master storage device, and information about the position in the master replication log included in the master storage device of the operation to be performed Receiving a master replication log including the master replication log; And
Performing, in the slave storage device, an operation stored in the slave replication log of the slave storage device for the slave memory storage based on information about the location of the operation to be performed
Lt; / RTI >
Wherein the master memory store and the slave memory store have the same data state set and have the same operation result according to the stored operation,
The master storage device stores the requested operation in a buffer cache area in the form of a memory mapping file, flushes data stored in the buffer cache area to a disk periodically, And flushing the data stored in the buffer cache area to the disk if the size of the data stored in the buffer cache area is larger than a predetermined size
The data copying method comprising the steps of:

17. The method of claim 16,
Wherein performing the stored operation further comprises:
Performing the stored operation on the slave memory storage in a slave client library included in the slave storage device,
Wherein the stored operations for the master memory store in the master client library of the master storage device are performed identically
The data copying method comprising the steps of:

A computer-readable recording medium storing a program for executing the method according to any one of claims 1 to 5 or 17 to 17.

In a data storage system,
A plurality of storage units,
Wherein each of the plurality of storage units comprises:
A replicator and a memory storage,
The master replicator included in the master storage unit of the plurality of storage units determines the order of operations to be performed on the memory stores included in each of the plurality of storage units by first in first out (FIFO) Storing information in a master replica log included in the master storage section and information on a location in the master replica log of an operation to be performed,
Wherein the slave replicator of the slave storage unit among the plurality of storage units receives the master replication log and stores the master replication log in a slave replication log included in the slave storage unit,
Wherein the master replicator and the slave replicator include a memory repository including the master repository and the slave repository including the same operations stored in the master replication log and the slave replication log based on information on the location of the operation to be performed, Performing for the memory store,
Wherein the master replicator stores the operation to be performed in a buffer cache area in the form of a memory mapping file, flushes data stored in the buffer cache area to a disk periodically, And flushing the data stored in the buffer cache area to the disk if the size of the data is greater than or equal to a predetermined size
And the data storage system.

20. The method of claim 19,
Wherein the master memory store and the slave memory store have the same data state set and have the same operation result according to the same operation
And the data storage system.

20. The method of claim 19,
The master replicator includes:
Wherein each of the plurality of storage units receives a requested operation for a memory storage included in each of the plurality of storage units via a client library,
Performing the same operation on the master memory storage in the master client library of the master storage unit and performing the same operation on the slave memory storage in the client library further including the slave storage unit
And the data storage system.

In the master storage device,
Master memory storage;
A master replicator for receiving a requested operation on at least one of the master memory store and the slave memory stores included in the slave storage device; And
Client library
Lt; / RTI >
The master replicator includes:
Storing the received operation in a master replicated log and further storing information on the location of an operation to be performed in the master replicated log,
The client library includes:
Performing an operation stored in the master replication log on the master memory storage based on information on the location of the operation to be performed,
Wherein the master memory store and the slave memory store have the same data state set and have the same operation result according to the stored operation,
Wherein the master replicator stores the received operation in a buffer cache area in the form of a memory mapping file and periodically flushes the data stored in the buffer cache area to a disk, And flushing the data stored in the buffer cache area to the disk if the size of the data is greater than or equal to a predetermined size
And a master storage device.

In a slave storage device,
Slave memory storage;
A client library for transferring the requested operation to the slave memory repository to the master storage device; And
A slave replicator for receiving a master replication log included in the master storage device, the master replication log comprising at least one of the operations and the operations requested for the master memory storage included in the master storage device, And includes information on the location in the master replication log. -;
Lt; / RTI >
The client library includes:
Performing an operation stored in a slave replication log included in the slave storage device with respect to the slave memory storage based on information on the location of the operation to be performed,
Wherein the master memory store and the slave memory store have the same data state set and have the same operation result according to the stored operation,
The master storage device stores the requested operation in a buffer cache area in the form of a memory mapping file, flushes data stored in the buffer cache area to a disk periodically, And flushing the data stored in the buffer cache area to the disk if the size of the data stored in the buffer cache area is larger than a predetermined size
And the slave storage device.