KR101979325B1

KR101979325B1 - Apparatus and method for distributed storage

Info

Publication number: KR101979325B1
Application number: KR1020180087243A
Authority: KR
Inventors: 한승진; 이창재
Original assignee: 넷마블 주식회사
Priority date: 2018-07-26
Filing date: 2018-07-26
Publication date: 2019-05-16

Abstract

Presented are a system and method for distributed storage. According to one embodiment of the present invention, the system comprises: one or more cache servers; an object storage tier; and a bridge tier. The object storage tier stores data. The bridge tier synchronizes first data stored in the object storage tier with the cache servers. The cache servers include different types of cache servers. The bridge tier selects a cache server for connection among the different types of cache servers based on a state of the object storage tier and connects the selected cache server to the object storage tier.

Description

[0001] APPARATUS AND METHOD FOR DISTRIBUTED STORAGE [0002]

아래의 설명은 분산 저장 기술에 관한 것으로, 캐쉬 서버와 객체 저장 장치를 결합하고 이를 관리하는 기술에 관한 것이다.The following description relates to distributed storage technology and relates to techniques for combining and managing cache servers and object storage devices.

콘텐츠 전송 네트워크(Content delivery network 또는 content distribution network, CDN)는 콘텐츠를 효율적으로 전달하기 위해 여러 노드를 가진 네트워크에 데이터를 저장하여 제공하는 시스템이다. 컨텐츠 전송 네트워크는 인터넷 서비스 제공자에 직접 연결되어 데이터를 전송할 수 있다. 컨텐츠 전송 네트워크는 다양한 노드가 유동적으로 연결되는 구조로서, 스토리지의 확장성과 유동성이 보장될 필요가 있다.A content delivery network (a content delivery network or a content distribution network, CDN) is a system that stores and provides data to a network having multiple nodes in order to efficiently deliver contents. The content transmission network may be directly connected to an Internet service provider to transmit data. The contents transmission network is a structure in which various nodes are connected in a fluid manner, so that the scalability and the fluidity of the storage need to be guaranteed.

분산 저장 시스템은 컨텐츠 전송 네트워크의 저장 공간으로서 활용될 수 있다. 분산 저장 시스템은 다양한 종류의 스토리지를 관리하는 시스템이다. 분산 저장 시스템은 하드웨어의 실체를 가지는 스토리지를 소프트웨어로 재정의함으로써 스토리지 관리 비용을 절감할 수 있다. 다만, 클라이언트에게 직접 데이터를 다운로드하는 경우 읽기 동작과 관련된 성능 저하 및 보안성 문제가 발생할 수 있다.The distributed storage system can be utilized as a storage space of the content transmission network. Distributed storage systems are systems that manage various kinds of storage. Distributed storage systems can reduce storage management costs by redefining storage with hardware entities as software. However, when downloading data directly to the client, performance degradation and security problems associated with the read operation may occur.

일 실시예에 따른 분산 저장 시스템은, 하나 이상의 캐쉬 서버, 객체 저장 계층 및 브릿지 계층을 포함하고, 상기 객체 저장 계층은, 데이터를 저장하고, 상기 브릿지 계층은, 상기 객체 저장 계층에 저장된 제1 데이터를 상기 하나 이상의 캐쉬 서버에 동기화하고, 상기 하나 이상의 캐쉬 서버는 서로 다른 종류의 캐쉬 서버를 포함하고, 상기 브릿지 계층은, 상기 객체 저장 계층의 상태를 기초로 상기 서로 다른 종류의 캐쉬 서버 중에서 연결할 캐쉬 서버를 선택하고, 상기 선택된 캐쉬 서버를 상기 객체 저장 계층과 연결한다.According to an embodiment of the present invention, a distributed storage system includes at least one cache server, an object storage layer and a bridge layer, the object storage layer stores data, and the bridge layer stores first data Wherein the one or more cache servers include different types of cache servers, and the bridge layer is configured to synchronize the cache of the different types of cache servers based on the state of the object storage layer, Selects a server, and connects the selected cache server to the object storage layer.

상기 서로 다른 종류의 캐쉬 서버는, 상용 캐쉬 서버 또는 오픈 소스 기반의 캐쉬 서버를 포함할 수 있다.The different types of cache servers may include a commercial cache server or an open source based cache server.

다른 실시예에 따른 분산 저장 시스템은, 하나 이상의 캐쉬 서버, 객체 저장 계층 및 브릿지 계층을 포함하고, 상기 객체 저장 계층은, 제1 데이터를 저장하고, 상기 브릿지 계층은, 상기 객체 저장 계층에 저장된 제1 데이터를 상기 하나 이상의 캐쉬 서버에 동기화하고, 상기 캐쉬 서버는, 클라이언트의 상기 제1 데이터의 전송 요청에 응답하여 상기 제1 데이터를 상기 클라이언트로 전송한다. According to another embodiment of the present invention, a distributed storage system includes at least one cache server, an object storage layer and a bridge layer, the object storage layer stores first data, and the bridge layer stores data stored in the object storage layer 1 data to the one or more cache servers, wherein the cache server transmits the first data to the client in response to a request for transmission of the first data of the client.

다른 실시예에 따른 분산 저장 시스템은, 하나 이상의 캐쉬 서버, 객체 저장 계층 및 브릿지 계층을 포함하고, 상기 객체 저장 계층은, 제1 데이터를 저장하고, 상기 브릿지 계층은, 상기 제1 데이터의 저장에 대응하는 상기 객체 저장 계층의 상태 변경에 따라 상기 캐쉬 서버로 퍼지(purge) 요청을 전송한다.According to another embodiment of the present invention, a distributed storage system includes at least one cache server, an object storage layer and a bridge layer, the object storage layer stores first data, and the bridge layer stores And transmits a purging request to the cache server according to the state change of the corresponding object storage layer.

상기 객체 저장 계층은, 클라이언트로부터 수신된 제1 데이터를 상기 객체 저장 계층에 포함된 적어도 하나의 객체 저장 장치에 업로드하고, 상기 브릿지 계층은, 상기 업로드된 제1 데이터에 대응하는 퍼지 정보를 생성하고, 상기 캐쉬 서버로 상기 퍼지 정보가 포함된 퍼지 요청을 전송하고, 상기 캐쉬 서버는, 상기 퍼지 정보를 기초로 퍼지를 수행할 수 있다.The object storage layer uploads first data received from a client to at least one object storage device included in the object storage layer, and the bridge layer generates fuzzy information corresponding to the uploaded first data , Transmits a purge request including the purge information to the cache server, and the cache server can perform purge based on the purge information.

다른 실시예에 따른 분산 저장 시스템은, 하나 이상의 캐쉬 서버, 객체 저장 계층 및 브릿지 계층을 포함하고, 상기 객체 저장 계층은, 제1 데이터 및 상기 제1 데이터에 대응하는 제1 논리적인 오브젝트 저장소를 저장하고, 클라이언트로부터 수신된 신호에 따라 상기 제1 논리적인 오브젝트 저장소를 변경하고, 상기 브릿지 계층은, 상기 객체 저장 계층의 제1 논리적인 오브젝트 저장소의 변경에 따라 상기 캐쉬 서버로 변경된 제1 논리적인 오브젝트 저장소에 대응하는 동기화 요청을 전송한다.According to another embodiment, a distributed storage system includes at least one cache server, an object storage layer and a bridge layer, and the object storage layer stores first data and a first logical object store corresponding to the first data And changes the first logical object repository according to a signal received from the client, the bridge layer comprising: a first logical object changed to the cache server according to a change in a first logical object repository of the object storage hierarchy; And sends a synchronization request corresponding to the storage.

상기 브릿지 계층은, 상기 변경된 제1 논리적인 오브젝트 저장소의 매핑 정보를 생성하고, 상기 매핑 정보를 포함하는 동기화 요청을 상기 캐쉬 서버로 전송하고, 상기 캐쉬 서버는, 상기 매핑 정보를 기초로 상기 제1 논리적인 오브젝트 저장소에 대응하는 상기 캐쉬 서버에 저장된 제2 논리적인 오브젝트 저장소를 동기화할 수 있다.Wherein the bridge layer generates mapping information of the changed first logical object store and transmits a synchronization request including the mapping information to the cache server, And synchronize a second logical object store stored in the cache server corresponding to the logical object store.

일 실시예에 따른 분산 저장 장치는, 객체 저장 계층, 브릿지 계층 및 하나 이상의 캐쉬 서버와 통신하는 I/O 인터페이스를 포함하고, 상기 객체 저장 계층은, 데이터를 저장하는 하나 이상의 객체 저장 장치, 상기 하나 이상의 객체 저장 장치에 대한 계층 구조를 관리하는 감시 장치 및 클라이언트와 통신하는 하나 이상의 게이트 웨이를 포함하고, 상기 브릿지 계층은, 상기 객체 저장 계층에 저장된 제1 데이터를 상기 하나 이상의 캐쉬 서버에 동기화한다.A distributed storage device according to an embodiment includes an object storage layer, a bridge layer, and an I / O interface communicating with one or more cache servers, the object storage layer including one or more object storage devices for storing data, Wherein the bridge layer synchronizes the first data stored in the object storage layer with the one or more cache servers.

상기 브릿지 계층은, 상기 객체 저장 계층의 데이터 변경에 따라 상기 캐쉬 서버로 퍼지(purge) 요청을 전송하는 업로더를 더 포함할 수 있다.The bridge layer may further include an uploader for transmitting a purge request to the cache server according to the data change of the object storage layer.

상기 객체 저장 계층은, 상기 클라이언트로부터 수신된 제2 데이터를 상기 객체 저장 계층에 포함된 적어도 하나의 객체 저장 장치에 업로드하고, 상기 업로더는, 상기 업로드된 제2 데이터에 대응하는 퍼지 정보를 생성하고, 상기 캐쉬 서버로 상기 퍼지 정보가 포함된 퍼지 요청을 전송할 수 있다.Wherein the object storage layer uploads second data received from the client to at least one object storage device included in the object storage layer, and the uploader generates fuzzy information corresponding to the uploaded second data And transmit the fuzzy request including the fuzzy information to the cache server.

상기 브릿지 계층은, 상기 객체 저장 계층의 논리적인 오브젝트 저장소의 변경에 따라 상기 캐쉬 서버로 동기화 요청을 전송하는 에이전트를 더 포함할 수 있다.The bridge layer may further include an agent for transmitting a synchronization request to the cache server according to a change in the logical object storage of the object storage layer.

상기 객체 저장 계층은, 상기 저장된 데이터에 대응하는 제1 논리적인 오브젝트 저장소를 변경하고, 상기 에이전트는, 상기 변경된 제1 논리적인 오브젝트 저장소의 매핑 정보를 생성하고, 상기 매핑 정보를 포함하는 동기화 요청을 상기 캐쉬 서버로 전송하고, 상기 캐쉬 서버는, 상기 매핑 정보를 기초로 상기 제1 논리적인 오브젝트 저장소에 대응하는 상기 캐쉬 서버에 저장된 제2 논리적인 오브젝트 저장소를 동기화할 수 있다.Wherein the object storage layer modifies a first logical object store corresponding to the stored data and the agent generates mapping information of the changed first logical object store and a synchronization request including the mapping information To the cache server, and the cache server may synchronize a second logical object store stored in the cache server corresponding to the first logical object store based on the mapping information.

일 실시예에 따른 분산 저장 방법은, 서로 다른 종류의 캐쉬 서버를 포함하는 하나 이상의 캐쉬 서버, 객체 저장 계층 및 브릿지 계층을 포함하는 분산 저장 시스템에 의해 수행되는 분산 저장 방법에 있어서, 상기 객체 저장 계층에 데이터를 저장하는 단계 및 상기 객체 저장 계층에 저장된 제1 데이터를 상기 하나 이상의 캐쉬 서버에 동기화하는 단계를 포함하고, 상기 동기화하는 단계는, 상기 객체 저장 계층의 상태를 기초로 상기 서로 다른 종류의 캐쉬 서버 중에서 연결할 캐쉬 서버를 선택하는 단계 및 상기 선택된 캐쉬 서버를 상기 객체 저장 계층과 연결하는 단계를 포함한다. According to an embodiment of the present invention, there is provided a distributed storage method performed by a distributed storage system including at least one cache server, an object storage layer, and a bridge layer including different types of cache servers, And synchronizing the first data stored in the object storage layer with the one or more cache servers, wherein the synchronizing comprises synchronizing the first data stored in the object storage layer with the different types of objects Selecting a cache server to be connected among the cache servers, and connecting the selected cache server to the object storage layer.

다른 실시예에 따른 분산 저장 방법은, 하나 이상의 캐쉬 서버, 객체 저장 계층 및 브릿지 계층을 포함하는 분산 저장 시스템에 의해 수행되는 분산 저장 방법에 있어서, 상기 객체 저장 계층에 데이터를 저장하는 단계, 상기 객체 저장 계층에 저장된 제1 데이터를 상기 하나 이상의 캐쉬 서버에 동기화하는 단계 및 클라이언트의 제1 데이터의 전송 요청에 응답하여 상기 캐쉬 서버에 저장된 상기 제1 데이터를 상기 클라이언트로 전송하는 단계를 포함한다.According to another embodiment of the present invention, there is provided a distributed storage method performed by a distributed storage system including at least one cache server, an object storage layer, and a bridge layer, the distributed storage method comprising: storing data in the object storage layer; Synchronizing the first data stored in the storage layer to the one or more cache servers, and transmitting the first data stored in the cache server to the client in response to a transmission request of the client's first data.

다른 실시예에 따른 분산 저장 방법은, 하나 이상의 캐쉬 서버, 객체 저장 계층 및 브릿지 계층을 포함하는 분산 저장 시스템에 의해 수행되는 분산 저장 방법에 있어서, 상기 객체 저장 계층에 데이터를 저장하는 단계, 상기 객체 저장 계층에 저장된 제1 데이터를 상기 하나 이상의 캐쉬 서버에 동기화하는 단계 및 상기 객체 저장 계층의 데이터 변경에 따라 상기 캐쉬 서버로 퍼지(purge) 요청을 전송하는 단계를 포함한다.According to another embodiment of the present invention, there is provided a distributed storage method performed by a distributed storage system including at least one cache server, an object storage layer, and a bridge layer, the distributed storage method comprising: storing data in the object storage layer; Synchronizing the first data stored in the storage layer with the one or more cache servers, and transmitting a purge request to the cache server according to data modification in the object storage layer.

다른 실시예에 따른 분산 저장 방법은, 하나 이상의 캐쉬 서버, 객체 저장 계층 및 브릿지 계층을 포함하는 분산 저장 시스템에 의해 수행되는 분산 저장 방법에 있어서, 상기 객체 저장 계층에 데이터를 저장하는 단계, 상기 객체 저장 계층에 저장된 제1 데이터를 상기 하나 이상의 캐쉬 서버에 동기화하는 단계 및 상기 객체 저장 계층의 논리적인 오브젝트 저장소의 변경에 따라 상기 캐쉬 서버로 동기화 요청을 전송하는 단계를 포함한다.According to another embodiment of the present invention, there is provided a distributed storage method performed by a distributed storage system including at least one cache server, an object storage layer, and a bridge layer, the distributed storage method comprising: storing data in the object storage layer; Synchronizing the first data stored in the storage layer with the one or more cache servers, and transmitting the synchronization request to the cache server according to the change of the logical object storage of the object storage layer.

일 실시예에 따른 컴퓨터 판독 가능한 저장 매체는 상기 법을 실행하기 위한 인스트럭션들을 저장한다.A computer readable storage medium according to one embodiment stores instructions for executing the method.

일 실시예에 따른 컴퓨터 프로그램은 상기 방법을 컴퓨터와의 결합을 통해 실행시키기 위한 저장매체에 저장된다.A computer program according to an embodiment is stored in a storage medium for executing the method through a combination with a computer.

도 1은 일 실시예에 따른 분산 저장 시스템의 전체 구성을 설명하기 위한 도면이다.
도 2는 일례에 따른 분산 저장 시스템의 세부 구성을 도시한 도면이다.
도 3는 일례에 따른 분산 저장 시스템의 데이터 등의 흐름을 도시한 도면이다.
도 4는 일 실시예에 따른 분산 저장 방법의 동작을 도시한 순서도이다.
도 5는 다른 실시예에 따른 분산 저장 방법의 동작을 도시한 순서도이다.1 is a view for explaining the overall configuration of a distributed storage system according to an embodiment.
2 is a view showing a detailed configuration of a distributed storage system according to an example.
3 is a diagram showing a flow of data and the like in the distributed storage system according to an example.
4 is a flowchart illustrating an operation of the distributed storage method according to an embodiment.
5 is a flowchart showing the operation of the distributed storage method according to another embodiment.

이하에서, 첨부된 도면을 참조하여 실시예들을 상세하게 설명한다. 그러나, 실시예들에는 다양한 변경이 가해질 수 있어서 특허출원의 권리 범위가 이러한 실시예들에 의해 제한되거나 한정되는 것은 아니다. 실시예들에 대한 모든 변경, 균등물 내지 대체물이 권리 범위에 포함되는 것으로 이해되어야 한다.In the following, embodiments will be described in detail with reference to the accompanying drawings. However, various modifications may be made in the embodiments, and the scope of the patent application is not limited or limited by these embodiments. It is to be understood that all changes, equivalents, and alternatives to the embodiments are included in the scope of the right.

실시예에서 사용한 용어는 단지 설명을 목적으로 사용된 것으로, 한정하려는 의도로 해석되어서는 안된다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the examples are used for descriptive purposes only and are not to be construed as limiting. The singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, the terms "comprises" or "having" and the like refer to the presence of stated features, integers, steps, operations, elements, components, or combinations thereof, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this embodiment belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Do not.

또한, 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 실시예의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.In the following description of the present invention with reference to the accompanying drawings, the same components are denoted by the same reference numerals regardless of the reference numerals, and redundant explanations thereof will be omitted. In the following description of the embodiments, a detailed description of related arts will be omitted if it is determined that the gist of the embodiments may be unnecessarily blurred.

본 발명은 분산 저장 시스템에 관한 것이다. 본 발명은 분산 저장 장치와 캐쉬 서버를 연결하고, 연결에 따른 다양한 문제들을 해결하기 위한 알고리즘을 제공한다. 이하에서는, 설명의 편의를 위해 본 발명은 CEPH를 예로 들어 설명하나, 이는 분산 저장 장치의 일례에 불과하며, AWS, GCP, Azure 및 SWIFT 등의 다양한 종류의 분산 저장 장치가 이용될 수 있다. The present invention relates to a distributed storage system. The present invention provides an algorithm for connecting a distributed storage device to a cache server and solving various problems related to connection. Hereinafter, for the sake of convenience, the present invention will be described by taking CEPH as an example, but it is merely an example of a distributed storage device, and various kinds of distributed storage devices such as AWS, GCP, Azure and SWIFT can be used.

도 1은 일 실시예에 따른 분산 저장 시스템의 전체 구성을 설명하기 위한 도면이다.1 is a view for explaining the overall configuration of a distributed storage system according to an embodiment.

일 실시예에 따르면, 분산 저장 시스템(100)은 객체 저장 계층(110), 브릿지 계층(120) 및 하나 이상의 캐쉬 서버 계층(130)을 포함한다. 분산 저장 시스템(100)은 객체 저장 계층(110)과 캐쉬 서버 계층(130)을 결합하여 분산 저장 시스템의 성능을 개선할 수 있다.According to one embodiment, the distributed storage system 100 includes an object storage layer 110, a bridge layer 120, and one or more cache server layers 130. The distributed storage system 100 may combine the object storage layer 110 and the cache server layer 130 to improve the performance of the distributed storage system.

일 실시예예 따르면, 분산 저장 시스템(100)은 객체 저장 계층(110)과 캐쉬 서버 계층(130)을 유기적으로 연결시켜 하나의 어플라이언스(appliance) 제품처럼 동작시킬 수 있다. According to one embodiment, the distributed storage system 100 can organically connect the object storage layer 110 and the cache server layer 130 to operate as one appliance product.

분산 저장 시스템(100)에서 캐쉬 서버 계층(130)는 리버스 프록시(reverse proxy) 역할을 수행할 수 있다. 이를 통해, 분산 저장 시스템(100)은 보안을 강화할 수 있고 읽기(read) 성능을 향상시킬 수 있다. 클라이언트(140)는 객체 저장 계층(110)에 직접 요청하지 않고 캐쉬 서버 계층(130)에 요청함으로써 캐쉬 서버 계층(130)로부터 원하는 데이터를 수신할 수 있다. 따라서, 분산 저장 시스템(100)은 객체 저장 계층(110)의 규모에 상관없이 일정 수준의 저장 서비스를 클라이언트(140)에게 제공할 수 있다. 캐쉬 서버 계층(130)이 리버스 프록시로서 동작함으로써 캐쉬 서버 계층(130)의 데이터가 유실되는 경우에도 원본 데이터는 객체 저장 계층(110)에 보존될 수 있다.In the distributed storage system 100, the cache server layer 130 may serve as a reverse proxy. In this way, the distributed storage system 100 can enhance security and improve read performance. The client 140 may receive the desired data from the cache server layer 130 by requesting the cache server layer 130 without directly requesting the object storage layer 110. [ Accordingly, the distributed storage system 100 can provide a storage service of a certain level to the client 140 irrespective of the size of the object storage layer 110. The original data can be saved in the object storage layer 110 even when the cache server layer 130 operates as a reverse proxy and the data in the cache server layer 130 is lost.

분산 저장 시스템(100)은 캐쉬 서버 계층(130)에 다양한 종류의 캐쉬 서버를 연결할 수 있다. 분산 저장 시스템(100)은 저장 공간의 수요 및 상태에 따라 연결되어 있던 캐쉬 서버를 해제하거나, 새로운 캐쉬 서버를 연결할 수 있다. 분산 저장 시스템(100)은 특정 캐쉬 서버 계층(130)에 의존하지 않고 상용 캐쉬 서버 또는 오픈 소스 캐쉬 서버 등을 자유롭게 선택할 수 있다. 이처럼, 분산 저장 시스템(100)은 임의의 캐쉬 서버를 드라이버처럼 선택하고 연결할 수 있고 특정 캐쉬 서버에 대한 의존성을 완화할 수 있다. 다만, 객체 저장 계층(110)과 캐쉬 서버 계층(130)를 결합할 경우 결합된 구조에 적합한 데이터 퍼지 및 논리적인 오브젝트 저장소 동기화의 알고리즘이 요구된다.The distributed storage system 100 may connect various kinds of cache servers to the cache server layer 130. [ The distributed storage system 100 can release the connected cache server or connect a new cache server according to the demand and state of the storage space. The distributed storage system 100 can freely select a commercial cache server or an open source cache server without depending on the specific cache server layer 130. [ As such, the distributed storage system 100 can select and connect any cache server as a driver and can relieve dependency on a specific cache server. However, when the object storage layer 110 and the cache server layer 130 are combined, algorithms for data purging and logical object store synchronization suitable for the combined structure are required.

분산 저장 시스템(100)은 미리 정해진 TTL(Time to live) 외에 저장 공간의 수요 및 상태에 따라 능동적으로 퍼지(purge)를 수행할 수 있다. TTL은 캐쉬 서버 계층(130)에 저장된 데이터의 만료 시간을 의미할 수 있다. 보통, 캐쉬 서버 계층(130)는 데이터 별로 TTL을 미리 정해둔다. TTL에 의존할 경우 객체 저장 계층(110)의 데이터 상태 변경이 발생하더라도 TTL을 기다려 퍼지가 수행되므로 동기화 성능이 열화될 수 있다. 분산 저장 시스템(100)은 TTL과 별도로 데이터 상태 변경에 응답하여 퍼지를 수행함으로써 동기화 성능을 보다 향상시킬 수 있다.The distributed storage system 100 can perform active purge according to the demand and state of the storage space in addition to the predetermined time to live (TTL). The TTL may indicate the expiration time of data stored in the cache server layer 130. Usually, the cache server layer 130 sets a TTL for each data in advance. If the data state change of the object storage layer 110 occurs depending on the TTL, the synchronization performance may be degraded because the pause is performed by waiting for the TTL. Distributed storage system 100 may further improve synchronization performance by performing purging in response to a data state change separately from TTL.

분산 저장 시스템(100)은 논리적인 오브젝트 저장소 정보를 캐쉬 서버 계층(130)에 능동적으로 동기화할 수 있다. 이에 따라, 캐쉬 서버 계층(130)은 논리적인 오브젝트 저장소 정보를 별도로 동기화할 필요가 없다. 이처럼, 분산 저장 시스템(100)은 동기화 성능을 보다 향상시킬 수 있다. 여기서, 논리적인 오브젝트 저장소는 버킷(bucket) 또는 컨테이너(container)로 지칭될 수 있다. 예를 들어, AWS, GCP 및 CEPH의 경우 버킷으로 지칭되고, Azure 및 SWIFT의 경우 컨테이너로 지칭될 수 있다. The distributed storage system 100 may actively synchronize logical object store information to the cache server layer 130. Accordingly, the cache server layer 130 does not need to synchronize the logical object repository information separately. As such, the distributed storage system 100 can further improve synchronization performance. Here, the logical object repository may be referred to as a bucket or a container. For example, it can be referred to as a bucket for AWS, GCP and CEPH, and as a container for Azure and SWIFT.

버킷은 S3에서 생성할 수 있는 최상위 디렉토리를 의미할 수 있다. 버킷은 S3 영역에서 유일하며, 계정별로 100개까지 생성할 수 있다. 버킷에 저장될 수 있는 객체와 용량은 무제한일 수 있다. S3에 저장한 데이터를 사용하기 위해서는 URL을 통해 버킷에 접근해야 한다. S3는 RESTful 아키텍처를 지원하며, 데이터를 불러올 수 있는 고유한 URL을 연결해 놓고 URL을 호출하여 데이터를 불러올 수 있다. A bucket can refer to the top-level directory that S3 can create. The bucket is unique in the S3 area, and you can create up to 100 buckets per account. The objects and capacity that can be stored in the bucket can be unlimited. To use the data stored in S3, you need to access the bucket via the URL. S3 supports the RESTful architecture and can retrieve data by calling URLs with unique URLs that can load data.

CEPH는 통합 스토리지 시스템으로서 객체 저장(Object storage), 블록 저장(block strage) 및 파일 저장(file storage)를 지원한다. CEPH는 radosgw라는 컴포넌트를 이용하여 객체 저장을 제공한다. 예를 들어, S3 또는 swift가 인터페이스로서 사용될 수 있다. 객체 저장은 블록 저장에 비해 읽기 성능에서 열등할 수 있는데, 본 발명은 캐쉬 서버 계층(130)을 이용함으로써 객체 저장 계층(110)의 읽기 성능을 개선할 수 있다.CEPH is an integrated storage system that supports object storage, block storage, and file storage. CEPH provides storage of objects using a component called radosgw. For example, S3 or swift may be used as an interface. The object storage may be inferior to the block storage in the read performance. The present invention can improve the read performance of the object storage layer 110 by using the cache server layer 130.

분산 저장 시스템은 소프트웨어 정의 스토리지(SDS, software-defined storage)라고 지칭될 수 있다. 저장 용량을 단순하게 증가시키는 것은 스토리지 및 관리 측면에서 비용을 크게 높일 수 있다. 수동으로 다양한 종류의 스토리지를 관리하는 것은 많은 비용을 요구한다. 소프트웨어 정의 스토리지는 기존의 데이터 스토리지의 스케일, 통합성 및 유연성 문제를 해결하고 관리 비용을 줄일 수 있다.A distributed storage system may be referred to as software-defined storage (SDS). Simply increasing storage capacity can significantly increase costs in terms of storage and management. Manually managing various kinds of storage requires a lot of money. Software-defined storage addresses the scale, integration and flexibility issues of traditional data storage and reduces management costs.

CEPH는 오픈 소스의 소프트웨어 정의 스토리지로서, 통합 스토리지 솔루션을 제공한다. CEPH는 무결성이 보장되고 높은 수준의 성능과 스케일 확장이 가능한 솔루션을 제공한다. 이를 위해, CEPH는 복수의 상이한 소프트웨어 데몬(Daemon)으로 구성된다. 데몬은 상주 프로그램으로서 시스템의 운영에 관련된 작업을 후선(background) 상태로 동작하면서 실행하는 프로그램을 지칭한다. 이하, 데몬은 컴포넌트로 지칭될 수 있다.CEPH is an open source, software-defined storage that provides a unified storage solution. CEPH provides a solution that ensures high integrity and scalability. To this end, the CEPH consists of a plurality of different software daemons. A daemon is a resident program, which refers to a program that runs a task related to the operation of the system while operating in a background state. Hereinafter, the daemon may be referred to as a component.

CEPH는 RADOS(Ceph Reliable Autonomic Distributed Object Store), OSD(Ceph Object Storage Device), MON(Ceph monitors) 및 radosgw(RADOS Gateway, RGW) 등으로 구성된다. CEPH consists of RADOS (Ceph Reliable Autonomic Distributed Object Store), OSD (Ceph Object Storage Device), MON (Ceph monitors) and radosgw (RADOS Gateway, RGW).

RADOS는 CEPH 스토리지 클러스터의 기본 컴포넌트이다. 오브젝트는 CEPH 기본 저장 형식이며, RADOS는 이러한 오브젝트의 저장을 담당한다. RADOS는 저장된 데이터의 일관성과 신뢰성을 유지하는 역할을 한다. RADOS는 일관성을 위해 데이터 복제, 실패 탐색, 복구, 마이그레이션 및 클러스터 노드들 간의 리밸런싱을 수행한다.RADOS is a basic component of a CEPH storage cluster. The object is a CEPH basic storage format, and RADOS is responsible for storing these objects. RADOS plays a role in maintaining the consistency and reliability of stored data. RADOS performs data replication, failure discovery, recovery, migration, and rebalancing between cluster nodes for consistency.

OSD는 클라이언트의 쓰기(write) 요청에 응답하여 데이터를 오브젝트 형태로 저장되는 공간을 의미한다. OSD는 데이터가 실제로 저장되는 공간이며, 이곳에서 클라이언트의 읽기 요청에 응답하여 데이터 검색이 수행된다.OSD is a space in which data is stored in the form of an object in response to a write request of a client. The OSD is the space in which data is actually stored, where data retrieval is performed in response to a client read request.

MON은 OSD 등을 포함하는 클러스터의 맵(map)을 유지함으로써 전체 클러스터의 상태를 건강하게 유지한다. 이를 위해, 모든 클러스터 노드들은 자신의 상태 변화를 지속적으로 MON에 알린다. MON keeps the state of the entire cluster healthy by keeping a map of the cluster including the OSD and the like. To this end, all cluster nodes continuously notify MON of their state changes.

radosgw는 단순한 스토리지 서비스인 S3 또는 swift와 호환 가능한 API 인터페이스를 제공한다. radosgw는 RADOS에 접근할 수 있는 통로(gateway) 역할을 한다. radosgw provides an API interface that is compatible with S3 or swift, a simple storage service. radosgw acts as a gateway to RADOS.

도 2는 일례에 따른 분산 저장 시스템의 세부 구성을 도시한 도면이다.2 is a view showing a detailed configuration of a distributed storage system according to an example.

일 실시예예 따르면, 객체 저장 계층(110)은 데이터를 저장할 수 있다. 객체 저장 계층(110)은 오브젝트 형식으로 데이터를 저장할 수 있다. 객체 저장 계층(110)은 클라이언트(140)로부터 수신된 데이터를 게이트웨이(211, 212, 213)를 통해 객체 저장 장치(221, 222, 223, 224)로 전달할 수 있다. 예를 들어, 객체 저장 장치(221, 222, 223, 224)는 OSD를 포함할 수 있다.According to one embodiment, the object storage layer 110 may store data. The object storage layer 110 may store data in an object format. The object storage layer 110 may transfer the data received from the client 140 to the object storage devices 221, 222, 223, and 224 through the gateways 211, 212, and 213. For example, the object storage devices 221, 222, 223, and 224 may include an OSD.

게이트웨이(211, 212, 213)는 클라이언트(140)로부터 데이터를 수신하여 객체 저장 장치(221, 222, 223, 224)로 분배할 수 있다. 게이트웨이(211, 212, 213)는 각각의 활동 정보를 기록하기 위해 게이트웨이 로그(241, 242, 243)를 유지할 수 있다.The gateways 211, 212, and 213 may receive data from the client 140 and distribute the data to the object storage devices 221, 222, 223, and 224. The gateways 211, 212, and 213 may maintain the gateway logs 241, 242, and 243 to record their respective activity information.

게이트웨이 로그(241, 242, 243)는 게이트웨이(211, 212, 213)의 활동 정보를 포함할 수 있다. 각각의 게이트웨이 로그(241, 242, 243)는 각각의 게이트웨이(211, 212, 213)에 대응될 수 있다. 각각의 게이트웨이 로그(241, 242, 243)는 각각의 게이트웨이(211, 212, 213)를 통해 입력된 데이터와 관련된 정보를 저장할 수 있다.The gateway logs 241, 242, and 243 may include activity information of the gateways 211, 212, and 213. Each of the gateway logs 241, 242, and 243 may correspond to each of the gateways 211, 212, and 213. Each of the gateway logs 241, 242, and 243 may store information related to data input through the gateways 211, 212, and 213, respectively.

감시장치(231)는 객체 저장 계층(110) 내의 각각의 컴포넌트 클러스터의 맵을 유지함으로써 전체 클러스터의 상태를 건강하게 유지한다. 게이트웨이(211, 212, 213), 객체 저장 장치(221, 222, 223, 224)는 자신의 상태 변화를 지속적으로 감시장치(231)에 알릴 수 있다. 예를 들어, 감시장치(231)는 MON을 포함할 수 있다.The monitoring device 231 keeps the state of the entire cluster healthy by keeping a map of each component cluster in the object storage hierarchy 110. [ The gateways 211, 212 and 213 and the object storage devices 221, 222, 223 and 224 can continuously inform the monitoring device 231 of a change in their status. For example, the monitoring device 231 may include MON.

일 실시예에 따르면, 캐쉬 서버 계층(130)은 서로 다른 종류의 캐쉬 서버를 포함할 수 있다. 분산 저장 시스템(100)은 저장 공간의 수요 및 상태에 따라 연결되어 있던 캐쉬 서버를 해제하거나, 새로운 캐쉬 서버를 연결할 수 있다. 캐쉬 서버의 종류에는 제한이 없으며, 같은 종류의 캐쉬 서버가 포함될 수도 있고, 서로 다른 종류의 캐쉬 서버가 포함될 수 있다. 서로 다른 종류의 캐쉬 서버는 상용 캐쉬 서버 또는 오픈 소스 기반의 캐쉬 서버를 포함할 수 있다. 도 2를 참조하면, 캐쉬 서버 계층(130)은 제1 캐쉬 서버(131)과 이에 대응하는 제1 캐쉬 저장 장치(133) 및 제2 캐쉬 서버(135)와 이에 대응하는 제2 캐쉬 저장 장치(137)를 포함할 수 있다.According to one embodiment, the cache server layer 130 may include different types of cache servers. The distributed storage system 100 can release the connected cache server or connect a new cache server according to the demand and state of the storage space. There is no restriction on the type of the cache server, and the same kind of cache server may be included, or different kinds of cache servers may be included. Different types of cache servers may include commercial cache servers or open source based cache servers. Referring to FIG. 2, the cache server layer 130 includes a first cache server 131, a corresponding first cache storage 133 and a second cache server 135, and a corresponding second cache storage device 137).

브릿지 계층(120)은 객체 저장 계층에 저장된 제1 데이터를 하나 이상의 캐쉬 서버에 동기화할 수 있다. 브릿지 계층(120)은 객체 저장 계층의 상태를 기초로 서로 다른 종류의 캐쉬 서버 중에서 연결할 캐쉬 서버를 선택할 수 있다. 브릿지 계층(120)은 선택된 캐쉬 서버를 객체 저장 계층과 연결할 수 있다. 이를 위하여, 브릿지 계층(120)은 에이전트(121) 및 업로더(123)를 포함할 수 있다. 브릿지 계층(120)은 데이터베이스(125)를 더 포함할 수 있다.The bridge layer 120 may synchronize the first data stored in the object storage layer to one or more cache servers. The bridge layer 120 may select a cache server to connect among different types of cache servers based on the state of the object storage layer. The bridge layer 120 may connect the selected cache server to the object storage layer. To this end, the bridge layer 120 may include an agent 121 and an uploader 123. The bridge layer 120 may further include a database 125.

다른 실시예에 따르면, 브릿지 계층(120)은 객체 저장 계층(110)에 저장된 제1 데이터를 하나 이상의 캐쉬 서버에 동기화할 수 있다. 캐쉬 서버 계층(130)은 클라이언트(140)의 제1 데이터의 전송 요청에 응답하여 제1 데이터를 클라이언트로 전송할 수 있다. 이처럼, 캐쉬 서버 계층(130)은 리버스 프록시로서 동작할 수 있다. According to another embodiment, the bridge layer 120 may synchronize the first data stored in the object storage layer 110 to one or more cache servers. The cache server layer 130 may transmit the first data to the client in response to the request for transmission of the first data of the client 140. [ As such, the cache server layer 130 may operate as a reverse proxy.

클라이언트(140)는 객체 저장 계층(110)에 직접 요청하지 않고 캐쉬 서버 계층(130)에 요청함으로써 캐쉬 서버 계층(130)로부터 원하는 데이터를 수신할 수 있다. 따라서, 분산 저장 시스템(100)은 객체 저장 계층(110)의 규모에 상관없이 일정 수준의 저장 서비스를 클라이언트(140)에게 제공할 수 있다. 캐쉬 서버 계층(130)이 리버스 프록시로서 동작함으로써 캐쉬 서버 계층(130)의 데이터가 유실되는 경우에도 원본 데이터는 객체 저장 계층(110)에 보존될 수 있다.The client 140 may receive the desired data from the cache server layer 130 by requesting the cache server layer 130 without directly requesting the object storage layer 110. [ Accordingly, the distributed storage system 100 can provide a storage service of a certain level to the client 140 irrespective of the size of the object storage layer 110. The original data can be saved in the object storage layer 110 even when the cache server layer 130 operates as a reverse proxy and the data in the cache server layer 130 is lost.

다른 실시예에 따르면, 브릿지 계층(120)은 객체 저장 계층에 저장된 제1 데이터를 하나 이상의 캐쉬 서버에 동기화할 수 있다. 브릿지 계층(120)은 객체 저장 계층의 데이터 변경에 따라 캐쉬 서버로 퍼지(purge) 요청을 전송할 수 있다. 분산 저장 시스템(100)은 미리 정해진 TTL(Time to live) 외에 저장 공간의 수요 및 상태에 따라 능동적으로 퍼지(purge)를 수행할 수 있다. 분산 저장 시스템(100)은 TTL과 별도로 데이터 상태 변경에 응답하여 퍼지를 수행함으로써 동기화 성능을 보다 향상시킬 수 있다.According to another embodiment, the bridge layer 120 may synchronize the first data stored in the object storage layer to one or more cache servers. The bridge layer 120 may send a purge request to the cache server according to the data change of the object storage layer. The distributed storage system 100 can perform active purge according to the demand and state of the storage space in addition to the predetermined time to live (TTL). Distributed storage system 100 may further improve synchronization performance by performing purging in response to a data state change separately from TTL.

객체 저장 계층(110)은 클라이언트(140)로부터 수신된 제2 데이터를 업로드할 수 있다. 객체 저장 계층(110)은 객체 저장 장치(221, 222, 223, 224) 중의 하나 이상의 객체 저장 장치로 제2 데이터를 업로드할 수 있다. 브릿지 계층(120)은 업로드된 제2 데이터에 대응하는 퍼지 정보를 생성할 수 있다. 브릿지 계층(120)은 캐쉬 서버로 퍼지 정보가 포함된 퍼지 요청을 전송할 수 있다. 캐쉬 서버 계층(130)은 퍼지 정보를 기초로 퍼지를 수행할 수 있다.The object storage layer 110 may upload the second data received from the client 140. The object storage layer 110 may upload the second data to one or more of the object storage devices 221, 222, 223, and 224. The bridge layer 120 may generate fuzzy information corresponding to the uploaded second data. The bridge layer 120 may send a purge request including fuzzy information to the cache server. The cache server layer 130 may perform purge based on the fuzzy information.

에이전트(121)는 게이트웨이 로그(241, 242, 243)를 지속적으로 수집할 수 있다. 에를 들어, 에이전트(121)는 radosgw 로그를 주기적으로 수집할 수 있다. 에이전트(121)는 수집된 로그를 기초로 업로드가 완료된 파일의 퍼지 정보를 생성하고 데이터베이스(125)에 저장할 수 있다. Agent 121 may continuously collect gateway logs 241, 242, and 243. For example, the agent 121 may periodically collect the radosgw log. The agent 121 may generate the fuzzy information of the uploaded file based on the collected log and store the generated fuzzy information in the database 125.

업로더(123)는 객체 저장 장치의 파일 리스트를 모니터링할 수 있다. 업로더(123)는 에이전트(121)에 의해 업로드된 데이터 정보가 데이터베이스(125)에 저장되면 퍼지 정보를 기초로 캐쉬 서버 계층(130)으로 퍼지 수행을 요청할 수 있다.The uploader 123 can monitor the file list of the object storage device. The uploader 123 may request the cache server layer 130 to perform fuzzy execution based on the fuzzy information if the data information uploaded by the agent 121 is stored in the database 125. [

다른 실시예에 따르면, 브릿지 계층(120)은 객체 저장 계층(110)에 저장된 제1 데이터를 하나 이상의 캐쉬 서버에 동기화할 수 있다. 브릿지 계층(120)은 객체 저장 계층의 논리적인 오브젝트 저장소의 변경에 따라 캐쉬 서버로 동기화 요청을 전송할 수 있다. According to another embodiment, the bridge layer 120 may synchronize the first data stored in the object storage layer 110 to one or more cache servers. The bridge layer 120 may send a synchronization request to the cache server in response to a change in the logical object store of the object storage layer.

객체 저장 계층(110)은 저장된 데이터에 대응하는 제1 논리적인 오브젝트 저장소를 변경할 수 있다. 브릿지 계층(120)은 변경된 제1 논리적인 오브젝트 저장소의 매핑 정보를 생성할 수 있다. 브릿지 계층(120)은 매핑 정보를 포함하는 동기화 요청을 캐쉬 서버 계층(130)으로 전송할 수 있다. 이에 따라, 캐쉬 서버 계층(130)은 논리적인 오브젝트 저장소 정보를 별도로 동기화할 필요가 없다. 캐쉬 서버 계층(130)은 매핑 정보를 기초로 제1 논리적인 오브젝트 저장소에 대응하는 캐쉬 서버에 저장된 제2 논리적인 오브젝트 저장소를 동기화할 수 있다. The object storage layer 110 may change the first logical object store corresponding to the stored data. The bridge layer 120 may generate the mapping information of the changed first logical object repository. The bridge layer 120 may send a synchronization request including mapping information to the cache server layer 130. Accordingly, the cache server layer 130 does not need to synchronize the logical object repository information separately. The cache server layer 130 may synchronize the second logical object store stored in the cache server corresponding to the first logical object store based on the mapping information.

논리적인 오브젝트 저장소 정보는, 예를 들어, 버킷(Bucket) 기반의 매핑(mapping) 정보를 포함할 수 있다. 분산 저장 시스템(100)은 버킷이 새로 생성될 경우 버킷의 URL 정보를 캐쉬 서버 계층(130)에 능동적으로 매핑할 수 있다. 예를 들어, 분산 저장 시스템(100)은 객체 저장 계층(110)의 "bucket1.original.com"라는 URL 정보를 캐쉬 서버 계층(130)의 "bucket1.cacheserver1.com"라는 URL 정보와 매핑할 수 있다.The logical object repository information may include, for example, bucket based mapping information. The distributed storage system 100 may actively map the URL information of the bucket to the cache server layer 130 when a new bucket is created. For example, the distributed storage system 100 can map URL information " bucket1.original.com " of the object storage layer 110 to URL information " bucket1.cacheserver1.com " of the cache server layer 130 have.

클라이언트(140)는 객체 저장 계층(110)에 버킷을 생성하거나 삭제해 달라고 요청할 수 있다. 에이전트(121)는 게이트웨이 로그(241, 242, 243)를 주기적을 수집할 수 있다. 에이전트(121)는 게이트웨이 로그(241, 242, 243)를 기초로 버킷 변경 여부를 판단할 수 있다. 에이전트(121)는 버킷 변경 여부에 대한 참/거짓 값을 반환할 수 있다. 버킷 변경에 대응하여, 에이전트(121)는 게이트웨이 로그(241, 242, 243)의 API를 이용하여 객체 저장 장치(221, 222, 223, 224)의 버킷 정보를 캐쉬 서버 계층(130)의 각 캐쉬 서버(131, 135)의 URL 매핑 형식으로 변환할 수 있다. 에이전트(121)는 캐쉬 서버(131, 135) 에 버킷 정보의 동기화를 요청할 수 있다.The client 140 may request the object storage layer 110 to create or delete a bucket. The agent 121 may collect the gateway logs 241, 242, and 243 periodically. The agent 121 can determine whether to change the bucket based on the gateway logs 241, 242, and 243. The agent 121 may return a true / false value for whether to change the bucket. In response to the bucket change, the agent 121 transmits bucket information of the object storage devices 221, 222, 223, and 224 to each cache of the cache server layer 130 using the APIs of the gateway logs 241, 242, It can be converted into the URL mapping format of the servers 131 and 135. The agent 121 may request the cache servers 131 and 135 to synchronize the bucket information.

도 3는 일례에 따른 분산 저장 시스템의 데이터 등의 흐름을 도시한 도면이다.3 is a diagram showing a flow of data and the like in the distributed storage system according to an example.

일 실시예예 따르면, 분산 저장 장치(110)는 객체 저장 계층(110) 및 브릿지 계층(120)을 포함할 수 있다. 분산 저장 장치(110)는 하나 이상의 객체 저장 장치(221), 하나 이상의 게이트웨이(211, 212) 및 각각의 게이트웨이(211, 212)에 대응하는 게이트웨이 로그(231, 232)를 포함할 수 있다. 분산 저장 장치(110)는 각각의 게이트웨이(211, 212)를 통합한 게이트웨이 로그(301)을 더 포함할 수 있다. 분산 저장 장치(110)는 캐쉬 서버 계층(130)과 통신하는 I/O 인터페이스를 포함할 수 있다.According to one embodiment, the distributed storage device 110 may include an object storage layer 110 and a bridge layer 120. The distributed storage device 110 may include one or more object storage devices 221, one or more gateways 211 and 212 and gateway logs 231 and 232 corresponding to the gateways 211 and 212, respectively. The distributed storage device 110 may further include a gateway log 301 that integrates each of the gateways 211 and 212. The distributed storage device 110 may include an I / O interface for communicating with the cache server layer 130.

단계(331, 332)에서, 객체 저장 장치(221)는 하나 이상의 게이트웨이(211, 212)로부터 데이터를 수신하여 저장할 수 있다. 각 게이트웨이(211, 212)에 대응하는 각 게이트웨이 로그(231, 232)는 데이터 수신에 관한 정보가 기록될 수 있다. 에이전트(121)는 주기적으로 게이트웨이 로그(231, 232)를 수집할 수 있다. 단계(321, 322)에서, 에이전트(121)는 게이트웨이 로그(231, 232)를 데이터베이스(125)에 저장할 수 있다. In steps 331 and 332, the object storage device 221 may receive and store data from one or more gateways 211 and 212. [ Information on data reception can be recorded in each of the gateway logs 231 and 232 corresponding to the gateways 211 and 212. The agent 121 may collect the gateway logs 231 and 232 periodically. At steps 321 and 322, the agent 121 may store the gateway logs 231 and 232 in the database 125.

데이터 수신으로 인해 게이트웨이 로그(231, 232)가 변경된 경우에, 단계(323)에서, 업로더(123)는 데이터베이스(125)에 저장된 퍼지 정보를 기초로 캐쉬 서버 계층(130)으로 퍼지 수행을 요청할 수 있다. 업로더(123)는 객체 저장 장치의 파일 리스트를 모니터링할 수 있다. 업로더(123)는 에이전트(121)에 의해 업로드된 데이터 정보가 데이터베이스(125)에 저장되면 퍼지 정보를 기초로 캐쉬 서버 계층(130)으로 퍼지 수행을 요청할 수 있다.The uploader 123 requests the cache server layer 130 to perform fuzzy execution based on the fuzzy information stored in the database 125 in step 323 when the gateway logs 231 and 232 are changed due to data reception . The uploader 123 can monitor the file list of the object storage device. The uploader 123 may request the cache server layer 130 to perform fuzzy execution based on the fuzzy information if the data information uploaded by the agent 121 is stored in the database 125. [

단계(343)에서, 에이전트(121)는 게이트웨이 로그(231, 232, 301)를 기초로 버킷 변경 여부를 판단할 수 있다. 에이전트(121)는 버킷 변경 여부에 대한 참/거짓 값을 반환할 수 있다. 버킷 변경에 대응하여, 에이전트(121)는 게이트웨이 로그(231, 232, 301)의 API를 이용하여 객체 저장 장치(221)의 버킷 정보를 캐쉬 서버 계층(130)의 캐쉬 서버(131)의 URL 매핑 형식으로 변환할 수 있다. 에이전트(121)는 캐쉬 서버(131)에 버킷 정보의 동기화를 요청할 수 있다.In step 343, the agent 121 may determine whether to change the bucket based on the gateway logs 231, 232, and 301. The agent 121 may return a true / false value for whether to change the bucket. In response to the change of the bucket, the agent 121 uses the API of the gateway logs 231, 232 and 301 to map the bucket information of the object storage device 221 to the URL mapping of the cache server 131 of the cache server layer 130 Format. The agent 121 may request the cache server 131 to synchronize the bucket information.

도 4는 일 실시예에 따른 분산 저장 방법의 동작을 도시한 순서도이다.4 is a flowchart illustrating an operation of the distributed storage method according to an embodiment.

일 실시예예 따르면, 분산 저장 장치는 객체 저장 계층과 캐쉬 서버 계층을 결합하여 분산 저장 시스템의 성능을 개선할 수 있다. 이처럼, 분산 저장 장치는 임의의 캐쉬 서버를 드라이버처럼 선택하고 연결할 수 있고 특정 캐쉬 서버에 대한 의존성을 완화할 수 있다. According to one embodiment, a distributed storage device may combine an object storage layer and a cache server layer to improve the performance of a distributed storage system. Thus, a distributed storage device can select and connect any cache server as a driver and mitigate the dependency on a specific cache server.

단계(410)에서, 분산 저장 장치는 객체 저장 계층에 데이터를 저장한다. 예를 들어, 분산 저장 장치는 클라이언트로부터 데이터의 업로드 요청을 수신할 수 있다. 분산 저장 장치는 게이트웨이를 통해 수신된 데이터를 객체 저장 장치에 저장할 수 있다. 분산 저장 장치는 게이트웨이 로그에 데이터 입력과 관련된 정보를 저장할 수 있다.At step 410, the distributed storage device stores the data in the object storage layer. For example, the distributed storage device may receive a request to upload data from a client. The distributed storage device may store the data received via the gateway in an object storage device. The distributed storage device may store information related to data entry in the gateway log.

단계(420)에서, 분산 저장 장치는 객체 저장 계층에 저장된 제1 데이터를 하나 이상의 캐쉬 서버에 동기화한다. 단계(421)에서, 분산 저장 장치는 객체 저장 계층의 상태를 기초로 서로 다른 종류의 캐쉬 서버 중에서 연결할 캐쉬 서버를 선택할 수 있다. 단계(422)에서, 분산 저장 장치는 선택된 캐쉬 서버를 객체 저장 계층과 연결할 수 있다.In step 420, the distributed storage device synchronizes the first data stored in the object storage layer to one or more cache servers. At step 421, the distributed storage device may select a cache server to connect to from among different types of cache servers based on the state of the object storage layer. At step 422, the distributed storage device may associate the selected cache server with the object storage layer.

이처럼, 분산 저장 장치는 저장 공간의 수요 및 상태에 따라 연결되어 있던 캐쉬 서버를 해제하거나, 새로운 캐쉬 서버를 연결할 수 있다. 분산 저장 장치는 특정 캐쉬 서버 계층에 의존하지 않고 상용 캐쉬 서버 또는 오픈 소스 캐쉬 서버 등을 자유롭게 선택할 수 있다.Thus, the distributed storage device can release the connected cache server or connect a new cache server according to the demand and the state of the storage space. The distributed storage device can freely select a commercial cache server or an open source cache server without relying on a specific cache server layer.

도 5는 다른 실시예에 따른 분산 저장 방법의 동작을 도시한 순서도이다.5 is a flowchart showing the operation of the distributed storage method according to another embodiment.

일 실시예예 따르면, 분산 저장 장치는 객체 저장 계층과 캐쉬 서버 계층을 결합하여 분산 저장 시스템의 성능을 개선할 수 있다. 분산 저장 장치는 캐쉬 서버 계층을 리버스 프록시(reverse proxy)로서 사용할 수 있다. 이를 통해, 분산 저장 시스템(100)은 보안을 강화할 수 있고 읽기(read) 성능을 향상시킬 수 있다. According to one embodiment, a distributed storage device may combine an object storage layer and a cache server layer to improve the performance of a distributed storage system. The distributed storage device may use the cache server layer as a reverse proxy. In this way, the distributed storage system 100 can enhance security and improve read performance.

단계(510)에서, 분산 저장 장치는 객체 저장 계층에 데이터를 저장한다. 예를 들어, 분산 저장 장치는 클라이언트로부터 데이터의 업로드 요청을 수신할 수 있다. 분산 저장 장치는 게이트웨이를 통해 수신된 데이터를 객체 저장 장치에 저장할 수 있다. 분산 저장 장치는 게이트웨이 로그에 데이터 입력과 관련된 정보를 저장할 수 있다.At step 510, the distributed storage device stores the data in the object storage layer. For example, the distributed storage device may receive a request to upload data from a client. The distributed storage device may store the data received via the gateway in an object storage device. The distributed storage device may store information related to data entry in the gateway log.

단계(520)에서, 분산 저장 장치는 객체 저장 계층에 저장된 제1 데이터를 하나 이상의 캐쉬 서버에 동기화한다. 객체 저장 계층의 데이터 변경에 따라 캐쉬 서버로 퍼지 요청을 전송할 수 있다. 분산 저장 장치는 미리 정해진 TTL 외에 저장 공간의 수요 및 상태에 따라 능동적으로 퍼지를 수행할 수 있다. TTL은 캐쉬 서버 계층에 저장된 데이터의 만료 시간을 의미할 수 있다. 보통, 캐쉬 서버 계층는 데이터 별로 TTL을 미리 정해둔다. TTL에 의존할 경우 객체 저장 계층의 데이터 상태 변경이 발생하더라도 TTL을 기다려 퍼지가 수행되므로 동기화 성능이 열화될 수 있다. 분산 저장 장치는 TTL과 별도로 데이터 상태 변경에 응답하여 퍼지를 수행함으로써 동기화 성능을 보다 향상시킬 수 있다.In step 520, the distributed storage device synchronizes the first data stored in the object storage layer to one or more cache servers. The fuzzy request can be transmitted to the cache server according to the data change of the object storage layer. The distributed storage device can perform the purging actively according to the demand and the condition of the storage space in addition to the predetermined TTL. The TTL can mean the expiration time of the data stored in the cache server layer. Usually, the cache server layer pre-sets the TTL for each piece of data. If the data state change of the object storage layer occurs depending on the TTL, the synchronization performance may deteriorate because the pause is performed by waiting for the TTL. The distributed storage device can further improve the synchronization performance by performing the purging in response to the data state change separately from the TTL.

분산 저장 장치는 객체 저장 계층의 논리적인 오브젝트 저장소의 변경에 따라 캐쉬 서버로 동기화 요청을 전송할 수 있다. 분산 저장 장치는 논리적인 오브젝트 저장소 정보를 캐쉬 서버 계층에 능동적으로 동기화할 수 있다. 이에 따라, 캐쉬 서버 계층은 논리적인 오브젝트 저장소 정보를 별도로 동기화할 필요가 없다. 이처럼, 분산 저장 장치는 동기화 성능을 보다 향상시킬 수 있다. The distributed storage device can transmit the synchronization request to the cache server according to the change of the logical object storage of the object storage layer. The distributed storage device can actively synchronize the logical object store information to the cache server layer. Accordingly, the cache server layer does not need to synchronize the logical object repository information separately. As such, the distributed storage device can further improve synchronization performance.

단계(530)에서, 분산 저장 장치는 클라이언트의 제1 데이터의 전송 요청에 응답하여 캐쉬 서버에 저장된 제1 데이터를 클라이언트로 전송한다. 클라이언트는 객체 저장 계층에 직접 요청하지 않고 캐쉬 서버 계층에 요청함으로써 캐쉬 서버 계층로부터 원하는 데이터를 수신할 수 있다. In step 530, the distributed storage device transmits the first data stored in the cache server to the client in response to the request to transmit the client's first data. The client can receive the desired data from the cache server layer by requesting the cache server layer without directly requesting the object storage layer.

따라서, 분산 저장 장치는 객체 저장 계층의 규모에 상관없이 일정 수준의 저장 서비스를 클라이언트에게 제공할 수 있다. 캐쉬 서버 계층이 리버스 프록시로서 동작함으로써 캐쉬 서버 계층의 데이터가 유실되는 경우에도 원본 데이터는 객체 저장 계층에 보존될 수 있다.Accordingly, the distributed storage device can provide a certain level of storage service to the client regardless of the size of the object storage layer. Even when the cache server layer operates as a reverse proxy, data of the cache server layer is lost, the original data can be saved in the object storage layer.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer readable medium may include program instructions, data files, data structures, and the like, alone or in combination. Program instructions to be recorded on the medium may be those specially designed and constructed for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 영구적으로 EH는 일시적으로 구체화될 수 있다. 또한, 소프트웨어 및/또는 데이터는 처리 장치에 명령 또는 데이터를 제공하기 위하여 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device The EH can be materialized temporarily. In addition, the software and / or data may be permanently or temporarily embodied in a signal wave that is transmitted to provide instructions or data to the processing device. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.Although the embodiments have been described with reference to the drawings, various technical modifications and variations may be applied to those skilled in the art. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

[1] [One]

Claims

Two or more cache servers;
Object storage layer; And
Bridge layer,
Wherein the object storage layer stores first data,
Wherein the bridge layer synchronizes the first data stored in the object storage layer to the two or more cache servers,
Wherein the at least two cache servers include different types of cache servers,
The bridge layer comprises:
Selecting a cache server to be connected among the different types of cache servers based on the state of the object storage layer and the demand of the storage space, connecting the selected cache server to the object storage layer,
The bridge layer transmits a purge request to the cache server according to a state change of the object storage layer corresponding to the storage of the first data,
The object storage layer may upload the first data received from the client to at least one object storage device included in the object storage layer,
Wherein the bridge layer generates fuzzy information corresponding to the uploaded first data, transmits a purge request including the fuzzy information to the cache server,
Wherein the cache server performs purging based on the fuzzy information,
Wherein the object storage layer stores a first logical object store corresponding to the first data and the first data, changes the first logical object store according to a signal received from the client,
Wherein the bridge layer transmits a synchronization request corresponding to a first logical object store changed to the cache server according to a change of a first logical object store of the object storage hierarchy,
The bridge layer generates mapping information of the changed first logical object repository, transmits a synchronization request including the mapping information to the cache server,
Wherein the cache server synchronizes a second logical object store stored in the cache server corresponding to the first logical object store based on the mapping information,
Distributed storage system.

The method according to claim 1,
The different types of cache servers include a commercial cache server or an open source-based cache server.
Distributed storage system.

Cache servers;
Object storage layer; And
Bridge layer,
Wherein the object storage layer stores first data,
Wherein the bridge layer synchronizes the first data stored in the object storage layer with the cache servers,
Wherein the cache servers operate as a reverse proxy and transmit the first data to the client in response to a request for transmission of the first data of the client,
The cache servers include different types of cache servers,
The bridge layer comprises:
Selecting a cache server to be connected among the different types of cache servers based on the state of the object storage layer and the demand of the storage space, connecting the selected cache server to the object storage layer,
The bridge layer transmits a purge request to the cache server according to a state change of the object storage layer corresponding to the storage of the first data,
The object storage layer may upload the first data received from the client to at least one object storage device included in the object storage layer,
Wherein the bridge layer generates fuzzy information corresponding to the uploaded first data, transmits a purge request including the fuzzy information to the cache server,
Wherein the cache server performs purging based on the fuzzy information,
Wherein the object storage layer stores a first logical object store corresponding to the first data and the first data, changes the first logical object store according to a signal received from the client,
Wherein the bridge layer transmits a synchronization request corresponding to a first logical object store changed to the cache server according to a change of a first logical object store of the object storage hierarchy,
The bridge layer generates mapping information of the changed first logical object repository, transmits a synchronization request including the mapping information to the cache server,
Wherein the cache server synchronizes a second logical object store stored in the cache server corresponding to the first logical object store based on the mapping information,
Distributed storage system.

delete