KR101104999B1

KR101104999B1 - Load balancing method and system for metadata service

Info

Publication number: KR101104999B1
Application number: KR1020100101294A
Authority: KR
Inventors: 윤희용; 박선식; 황명진; 유창훈
Original assignee: 성균관대학교산학협력단
Priority date: 2010-10-18
Filing date: 2010-10-18
Publication date: 2012-01-16

Abstract

PURPOSE: A load balancing method for a metadata service and system thereof are provided to supply a seamless service and to minimize the intervention of a manager in a cloud computing environment. CONSTITUTION: An MDS watcher periodically receives a heartbeat from an MDS(Metadata Server). An MDS controller calculates an MDS use value of the MDS value through the heartbeat. The MDS controller determines that the MDS having small MDS value is determined as a shadow MDS and copies a hotspot file to the shadow MDS. If a client inquires a hot spot file, a channel controller connects client to the shadow MDS. An update manager transmits the change information of a metadata lookup table to an MLTS(Metadata Lookup Table Server).

Description

Load balancing method and system for metadata service {LOAD BALANCING METHOD AND SYSTEM FOR METADATA SERVICE}

본 발명은 메타데이터 서비스를 위한 로드밸런싱 방법 및 시스템에 관한 것으로서, 더욱 상세하게는 클라우드 컴퓨팅 기반으로 마스터 메타데이터 서버를 이용한 메타데이터 서비스를 위한 로드밸런싱 방법 및 시스템에 관한 것이다.The present invention relates to a load balancing method and system for a metadata service, and more particularly, to a load balancing method and system for a metadata service using a master metadata server based on cloud computing.

클러스터 기반 스토리지 시스템에서 추가적인 하드웨어의 증설을 하지 않고 저가의 유휴 장비들을 이용해서 이미 도입된 스토리지를 제대로 활용하고자 하는 요구가 증가함에 따라 클라우드 컴퓨팅의 요구가 증가하고 있다. The demand for cloud computing is increasing as cluster-based storage systems are increasingly demanding to take advantage of the already introduced storage using low-cost idle equipment without adding additional hardware.

클라우드 컴퓨팅 분야에서는 가상화 스토리지 기술이 중요한 요인의 하나이고, 현존하는 여러 종류의 클러스터 파일 시스템을 기반으로 하는 스토리지 기술과 가상화 기술의 결합은 클라우드 컴퓨팅의 핵심 기술로 발전하고 있으며 파일 시스템의 효율적인 I/O를 위한 메타데이터(Metadata; MD) 관리의 비중이 늘어나고 있다. 하지만, 여전히 네트워크 트래픽 증가, 병목현상, 로드밸런싱과 같은 문제들이 해결되지 않고 있다.In the field of cloud computing, virtualized storage technology is one of the important factors, and the combination of storage technology and virtualization technology based on existing cluster file systems is developing as a core technology of cloud computing and efficient I / O of file system. The share of metadata (MD) management for organizations is increasing. However, issues such as network traffic growth, bottlenecks and load balancing are still unresolved.

본 발명은 상술한 바와 같이 클라우드 컴퓨팅에서 문제되고 있는 메타데이터 서비스를 위한 로드밸런싱 방법 및 시스템을 제공하는 것을 목적으로 한다.As described above, an object of the present invention is to provide a load balancing method and system for a metadata service which is a problem in cloud computing.

본 발명의 일 관점은 클라이언트와 복수의 메타데이터 서버(MDS) 사이에 마스터 메타데이터 서버(MMDS) 및 메타데이터 룩업 테이블 서버(MLTS)를 배치하고 각각의 MDS는 MMDS에만 하트비트를 전송하게 함으로써 메타데이터 서비스를 위한 로드밸런싱을 수행하는 방법에 관한 것이다.One aspect of the present invention is to place a master metadata server (MMDS) and a metadata lookup table server (MLTS) between a client and a plurality of metadata servers (MDS), and each MDS transmits a heartbeat only to the MMDS to allow meta The present invention relates to a method for performing load balancing for a data service.

상술한 목적을 달성하기 위해서 본 발명에 따른 로드밸런싱 수행 방법은, MMDS가 MDS로부터 하트비트를 주기적으로 전송받는 제1 단계; MMDS가 하트비트를 이용하여 각각의 MDS에 대한 MDS 활용도 값을 산출하는 제2 단계; MMDS가 각각의 MDS에 대한 MDS 활용도 값을 검사하여 MDS 활용도 값이 핫스팟 결정 기준값 이상인 MDS를 핫스팟 MDS로 결정하는 제3 단계; 제3 단계에서 핫스팟 MDS로 결정된 MDS가 있는 경우에, MMDS가 각각의 MDS에 대한 MDS 활용도 값을 검사하여 MDS 활용도 값이 핫스팟 결정 기준값 미만인 MDS 중에서 MDS 활용도 값이 가장 작은 MDS를 쉐도우 MDS로 결정하는 제4 단계; MMDS가 핫스팟 MDS에 존재하는 핫스팟 파일을 쉐도우 MDS에 복사하는 제5 단계; 클라이언트에 의해서 핫스팟 파일이 조회되는 경우에, MMDS가 클라이언트를 쉐도우 MDS로 연결시키는 제6 단계를 포함하고, 상기 하트비트는 MDS의 하드웨어의 상태를 나타내는 정보와 MDS의 메타데이터 서비스 지연 시간의 상태를 나타내는 정보를 포함한다. 여기서, 상기 MDS의 하드웨어의 상태를 나타내는 정보는 CPU의 최대 처리량, CPU의 현재 처리량, RAM의 최대 처리량, 및 RAM의 최대 처리량을 포함하고, MDS의 메타데이터 서비스 지연 시간의 상태를 나타내는 정보는 MDS 큐에서 메타데이터의 평균 대기 시간 및 MDS 내의 업데이트 트래픽의 수를 포함할 수 있다.In order to achieve the above object, a load balancing performing method according to the present invention includes a first step of receiving MMDS periodically receiving a heartbeat from an MDS; A second step of the MMDS using the heartbeat to calculate an MDS utilization value for each MDS; A third step of determining, by the MMDS, the MDS utilization value for each MDS, the MDS whose MDS utilization value is greater than or equal to the hotspot determination reference value as a hotspot MDS; If there is an MDS determined as a hotspot MDS in the third step, the MMDS checks the MDS utilization value for each MDS to determine the shadow MDS of the MDS having the lowest MDS utilization value among the MDSs whose MDS utilization value is less than the hotspot determination threshold. A fourth step; A fifth step of the MMDS copying the hotspot file existing in the hotspot MDS to the shadow MDS; When a hotspot file is queried by the client, the MMDS includes a sixth step of connecting the client to the shadow MDS, wherein the heartbeat indicates information indicating the state of the hardware of the MDS and the state of the metadata service latency of the MDS. Contains information that represents. Here, the information representing the state of the hardware of the MDS includes the maximum throughput of the CPU, the current throughput of the CPU, the maximum throughput of the RAM, and the maximum throughput of the RAM, and the information representing the state of the metadata service delay time of the MDS is MDS. Average latency of metadata in the queue and the number of update traffic in the MDS.

또한, 본 발명의 일 실시예에 따르면, MLTS와 MDS는 각각 메타데이터 룩업 테이블을 갖고 있으며, 상기 제6 단계 이후에, 메타데이터 룩업 테이블의 변경이 발생한 MDS가 메타데이터 룩업 테이블 변경 메시지를 MLTS에 전송하는 제7 단계와, MLTS가 상기 메타데이터 룩업 테이블 변경 메시지를 메타데이터 룩업 테이블의 변경이 발생하지 않은 MDS에 전송하는 제8 단계를 더 포함할 수 있고, 메타데이터 룩업 테이블의 변경이 발생하지 않은 MDS가 둘 이상인 경우에 MLTS가 상기 MDS 활용도 값을 고려하여 상기 메타데이터 룩업 테이블 변경 메시지를 전송하는 순서를 정할 수 있다.In addition, according to an embodiment of the present invention, the MLTS and the MDS each have a metadata lookup table, and after the sixth step, the MDS having changed the metadata lookup table sends a metadata lookup table change message to the MLTS. And a seventh step of transmitting, and an eighth step of MLTS transmitting the metadata lookup table change message to the MDS in which the change of the metadata lookup table has not occurred, and the change of the metadata lookup table does not occur. If there is more than one MDS, the order in which the MLTS transmits the metadata lookup table change message may be determined in consideration of the MDS utilization value.

본 발명의 다른 관점은 클라이언트와 복수의 메타데이터 서버(MDS) 사이에 배치된 마스터 메타데이터 서버(MMDS) 및 메타데이터 룩업 테이블 서버(MLTS)를 포함하는 메타데이터에 대한 로드밸런싱을 수행하기 위한 시스템에 관한 것이다.Another aspect of the invention is a system for performing load balancing on metadata including a master metadata server (MMDS) and a metadata lookup table server (MLTS) disposed between a client and a plurality of metadata servers (MDS). It is about.

상술한 목적을 달성하기 위해서 MMDS는, MDS로부터 하트비트를 주기적으로 전송받고 메타데이터 룩업 테이블 변경발생 정보를 전송받는 MDS 와쳐; 상기 하트비트를 이용하여 각각의 MDS에 대한 MDS 활용도 값을 산출하고, 각각의 MDS에 대한 MDS 활용도 값을 검사하여 MDS 활용도 값이 핫스팟 결정 기준값 이상인 MDS를 핫스팟 MDS로 결정하고, MDS 활용도 값이 핫스팟 결정 기준값 미만인 MDS 중에서 MDS 활용도 값이 가장 작은 MDS를 쉐도우 MDS로 결정한 후 핫스팟 MDS에 존재하는 핫스팟 파일을 쉐도우 MDS에 복사하는 MDS 컨트롤러; 클라이언트에 의해서 핫스팟 파일이 조회되는 경우에, MMDS가 클라이언트를 쉐도우 MDS로 연결시키는 채널 컨트롤러; 및 각각의 MDS에 대한 MDS 활용도 값 및 메타데이터 룩업 테이블 변경발생 정보를 MLTS에 전송하는 업데이트 메니저;를 포함하고, MLTS와 MDS는 각각 메타데이터 룩업 테이블을 갖고 있고, 상기 하트비트는 MDS의 하드웨어의 상태를 나타내는 정보와 MDS의 메타데이터 서비스 지연 시간의 상태를 나타내는 정보를 포함한다. 여기서, 상기 MDS의 하드웨어의 상태를 나타내는 정보는 CPU의 최대 처리량, CPU의 현재 처리량, RAM의 최대 처리량, 및 RAM의 최대 처리량을 포함하고, MDS의 메타데이터 서비스 지연 시간의 상태를 나타내는 정보는 MDS 큐에서 메타데이터의 평균 대기 시간 및 MDS 내의 업데이트 트래픽의 수를 포함할 수 있다.In order to achieve the above object, the MMDS includes an MDS watcher periodically receiving a heartbeat from the MDS and receiving metadata lookup table change occurrence information; The MDS utilization value for each MDS is calculated using the heartbeat, the MDS utilization value for each MDS is examined, and the MDS utilization value is determined as a hotspot MDS where the MDS utilization value is greater than or equal to the hotspot determination reference value, and the MDS utilization value is a hotspot. An MDS controller which determines a shadow MDS having the smallest MDS utilization value among MDSs that are less than a determination threshold value, and then copies the hotspot file existing in the hotspot MDS to the shadow MDS; A channel controller for connecting the client to the shadow MDS when the hotspot file is queried by the client; And an update manager for transmitting MDS utilization value and metadata lookup table change occurrence information for each MDS to MLTS, wherein the MLTS and MDS each have a metadata lookup table, and the heartbeat of the hardware of the MDS. Information indicating the status and information indicating the status of the metadata service delay time of the MDS. Here, the information representing the state of the hardware of the MDS includes the maximum throughput of the CPU, the current throughput of the CPU, the maximum throughput of the RAM, and the maximum throughput of the RAM, and the information representing the state of the metadata service delay time of the MDS is MDS. Average latency of metadata in the queue and the number of update traffic in the MDS.

또한, 본 발명의 일 실시예에 따르면, MLTS는 메타데이터 룩업 테이블의 변경이 발생한 MDS로부터 메타데이터 룩업 테이블 변경 메시지를 전송받아서 상기 메타데이터 룩업 테이블 변경 메시지를 메타데이터 룩업 테이블의 변경이 발생하지 않은 MDS에 전송하고, 메타데이터 룩업 테이블의 변경이 발생하지 않은 MDS가 둘 이상인 경우에 상기 MDS 활용도 값을 고려하여 상기 메타데이터 룩업 테이블 변경 메시지를 전송하는 순서를 정할 수 있다.In addition, according to an embodiment of the present invention, the MLTS receives a metadata lookup table change message from an MDS in which a metadata lookup table change occurs so that the metadata lookup table change message does not change the metadata lookup table. When there is more than one MDS transmitted to the MDS and no change of the metadata lookup table occurs, the order of transmitting the metadata lookup table change message may be determined in consideration of the MDS utilization value.

본 발명에 따른 방법 및 시스템에 의해서, 클라우드 컴퓨팅 환경에서 관리자의 개입을 최소화하고 서비스의 중단 없이 지속적인 서비스가 가능하도록 한다. 또한, 저비용의 스토리지 서버들을 활용하여 수많은 데이터를 유지 및 관리 가능하게 한다. 특히, 본 발명에서 구체적으로 제안된 MMDS는 네트워크 트레픽을 줄이는 데 기여하며, MLTS는 MMDS와 연동되어 효율적인 MLT 업데이트를 가능하게 한다. By the method and system according to the present invention, it is possible to minimize the intervention of the administrator in the cloud computing environment and to enable continuous service without interruption of service. In addition, low-cost storage servers can be utilized to maintain and manage large amounts of data. In particular, the MMDS specifically proposed in the present invention contributes to reducing network traffic, and the MLTS works in conjunction with the MMDS to enable efficient MLT update.

한편, 도 5 내지 7을 참조하면, 성능평가를 통해 MDS와 클라이언트의 수에 따른 성능평가 결과, 본 발명에 따른 방법 및 시스템에 의한 성능이 종래기술에 비해서 현저하게 향상되었음을 확인할 수 있다. On the other hand, referring to Figures 5 to 7, through the performance evaluation according to the performance evaluation according to the number of MDS and the client, it can be seen that the performance by the method and system according to the present invention is significantly improved compared to the prior art.

도 1은 본 발명에 따른 메타데이터 서비스를 위한 로드밸런싱 시스템의 개략도이다.
도 2는 본 발명에 따른 메타데이터 서비스를 위한 로드밸런싱 시스템에 포함된 마스터 메타데이터 서버(Master Metadata Server; MMDS)의 구성을 설명하기 위한 개략도이다.
도 3은 본 발명에 따른 로드밸런싱을 설명하기 위한 알고리즘이다.
도 4는 본 발명에 따른 로드밸런싱 수행 방법의 흐름도이다.
도 5는 본 발명에 따른 방법 및 시스템에 의한 경우에 MDS 수의 변화에 따른 메타데이터 처리 시간과 종래기술에 의한 경우에 MDS 수의 변화에 따른 메타데이터 처리 시간을 비교한 그래프이다.
도 6은 본 발명에 따른 방법 및 시스템에 의한 경우에 MDS 수의 변화에 따른 메타데이터 처리량과 종래기술에 의한 경우에 MDS 수의 변화에 따른 메타데이터 처리량을 비교한 그래프이다.
도 7은 본 발명에 따른 방법 및 시스템에 의한 경우에 메타데이터 서버의 큐에 대기하는 메타데이터 쿼리 수의 표준편차와 종래기술에 의한 경우에 메타데이터 서버의 큐에 대기하는 메타데이터 쿼리 수의 표준편차를 비교한 도면이다.1 is a schematic diagram of a load balancing system for a metadata service according to the present invention.
2 is a schematic diagram illustrating a configuration of a master metadata server (MMDS) included in a load balancing system for a metadata service according to the present invention.
3 is an algorithm for explaining load balancing according to the present invention.
4 is a flowchart illustrating a method for performing load balancing according to the present invention.
5 is a graph comparing metadata processing time according to a change in the number of MDSs according to the method and system according to the present invention and metadata processing time according to a change in the number of MDSs according to the related art.
6 is a graph comparing the metadata throughput according to the change of the MDS number in the case of the method and system according to the present invention and the metadata throughput according to the change in the MDS number in the case of the related art.
7 is a standard deviation of the number of metadata queries queued in the metadata server in the case of the method and system according to the present invention, and the standard number of metadata queries queued in the metadata server in the prior art. It is a figure comparing deviation.

도 1은 본 발명에 따른 메타데이터 서비스를 위한 로드밸런싱 시스템의 개략도이다.1 is a schematic diagram of a load balancing system for a metadata service according to the present invention.

도 1에서 도시된 바와 같이, 본 발명에서는 클라이언트와 복수의 메타데이터 서버(Metadata Server; MDS) 사이에 마스터 메타데이터 서버(Master Metadata Server; MMDS) 및 메타데이터 룩업 테이블 서버(Metadata Look-up Table Server; MLTS)를 배치함으로써 메타데이터 서비스를 위한 로드밸런싱을 수행한다. As shown in FIG. 1, in the present invention, a Master Metadata Server (MMDS) and a Metadata Look-up Table Server between a client and a plurality of Metadata Servers (MDSs). Load balancing for metadata services by deploying;

MMDS는 주기적으로 모든 MDS들로부터 하트비트를 받는다. 하트비트는 MDS의 하드웨어의 상태를 나타내는 정보와 MDS의 메타데이터 서비스 지연 시간의 상태를 나타내는 정보를 포함한다. 그러므로 MMDS는 핫스팟이 발생했을 때 MDS들의 정보들을 통해 MDS들의 로드밸런싱을 조율한다. The MMDS periodically receives heartbeats from all MDSs. The heartbeat includes information indicating the state of the hardware of the MDS and information indicating the state of the metadata service delay time of the MDS. Therefore, the MMDS coordinates the load balancing of the MDSs through the information of the MDSs when a hot spot occurs.

이러한 MMDS 플랫폼은 메타데이터 서비스를 위한 확장성과 유연성 및 안정성을 보장하는 것이 가능하게 하고, 실시예에 따라서 가상화를 위해 Xen 스킴을 활용할 수 있다.This MMDS platform makes it possible to ensure scalability, flexibility, and stability for metadata services, and may utilize Xen schemes for virtualization in some embodiments.

각 MDS들의 상태를 나타내기 위해 본 발명에 따른 시스템은 큐잉 네트워크 이론을 활용한다. 시스템에서 메타데이터 쿼리의 수는 새로운 메타데이터 쿼리의 도착 시간과 평균 서비스 시간에 영향을 미치지 않는다고 가정한다. 만일 새로운 메타데이터 쿼리의 도착 시간과 평균 서비스율이 일정하다고 가정한다면 수학식 1 및 수학식 2가 성립한다.The system according to the present invention utilizes queuing network theory to indicate the status of each MDS. It is assumed that the number of metadata queries in the system does not affect the arrival time and average service time of new metadata queries. If it is assumed that the arrival time of the new metadata query and the average service rate are constant, Equations 1 and 2 hold.

여기서,

는 시스템으로 들어오는 평균 쿼리의 수를 나타낸다.here,

Is the average number of queries coming into the system.

여기서,

는 시스템에 있는 쿼리의 평균 처리율을 나타낸다.here,

Represents the average throughput of queries in the system.

이때, 시스템의 부하는 수학식 3에 의해서 구할 수 있다.At this time, the load of the system can be obtained by the equation (3).

여기서,

는 시스템의 부하를 나타낸다.here,

Is the load on the system.

그러므로, 정상상태에서 시스템 내에 있는 쿼리의 수가 n일 확률은 수학식 4에 의해서 구할 수 있다.Therefore, the probability that n is the number of queries in the system at steady state can be calculated by Equation 4.

여기서,

은 시스템 내에 있는 쿼리의 수가 n일 확률을 나타내고,

는 수학식 5에 의해서 구할 수 있다. here,

Represents the probability that n is the number of queries in the system,

Can be obtained by equation (5).

정상상태에서 시스템에 있는 쿼리의 수의 확률은 n이다. 메타데이터의 쿼리의 수는 수학식 6에서의 관계를 만족한다고 가정한다.In steady state, the probability of the number of queries in the system is n. It is assumed that the number of queries of metadata satisfies the relationship in equation (6).

여기서,

은 시스템에 있는 평균 쿼리의 수를 나타낸다.here,

Is the average number of queries in the system.

각 MDS의 큐에서 기다려야 하는 평균 시간, 다시 말해 MDS 큐에서 메타데이터의 평균 대기 시간(

)은 수학식 7에 의해서 구할 수 있다.The average time to wait in the queue for each MDS, that is, the average wait time for metadata in the MDS queue (

) Can be obtained by the equation (7).

여기서 고객이 시스템 내에서 머물러야 할 시간(T)가 t보다 클 확률을 구할 수 있다. 따라서, 시스템 내에 존재하는 n 쿼리가 존재할 확률은 다음과 같다. 확률

와 같을 확률은 전확률의 법칙을 나타내는 수학식 8로부터 도출되는 수학식 9에서의 관계를 만족한다.Here we can find the probability that the time (T) the customer has to stay in the system is greater than t. Therefore, the probability that there are n queries existing in the system is as follows. percentage

Probability equal to satisfies the relationship in Equation 9 derived from Equation 8 representing the law of full probability.

그러므로,

확률은 수학식 10에 의해서 구할 수 있다.therefore,

Probability can be calculated by Equation (10).

MDS의 하드웨어의 상태를 나타내는 정보와 MDS의 메타데이터 서비스 지연 시간의 상태를 나타내는 정보를 이용해서 수학식 11에 의해서 MDS 활용도 값을 구할 수 있다. 수학식 11에서는 하드웨어가 CPU와 RAM을 포함하는 경우를 예로 든 것이고, 이러한 경우에 하드웨어의 상태를 나타내는 정보는 CPU의 최대 처리량(

), CPU의 현재 처리량(

), RAM의 최대 처리량(

), 및 RAM의 최대 처리량(

)을 포함한다. 또한, MDS의 메타데이터 서비스 지연 시간의 상태를 나타내는 정보는 MDS 큐에서 메타데이터의 평균 대기 시간(

) 및 MDS 내의 업데이트 트래픽의 수(

)를 포함할 수 있다.

에 따라서 CPU와 RAM의 활용도는 변화한다. 그러므로, 메터데이터 업데이트와 하트비트 메시지에 의해 영향을 받는

이 변할 것이다.Using the information indicating the state of the hardware of the MDS and the information indicating the state of the metadata service delay time of the MDS, the MDS utilization value can be obtained using Equation (11). In Equation 11, the hardware includes a CPU and RAM as an example, and in this case, the information representing the state of the hardware indicates the maximum throughput of the CPU (

), The current throughput of the CPU (

), Maximum throughput of RAM (

), And maximum throughput of RAM (

). In addition, the information indicative of the status of the metadata service delay time of the MDS may include the average latency of metadata in the MDS queue (

) And the number of update traffic in the MDS (

) May be included.

The utilization of the CPU and RAM changes accordingly. Therefore, affected by metadata updates and heartbeat messages

Will change.

여기서,

는 MDS 활용도 값을 나타낸다.here,

Represents the MDS utilization value.

본 발명에서는 MMDS가 MMDS 활용도 값을 구하기 위해서 MDS로부터 하트비트를 주기적으로 전송받는다.In the present invention, the MMDS periodically receives a heartbeat from the MDS to obtain an MMDS utilization value.

도 2는 본 발명에 따른 메타데이터 서비스를 위한 로드밸런싱 시스템에 포함된 마스터 메타데이터 서버(MMDS)의 구성을 설명하기 위한 개략도이다.2 is a schematic diagram illustrating a configuration of a master metadata server (MMDS) included in a load balancing system for a metadata service according to the present invention.

도 2를 참조하면, MMDS는 클라이언트와 MDS사이에서 중계 역할뿐만 아니라 MMDS에 포함된 CC, UM, RC, MC, MW, XCM, XC, 및 RMSS 컴포턴트들을 통해서 원활하게 메타데이터를 처리하고 MLT 업데이트를 가능하게 한다. 각 컴포넌트들의 기능은 다음과 같다.Referring to FIG. 2, the MMDS seamlessly processes metadata and updates MLT through CC, UM, RC, MC, MW, XCM, XC, and RMSS components included in the MMDS as well as a relay role between the client and the MDS. To make it possible. The functions of each component are as follows.

채널 컨트롤러(Channel Controller; CC)는 MDS와 연결 관리를 담당하여 채널 생성 및 삭제한다. 예를 들어, 채널 컨트롤러는 클라이언트에 의해서 핫스팟 파일이 조회되는 경우에, MMDS가 클라이언트를 쉐도우 MDS로 연결시킬 수 있도록 기능할 수 있다.The channel controller (CC) is responsible for connection management with the MDS to create and delete channels. For example, the channel controller may function to allow the MMDS to connect the client to the shadow MDS when the hotspot file is queried by the client.

업데이트 매니저(Update Manageer; UM)는 MC로부터 MDS의 상태를 보고받아 MLTS에게 현재 MDS들의 상태 보고한다. 예를 들어, 업데이트 매니저는 각각의 MDS에 대한 MDS 활용도 값 및 메타데이터 룩업 테이블 변경발생 정보를 MLTS에 전송할 수 있다.The Update Manager (UM) receives the status of the MDS from the MC and reports the status of the current MDSs to the MLTS. For example, the update manager may transmit the MDS utilization value and the metadata lookup table change occurrence information for each MDS to the MLTS.

릴레이 컨트롤러(Relay Controller; RC)는 클라이언트와 물리적 서버(Physical Server)의 직접적인 연결을 설정한다.The relay controller (RC) establishes a direct connection between the client and the physical server.

MDS 컨트롤러(MDS Controller; MC)는 MW로부터 MDS의 상태를 보고받아 원하는 컨트롤러에게 전송하고 관리한다. 예를 들어, MDS 컨트롤러는 하트비트를 이용하여 각각의 MDS에 대한 MDS 활용도 값을 산출하고, 각각의 MDS에 대한 MDS 활용도 값을 검사하여 MDS 활용도 값이 핫스팟 결정 기준값 이상인 MDS를 핫스팟 MDS로 결정하고, MDS 활용도 값이 핫스팟 결정 기준값 미만인 MDS 중에서 MDS 활용도 값이 가장 작은 MDS를 쉐도우 MDS로 결정한 후 핫스팟 MDS에 존재하는 핫스팟 파일을 쉐도우 MDS에 복사할 수 있다. The MDS Controller (MC) receives the status of the MDS from the MW, transmits to the desired controller and manages. For example, the MDS controller calculates the MDS utilization value for each MDS using the heartbeat, examines the MDS utilization value for each MDS, and determines an MDS whose MDS utilization value is greater than or equal to the hotspot determination threshold as a hotspot MDS. In addition, among the MDSs whose MDS utilization value is less than the hotspot determination threshold, the MDS having the smallest MDS utilization value may be determined as the shadow MDS, and then the hotspot file existing in the hotspot MDS may be copied to the shadow MDS.

MDS 와쳐(MDS Watcher; MW)는 MDS의 상태 및 기타 정보를 관찰하고, 하트비트 교환한다. 예를 들어, MDS 와쳐는 MDS로부터 하트비트를 주기적으로 전송받고 메타데이터 룩업 테이블 변경발생 정보를 전송받을 수 있다.The MDS Watcher (MW) observes the status and other information of the MDS and exchanges heartbeats. For example, the MDS watcher may periodically receive a heartbeat from the MDS and receive metadata lookup table change occurrence information.

도 3은 본 발명에 따른 로드밸런싱을 설명하기 위한 알고리즘이다.3 is an algorithm for explaining load balancing according to the present invention.

파일시스템 내에서 특정 파일에 대해 클라이언트의 많은 접속요청은 빈번하게 발생한다. 이럴 때 그 파일이 좀 더 여유 있는 MDS들에게 복제되어 분산처리가 가능하다면 클라이언트들은 신속한 파일처리가 가능해 진다. 종래의 방법은 핫스팟이 발생하면 그 파일에 대한 정보를 업데이트하고 이 업데이트를 모든 MDS와 클라이언트에게 알린다. 그리고 클라이언가 접근했을 때 MLT버전을 업데이트할 것을 요청하고 업데이트된 정보를 활용하여 핫스팟이 발생한 MDS를 대신할 쉐도우 노드를 찾아서 랜덤하게 선택하였다. 하지만, 여기서도 업데이트를 모든 MDS와 클라이언트에게 알리는 것은 네트워크 트래픽의 증가를 야기한다. 이러한 점을 좀더 수월하게 처리하기 위한 방법으로 본 발명에서는 핫스팟이 발생하면 MMDS는 MDS로부터 MDS 활용도 정보를 받는다. 이 정보를 받은 MMDS는 도 3에서 도시된 알고리즘에 따라서 처리할 수 있다.Many connection requests from clients for a particular file in the filesystem occur frequently. In this case, if the file is replicated to more relaxed MDSs and can be distributed, clients can quickly process the file. Conventional methods update information about the file when a hotspot occurs and notify all MDSs and clients of this update. When the client approached, he requested to update the MLT version and used the updated information to find the shadow node to replace the MDS where the hot spot occurred and randomly select it. However, again, notifying all MDSs and clients of updates causes an increase in network traffic. In the present invention, when a hot spot occurs, the MMDS receives MDS utilization information from the MDS. The MMDS receiving this information can be processed according to the algorithm shown in FIG.

MMDS는 주기적으로 MDS 활용도 값을 확인하며, 만일 특정 MDS의 MDS 활용도 값이 α 이상이라면 그 MDS를 핫스팟 노드로 선정한다. 시뮬레이션을 통해 10개의 MDS에서 MDS 활용도 값이 α(예컨대, 0.7)이상 일 때부터 급격한 메타데이터 처리 지연시간의 증가를 확인하였다. 그러나 MDS의 수, CPU, RAM 등이 바뀌면 α값은 변경될 수 있다. 특정 MDS가 핫스판 노드로 선정되면, 쉐도우 MDS들을 찾아서 핫스판 노즈에 존재하는 핫스팟 파일을 복제한다. 이 과정을 클라이언트는 모르고 있다. 그리고, 클라이언트가 핫스팟 파일에 접근하려고 한다면 MMDS는 복제본 파일이 있는 다른 MDS로 클라이언트를 연결하여 핫스팟을 해결하고, 핫스핫이 해제된다면 쉐도우 노드에 있는 파일을 지운다.The MMDS periodically checks the MDS utilization value. If the MDS utilization value of a specific MDS is greater than or equal to α, the MDS is selected as a hotspot node. The simulation confirms the rapid increase in metadata processing delay from 10 MDS utilization values above α (eg, 0.7). However, if the number of MDSs, CPU, RAM, etc. change, the α value may change. When a particular MDS is selected as a hotspan node, it searches for shadow MDSs and duplicates the hotspot files that exist in the hotspan nose. The client does not know this process. Then, if the client tries to access the hotspot file, MMDS connects the client to another MDS with a replica file to resolve the hotspot. If the hotspot is released, the MMDS deletes the file on the shadow node.

도 4는 본 발명에 따른 로드밸런싱 수행 방법의 흐름도이며, 본 발명에 따른 로드밸러싱 수행 방법에는 도 3에서와 같은 알고리즘이 반영된다.4 is a flowchart illustrating a method of performing load balancing according to the present invention, and the algorithm as shown in FIG. 3 is reflected in the method of performing load balancing according to the present invention.

본 발명에 따른 로드밸런싱 수행 방법은 MDS로부터 주기적으로 하트비트를 전송받는 단계(S101), MDS 활용도 값을 계산하는 단계(S103), MDS 활용도 값을 체크하는 단계(S105), MDS 활용도 값이 기준값 이상인 MDS를 핫스팟 MDS로 결정하는 단계(S107), MDS 활용도 값이 가장 작은 MDS를 쉐도우 MDS로 결정하는 단계(S109), 쉐도우 MDS에 핫스팟 MDS의 핫스팟 파일을 복사하는 단계(S111), 및 핫스팟 파일이 조회되는 경우 MMDS가 쉐도우 MDS로 연결시키는 단계(S113)를 포함한다.In the load balancing method according to the present invention, a step of periodically receiving a heartbeat from an MDS (S101), calculating an MDS utilization value (S103), checking an MDS utilization value (S105), and an MDS utilization value are reference values Determining the MDS as a hot spot MDS (S107), Determining the MDS having the smallest MDS utilization value as a shadow MDS (S109), Copying the hotspot file of the hotspot MDS to the shadow MDS (S111), and the hotspot file If the query is MMDS includes the step of connecting to the shadow MDS (S113).

본 발명에 따른 로드밸런싱 수행 방법은, MDS 활용도 값이 결정된 이후에, 메타데이터 룩업 테이블의 변경이 발생한 MDS가 메타데이터 룩업 테이블 변경 메시지를 MLTS에 전송하는 단계와, MLTS가 상기 메타데이터 룩업 테이블 변경 메시지를 메타데이터 룩업 테이블의 변경이 발생하지 않은 MDS에 전송하는 단계를 더 포함할 수 있고, 메타데이터 룩업 테이블의 변경이 발생하지 않은 MDS가 둘 이상인 경우에 MLTS가 상기 MDS 활용도 값을 고려하여 상기 메타데이터 룩업 테이블 변경 메시지를 전송하는 순서를 정할 수 있다. 이에 대해서는 이하에서 상세하게 설명한다.In the method of performing load balancing according to the present invention, after the MDS utilization value is determined, the MDS having changed the metadata lookup table transmits a metadata lookup table change message to the MLTS, and the MLTS changes the metadata lookup table. The method may further include transmitting a message to an MDS in which a change of the metadata lookup table does not occur. When there are two or more MDSs in which the change of the metadata lookup table does not occur, the MLTS considers the MDS utilization value. You can specify the order in which the metadata lookup table change messages are sent. This will be described in detail below.

표 1은 본 발명에 따른 메타데이터 검색 테이블을 설명하기 위한 표이다.Table 1 is a table for explaining the metadata search table according to the present invention.

범위range 식별번호Identification number MLT 버전MLT version 0~10000-1000 1One
20100225122425

20100225122425

1001~20001001-2000 22 2001~30002001-3000 33 3001~40003001-4000 44

본 발명에서는 표 1에서와 같이, 해싱의 결과값을 직접 사용하는 대신 메타데이터 룩업 테이블(MLT)을 사용한다. 따라서, MDS 노드가 클러스터에 추가·삭제 시 또는 파일 이름변경이나 디렉토리의 변경 시 해시 함수를 변경하지 않고 간단히 MLT를 수정함으로써 해시 함수의 수정 등 시스템 전체에 미치는 파급 효과를 줄인다.In the present invention, as shown in Table 1, instead of using the result of hashing directly, the metadata lookup table (MLT) is used. Therefore, by modifying the MLT node without changing the hash function when the MDS node is added to or deleted from the cluster, or when renaming a file or changing a directory, the effect on the entire system, such as the modification of the hash function, is reduced.

MLTS와 모든 MDS는 각각 MLT을 갖고 있다. 종래의 DH(Dynamic Hashing) 업데이트 방식은 MDS 노드가 클러스터에 추가 또는 삭제 시 또는 파일 이름 변경이나 디렉토리의 변경이 있을 때 MDS들이 상호 통신하면서 업데이트하였다. 소수의 MDS들 사이에서는 이 방법이 문제가 되지 않지만 클라우드 컴퓨팅 환경에서 수많은 노드 들이 추가 또는 삭제되며 많은 클라이언트들로부터 파일 및 디렉토리 이름변경, 삭제, 이동이 일어난다. 만일 다수의 클라이언트들이 이와 같은 방식으로 업데이트 메시지를 각 MDS에 교환한다면 비효율적 일뿐만 아니라 MDS들 상호간 많은 네트워크 트래픽을 초래할 것이다. 예를 들어, 10개의 MDS 중 5개의 MDS의 MLT가 업데이트 되어져야 한다면 각 5개의 MDS는 업데이트 메시지를 자신을 제외한 모든 노드들에 전달해야 한다. 그러면, 총 45개의 업데이트 메시지가 전송되어 처리되고 이 때문에 메타데이터 제공 서비스의 효율이 떨어진다. 이 문제는 MDS의 수가 증가함에 따라 현저히 드러날 것이다. 그러므로, MLT 업데이트로 발생하는 네트워크 트래픽을 증가 요소(increasing factor)

라고 한다면 시스템의 총 지연시간은

만큼 연장되며 메타데이터 제공 서비스 효율은 감소한다.MLTS and all MDSs each have an MLT. The conventional DH (Dynamic Hashing) update method is updated when MDS nodes communicate with each other when an MDS node is added to or deleted from a cluster, or when there is a file name change or a directory change. This is not a problem for a few MDSs, but in a cloud computing environment, many nodes are added or removed, and file and directory renaming, deletion, and movement from many clients occurs. If multiple clients exchange update messages for each MDS in this manner, they will not only be inefficient, but will also incur a lot of network traffic between the MDSs. For example, if the MLT of five of the 10 MDSs needs to be updated, each of the five MDSs must send an update message to all nodes except themselves. A total of 45 update messages are then sent and processed, which reduces the effectiveness of the metadata providing service. This problem will become apparent as the number of MDSs increases. Therefore, increasing the network traffic caused by MLT update

The total latency of the system

As a result, the metadata providing service efficiency is reduced.

제안된 업데이트 방법은 각 MDS에 변경사항을 알리지 않고 모든 MDS에서 메타데이터의 변경이 있을 경우 MLTS로만 변경 메시지를 통보하기 때문에 CPU, RAM, W를 고려하여 업데이트하게 되므로 최적의 상태에서

값이 발생하므로 시스템에 미치는 영향력은 감소하게 된다. 즉, MLT의 업데이트는 MMDS가

을 고려하여 우선순위에 따라 MLTS에게 여유있는 MDS의 MLT부터 업데이트 명령을 지시한다. The proposed update method does not inform each MDS of changes, and if there is a change of metadata in all MDSs, only the MLTS is notified so that the update message is updated considering CPU, RAM, and W.

As the value is generated, the impact on the system is reduced. In other words, the update of the MLT is

In consideration of this, the MLTS is instructed to update the MTS of the MDS of the MDS that is free in order of priority.

MLT의 내부에는 버전정보 영역이 존재하므로 그 버전정보를 이용하여 업데이트 유·무를 확인한다. 그럼에도 불구하고, 지속적인 네트워크 트래픽으로 업데이트되지 못한 노드가 존재할 경우를 대비하여 주기적으로 업데이트가 가능하도록 한다. 만일 클라이언트가 해싱으로 인한 MLT를 통해 도착한 MDS에서 적절한 메타데이터를 찾지 못했을 경우는 모든 MDS에게 브로드캐스트하지 않고 현재 MDS에서 MLTS내에 있는 MLT의 업데이트를 받아서 원하는 메타데이터를 얻을 수 있다.Since there is a version information area inside the MLT, the presence or absence of an update is checked using the version information. Nevertheless, it is possible to update periodically in case there is a node that cannot be updated due to continuous network traffic. If the client does not find the proper metadata in the MDS arriving through MLT due to hashing, the desired metadata can be obtained by updating the MLT in the MLTS from the current MDS without broadcasting to all MDSs.

이상의 설명은 본 발명의 기술 사항을 예시적으로 설명한 것에 불과한 것으로, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이런 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리 범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical details of the present invention, and those skilled in the art to which the present invention pertains may various modifications and changes without departing from the essential characteristics of the present invention. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention but to describe the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The protection scope of the present invention should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.

Claims

Load balancing for metadata services by placing a master metadata server (MMDS) and a metadata lookup table server (MLTS) between the client and multiple metadata servers (MDS), and each MDS sends heartbeats only to the MMDS. As a method of performing
A first step in which the MMDS periodically receives a heartbeat from the MDS;
A second step of the MMDS using the heartbeat to calculate an MDS utilization value for each MDS;
A third step of determining, by the MMDS, the MDS utilization value for each MDS, the MDS whose MDS utilization value is greater than or equal to the hotspot determination reference value as a hotspot MDS;
If there is an MDS determined as a hotspot MDS in the third step, the MMDS checks the MDS utilization value for each MDS to determine the shadow MDS whose MDS utilization value is the smallest among the MDSs whose MDS utilization value is less than the hotspot determination threshold. A fourth step;
A fifth step of the MMDS copying the hotspot file existing in the hotspot MDS to the shadow MDS;
If the hotspot file is queried by the client, the MMDS includes a sixth step of connecting the client to the shadow MDS,
The heartbeat includes information indicating the state of the hardware of the MDS and information indicating the state of the metadata service delay time of the MDS.

The method of claim 1,
Information indicating the state of the hardware of the MDS is the maximum throughput of the CPU (

), The current throughput of the CPU (

), Maximum throughput of RAM (

), And maximum throughput of RAM (

),
Information indicating the status of the metadata service latency of the MDS is based on the average latency of metadata in the MDS queue (

) And the number of update traffic in the MDS (

),
The MDS utilization value is calculated by Equation 1 below.
<Equation 1>

here,

Indicates the MDS utilization value.

The method of claim 2,
MLTS and MDS each have a metadata lookup table,
After the sixth step, the MDS in which the change of the metadata lookup table has occurred, sends a metadata lookup table change message to the MLTS, and the MLTS sends the metadata lookup table change message to the metadata lookup table. And transmitting the metadata lookup table change message in consideration of the MDS utilization value when there is more than one MDS in which no change of the metadata lookup table occurs. How to perform load balancing for the metadata service, characterized in that for determining.

A system for performing load balancing on metadata comprising a master metadata server (MMDS) and a metadata lookup table server (MLTS) disposed between a client and a plurality of metadata servers (MDS), the MMDS being:
An MDS watcher periodically receiving a heartbeat from the MDS and receiving metadata lookup table change occurrence information;
The MDS utilization value for each MDS is calculated using the heartbeat, the MDS utilization value for each MDS is examined, and an MDS whose MDS utilization value is greater than or equal to the hotspot determination reference value is determined as a hotspot MDS, and the MDS utilization value is a hotspot. An MDS controller which determines a shadow MDS having the smallest MDS utilization value among MDSs that are less than a determination threshold value, and then copies the hotspot file existing in the hotspot MDS to the shadow MDS;
A channel controller for connecting the client to the shadow MDS when the hotspot file is queried by the client; And
An update manager for transmitting MDS utilization value and metadata lookup table change occurrence information to each MLTS to MLTS;
MLTS and MDS each have a metadata lookup table,
The heartbeat is a system for performing load balancing for metadata services, characterized in that it includes information indicating the status of the hardware of the MDS and the metadata service delay time of the MDS.

The method of claim 4, wherein
Information indicating the state of the hardware of the MDS is the maximum throughput of the CPU (

), The current throughput of the CPU (

), Maximum throughput of RAM (

), And maximum throughput of RAM (

) And the number of update traffic in the MDS (

),
The MDS utilization value is calculated by the following equation (1), the system for performing load balancing for the metadata service.
<Equation 1>

here,

Indicates the MDS utilization value.

The method of claim 5, wherein
The MLTS receives the metadata lookup table change message from the MDS in which the metadata lookup table has been changed, transmits the metadata lookup table change message to the MDS in which the metadata lookup table has not been changed, and changes the metadata lookup table. And determining the order in which the metadata lookup table change message is transmitted in consideration of the MDS utilization value when two or more MDSs do not occur.