KR101078287B1

KR101078287B1 - Method Recovering Data Server at the Applying Multiple Reproduce Dispersion File System and Metadata Storage and Save Method Thereof

Info

Publication number: KR101078287B1
Application number: KR1020070131204A
Authority: KR
Inventors: 진기성; 이상민; 김영균; 김명준
Original assignee: 한국전자통신연구원
Priority date: 2007-12-14
Filing date: 2007-12-14
Publication date: 2011-10-31
Also published as: KR20090063733A

Abstract

본 발명은 다중 복제를 지원하는 분산 파일 시스템에서 데이터 서버의 복구 방법 및 그에 적당한 메타데이터 스토리지 및 저장 방법에 대하여 개시한다. 본 발명은 위치 정보를 포함하는 파일의 속성과 관련한 메타데이터를 저장하는 메타데이터 영역; 상기 파일이 저장된 블록을 저장하는 데이터 서버별로 구분되는 블록 인덱스의 내용을 저장하는 블록 영역을 포함하고, 상기 데이터 서버에 고장이 발생하면, 상기 블록 인덱스를 이용한 해당 데이터 서버 손실의 복원을 지원하는 점에 그 특징이 있다.The present invention discloses a recovery method of a data server in a distributed file system supporting multiple replications, and a metadata storage and storage method suitable thereto. The present invention provides a data storage device comprising: a metadata area for storing metadata related to attributes of a file including location information; And a block area for storing contents of block indexes classified for each data server storing the block in which the file is stored, and when a failure occurs in the data server, supporting the restoration of the corresponding data server loss using the block index. Has its features.

본 발명은 파일 속성과 블록 인덱스의 메타데이터를 구분/저장하고, 이를 이용하여 데이터 서버 고장 시 파일이 존재하는 블록에 대한 인덱스 정보 재구성 및 동일 블록이 저장된 다른 데이터 서버의 선정을 통하여 유실된 데이터를 용이하게 복구할 수 있는 효과가 있다.The present invention distinguishes and stores file attributes and metadata of a block index, and uses the same to reconstruct lost data by reconstructing index information for a block in which a file exists and selecting another data server in which the same block is stored. There is an effect that can be easily recovered.

분산파일 시스템, 고장, 복구, 메타데이터 서버, 데이터 서버 Distributed File System, Failure, Recovery, Metadata Server, Data Server

Description

{Method Recovering Data Server at the Applying Multiple Reproduce Dispersion File System and Metadata Storage and Save Method Thereof}

본 발명은 다중 복제를 지원하는 분산 파일 시스템에서 데이터 서버의 복구 방법 및 그에 적당한 메타데이터 스토리지 및 저장 방법에 관한 것으로서, 특히 파일 속성과 블록 인덱스의 메타데이터를 구분/저장하고, 이를 이용하여 데이터 서버 고장 시 파일이 존재하는 블록에 대한 인덱스 정보 재구성 및 동일 블록이 저장된 다른 데이터 서버의 선정을 통하여 유실된 데이터를 용이하게 복구할 수 있는 다중 복제를 지원하는 분산 파일 시스템에서 데이터 서버의 복구 방법 및 그에 적당한 메타데이터 스토리지 및 저장 방법에 관한 것이다.The present invention relates to a method for recovering a data server and a method for storing and storing metadata appropriately in a distributed file system supporting multiple replication, and in particular, distinguishes and stores metadata of file attributes and block indexes, and uses the data server Recovery method of a data server in a distributed file system that supports multiple replication to easily recover lost data by reconstructing index information for a block in which a file exists and selecting another data server in which the same block is stored. A suitable metadata storage and storage method is provided.

본 발명은 정보통신부 및 정보통신연구진흥원의 IT신성장동력핵심기술개발사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호:2007-S-016-01, 과제명:저비용 대규모 글로벌 인터넷 서비스 솔루션 개발].The present invention is derived from the research conducted as part of the IT new growth engine core technology development project of the Ministry of Information and Communication and the Ministry of Information and Communication Research and Development. Development].

종래의 스토리지(Storage) 환경에서 저장되는 데이터의 대부분은 기업이나 기관에서 생성한 업무 관련 데이터였으나, 최근 인터넷 기술의 비약적인 발전으로 블로그, 사진, 동영상과 같은 멀티미디어 데이터들의 저장도 급속도로 증가하고 있다. 특히, 국내외에서 인터넷 서비스를 실시중인 대형 포탈 업체의 경우에는 매월 수 테라바이트(TB, Tera Byte)~수십 테라바이트(TB, Tera Byte)의 데이터가 새롭게 생성되어 저장 및 관리되고 있다. 그러나 기존의 저장 구조 환경은 스토리지 확장성 및 관리의 용이성에서 많은 문제점이 있기 때문에 변화무쌍한 서비스 환경에 대체하기에는 부족하였다.Most of the data stored in the conventional storage environment has been business-related data generated by companies or institutions, but the recent rapid development of the Internet technology is also rapidly increasing the storage of multimedia data such as blogs, photos, videos. In particular, large portal companies that provide Internet services at home and abroad are newly generating, storing, and managing data of several terabytes (TB) to several tens of terabytes (TB) each month. However, the existing storage structure environment has many problems in storage scalability and ease of management, so it is not sufficient to replace the ever-changing service environment.

최근 스토리지 시스템 혹은 파일 시스템의 근원적인 기술 발전은 스토리지 시스템의 확장성(Scalability) 및 성능의 향상에 기인한 것이다. 상세하게는, 파일 시스템 구조 측면에서 몇몇 시스템들이 파일의 데이터 입출력 경로와 파일의 메타데이터 관리 경로를 분리시켜서 분산 스토리지 시스템의 확장성과 성능을 높인 것이다. 이러한 구조를 적용하여 클라이언트 시스템이 저장 장치들에 직접 접근할 수 있게 하고, 메타데이터를 분산시켜서 빈번한 파일의 메타데이터 접근으로 인한 병목현상을 해소하여 스토리지의 확장성을 높인다. Recent technological advances in storage systems or file systems are due to improvements in scalability and performance of storage systems. Specifically, in terms of the file system structure, some systems have separated the data input / output path of the file and the metadata management path of the file to increase the scalability and performance of the distributed storage system. By applying this structure, the client system can directly access the storage devices, and by distributing the metadata, the bottleneck caused by frequent file metadata access can be eliminated to increase the storage scalability.

이러한 구조를 기반으로 개발된 엔터프라이즈급 스토리지 솔루션으로 IBM의 StorageTank, Panasas의 ActiveScale Storage Cluster, 그리고 Cluster Filesystems의 Lustre, Google의 Google Filesystem 등이 있다. 이중, Google의 Google Filesystem은 한 파일에 대한 블록 데이터를 다수의 데이터 서버에 복제하여 더욱 가용성이 높다는 장점이 있다. Enterprise-class storage solutions built on this structure include IBM's StorageTank, Panasas 'ActiveScale Storage Cluster, Cluster Filesystems' Luster, and Google's Google Filesystem. Google's Google Filesystem has the advantage of being more available by replicating block data for one file to multiple data servers.

이 같은 네트워크 기반 분산 파일 시스템 환경에서는 클라이언트 파일 시스 템, 메타데이터 서버 및 데이터 서버들이 네트워크를 통해 교신하면서 데이터의 입출력을 제공한다. 클라이언트는 특정 파일에 접근하기 위해서 메타데이터 서버로부터 파일의 실제 데이터가 저장된 블록의 위치 정보를 획득한 후, 블록이 위치한 데이터 서버에 접근하여 블록의 데이터를 읽어 이를 사용한다. In this network-based distributed file system environment, client file systems, metadata servers, and data servers communicate over the network to provide input and output of data. To access a specific file, the client obtains the location information of the block in which the actual data of the file is stored from the metadata server, and then accesses the data server where the block is located and reads the data of the block.

도 1은 종래기술에 따른 다중 복제 기반 분산 파일 시스템을 도시한 블록도이다.
도 1에 도시된 바와 같이, 다중 복제 기반 분산 파일 시스템은 개별 시스템으로 운용되는 사용자 단말인 클라이언트(100), 파일의 속성 및 블록의 위치와 같은 메타데이터를 저장하는 메타데이터 서버(200), 실제 파일의 데이터를 저장하는 하나 이상의 데이터 서버(300A, 300B, 300C)로 구성되며, 이들은 네트워크로 연결되어 정보를 공유한다. 1 is a block diagram illustrating a multiple replication based distributed file system according to the prior art.
As shown in FIG. 1, the multiple replication-based distributed file system includes a client 100 which is a user terminal operating as a separate system, a metadata server 200 that stores metadata such as file attributes and block positions, and the like. It consists of one or more data servers 300A, 300B, and 300C that store data in files, which are networked and share information.

클라이언트(100)는 개별 시스템으로 운용되는 사용자 단말로 PC(Personal Computor), PDA(Personal Digital Assistants) 및 모바일 폰(Mobile Phone) 일 수 있다. The client 100 may be a personal terminal (PC), a personal digital assistant (PDA), and a mobile phone that are operated by individual systems.

메타데이터 서버(200)는 파일의 크기, 생성시간, 소유자 등의 파일의 속성 및 파일의 블록 위치를 포함하는 데이터 서버(300A, 300B, 300C)에 저장된 데이터의 메타데이터를 저장하여 클라이언트(100)에게 제공한다. The metadata server 200 stores the metadata of the data stored in the data servers 300A, 300B, and 300C, including the file's attributes such as the file size, creation time, owner, and the block position of the file. To provide.

데이터 서버(300A, 300B, 300C)는 메타데이터 서버(200)의 파일의 속성에 관련한 실제 데이터 블록들을 저장하며, 클라이언트(100)의 요청에 따라 이를 제공한다.The data servers 300A, 300B, and 300C store actual data blocks related to attributes of the file of the metadata server 200 and provide them at the request of the client 100.

동일 블록은 물리적으로 떨어진 하나 이상의 데이터 서버(300A, 300B, 300C)에 복제되어 저장됨으로써 파일 시스템의 가용성을 높일 수 있다. The same block may be replicated and stored in one or more physically separated data servers 300A, 300B, and 300C to increase the availability of the file system.

여기서, 데이터 서버(300A, 300B, 300C)는 하나의 파일을 여러 개의 블록들로 분할하여 저장하거나, 하나의 연속된 파일로 저장한다. Here, the data servers 300A, 300B, and 300C divide one file into several blocks or store them as one continuous file.

한편, 메타데이터 서버(200)는 데이터 서버(300A, 300B, 300C)와 별개의 기기로서 배치될 수 있으며, 데이터 서버(300A, 300B, 300C) 또는 클라이언트(100)와 동일한 기기로 구성될 수 있다.
이하, 메타데이터 서버(200) 및 데이터 서버(300A, 300B, 300C)의 동작을 살펴본다.Meanwhile, the metadata server 200 may be disposed as a separate device from the data servers 300A, 300B, and 300C, and may be configured with the same device as the data server 300A, 300B, or 300C or the client 100. .
Hereinafter, the operation of the metadata server 200 and the data servers 300A, 300B, and 300C will be described.

예를 들어, "example.txt"란 파일을 읽고자 하는 클라이언트(100)는 메타데이터 서버(200)로부터 "example.txt" 파일의 속성 및 블록의 위치 등의 메타데이터 정보를 제공받고, 메타데이터로부터 블록이 위치하는 데이터 서버(300A, 300B, 300C)를 확인하여 해당 데이터 서버에게 블록의 데이터를 요청한다. 그러면, 해당 데이터 서버는 자신의 메타데이터 저장소(201)에 저장된 해당 블록의 데이터를 클라이언트(100)에게 제공한다. For example, the client 100 who wants to read a file called "example.txt" is provided with metadata information such as the properties of the "example.txt" file and the location of a block from the metadata server 200. The data server 300A, 300B, or 300C in which the block is located is identified from the request, and the data server requests data of the block. Then, the data server provides the client 100 with the data of the corresponding block stored in its metadata repository 201.

이때, 클라이언트(100)가 요청한 블록(블록 1)은 도 1에 도시된 바와 같이, 하나 이상의 데이터 서버(300A, 300C)에 존재할 수 있으므로, 클라이언트(100)는 네트워크상으로 가장 가까운 데이터 서버(예컨대, 300A)로부터 요청한 블록을 가져옴으로써 지역성(Locality)에 기반한 I/O 성능을 높일 수 있다.In this case, since the block (block 1) requested by the client 100 may exist in one or more data servers 300A and 300C, as shown in FIG. 1, the client 100 may be the nearest data server (for example, a network). , I / O performance based on locality can be improved by importing the requested block from 300A).

이 같은, 다중 복제 환경은 찾고자하는 블록이 저장된 하나의 데이터 서버(예컨대, 300A)에 고장이 발생하여 접근이 불가능한 경우에 정상동작하는 다른 데이터 서버(예컨대, 300C)로부터 동일 블록을 획득할 수 있으므로 파일 시스템의 가용성이 높다. Such a multiple replication environment may acquire the same block from another data server (e.g. 300C) that operates normally when one data server (e.g. 300A) in which the block to be searched is stored has failed and is inaccessible. The file system is highly available.

또한, 블록 단위 다중 복제 환경에서는 서버 단위의 복제를 지원하는 RAID1과 달리 파일 단위로 블록의 복제가 이루어지기 때문에 시스템 운영 환경 또는 응용 프로그램 접근 패턴 등에 따라 복제되는 블록의 수를 유연하게 지정할 수 있다는 장점이 있다. In addition, unlike RAID1, which supports server-level replication, block-by-file replication is performed on a file-by-block basis, which allows flexible designation of the number of blocks to be replicated according to the system operating environment or application access pattern. There is this.

이때, 블록은 데이터를 담고 있는 논리적인 단위로서 하나의 파일이 하나의 블록에 존재하거나, 하나의 파일이 하나 이상의 블록에 존재할 수 있다. In this case, a block is a logical unit containing data, and one file may exist in one block or one file may exist in one or more blocks.

그러나, 이러한 블록의 다중 복제를 지원하는 분산 파일 시스템에서도 데이터 서버 고장과 같은 예외 상황의 발생시 복제된 블록의 유실이 발생할 수 있다.However, even in a distributed file system that supports multiple replication of such blocks, loss of replicated blocks may occur when an exception occurs such as a data server failure.

예를 들어, 파일에 대한 3개 복제 블록 중 2개의 복제 블록에 고장이 발생하더라도 1개의 복제 블록이 남아 있다면 서비스 제공이 가능하다. 하지만, 고장이 발생한 2개의 복제 블록을 지속적으로 복구하지 않으면, 또 다른 고장에 의해 마지막 남은 1개의 블록도 유실이 발생할 수 있어 파일의 메타데이터만 존재하고 실제 데이터가 저장된 블록이 존재하지 않아 해당 파일의 복구 자체가 불가능할 것이다. For example, even if two of the three replication blocks for a file fail, a service can be provided if one replication block remains. However, if two failed replication blocks are not recovered continuously, the last one block may be lost by another failure, so only the metadata of the file exists and the block where the actual data is stored does not exist. The recovery itself would be impossible.

이러한 경우에 대비하기 위해서, 일반적으로 다중 복제를 지원하는 분산 파일 시스템은 최소한의 가용성을 보장하는 범위 내에서 아래와 같은 방법들을 사용하여 고장이 발생한 블록들을 복구한다. In order to prepare for such a case, a distributed file system that supports multiple replication generally recovers failed blocks using the following methods within the minimum availability.

첫 번째 방법은 모든 블록 정보를 메타데이터 서버(200)의 메모리에 적재하고, 고장 상황 발생시 메모리로부터 고장이 발생한 블록 정보를 수집한 후 블록을 복구하는 방법이다. 그러나, 이 방법은 관리하는 블록 정보가 증가하면, 과도한 메모리가 소비되기 때문에 시스템 성능이 저하될 뿐만 아니라, 메타데이터 서버(200)의 물리적인 메모리 용량 이상의 블록 정보의 양을 저장할 수 없다는 단점이 있다. 때문에, 데이터 서버들의 디스크에 여유 공간이 충분함에도 불구하고 메타데이터 서버(200)의 메모리 부족으로 인해 더 이상의 새로운 파일 생성이 불가능하게 될 수도 있다. The first method is a method of loading all block information into the memory of the metadata server 200 and recovering the block after collecting the block information in which the failure occurred from the memory when a failure situation occurs. However, this method has a drawback that if the block information to be managed increases, not only system performance is degraded because excessive memory is consumed, but also the amount of block information beyond the physical memory capacity of the metadata server 200 cannot be stored. . Therefore, even though there is sufficient free space on the disk of the data servers, it may be impossible to create a new file due to the memory shortage of the metadata server 200.

두 번째 방법은 모든 블록 정보를 별도의 데이터베이스에 저장하고 고장 상황 발생시 데이터베이스로부터 블록 정보를 수집한 후 블록을 복구하는 방법이다.
상세하게는, 메타데이터 서버(200)에 블록 정보 저장을 위한 전용의 데이터베이스를 구축하여 블록에 대한 변동이 발생할 때마다 데이터베이스를 편집하여 관리하는 것이다. 그러나, 이 방법은 파일 시스템이 데이터베이스에 의존적이기 때문에 시스템 운영상의 유연성이 떨어질 뿐만 아니라 수백만 건 이상이 되는 데이터를 처리하기 위한 효율적인 테이블 구조를 설계하기가 어렵다는 문제가 있다. The second method is to store all block information in a separate database, recover the block after collecting the block information from the database when a failure occurs.
In detail, a dedicated database for storing block information is constructed in the metadata server 200, and the database is edited and managed whenever a change in a block occurs. However, this method has a problem that it is difficult to design an efficient table structure for processing millions of data as well as the system operation flexibility because the file system is database-dependent.

세 번째 방법은 별도의 블록 정보를 관리하지 않고 고장이 발생할 때마다 모든 메타데이터를 검색하여 고장이 발생한 블록 정보를 수집한 후 블록을 복구하는 방법이다. 하지만, 이 방법 역시 하나의 데이터 서버가 고장이 발생한 경우에도 모든 메타데이터의 블록 정보를 검색해야 하므로 고장이 발생한 데이터 서버에 소속된 블록들을 수집하는 절차가 필요하고, 이 때문에 복구효율이 낮다는 문제가 있다.The third method is to recover blocks after collecting all block information by searching all metadata whenever a failure occurs without managing separate block information. However, this method also requires a procedure to collect blocks belonging to the failed data server, even if one data server fails, so the recovery efficiency is low. There is.

본 발명은 파일 속성과 블록 인덱스의 메타데이터를 구분/저장하고, 이를 이용하여 데이터 서버 고장 시 파일이 존재하는 블록에 대한 인덱스 정보 재구성 및 동일 블록이 저장된 다른 데이터 서버의 선정을 통하여 유실된 데이터를 용이하게 복구할 수 있는 다중 복제를 지원하는 분산 파일 시스템에서 데이터 서버의 복구 방법 및 그에 적당한 메타데이터 스토리지 및 저장 방법을 제공함에 그 목적이 있다. The present invention distinguishes and stores file attributes and metadata of a block index, and uses the same to reconstruct lost data by reconstructing index information for a block in which a file exists and selecting another data server in which the same block is stored. It is an object of the present invention to provide a method for recovering a data server and a method for storing and storing metadata appropriately in a distributed file system supporting multiple replications that can be easily recovered.

전술한 목적을 달성하기 위해서 본 발명에 따른 다중 복제를 지원하는 분산 파일 시스템에서 메타데이터 서버의 스토리지는, 적어도 하나의 데이터 서버에 블록 단위로 저장된 파일에 대한 메타데이터를 관리하는 메타데이터 서버의 스토리지에 있어서, 위치 정보를 포함하는 파일의 속성과 관련한 메타데이터를 저장하는 메타데이터 영역; 상기 파일을 포함하는 블록을 저장하는 데이터 서버별로 구분되는 블록 인덱스를 저장하는 블록 영역을 포함하고, 상기 데이터 서버에 고장이 발생하면, 상기 블록 인덱스를 이용한 해당 데이터 서버 손실의 복원을 지원하는 점에 그 특징이 있다.
여기서, 상기 파일의 속성은 파일의 이름, 파일의 크기, 파일 접근 권한 및 파일이 저장된 위치를 포함하는 정보와 관련한 것이며, 상기 블록 영역은 상기 메타데이터 서버의 시스템이 기동 될 때마다 재구성되며, 초기 상태의 블록 인덱스 정보를 저장하는 기본 파일; 상기 초기 상태에는 크기가 0이며, 새로운 블록이 추가될 때마다 새로운 블록 정보 엔트리를 추가 저장하는 추가 파일; 상기 초기 상태에는 크기가 0이며, 블록이 삭제될 때마다 삭제되는 블록 정보 엔트리를 추가 저장하는 삭제 파일로 구분하여 상기 블록 인덱스의 내용을 관리한다.
이때, 상기 추가 파일 및 상기 삭제 파일은 새로운 블록 정보 엔트리의 추가만 가능하며, 내용의 삭제는 허용하지 않으며, 상기 블록 정보 엔트리는 블록 식별자, 블록이 속한 파일 경로를 포함한다.
본 발명의 다른 특징에 따라, 메타데이터 서버가 하나 이상의 데이터 서버에 저장된 데이터에 대한 메타데이터를 저장하는 방법에 있어서, (a) 메타데이터 저장 영역을 메타데이터 영역 및 블록 영역으로 구분하는 단계; (b) 데이터 서버로부터 메타데이터를 수신하여 파일의 속성 관련정보를 상기 메타데이터 영역에 저장하는 단계; (c) 상기 블록 영역에 초기 상태의 블록의 정보를 저장하는 기본 파일을 생성하는 단계; (d) 상기 초기 상태로부터 추가 또는 삭제되는 블록의 정보가 존재하면, 상기 초기 상태로부터 추가 또는 삭제되는 블록의 정보를 추가 파일 또는 삭제 파일에 추가하는 단계를 포함하는 점에 그 특징이 있는 다중 복제를 지원하는 분산 파일 시스템에서 메타데이터 저장 방법이 제공된다.
여기서, 상기 (d)단계는 (d-1) 새로운 블록이 추가되면, 상기 추가 파일의 끝에 새로운 블록 정보 엔트리를 삽입하는 단계; (d-2) 삭제되는 블록이 존재하면, 상기 삭제 파일의 끝에 삭제되는 블록 정보 엔트리를 삽입하는 단계를 포함하며, 상기 (d)단계 이후에, 새로운 데이터 서버가 추가되면, 상기 (b)단계 내지 상기 (d)단계를 반복 수행하는 단계를 더 포함하는 것이 바람직하다.In order to achieve the above object, in the distributed file system supporting multiple replications according to the present invention, the storage of the metadata server is a storage of the metadata server that manages metadata about files stored in block units in at least one data server. A metadata field comprising: a metadata area for storing metadata related to attributes of a file including location information; And a block area for storing a block index for each data server storing a block including the file, and if a failure occurs in the data server, supporting a restoration of a corresponding data server loss using the block index. It has its features.
Here, the attribute of the file relates to information including the name of the file, the size of the file, the file access authority and the location where the file is stored, and the block area is reconfigured every time the system of the metadata server is started. A basic file for storing block index information of a state; The initial state has a size of 0 and an additional file for additionally storing a new block information entry each time a new block is added; In the initial state, the size is 0, and the content of the block index is managed by dividing the block information entry to be deleted each time a block is deleted into a deletion file.
In this case, the addition file and the deletion file may only add a new block information entry, and do not allow the deletion of contents, and the block information entry includes a block identifier and a file path to which the block belongs.
According to another aspect of the present invention, a method for storing metadata for data stored in one or more data servers by a metadata server, the method comprising: (a) dividing a metadata storage area into a metadata area and a block area; (b) receiving metadata from a data server and storing attribute related information of a file in the metadata area; (c) generating a basic file for storing information of blocks of an initial state in the block area; (d) if the information of the block to be added or deleted from the initial state is present, adding the information of the block to be added or deleted from the initial state to the additional file or the deletion file. Metadata storage method is provided in a distributed file system that supports.
Here, step (d) includes (d-1) inserting a new block information entry at the end of the additional file when a new block is added; (d-2) if a block to be deleted exists, inserting a block information entry to be deleted at the end of the deletion file, and after step (d), if a new data server is added, step (b) It is preferable to further include the step of performing the step (d) repeatedly.

삭제delete

본 발명의 또 다른 특징에 따라, 메타데이터 서버가 고장이 발생한 일 데이터 서버를 복구하는 방법에 있어서, (f) 고장이 발생한 일 데이터 서버를 감지하는 단계; (g) 크래쉬 영역을 재구성하여 고장이 발생하기 전의 블록 정보를 재구성하는 단계; (h) 복구할 블록과 동일 블록들이 저장된 타 데이터 서버들을 검색 및 선정하여 상기 각 동일 블록을 요청하는 단계; (i) 상기 타 데이터 서버들로부터 상기 각 동일 블록을 수신하는 단계; (j) 상기 수신한 각 동일 블록으로 상기 일 데이터 서버를 복구하고, 복구 완료를 공지하는 단계를 포함하는 점에 그 특징이 있는 다중 복제를 지원하는 분산 파일 시스템에서 데이터 서버의 복구 방법이 제공된다.
여기에서, 상기 (f)단계에서 고장의 감지는 네트워크 연결 단절, 데이터 서버 프로세서의 비정상 종료, 전원 불량을 포함하는 현상의 감지에 의해 수행되며, 상기 (j)단계 이후에 (k) 상기 복구의 성공 여부를 판단하는 단계; (l) 상기 복구에 성공하면, 상기 크래쉬 영역을 삭제하는 단계를 포함하고, (m) 상기 복구에 실패하면, 상기 (g)단계로 돌아가는 단계를 포함한다.
한편, 상기 (k)단계에서 상기 복구의 성공 여부는, 상기 일 데이터 서버로부터 수집된 처리 결과를 상기 재구성한 크래쉬 영역과 비교 결과에 따라 판단하며, 상기 (g)단계는 (g-1) 상기 크래쉬 영역을 할당하는 단계; (g-2) 상기 크래쉬 영역에 상기 일 데이터 서버 초기 상태의 블록 인덱스 정보가 저장된 기본 파일을 복사하는 단계; (g-3) 상기 크래쉬 영역에 상기 일 데이터 서버의 초기 상태로부터 추가된 블록 인덱스 정보가 저장된 추가 파일의 내용을 추가하는 단계; (g-4) 상기 크래쉬 영역에 상기 일 데이터 서버의 초기 상태로부터 삭제된 블록 인덱스 정보가 저장된 삭제 파일의 내용을 추가하는 단계를 포함한다.
또한, 상기 (h)단계는 (h-1) 상기 메타데이터로부터 복구에 사용 가능한 동일 블록의 수 및 각 동일 블록의 위치를 검색하는 단계; (h-2) 상기 검색된 각 동일 블록이 저장된 데이터 서버 중 재복제를 수행할 각 타 데이터 서버를 선정하는 단계; (h-3) 상기 검색 결과로부터 {동일 블록, 동일 블록이 저장된 타 데이터 서버, 일 데이터 서버} 리스트를 구성하는 단계; (h-4) 상기 리스트를 상기 동일 블록이 저장된 각 타 데이터 서버별로 정렬하는 단계; (h-5) 상기 각 타 데이터 서버들에 상기 동일 블록을 요청하는 명령을 전송하는 단계를 포함한다.
이때, 상기 (h-2)단계에서 타 데이터 서버는 네트워크 위상 구조 및 시스템 부하를 포함하는 시스템 상황을 고려하여 선정되는 것이 바람직하다.According to still another aspect of the present invention, there is provided a method for a metadata server to recover a failed data server, the method comprising: (f) detecting a failed data server; (g) reconstructing the crash region to reconstruct block information before failure occurs; (h) searching for and selecting other data servers storing the same blocks as the block to be recovered and requesting each of the same blocks; (i) receiving each same block from the other data servers; (j) a method for recovering a data server in a distributed file system supporting multiple replications, characterized in that it comprises recovering the one data server with each received same block and notifying a completion of the recovery. .
Here, the detection of the failure in step (f) is performed by detection of a phenomenon including disconnection of a network, abnormal termination of the data server processor, power failure, and after step (j), (k) Determining success or failure; (l) if the recovery is successful, deleting the crash region; and (m) if the recovery fails, returning to step (g).
Meanwhile, in the step (k), whether the recovery is successful or not is determined based on a result of comparing the reconstructed crash region with the result of the processing collected from the one data server, and the step (g) includes the step (g-1). Allocating a crash region; (g-2) copying a basic file in which the block index information of the initial state of the one data server is stored in the crash region; (g-3) adding contents of an additional file in which block index information added from an initial state of the one data server is stored in the crash region; (g-4) adding the contents of the deletion file in which the block index information deleted from the initial state of the one data server is stored in the crash region.
In addition, step (h) may include (h-1) retrieving the number of identical blocks available for recovery and the location of each identical block from the metadata; (h-2) selecting each other data server to perform re-replication among data servers in which each searched same block is stored; (h-3) constructing a list of {same block, another data server in which the same block is stored, one data server} from the search result; (h-4) sorting the list by each other data server in which the same block is stored; (h-5) transmitting a command for requesting the same block to the other data servers.
In this case, in the step (h-2), the other data server is preferably selected in consideration of a system situation including a network topology structure and a system load.

삭제delete

본 발명에 따른 다중 복제를 지원하는 분산 파일 시스템에서 데이터 서버의 복구 방법 및 그에 적당한 메타데이터 스토리지 및 저장 방법은 파일 속성과 블록 인덱스의 메타데이터를 구분/저장하고, 이를 이용하여 데이터 서버 고장 시 파일이 존재하는 블록에 대한 인덱스 정보 재구성 및 동일 블록이 저장된 다른 데이터 서버의 선정을 통하여 유실된 데이터를 용이하게 복구할 수 있는 효과가 있다.In a distributed file system supporting multiple replications according to the present invention, a method for recovering a data server and a method for storing and storing metadata according to the present invention distinguish and store file attributes and metadata of a block index, and use the same to store a file in case of a data server failure. The lost data can be easily recovered by reconstructing index information of the existing block and selecting another data server in which the same block is stored.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명하기로 한다. 이하의 실시예에서는 이 기술분야에서 통상적인 지식을 가진 자에게 본 발명이 충분히 이해되도록 제공되는 것으로서, 여러 가지 형태로 변형될 수 있으며, 본 발명의 범위가 다음에 기술되는 실시예에 한정되는 것은 아니다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following embodiments are provided to those skilled in the art to fully understand the present invention, can be modified in various forms, the scope of the present invention is limited to the embodiments described below no.

도 2 및 도 3은 본 발명의 일실시예에 따른 다중 복제를 지원하는 분산 파일 시스템에서 메타데이터 스토리지의 형태를 도시한 블록도이다.
도 2 도시된 바와 같이, 메타데이터 서버(200)는 메타데이터 저장소(201)를 파일의 메타데이터를 관리하는 메타데이터 영역(210)과 블록 인덱스를 관리하는 블록 영역(220)으로 구분하여 관리한다. 2 and 3 are block diagrams illustrating a form of metadata storage in a distributed file system supporting multiple replications according to an embodiment of the present invention.
As shown in FIG. 2, the metadata server 200 divides and manages the metadata repository 201 into a metadata area 210 managing a metadata of a file and a block area 220 managing a block index. .

메타데이터 영역(210)은 파일의 네임스페이스 트리(File Namespace Tree)를 관리하기 위한 영역으로 각 디렉터리 및 파일의 계층 구조를 표현한다(제220블록).
또한, 메타데이터 영역(210)은 각 파일들의 이름, 크기, 권한 및 블록의 위치 정보 등의 파일의 속성에 대한 정보인 메타메이터를 저장 및 관리한다.The metadata area 210 is an area for managing a file namespace tree of a file and represents a hierarchical structure of each directory and file (block 220).
In addition, the metadata area 210 stores and manages metadata, which is information about attributes of a file such as names, sizes, permissions, and location information of blocks.

블록 영역(220)은 모든 데이터 서버(데이터 서버#1 내지 데이터 서버#N) 블록들의 정보를 관리하는 영역으로서 특정 데이터 서버(예컨대, 데이터 서버#1)의 고장에 대한 빠른 복구 정보 수집을 위해 각 데이터 서버별로 엔트리들을 트리로 구분하여 저장한다(제221블록).Block area 220 is an area that manages the information of all data server (data server # 1 to data server #N) blocks, each for fast recovery information collection of the failure of a specific data server (for example, data server # 1) Entries are stored in a tree for each data server (block 221).

이와 같이, 메타데이터 서버(200)는 데이터 서버별로 엔트리들을 구분하여 저장함으로써 특정 데이터 서버에 고장이 발생하였을 때 전체 블록 인덱스를 검색하지 않고도 원하는 블록 정보를 획득할 수 있다.As such, the metadata server 200 may store the entries for each data server to obtain desired block information without searching the entire block index when a failure occurs in a specific data server.

블록 인덱스(222)는 모든 데이터 서버 엔트리마다 존재하는 것으로서 개별 데이터 서버에 저장된 블록들의 정보 추적에 사용된다.
블록 인덱스(222)는 블록의 추가, 삭제와 같은 변경을 실시간으로 추적할 수 있도록 도 3에 도시된 바와 같이, 기본 파일(223), 추가 파일(224), 삭제 파일(225)의 세 가지로 구분되어 관리되는 것이 바람직하다(제222블록).The block index 222 exists for every data server entry and is used to track information in blocks stored in a separate data server.
As shown in FIG. 3, the block index 222 may include three types of basic files 223, additional files 224, and deleted files 225 to track changes such as adding and deleting blocks in real time. It is desirable to be managed separately (block 222).

기본 파일(223)은 메타데이터 서버(200)의 시스템 기동시 또는 고장이 발생한 일 데이터 서버(예컨대, 데이터 서버#1)가 고장으로부터 복구될 때 재구성되며, 이외의 경우에는 변경되지 않는다. The basic file 223 is reconfigured at system startup of the metadata server 200 or when one data server (eg, data server # 1) in which the failure occurs is recovered from the failure, and is not changed otherwise.

이때, 도 3에 도시된 바와 같이 기본 파일(223)은 블록 정보들을 순서에 상관없이 저장하며 [블록 식별자, 블록이 속한 파일 경로]의 형태로 블록 정보 엔트리를 관리한다.In this case, as shown in FIG. 3, the basic file 223 stores block information in any order and manages block information entries in the form of [block identifier, file path to which the block belongs].

추가 파일(224)은 기본 파일(223)이 재구성되면 크기가 0으로 리셋된 다음, 새로운 블록의 추가가 발생할 때 그에 대한 정보를 추가한다.The additional file 224 is reset to zero size when the base file 223 is reconstructed, and then adds information about the addition of a new block.

상세하게는, 도 3에 도시된 바와 같이, 새로운 블록이 추가될 때마다 추가 파일(224)의 끝에는 새로운 블록 정보 엔트리가 삽입되며, 블록 정보 엔트리는 [블록 식별자, 블록이 속한 파일 경로]의 형태로 구성된다.Specifically, as shown in FIG. 3, each time a new block is added, a new block information entry is inserted at the end of the additional file 224, and the block information entry is in the form of [block identifier, file path to which the block belongs]. It consists of.

삭제 파일(225)은 기본 파일(223)이 재구성되면 크기가 0으로 리셋된 다음, 삭제되는 블록이 발생할 때 그에 대한 정보를 추가한다.The delete file 225 is reset to zero when the base file 223 is reconstructed, and then adds information about the deleted file when the block to be deleted occurs.

상세하게는, 도 3에 도시된 바와 같이, 새로운 블록이 삭제될 때마다 삭제 파일(225)의 끝에는 새로운 블록 정보 엔트리가 삽입되며, 블록 정보 엔트리는 [블록 식별자, 블록이 속한 파일 경로]의 형태로 구성된다.Specifically, as shown in FIG. 3, each time a new block is deleted, a new block information entry is inserted at the end of the deletion file 225, and the block information entry is in the form of [block identifier, file path to which the block belongs]. It consists of.

이때, 추가 파일(224) 및 삭제 파일(225)은 새로운 데이터의 추가는 허용하지만, 기존 엔트리의 변경이나 삭제는 허용하지 않는 속성을 가진다.In this case, the additional file 224 and the deletion file 225 have attributes that allow the addition of new data but do not allow the change or deletion of existing entries.

도 4는 본 발명의 일실시예에 따른 메타데이터 서버(200)가 블록 인덱스(222)를 구성하는 방법을 도시한 흐름도이다.
도 4에서는, 도시된 바와 같이 메타데이터 서버(200)의 기동이 시작된 네트워크에는 각각 "a.txt", "b.txt", "c.txt" 파일이 존재하는 블록#1, 블록#2, 블록#3을 가지는 데이터 서버#1만이 존재한다고 가정한다. 4 is a flowchart illustrating a method of configuring the block index 222 by the metadata server 200 according to an embodiment of the present invention.
In FIG. 4, as shown, blocks # 1, block # 2, and "a.txt", "b.txt", and "c.txt" files exist in the network where the metadata server 200 is started. Assume that only data server # 1 with block # 3 exists.

이때, 메타데이터 서버(200)의 저장공간은 파일의 메타데이터를 관리하는 메타데이터 영역(210)과 블록 인덱스를 관리하는 블록 영역(220)으로 구분된다.
상세하게는, 메타데이터 서버(200)는 기동을 시작한 후 네트워크에 존재하는 데이터 서버#1로부터 메타데이터를 수신하여 메타데이터 영역에는 파일의 속성 관련 정보를 저장하고, 블록 영역에는 제410블록, 제420블록 및 제430블록과 같이 기본 파일(223), 추가 파일(224), 삭제 파일(225)을 저장한다.
제410블록은 데이터 서버#1의 블록#1, 블록#2, 블록#3에 각각 "a.txt", "b.txt", "c.txt" 파일이 저장되었음을 나타내는 기본 파일(223)을 도시하였다. In this case, the storage space of the metadata server 200 is divided into a metadata area 210 for managing metadata of a file and a block area 220 for managing a block index.
In detail, the metadata server 200 receives metadata from the data server # 1 existing in the network after starting the operation, and stores the attribute related information of the file in the metadata area. Like the blocks 420 and 430, the basic file 223, the additional file 224, and the deletion file 225 are stored.
Block 410 indicates a basic file 223 indicating that files "a.txt", "b.txt", and "c.txt" are stored in blocks # 1, # 2, and # 3 of the data server # 1, respectively. Shown.

제420블록은 블록#4를 포함하는 "d.txt"라는 새로운 파일이 추가되었을 때(S420), 끝에 새로운 블록에 대한 정보(블록#4, d.txt)를 추가한 추가 파일(224)을 도시하였다. When block 420 is added a new file called "d.txt" containing block # 4 (S420), the additional file 224, the information (block # 4, d.txt) added to the end of the new block at the end Shown.

제430블록은 블록#2를 포함하는 "b.txt"라는 파일이 삭제되었을 때(S430), 끝에 삭제된 블록에 대한 정보(블록#2, b.txt)를 추가한 삭제 파일(225)을 도시하였다. When the block “430” including the block # 2 is deleted (S430), the block 430 may include a deletion file 225 having information (blocks # 2 and b.txt) added to the end of the block. Shown.

이때, 기본 파일(223)은 메타데이터 서버(200) 시스템의 재가동 또는 고장이 발생한 데이터 서버(예컨대, 데이터 서버#1)가 복구될 때 이외에는 변경되지 않는다. At this time, the basic file 223 is not changed except when the data server (eg, data server # 1) in which the metadata server 200 system is restarted or malfunctions is restored.

즉, 삭제될 블록에 대한 정보가 추가 파일(223) 또는 삭제 파일(224)에 포함되어 있다고 해도 기본 파일(222)은 변경되지 않는다.That is, even if the information on the block to be deleted is included in the additional file 223 or the deletion file 224, the basic file 222 is not changed.

여기서, 전술한 바와 같이 추가 파일(224) 및 삭제 파일(225)의 내용은 메타데이터 서버(200)가 재가동되거나, 또는 일 데이터 서버(예컨대, 데이터 서버#1)에 고장이 발생하여 타 데이터 서버로부터 블록을 재복제할 때만 기본 파일(222)로 병합된다. In this case, as described above, the contents of the additional file 224 and the deletion file 225 may be changed by restarting the metadata server 200 or by a failure of one data server (eg, data server # 1). It is only merged into the base file 222 when recreating a block from.

도 5는 본 발명의 일실시예에 따른 블록 인덱스(222)를 이용하여 고장이 발생한 데이터 서버의 블록 정보를 재구성하는 과정을 도시한 흐름도이다. 5 is a flowchart illustrating a process of reconstructing block information of a failed data server using the block index 222 according to an embodiment of the present invention.

제410블록 내지 제430블록의 과정을 거친 데이터 서버#1에 고장이 발생하면, 메타데이터 서버(200)는 제510블록(510)과 같이 블록 영역(220)에서 고장이 발생하기 직전 데이터 서버#1의 블록 인덱스(222)를 확인한다.If a failure occurs in the data server # 1 that has undergone the processes of blocks 410 to 430, the metadata server 200 immediately before the failure occurs in the block area 220 as in the block 510. The block index 222 of 1 is checked.

이어서, 메타데이터 서버(200)는 제520블록(520)과 같이 메타데이터 서버(200)의 크래쉬 영역에 기본 파일(223)의 내용을 그대로 복사한다(S510). Subsequently, the metadata server 200 copies the contents of the basic file 223 to the crash area of the metadata server 200 as in block 520 (S510).

이때, 크래쉬 영역에는 기본 파일(223)의 내용과 동일한 블록#1, 블록#2, 블록#3의 "a.txt", "b.txt", "c.txt"가 각각 존재한다.At this time, "a.txt", "b.txt", and "c.txt" of blocks # 1, block # 2, and block # 3 are the same as the contents of the basic file 223 in the crash area.

여기서, 크래쉬(Crash) 영역은 메타데이터 서버(200)의 일 저장 영역에 존재하며, 메모리 및 타 저장매체일 수 있다.Here, the crash area exists in one storage area of the metadata server 200 and may be a memory and another storage medium.

그리고, 메타데이터 서버(200)는 제530블록(530)과 같이 크래쉬 영역에 추가 파일(224)의 인덱스 엔트리를 추가한다(S520). The metadata server 200 adds an index entry of the additional file 224 to the crash region as in block 530 (S520).

이때, 크래쉬 영역에는 블록#4의 "d.txt"의 정보를 추가되어, 블록#1, 블록#2, 블록#3, #블록4의 "a.txt", "b.txt", "c.txt", "d.txt"의 정보가 각각 존재한다. At this time, the information of "d.txt" of block # 4 is added to the crash area, and "a.txt", "b.txt", "c of block # 1, block # 2, block # 3, and #block 4 are added. .txt "and" d.txt "information.

그 다음으로, 메타데이터 서버(200)는 제540블록(540)과 같이 크래쉬 영역에 삭제 파일(225)의 인덱스 엔트리를 제거한다(S530). Next, the metadata server 200 removes the index entry of the deletion file 225 in the crash area as in block 540 (S530).

이때, 크래쉬 영역에는 블록#2의 정보가 삭제되어 블록#1, 블록#2, #블록4의 "a.txt", "b.txt", "d.txt"의 정보가 각각 존재한다.At this time, the information of block # 2 is deleted in the crash area, and information of "a.txt", "b.txt", and "d.txt" of blocks # 1, block # 2, and #block 4 exists, respectively.

이후, 데이터 서버#1은 완성된 크래쉬 영역의 정보를 이용하여 고장이 발생한 블록 정보를 복구하고(S540), 복구된 블록 정보를 이용하여 손실이 발생한 데이터를 복구할 수 있다.Thereafter, the data server # 1 recovers the block information in which the failure occurred by using the information of the completed crash region (S540), and restores the data in which the loss occurred by using the recovered block information.

도 6은 본 발명의 일실시예에 따른 다중 복제를 지원하는 분산 파일 시스템에서 고장이 발생한 데이터 서버(데이터 서버#1)가 블록을 복제하는 방법을 도시한 흐름도이다. 도 6은 도 5의 과정을 통해서 구성된 크래쉬 영역의 블록 정보를 이용해서 데이터 서버#1가 다른 데이터 서버(데이터 서버#3, 데이터 서버#7, 데이터 서버#N)로부터 동일 블록을 복사하여 손실 데이터를 복원하는 과정을 도시하였다.FIG. 6 is a flowchart illustrating a method of replicating blocks by a data server (data server # 1) having a failure in a distributed file system supporting multiple replications according to an exemplary embodiment of the present invention. FIG. 6 shows data lost by copying the same block from another data server (data server # 3, data server # 7, data server #N) where data server # 1 is different using the block information of the crash region configured through the process of FIG. It shows the process of restoring.

삭제delete

우선, 메타데이터 서버(200)는 찾고자하는 블록이 존재하는 데이터 서버(데이터 서버#3, 데이터 서버#7, 데이터 서버#N)의 위치를 확보하기 위해 크래쉬 영역의 블록 정보를 재구성한다(S610).First, the metadata server 200 reconstructs block information of the crash area to secure the location of the data server (data server # 3, data server # 7, data server #N) in which the block to be found exists (S610). .

여기서, (S610)단계는 복구에 사용할 정상적인 블록과 재복제될 데이터 서버를 선정하는 과정이다.Here, step S610 is a process of selecting a normal block to be used for recovery and a data server to be re-created.

이때, 복구에 사용될 정상적인 블록은 파일의 메타데이터로부터 확보될 수 있다. 즉, 메타데이터에는 파일에 대한 네임스페이스 정보뿐만 아니라 동일 블록의 수 및 블록의 저장위치 정보가 저장되어 있기 때문에 파일 메타데이터에 접근하여 정상적인 블록의 정보를 얻을 수 있다.In this case, a normal block to be used for recovery may be secured from metadata of the file. That is, since the metadata stores not only the namespace information of the file but also the number of the same blocks and the storage location information of the blocks, the metadata of the file can be accessed to obtain normal block information.

여기서, 블록을 재복제할 데이터 서버(데이터 서버#3, 데이터 서버#7, 데이터 서버#N)는 네트워크 위상 구조 또는 시스템 부하 등을 고려하여 선정하는 것이 바람직하다. Here, it is preferable to select a data server (data server # 3, data server # 7, data server #N) to which a block is to be replicated in consideration of network topology or system load.

제630블록과 같이 재구성된 크래쉬 영역은 {원본 블록, 원본 블록이 저장된 데이터 서버, 재복제할 데이터 서버} 형태로 이루어진다.The crash region reconstructed as in block 630 has a form of {the original block, the data server in which the original block is stored, and the data server to be replicated}.

이어서, 복구에 사용할 정상적인 블록이 존재하는 재복제할 데이터 서버(데이터 서버#3, 데이터 서버#7, 데이터 서버#N)에 재복제 명령을 전달한다(S620).Subsequently, a re-cloning command is transmitted to the data server (data server # 3, data server # 7, data server #N) to be reproduced in which a normal block to be used for recovery exists (S620).

이후, 재복제 명령을 전달받은 각 데이터 서버(데이터 서버#3, 데이터 서버#7, 데이터 서버#N)들은 재복제될 데이터 서버(데이터 서버#1)로 해당 블록(블록#1, 블록#3, 블록#4)을 전송하는 과정이 실행한다(S630).Subsequently, each data server (data server # 3, data server # 7, data server #N) that has received the re-cloning command is sent to the corresponding data server (data server # 1) to be replicated (block # 1, block # 3). In operation S630, the process of transmitting the block # 4 is performed.

즉, 재복제 명령을 전달받은 데이터 서버(데이터 서버#3, 데이터 서버#7, 데이터 서버#N)들은 각 블록들을 메타데이터 서버(200)가 전달한 블록 정보의 디스크로부터 재복제를 수행할 데이터 서버(데이터 서버#1)로 전송한다.That is, the data servers (data server # 3, data server # 7, and data server #N) that have received the re-copy command are data servers that will re-replicate each block from the disk of the block information transmitted by the metadata server 200. Send to (Data Server # 1).

이때, 각 데이터 서버(데이터 서버#3, 데이터 서버#7, 데이터 서버#N)는 데이터 서버의 부하 등의 시스템 상황을 고려하여 단계별로 또는 한꺼번에 블록 제복제를 수행할 수 있다.At this time, each data server (data server # 3, data server # 7, data server #N) may perform block replication step by step or all at once in consideration of system conditions such as the load of the data server.

도 7은 본 발명의 일실시예에 따른 다중 복제를 지원하는 분산 파일 시스템 에서 데이터 서버의 복구 방법을 도시한 흐름도이다. 이하, 도 7을 참조하여 설명한다.7 is a flowchart illustrating a recovery method of a data server in a distributed file system supporting multiple replications according to an embodiment of the present invention. A description with reference to FIG. 7 is as follows.

분산 파일 시스템 이용 중 고장이 발생한 일 데이터 서버가 감지되면(S705), 메타데이터 서버(200)는 저장 영역에 블록 인덱스 정보를 구성할 크래쉬 영역을 할당한다(S710).When one data server in which a failure occurs while using the distributed file system is detected (S705), the metadata server 200 allocates a crash area for configuring block index information to the storage area (S710).

여기서, 크래쉬 영역은 메타데이터 서버(200)의 메모리 또는 별도의 저장 공간에 존재할 수 있으며, 복구가 완료되면 제거되는 것이 바람직하다.Here, the crash region may exist in a memory or a separate storage space of the metadata server 200, and may be removed when the recovery is completed.

즉, 메타데이터 서버(200)는 크래쉬 영역을 할당함으로써 일 데이터 서버가 고장 상태임을 표시할 수 있다. That is, the metadata server 200 may indicate that one data server is in a failure state by allocating a crash region.

여기서, 일 데이터 서버의 고장은 네트워크 연결 단절, 데이터 서버 프로세스의 비정상 종료, 전원 불량 등으로 인해 발생될 수 있다. Here, a failure of one data server may be caused by disconnection of a network, abnormal termination of a data server process, power failure, or the like.

그리고, 메타데이터 서버(200)는 크래쉬 영역에 기본 파일(223)의 블록 인덱스 내용을 복사한다(S715).The metadata server 200 copies the block index contents of the basic file 223 to the crash area (S715).

그 다음으로, 메타데이터 서버(200)는 추가 파일(224)의 크기가 0 이상인지를 확인하고(S720), 0 이상이면 추가 파일(224)의 블록 인덱스 내용을 크래쉬 영역에 추가한다(S725). Next, the metadata server 200 checks whether the size of the additional file 224 is greater than or equal to zero (S720), and if it is greater than or equal to zero, adds the block index content of the additional file 224 to the crash area (S725). .

이때, 메타데이터 서버(200)는 추가 파일(224)의 크기가 0이면, 추가 파일(224)에 블록 인덱스의 내용이 없다고 판단하고, (S725)단계를 생략하고 (S730)단계를 수행한다.In this case, when the size of the additional file 224 is 0, the metadata server 200 determines that there is no content of the block index in the additional file 224, and omits step S725 and performs step S730.

그 다음으로, 메타데이터 서버(200)는 삭제 파일(225)의 크기가 0 이상인지를 확인하고(S730), 0 이상이면 삭제 파일(225)의 블록 인덱스 내용을 크래쉬 영역에 추가한다(S735).Next, the metadata server 200 checks whether the size of the deleted file 225 is greater than or equal to zero (S730), and if it is greater than or equal to zero, adds the contents of the block index of the deleted file 225 to the crash area (S735). .

이때, 메타데이터 서버(200)는 삭제 파일(225)의 크기가 0이면, 삭제 파일(225)에 블록 인덱스의 내용이 없다고 판단하고, (S735)단계를 생략하고 (S740)단계를 수행한다.At this time, if the size of the deletion file 225 is 0, the metadata server 200 determines that there is no content of the block index in the deletion file 225, and skips step S735 and performs step S740.

전술한 과정을 통하여, 크래쉬 영역에는 고장이 발생한 일 데이터 서버의 최신 블록 정보가 저장된다.Through the above-described process, the crash area stores the latest block information of one data server in which a failure occurs.

그리고, 메타데이터 서버(200)는 복구에 사용할 정상적인 블록이 존재하는 데이터 서버를 검색하고, 블록을 재복제할 데이터 서버를 선정한다(S740).
이때, 메타데이터 서버(200)는 메타데이터를 참조하여 각 재복제할 블록마다 {원본 블록, 원본 블록이 저장된 데이터 서버, 재복제할 데이터 서버}의 리스트를 구성하고, 그 중 복구에 사용될 데이터 서버를 선정하는 것이다.
이때, 원본 블록은 각 데이터 서버에서 정상적으로 남아있는 복제 블록들 중에서 선택된다.
예컨대, 전체 3개의 복제 블록 중 1개의 블록이 유실되었다면, 나머지 2개의 블록 중 하나가 복구에 사용할 블록으로 선정된다. Then, the metadata server 200 searches for a data server in which a normal block to be used for recovery exists, and selects a data server to re-replicate the block (S740).
At this time, the metadata server 200 forms a list of {original block, data server in which original blocks are stored, and data server to be replicated} for each block to be reproduced with reference to the metadata, and among them, a data server to be used for recovery. To select.
At this time, the original block is selected from among the duplicate blocks normally left in each data server.
For example, if one block of all three duplicated blocks is lost, one of the remaining two blocks is selected as the block to be used for recovery.

삭제delete

이때, 블록을 재복제할 데이터 서버는 네트워크 지역성(Locality)(즉, 물리적 위치의 근접 정도)을 고려하여 선정되는 것이 효율적이다.In this case, it is efficient that the data server to which the block is to be replicated is selected in consideration of network locality (ie, proximity of physical location).

이후, 고장이 발생한 일 데이터 서버는 메타데이터 서버(200)의 크래쉬 영역 엔트리를 각 데이터 서버별로 정렬한다(S745).
즉, 전송할 명령은 하나의 네트워크 명령으로 구성됨으로써, 각 데이터 서버에 대하여 일괄 전송되는 것이 바람직하다.
이와 같이, 고장이 발생한 일 데이터 서버는 각 데이터 서버별로 엔트리를 구성하여 각 데이터 서버에 대한 데이터를 일괄적으로 전송함으로써 블록의 수에 상관없이 데이터 서버의 개수 번만 데이터를 전송하면 된다. Subsequently, the data server in which the failure occurs arranges the crash area entries of the metadata server 200 for each data server (S745).
That is, the command to be transmitted is preferably composed of one network command, which is preferably transmitted collectively for each data server.
As described above, a data server having a failure is configured by transmitting an entry for each data server and collectively transmitting data for each data server, so that only one data server needs to be transmitted regardless of the number of blocks.

삭제delete

예컨대, 복구를 담당할 데이터 서버가 10대 존재한다면, 복구할 블록의 수가 아무리 많다고 하더라도 메타데이터 서버(200)는 10번의 네트워크 명령만 전송하면 된다. For example, if there are 10 data servers to handle recovery, the metadata server 200 only needs to transmit 10 network commands, no matter how many blocks to recover.

그 다음으로, 고장이 발생한 일 데이터 서버는 정렬된 각 데이터 서버에게 블록 재복제 명령을 전송한다(S750).Subsequently, the one data server in which the failure occurs transmits a block replication command to each aligned data server (S750).

이때, 일 데이터 서버는 복제할 원본 블록이 다수 개 존재하면, 재복제를 수행할 각 데이터 서버에게 순차적으로 명령을 전송하거나, 또는 병렬로 원본 블록을 전송할 수 있다. 그러므로, 명령의 전송 방식은 복구 작업 수행 당시의 시스템 부하를 고려하여 복구 효율을 높일 수 있는 방식으로 선정될 수 있다. In this case, when a plurality of original blocks to be replicated exist, one data server may sequentially transmit a command to each data server to perform re-replication, or transmit original blocks in parallel. Therefore, the command transmission method may be selected in such a way that the recovery efficiency may be increased in consideration of the system load at the time of performing the recovery operation.

이후, 일 데이터 서버는 제복제 명령을 수신한 데이터 서버로부터 복구에 사용될 정상적인 블록을 수신한다(S755).Thereafter, one data server receives a normal block to be used for recovery from the data server receiving the replica command (S755).

그리고, 복구할 일 데이터 서버는 수신된 정상적인 블록을 사용하여 복구 작업을 수행한다(S760).The to-be-restored data server performs a recovery operation using the received normal block (S760).

이어서, 복구할 일 데이터 서버는 복구 작업이 완료되면, 메타데이터 서버(200)에게 블록 복구가 완료되었음을 통보한다(S765). Subsequently, when the recovery task data server is completed, the metadata server 200 notifies the metadata server 200 that block recovery is completed (S765).

그러면, 메타데이터 서버(200)는 일 데이터 서버의 블록 복구 완료 여부를 판별한다(S770). Then, the metadata server 200 determines whether block recovery of one data server is completed (S770).

이때, 블록 복구의 완료 여부는 각 데이터 서버로부터 수집된 처리결과를 크래쉬 영역의 블록 정보와 비교함으로써 판단될 수 있다.
메타데이터 서버(200)는 판별결과 복제가 완료되었으면(S775), 크래쉬 영역을 제거하여 일 데이터 서버의 복구 작업을 완료한다(S780).In this case, whether the block recovery is completed may be determined by comparing the processing result collected from each data server with the block information of the crash area.
When the replication is completed (S775), the metadata server 200 removes the crash area and completes the recovery of one data server (S780).

삭제delete

한편, 복구된 일 데이터 서버는 분산 파일 시스템의 안정성 측면에서 사용되지 않는 것이 바람직하다.On the other hand, the recovered one data server is preferably not used in terms of stability of the distributed file system.

이상, 바람직한 실시예 및 첨부 도면을 통해 본 발명의 구성에 대하여 설명하였다. 그러나, 이는 예시에 불과한 것으로서 본 발명의 범위를 제한하기 위하여 사용된 것은 아니다. 본 기술 분야의 지식을 가진자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 본 발명의 진정한 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다. The configuration of the present invention has been described above through the preferred embodiments and the accompanying drawings. However, these are only examples and are not used to limit the scope of the present invention. Those skilled in the art will understand from this that various modifications and equivalent other embodiments are possible. The true scope of protection of the present invention should be defined by the technical spirit of the appended claims.

도 1은 종래기술에 따른 다중 복제 기반 분산 파일 시스템을 도시한 블록도.1 is a block diagram illustrating a multiple replication based distributed file system according to the prior art.

도 2 및 도 3은 본 발명에 따른 다중 복제를 지원하는 분산 파일 시스템에서 메타데이터 스토리지의 형태를 도시한 블록도.2 and 3 are block diagrams illustrating a form of metadata storage in a distributed file system supporting multiple replications according to the present invention.

도 4는 본 발명에 따른 메타데이터 서버가 블록 인덱스를 구성하는 방법을 도시한 흐름도.4 is a flowchart illustrating a method for configuring a block index in a metadata server according to the present invention.

도 5는 본 발명에 따른 블록 인덱스를 이용하여 고장이 발생한 일 데이터 서버의 블록 정보를 재구성하는 과정을 도시한 흐름도.5 is a flowchart illustrating a process of reconstructing block information of a data server having a failure using a block index according to the present invention.

도 6은 본 발명에 따른 다중 복제를 지원하는 분산 파일 시스템에서 고장이 발생한 데이터 서버가 블록을 복제하는 방법을 도시한 흐름도.6 is a flowchart illustrating a method for replicating blocks by a failed data server in a distributed file system supporting multiple replications according to the present invention.

도 7은 본 발명에 따른 다중 복제를 지원하는 분산 파일 시스템에서 데이터 서버의 복구 방법을 도시한 흐름도.7 is a flowchart illustrating a method of recovering a data server in a distributed file system supporting multiple replications according to the present invention.

<도면의 주요부분에 대한 설명><Description of main parts of drawing>

200: 메타데이터 서버 201: 메타데이터 저장소200: metadata server 201: metadata store

210: 메타데이터 영역 220, 221: 블록 영역210: metadata area 220, 221: block area

222: 블록 인덱스 223: 기본 파일222: Block Index 223: Basic File

224: 추가 파일 225: 삭제 파일224: Additional Files 225: Delete Files

Claims

For storage of metadata servers in distributed file systems that support multiple replication,

The metadata server manages metadata about a file stored in at least one data server in block units,

The storage of the metadata server,

A metadata area for storing metadata related to attributes of a file including location information;

A block area for storing a block index classified for each data server storing a block including the file,

If a failure occurs in the data server, support for restoration of the corresponding data server loss using the block index,

The block area is,

A basic file that is reconfigured each time the system of the metadata server is started and stores block index information of an initial state;

The initial state has a size of 0 and an additional file for additionally storing a new block information entry each time a new block is added;

In the initial state, the file has a size of 0 and further includes a block file for storing block information entries that are deleted each time a block is deleted.

Storage of a metadata server in a distributed file system supporting multiple replication, characterized by managing the contents of the block index by dividing into.

The method of claim 1, wherein the attribute of the file is:

File name, file size, file access permissions, and where the file is stored

Storage of the metadata server in a distributed file system that supports multiple replication, characterized in that related to the information including.

delete

The method of claim 1, wherein the additional file and the deletion file,

Storage of a metadata server in a distributed file system that supports multiple replication, characterized in that only content can be added but not content can be deleted.

The method of claim 1, wherein each of the new block information entry and the deleted block information entry are:

Block identifier, file path to which the block belongs

Storage of the metadata server in a distributed file system that supports multiple replication, characterized in that it comprises a.

A method for storing metadata in a distributed file system that supports multiple replications,

The metadata server stores metadata about data stored on one or more data servers.

(a) dividing the metadata storage area into a metadata area and a block area;

(b) receiving the metadata from a data server and storing attribute related information of a file in the metadata area;

(c) generating a basic file for storing information of blocks of an initial state in the block area;

(d) adding information of a block added or deleted from the initial state to an additional file or a deleted file if information of a block added or deleted from the initial state exists

Metadata storage method in a distributed file system supporting multiple replication, comprising a.

The method of claim 6, wherein step (d)

(d-1) when a new block is added, inserting a new block information entry at the end of the additional file;

(d-2) inserting a block information entry to be deleted at the end of the deletion file if a block to be deleted exists;

The method of claim 6, wherein after step (d),

When a new data server is added, the method further includes repeating steps (b) to (d).

In a distributed file system supporting multiple replication, a metadata server recovers a data server.

The metadata server stores metadata for data stored in one or more of the data servers,

The metadata server divides the metadata storage area into a metadata area and a block area, receives the metadata from the data server, stores attribute related information of a file in the metadata area, and initializes the data in the block area. Create a basic file that stores information of blocks of a state, and if information of a block added or deleted from the initial state exists, add information of a block added or deleted from the initial state to an additional file or a delete file,

(f) detecting a data server on which the failure occurred;

(g) reconstructing the crash region to reconstruct block information before failure occurs;

(h) searching for and selecting other data servers storing the same blocks as the block to be restored and requesting the same block;

(i) receiving the same block from the other data servers;

(j) recovering the one data server with the same block received, and notifying a completion of recovery;

Method for recovering a data server metadata server in a distributed file system that supports multiple replication, comprising a.

The method of claim 9, wherein the detecting of the failure in the step (f),

A method for recovering a data server by a metadata server in a distributed file system supporting multiple replication, characterized by disconnection of the network, abnormal termination of the data server processor, and detection of a power failure.

The method of claim 9, wherein after step (j),

(k) determining whether the recovery is successful;

(l) if the recovery is successful, deleting the crash region;

(m) if the recovery fails, returning to step (g) and repeating steps (g) to (j) until the recovery is successful.

The method of claim 11, wherein in the step (k) whether the recovery is successful,

And a metadata server recovering the data server in the distributed file system supporting multiple replications, wherein the metadata server is determined based on a result of the processing of the processing data collected from the data server and the reconstructed crash region.

The method of claim 9, wherein step (g) is

(g-1) allocating the crash region;

(g-2) copying the basic file in which the block index information for the initial state of the one data server is stored in the crash area;

(g-3) adding contents of the additional file in which the block index information added from the initial state of the one data server is stored in the crash area;

(g-4) adding contents of the deleted file in which the block index information deleted from the initial state of the one data server is stored in the crash region;

The method of claim 9, wherein (h) is,

(h-1) retrieving the number of the same blocks and at least one data server in which the same blocks are located from the metadata;

(h-2) selecting each other data server to perform re-replication among the searched one or more data servers;

(h-3) constructing a list of {same block, another data server in which the same block is stored, one data server} from the search result;

(h-4) sorting the list by each other data server in which the same block is stored;

(h-5) transmitting a command for requesting the same block to the other data servers

The method of claim 14, wherein in the step (h-2) the other data server,

Network topology and system load

Method for recovering the data server metadata server in a distributed file system supporting multiple replication, characterized in that the system is selected in consideration of the situation.