KR100472207B1

KR100472207B1 - RAID control system for Sharing Logical Disk Units by Multiple RAID Controllers

Info

Publication number: KR100472207B1
Application number: KR10-2002-0082911A
Authority: KR
Inventors: 이상민; 박종원; 석성우; 김명준
Original assignee: 한국전자통신연구원
Priority date: 2002-12-23
Filing date: 2002-12-23
Publication date: 2005-03-10
Also published as: KR20040056308A

Abstract

본 발명은 각 논리적 디스크 장치들에 대한 입출력 처리를 다중 제어기들로 분산시키는 분산 공유 RAID 시스템에서 입출력 병목현상을 해결하고, 다중화를 통한 성능 향상 및 데이터 공유로 인해 발생할 수 있는 데이터 훼손이나 손실 등을 방지할 수 있는 다중 RAID 제어기를 통한 데이터 분산 공유 RAID 제어 시스템에 관한 것이다.The present invention solves an I / O bottleneck in a distributed shared RAID system that distributes I / O processing for each logical disk device to multiple controllers, and improves performance through multiplexing and data corruption or loss that may occur due to data sharing. The present invention relates to a data sharing shared RAID control system through multiple RAID controllers that can be prevented.

본 발명은 다수의 호스트와 논리적 디스크 장치에 대하여 다중 RAID 제어기를 통해 데이터를 분산 공유하는 RAID 제어 시스템에 있어서, 각 RAID 제어기는 그 공유 논리적 디스크장치의 모든 데이터들을 일정 단위로 나누어 각각 일정 분량의 데이터를 분담하고, 자신의 분담 데이터 영역에 대한 락 서버로서 동작하는 것을 특징으로 한다. The present invention relates to a RAID control system for distributing and sharing data through multiple RAID controllers for a plurality of hosts and logical disk devices, wherein each RAID controller divides all the data of the shared logical disk device into a predetermined unit, each of a predetermined amount of data. And as a lock server for its own shared data area.

Description

Data distributed sharing RAID control system through multiple RAID controllers {RAID control system for Sharing Logical Disk Units by Multiple RAID Controllers}

본 발명은 분산 공유 RAID 제어 시스템에 관한 것이며, 보다 상세히는 다중의 RAID 제어기들이 논리적 디스크 장치들을 공유하는 분산 공유 RAID 시스템에서 입,출력 병목현상을 해결하고 데이터 공유로 인해 발생할 수 있는 데이터 훼손이나 손실을 방지할 수 있는 다중 RAID 제어기를 통한 데이터 분산 공유 RAID 제어 시스템에 관한 것이다. The present invention relates to a distributed shared RAID control system, and more particularly, to solve input and output bottlenecks in a distributed shared RAID system in which multiple RAID controllers share logical disk devices, and data corruption or loss that may occur due to data sharing. The present invention relates to a data distribution sharing RAID control system through multiple RAID controllers.

RAID(Redundant Array of Inexpensive Disks) 시스템은 여러 개의 물리적인 디스크들을 하나의 대용량 디스크처럼 동작하도록 구성하고 관리하는 디스크 어레이 시스템이다. Redundant Array of Inexpensive Disks (RAID) systems are disk array systems that organize and manage multiple physical disks to act as a single large disk.

일반적으로 이러한 RAID 시스템은 호스트에 HBA 형태로 제어기를 꽂아 저장장치를 연결하거나, 독립적인 여러 개의 RAID 제어기를 갖는 시스템 구조로 이루어진다. In general, such a RAID system is connected to a storage device by inserting a controller in the form of an HBA to a host, or a system structure having several independent RAID controllers.

도 1은 이러한 종래의 독립적인 여러 개의 RAID 제어기(120)를 갖는 RAID 시스템에 대한 블록 구성도이며, 호스트(110)와 저장 장치(130)를 관리하는 제어기를 다수개로 분리하고, 각 RAID 제어기(120)는 하나의 호스트(110)에만 연결하는 대신 SAN(Storage Area Network)을 통해 다중의 저장장치를 연결하는 구조로 된다. FIG. 1 is a block diagram of a RAID system having a plurality of such independent RAID controllers 120, and separates a plurality of controllers managing the host 110 and the storage device 130, and each RAID controller ( 120 may be configured to connect multiple storage devices through a storage area network (SAN) instead of only one host 110.

각 RAID 제어기(120)는 서로 독립적인 RAID 시스템을 형성하며 각 제어기는 자신에게 할당된 논리적 디스크장치의 관리를 책임진다. Each RAID controller 120 forms a RAID system independent of each other, each controller is responsible for the management of the logical disk device assigned to it.

따라서, 이와 같은 종래의 RAID 시스템은 호스트(110)들이 특정 디스크장치(130)에 대해 집중적으로 요구하게 되면 상대적으로 다른 제어기들에 대해서는 입출력 요구가 적어지는 반면 해당 디스크장치를 관리하는 제어기로만 입출력 요구가 몰리기 때문에, 제어기에 성능상 병목현상이 발생하게 되고 제어기의 고장 및 저장 데이터의 처리에 대한 성능 저하를 가져오는 문제점이 있다. Therefore, in the conventional RAID system, when the host 110 requests intensively for a specific disk device 130, the input / output request for other controllers is relatively small while the input / output request only for the controller managing the disk device. Because of this problem, performance bottlenecks occur in the controller, and there is a problem in that the failure of the controller and the performance degradation of the processing of stored data are caused.

또한, 이러한 병목현상을 해결하기 위해 각 논리적 디스크장치들에 대한 입출력 처리를 다중 제어기들로 분산시킬 경우, 데이터 공유로 인해 발생할 수 있는 데이터의 훼손 및 손실을 방지할 수 있는 적절한 공유 데이터 관리 방식이 요구된다. In addition, in order to solve this bottleneck, when the I / O processing for each logical disk device is distributed to multiple controllers, an appropriate shared data management method is provided to prevent data corruption and loss that may occur due to data sharing. Required.

따라서, 본 발명은 상술한 종래의 문제점 및 제시된 필요성을 해결하기 위한 것으로서, 본 발명의 목적은 각 논리적 디스크 장치들에 대한 입출력 처리를 다중 제어기들로 분산시키는 분산 공유 RAID 시스템에서 입출력 병목현상을 해결하고, 다중화를 통한 성능 향상 및 데이터 공유로 인해 발생할 수 있는 데이터 훼손이나 손실 등을 방지할 수 있는 다중 RAID 제어기를 통한 데이터 분산 공유 RAID 제어 시스템을 제공하는데 있다.Accordingly, an object of the present invention is to solve the above-described problems and necessities presented, and an object of the present invention is to solve an I / O bottleneck in a distributed shared RAID system that distributes I / O processing for each logical disk device to multiple controllers. In addition, the present invention provides a data distribution sharing RAID control system through multiple RAID controllers that can prevent data corruption or loss that may occur due to performance improvement and data sharing through multiplexing.

상기 본 발명의 목적을 달성하기 위한 다중 RAID 제어기를 통한 데이터 분산 공유 RAID 제어 시스템은, 다수의 호스트와 논리적 디스크 장치에 대하여 다중 RAID 제어기를 통해 데이터를 분산 공유하는 RAID 제어 시스템에 있어서, 각 RAID 제어기는 그 공유 논리적 디스크장치의 모든 데이터들을 일정 단위로 나누어 각각 일정 분량의 데이터를 분담하고, 자신의 분담 데이터 영역에 대한 락 서버로서 동작하는 것을 특징으로 한다. In order to achieve the object of the present invention, a data distribution sharing RAID control system through a multiple RAID controller, RAID control system for sharing data through multiple RAID controllers for a plurality of hosts and logical disk devices, each RAID controller Divides all the data of the shared logical disk device into a certain unit, shares a certain amount of data, and operates as a lock server for its own shared data area.

이하, 본 발명에 따른 실시예를 첨부한 도면을 참조하여 상세히 설명하기로 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 2. 본 발명에 따른 분산 공유 RAID 시스템에 대한 블록 구성도이다. 2 is a block diagram of a distributed shared RAID system according to the present invention.

도 2에 도시된 바와 같이, 본 발명의 분산 공유 RAID 시스템은 각 논리적 디스크장치(230)를 다중의 RAID 제어기(220)가 공유하는 구조로 이루어진다. As shown in FIG. 2, the distributed shared RAID system according to the present invention has a structure in which multiple RAID controllers 220 share each logical disk device 230.

또한, 다수의 호스트(210)와 다중 RAID 제어기(220)간의 인터페이스와, 다수의 논리적 디스크장치(230)와 다중 RAID 제어기(220)간의 인터페이스를 위해 처리 속도가 빠른 광채널 스위치(240)를 사용하며, RAID 제어기(220)들간의 블록 데이터 전송을 빠르게 하기 위하여 SCI(Scalable Coherence Interface) 스위치(250)를 사용한다. In addition, a fast channel speed switch 240 is used for the interface between the plurality of hosts 210 and the multiple RAID controller 220 and the interface between the plurality of logical disk devices 230 and the multiple RAID controller 220. In order to speed up block data transfer between the RAID controllers 220, a scalable coherence interface (SCI) switch 250 is used.

이와 같은 분산 공유 RAID 시스템은 하나의 제어기가 관리해야 하는 논리적 디스크장치(230)의 수가 많아진다고 해도 여러 제어기를 통해 입출력 채널이 다중화되기 때문에 각 RAID 제어기(220)가 처리해야 하는 일을 여러 제어기로 분산 처리할 수 있다. In such a distributed shared RAID system, even though the number of logical disk devices 230 that one controller needs to manage increases, the I / O channels are multiplexed through multiple controllers, so that each RAID controller 220 needs to handle a plurality of controllers. Can be distributed processing.

그러나, 어느 제어기로부터든 데이터 접근이 가능하다는 것은 그만큼 데이터가 훼손될 가능성이 높기 때문에 데이터 무결성(data coherence)을 유지하는 것이 아주 중요하다. However, it is very important to maintain data integrity because data access from any controller is more likely to be corrupted.

이를 해결하기 위하여, 본 발명은 각 RAID 제어기(220)를 락 서버(lock server)로 이용하여 각 RAID 제어기(220)로 하여금 자신의 관리 데이터에 대한 접근권한을 관리하도록 한다. In order to solve this problem, the present invention uses each RAID controller 220 as a lock server to allow each RAID controller 220 to manage access rights to its own management data.

즉, 각 RAID 제어기(220)는 자신의 관리 데이터에 대한 락 정보를 관리하는 락 서버로서, 자신의 관리 데이터에 대한 입출력 요구가 왔을 때 해당 데이터로의 접근권한을 책임지는 역할을 수행한다. That is, each RAID controller 220 is a lock server that manages lock information on its own management data, and plays a role in charge of access right to the corresponding data when an input / output request for its own management data arrives.

락 서버의 운용은 모든 데이터에 대한 책임을 하나의 락 서버에서 담당하는 중앙 방식(centralized lock management)과, 여러 개의 락 서버를 두어 각자 자신에 할당된 데이터에 대한 책임을 담당하도록 하는 분산 방식(distributed lock management)으로 구분할 수 있다. The operation of a lock server is a centralized lock management in which one lock server is responsible for all data, and a distributed method in which multiple lock servers are in charge of data assigned to each of them. lock management).

종래 락 서버 운용 방식인 중앙 방식은 하나의 제어기로 데이터 접근허가를 요청하는 요구들이 집중되기 때문에 락 서버가 병목되는 문제점이 있다. The central method, which is a conventional lock server operation method, has a problem that the lock server is a bottleneck because requests for requesting data access permission to one controller are concentrated.

하지만, 본 발명에 따른 분산 방식의 락 서버 운용은 종래 중앙 방식의 이런 집중화 문제를 해결하기 위해 제안된 것으로, 여러 개의 락 서버들을 통해 데이터 관리책임을 분산시킴으로써 중앙 방식이 갖는 병목현상을 효과적으로 감소시킬 수 있다. However, the distributed lock server operation according to the present invention is proposed to solve the centralization problem of the conventional central method, and effectively reduces the bottleneck of the central method by distributing data management responsibility through multiple lock servers. Can be.

하나의 논리적 디스크장치(230)를 공유하는 다중의 RAID 제어기(220)들은 모든 데이터들을 일정 단위로 나누어 각자 일정 분량의 데이터를 분담 관리한다. 동일한 데이터에 대한 복사본은 동시에 여러 RAID 제어기(220)에 존재할 수 있으며 이에 대한 정보는 락 서버 제어기에서 관리하는 락 정보로서 저장된다. The multiple RAID controllers 220 sharing one logical disk device 230 divides all data into a predetermined unit and manages a certain amount of data. Copies of the same data can exist in multiple RAID controllers 220 at the same time and the information is stored as lock information managed by the lock server controller.

한편, 도 3. 본 발명에 따른 각 RAID 제어기(220)의 공유 데이터 관리 구조를 보여주는 도면이다. Meanwhile, FIG. 3 is a diagram illustrating a shared data management structure of each RAID controller 220 according to the present invention.

도 3에 도시된 바와 같이, 각 RAID 제어기(220)는 입출력 성능을 고려하여 읽기캐시(221), 쓰기캐시(222), 백업 쓰기캐시(224)로 나누어 관리한다. As shown in FIG. 3, each RAID controller 220 is divided into a read cache 221, a write cache 222, and a backup write cache 224 in consideration of input / output performance.

이때, 제어기 오류 복구를 위해 각 RAID 제어기(220)는 round-robin 방식으로 쓰기캐시(222)에 대한 백업 쓰기캐시(224)를 정의한다. At this time, each RAID controller 220 defines a backup write cache 224 for the write cache 222 in a round-robin manner for controller error recovery.

즉, 제어기 2에는 제어기 1의 백업 쓰기캐시가 존재하며, 제어기 3에는 제어기 2의 백업 쓰기캐시가 존재한다. 마지막 제어기의 백업 쓰기캐시는 제어기 1에서 관리한다. That is, the controller 2 has a backup write cache of controller 1, and the controller 3 has a backup write cache of controller 2. The backup write cache of the last controller is managed by controller 1.

또한, 각 RAID 제어기(220)는 자신이 관리하는 데이터에 대한 락 정보(223)를 관리한다. Also, each RAID controller 220 manages lock information 223 for data managed by the RAID controller 220.

이 락 정보(223)는 해당 데이터에 대한 캐싱 여부 및 그 캐싱 제어기의 위치와 관련된 정보로서, 락 서버인 RAID 제어기(220)는 호스트(210) 또는 타 RAID 제어기(220)로부터 자신의 관리 데이터에 대한 읽기 또는 쓰기 요청이 있을 경우, 먼저 락 정보(223)를 검색하여 해당 데이터에 대한 캐싱 여부 및 그 캐싱 제어기를 확인한 후, 그에 따른 동작을 수행하게 된다. The lock information 223 is information related to whether or not to cache the data and the location of the caching controller. The RAID controller 220, which is a lock server, transmits its management data from the host 210 or another RAID controller 220. When there is a read or write request, the lock information 223 is searched first to check whether the corresponding data is cached and the caching controller, and then the corresponding operation is performed.

도 4는 본 발명에 따른 하나의 논리적 디스크장치(230)에 대한 다중 RAID 제어기(220)의 데이터 분산 관리 구조도이다. 4 is a diagram illustrating a data distribution management structure of the multiple RAID controller 220 for one logical disk device 230 according to the present invention.

도 4에 도시된 바와 같이, 하나의 논리적 디스크장치(230)는 일정 크기의 다수 세그먼트들로 나누어지며, 그 크기는 사용자에 의해 정의될 수 있다. As shown in FIG. 4, one logical disk device 230 is divided into a plurality of segments having a predetermined size, and the size may be defined by a user.

각 세그먼트는 논리적 블록 주소로 정해지는 것으로 해당 데이터는 여러 개의 물리적 디스크 장치에 걸쳐서 저장될 수 있다. Each segment is defined by a logical block address, and its data can be stored across multiple physical disk devices.

이때, 각 RAID 제어기(220)는 하나의 논리적 디스크장치(230)의 세그먼트들을 분담하여 관리하며, 자신이 관리하는 세그먼트들에 저장되는 데이터에 대한 락 정보(223)를 관리한다. At this time, each RAID controller 220 shares and manages segments of one logical disk device 230 and manages lock information 223 for data stored in segments managed by the RAID controller 220.

또한, 각 락 서버가 관리하는 데이터를 갱신하고자 할 때는 반드시 해당 락 서버로 데이터를 전송함으로써 락 서버의 쓰기캐시(222)에 저장되도록 한다. 이때 타 제어기에 존재하는 이전 데이터들에 대한 삭제 요청이 선행되어야 한다. In addition, when updating the data managed by each lock server, the data must be transmitted to the lock server so that the lock server can be stored in the write cache 222 of the lock server. At this time, the deletion request for previous data existing in the other controller must be preceded.

도 5. 각 RAID 제어기(220)가 관리하는 락 관리 해시 테이블 및 락 정보 구조를 보여주는 도면이다. 5 is a diagram illustrating a lock management hash table and lock information structure managed by each RAID controller 220.

도 5에 도시된 바와 같이, 락 정보(223)는 각 논리적 디스크장치(230)마다 하나의 해시테이블에서 관리된다. 해시테이블은 여러 해시 엔트리들로 구성되며 하나의 해시 엔트리는 락 정보들로 구성되는 이중 연결 리스트를 연결한다. 락 정보를 저장할 해시 엔트리는 락 정보가 저장하는 물리적 디스크 블록 번호를 전체 해시 엔트리 수로 나눈 나머지 값을 인덱스로 하여 결정된다. As shown in FIG. 5, the lock information 223 is managed in one hash table for each logical disk device 230. As shown in FIG. A hash table consists of several hash entries, and one hash entry joins a doubly linked list of lock information. The hash entry to store lock information is determined by indexing the remaining value obtained by dividing the physical disk block number stored by the lock information by the total number of hash entries.

하나의 논리적 디스크장치(230)에 대해 사용되는 락 정보(223)들의 수는 논리적 디스크장치(230)가 생성될 때 정의된 디스크 캐시의 크기와 사용되는 제어기의 수에 의해 결정된다. The number of lock information 223 used for one logical disk device 230 is determined by the size of the disk cache defined when the logical disk device 230 is created and the number of controllers used.

또한, 락 정보(223)는 데이터가 저장된 물리적 디스크 장치 번호와 물리적 디스크 블록 번호, 해당 락 정보를 참조하기 위해 대기 중인 프로세스의 수, 해당 데이터의 복사본을 저장하고 있는 제어기들을 나타내는 비트맵, 해당 데이터에 대한 쓰기가 진행 중임을 나타내는 플래그, 프로세스 동기 세마포어로 구성된다. The lock information 223 may also include a physical disk device number and a physical disk block number where data is stored, a number of processes waiting to refer to the lock information, a bitmap indicating controllers storing a copy of the data, and corresponding data. A flag indicating that a write is in progress, composed of process-synchronous semaphores.

상기 제어기 비트맵은 락 서버로부터 해당 데이터를 읽어 자신의 읽기캐시(221)에 저장하고 있는 제어기들을 표시하는 것으로, 제어기 수만큼의 비트(share 비트)로 이루어져서 각 제어기마다 고유한 비트 번호가 할당되며 특정 제어기에 대한 비트가 1이라는 의미는 해당 제어기의 읽기캐시(221)에 그 데이터가 존재한다는 것을 의미한다. The controller bitmap indicates controllers that read the corresponding data from the lock server and store the data in the read cache 221. The controller bitmap is composed of as many bits as the number of controllers and is assigned a unique bit number for each controller. A bit of 1 for a particular controller means that the data exists in the read cache 221 of that controller.

이 제어기 비트맵 정보는 해당 데이터에 대한 쓰기 요구가 왔을 때 이전 데이터를 저장하고 있는 제어기로 읽기캐시(221)에 저장된 이전 데이터를 삭제하도록 요청할 때 사용된다. The controller bitmap information is used when a request for writing the corresponding data is requested to delete the previous data stored in the read cache 221 to the controller storing the previous data.

또한, 상기 쓰기 플래그는 해당 데이터에 대한 쓰기 요구가 진행 중인 동안 데이터 접근을 차단하기 위해 사용된다. The write flag is also used to block data access while a write request for the data is in progress.

즉, 쓰기 요구가 오면 우선 이 플래그를 1로 셋팅함으로써 이후에 들어오는 모든 읽기 및 쓰기 요구는 진행 중인 쓰기가 완료될 때까지 해당 락 정보에서 대기하고, 그 쓰기가 완료되면 대기 중인 요구 중 가장 먼저 도착한 입출력 요구부터 처리를 시작하도록 한다. In other words, when a write request comes in, first set this flag to 1 so that all subsequent read and write requests will wait on the lock information until the ongoing write is completed. Start processing from the I / O request.

만일, 락 정보(223)는 존재하지만 비트맵과 쓰기 플래그가 모두 0인 경우는 해당 데이터가 쓰기캐시(222)에 존재함을 의미한다. If the lock information 223 is present but both the bitmap and the write flag are 0, it means that the corresponding data exists in the write cache 222.

한편, 도 6은 데이터를 관리하는 락 서버 제어기로 전달된 읽기 요구를 처리하는 과정을 보여주는 흐름도이다. 6 is a flowchart illustrating a process of processing a read request transmitted to a lock server controller managing data.

호스트(210)로부터 읽기 요청이 들어오면, 락 서버인 RAID 제어기(220)는 먼저 자신이 관리하는 락 정보 해시 테이블을 검색하여 그 데이터에 대한 락 정보(223)가 존재하는지 확인한다.(S601) When a read request is received from the host 210, the RAID controller 220, which is a lock server, first searches for a lock information hash table managed by the lock server and checks whether there is lock information 223 for the data (S601).

이때, 해당 락 정보(223)가 존재하지 않으면, 새로운 락 정보(223)를 할당하여 그 정보를 저장하고, 자신에게 해당하는 제어기 비트맵의 share 비트를 1로 변경한 후 해시 엔트리에 연결한다.(S602,S603) At this time, if the lock information 223 does not exist, the new lock information 223 is allocated, the information is stored, the share bit of the controller bitmap corresponding to the controller is changed to 1, and then connected to the hash entry. (S602, S603)

그리고, 읽기캐시(221)에 새로운 캐시버퍼를 할당한 후,(S604) 디스크로부터 데이터를 읽어 읽기캐시(221)에 저장한다.(S605) After a new cache buffer is allocated to the read cache 221 (S604), data is read from the disk and stored in the read cache 221 (S605).

락 정보(223)가 존재할 경우, 락 서버 RAID 제어기(220)는 쓰기 플래그를 검사하여 쓰기가 진행 중인지 확인한다.(S606) If the lock information 223 exists, the lock server RAID controller 220 checks the write flag to determine whether the write is in progress.

쓰기가 진행 중이라면 쓰기가 완료될 때까지 대기모드로 들어간다.(S607) If the write is in progress, it enters the standby mode until the write is completed (S607).

쓰기가 진행 중이지 않는다면, 제어기 비트맵의 share 비트들을 검색하여 해당 데이터를 저장하고 있는 제어기들을 검색한다.(S608) If writing is not in progress, the controller searches for the share bits of the controller bitmap and stores the corresponding data (S608).

이때, share 비트의 값이 0이면 자신의 쓰기캐시(222)에 저장되었음을 의미하므로 자신의 쓰기캐시(222)를 검색한다.(S609) At this time, if the value of the share bit is 0, it means that it is stored in its own write cache 222, so that its own write cache 222 is searched (S609).

또한, 자신에 대한 share 비트가 1이면 자신의 읽기캐시(221)에 저장되었음을 의미하므로 자신의 읽기캐시(221)를 검색한다.(S610) In addition, if the share bit for the self is 1, it means that it is stored in its own read cache 221. Therefore, the own read cache 221 is searched (S610).

만일, 자신이 아닌 타 제어기(220)에 해당하는 share 비트가 1이면 자신의 읽기캐시(221)에 새로운 캐시버퍼를 할당한 후,(S611) 읽고자 하는 데이터의 논리적 디스크 장치번호, 물리적 디스크 장치번호, 물리적 디스크 블록번호, 데이터가 저장될 캐시버퍼 주소를 담은 데이터 읽기 요구 메시지를 상기 SCI 스위치를 통해 해당 타 RAID 제어기(220)로 전송한 후 응답을 기다린다.(S612) If the share bit corresponding to the controller 220 other than itself is 1, after allocating a new cache buffer to its read cache 221 (S611), the logical disk device number and the physical disk device of the data to be read. The data read request message including the number, the physical disk block number, and the cache buffer address where the data is to be stored is transmitted to the other RAID controller 220 through the SCI switch, and then waits for a response.

그리고, 타 제어기(220)로부터 응답을 받으면 응답 메시지를 통해 에러 여부를 판단한다. 에러가 없다는 것은 타 제어기로부터 자신의 읽기캐시(221)로 데이터가 성공적으로 저장되었음을 의미한다. 이 경우 자신에 대한 share 비트를 1로 변경함으로써 자신의 읽기캐시(221)에도 저장되어 있음을 알려준다.(S613)When receiving a response from the other controller 220, it is determined whether an error is made through the response message. No error means that data has been successfully stored in another read cache 221 from another controller. In this case, by changing the share bit for itself to 1, it tells that it is also stored in its read cache (221) (S613).

이와 같은 과정에 의해 읽기 요청된 데이터가 성공적으로 확인되면, 락서버인 상기 RAID 제어기(220)는 그 읽은 데이터를 호스트(210)로 전송한다.(S614) When the read request data is successfully confirmed by the above process, the RAID controller 220 which is a lock server transmits the read data to the host 210 (S614).

도 7은 타 RAID 제어기(220)가 락 서버 RAID 제어기(220)의 데이터에 대한 읽기 요구를 처리하는 과정을 보여주는 흐름도이다. FIG. 7 is a flowchart illustrating a process in which another RAID controller 220 processes a read request for data of the lock server RAID controller 220.

먼저, 타 RAID 제어기(220)는 읽기 요청 데이터를 관리하는 락 서버 RAID 제어기(220)를 결정한 후, 자신의 읽기캐시(221)에 새로운 캐시버퍼를 할당한다.(S701) First, the other RAID controller 220 determines the lock server RAID controller 220 that manages the read request data, and then allocates a new cache buffer to its read cache 221 (S701).

그리고, 읽고자 하는 데이터의 논리적 디스크 장치번호, 물리적 디스크 장치번호, 물리적 디스크 블록번호, 데이터가 저장될 캐시버퍼 주소를 담은 락 정보 요구 메시지를 그 데이터에 대한 락 서버 RAID 제어기(220)로 전송한 후 응답을 기다린다.(S702) The lock information request message including the logical disk device number, the physical disk device number, the physical disk block number, and the cache buffer address where the data is to be stored is transmitted to the lock server RAID controller 220 for the data. Wait for a response (S702).

락 서버 RAID 제어기(220)로부터 응답 메시지가 수신되면,(S703) 그 응답 메시지 내용을 통해 그 데이터에 대한 share 제어기 번호를 확인한다.(S704) If a response message is received from the lock server RAID controller 220 (S703), the share controller number for the data is checked through the response message contents (S704).

share 제어기에 자신이 포함될 경우, 상기 타 RAID 제어기(220)는 자신의 읽기캐시(221)에 해당 데이터가 저장되어 있음을 의미하므로 자신의 읽기캐시(221)를 검색한다.(S705) When the share controller includes itself, the other RAID controller 220 searches for its own read cache 221 because it means that the corresponding data is stored in its read cache 221.

또한, share 제어기가 락 서버 RAID 제어기(220)일 경우는 락 서버 RAID 제어기(220)의 읽기캐시(221)에 저장된 데이터를 읽어온다. In addition, when the share controller is the lock server RAID controller 220, data stored in the read cache 221 of the lock server RAID controller 220 is read.

share 제어기 번호가 -1이면 어느 제어기에도 해당 데이터가 저장되어 있지 않음을 의미하는 것으로, 직접 물리적 디스크 장치를 읽어 해당 데이터를 읽기캐시(221)에 저장한 후,(S706) 상기 락 서버 RAID 제어기(220)로 해당 데이터가 자신의 읽기캐시(221)에 저장되었으니 자신에 대한 share 비트를 1로 변경해 달라는 의미로 락 정보 변경 메시지를 전송한다.(S709) If the share controller number is -1, it means that the corresponding data is not stored in any controller. After reading the physical disk device directly and storing the data in the read cache 221 (S706), the lock server RAID controller ( Since the corresponding data is stored in the read cache 221 of FIG. 220, the lock information change message is transmitted to change the share bit of the data to 1 (S709).

만일, share 제어기 번호가 다른 타 제어기라면 그 타 제어기의 읽기캐시(221)에 저장되었음을 의미하는 것으로, 상기 타 RAID 제어기(220)는 읽고자 하는 데이터 정보를 담은 데이터 읽기 요구 메시지를 해당 타 제어기로 전송한 후 응답을 기다린다.(S707) If the share controller number is another controller, it means that the other controller is stored in the read cache 221 of the other controller. The other RAID controller 220 sends a data read request message containing data information to be read to the other controller. Wait for a response after transmitting (S707).

타 제어기로부터 응답 메시지가 수신되면,(S708) 에러 여부를 확인한 후 에러가 없다면 상기 락 서버 RAID 제어기(220)로 락 정보 변경 메시지를 전송하고,(S709) 그 데이터를 자신의 읽기캐시(221)에 저장한다. When the response message is received from another controller (S708), if there is no error after checking whether the error is transmitted, the lock information change message is transmitted to the lock server RAID controller 220 (S709), and the data is read from its cache (221). Store in

그리고, 상기 과정을 통해 자신의 읽기캐시(221)에 저장된 데이터를 호스트(210)로 전송한다.(S710) Then, the data stored in its read cache 221 is transmitted to the host 210 through the above process (S710).

도 8은 락 서버 제어기와 읽기 요구를 받은 제어기가 다른 경우 타 제어기로부터 락 서버 제어기로 전달된 락 정보 요구 메시지를 처리하는 과정을 보여주는 흐름도이다. 8 is a flowchart illustrating a process of processing a lock information request message transferred from another controller to a lock server controller when the lock server controller and the controller receiving the read request are different.

타 RAID 제어기(220)로부터 락 정보 요구 메시지를 받으면, 락 서버 RAID 제어기(220)는 먼저 그 요구 메시지에 저장되어 있는 물리적 디스크 장치번호와 물리적 디스크번호를 이용하여 해당 락 정보(223)를 검색한다.(S801) Upon receiving the lock information request message from the other RAID controller 220, the lock server RAID controller 220 first searches for the lock information 223 using the physical disk device number and the physical disk number stored in the request message. (S801)

이때, 락 정보(223)는 존재하지만 현재 쓰기가 진행 중이면 쓰기가 완료될 때까지 대기한다.(S802) At this time, if the lock information 223 is present but writing is in progress, it waits until the writing is completed (S802).

쓰기가 진행 중이지 않는다면, share 비트를 검색하여 그 값이 0이면 쓰기캐시(222)에 저장되어 있으므로, 상기 락 서버 RAID 제어기(220)는 자신의 쓰기캐시(222)를 검색하여 데이터를 검색한 후,(S803) 우선 그 데이터를 디스크에 저장한다.(S804) If the write is not in progress, the search bit is searched for the share bit, and if the value is 0, it is stored in the write cache 222. Thus, the lock server RAID controller 220 searches for the write cache 222 to retrieve data. After that (S803), the data is first stored on the disk. (S804)

그리고, 읽기 요구한 타 RAID 제어기(220)에 대한 share 비트를 1로 설정한 후,(S805) 해당 데이터를 락 정보 요구 메시지에 저장된 버퍼주소로 복사한다.(S807) After setting the share bit for the other RAID controller 220 that has read request to 1 (S805), the corresponding data is copied to the buffer address stored in the lock information request message (S807).

Share 비트 값이 0이 아니고 자신에 대한 share 비트가 1이면 자신의 읽기캐시(221)로부터 데이터를 검색한 후,(S806) 그 데이터를 락 정보 요구 메시지에 저장된 버퍼주소로 복사한다.(S807) If the share bit value is not 0 and the share bit for itself is 1, after retrieving data from its read cache 221 (S806), the data is copied to the buffer address stored in the lock information request message (S807).

그리고, 자신의 제어기 번호가 저장된 응답메시지를 작성하여 이를 요구한 타 RAID 제어기(220)로 전송한다.(S808,S809) Then, a response message in which its controller number is stored is written and transmitted to the other RAID controller 220 which has requested this. (S808, S809)

만일, 자신이 아닌 타 제어기에 해당하는 share 비트가 1이면 응답 메시지에 해당 타 제어기의 번호를 저장한 후 이를 요구한 RAID 제어기(220)로 전송한다. (S810,S809)If the share bit corresponding to the other controller other than the self is 1, the number of the other controller is stored in the response message and transmitted to the requested RAID controller 220. (S810, S809)

또한, 락 정보(223)가 존재하지 않을 경우는, 상기 락 서버 RAID 제어기(220)는 share 제어기 번호를 -1로 설정하여 응답 메시지를 작성한 후, 이를 요구 RAID 제어기(220)로 전송한다.(S811,S809) If the lock information 223 does not exist, the lock server RAID controller 220 sets the share controller number to -1, creates a response message, and transmits the response message to the request RAID controller 220. S811, S809)

도 9는 타 RAID 제어기(220)로부터 전달된 데이터 읽기 요구 메시지를 처리하는 과정을 보여주는 흐름도이다. 9 is a flowchart illustrating a process of processing a data read request message transferred from another RAID controller 220.

해당 데이터에 대한 락 서버 RAID 제어기(220)가 아닌 RAID 제어기(220)가 타 RAID 제어기(220)로부터 읽기 요구 메시지를 받으면, 그 RAID 제어기(220)는 요구 메시지에 저장된 물리적 디스크 장치 번호와 물리적 디스크 블록 번호를 이용하여 읽기캐시(221)를 검색한다.(S901) When a RAID controller 220 other than the lock server RAID controller 220 for the data receives a read request message from another RAID controller 220, the RAID controller 220 stores the physical disk device number and the physical disk stored in the request message. The read cache 221 is searched using the block number (S901).

읽기캐시(221) 검색 과정에서 에러 발생 여부를 확인한 후, 에러가 없다면 즉, 해당 데이터가 읽기캐시(221)에 저장되어 있다면, 메시지에 저장된 버퍼주소로 데이터를 복사한다.(S902) After checking whether an error occurs in the read cache search process, if there is no error, that is, if the corresponding data is stored in the read cache 221, the data is copied to the buffer address stored in the message.

그리고, 에러 발생 여부 정보를 포함한 응답 메시지를 작성한 후,(S903) 이를 요구한 타 RAID 제어기(220)로 전송한다.(S904)After the response message including the error occurrence information is written (S903), it is transmitted to the other RAID controller 220 which has requested this (S904).

도 10은 자신이 관리하는 데이터에 대한 쓰기 요구를 처리 과정을 나타내는 흐름도이다. 10 is a flowchart illustrating a process of processing a write request for data managed by the user.

호스트(210)로부터 쓰기 요구를 받으면, RAID 제어기(220)는 우선 물리적 디스크 장치 번호와 물리적 디스크 블록 번호를 이용하여 그 블록에 해당하는 락 정보(223)를 검색한다. (S1001)Upon receiving a write request from the host 210, the RAID controller 220 first searches for lock information 223 corresponding to the block using the physical disk device number and the physical disk block number. (S1001)

이때, 해당 락 정보(223)가 존재하지 않는다면, 쓰기캐시(222)에 해당 데이터가 저장되어 있지 않음을 의미하므로, 새로운 락 정보(223)를 할당하고,(S1002) 쓰기플래그를 1로 셋팅한 후,(S1003) 쓰기캐시(222)에 캐시 택을 할당하고,(S1004) 호스트(210)로부터 전달받은 쓰기 데이터를 저장한다.(S1005) In this case, if the lock information 223 does not exist, it means that the corresponding data is not stored in the write cache 222. Therefore, new lock information 223 is allocated and the write flag is set to 1 (S1002). After that, the cache tag is allocated to the write cache 222 (S1004), and the write data received from the host 210 is stored (S1005).

만일, 해당 락 정보(223)가 존재한다면, 상기 RAID 제어기(220)는 우선 쓰기 플래그가 1인지 여부를 확인한다.(S1006) If the corresponding lock information 223 exists, the RAID controller 220 first checks whether the write flag is 1 (S1006).

쓰기 플래그가 1일 경우는 그 쓰기가 완료될 때까지 대기한다. (S1007)If the write flag is 1, wait until the write is completed. (S1007)

쓰기 플래그가 0이고, share 비트들도 모두 0이면 해당 데이터가 쓰기캐시(222)에 존재하는 것이므로, 상기 RAID 제어기(220)는 해당 락 정보(223)의 쓰기 플래그를 일단 1로 변경한 상태에서,(S1008) 호스트(210)로부터 받은 쓰기 데이터를 쓰기캐시(222)에 덮어쓴다.(S1009) If the write flag is 0 and the share bits are all 0, the corresponding data exists in the write cache 222, and the RAID controller 220 changes the write flag of the lock information 223 to 1 once. (S1008) The write data received from the host 210 is overwritten by the write cache 222. (S1009)

또한, share 비트들이 모두 0이 아니라면 1에 해당하는 제어기들로 그 읽기캐시(221)에 존재하는 해당 데이터를 삭제하도록 데이터 삭제 요구 메시지를 전송한다.(S1010)In addition, if the share bits are not all zeros, a data deletion request message is transmitted to the controllers corresponding to 1 to delete the corresponding data existing in the read cache 221 (S1010).

그리고, share 비트가 1인 모든 제어기들로부터 그 읽기캐시(221)의 해당 데이터가 삭제되었다는 응답 메시지를 받은 후, 쓰기 캐기에 캐시 택을 할당하고,(S1004) 상기 호스트(210)로부터 받은 데이터를 저장한다.(S1005) After receiving the response message indicating that the corresponding data of the read cache 221 has been deleted from all controllers having the share bit of 1, the cache tag is allocated to the write cache (S1004) and the data received from the host 210 is received. Save (S1005)

이와 같이 데이터 저장이 완료되면 호스트(210)로 쓰기 완료 응답메시지를 전송하고,(S1011) 쓰기 플래그를 0으로 바꾼 후,(S1012) 해당 락 정보(223)에서 대기 중인 프로세스들을 깨운다. When the data storage is completed as described above, the write completion response message is transmitted to the host 210 (S1011), the write flag is changed to 0 (S1012), and the processes waiting for the corresponding lock information 223 are woken up.

도 11은 타 제어기가 관리하는 데이터에 대한 쓰기 요구를 처리하는 과정을 보여주는 흐름도이다. 11 is a flowchart illustrating a process of processing a write request for data managed by another controller.

호스트(210)로부터 타 제어기가 관리하는 데이터에 대한 쓰기 요구가 오면, RAID 제어기(220)는 우선 임시 버퍼를 할당한 후,(S1101) 호스트(210)로부터 받은 데이터를 임시 버퍼에 저장한다.(S1102) When a write request for data managed by another controller comes from the host 210, the RAID controller 220 first allocates a temporary buffer (S1101) and then stores the data received from the host 210 in the temporary buffer. S1102)

그리고, 해당 데이터를 관리하는 락 서버 RAID 제어기(220)를 확인한 후 그 락 서버 RAID 제어기(220)로 데이터 쓰기 요청 메시지를 전송한 후 그 응답을 기다린다.(S1103) After checking the lock server RAID controller 220 managing the data, the server sends a data write request message to the lock server RAID controller 220 and waits for the response (S1103).

락 서버 RAID 제어기(220)로부터의 응답이 수신되면,(S1104) 임시 버퍼 할당을 취소시킨 후,(S1105) 호스트(210)로 쓰기가 완료되었다는 응답을 전송한다.(S1106)When the response from the lock server RAID controller 220 is received (S1104), the temporary buffer allocation is canceled (S1105), and a response indicating that writing is completed is transmitted to the host 210 (S1106).

도 12는 타 RAID 제어기(220)로부터 락 서버 RAID 제어기(220)로 전달된 데이터 쓰기 요구 메시지의 처리 과정을 보여주는 흐름도이다. 12 is a flowchart illustrating a process of processing a data write request message transferred from another RAID controller 220 to the lock server RAID controller 220.

타 RAID 제어기(220)로부터 쓰기 요구 메시지를 받으면, 락 서버 RAID 제어기(220)는 우선 그 쓰기 요구 메시지에 해당하는 락 정보(223)를 검색한다. (S1201)Upon receiving a write request message from another RAID controller 220, the lock server RAID controller 220 first searches for lock information 223 corresponding to the write request message. (S1201)

이때, 락 정보(223)가 존재하지 않는다면, 새로운 락 정보(223)를 할당하여 데이터 관련 정보를 저장하고 쓰기 플래그를 1로 셋팅한 후 해시 엔트리에 연결한다.(S1202,S1203) At this time, if the lock information 223 does not exist, the new lock information 223 is allocated to store the data related information, the write flag is set to 1, and then connected to the hash entry (S1202, S1203).

그리고, 쓰기캐시(222)로부터 데이터를 저장할 캐시버퍼를 할당한 후,(S1204) 쓰기 요구 메시지에 저장되어 있는 버퍼주소로부터 데이터를 가져와서 할당된 캐시버퍼에 저장한다.(S1205) After allocating a cache buffer to store data from the write cache 222 (S1204), data is taken from the buffer address stored in the write request message and stored in the allocated cache buffer (S1205).

만일, 락 정보(223)가 존재한다면 우선 쓰기 플래그부터 확인한다.(S1206) If the lock information 223 exists, the write flag is first checked (S1206).

쓰기 플래그가 1이면 현재 진행 중인 쓰기가 완료될 때까지 대기한다. (S1207)If the write flag is 1, wait for the current write to complete. (S1207)

쓰기 플래그도 0이고 share 비트들도 모두 0이면, 쓰기 플래그를 1로 셋팅한 후,(S1208) 쓰기캐시(222)에 저장된 해당 기존 데이터에 새로운 데이터를 덮어 쓴다.(S1209) If the write flag is also 0 and the share bits are all 0, after setting the write flag to 1 (S1208), new data is overwritten with the existing data stored in the write cache 222 (S1209).

Share 비트들 중 1이 있다면 해당 RAID 제어기(220)들로 데이터 삭제 요구 메시지를 전송한 후,(S1210) 메시지를 받은 제어기들로부터 삭제 완료 메시지를 받으면 해당 쓰기캐시(222)에 데이터를 저장한다.(S1205) If one of the share bits is present, the data deletion request message is transmitted to the corresponding RAID controllers 220 (S1210) and when the deletion completion message is received from the controllers receiving the message, the data is stored in the corresponding write cache 222. (S1205)

데이터 쓰기가 완료되면 쓰기 플래그를 0으로 다시 변경한 후,(S1211) 쓰기 완료 응답 메시지를 그 타 RAID 제어기(220)로 전송한다.(S1212) When data writing is completed, the write flag is changed back to 0 (S1211), and the write completion response message is transmitted to the other RAID controller 220 (S1212).

도 13은 본 발명에 따른 쓰기캐시(222) 데이터를 디스크에 저장하는 디스테이징(destaging) 처리 흐름도이다. 13 is a flowchart of a destaging process of storing write cache 222 data on a disk according to the present invention.

이 디스테이징 과정은 쓰기캐시(222)에 저장되어 있는 데이터에 대해 타 RAID 제어기(220)로부터 읽기가 요청될 경우 그 쓰기캐시(222)의 dirty 블록이 어느 수준 이상으로 저장되어 있어 새로운 데이터를 저장하기 위해 기존의 데이터를 쓰기캐시(222)로부터 디스크로 내리는 처리 과정이다. This destaging process saves new data because the dirty block of the write cache 222 is stored above a certain level when a read is requested from another RAID controller 220 for the data stored in the write cache 222. In order to do this, existing data is written to the disk from the write cache 222.

이를 위해, RAID 제어기(220)는 자신의 쓰기캐시(222)를 검색하여,(S1301) dirty 블록이 일정 수준 이상이 되는 지를 확인한다.(S1302) To this end, the RAID controller 220 searches its write cache 222 (S1301) and checks whether the dirty block is above a certain level (S1302).

dirty 블록이 일정 수준 이상이 되면, RAID 제어기(220)는 우선 디스크에 저장할 기존 데이터들의 물리적 디스크 장치 번호와 물리적 디스크 블록 번호를 이용하여 해당 락 정보(223)를 검색하고 그 락 정보(223)의 자신에 대한 share 비트를 1로 셋팅한다.(S1303) When the dirty block is above a certain level, the RAID controller 220 first searches for the corresponding lock information 223 using the physical disk device number and the physical disk block number of the existing data to be stored on the disk, and then stores the lock information 223. Set the share bit for itself to 1 (S1303).

그리고, 읽기 캐시로부터 사용 가능한 캐시버퍼를 할당한다.(S1304) Then, a cache buffer usable from the read cache is allocated (S1304).

만일, 할당한 캐시버퍼에 유효한 데이터가 저장되어 있다면 우선 해당 락 정보(223)를 찾아 변경한다.(S1305,S3106) If valid data is stored in the allocated cache buffer, the corresponding lock information 223 is first found and changed (S1305 and S3106).

할당된 읽기캐시(221) 택의 캐시버퍼와 디스테이징(destaging)되는 캐시버퍼를 서로 교환하고,(S1307) dirty 데이터를 디스크로 저장한다.(S1308) The cache buffer of the allocated read cache 221 tag and the cache buffer to be destaged are exchanged with each other (S1307), and dirty data is stored on the disk (S1308).

이렇게 함으로써 destaging된 데이터는 쓰기캐시(222)에서 읽기캐시(221)로 바뀌어 저장된다. In this way, the destaged data is changed from the write cache 222 to the read cache 221 and stored.

상기 설명된 바와 같은 본 발명의 공유 데이터 관리 구조는 정적 분산 데이터 관리 방식(Static, Distributed Data Management Policy)이다. The shared data management structure of the present invention as described above is a static distributed data management policy (Static, Distributed Data Management Policy).

즉, 하나의 논리적 디스크장치(230)에 저장되는 모든 데이터를 하나의 제어기에서 관리하는 것이 아니라 세그먼트 단위로 나누어 각 제어기(220)마다 할당함으로써 각 제어기(220)는 해당하는 세그먼트의 데이터들에 대한 접근권한만 관리한다. 이렇게 각 제어기가 하나의 논리적 디스크 장치(230)의 전체 용량을 일정 단위로 나누어 관리함으로써 동시에 여러 호스트(210)로부터 요구된 데이터가 서로 다른 제어기에서 관리하는 데이터인 경우 하나의 논리적 디스크 장치(230)로 접근할 수 있는 입출력 채널이 다중화되기 때문에 단일 제어기 구조에서 발생하는 입출력 채널 병목현상을 감소시킬 수 있다. That is, instead of managing all data stored in one logical disk device 230 in one controller, each controller 220 allocates data to the corresponding segments by dividing the data into segments in units of segments. Only manage access rights. As such, each controller manages the total capacity of one logical disk device 230 by a predetermined unit so that the data requested from several hosts 210 is data managed by different controllers at the same time, one logical disk device 230 Because I / O channels accessible by multiplexing are multiplexed, I / O channel bottlenecks in a single controller structure can be reduced.

그리고, 여러 호스트(210)들로부터 동일한 데이터 블록에 대한 읽기 요구가 여러 제어기(220)로 빈번하게 발생하는 응용환경의 경우 단일 제어기 구조에서는 읽기캐시의 해당 데이터에 대해 순차적으로 접근해야 하지만, 다중 제어기(220)가 공유 데이터를 분산 관리하는 구조에서는 동일한 데이터의 복사본이 다중 제어기에 존재할 수 있기 때문에 쓰기 요구가 오기 전까지는 여러 제어기(220)를 통해 들어온 읽기 요구를 각 제어기(220)의 읽기캐시(221)를 통해 동시에 처리할 수 있다. In an application environment in which read requests for the same data block from multiple hosts 210 are frequently generated by multiple controllers 220, the controllers must sequentially access the corresponding data of the read cache in a single controller structure. In a structure in which the 220 manages distributed data, since a copy of the same data may exist in multiple controllers, read requests received through the various controllers 220 may be read from each controller 220 until a write request arrives. 221 may be processed at the same time.

또한, 각 제어기(220)마다 서로 다른 데이터를 저장 관리하는 경우에도 비록 제어기가 관리하지 않는 데이터에 대한 요구가 호스트(210)로부터 올지라도 직접 물리적 디스크 장치로부터 읽어오지 않고 타 제어기의 캐시데이터를 이용하여 처리할 수 있다. In addition, even in the case of storing and managing different data for each controller 220, even if a request for data not managed by the controller comes from the host 210, the cache data of another controller is used without directly reading from the physical disk device. Can be processed.

상술한 바와 같이 본 발명에 따른 다중 RAID 제어기를 통한 데이터 분산 공유 RAID 제어 시스템은, 논리적 디스트 장치들을 관리하는 제어기를 고정하지 않고 다중의 RAID 제어기를 통해 공유하기 때문에, 입출력 채널 다중화를 통한 성능 향상을 도모할 수 있을 뿐만 아니라 데이터 공유로 인해 발생할 수 있는 데이터 훼손 및 손실을 효과적으로 방지할 수 있다. As described above, the data distribution sharing RAID control system through the multiple RAID controllers according to the present invention can share performance through multiple RAID controllers without fixing the controller managing logical disk devices, thereby improving performance through I / O channel multiplexing. Not only can this be done, it can effectively prevent data corruption and loss due to data sharing.

또한, 여러 호스트들로부터 동일한 데이터 블록에 대한 읽기 요구가 많은 응용에서는 동시에 여러 제어기들로부터 처리될 수 있기 때문에 읽기 성능을 보다 향상시킬 수 있다. In addition, read performance of the same data block from multiple hosts can be handled by multiple controllers at the same time, thereby improving read performance.

또한 제어기들마다 분산하여 저장하고 관리하는 데이터들의 크기를 어떻게 결정하느냐에 따라 여러 호스트들이 지역적 특성이 적은 데이터들에 대한 읽기 요구가 많은 경우에도 동시에 여러 제어기들로부터 처리할 수 있기 때문에 보다 나은 읽기 성능이 가능하다.In addition, depending on how to determine the size of data that is distributed and stored and managed by each controller, even if a large number of hosts need to read data with less local characteristics, they can simultaneously process from multiple controllers. It is possible.

이상에서 설명한 것은 본 발명에 따른 다중 RAID 제어기를 통한 데이터 분산 공유 RAID 제어 시스템을 실시하기 위한 하나의 실시예에 불과한 것으로서, 본 발명은 상기한 실시예에 한정되지 않고, 이하의 특허청구의 범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 분야에서 통상의 지식을 가진 자라면 누구든지 다양한 변경 실시가 가능한 범위까지 본 발명의 기술적 정신이 있다고 할 것이다.What has been described above is just one embodiment for implementing a distributed data sharing RAID control system through a multi-RAID controller according to the present invention, the present invention is not limited to the above-described embodiment, in the following claims Without departing from the gist of the present invention, any person of ordinary skill in the art will have the technical spirit of the present invention to the extent that various modifications can be made.

도 1은 종래의 독립적인 여러 개의 RAID 제어기를 갖는 RAID 시스템의 블록 구성도.1 is a block diagram of a conventional RAID system having several independent RAID controllers.

도 2는 본 발명에 따른 분산 공유 RAID 시스템에 대한 블록 구성도. 2 is a block diagram of a distributed shared RAID system according to the present invention.

도 3은 본 발명에 따른 각 RAID 제어기의 공유 데이터 관리 구조를 보여주는 도면. 3 illustrates a shared data management structure of each RAID controller according to the present invention.

도 4는 본 발명에 따른 하나의 논리적 디스크 장치에 대한 다중 RAID 제어기의 데이터 분산 관리 구조도. 4 is a data distribution management structure diagram of multiple RAID controllers for one logical disk device according to the present invention;

도 5는 본 발명에 따른 락 정보 및 락 관리 해시 테이블에 대한 데이터 구조도. 5 is a data structure diagram of lock information and a lock management hash table according to the present invention;

도 6은 본 발명에 따른 락 서버에서의 읽기 요구 처리 흐름도. 6 is a flowchart of a read request processing in the lock server according to the present invention;

도 7은 본 발명에 따른 타 제어기에서 관리하는 데이터에 대한 읽기 요구 처리 흐름도. 7 is a flowchart illustrating a read request process for data managed by another controller according to the present invention.

도 8은 본 발명에 따른 타 제어기로부터 락 서버로 요청된 락 정보 요구 메시지 처리 흐름도. 8 is a flowchart illustrating a lock information request message requested from another controller to a lock server according to the present invention.

도 9는 본 발명에 따른 타 제어기로부터 락 서버로 요청된 데이터 읽기 요구 메시지 처리 흐름도. 9 is a flowchart of processing a data read request message requested from another controller to the lock server according to the present invention.

도 10은 본 발명에 따른 락 서버에서의 쓰기 요구 처리 흐름도. 10 is a flowchart of a write request processing in the lock server according to the present invention.

도 11은 본 발명에 따른 타 제어기에서 관리하는 데이터에 대한 쓰기 요구 처리 흐름도. 11 is a flow chart for processing a write request for data managed by another controller according to the present invention.

도 12는 본 발명에 따른 타 제어기로부터 락 서버로 요청된 데이터 쓰기 요구 메시지 처리 흐름도. 12 is a flowchart of processing a data write request message requested from another controller to the lock server according to the present invention.

도 13은 본 발명에 따른 쓰기 캐시 데이터를 디스크에 저장하는 디스테이징(destaging) 처리 흐름도. 13 is a flow chart of a destaging process for storing write cache data on a disk according to the present invention.

<도면의 주요부분에 대한 부호의 설명><Description of the symbols for the main parts of the drawings>

210: 호스트 220: RAID 제어기210: host 220: RAID controller

230: 논리적 디스크장치 250: SCI 스위치230: logical disk device 250: SCI switch

221: 읽기 캐시 222: 쓰기 캐시221: read cache 222: write cache

223: 락 정보 224: 백업 쓰기캐시223: lock information 224: backup write cache

Claims

In a data distributed shared RAID control system,

Multiple RAID controllers for distributing and sharing data among multiple hosts and logical disk devices. Each RAID controller divides the data area of the shared logical disk device to handle I / O of each partitioned area. Distributed RAID control system characterized in that it comprises a multiple RAID controller operating as a lock server to manage the access rights of.

The system of claim 1, wherein the RAID controller uses an SCI switch for block data transfer with another RAID controller.

The RAID controller of claim 1, wherein the RAID controller divides and manages the read cache, the write cache, and the backup write cache in consideration of input / output performance, and each RAID controller defines a backup write cache in a round-robin manner for controller error recovery. A distributed data sharing RAID control system using a multiple RAID controller, characterized in that for managing the lock information for the data managed by itself.

2. The system of claim 1, wherein the RAID controller updates data only in the lock server RAID controller and the update data exists only in the write cache of the lock server RAID controller.

2. The RAID controller of claim 1, wherein the RAID controller manages the lock information in one hash table for each logical disk device, the hash table is composed of several hash entries, and one hash entry is composed of lock information. A distributed data RAID control system using multiple RAID controllers characterized in that the list is connected.

2. The RAID controller of claim 1, wherein the RAID controller stores, as its lock information, a physical disk device number and a physical disk block number storing the shared data, the number of processes waiting to refer to the corresponding lock information, and a copy of the corresponding data. A data sharing shared RAID control system using multiple RAID controllers, comprising a controller bitmap representing controllers, a flag indicating that writes to the data are in progress, and process synchronous semaphores.

7. The RAID controller of claim 6, wherein the RAID controller assigns a unique bit number for each controller to a controller bitmap, and sets the share bit for the controller to 1 when the data exists in the read cache of a specific controller. When a share bit is 1, the controller having a share bit of 1 is requested to delete the read cache data.

7. The RAID controller of claim 6, wherein the RAID controller sets the write flag to 1 once a write request is made and all subsequent reads and writes when a write request comes in order to block data access while a write request for the corresponding data of lock information is in progress. A distributed data sharing RAID control system with multiple RAID controllers, characterized in that the request waits on the corresponding lock information until the write in progress is completed.

In the control method of the data distribution sharing RAID control system according to claim 1,

When a read request for data managed by the host is received, checking whether the lock information for the data exists by searching the lock information hash table;

If the lock information does not exist, allocating the new lock information to set the share bit for itself to 1 and reading data from the disk and storing the read bit in the read cache;

If the corresponding lock information exists, checking the write flag to check whether the write is in progress, and if the write on the data is not in progress, searching for the share bit to identify the controller that read the corresponding data;

if all of the share bits are 0, searching for and transmitting its write cache; And

If the share bit for itself is 1, it reads and delivers its own read cache. If the share bit for other controllers is 1, it sends a read request message to the controller and receives the data and sends it to its read cache. Storing and changing the share bit for itself to 1 and transferring the data to the host; controlling a distributed data sharing RAID control system through a multiple RAID controller.

Determining a lock server controller for the data and allocating a new cache buffer to its read cache when a read request is made to the lock server RAID controller for data not managed by the self;

Sending a lock information request message including information about the data and its cache buffer address to the lock server controller;

Ascertaining the number of the share controller for the data through the contents of the response message from the lock server controller;

searching for and transmitting its read cache when it is included in the share controller;

If the share controller is a lock server controller, it reads the data stored in the read cache of the lock server. If the share controller is another controller, it sends a read request message containing data information to the other controller and receives the response message. Transmitting a lock information change message to the lock server controller and storing the data in a read cache thereof;

when the share controller number is -1 and no corresponding data is stored in any controller, reading the physical disk device directly, storing the corresponding data in a read cache, and transmitting a lock information change message to the lock server controller; And

And transmitting the data stored in its read cache to the requesting host through each of the above-described steps.

Retrieving the lock information through the lock information request message when receiving a lock information request message for data managed by the other RAID controller;

If the lock information does not exist, setting a share controller number to -1 to create a response message and transmitting the response message to the other controller;

If the lock information exists but the current write is in progress, waiting until the write is completed;

If the write bit is not in progress and the share bit is 0, changing the share bit for the other controller to 1, searching for its own write cache, and transmitting the data to the buffer address of the other controller;

If the write bit is not in progress and the share bit of the bit is 1, retrieving the data from its read cache and transmitting the data to the buffer address of the other controller;

If the write is not in progress and the share bit for another controller is 1, the number of the other controller is stored in the response message, and the response message is transmitted to the other controller as requested. Control method of the control system.

Retrieving lock information for the data if there is a write request for data managed by the host;

If the lock information does not exist, allocating new lock information, setting a write flag to 1, and storing write data from the host in a write cache;

If the corresponding lock information exists, first checking the write flag, and if the write flag is 1, waiting until the write is completed;

If the write flag is 0 and the share bits are all 0, overwriting write data from the host with the write flag changed to 1;

If all share bits are non-zero, requesting deletion of the corresponding data in the read cache from the controllers corresponding to 1, and if the share bits are deleted, storing write data from the host in the write cache; And

Changing the write flag of the lock information to 0 when storing of the write data is completed through each of the steps; and controlling the distributed data sharing RAID control system through the multiple RAID controllers.

Allocating a temporary buffer to store data from the host when there is a write request for data managed by another controller from the host;

Identifying a lock server controller for the data and transmitting a data write request message to the lock server controller and waiting for a response; And

If a response is received from the lock server controller, canceling the temporary buffer allocation and transmitting a write completion response to a host; controlling a distributed data sharing RAID control system through a multiple RAID controller .

Retrieving the lock information corresponding to the write request message when receiving a write request message for data managed by the other RAID controller;

If the corresponding lock information does not exist, allocating new lock information to store data related information, setting the write flag to 1, and taking write data from the buffer address of the other controller and storing the write data in a buffer of the write cache;

If the corresponding lock information exists, first checking the write flag, and if the write flag is 1, waiting until the current write is completed;

If the write flag is 0 and the share bit is also all 0, setting the write flag to 1 and overwriting new data with the existing data stored in the write cache;

If all share bits are non-zero, send a delete request message for the data to controllers corresponding to 1, and if the delete is completed, set the write flag to 1 and store write data from another controller in the write cache. ; And

And changing the write flag back to 0 when data writing is completed through each of the above steps.

In destaging the write cache, checking the write cache to determine whether the dirty block is above a certain level;

retrieving lock information corresponding to existing data and setting the share bit for itself of the lock information to 1 when the dirty block is above a certain level;

Allocating a cache buffer usable for the read cache;

Finding and changing the lock information if valid data is stored in the allocated read cache buffer; And

Exchanging the cache buffer of the allocated read cache tag with the write cache buffer destaged, and storing dirty data to a disk; controlling data sharing shared RAID control system through multiple RAID controllers Way.