KR0123399B1

KR0123399B1 - Method of data consistency preserving in high availability system

Info

Publication number: KR0123399B1
Application number: KR1019940031731A
Authority: KR
Inventors: 강흠근; 임선배
Original assignee: 양승택; 한국전자통신연구원
Priority date: 1994-11-29
Filing date: 1994-11-29
Publication date: 1997-11-21
Also published as: KR960018971A

Abstract

The data consistency maintaining method in high availability system is embodied to maintain the data which is managed in the high availability system at a consistent state, even if a computer fails. The data consistency maintaining method in the high availability system comprises the steps of: if an applied process is performed, checking whether a log file exists; if does not exist, creating the log file and waiting data process request; if exists, checking whether a log in which a performance flag is not set(reset) exists in the log file; if the log exists, undoing the log and performing the setting of the performance flag for the log to wait the data process request; if does not exist, waiting a stand-by request signal; and if the data process request is inputted, creating the log, performing an operation, displaying the normal completion of the operation by setting the performance flag, and waiting other data process request.

Description

Method of Data Consistency Preserving in High Availability System

제1도는 종래의 고유용도 시스템의 구조도.1 is a structural diagram of a conventional high-use system.

제2도는 본 발명이 적용된 고유용도 시스템의 구조도.2 is a structural diagram of a high-use system to which the present invention is applied.

제3도는 log의 구조도.3 is a structural diagram of a log.

제4도는 log 화일의 구조도.4 is a structural diagram of a log file.

제5도는 log를 사용하는 응용 프로세스의 흐름도.5 is a flow chart of an application process using log.

* 도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

1, 2, 10, 11 : 감시 프로세스 3, 4, 12, 13 : 응용 프로세스1, 2, 10, 11: surveillance process 3, 4, 12, 13: application process

5, 6, 14, 15 : 컴퓨터 7, 20 : 디스크 어레이5, 6, 14, 15: Computer 7, 20: Disk Array

8, 9, 16, 18 : 데이타 화일 17, 19 : log 화일8, 9, 16, 18 data files 17, 19 log files

21 : 오프셋 22 : 길이21: Offset 22: Length

23 : 동작전의 데이타 24 : 동작후의 데이타23: Data before operation 24: Data after operation

25 : 수행 여부 플래그 26 : 마지막 log가 저장될 위치25: Flag to perform 26: Location where the last log will be saved

27 : 새로운 log가 저장될 위치27: New log location

28 : 수행 여부 플래그가 세트(set)되지 않은 첫번째 log의 위치28: first log position where execution flag is not set

본 발명은 고유용도 시스템(High Availiability System)에서 컴퓨터에 장애가 발생하여 페일오버(failover)를 하게 되더라도 시스템에서 관리하는 데이타의 일관성을 유지할 수 있는 방법에 관한 것이다.The present invention relates to a method for maintaining consistency of data managed by a system even when a computer fails in a high availability system and fails over.

제1도는 종래의 고유용도 시스템의 구조를 나타낸 것으로, 컴퓨터 1(5)이나 컴퓨터 2(6)는 UNIX를 OS로 하는 워크스테이션이나 미니컴퓨터이며, 디스크 어레이는 저가 디스크의 중복 어레이(Redundant Arrays of Inexpensive Disks : 이하, RAID)로서 빠른 검색 속도와 결함 허용(fault tolerant) 기능을 갖고 있는 기억장치이다.1 shows the structure of a conventional high-use system, where computer 1 (5) or computer 2 (6) is a workstation or minicomputer with a UNIX operating system, and a disk array is a redundant array of low-cost disks. Inexpensive Disks (RAID) is a storage device with fast search speed and fault tolerant function.

컴퓨터와 디스크 어레이는 SCSI 방식으로 연결되었기 때문에 컴퓨터는 디스크 어레이의 하드 디스크를 컴퓨터 내부의 하드 디스크처럼 사용할 수 있다.Because the computer and disk arrays are connected via SCSI, the computer can use the hard disks in the disk array as if they were hard disks inside the computer.

제1도에 도시한 고유용성 시스템에서는 페일오버의 발생시에도 중단없는 서비스를 제공할 수 있다.The high availability system shown in FIG. 1 can provide an uninterrupted service even when a failover occurs.

이 방법은 감시 프로세스를 사용하는 것이다.This method uses a monitoring process.

제1도에서 감시 프로세스 1(1)과 감시 프로세스 2(2)는 항시 서로를 감시하고 있다.In FIG. 1, monitoring process 1 (1) and monitoring process 2 (2) are always monitoring each other.

그래서 만일 컴퓨터 1(5)에 장애가 발생하면 감시 프로세스 1(1)는 수행이 중단된다.So if computer 1 (5) fails, monitoring process 1 (1) stops performing.

감시 프로세스 1(1)을 감시하던 감시 프로세스 2(2)는 이것을 즉각 알게 된다.Watchdog process 2 (2), which was watching watchdog process 1 (1), immediately notices this.

그렇게 되면 감시 프로세스 2(2)는 컴퓨터 1(5)에서 수행되던 응용 프로세스 1(3)이 컴퓨터 2(6)에서 수행되도록 한다.The monitoring process 2 (2) then causes application process 1 (3), which was performed on computer 1 (5), to run on computer 2 (6).

이렇게 하면 제1도의 고유용도 시스템에서는 컴퓨터(5,6)에 장애가 발생하여도 전체 성능은 떨어지지만 응용 프로세스(3,4)들을 계속 수행시킬 수 있다.This allows the application process (3,4) to continue in the high availability system of FIG.

그러나 제1도의 시스템에서는 데이타 화일(8,9)에 저장된 데이타의 일관성이 상실될 가능성이 있다.However, in the system of FIG. 1, there is a possibility that the data stored in the data files 8, 9 is lost.

그 이유는 응용 프로세스(3,4)가 데이타 화일(8,9)을 수정하는 중간에 컴퓨터(5,6)에 장애가 발생할 수 있기 때문이다.The reason is that the computer 5, 6 may fail in the middle of the application process 3,4 modifying the data file 8,9.

따라서, 본 발명은 로그(log) 기법을 사용하여 고유용도 시스템에서의 컴퓨터 장애로 인한 페일오버시에도 시스템에서 데이타의 일관성을 유지할 수 있는 방법을 제공함에 그 목적이 있다.Accordingly, an object of the present invention is to provide a method for maintaining data consistency in a system even in the event of a failover due to a computer failure in a high-use system using a log technique.

상기 목적을 달성하기 위하여 본 발명은, 감시 프로세스와 응용 프로세스로 구성되는 다수의 컴퓨터와 저가 디스크의 중복 어레이로서 빠른 검색 속도와 결함 허용(fault tolerant) 기능을 갖고 있는 기억장치인 디스크 어레이로 구성되며, 상기 컴퓨터와 상기 디스크 어레이는 SCSI 방식으로 연결되었기 때문에 컴퓨터는 디스크 어레이의 하드 디스크를 컴퓨터 내부의 하드 디스크처럼 사용할 수 있는 고유용도 시스템(High Availiability System)에 적용되어, 모든 응용 프로세스는 상기 디스크 어레이에 저장된 데이타를 수정할 때마다 수정하기 전에 관련된 정보(수정되는 데이타의 위치, 작동 이전의 값, 작동 이후의 값)들을 로그(log)로 만들어 디스크 어레이에 저장하는 것을 특징으로 한다.In order to achieve the above object, the present invention comprises a disk array which is a storage device having a fast search speed and a fault tolerant function as a redundant array of a plurality of computers and a low cost disk composed of a monitoring process and an application process. Since the computer and the disk array are connected in a SCSI manner, the computer is applied to a high availability system that can use the hard disk of the disk array as a hard disk inside the computer. Whenever the data stored in the system is modified, related information (the location of the data to be modified, the value before the operation and the value after the operation) is made into a log and stored in the disk array.

이하, 첨부된 도면을 참조하여 본 발명을 상세히 설명한다.Hereinafter, with reference to the accompanying drawings will be described in detail the present invention.

제2도는 본 발명이 적용되는 고유용도 시스템의 구조도이다.2 is a structural diagram of a high-use system to which the present invention is applied.

하드웨어의 특성은 제1도의 시스템과 같다.The characteristics of the hardware are the same as those of the system of FIG.

제3도는 로그의 구조를 도시하고 있다.3 shows the structure of the log.

오프셋(21)은 변경되는 데이타가 데이타 화일의 몇번째 바이트에 위치하는지를 나타낸다.The offset 21 indicates at what byte of the data file the data to be changed is located.

길이(22)는 몇 바이트의 데이타가 변경되었는지를 표시한다.The length 22 indicates how many bytes of data have changed.

동작전의 데이타(23)는 데이타를 변경하기 전에 데이타 화일에 저장되었던 데이타 값을 나타낸다.The data 23 before the operation represents the data values that were stored in the data file before changing the data.

동작후의 데이타(24)는 데이타를 변경한 후에 데이타 화일에 저장된 데이타 값을 나타낸다.The data 24 after the operation represents the data value stored in the data file after changing the data.

수행 여부 플래그(25)는 이 로그가 나타내는 동작의 수행이 종료되었는지의 여부를 표시한다.The performance flag 25 indicates whether or not the performance of the operation indicated by this log has ended.

로그 화일은 이러한 로그가 저장되어 있는 화일로 제4도의 구조를 갖는다.The log file is a file in which these logs are stored and has the structure of FIG.

제5도는 제2도의 응용 프로세스 3(12)와 응용 프로세스 4(13)의 흐름도이다.5 is a flowchart of the application process 3 (12) and the application process 4 (13) of FIG.

컴퓨터에 장애가 발생하였을 때, 복구하는 과정을 살펴보면 다음과 같다.When the computer fails, the recovery process is as follows.

응용 프로세스 3(12)가 처음 수행을 시작하면 먼저 로그화일이 있는지 조사한다.When application process 3 (12) first starts running, it first checks whether a log file exists.

처음 수행을 하는 것이므로 로그 화일은 없다.There is no log file because this is the first time.

그러면, 로그 화일을 만들고 데이타 처리 요구가 오기를 기다린다.This creates a log file and waits for data processing requests.

데이타 처리 요구가 오면 로그를 만들고, 동작을 수행하고, 동작의 수행이 정상적으로 종료되었음을, 수행 여부 플래그(25)를 세트하므로서 표시한다.When a request for data processing comes, a log is created, an operation is performed, and the execution of the operation is normally terminated, and is displayed by setting the execution flag 25.

컴퓨터에 장애가 발생하기 전까지는 데이타 처리 요구를 받고, 로그를 만들고, 데이타 처리 요구를 수행하고, 로그의 수행 여부 플래그(25)를 세트하는 과정을 반복한다.Until a computer failure occurs, the process of receiving a data processing request, creating a log, performing a data processing request, and setting a flag 25 for performing a log is repeated.

어떤 동작의 수행중에 컴퓨터 3(14)에 장애가 발생하면 감시 프로세스 4(11) 응용 프로세스 3(12)을 컴퓨터 4(15)에 동작시킨다.If computer 3 14 fails while performing any operation, monitoring process 4 (11) application process 3 (12) is run on computer 4 (15).

컴퓨터 4(15)에서 수행을 시작한 응용 프로세스 3(12)은 먼저 로그 화일이 있는지 조사한다.Application process 3 (12), which has started running on computer 4 (15), first checks for the presence of a log file.

컴퓨터 3(14)에서 동작하던 응용 프로세스 3(12)이 만든 로그 화일이 있으므로 그 로그 화일에 수행 여부 플래그(25)가 세트되지 않은(리셋된) 로그가 있는지 조사한다.Since there is a log file created by the application process 3 (12) that was running on the computer 3 (14), it is checked whether there is a log in which the execution flag 25 is not set (reset).

컴퓨터 3(14)에서 동작하던 응용 프로세스 3(12)이 동작을 종료하지 못하였으므로 수행 여부 플래그(25)가 세트되지 않은(리셋된) 로그가 존재한다.Since the application process 3 (12) operating in the computer 3 (14) failed to terminate the operation, there is a log in which the execution flag 25 is not set (reset).

컴퓨터 4(15)에서 수행을 시작한 응용 프로세스 3(12)은 수행 여부 플래그가 세트되지 않은(리셋된) 로그들을 철회(undo)한다.Application process 3 (12), which has started running on computer 4 (15), undoes the logs for which the run flag is not set (reset).

어떤 로그를 철회한다는 것은 데이타 화일에서 그 로그의 오프셋(21)에서부터 길이(22)만큼 동작전의 데이타(23)로 바꾸는 것을 말한다.Retracting a log refers to changing the log data from the offset 21 of the log to the data 23 before operation by the length 22.

철회한 동작을 재수행하고 로그들의 수행 여부 플래그들을 세트하면 컴퓨터 3(14)의 장애를 복구한 것이 된다.Performing the retracted operation and setting the flags whether or not the logs are performed is a recovery of the failure of the computer 3 (14).

그 다음에는 새로운 데이타 처리 요구를 기다려서 도착한 것이 수행하면 된다.Then wait for new data processing requests and do what arrives.

상기와 같은 발명에 의해, 고유용도 시스템에서 컴퓨터 장애로 인해 페일오버가 발생하여도 시스템에서 데이타의 일관성을 유지할 수 있는 효과가 있다.According to the above invention, even if a failover occurs due to a computer failure in the high-use system, there is an effect that can maintain the consistency of data in the system.

Claims

Redundant array of low cost disks and multiple computers consisting of supervisory and application processes, consisting of a disk array, which is a storage device with fast scan speed and fault tolerant function. In this way, the computer is applied to the High Availia- bility System, which can use the hard disks of the disk array as the hard disks inside the computer. Starting to examine whether there is a log file (s0, s1); If there is no log file in step s1, a log file is created and waits for a data processing request (s2, s3); If there is a log file in step 51, checking whether there is a log in which the execution flag 25 is not set (reset) in the log file (s7); If there is a log in which the execution flag 25 is not set (reset) in the process (s7), the log is undoed, the operation is performed again, and the data processing is performed after the logs are set. Waiting for a request to come (s8, s5, s6, s3); Waiting for a wait request signal when there is no log in which the execution flag 25 is not set (reset) in step s7; And a process of creating a log when the data processing request comes in, performing an operation, and displaying a flag indicating that the execution of the operation is normally completed by setting a flag 25 or not and waiting for a data processing request again (s3, s4, s5, s6, s3). A method for maintaining data consistency in high-use systems, characterized in that

2. The system of claim 1, wherein the log further comprises: an offset (21) indicating at what byte of the data file the data to be changed is located; A length 22 indicating how many bytes of data are to be changed; Pre-operation data 23 representing data values that were stored in the data file before changing the data; Post-operation data 24 representing data values stored in the data file after changing the data; And a performance flag (25) comprising a performance flag (25) indicating whether or not the performance of the operation indicated by this log has ended.

The method of claim 1, wherein the revocation of a log is changed from the offset (21) of the log to the data (23) before operation in the data file.