KR102427418B1

KR102427418B1 - Apparatus and method for synthetic of backup data

Info

Publication number: KR102427418B1
Application number: KR1020190119737A
Authority: KR
Inventors: 이장선; 민영수; 김응진
Original assignee: 주식회사 데이타커맨드
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2022-08-01
Also published as: KR20210037289A

Abstract

메타 데이터 영역에서 가상의 백업 데이터를 구현하는 합성 메타 데이터를 생성하고 합성 메타 데이터를 기반으로 복구 시점의 데이터를 복원함으로써 데이터 복원을 위한 저장 공간을 최소화하고 데이터 복원 시간을 단축할 수 있는 백업 데이터합성 장치 및 방법, 기록 매체가 개시된다. 본 발명의 실시예에 따른 백업 데이터 합성 장치는, 백업 대상 데이터를 풀 백업하여 생성되는 풀 백업 데이터와, 상기 풀 백업 데이터로부터 증분 백업되어 생성되는 증분 백업 데이터를 포함하는 백업 데이터로부터 복구 시점의 데이터를 합성하여 복원하는 백업 데이터 합성 장치로서, 상기 증분 백업 데이터와 관련된 증분 백업 메타 데이터를 상기 풀 백업 데이터와 관련된 풀 백업 메타 데이터에 합성하여, 가상 백업 데이터를 구현하기 위한 합성 메타 데이터를 생성하는 가상 백업 데이터 생성부; 및 상기 합성 메타 데이터에 의해 구현되는 상기 가상 백업 데이터를 기반으로 상기 풀 백업 데이터와 상기 증분 백업 데이터에서 상기 복구 시점의 데이터를 추출하여 복원하는 데이터 복원부를 포함한다.Synthesis of backup data that can minimize the storage space for data restoration and shorten the data restoration time by creating synthetic metadata that implements virtual backup data in the metadata area and restoring the data at the point of recovery based on the synthetic metadata. An apparatus and method, and a recording medium are disclosed. The backup data synthesizing apparatus according to an embodiment of the present invention includes full backup data generated by performing a full backup of backup target data, and data at a recovery point from backup data including incremental backup data generated by incrementally backing up from the full backup data. A backup data synthesizing apparatus for synthesizing and restoring a virtual backup data synthesizing apparatus for generating synthetic metadata for realizing virtual backup data by synthesizing incremental backup metadata related to the incremental backup data with full backup metadata related to the full backup data backup data generation unit; and a data restoration unit that extracts and restores data at the recovery point in time from the full backup data and the incremental backup data based on the virtual backup data implemented by the synthetic metadata.

Description

Apparatus and method for synthesizing backup data

본 발명은 백업 데이터 합성 장치 및 방법에 관한 것으로, 보다 상세하게는 메타 데이터 영역에서 가상의 백업 데이터를 구현하는 합성 메타 데이터를 생성하고 합성 메타 데이터를 기반으로 복구 시점의 데이터를 복원함으로써 데이터 복원을 위한 저장 공간을 최소화하고 데이터 백업 및 복원 시간을 단축할 수 있는 백업 데이터합성 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for synthesizing backup data, and more particularly, by creating synthetic metadata that implements virtual backup data in a metadata area and restoring data at a recovery point based on the synthetic metadata, data restoration is performed. It relates to an apparatus and method for synthesizing backup data that can minimize the storage space and shorten the data backup and restoration time.

컴퓨터의 메모리에 저장되어 프로세서에 의해 실행되는 모든 응용 프로그램은 데이터를 처리하기 위하여 동작한다. 이때 응용 프로그램의 데이터 연산을 통하여 처리된 결과 데이터는 입출력 관련 시스템 콜(system call)에 의하여 다시 저장 매체에 저장된다. 저장 매체에 저장된 데이터는 시스템의 장애와 같은 여러 요인으로 인해 손실 또는 유실될 수 있으며, 이로 인해 업무의 연속성을 보장할 수 없을 뿐만 아니라 데이터 복구에 막대한 비용이 소요될 수 있다.All application programs stored in the computer's memory and executed by the processor operate to process data. At this time, the result data processed through the data operation of the application program is stored in the storage medium again by an input/output-related system call. Data stored in the storage medium may be lost or lost due to various factors such as system failure, which not only cannot guarantee business continuity, but also may incur huge costs for data recovery.

데이터 손실로부터 발생하는 업무 공백을 최소화하기 위하여 데이터 관리 파트에서 데이터 백업 시스템을 도입하여 운용하고 있다. 데이터 백업 시스템은 데이터의 중요도 및 성격 등에 따라 풀 백업(full backup), 증분 백업 (incremental backup) 등의 방법을 이용하여 데이터를 백업한다. 풀 백업은 현재 백업을 하고자 하는 모든 데이터를 백업하는 방법이고, 증분 백업은 풀 백업 이후 변화된 데이터에 대하여만 백업을 하는 방법이다. 증분 백업은 전체 데이터를 백업하는 풀 백업 시간보다 데이터를 백업할 대상이 줄 수 있어 백업하는 시간을 절약할 수 있다. In order to minimize the work gap caused by data loss, the data management part introduces and operates a data backup system. The data backup system backs up data using methods such as full backup and incremental backup according to the importance and nature of data. Full backup is a method of backing up all data to be currently backed up, and incremental backup is a method of backing up only data that has changed since the full backup. Incremental backup can save time for backing up because the target to back up data can be given more than the full backup time for backing up the entire data.

도 1은 백업 대상 데이터를 풀 백업하는 과정을 나타낸 개념도이다. 도 1을 참조하면, 풀 백업(Full Backup)은 백업 대상 데이터(10) 전체를 백업하는 백업 방법이다. 풀 백업시 백업 대상 데이터(10)의 파일들의 데이터가 백업 서버로 전송되고, 백업 서버의 저장소(20)에 풀 백업 메타 데이터(12)와, 풀 백업 데이터(14)가 저장된다. 풀 백업 메타 데이터(12)는 풀 백업 데이터(14)의 파일들(file A~D)의 계층 구조(디렉토리), 파일명, 파일 ID, 사이즈, 저장 위치 등을 나타내는 메타 데이터(Metadata)이다. 풀 백업 데이터(14)는 백업 대상 데이터(10)의 파일들(file A~D)의 데이터를 포함할 수 있다.1 is a conceptual diagram illustrating a process of fully backing up data to be backed up. Referring to FIG. 1 , a full backup is a backup method of backing up the entire backup target data 10 . During the full backup, the data of the files of the backup target data 10 is transmitted to the backup server, and the full backup metadata 12 and the full backup data 14 are stored in the storage 20 of the backup server. The full backup metadata 12 is metadata indicating a hierarchical structure (directory), a file name, a file ID, a size, a storage location, and the like of the files (files A to D) of the full backup data 14 . The full backup data 14 may include data of files A to D of the backup target data 10 .

도 2는 도 1에서 풀 백업된 백업 대상 데이터에서 변경된 데이터를 증분 백업하는 과정을 나타낸 개념도이다. 도 1 및 도 2를 참조하면, 증분 백업(Incremental Backup)은 백업 대상 데이터(10') 전체를 백업 서버로 전송하지 않고, 백업 시간 단축을 위해 이전에 백업된 데이터에서 변경된 데이터만 백업 서버로 전송하는 백업 방법이다. 도 2에 예시된 백업 대상 데이터(10')는 도 1에 예시된 백업 대상 데이터(10)가 풀 백업된 이후에 백업 대상 데이터(10)로부터 변경된 데이터이다. 예를 들어, 도 1의 백업 대상 데이터(10)에서 file B가 삭제되고, file D가 변경되고, file E가 새로 생성된 백업 대상 데이터(10')를 증분 백업하는 경우, 변경(삭제, 수정 또는 생성)된 파일들(file B, D, E)에 관한 증분 백업 메타 데이터(12')와, 수정 또는 생성된 파일들(file D, E)의 데이터(14')가 백업 서버로 전송되어 저장소(20)에 저장된다.FIG. 2 is a conceptual diagram illustrating a process of incrementally backing up changed data from the backup target data that has been fully backed up in FIG. 1 . 1 and 2, incremental backup (Incremental Backup) does not transmit the entire backup target data 10' to the backup server, and only data changed from the previously backed up data is transmitted to the backup server to shorten the backup time. backup method. The backup target data 10' illustrated in FIG. 2 is data changed from the backup target data 10 after the backup target data 10 illustrated in FIG. 1 is fully backed up. For example, when file B is deleted from the backup target data 10 of FIG. 1, file D is changed, and file E incrementally backs up the newly created backup target data 10', changes (deletion, modification) Or, the incremental backup metadata 12' for the created files (file B, D, E) and the data 14' of the modified or created files (file D, E) are transmitted to the backup server. stored in the storage 20 .

저장소(20)에 백업된 데이터를 복구해야 하는 경우에, 풀 백업된 데이터는 백업된 시점의 파일 상태를 복원하면 되지만, 변경된 데이터만 전송하는 증분 백업의 경우에는 이전에 백업된 데이터에 증분 백업된 데이터를 합성하는 과정을 거쳐서 백업된 시점의 파일 상태를 복원할 수 있다. 도 3은 종래의 증분 백업된 데이터를 복원하는 과정을 나타낸 개념도이다. 일반적으로 특정 시점의 데이터를 복원하기 위해서는 풀 백업 데이터(14)를 먼저 복원한 다음, 증분 백업 데이터(14')를 반영하여 특정 시점의 데이터를 복원한다. 풀 백업 이후 여러 차례 증분 백업이 수행된 경우, 풀 백업 데이터(14)의 복원 후 증분 백업된 순서대로 증분 백업 데이터(14')들을 차례대로 반영하는 합성을 통해 특정 시점의 데이터를 복원할 수 있다.When it is necessary to restore the data backed up in the storage 20, the full backed up data only needs to restore the file state at the time it was backed up. Through the process of synthesizing data, the state of the file at the time of the backup can be restored. 3 is a conceptual diagram illustrating a process of restoring data that has been previously incrementally backed up. In general, in order to restore data at a specific point in time, the full backup data 14 is first restored, and then the data at a specific point in time is restored by reflecting the incremental backup data 14'. If incremental backups are performed several times after the full backup, data at a specific point in time can be restored through synthesis that sequentially reflects the incremental backup data 14' in the order of the incremental backup after the restoration of the full backup data 14. .

예를 들어, 풀 백업 이후 5번째 증분 백업된 데이터를 복원하기 위해서는 풀 백업된 데이터를 복원한 다음 첫 번째 증분 백업 데이터부터 5번째 증분 백업된 데이터를 차례대로 반영하는 과정이 필요한데, 이 경우 데이터를 복원하는데 많은 시간이 소요된다. 또한, 풀 백업된 데이터를 복사(copy)한 다음, 복사한 데이터에 증분 백업된 데이터를 차례대로 덮어쓰기(overwrite)하여 증분 백업 시점의 데이터를 합성해야 한다. 예를 들어, 백업된 데이터 크기가 10.3 GB이고, 풀 백업 데이터 크기가 10 GB이고, 증분 백업 데이터의 크기가 0.3 GB인 경우, 풀 백업 데이터의 복사를 위해 10 GB의 저장 공간이 추가로 필요하고, 증분 백업 데이터를 순차로 풀 백업 데이터에 덮어쓰기하는 데에도 추가적인 저장 공간을 필요로 한다. 따라서, 데이터 복원을 위해 많은 저장 공간을 할당해야 하고, 데이터 복원에 오랜 시간이 소요되는 문제가 있다.For example, in order to restore the data backed up by the 5th incremental backup after the full backup, it is necessary to restore the data backed up by the full backup and then reflect the data from the first incremental backup to the 5th incremental backup in order. In this case, It takes a lot of time to restore. In addition, the data at the time of the incremental backup must be synthesized by copying the full-backed data and then overwriting the incrementally-backed data to the copied data in turn. For example, if the backed up data size is 10.3 GB, the full backup data size is 10 GB, and the incremental backup data size is 0.3 GB, an additional 10 GB storage space is required for copying the full backup data and , additional storage space is also required to sequentially overwrite the incremental backup data to the full backup data. Therefore, there is a problem that a large amount of storage space must be allocated for data restoration, and data restoration takes a long time.

본 발명은 메타 데이터 영역에서 가상의 백업 데이터(가상 백업 이미지)를 구현하는 합성 메타 데이터를 생성하고 이를 기반으로 복구 시점의 데이터를 복원하는 백업 데이터 합성 장치 및 방법, 기록 매체를 제공하기 위한 것이다.An object of the present invention is to provide an apparatus and method for synthesizing backup data, and a recording medium for generating synthetic meta data that implements virtual backup data (virtual backup image) in a meta data area and restoring data at a recovery point based thereon.

또한, 본 발명은 데이터 복원을 위한 저장 공간을 최소화하고 데이터 복원 시간을 단축할 수 있는 백업 데이터 합성 장치 및 방법, 기록 매체를 제공하기 위한 것이다.Another object of the present invention is to provide an apparatus and method for synthesizing backup data and a recording medium capable of minimizing a storage space for data restoration and shortening a data restoration time.

또한, 본 발명은 풀 백업 데이터를 복사한 후 풀 백업 데이터에 증분 백업 데이터를 반복하여 덮어쓰는 과정 없이 원하는 백업 시점의 데이터를 복원할 수 있는 백업 데이터 합성 장치 및 방법, 기록 매체를 제공하기 위한 것이다.Another object of the present invention is to provide an apparatus and method for synthesizing backup data, and a recording medium capable of restoring data at a desired backup point without repeatedly overwriting the incremental backup data on the full backup data after copying the full backup data. .

본 발명의 실시예에 따른 백업 데이터 합성 장치는, 백업 대상 데이터를 풀 백업하여 생성되는 풀 백업 데이터와, 상기 풀 백업 데이터로부터 증분 백업되어 생성되는 증분 백업 데이터를 포함하는 백업 데이터로부터 복구 시점의 데이터를 합성하여 복원하는 백업 데이터 합성 장치로서, 상기 증분 백업 데이터와 관련된 증분 백업 메타 데이터를 상기 풀 백업 데이터와 관련된 풀 백업 메타 데이터에 합성하여, 가상 백업 데이터를 구현하기 위한 합성 메타 데이터를 생성하는 가상 백업 데이터 생성부; 및 상기 합성 메타 데이터에 의해 구현되는 상기 가상 백업 데이터를 기반으로 상기 풀 백업 데이터와 상기 증분 백업 데이터에서 상기 복구 시점의 데이터를 추출하여 복원하는 데이터 복원부를 포함한다.The backup data synthesizing apparatus according to an embodiment of the present invention includes full backup data generated by performing a full backup of backup target data, and data at a recovery point from backup data including incremental backup data generated by incrementally backing up from the full backup data. A backup data synthesizing apparatus for synthesizing and restoring a virtual backup data synthesizing apparatus for generating synthetic metadata for realizing virtual backup data by synthesizing incremental backup metadata related to the incremental backup data with full backup metadata related to the full backup data backup data generation unit; and a data restoration unit that extracts and restores data at the recovery point in time from the full backup data and the incremental backup data based on the virtual backup data implemented by the synthetic metadata.

상기 데이터 복원부는, 상기 데이터 복원을 위해 상기 풀 백업 데이터를 복사하지 않고, 메타 데이터 영역에서 생성된 상기 합성 메타 데이터를 기반으로 상기 복구 시점에 해당하는 청크를 데이터 파일에서 추출하여 데이터를 복원할 수 있다.The data restoration unit may restore the data by extracting the chunk corresponding to the recovery point from the data file based on the synthetic metadata generated in the metadata area without copying the full backup data to restore the data. have.

상기 합성 메타 데이터는 파일 별로 유효한 백업 작업 범위를 기록하기 위한 리비전 필드를 포함할 수 있다. 상기 가상 백업 데이터 생성부는 상기 파일의 변경 유형에 따라 상기 리비전 필드의 필드 값을 변경하도록 구성될 수 있다.The composite metadata may include a revision field for recording a valid backup operation range for each file. The virtual backup data generator may be configured to change a field value of the revision field according to a change type of the file.

상기 리비전 필드는, 상기 파일이 몇 번째 백업 작업에서 생성된 것인지를 나타내는 파일 생성 버전 필드; 및 상기 파일이 몇 번째 백업 작업까지 유효한지를 나타내는 파일 삭제 버전 필드를 포함할 수 있다.The revision field may include a file generation version field indicating in which backup job the file was created; and a file deletion version field indicating the number of backup jobs the file is valid for.

상기 가상 백업 데이터 생성부는, 상기 파일의 변경 유형이 파일 생성, 파일 삭제 및 파일 수정을 포함하는 복수개의 변경 유형 중 어느 변경 유형에 해당하는지를 판단하고; 상기 변경 유형이 상기 파일 삭제에 해당하는 경우, 상기 파일 삭제 버전 필드를 이전 백업 작업들 중 가장 최근의 백업 작업인 직전 백업 작업에 해당하는 값으로 설정하고; 상기 변경 유형이 상기 파일 생성에 해당하는 경우, 상기 파일 생성 버전 필드를 현재 백업 작업에 해당하는 값으로 설정하고, 상기 파일 삭제 버전 필드를 최대 값으로 설정하고; 그리고 상기 변경 유형이 상기 파일 수정에 해당하는 경우, 수정 전의 파일의 파일 삭제 버전 필드를 상기 최대 값에서 상기 직전 백업 작업에 해당하는 값으로 설정하도록 구성될 수 있다.The virtual backup data generation unit may be configured to: determine which change type of the file corresponds to among a plurality of change types including file creation, file deletion, and file modification; when the change type corresponds to the deletion of the file, setting the file deletion version field to a value corresponding to a previous backup job that is the most recent backup job among previous backup jobs; when the change type corresponds to the file creation, set the file creation version field to a value corresponding to a current backup job, and set the file deletion version field to a maximum value; And when the change type corresponds to the file modification, the file deletion version field of the file before modification may be configured to set from the maximum value to a value corresponding to the previous backup job.

상기 가상 백업 데이터 생성부는, 상기 변경 유형이 상기 파일 수정에 해당하는 경우, 수정된 파일에 대한 레코드를 추가하여 상기 수정된 파일에 대한 상기 파일 생성 버전 필드를 현재 백업 작업에 해당하는 값으로 설정하고, 상기 수정된 파일에 대한 상기 파일 삭제 버전 필드를 상기 최대 값으로 설정하도록 구성될 수 있다.The virtual backup data generation unit, when the change type corresponds to the file modification, adds a record for the modified file and sets the file generation version field for the modified file to a value corresponding to the current backup job, , set the file deletion version field for the modified file to the maximum value.

상기 합성 메타 데이터는 상기 파일을 구성하는 청크들의 그룹에 청크 그룹 ID를 할당하기 위한 청크 그룹 ID 필드와, 각 청크에 대응되는 헤더 번호를 기록하기 위한 헤더 번호 필드를 포함할 수 있다. 상기 가상 백업 데이터 생성부는, 상기 파일의 변경 유형이 상기 파일 생성 또는 상기 파일 수정에 해당하는 경우, 상기 청크 그룹 ID 필드에 새로운 필드 값을 기록하여 상기 합성 메타 데이터에 새로운 레코드를 추가하고; 그리고 상기 파일의 청크 데이터 중 일부 데이터가 변경된 경우, 상기 청크 데이터 중 변경되지 않은 데이터는 헤더 번호 필드를 갱신하지 않고, 상기 청크 데이터 중 변경된 데이터에 대해 새로운 청크를 생성하여 상기 헤더 번호 필드에 새로운 헤더 번호를 생성하도록 구성될 수 있다.The composite metadata may include a chunk group ID field for allocating a chunk group ID to a group of chunks constituting the file, and a header number field for recording a header number corresponding to each chunk. The virtual backup data generation unit adds a new record to the composite metadata by writing a new field value in the chunk group ID field when the file change type corresponds to the file creation or the file modification; And when some of the chunk data of the file is changed, the unaltered data among the chunk data does not update the header number field, but a new chunk is created for the changed data among the chunk data, and a new header is added to the header number field. may be configured to generate a number.

상기 데이터 복원부는, 상기 리비전 필드를 기반으로 상기 합성 메타 데이터로부터 상기 복구 시점에 해당하는 레코드들을 추출하고; 그리고 추출된 레코드들에 대응되는 데이터를 데이터 파일에서 읽어서 상기 데이터를 복원하도록 구성될 수 있다.the data restoration unit extracts records corresponding to the restoration time point from the composite metadata based on the revision field; And it may be configured to read data corresponding to the extracted records from the data file to restore the data.

상기 데이터 복원부는, 상기 합성 메타 데이터로부터, 상기 복구 시점에 해당하는 백업 작업 순번이 상기 파일 생성 버전 필드의 필드 값 이상이고 상기 파일 삭제 버전 필드의 필드 값 이하인 조건을 만족하는 레코드들을 추출하도록 구성될 수 있다.The data restoration unit may be configured to extract records satisfying a condition in which a backup job sequence number corresponding to the recovery point is greater than or equal to a field value of the file creation version field and less than or equal to a field value of the file deletion version field, from the synthetic metadata. can

상기 데이터 복원부는, 상기 합성 메타 데이터에서 상기 복구 시점에 해당하는 레코드들의 청크 그룹 ID에 대응되는 헤더 번호를 참조하고, 상기 헤더 번호에 따라 청크의 헤더를 선택하여 상기 데이터 파일에서 상기 데이터를 복원하도록 구성될 수 있다.The data restoration unit refers to a header number corresponding to a chunk group ID of records corresponding to the recovery time in the composite metadata, selects a header of a chunk according to the header number, and restores the data from the data file. can be configured.

본 발명의 실시예에 따른 백업 데이터 합성 방법은, 백업 대상 데이터를 풀 백업하여 생성되는 풀 백업 데이터와, 상기 풀 백업 데이터로부터 증분 백업되어 생성되는 증분 백업 데이터를 포함하는 백업 데이터로부터 복구 시점의 데이터를 합성하여 복원하는 백업 데이터 합성 방법으로서, 가상 백업 데이터 생성부에 의해, 상기 증분 백업 데이터와 관련된 증분 백업 메타 데이터를 상기 풀 백업 데이터와 관련된 풀 백업 메타 데이터에 합성하여, 가상 백업 데이터를 구현하기 위한 합성 메타 데이터를 생성하는 단계; 및 데이터 복원부에 의해, 상기 합성 메타 데이터에 의해 구현되는 상기 가상 백업 데이터를 기반으로 상기 풀 백업 데이터와 상기 증분 백업 데이터에서 상기 복구 시점의 데이터를 추출하여 복원하는 단계를 포함한다.In the backup data synthesis method according to an embodiment of the present invention, data at a recovery point from backup data including full backup data generated by performing a full backup of backup target data and incremental backup data generated by incrementally backing up from the full backup data A backup data synthesis method for synthesizing and restoring, by a virtual backup data generator, by synthesizing the incremental backup metadata related to the incremental backup data with the full backup metadata related to the full backup data to implement virtual backup data generating synthetic metadata for; and extracting, by a data restoration unit, data at the recovery point in time from the full backup data and the incremental backup data based on the virtual backup data implemented by the synthetic metadata, and restoring the data.

상기 복원하는 단계는, 상기 데이터 복원을 위해 상기 풀 백업 데이터를 복사하지 않고, 메타 데이터 영역에서 생성된 상기 합성 메타 데이터를 기반으로 상기 복구 시점에 해당하는 청크를 데이터 파일에서 추출하여 데이터를 복원하는 단계를 포함할 수 있다.The restoring step does not copy the full backup data to restore the data, but extracts the chunk corresponding to the recovery point from the data file based on the synthesized metadata generated in the metadata area to restore the data. may include steps.

상기 합성 메타 데이터는 파일 별로 유효한 백업 작업 범위를 기록하기 위한 리비전 필드를 포함할 수 있다. 상기 합성 메타 데이터를 생성하는 단계는 상기 파일의 변경 유형에 따라 상기 리비전 필드의 필드 값을 변경하는 단계를 포함할 수 있다.The composite metadata may include a revision field for recording a valid backup operation range for each file. The generating of the composite metadata may include changing a field value of the revision field according to a change type of the file.

상기 리비전 필드는, 상기 파일이 몇 번째 백업 작업에서 생성된 것인지를 나타내는 파일 생성 버전 필드; 및 상기 파일이 몇 번째 백업 작업까지 유효한지를 나타내는 파일 삭제 버전 필드를 포함할 수 있다. 상기 합성 메타 데이터를 생성하는 단계는, 상기 파일의 변경 유형이 파일 생성, 파일 삭제 및 파일 수정을 포함하는 복수개의 변경 유형 중 어느 변경 유형에 해당하는지를 판단하는 단계; 상기 변경 유형이 상기 파일 삭제에 해당하는 경우, 상기 파일 삭제 버전 필드를 이전 백업 작업들 중 가장 최근의 백업 작업인 직전 백업 작업에 해당하는 값으로 설정하는 단계; 상기 변경 유형이 상기 파일 생성에 해당하는 경우, 상기 파일 생성 버전 필드를 현재 백업 작업에 해당하는 값으로 설정하고, 상기 파일 삭제 버전 필드를 최대 값으로 설정하는 단계; 및 상기 변경 유형이 상기 파일 수정에 해당하는 경우, 수정 전의 파일의 파일 삭제 버전 필드를 상기 최대 값에서 상기 직전 백업 작업에 해당하는 값으로 설정하는 단계를 포함할 수 있다.The revision field may include a file generation version field indicating in which backup job the file was created; and a file deletion version field indicating the number of backup jobs the file is valid for. The generating of the composite metadata may include: determining which change type of the file corresponds to among a plurality of change types including file creation, file deletion, and file modification; when the change type corresponds to the deletion of the file, setting the file deletion version field to a value corresponding to a previous backup job that is the most recent backup job among previous backup jobs; when the change type corresponds to the file creation, setting the file creation version field to a value corresponding to a current backup job and setting the file deletion version field to a maximum value; and when the change type corresponds to the file modification, setting a file deletion version field of the file before modification from the maximum value to a value corresponding to the immediately preceding backup job.

상기 합성 메타 데이터를 생성하는 단계는, 상기 변경 유형이 상기 파일 수정에 해당하는 경우, 수정된 파일에 대한 레코드를 추가하여 상기 수정된 파일에 대한 상기 파일 생성 버전 필드를 현재 백업 작업에 해당하는 값으로 설정하고, 상기 수정된 파일에 대한 상기 파일 삭제 버전 필드를 상기 최대 값으로 설정하는 단계를 포함할 수 있다.In the generating of the composite metadata, when the change type corresponds to the file modification, a record for the modified file is added and the file creation version field for the modified file is set to a value corresponding to the current backup job. and setting the file deletion version field for the modified file to the maximum value.

상기 합성 메타 데이터는 상기 파일을 구성하는 청크들의 그룹에 청크 그룹 ID를 할당하기 위한 청크 그룹 ID 필드와, 각 청크에 대응되는 헤더 번호를 기록하기 위한 헤더 번호 필드를 포함할 수 있다. 상기 합성 메타 데이터를 생성하는 단계는, 상기 파일의 변경 유형이 상기 파일 생성 또는 상기 파일 수정에 해당하는 경우, 상기 청크 그룹 ID 필드에 새로운 필드 값을 기록하여 상기 합성 메타 데이터에 새로운 레코드를 추가하는 단계; 및 상기 파일의 청크 데이터 중 일부 데이터가 변경된 경우, 상기 청크 데이터 중 변경되지 않은 데이터는 헤더 번호 필드를 갱신하지 않고, 상기 청크 데이터 중 변경된 데이터에 대해 새로운 청크를 생성하여 상기 헤더 번호 필드에 새로운 헤더 번호를 생성하는 단계를 포함할 수 있다.The composite metadata may include a chunk group ID field for allocating a chunk group ID to a group of chunks constituting the file, and a header number field for recording a header number corresponding to each chunk. The generating of the composite metadata includes adding a new record to the composite metadata by writing a new field value in the chunk group ID field when the change type of the file corresponds to the file creation or the file modification. step; and when some of the chunk data of the file is changed, the unaltered data among the chunk data does not update the header number field, but a new chunk is created for the changed data among the chunk data, and a new header is added to the header number field. It may include generating a number.

상기 복원하는 단계는, 상기 리비전 필드를 기반으로 상기 합성 메타 데이터로부터 상기 복구 시점에 해당하는 레코드들을 추출하는 단계; 및 추출된 레코드들에 대응되는 데이터를 데이터 파일에서 읽어서 상기 데이터를 복원하는 단계를 포함할 수 있다.The restoring may include: extracting records corresponding to the recovery point from the composite metadata based on the revision field; and reading data corresponding to the extracted records from a data file to restore the data.

상기 레코드들을 추출하는 단계는, 상기 복구 시점에 해당하는 백업 작업 순번이 상기 파일 생성 버전 필드의 필드 값 이상이고 상기 파일 삭제 버전 필드의 필드 값 이하인 조건을 만족하는 레코드들을 추출할 수 있다.The extracting of the records may include extracting records that satisfy a condition that a backup job sequence number corresponding to the recovery time is greater than or equal to a field value of the file creation version field and less than or equal to a field value of the file deletion version field.

본 발명의 실시예에 따르면, 상기 백업 데이터 합성 방법을 실행하기 위한 프로그램이 기록된 컴퓨터로 판독 가능한 기록 매체가 제공된다.According to an embodiment of the present invention, there is provided a computer-readable recording medium in which a program for executing the backup data synthesis method is recorded.

본 발명의 실시예에 의하면, 메타 데이터 영역에서 가상의 백업 데이터를 구현하는 합성 메타 데이터를 생성하고 합성 메타 데이터를 기반으로 복구 시점의 데이터를 복원하는 백업 데이터 합성 장치 및 방법, 기록 매체가 제공된다.According to an embodiment of the present invention, there is provided an apparatus and method for synthesizing backup data, and a recording medium for generating synthetic meta data implementing virtual backup data in a meta data area and restoring data at a recovery point based on the synthetic meta data. .

또한, 본 발명의 실시예에 의하면, 데이터 복원을 위한 저장 공간을 최소화하고 데이터 복원 시간을 단축할 수 있는 백업 데이터 합성 장치 및 방법, 기록 매체가 제공된다.In addition, according to an embodiment of the present invention, a backup data synthesis apparatus and method, and a recording medium capable of minimizing a storage space for data restoration and shortening a data restoration time are provided.

또한, 본 발명의 실시예에 의하면, 풀 백업 데이터를 복사한 후 풀 백업 데이터에 증분 백업 데이터를 반복하여 덮어쓰는 과정 없이 원하는 백업 시점의 데이터를 복원할 수 있는 백업 데이터 합성 장치 및 방법, 기록 매체가 제공된다.In addition, according to an embodiment of the present invention, after copying the full backup data, the backup data synthesis apparatus and method, and the recording medium capable of restoring data at a desired backup point without the process of repeatedly overwriting the incremental backup data on the full backup data is provided

도 1은 백업 대상 데이터를 풀 백업하는 과정을 나타낸 개념도이다.
도 2는 도 1에서 풀 백업된 백업 대상 데이터에서 변경된 데이터를 증분 백업하는 과정을 나타낸 개념도이다.
도 3은 종래의 증분 백업된 데이터를 복원하는 과정을 나타낸 개념도이다.
도 4는 본 발명의 실시예에 따른 백업 데이터 합성 방법의 개념도이다.
도 5는 본 발명의 실시예에 따른 백업 데이터 합성 장치의 구성도이다.
도 6은 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부의 기능을 설명하기 위한 개념도이다.
도 7은 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부에 의해 생성된 풀 백업 메타 데이터와 풀 백업 데이터의 예시도이다.
도 8은 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부에 의해 생성된 증분 백업 메타 데이터와 데이터 파일의 예시도이다.
도 9는 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부의 기능을 설명하기 위한 개념도이다.
도 10은 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부에 의해 생성된 가상 백업 데이터의 예시도이다.
도 11은 본 발명의 실시예에 따른 백업 데이터 합성 방법의 순서도이다.
도 12는 본 발명의 실시예에 따라 생성된 합성 메타 데이터를 기반으로 데이터를 복원하는 과정을 설명하기 위한 예시도이다.
도 13은 도 12에 도시된 합성 메타 데이터를 기반으로 풀 백업된 복구 시점의 데이터를 복원하는 과정을 설명하기 위한 예시도이다.
도 14는 도 12에 도시된 합성 메타 데이터를 기반으로 풀 백업된 데이터에 증분 백업된 데이터가 반영된 데이터를 복원하는 과정을 설명하기 위한 예시도이다.1 is a conceptual diagram illustrating a process of fully backing up data to be backed up.
FIG. 2 is a conceptual diagram illustrating a process of incrementally backing up changed data from the backup target data that has been fully backed up in FIG. 1 .
3 is a conceptual diagram illustrating a process of restoring data that has been previously incrementally backed up.
4 is a conceptual diagram of a backup data synthesis method according to an embodiment of the present invention.
5 is a block diagram of an apparatus for synthesizing backup data according to an embodiment of the present invention.
6 is a conceptual diagram for explaining a function of a virtual backup data generating unit constituting a backup data synthesis apparatus according to an embodiment of the present invention.
7 is an exemplary diagram of full backup metadata and full backup data generated by the virtual backup data generating unit constituting the backup data synthesis apparatus according to an embodiment of the present invention.
8 is an exemplary diagram of incremental backup metadata and data files generated by the virtual backup data generator constituting the backup data synthesis apparatus according to an embodiment of the present invention.
9 is a conceptual diagram for explaining a function of a virtual backup data generating unit constituting a backup data synthesis apparatus according to an embodiment of the present invention.
10 is an exemplary diagram of virtual backup data generated by the virtual backup data generating unit constituting the backup data synthesis apparatus according to an embodiment of the present invention.
11 is a flowchart of a backup data synthesis method according to an embodiment of the present invention.
12 is an exemplary diagram for explaining a process of restoring data based on synthetic metadata generated according to an embodiment of the present invention.
FIG. 13 is an exemplary diagram for explaining a process of restoring data at a recovery point that has been fully backed up based on the composite metadata shown in FIG. 12 .
FIG. 14 is an exemplary diagram for explaining a process of restoring data in which incrementally backed up data is reflected in full backed up data based on the composite metadata shown in FIG. 12 .

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments allow the disclosure of the present invention to be complete, and common knowledge in the technical field to which the present invention pertains It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout.

본 명세서에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 본 명세서에서 사용되는 '~부'는 적어도 하나의 기능이나 동작을 처리하는 단위로서, 예를 들어 소프트웨어, FPGA 또는 하드웨어 구성요소를 의미할 수 있다. '~부'에서 제공하는 기능은 복수의 구성요소에 의해 분리되어 수행되거나, 다른 추가적인 구성요소와 통합될 수도 있다. 본 명세서의 '~부'는 반드시 소프트웨어 또는 하드웨어에 한정되지 않으며, 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고, 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 이하에서는 도면을 참조하여 본 발명의 실시예에 대해서 구체적으로 설명하기로 한다.In the present specification, when a part "includes" a certain component, this means that other components may be further included rather than excluding other components unless otherwise stated. As used herein, '~ unit' is a unit for processing at least one function or operation, and may refer to, for example, software, FPGA, or hardware component. A function provided by '~ unit' may be performed separately by a plurality of components, or may be integrated with other additional components. The term '~' in the present specification is not necessarily limited to software or hardware, and may be configured to reside in an addressable storage medium, or may be configured to reproduce one or more processors. Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

도 4는 본 발명의 실시예에 따른 백업 데이터 합성 방법의 개념도이다. 본 발명의 실시예에 따른 백업 데이터 합성 방법은 풀 백업 데이터와 증분 백업 데이터를 데이터 영역에서 합성하기 전에, 먼저 메타 데이터(metadata) 영역에서 증분 백업 메타 데이터를 이전의 메타 데이터를 합성하여 합성 메타 데이터(30)를 생성한다. 즉, 증분 백업 데이터(14')를 이전 데이터(예를 들어, 풀 백업 데이터(14))에 합성하기 전에 증분 백업 메타 데이터(12')를 풀 백업 메타 데이터(12)를 합성하여 메타 데이터 수준에서 가상의 백업 데이터인 합성 메타 데이터(30)를 합성하는 것이다.4 is a conceptual diagram of a backup data synthesis method according to an embodiment of the present invention. In the backup data synthesis method according to an embodiment of the present invention, before synthesizing the full backup data and the incremental backup data in the data area, first, the incremental backup metadata is synthesized with the previous metadata in the metadata area to synthesize metadata. (30) is generated. That is, before synthesizing the incremental backup data 14 ′ with the previous data (eg, the full backup data 14 ), the incremental backup metadata 12 ′ is synthesized with the full backup metadata 12 to the metadata level. is to synthesize the synthetic metadata 30, which is virtual backup data.

메타 데이터는 실제 파일의 데이터의 저장 위치, 사이즈 등을 관리하기 위한 정보이므로, 실제 파일의 데이터 보다 훨씬 크기가 작다. 따라서, 본 발명의 실시예에 의하면, 데이터 합성을 위해 실제 파일의 데이터를 복사할 필요 없이 메타 데이터 영역에서 합성 메타 데이터(30)(가상의 백업 이미지)를 생성한 다음, 합성 메타 데이터(30)를 기반으로 복구 시점에 해당하는 데이터의 저장 위치를 확인하여 해당 저장 위치의 데이터만 추출하여 전송하면 되기 때문에, 데이터 복구를 위한 저장 공간의 낭비를 줄일 수 있고, 데이터 복구에 소요되는 시간도 단축할 수 있다.Since the metadata is information for managing the storage location and size of the data of the actual file, the size is much smaller than the data of the actual file. Therefore, according to the embodiment of the present invention, the composite metadata 30 (virtual backup image) is generated in the metadata area without the need to copy the data of the actual file for data synthesis, and then the composite metadata 30 Based on this, it is possible to reduce the waste of storage space for data recovery and shorten the time required for data recovery, because only the data in the storage location needs to be extracted and transmitted by checking the storage location of the data corresponding to the recovery point based on the can

도 5는 본 발명의 실시예에 따른 백업 데이터 합성 장치의 구성도이다. 도 5를 참조하면, 본 발명의 실시예에 따른 백업 데이터 합성 장치(100)는 데이터 복구를 위한 백업 서버(Backup Server)에 제공될 수 있다. 본 발명의 실시예에 따른 백업 데이터 합성 장치(100)는 데이터 복구 시간 단축 및 데이터 복구를 위한 저장 공간 최소화를 위해, 가상 백업 데이터 생성부(120)와, 데이터 복원부(140) 및 데이터 저장소(160)를 포함할 수 있다.5 is a block diagram of an apparatus for synthesizing backup data according to an embodiment of the present invention. Referring to FIG. 5 , the backup data synthesis apparatus 100 according to an embodiment of the present invention may be provided to a backup server for data recovery. Backup data synthesis apparatus 100 according to an embodiment of the present invention includes a virtual backup data generation unit 120, a data restoration unit 140 and a data storage ( 160) may be included.

가상 백업 데이터 생성부(120)는 풀 백업 메타 데이터(FMD)와, 하나 이상의 증분 백업 메타 데이터(IMD1, IMD2,..., IMDN)를 메타 데이터 영역에서 합성하여 합성 메타 데이터를 생성하여 풀 백업 데이터와 증분 백업 데이터의 합성을 위한 가상 백업 데이터를 생성할 수 있다. 풀 백업 메타 데이터(FMD)는 풀 백업 데이터와 관련된 메타 데이터일 수 있다. 증분 백업 메타 데이터(IMD1, IMD2,..., IMDN)는 풀 백업 데이터로부터 순차적으로 증분 백업할 때마다 생성되는 증분 백업 데이터와 관련된 메타 데이터일 수 있다.The virtual backup data generation unit 120 synthesizes the full backup metadata (FMD) and one or more incremental backup metadata (IMD1, IMD2, ..., IMDN) in the metadata area to generate synthetic metadata to create a full backup. You can create virtual backup data for the synthesis of data and incremental backup data. The full backup metadata FMD may be metadata related to the full backup data. The incremental backup metadata (IMD1, IMD2, ..., IMDN) may be metadata related to incremental backup data generated whenever incremental backup is sequentially performed from full backup data.

가상 백업 데이터 생성부(120)에 의해 가상 백업 데이터인 합성 메타 데이터가 생성되면, 데이터 복원부(140)는 합성 메타 데이터에서 추출할 데이터의 저장 위치를 확인하여, 데이터 저장소(160)에 저장된 풀 백업 데이터 및 증분 백업 데이터 중에서 복원 시점에 해당하는 데이터의 파일들(파일들 각각의 청크)을 차례로 추출하여 데이터를 복구할 수 있다.When synthetic metadata that is virtual backup data is generated by the virtual backup data generation unit 120 , the data restoration unit 140 checks a storage location of data to be extracted from the synthetic metadata, and the pool stored in the data storage 160 . Data may be recovered by sequentially extracting files (each chunk of the files) of data corresponding to the restoration time point among the backup data and the incremental backup data.

도 6은 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부의 기능을 설명하기 위한 개념도이다. 도 5 및 도 6을 참조하면, 가상 백업 데이터 생성부(120)는 여러 번의 백업 작업이 수행된 후, 과거의 특정 복구 시점에 해당하는 데이터를 복원할 수 있도록, 합성 메타 데이터의 리비전 필드(revision field)(RF, RT)에 데이터의 각 파일 별로 파일이 유효한 백업 작업 범위를 기록할 수 있다. 도 6의 예에서, 리비전 필드(RF, RT)는 파일의 버전을 나타내는 필드로, 파일이 몇 번째 백업 작업에서 생성되어 몇 번째 백업 작업까지 유효한지를 나타낼 수 있다.6 is a conceptual diagram for explaining a function of a virtual backup data generating unit constituting a backup data synthesis apparatus according to an embodiment of the present invention. Referring to FIGS. 5 and 6 , the virtual backup data generation unit 120 may restore data corresponding to a specific recovery point in the past after multiple backup operations are performed, in a revision field of the synthetic metadata. field) (RF, RT), for each file of data, the file valid backup operation range can be recorded. In the example of FIG. 6 , the revision fields RF and RT are fields indicating the version of the file, and may indicate at which backup job the file is created and is valid until the backup job.

실시예에서, 가상 백업 데이터 생성부(120)는 파일 생성 버전 필드(rev_from)와, 파일 삭제 버전 필드(rev_to)의 2개의 리비전 필드(RF, RT)를 통해 해당 파일이 몇 번째 백업 작업에서 생성되어 몇 번째 백업 작업까지 유효한지를 표현할 수 있다. 파일 생성 버전 필드(rev_from)는 파일이 몇 번째 백업 작업에서 생성된 것인지를 나타내는 필드이고, 파일 삭제 버전 필드(rev_to)는 파일이 몇 번째 백업 작업까지 유효한지를 나타내는 필드일 수 있다.In an embodiment, the virtual backup data generation unit 120 generates the file in the second backup operation through two revision fields (RF, RT) of the file creation version field (rev_from) and the file deletion version field (rev_to). It can be used to express how many backup jobs are valid. The file creation version field (rev_from) may be a field indicating from which backup operation the file was created, and the file deletion version field (rev_to) may be a field indicating up to what backup operation the file is valid.

도 6에서 리비전 필드(RF, RT)의 오른쪽 부분은 본 발명의 이해를 돕기 위하여 파일 별로 유효한 백업 작업 ID의 범위를 나타낸 것이다. 예를 들어, 파일 c의 경우 3번 백업 작업에서 추가된 파일로, 추가된 당시에는 리비전 필드(RF, RT)가 (3, max) 값을 가지고 있었지만, 7번 백업 작업에서 삭제되어 리비전 필드(RF, RT)가 (3, 6) 값으로 변경된 것이다. 리비전 필드는 예를 들어 아래와 같은 규칙 1 내지 규칙 4에 따라 기록될 수 있다.The right part of the revision fields (RF, RT) in FIG. 6 shows the range of valid backup job IDs for each file in order to help the understanding of the present invention. For example, in the case of file c, it is a file that was added in the 3rd backup operation. At the time it was added, the revision field (RF, RT) had a value of (3, max), but it was deleted in the 7th backup operation and the revision field ( RF, RT) is changed to (3, 6) value. The revision field may be recorded according to the following rules 1 to 4, for example.

규칙 1) 각 백업 작업에서 새롭게 생성된 파일이 존재하는 경우, 그 파일의 파일 생성 버전 필드(rev_from) 값은 현재 백업 작업 ID를 가지고, 파일 삭제 버전 필드(rev_to) 값은 'max' 값을 갖는다.Rule 1) If a file newly created in each backup job exists, the file creation version field (rev_from) value of the file has the current backup job ID, and the file deletion version field (rev_to) value has a 'max' value. .

규칙 2) 각 백업 작업에서 삭제된 파일이 존재하는 경우, 그 파일의 파일 삭제 버전 필드(rev_to) 값은 'max'에서 이전 백업 작업들 중 가장 최근의 백업 작업(직전 백업 작업) ID로 변경된다.Rule 2) If a deleted file exists in each backup job, the value of the file deletion version field (rev_to) of the file is changed from 'max' to the ID of the most recent backup job (previous backup job) among previous backup jobs. .

규칙 3) 각 백업 작업에서 변경(수정)된 파일이 존재하는 경우, 변경 전의 파일(파일 b)의 파일 삭제 버전 필드(rev_to) 값은 'max'에서 직전 백업 작업 ID로 변경되고, 변경된 파일(파일 b')에 대한 새로운 레코드가 작성되며, 변경된 파일(파일 b')의 파일 생성 버전 필드(rev_from) 값은 현재 백업 작업 ID를 가지고, 변경된 파일(파일 b')의 파일 삭제 버전 필드(rev_to) 값은 'max'를 갖는다.Rule 3) If a changed (modified) file exists in each backup job, the file deletion version field (rev_to) value of the file before the change (file b) is changed from 'max' to the previous backup job ID, and the changed file ( A new record is created for file b'), the value of the file creation version field (rev_from) of the changed file (file b') has the current backup job ID, and the file deletion version field (rev_to) of the changed file (file b') ) value has 'max'.

규칙 4) 각 파일은 백업 작업 ID가 [rev_from, rev_to] 인 구간에서 유효한 파일이라는 것을 나타낸다.Rule 4) It indicates that each file is a valid file in the interval with the backup job ID [rev_from, rev_to].

도 6에 도시된 예에서, 파일 a는 1번 백업 작업에서 생성된 이후에 변화가 없는 파일(1번 백업 작업부터 현재까지 유효)이고, 파일 b는 1번 백업 작업에서 생성된 후 5번 백업 작업에서 파일 b'로 수정된 파일(1번 백업 작업부터 4번 백업 작업까지 유효)이고, 파일 c는 3번 백업 작업에서 생성된 후 7번 백업 작업에서 삭제된 파일(3번 백업 작업부터 6번 백업 작업까지 유효)이고, 파일 b'는 5번 백업 작업에서 생성된 후 8번 백업 작업에서 삭제된 파일(5번 백업 작업부터 7번 백업 작업까지 유효)이고, 파일 d는 5번 백업 작업에서 생성된 이후 변화 없는 파일(5번 백업 작업부터 현재까지 유효)이다.In the example shown in FIG. 6 , file a is a file that does not change after being created in backup job 1 (valid from backup job 1 to the present), and file b is backed up 5 times after being created in backup job 1 Files modified by file b' in job (valid from backup job 1 to 4), file c is a file created in backup job 3 and then deleted in backup job 7 (from backup job 3 to 6) is valid until backup job No. 5), file b' is a file created in backup job 5 and then deleted in backup job 8 (valid from backup job 5 to 7), and file d is backup job 5 It is a file that has not changed since it was created (valid from the 5th backup job to the present).

도 6에 도시된 예에서, 파일 삭제 버전 필드(rev_to)에 기록된 'max'는 현재까지 파일이 유효한 상태임을 나타내기 위하여 해당 필드에서 표현할 수 있는 가장 큰 값으로 정의될 수 있다. 물론, 리비전 필드(RF, RT)의 파일 생성 버전 필드(rev_from)와 파일 삭제 버전 필드(rev_to)에 기록되는 값들은 도 6에 도시된 바에 제한되지 않고 다양한 유형으로 변경될 수 있다.In the example shown in FIG. 6 , 'max' recorded in the file deletion version field rev_to may be defined as the largest value that can be expressed in the corresponding field to indicate that the file is in a valid state so far. Of course, values recorded in the file creation version field rev_from and the file deletion version field rev_to of the revision fields RF and RT are not limited to those shown in FIG. 6 and may be changed into various types.

1번 백업 작업은 풀 백업에 의해 수행되고, 그 이후의 백업 작업들은 증분 백업에 의해 수행될 수 있다. 5번 백업 작업에서 백업된 데이터를 복구하는 경우를 예로 들면, 가상 백업 데이터 생성부(120)는 1번 백업 작업시에 생성되는 풀 백업 메타 데이터와, 2번 내지 5번 백업 작업들에서 각각 생성된 증분 백업 메타 데이터를 합성하여 합성 메타 데이터를 생성할 수 있다. 이에 따라, 5번 백업 작업에서 유효한 파일들(파일 a, c, b', d)를 복구할 수 있다.The first backup operation may be performed by a full backup, and subsequent backup operations may be performed by an incremental backup. For example, when the data backed up in backup job 5 is restored, the virtual backup data generation unit 120 generates full backup metadata generated during backup job 1 and backup jobs 2 to 5, respectively. Composite metadata can be created by synthesizing the incremental backup metadata. Accordingly, valid files (files a, c, b', d) in the 5th backup operation can be recovered.

도 7은 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부에 의해 생성된 풀 백업 메타 데이터(syn_dir, syn_file, syn_chunk)와 풀 백업 데이터의 예시도이다. 실시예에서, 풀 백업 메타 데이터는 디렉토리 메타 데이터(syn_dir), 파일 메타 데이터(syn_file), 청크 메타 데이터(syn_chunk) 및 헤더 메타 데이터(hdr-0~6)를 테이블 형식으로 포함할 수 있다.7 is an exemplary diagram of full backup metadata (syn_dir, syn_file, syn_chunk) and full backup data generated by the virtual backup data generator constituting the backup data synthesis apparatus according to an embodiment of the present invention. In an embodiment, the full backup metadata may include directory metadata (syn_dir), file metadata (syn_file), chunk metadata (syn_chunk), and header metadata (hdr-0 to 6) in a table format.

디렉토리 메타 데이터(syn_dir)는 디렉토리들(test, dir1, dir2)의 생성/삭제 정보를 나타내는 리비전 필드에 해당하는 디렉토리 생성 버전 필드(rev_from)와 디렉토리 삭제 버전 필드(rev_to), 디렉토리 ID 필드(dir_uuid), 디렉토리가 속한 상위 디렉토리 ID 필드(parent_dir_uuid), 디렉토리명 필드(dir_name)를 포함할 수 있다.Directory metadata (syn_dir) includes a directory creation version field (rev_from) corresponding to a revision field indicating creation/deletion information of directories (test, dir1, dir2), a directory deletion version field (rev_to), and a directory ID field (dir_uuid). , the parent directory ID field to which the directory belongs (parent_dir_uuid), and a directory name field (dir_name) may be included.

파일 메타 데이터(syn_file)는 파일들(file A-D)의 생성/삭제 정보를 나타내는 리비전 필드에 해당하는 파일 생성 버전 필드(rev_from)와 파일 삭제 버전 필드(rev_to), 파일 ID 필드(file_uuid), 파일이 속해 있는 디렉토리 ID 필드(파일이 아닌 디렉토리의 경우, 디렉토리의 ID 필드)(dir_uuid), 해당 디렉토리가 속한 상위 디렉토리 ID 필드(parent_dir_uuid), 파일을 구성하는 청크 그룹(chunk group) ID 필드(chunk_group_id), 파일명 필드(filename)를 포함할 수 있다. 하나의 파일은 단일 청크(chunk)로 구성되거나 여러 개의 청크들로 구성될 수 있는데, 청크 그룹 ID는 하나의 파일을 구성하고 있는 청크의 ID 또는 하나의 파일을 구성하고 있는 청크들이 공통으로 가지는 ID일 수 있다.File metadata (syn_file) includes a file creation version field (rev_from) corresponding to a revision field indicating creation/deletion information of files (file A-D), a file deletion version field (rev_to), a file ID field (file_uuid), and a file The ID field of the directory to which it belongs (in the case of a directory that is not a file, the ID field of the directory) (dir_uuid), the ID field of the parent directory to which the directory belongs (parent_dir_uuid), the ID field of the chunk group that makes up the file (chunk_group_id), It may include a file name field (filename). One file may consist of a single chunk or multiple chunks. The chunk group ID is the ID of a chunk composing one file or an ID common to chunks composing one file. can be

청크 메타 데이터(syn_chunk)는 각 청크 별로 저장 위치, 사이즈 등의 정보를 포함할 수 있다. 청크 메타 데이터(syn_chunk)는 청크 그룹 ID 필드(chunk_group_id)와, 파일의 청크가 저장되는 레파지토리(repository) 파일 ID 필드(cr_id), 레파지토리 파일 내에서 하나의 청크를 가리키는 헤더 번호 필드(h_num), 해당 청크가 원본 파일에서 존재하는 위치 필드(offset), 해당 청크의 크기 필드(size), 레파지토리 파일에 저장된 청크 데이터의 유효 영역을 나타내는 헤더 내에서의 위치 필드(frag_offset)와 헤더 내에서의 청크 데이터의 크기 필드(frag_size)를 포함할 수 있다. 청크 메타 데이터(syn_chunk)는 청크 그룹 ID 필드(chunk_group_id)를 매개로 파일 메타 데이터(syn_file)와 연결될 수 있다. The chunk metadata syn_chunk may include information such as a storage location and size for each chunk. Chunk metadata (syn_chunk) includes a chunk group ID field (chunk_group_id), a repository file ID field (cr_id) where the chunk of a file is stored, a header number field (h_num) that points to one chunk in the repository file, and the corresponding The location field where the chunk exists in the original file (offset), the size field of the chunk (size), the location field (frag_offset) in the header indicating the effective area of the chunk data stored in the repository file, and the chunk data in the header It may include a size field (frag_size). The chunk metadata syn_chunk may be connected to the file metadata syn_file via the chunk group ID field chunk_group_id.

데이터 파일(레파지토리 파일)은 파일 데이터가 저장된 위치를 나타내는 헤더(Header)(hdr-0~6)와, 실제 저장되는 파일들의 데이터(file A data, file B data, file C data-1, file C data-2, file D data-1, file D data-2, file D data-3)를 포함할 수 있다. 헤더는 실제 데이터의 위치를 나타내는 오프셋(offset) 필드와, 실제 저장되는 청크 데이터의 사이즈를 나타내는 사이즈(size) 필드로 구성될 수 있다.The data file (repository file) includes a header (hdr-0~6) indicating the location where the file data is stored, and the data of the files actually stored (file A data, file B data, file C data-1, file C). data-2, file D data-1, file D data-2, and file D data-3). The header may be composed of an offset field indicating the location of actual data and a size field indicating the size of chunk data to be actually stored.

도 7에 도시된 데이터 파일의 예에서, file C는 2개의 청크로 구성되고, file D는 3개의 청크로 구성되는 것을 알 수 있다. file C의 2개의 청크 데이터가 저장된 영역은 2번 헤더(hdr-2)와 3번 헤더(hdr-3)에 의해 확인할 수 있으며, file D의 3개의 청크 데이터가 저장된 영역은 4번 헤더(hdr-4), 5번 헤더(hdr-5) 및 6번 헤더(hdr-6)에 의해 확인할 수 있다. 도 7에 도시되지 않았으나, 각 파일들의 스탯(stat) 정보 등을 기록하는 추가적인 필드가 풀 백업 메타 데이터에 포함될 수도 있다.In the example of the data file shown in FIG. 7 , it can be seen that file C consists of two chunks and file D consists of three chunks. The area in which the two chunk data of file C is stored can be identified by the second header (hdr-2) and the third header (hdr-3), and the area where the three chunk data of file D is stored is the fourth header (hdr). -4), header 5 (hdr-5), and header 6 (hdr-6). Although not shown in FIG. 7 , an additional field for recording stat information of each file may be included in the full backup metadata.

도 8은 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부에 의해 생성된 증분 백업 메타 데이터(inc_file, inc_chunk)와 데이터 파일의 예시도이다. 실시예에서, 증분 백업 메타 데이터는 파일 메타 데이터(inc_file)와, 청크 메타 데이터(inc_chunk)를 테이블 형식으로 포함할 수 있다.8 is an exemplary diagram of incremental backup metadata (inc_file, inc_chunk) and data files generated by the virtual backup data generator constituting the backup data synthesis apparatus according to an embodiment of the present invention. In an embodiment, the incremental backup metadata may include file metadata (inc_file) and chunk metadata (inc_chunk) in a table format.

증분 백업 메타 데이터의 파일 메타 데이터(inc_file)는 변경된 파일들(file B, D, E)의 백업 파일 ID 필드(f_uuid), 백업되는 파일들의 청크 그룹 ID 필드(chunk_group_id), 파일의 변경 종류(create, delete, update, truncate, rename, stat 등)를 나타내는 오퍼레이션 ID 필드(operation), 변경된 파일명 필드(file_name) 등을 포함할 수 있다.The file metadata (inc_file) of the incremental backup metadata includes the backup file ID field (f_uuid) of the changed files (file B, D, E), the chunk group ID field of the files to be backed up (chunk_group_id), and the type of file change (create , delete, update, truncate, rename, stat, etc.) may include an operation ID field (operation), a changed file name field (file_name), and the like.

증분 백업 메타 데이터의 청크 메타 데이터(inc_chunk)는 변경된 파일들의 각 청크 별로 변경 위치, 변경 사이즈 등의 정보를 포함할 수 있으며, 변경된 내용에 대한 청크만 기록될 수 있다. 증분 백업 메타 데이터의 청크 메타 데이터(inc_chunk)는 변경된 파일이 속한 청크들의 그룹을 나타내는 청크 그룹 ID 필드(chunk_group_id)와, 변경된 파일의 청크가 저장되는 레파지토리 파일 ID 필드(cr_id), 레파지토리 파일 내에서 하나의 청크를 가리키는 헤더 번호 필드(h_num), 해당 청크가 원본 파일에서 존재하는 위치 필드(offset), 해당 청크의 크기 필드(size), 레파지토리 파일에 저장된 청크 데이터의 유효 영역을 나타내는 헤더 내에서의 위치 필드(fragoffset)와 헤더 내에서의 청크 데이터의 크기 필드(fragsize) 등을 포함할 수 있다. 증분 백업 메타 데이터의 청크 메타 데이터(inc_chunk)는 청크 그룹 ID 필드(chunk_group_id)를 매개로 파일 메타 데이터(inc_file)와 연결될 수 있다.The chunk metadata (inc_chunk) of the incremental backup metadata may include information such as a change location and a change size for each chunk of the changed files, and only the chunk for the changed content may be recorded. The chunk metadata (inc_chunk) of the incremental backup metadata includes a chunk group ID field (chunk_group_id) indicating a group of chunks to which the changed file belongs, a repository file ID field (cr_id) in which the chunk of the changed file is stored, and one in the repository file. The header number field (h_num) indicating the chunk of It may include a field (fragoffset) and a size field (fragsize) of chunk data in the header. The chunk metadata (inc_chunk) of the incremental backup metadata may be connected to the file metadata (inc_file) via the chunk group ID field (chunk_group_id).

이전의 데이터(예를 들어, 도 7에 도시된 풀 백업 데이터)에서 file B가 삭제되고, file D가 수정되고, file E가 생성된 경우, 증분 백업에 의해 변경 파일에 대한 메타 데이터가 파일 메타 데이터(inc_file)에 기록되고, 변경된 데이터는 데이터 파일(Data File)에 저장된 후 청크 메타 데이터(inc_chunk)에 반영될 수 있다. 파일의 변경 종류 필드(operation)에는 파일 생성(create), 파일 삭제(delete), 파일 수정(update), 파일 축소(truncate), 파일명 변경(rename), 스탯(stat) 등의 변경 종류가 기록될 수 있다.When file B is deleted, file D is modified, and file E is created from previous data (for example, the full backup data shown in Fig. 7), the metadata for the changed file is changed by the incremental backup to the file meta The data is recorded in the data inc_file, and the changed data is stored in the data file and then reflected in the chunk metadata inc_chunk. Change types such as file create, file delete, file update, file truncate, file rename, and stat are recorded in the file change type field (operation). can

도 8에 예시된 파일 메타 데이터(inc_file)와 청크 메타 데이터(inc_chunk)는 변경된 파일들에 대한 정보만 기록되어 있으며, file B가 삭제되고, file E가 생성되고, file D가 수정된 것을 나타내고 있다. 이때 생성된 file E와 수정된 file D는 데이터 파일(레파지토리 파일)에 기존 저장된 파일들과 함께 추가로 저장되고, 생성/수정된 파일들의 청크들의 저장 위치와 사이즈 등을 저장하는 헤더(hdr-7, hdr-8)가 메타 데이터에 추가될 수 있다. 파일이 삭제, 수정되더라도, 파일 삭제, 수정 이전의 복구 시점에서 파일을 복구할 경우를 대비하여 삭제된 파일, 수정 전의 파일(file B, D)은 데이터 파일(레파지토리 파일)에서 삭제되지 않을 수 있다.In the file metadata (inc_file) and chunk metadata (inc_chunk) illustrated in FIG. 8, only information about changed files is recorded, file B is deleted, file E is created, and file D is modified. . At this time, the generated file E and the modified file D are additionally stored together with the previously stored files in the data file (repository file), and a header (hdr-7) that stores the storage location and size of chunks of the created/modified files. , hdr-8) can be added to the metadata. Even if a file is deleted or modified, the deleted file and the file before modification (file B, D) may not be deleted from the data file (repository file) in case the file is recovered from the recovery point before the deletion or modification of the file. .

도 9는 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부의 기능을 설명하기 위한 개념도이다. 도 5 및 도 9를 참조하면, 가상 백업 데이터 생성부(120)는 파일을 구성하는 청크들 중의 일부 내용이 변경되어 증분 백업이 수행된 경우, 변경되지 않은 청크와 변경된 청크를 조합하여 변경 후의 파일에 대한 청크 그룹을 합성할 수 있다.9 is a conceptual diagram for explaining a function of a virtual backup data generating unit constituting a backup data synthesis apparatus according to an embodiment of the present invention. 5 and 9, when the incremental backup is performed because some of the chunks constituting the file are changed, the virtual backup data generation unit 120 combines the unchanged chunk and the changed chunk to create a changed file. You can synthesize chunk groups for .

도 9에 도시된 에에서, file A는 3개의 청크로 나누어 풀 백업되어 1번 데이터 파일(레파지토리 파일)(cr_id = 1)에 각각 저장되며, 각 청크에 대한 헤더(hdr 1, 2, 3)가 기록되고, 청크들의 청크 그룹 ID 필드(chunk_group_id) 값이 '1'로 레코드되어 있다. 각 헤더(header)는 대응되는 각 청크(chunk)에 접근하기 위한 정보를 가지고 있다.In the example shown in FIG. 9, file A is divided into three chunks and fully backed up and stored in data file number 1 (repository file) (cr_id = 1), respectively, and headers (hdr 1, 2, 3) for each chunk is recorded, and the chunk group ID field (chunk_group_id) value of the chunks is recorded as '1'. Each header contains information for accessing each corresponding chunk.

file A가 풀 백업된 이후 file A의 3개의 청크들 중 2번째 청크의 데이터 일부가 변경된 상태에서 증분 백업이 수행되는 경우, 풀 백업 메타 데이터에 변경된 파일(file A')에 대한 레코드가 추가될 수 있다. 이때, 변경된 file A'는 변경 전의 file A와 상이한 청크 그룹 ID 필드 값으로 레코드될 수 있으며, 도 9의 예에서는 변경된 file A'의 청크 그룹 ID 필드 값이 '2'로 기록되어 있다. 또한, file A'의 변경된 청크(3개의 청크 중 2번째 청크)에 접근할 수 있도록 4번 헤더(hdr 4)에 기록되고, 해당 청크의 헤더 번호 필드(h_num) 값이 '4'로 기록된다.After file A is fully backed up, if an incremental backup is performed while some of the data of the second chunk among the three chunks of file A is changed, a record for the changed file (file A') will be added to the full backup metadata. can In this case, the changed file A' may be recorded with a chunk group ID field value different from that of the file A before the change, and in the example of FIG. 9 , the chunk group ID field value of the changed file A' is recorded as '2'. In addition, it is recorded in the header 4 (hdr 4) so that the changed chunk of file A' (the second chunk among the three chunks) can be accessed, and the value of the header number field (h_num) of the corresponding chunk is recorded as '4' .

가상 백업 데이터 생성부(120)는 메타 데이터에 기록된 청크들의 정보를 바탕으로 가상 백업 데이터 합성(synthetic)을 위한 합성 메타 데이터를 작성할 수 있다. 가상 백업 데이터 생성부(120)는 file A' 중 변경되지 않은 1번 청크와 3번 청크는 그대로 사용하고, 2번 청크 중에서도 변경 영역의 앞 부분에 해당하는 변경되지 않은 데이터(chunk 2의 0 ~ 25 MB 데이터)와 변경 영역의 뒷 부분에 해당하는 변경되지 않은 데이터(chunk 2의 75 ~ 100 MB 데이터)는 그대로 사용하고, 2번 청크 중 변경된 영역(chunk 2의 25 ~ 75 MB 데이터)은 4번 청크(chunk 4의 0 ~ 50 MB 데이터)를 참조하여 데이터를 합성할 수 있도록 합성 메타 데이터를 생성할 수 있다.The virtual backup data generating unit 120 may create synthetic metadata for synthetic virtual backup data based on information on chunks recorded in the metadata. The virtual backup data generation unit 120 uses the unaltered chunks 1 and 3 among file A' as it is, and among the second chunks, the unaltered data corresponding to the front of the change area (from 0 to 25 MB data) and the unaltered data (75 to 100 MB data in chunk 2) at the end of the changed area are used as they are, and the changed area (25 to 75 MB data in chunk 2) is 4 Composite metadata can be generated so that data can be synthesized by referring to burn chunks (0 to 50 MB of data in chunk 4).

이때, 가상 백업 데이터 생성부(120)는 합성 메타 데이터에 각 헤더 내에서청크가 저장된 위치 필드(frag_offset)와 헤더 내에서의 청크 데이터의 크기 필드(frag_size)에 필드 값을 기록하며, 이후 데이터 복원부(140)에서 해당 청크의 위치 필드(frag_offset) 값에 해당하는 위치부터 크기 필드(frag_size) 값에 해당하는 크기 만큼 데이터를 읽어서 복원 시점의 데이터를 복원할 수 있게 된다.At this time, the virtual backup data generation unit 120 records the field values in the location field (frag_offset) where the chunk is stored in each header in the composite metadata and the size field (frag_size) of the chunk data in the header, and then restores the data. The unit 140 reads data from a position corresponding to the position field (frag_offset) value of the corresponding chunk to a size corresponding to the size field (frag_size) value, so that the data at the time of restoration can be restored.

도 10은 본 발명의 실시예에 따른 백업 데이터 합성 장치를 구성하는 가상 백업 데이터 생성부에 의해 생성된 가상 백업 데이터의 예시도이다. 도 5 및 도 10을 참조하면, 가상 백업 데이터 생성부(120)는 증분 백업 메타 데이터(inc_file, inc_chuck)를 이전의 메타 데이터에 합성하는 메타 합성(Meta Synthetic)에 의해 가상 백업 데이터(IBD)를 생성할 수 있다.10 is an exemplary diagram of virtual backup data generated by the virtual backup data generating unit constituting the backup data synthesis apparatus according to an embodiment of the present invention. 5 and 10 , the virtual backup data generation unit 120 generates virtual backup data (IBD) by meta-synthetic synthesizing incremental backup metadata (inc_file, inc_chuck) with previous metadata. can create

도 10에 도시된 예에서, file B는 이전 백업(풀 백업) 이후 첫번째 증분 백업 작업 전에 삭제된 파일이고, file E는 풀 백업 이후 첫번째 증분 백업 작업 전에 생성된 파일이고, file D는 풀 백업 이후 첫번째 증분 백업 작업 전에 3개의 청크 중 2번 청크의 일부 데이터가 수정된 파일이다. 가상 백업 데이터 생성부(120)는 증분 백업 메타 데이터의 파일 메타 데이터(inc_file)의 레코드를 읽어서 file B가 삭제된 것을 확인한 후, 첫번째 풀 백업 이후 2번째 백업(증분 백업) 전에 삭제된 파일이므로 이전 메타 데이터의 파일 메타 데이터(syn_file)에 file B에 해당하는 파일 삭제 버전 필드(rev_to) 값을 'max'에서 '1'로 변경한다. 삭제된 파일의 경우, 데이터 파일에 청크를 추가할 필요가 없으며, 추후 풀 백업된 데이터를 복구해야 하는 경우에 대비하여 file B에 해당하는 청크 데이터를 그대로 보유할 필요가 있으므로, 청크 메타 데이터(syn_chunk)에 반영되지는 않는다.In the example shown in FIG. 10 , file B is a file deleted before the first incremental backup operation after the previous backup (full backup), file E is a file created before the first incremental backup operation after the full backup, and file D is after the full backup It is a file in which some data in chunk 2 of 3 chunks was modified before the first incremental backup operation. The virtual backup data generation unit 120 reads the record of the file metadata (inc_file) of the incremental backup metadata to confirm that file B has been deleted, and since the file is deleted before the second backup (incremental backup) after the first full backup, the previous The value of the file deletion version field (rev_to) corresponding to file B in the file metadata (syn_file) of the metadata is changed from 'max' to '1'. In the case of a deleted file, it is not necessary to add a chunk to the data file, and it is necessary to retain the chunk data corresponding to file B in case the full backup data needs to be recovered later, so the chunk metadata (syn_chunk ) is not reflected in

가상 백업 데이터 생성부(120)는 증분 백업 메타 데이터(inc_file, inc_chunk)의 다음번 레코드를 읽어서 file E가 생성된 것을 확인한 후, 이전 메타 데이터의 파일 메타 데이터(syn_file)에 file E에 해당하는 레코드를 추가하고, 추가된 레코드의 청크 그룹 ID 필드에 새로운 필드 값(도 10의 예에서는 '8')을 기록한다. 또한, 가상 백업 데이터 생성부(120)는 file E가 첫번째 풀 백업 이후 2번째 백업(증분 백업) 시에 새로 추가된 파일이므로, 파일 생성 버전 필드(rev_from) 값을 '2'로 기록하고, 파일 삭제 버전 필드(rev_to) 값을 'max'로 기록한다.The virtual backup data generation unit 120 reads the next record of the incremental backup metadata (inc_file, inc_chunk) to confirm that file E is created, and then adds a record corresponding to file E to the file metadata (syn_file) of the previous metadata. is added, and a new field value ('8' in the example of FIG. 10) is recorded in the chunk group ID field of the added record. In addition, since file E is a newly added file in the second backup (incremental backup) after the first full backup, the virtual backup data generation unit 120 records the file creation version field (rev_from) value as '2', Record the deleted version field (rev_to) value as 'max'.

또한, 가상 백업 데이터 생성부(120)는 청크 메타 데이터(syn_chunk)에 레코드를 추가하고, 추가된 레코드에 file E의 청크 그룹 ID 필드(chunk_group_id) 값, 데이터 파일 필드(cr_id) 값, 새로 추가된 청크에 해당하는 헤더 넘버 필드(h_num) 값(도 10의 예에서는 7번 헤더), 청크에 저장되는 file E 데이터의 위치와 크기(offset, size, frag_offset, frag_size) 등을 저장한다.In addition, the virtual backup data generation unit 120 adds a record to the chunk metadata (syn_chunk), and a chunk group ID field (chunk_group_id) value of file E, a data file field (cr_id) value, a newly added record in the added record Stores the header number field (h_num) value corresponding to the chunk (header 7 in the example of FIG. 10 ), the location and size of the file E data stored in the chunk (offset, size, frag_offset, frag_size), and the like.

다시 가상 백업 데이터 생성부(120)는 증분 백업 메타 데이터(inc_file, inc_chunk)의 다음번 레코드를 읽어서 file D의 수정 및 수정 위치를 확인한 후, 이전 메타 데이터의 파일 메타 데이터(syn_file)에 변경 전 file D의 청크 그룹 ID(도 10의 예에서 7번)에 해당하는 파일 삭제 버전 필드(rev_to) 값을 'max'에서 '1'로 변경한다. 추후 변경 전 file D를 포함하여 풀 백업된 데이터를 복구해야 하는 경우에 대비하여, 변경 전 file D에 해당하는 청크 그룹 ID(7번 청크 그룹)에 해당하는 4번 내지 6번 헤더에 해당하는 청크들은 청크 메타 데이터(syn_chunk)에서 삭제되지 않는다.Again, the virtual backup data generation unit 120 reads the next record of the incremental backup metadata (inc_file, inc_chunk) to check the modification and modification position of file D, and then adds the file D before the change to the file metadata (syn_file) of the previous metadata. Changes the value of the file deletion version field (rev_to) corresponding to the chunk group ID (No. 7 in the example of FIG. 10 ) from 'max' to '1'. In case the full backup data including file D before the change needs to be restored later, chunks corresponding to headers 4 to 6 corresponding to the chunk group ID (chunk group 7) corresponding to file D before the change are not deleted from the chunk metadata (syn_chunk).

가상 백업 데이터 생성부(120)는 이전 메타 데이터의 파일 메타 데이터(syn_file)에 변경된 filed D에 해당하는 레코드를 추가하고, 추가된 레코드의 청크 그룹 ID 필드에 새로운 필드 값(도 10의 예에서는 '9')을 기록한다. 또한, 가상 백업 데이터 생성부(120)는 변경된 file D가 첫번째 풀 백업 이후 2번째 백업(증분 백업) 시에 수정된 파일이므로, 파일 생성 버전 필드(rev_from) 값을 '2'로 기록하고, 파일 삭제 버전 필드(rev_to) 값을 'max'로 기록한다.The virtual backup data generation unit 120 adds a record corresponding to the changed filed D to the file metadata (syn_file) of the previous metadata, and a new field value (in the example of FIG. 10 ' 9') is recorded. In addition, since the changed file D is a modified file during the second backup (incremental backup) after the first full backup, the virtual backup data generation unit 120 records the file creation version field (rev_from) value as '2', Record the deleted version field (rev_to) value as 'max'.

또한, 가상 백업 데이터 생성부(120)는 청크 메타 데이터(syn_chunk)에 수정된 file D에 해당하는 레코드를 추가하고, 추가된 레코드에 수정된 file D의 청크 그룹 ID 필드(chunk_group_id) 값, 데이터 파일 필드(cr_id) 값, 수정된 청크에 해당하는 헤더 넘버 필드(h_num) 값(도 10의 예에서는 8번 헤더), 청크에 저장되는 수정된 file D 데이터의 위치와 크기(offset, size, frag_offset, frag_size) 등을 저장한다.In addition, the virtual backup data generation unit 120 adds a record corresponding to the modified file D to the chunk metadata (syn_chunk), and a chunk group ID field (chunk_group_id) value of the modified file D to the added record, a data file The field (cr_id) value, the header number field (h_num) value corresponding to the modified chunk (header 8 in the example of FIG. 10), the location and size of the modified file D data stored in the chunk (offset, size, frag_offset, frag_size) and so on.

이때, 가상 백업 데이터 생성부(120)는 청크 메타 데이터(syn_chunk)에 file D의 3개의 청크 중 수정된 청크의 수정된 부분만 8번 헤더에 해당하는 새로운 청크를 추가하고, 3개의 청크 중 수정되지 않은 4번 헤더에 해당하는 첫번째 청크, 3개의 청크 중 수정되지 않은 6번 헤더에 해당하는 세번째 청크, 그리고 5번 헤더에 해당하는 2번째 청크 중 수정되지 않은 부분의 데이터에 대하여는 새로운 청크를 추가하지 않고 기존의 데이터 파일에 저장된 상태를 유지한 채로, 청크 그룹 ID와 2번 청크 중 수정되지 않은 부분의 데이터 위치, 크기를 갱신한다.At this time, the virtual backup data generation unit 120 adds a new chunk corresponding to the header No. 8 only to the modified portion of the modified chunk among the three chunks of file D to the chunk metadata (syn_chunk), and modifies the three chunks. A new chunk is added to the unmodified data of the first chunk corresponding to the unmodified header 4, the third chunk corresponding to the unmodified header 6 among the 3 chunks, and the second chunk corresponding to the header 5. It updates the chunk group ID and the data location and size of the unmodified part of the second chunk while maintaining the state stored in the existing data file.

가상 백업 데이터 생성부(120)는 메타 데이터 수준에서 이전 메타 데이터(풀 백업 메타 데이터)에 증분 백업 메타 데이터를 합성하여 합성 파일 메타 데이터(IBD1)와 합성 청크 메타 데이터(IBD2)를 포함하는 합성 메타 데이터(IBD)를 생성할 수 있다. 추후 데이터 복원부(140)는 합성 메타 데이터(IBD)를 데이터 복구를 위한 가상의 백업 이미지로 활용하여, 합성 메타 데이터(IBD)의 리비전 필드를 통해 파일의 유효 구간을 확인함으로써 복구 시점에 해당하는 데이터(파일들)를 데이터 파일로부터 추출하여 데이터를 복원할 수 있다.The virtual backup data generation unit 120 synthesizes incremental backup metadata with previous metadata (full backup metadata) at the metadata level to synthesize composite metadata including composite file metadata (IBD1) and composite chunk metadata (IBD2). data (IBD) can be generated. Later, the data restoration unit 140 utilizes the composite metadata (IBD) as a virtual backup image for data recovery, and checks the valid section of the file through the revision field of the composite metadata (IBD) corresponding to the recovery point. Data (files) can be extracted from the data file to restore the data.

도 11은 본 발명의 실시예에 따른 백업 데이터 합성 방법의 순서도이다. 도 5, 도 10 및 도 11을 참조하여, 풀 백업이 완료된 이후에 파일이 변경되고 증분 백업이 수행된 후, 증분 백업 메타 데이터(inc_file, inc_chunk)에 기록된 내용이 풀 백업 메타 데이터(syn_file, syn_chunk)에 반영되는 메타 데이터 영역에서의 합성을 통해 합성 메타 데이터를 생성하는 과정에 대해 설명한다.11 is a flowchart of a backup data synthesis method according to an embodiment of the present invention. 5, 10 and 11, after the file is changed after the full backup is completed and the incremental backup is performed, the contents recorded in the incremental backup metadata (inc_file, inc_chunk) are the full backup metadata (syn_file, The process of generating synthetic metadata through synthesis in the metadata area reflected in syn_chunk) will be described.

먼저, 가상 백업 데이터 생성부(120)는 증분 백업 파일 메타 데이터(inc_file) 테이블에서 레코드를 하나씩 읽어서 파일 변경 유형을 판단한다(S10, S20). 파일의 변경 유형은 증분 백업 파일 메타 데이터(inc_file)의 오퍼레이션 필드(operation)에 정의될 수 있으며, 예를 들어, 파일 생성(create), 파일 삭제(delete), 파일 수정/축소(update/truncate), 기타 유형(rename, stat)의 값들 중 어느 하나로 정의될 수 있다.First, the virtual backup data generation unit 120 reads records from the incremental backup file metadata (inc_file) table one by one to determine the file change type (S10, S20). The type of file change can be defined in the operation field of the incremental backup file metadata (inc_file), for example, create a file, delete a file (delete), modify/truncate a file (update/truncate) , can be defined as any of the values of other types (rename, stat).

만약, 증분 백업된 파일의 변경 유형이 파일 생성(create)에 해당하는 경우, 가상 백업 데이터 생성부(120)는 이전 파일 메타 데이터(syn_file)에 새로운 레코드를 삽입하고(S30), 증분 백업 청크 메타 데이터(inc_chunk)를 참조하여 이전 청크 메타 데이터(syn_chunk)에 새로운 청크 그룹(chunk group)에 대한 레코드들을 생성한다(S40).If the change type of the incrementally backed up file corresponds to file creation, the virtual backup data generator 120 inserts a new record into the previous file meta data (syn_file) (S30), and the incremental backup chunk meta With reference to the data (inc_chunk), records for a new chunk group are created in the previous chunk metadata (syn_chunk) (S40).

만약, 증분 백업된 파일의 변경 유형이 파일 삭제(delete)에 해당하는 경우, 가상 백업 데이터 생성부(120)는 이전 파일 메타 데이터(syn_file)에서 해당되는 레코드의 파일 삭제 버전 필드(rev_to) 값을 이전 백업 작업의 ID로 변경한다(S50).If the change type of the incrementally backed up file corresponds to file deletion, the virtual backup data generation unit 120 returns the file deletion version field (rev_to) value of the corresponding record in the previous file metadata (syn_file). Change to the ID of the previous backup job (S50).

만약, 증분 백업된 파일의 변경 유형이 파일 수정/축소(update/truncate)에 해당하는 경우, 가상 백업 데이터 생성부(120)는 이전 파일 메타 데이터(syn_file)에서 해당되는 레코드의 파일 삭제 버전 필드(rev_to) 값을 이전 백업 작업의 ID로 변경 후, 새로운 레코드를 삽입하고(S60), 증분 백업 청크 메타 데이터(inc_chunk)와 이전 청크 메타 데이터(syn_chunk)를 참조하여 이전 청크 메타 데이터(syn_chunk)에 새로운 청크 그룹에 대한 레코드들을 생성한다(S70).If the change type of the incrementally backed up file corresponds to file update/truncate, the virtual backup data generation unit 120 performs the file deletion version field ( rev_to) value is changed to the ID of the previous backup job, a new record is inserted (S60), and a new record is added to the previous chunk metadata (syn_chunk) by referring to the incremental backup chunk metadata (inc_chunk) and the previous chunk metadata (syn_chunk). Records for the chunk group are created (S70).

만약, 증분 백업된 파일의 변경 유형이 기타 변경 유형(rename/stat 등)에 해당하는 경우, 가상 백업 데이터 생성부(120)는 이전 파일 메타 데이터(syn_file)에서 변경 전 이름을 가진 레코드의 파일 삭제 버전 필드(rev_to) 값을 이전 백업 작업의 ID로 변경하고, 변경된 스탯(stat)을 가진 레코드를 삽입한다(S80).If the change type of the incrementally backed up file corresponds to other change types (rename/stat, etc.), the virtual backup data generation unit 120 deletes the file of the record with the name before the change in the previous file metadata (syn_file). The version field (rev_to) value is changed to the ID of the previous backup job, and a record with the changed stat is inserted (S80).

도 12는 본 발명의 실시예에 따라 생성된 합성 메타 데이터를 기반으로 데이터를 복원하는 과정을 설명하기 위한 예시도이다. 도 12는 도 10에 따라 생성된 합성 메타 데이터와 데이터 파일을 나타낸다. 도 13은 도 12에 도시된 합성 메타 데이터를 기반으로 풀 백업된 복구 시점의 데이터를 복원하는 과정을 설명하기 위한 예시도이다. 도 14는 도 12에 도시된 합성 메타 데이터를 기반으로 풀 백업된 데이터에 증분 백업된 데이터가 반영된 데이터를 복원하는 과정을 설명하기 위한 예시도이다.12 is an exemplary diagram for explaining a process of restoring data based on synthetic metadata generated according to an embodiment of the present invention. 12 shows composite metadata and data files generated according to FIG. 10 . FIG. 13 is an exemplary diagram for explaining a process of restoring data at a recovery point that has been fully backed up based on the composite metadata shown in FIG. 12 . FIG. 14 is an exemplary diagram for explaining a process of restoring data in which incrementally backed up data is reflected in full backed up data based on the composite metadata shown in FIG. 12 .

도 12에서 테이블 내에 배경색이 있는 레코드들은 증분 백업 시 삽입되거나 수정된 레코드를 의미한다. 데이터 파일은 총 8개의 청크가 생성되었으며, 헤더 번호(syn_chunk의 h_num 필드의 값)를 통해 각 청크에 접근할 수 있다. 헤더 번호를 통해 접근한 청크의 데이터를 바탕으로 추후 복구 시점의 데이터를 복원할 수 있다.In FIG. 12 , records with a background color in the table mean records inserted or modified during incremental backup. A total of 8 chunks are created in the data file, and each chunk can be accessed through the header number (the value of the h_num field of syn_chunk). Based on the data of the chunk accessed through the header number, data at a later recovery point can be restored.

먼저, 도 5, 도 12 및 도 13을 참조하여, 1번 백업(풀 백업)된 모든 데이터를 복원하는 과정에 대해 설명한다. 데이터 복원부(140)는 복구 시점에 해당하는 백업 작업의 순번과 리비전 필드(rev_from, rev_to)를 기반으로 복구 시점에 해당하는 레코드들(RD1)을 선택할 수 있다. 실시예에서, 데이터 복원부(140)는 파일 생성 버전 필드(rev_from) 값(파일이 최초 생성된 백업 작업 순번)이 원하는(입력된) 복구 시점에 해당하는 백업 작업의 순번(도 13의 예에서는 '1') 이하이고, 파일 삭제 버전 필드(rev_to) 값(파일이 삭제된 백업 작업 순번)이 복구 시점에 해당하는 백업 작업의 순번 이상인 조건을 만족하는 레코드들을 선택할 수 있다.First, a process of restoring all data backed up (full backup) once will be described with reference to FIGS. 5, 12 and 13 . The data restoration unit 140 may select the records RD1 corresponding to the recovery point based on the order number and revision fields rev_from and rev_to of the backup job corresponding to the recovery point. In the embodiment, the data restoration unit 140 is a backup job sequence number (in the example of FIG. 13 ) corresponding to the desired (input) recovery point of the file creation version field (rev_from) value (the sequence number of the backup job in which the file was initially created) '1') or less and the file deletion version field (rev_to) value (the sequence number of the backup job in which the file is deleted) is greater than or equal to the sequence number of the backup job corresponding to the recovery point may be selected.

도 13에서 테이블에 배경색이 들어가 있는 레코드들(RD1)은 데이터 복원부(140)에 의해 복구 시점(풀 백업 시점)에서 복원해야 할 대상으로 선정된 레코드들을 의미한다. 수정된 file D와, file E는 풀 백업 이후에 생성된 파일이므로 복원할 대상에서 제외된 것을 알 수 있다. 데이터 복원부(140)는 합성 파일 메타 데이터(syn_file)에서 선택된 복구 시점에 해당하는 레코드들(RD1)의 청크 그룹 ID(chunk_group_id)를 기반으로 합성 청크 메타 데이터(syn_chunk)에서 복구 시점에 해당하는 레코드들(RD1)의 데이터 파일 필드(cr_id), 헤더 번호 필드(h_num)를 찾은 후, 데이터 파일 번호와 헤더 번호 및 청크 내 저장 위치 및 저장 사이즈를 기반으로 레파지토리 파일에 접근하여 헤더 정보를 기반으로 실제 레파지토리 파일에 저장된 위치의 청크 데이터를 차례대로 읽은 후 청크 데이터를 전송하여 데이터를 복원할 수 있다.In FIG. 13 , records RD1 having a background color in the table mean records selected as objects to be restored at a recovery point (full backup point) by the data restorer 140 . Since the modified file D and file E are files created after the full backup, you can see that they are excluded from the restoration target. The data restoration unit 140 records the recovery point in the composite chunk metadata syn_chunk based on the chunk group ID chunk_group_id of the records RD1 corresponding to the recovery point selected in the composite file metadata syn_file. After finding the data file field (cr_id) and header number field (h_num) of RD1, the repository file is accessed based on the data file number, header number, and the storage location and storage size in the chunk, and the actual After sequentially reading the chunk data stored in the repository file, the data can be restored by transmitting the chunk data.

도 5, 도 12 및 도 14를 참조하여 2번 백업 작업 시점(풀 백업 후 증분 백업된 시점)의 데이터를 복원하는 과정에 대해 설명한다. 데이터 복원부(140)는 파일 생성 버전 필드(rev_from) 값(파일이 최초 생성된 백업 작업 순번)이 원하는(입력된) 복구 시점에 해당하는 백업 작업의 순번(도 14의 예에서는 '2') 이하이고, 파일 삭제 버전 필드(rev_to) 값(파일이 삭제된 백업 작업 순번)이 복구 시점에 해당하는 백업 작업의 순번 이상인 조건을 만족하는 레코드들을 선택할 수 있다.A process of restoring data at the second backup operation time point (the point of time of incremental backup after full backup) will be described with reference to FIGS. 5, 12 and 14 . The data restoration unit 140 sets the file generation version field (rev_from) value (the backup job sequence number in which the file was initially created) corresponds to the desired (input) recovery point in the backup job sequence number ('2' in the example of FIG. 14 ). It is possible to select records that satisfy the condition that the following and the file deletion version field (rev_to) value (the sequence number of the backup job in which the file is deleted) is equal to or greater than the sequence number of the backup job corresponding to the recovery point.

도 14에서 테이블에 배경색이 들어가 있는 레코드들(RD2)은 데이터 복원부(140)에 의해 복구 시점(2번 백업 작업 시점)에서 복원해야 할 대상으로 선정된 레코드들을 의미한다. 증분 백업 작업 전에 삭제된 file B와, 증분 백업 작업 전에 수정된 수정 전의 file D는 복구 대상 데이터에서 제외된 것을 확인할 수 있다.In FIG. 14 , records RD2 having a background color in the table mean records selected as objects to be restored at the recovery point (the second backup operation point) by the data restorer 140 . It can be seen that file B deleted before the incremental backup operation and file D before the modification modified before the incremental backup operation are excluded from the recovery target data.

데이터 복원부(140)는 합성 파일 메타 데이터(syn_file)에서 선택된 복구 시점에 해당하는 레코드들(RD2)의 청크 그룹 ID(chunk_group_id)를 기반으로 합성 청크 메타 데이터(syn_chunk)에서 복구 시점에 해당하는 레코드들(RD2)의 데이터 파일 필드(cr_id), 헤더 번호 필드(h_num)를 찾은 후, 데이터 파일 번호와 헤더 번호 및 청크 내 저장 위치 및 저장 사이즈를 기반으로 레파지토리 파일에 접근하여 헤더 정보를 기반으로 실제 레파지토리 파일에 저장된 위치의 청크 데이터를 차례대로 읽은 후 청크 데이터를 전송하여 데이터를 복원할 수 있다.The data restoration unit 140 records the recovery point in the composite chunk metadata syn_chunk based on the chunk group ID chunk_group_id of the records RD2 corresponding to the recovery point selected in the composite file metadata syn_file. After finding the data file field (cr_id) and header number field (h_num) of RD2, the repository file is accessed based on the data file number, header number, and the storage location and storage size in the chunk, and the actual After sequentially reading the chunk data stored in the repository file, the data can be restored by transmitting the chunk data.

file D는 증분 백업이 수행될 때 데이터 일부가 변경되어 백업된 파일로, 증분 백업될 때에는 변경된 데이터만 전송된 상태이다. file D를 복원할 차례에 데이터를 참조하는 과정을 설명하면, 먼저 합성 메타 데이터의 청크 그룹 ID 필드(chunk_group_id)를 통해 file D를 구성하고 있는 청크 그룹 ID(도 14의 예에서, 9번)를 확인한 후, 청크 그룹 ID에 해당하는 청크들을 하나씩 복원한다.File D is a backed up file with some data changed when incremental backup is performed. When incremental backup is performed, only the changed data is transferred. When describing the process of referencing data in the turn to restore file D, first, the chunk group ID (No. 9 in the example of FIG. 14) constituting file D through the chunk group ID field (chunk_group_id) of the composite metadata. After checking, the chunks corresponding to the chunk group ID are restored one by one.

수정된 file D는 총 5개의 청크로 구성되어 있으며, 8번 헤더에 해당하는 세번째 청크를 제외한 나머지 4개의 청크는 모두 풀 백업 작업에서 생성된 청크이다. file D의 5개의 청크 중 헤더 번호 4번에 해당하는 첫 번째 청크(청색으로 표시)는 풀 백업 작업에서 생성된 이후 변경 사항이 없기 때문에 그대로 복원이 된다.The modified file D consists of a total of 5 chunks. Except for the third chunk corresponding to the 8th header, the remaining 4 chunks are all chunks created in the full backup operation. Among the 5 chunks of file D, the first chunk corresponding to header number 4 (shown in blue) is restored as it is because there is no change since it was created in the full backup operation.

file D의 헤더 번호 5번에 해당하는 두 번째 청크(변경된 청크 데이터의 앞 부분 데이터)(녹색으로 표시)는 풀 백업 작업에서 생성된 이후 변경 사항이 없기 때문에 변경된 데이터의 이전 부분까지는 합성 메타 데이터의 청크 데이터 위치 및 사이즈(frag_offset, frag_size)에 해당하는 데이터를 그대로 읽어서 데이터 사이즈(size) 크기 만큼 파일 내의 위치(offset)에 복원한다.The second chunk corresponding to header number 5 of file D (data in the front part of the changed chunk data) (shown in green) has no changes since it was created in the full backup job, so the previous part of the changed data is of synthetic metadata. The data corresponding to the chunk data location and size (frag_offset, frag_size) is read as it is, and the data size (size) is restored to the position (offset) in the file.

file D의 헤더 번호 8번에 해당하는 세 번째 청크(적색으로 표시)는 증분 백업에서 변경되어 백업된 청크로, 8번 헤더 정보를 참조하여 레파지토리 파일에 저장된 변경된 데이터의 위치(frag_offset)에서 데이터 사이즈(frag_size) 만큼 읽어서 데이터 사이즈(size) 만큼 파일 내의 위치(offset)에 복원한다.The third chunk (indicated in red) corresponding to header number 8 of file D is the chunk that has been changed and backed up in the incremental backup, and the data size at the location (frag_offset) of the changed data stored in the repository file by referring to the 8 header information It reads as much as (frag_size) and restores the data size (size) to the position (offset) in the file.

file D의 헤더 번호 5번에 해당하는 네 번째 청크(청록색으로 표시)는 풀 백업에서 생성된 청크로, 5번 헤더 정보를 참조하여 레파지토리 파일에 저장된 데이터 위치(frag_offset)에서 데이터 사이즈(frag_size) 만큼 읽어서 변경 이후 부분의 위치(offset)에 데이터 사이즈(size) 만큼 복원한다. 마지막으로, file D의 헤더 번호 6번에 해당하는 다섯 번째 청크(분홍색으로 표시)는 풀 백업 작업에서 생성된 이후 변경 사항이 없기 때문에 그대로 복원이 된다.The fourth chunk (displayed in turquoise) corresponding to header number 5 of file D is a chunk created from the full backup, referring to header information 5, as much as the data size (frag_size) at the data location (frag_offset) stored in the repository file. It is read and restored as much as the data size (size) to the position (offset) of the part after the change. Finally, the fifth chunk (marked in pink) corresponding to header number 6 of file D is restored as it is because there is no change since it was created in the full backup operation.

본 발명의 실시예에 따른 백업 데이터 복원 방법은 풀 백업 데이터를 복사한 후 증분 백업 데이터를 반복적으로 덮어쓰기하여 데이터를 복원하는 종래의 방법과 달리, 풀 백업 데이터를 복사하지 않고 메타 데이터 수준에서 가상 백업 데이터(가상 백업 이미지)를 구현하기 위한 합성 메타 데이터를 생성하여 합성 메타 데이터를 기반으로 데이터를 복원할 수 있다.The backup data restoration method according to the embodiment of the present invention does not copy the full backup data, but virtual data at the metadata level, unlike the conventional method in which the data is restored by repeatedly overwriting the incremental backup data after copying the full backup data. By creating synthetic metadata to implement backup data (virtual backup image), data can be restored based on the synthetic metadata.

따라서, 본 발명의 실시예에 의하면 데이터 복구를 위하여 풀 백업 데이터를 복사하기 위한 저장 용량을 필요로 하지 않으며, 풀 백업 이후 여러차례 증분 백업된 경우에 데이터를 복원하는 경우에 증분 백업 데이터를 여러번에 걸쳐서 순차적으로 풀 백업 데이터에 덮어쓰기하지 않아도 되기 때문에, 데이터 복구를 위한 저장 용량 및 데이터 복구 시간을 획기적으로 줄일 수 있다.Therefore, according to the embodiment of the present invention, storage capacity for copying the full backup data is not required for data recovery, and when data is restored when the data is incrementally backed up several times after the full backup, the incremental backup data is stored several times. Since it is not necessary to sequentially overwrite the full backup data, the storage capacity and data recovery time for data recovery can be drastically reduced.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/ 또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(Arithmetic Logic Unit), 디지털 신호 프로세서(Digital Signal Processor), 마이크로컴퓨터, FPGA(Field Programmable Gate Array), PLU(Programmable Logic Unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다.The embodiments described above may be implemented by a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the apparatus, methods and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA). Array), a Programmable Logic Unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions, may be implemented using one or more general purpose or special purpose computers.

처리 장치는 운영 체제 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술 분야에서 통상의 지식을 가진 자는 처리 장치가 복수 개의 처리 요소(Processing Element) 및/또는 복수 유형의 처리요소를 포함할 수 있음을 이해할 것이다.The processing device may run an operating system and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, the processing device is sometimes described as being used, but one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It will be understood that this may include

예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(Parallel Processor) 와 같은, 다른 처리 구성(Processing configuration)도 가능하다. 소프트웨어는 컴퓨터 프로그램(Computer Program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다.For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a Parallel Processor. The software may include a computer program, code, instructions, or a combination of one or more thereof, which configures the processing device to operate as desired or is independently or collectively processed You can command the device.

소프트웨어 및/ 또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody) 될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or device, to be interpreted by or to provide instructions or data to the processing device. , or may be permanently or temporarily embody in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software.

컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CDROM, DVD와 같은 광기록 매체(optical media) 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CDROMs and DVDs, and ROM, RAM, and flash memory. Hardware devices specially configured to store and execute program instructions, such as, etc. are included. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다. 그러므로, 다른 구현들, 다른 실시예들 및 청구범위와 균등한 것들도 후술하는 청구범위의 범위에 속한다.As described above, although the embodiments have been described with reference to the limited embodiments and drawings, various modifications and variations are possible from the above description by those skilled in the art. For example, the described techniques are performed in an order different from the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result. Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

100: 백업 데이터 합성 장치
120: 가상 백업 데이터 생성부
140: 데이터 복원부
160: 데이터 저장소100: backup data synthesis device
120: virtual backup data generation unit
140: data restoration unit
160: data store

Claims

Based on incremental backup, full backup data generated by performing a full backup of the backup target data and data at the recovery point are synthesized from the backup data including the incremental backup data generated by incremental backup from the full backup data. A backup data synthesis device for restoring, comprising:
a virtual backup data generation unit for synthesizing the incremental backup metadata related to the incremental backup data with the full backup metadata related to the full backup data to generate synthetic metadata for realizing virtual backup data; and
a data restoration unit that extracts and restores data at the recovery point from the full backup data and the incremental backup data based on the virtual backup data implemented by the synthetic metadata;
When a plurality of incremental backup metadata is sequentially generated by incremental backup, the virtual backup data generation unit generates the composite metadata by synthesizing the plurality of incremental backup metadata with the full backup metadata in a metadata area, Backup data synthesis device.

According to claim 1,
The data restoration unit,
A backup data synthesis apparatus for restoring data by extracting a chunk corresponding to the recovery point from a data file based on the synthesized metadata generated in a metadata area without copying the full backup data for data restoration.

According to claim 1,
The synthetic metadata includes a revision field for recording a valid backup operation range for each file,
The virtual backup data generating unit is configured to change a field value of the revision field according to a change type of the file.

4. The method of claim 3,
The revision field is
a file creation version field indicating in which backup job the file was created; and
Backup data synthesizing device comprising a file deletion version field indicating how many times the file is valid until a backup operation.

A backup data synthesizing device for synthesizing and restoring data at a recovery point from backup data including full backup data generated by performing a full backup of the backup target data and incremental backup data generated by incrementally backing up from the full backup data,
a virtual backup data generation unit for synthesizing the incremental backup metadata related to the incremental backup data with the full backup metadata related to the full backup data to generate synthetic metadata for realizing virtual backup data; and
a data restoration unit that extracts and restores data at the recovery point from the full backup data and the incremental backup data based on the virtual backup data implemented by the synthetic metadata;
The data restoration unit,
Instead of copying the full backup data to restore the data, the chunk corresponding to the recovery point is extracted from the data file based on the synthetic metadata generated in the metadata area to restore the data,
The synthetic metadata includes a revision field for recording a valid backup operation range for each file,
The virtual backup data generation unit is configured to change a field value of the revision field according to a change type of the file,
The revision field is
a file creation version field indicating in which backup job the file was created; and
a file deletion version field indicating how many backup jobs the file is valid for;
The virtual backup data generation unit,
determining which change type of the file corresponds to among a plurality of change types including file creation, file deletion, and file modification;
when the change type corresponds to the deletion of the file, setting the file deletion version field to a value corresponding to a previous backup job that is the most recent backup job among previous backup jobs;
when the change type corresponds to the file creation, set the file creation version field to a value corresponding to a current backup job, and set the file deletion version field to a maximum value; and
and, when the change type corresponds to the file modification, set a file deletion version field of the file before modification from the maximum value to a value corresponding to the previous backup job.

6. The method of claim 5,
The virtual backup data generation unit,
When the change type corresponds to the file modification, a record for the modified file is added to set the file creation version field for the modified file to a value corresponding to the current backup job, and and set the file deletion version field to the maximum value.

6. The method of claim 5,
The composite metadata includes a chunk group ID field for allocating a chunk group ID to a group of chunks constituting the file, and a header number field for recording a header number corresponding to each chunk,
The virtual backup data generation unit,
adding a new record to the composite metadata by writing a new field value in the chunk group ID field when the change type of the file corresponds to the file creation or the file modification; and
When some of the chunk data of the file is changed, the unmodified data among the chunk data does not update the header number field, but a new chunk is created for the changed data among the chunk data, and a new header number is entered in the header number field. A backup data synthesizing device configured to create

8. The method of claim 7,
The data restoration unit,
extracting records corresponding to the recovery point from the composite metadata based on the revision field; and
A backup data synthesizing apparatus configured to read data corresponding to the extracted records from a data file and restore the data.

9. The method of claim 8,
The data restoration unit,
and extracting records satisfying a condition in which a backup job sequence number corresponding to the recovery time point is greater than or equal to a field value of the file creation version field and less than or equal to a field value of the file deletion version field from the synthesized metadata.

9. The method of claim 8,
The data restoration unit,
Backup data synthesis, configured to refer to a header number corresponding to a chunk group ID of records corresponding to the recovery point in the synthesis metadata, select a header of a chunk according to the header number, and restore the data from the data file Device.

Based on incremental backup, full backup data generated by performing a full backup of the backup target data and data at the recovery point are synthesized from the backup data including the incremental backup data generated by incremental backup from the full backup data. A method of synthesizing backup data to restore, comprising:
generating synthetic metadata for realizing virtual backup data by synthesizing, by a virtual backup data generator, the incremental backup metadata related to the incremental backup data with the full backup metadata related to the full backup data; and
Extracting and restoring, by a data restoration unit, data at the recovery point in the full backup data and the incremental backup data based on the virtual backup data implemented by the synthetic metadata,
The step of generating the synthetic metadata includes:
When a plurality of incremental backup metadata is sequentially generated by incremental backup, the virtual backup data generation unit synthesizes the plurality of incremental backup metadata into the full backup metadata in the metadata area to generate the synthesized metadata. A method of synthesizing backup data, comprising the step of generating.

12. The method of claim 11,
The restoration step is
Backup data comprising the step of extracting the chunk corresponding to the recovery point from a data file based on the synthetic metadata generated in the metadata area without copying the full backup data for data restoration, and restoring the data; synthesis method.

12. The method of claim 11,
The synthetic metadata includes a revision field for recording a valid backup operation range for each file,
The generating of the composite metadata includes changing a field value of the revision field according to a change type of the file.

14. The method of claim 13,
The revision field is
a file creation version field indicating in which backup job the file was created; and
A method for synthesizing backup data, comprising a file deletion version field indicating how many times the file is valid for a backup operation.

A backup data synthesis method for synthesizing and restoring data at a recovery point from backup data including full backup data generated by performing a full backup of the backup target data and incremental backup data generated by incrementally backing up from the full backup data,
generating synthetic metadata for realizing virtual backup data by synthesizing, by a virtual backup data generator, the incremental backup metadata related to the incremental backup data with the full backup metadata related to the full backup data; and
Extracting and restoring, by a data restoration unit, data at the recovery point in the full backup data and the incremental backup data based on the virtual backup data implemented by the synthetic metadata,
The restoration step is
To restore the data, extracting the chunk corresponding to the recovery point from the data file based on the synthesized metadata generated in the metadata area without copying the full backup data to restore the data,
The synthetic metadata includes a revision field for recording a valid backup operation range for each file,
The generating of the composite metadata includes changing a field value of the revision field according to a change type of the file,
The revision field is
a file creation version field indicating in which backup job the file was created; and
a file deletion version field indicating how many backup jobs the file is valid for;
The step of generating the synthetic metadata includes:
determining which change type of the file corresponds to among a plurality of change types including file creation, file deletion, and file modification;
setting the file deletion version field to a value corresponding to a previous backup job that is the most recent backup job among previous backup jobs when the change type corresponds to the file deletion;
when the change type corresponds to the file creation, setting the file creation version field to a value corresponding to a current backup job and setting the file deletion version field to a maximum value; and
and setting a file deletion version field of a file before modification from the maximum value to a value corresponding to the previous backup job when the change type corresponds to the file modification.

16. The method of claim 15,
The step of generating the synthetic metadata includes:
When the change type corresponds to the file modification, a record for the modified file is added to set the file creation version field for the modified file to a value corresponding to the current backup job, and and setting the file deletion version field to the maximum value.

16. The method of claim 15,
The composite metadata includes a chunk group ID field for allocating a chunk group ID to a group of chunks constituting the file, and a header number field for recording a header number corresponding to each chunk,
The step of generating the synthetic metadata includes:
adding a new record to the composite metadata by writing a new field value in the chunk group ID field when the change type of the file corresponds to the file creation or the file modification; and
When some of the chunk data of the file is changed, the unmodified data among the chunk data does not update the header number field, but a new chunk is created for the changed data among the chunk data, and a new header number is entered in the header number field. A backup data synthesis method comprising the step of creating a.

18. The method of claim 17,
The restoration step is
extracting records corresponding to the recovery point from the composite metadata based on the revision field; and
A backup data synthesis method comprising the step of reading data corresponding to the extracted records from a data file and restoring the data.

19. The method of claim 18,
Extracting the records comprises:
A backup data synthesis method for extracting records satisfying a condition in which a backup job sequence number corresponding to the recovery point is greater than or equal to a field value of the file creation version field and less than or equal to a field value of the file deletion version field.

A computer-readable recording medium in which a program for executing the backup data synthesis method according to any one of claims 11 to 19 is recorded.