KR101252375B1

KR101252375B1 - Mapping management system and method for enhancing performance of deduplication in storage apparatus

Info

Publication number: KR101252375B1
Application number: KR1020100136766A
Authority: KR
Inventors: 강수용; 원유집; 차재혁; 최종무; 윤성로; 김종화
Original assignee: 한양대학교 산학협력단
Priority date: 2010-12-28
Filing date: 2010-12-28
Publication date: 2013-04-08
Also published as: KR20120074817A

Abstract

저장 장치에서 효과적으로 중복을 제거하기 위한 매핑 관리 방법 및 시스템이 개시된다. 매핑 관리 시스템은 복수의 데이터가 중복 되는 경우, 맵핑 테이블을 사용하여 상기 데이터에 대응하는 적어도 하나의 참조 블록을 상기 데이터가 저장된 원본 블록에 맵핑하는 중복 제거부; 및 데이터의 변경 요청이 상기 원본 블록의 데이터를 변경하는 경우, 상기 변경 요청에 따라 변경한 데이터를 참조 블록 중 하나에 저장하고, 상기 원본 블록이 상기 변경한 데이터를 저장한 참조 블록을 참조하도록 설정하는 데이터 변경부를 포함할 수 있다. Disclosed are a mapping management method and system for effectively removing redundancy in a storage device. The mapping management system may further include: a deduplication unit configured to map at least one reference block corresponding to the data to an original block in which the data is stored by using a mapping table when a plurality of data are duplicated; And when the data change request changes the data of the original block, store the changed data in one of the reference blocks according to the change request, and set the original block to refer to the reference block storing the changed data. It may include a data change unit.

Description

MAPPING MANAGEMENT SYSTEM AND METHOD FOR ENHANCING PERFORMANCE OF DEDUPLICATION IN STORAGE APPARATUS

본 발명은 저장 장치에서 효과적으로 중복을 제거하기 위한 매핑 관리 방법 및 시스템에 관한 것으로, 보다 상세하게는 중복 제거 기술을 사용한 블록의 데이터가 변경되는 경우 필요한 연산의 수를 감소시키는 시스템 및 방법에 관한 것이다. The present invention relates to a mapping management method and system for effectively eliminating redundancy in a storage device, and more particularly, to a system and method for reducing the number of operations required when data in a block using deduplication is changed. .

중복 제거 기술은 기 저장된 데이터 중에 신규 데이터와 동일한 데이터가 있는 경우, 신규 데이터를 저장 매체에 저장하지 않고, 신규 데이터가 저장될 블록이 해당 데이터가 기 저장된 원본 블록을 참조한다는 맵핑 정보를 맵핑 테이블에 저장함으로써, 데이터 저장에 따른 쓰기 연산의 숫자를 감소시키는 기술이다.Deduplication technology does not store new data in a storage medium when there is the same data among the previously stored data, and maps the mapping information indicating that the block in which the new data is to be stored refers to the original block in which the data is previously stored. By storing, a technique of reducing the number of write operations in accordance with data storage.

그러나 종래의 중복 제거 기술은 원본 블록에 저장된 데이터를 변경할 경우, 원본 블록에 기 저장된 데이터를 원본 블록을 참조하던 블록 중 하나에 복사한 다음 원본 블록의 데이터를 변경해야 하며, 맵핑 테이블도 복사된 블록을 새로운 원본 블록으로 설정하여 변경해야 하는 불편함이 있었다.However, in case of changing the data stored in the original block, the conventional deduplication technique needs to copy the data previously stored in the original block to one of the blocks referring to the original block, and then change the data in the original block. Had to be changed to a new original block.

따라서, 중복 제거 기술에서 원본 블록에 저장된 데이터를 간단하게 변경할 수 있는 방법이 요구되고 있다.Accordingly, there is a demand for a method of easily changing data stored in the original block in the deduplication technology.

본 발명은 복수의 참조 블록이 참조하는 원본 블록의 데이터가 변경되는 경우, 변경되는 원본 블록의 데이터를 참조 블록 중 하나에 저장하고, 원본 블록이 해당 참조 블록을 참조하도록 함으로써 원본 블록의 데이터가 변경될 경우 수행되는 복사와 변경 연산을 감소시키는 시스템 및 방법을 제공한다. According to the present invention, when the data of the original block referenced by a plurality of reference blocks is changed, the data of the original block is changed by storing the data of the changed original block in one of the reference blocks, and having the original block refer to the corresponding reference block. It provides a system and method that reduces the copy and change operations performed when possible.

또한, 본 발명은 복수의 참조 블록이 하나의 원본 블록을 참조하는 경우, 맵핑 테이블에서 하나의 엔트리를 사용하여 복수의 참조 블록이 원본 블록을 참조한다는 정보를 저장함으로써 맵핑 테이블의 크기를 감소시키는 시스템 및 방법을 제공한다. The present invention also provides a system for reducing the size of the mapping table by storing information indicating that the plurality of reference blocks refer to the original block by using one entry in the mapping table when the plurality of reference blocks refer to one original block. And methods.

본 발명의 일실시예에 따른 맵핑 관리 시스템은 복수의 데이터가 중복 되는 경우, 맵핑 테이블을 사용하여 상기 데이터에 대응하는 적어도 하나의 참조 블록을 상기 데이터가 저장된 원본 블록에 맵핑하는 중복 제거부; 및 데이터의 변경 요청이 상기 원본 블록의 데이터를 변경하는 경우, 상기 변경 요청에 따라 변경한 데이터를 참조 블록 중 하나에 저장하고, 상기 원본 블록이 상기 변경한 데이터를 저장한 참조 블록을 참조하도록 하는 데이터 변경부를 포함할 수 있다.According to an embodiment of the present invention, a mapping management system may include: a deduplication unit configured to map at least one reference block corresponding to the data to an original block in which the data is stored using a mapping table when a plurality of data is duplicated; And when the data change request changes the data of the original block, store the changed data in one of the reference blocks according to the change request, and cause the original block to refer to the reference block storing the changed data. It may include a data change unit.

본 발명의 일실시예에 따른 맵핑 관리 시스템의 중복제거부는, 복수의 참조 블록들이 하나의 원본 블록을 참조하는 경우, 맵핑 테이블에서 하나의 엔트리를 사용하여 복수의 참조 블록들이 하나의 원본 블록을 참조하도록 설정할 수 있다.When the plurality of reference blocks refer to one original block, the deduplication unit of the mapping management system according to an embodiment of the present invention refers to the plurality of reference blocks by using one entry in the mapping table. Can be set to

본 발명의 일실시예에 따른 맵핑 관리 시스템은 참조 블록 중 하나를 참조하는 원본 블록인 체인 블록을 기초로 참조 블록과 상기 체인 블록의 데이터를 참조 대상에 따라 변경하고, 상기 맵핑 테이블에서 체인 블록에 대한 정보를 삭제하는 맵핑 테이블 정리부를 더 포함할 수 있다.The mapping management system according to an embodiment of the present invention changes the reference block and the data of the chain block according to a reference object based on a chain block that is an original block referring to one of the reference blocks, and changes the data from the mapping table to the chain block. The apparatus may further include a mapping table cleaner that deletes information about the information.

본 발명의 일실시예에 따른 맵핑 관리 방법은 복수의 데이터가 중복 되는 경우, 맵핑 테이블을 사용하여 상기 데이터에 대응하는 적어도 하나의 참조 블록을 상기 데이터가 저장된 원본 블록에 맵핑하는 단계; 데이터의 변경 요청이 상기 원본 블록의 데이터를 변경하는 경우, 상기 변경 요청에 따라 변경한 데이터를 참조 블록 중 하나에 저장하는 단계; 및 상기 원본 블록이 상기 변경한 데이터를 저장한 참조 블록을 참조하도록 설정하는 단계를 포함할 수 있다.A mapping management method according to an embodiment of the present invention includes mapping at least one reference block corresponding to the data to an original block in which the data is stored by using a mapping table when a plurality of data is duplicated; If the change request of data changes the data of the original block, storing the changed data in one of the reference blocks according to the change request; And setting the original block to refer to the reference block storing the changed data.

본 발명의 일실시예에 의하면, 복수의 참조 블록이 참조하는 원본 블록의 데이터가 변경되는 경우, 변경되는 원본 블록의 데이터를 참조 블록 중 하나에 저장하고, 원본 블록이 해당 참조 블록을 참조하도록 함으로써 원본 블록의 데이터가 변경될 경우 수행되는 복사와 변경 연산을 감소시킬 수 있다. According to an embodiment of the present invention, when the data of the original block referred to by a plurality of reference blocks is changed, the data of the changed original block is stored in one of the reference blocks, and the original block refers to the corresponding reference block. When the data of the original block is changed, the copy and change operations performed can be reduced.

또한, 본 발명의 일실시예에 의하면, 복수의 참조 블록이 하나의 원본 블록을 참조하는 경우, 맵핑 테이블에서 하나의 엔트리를 사용하여 복수의 참조 블록이 원본 블록을 참조한다는 정보를 저장함으로써 맵핑 테이블의 크기를 감소시킬 수 있다.Further, according to an embodiment of the present invention, when a plurality of reference blocks refer to one original block, the mapping table is stored by storing information indicating that the plurality of reference blocks refer to the original block using one entry in the mapping table. Can reduce the size.

도 1은 본 발명의 일실시예에 따른 맵핑 관리 시스템을 도시한 블록 다이어그램이다.
도 2는 본 발명의 일실시예에 따른 맵핑 테이블 관리부가 생성한 맵핑 테이블의 일례이다.
도 3은 본 발명의 일실시예에 따른 데이터 변경부가 원본 블록의 데이터 변경 요청을 처리하는 과정의 일례이다.
도 4는 본 발명의 일실시예에 따른 맵핑 테이블 정리부가 맵핑 테이블에서 중복된 맵핑 정보를 정리하는 과정의 일례이다.
도 5는 본 발명의 일실시예에 따른 맵핑 관리 방법을 도시한 플로우차트이다.1 is a block diagram illustrating a mapping management system according to an embodiment of the present invention.
2 is an example of a mapping table generated by the mapping table manager according to an embodiment of the present invention.
3 is an example of a process of a data change unit processing a data change request of an original block according to an embodiment of the present invention.
4 is an example of a process of arranging duplicate mapping information in a mapping table by a mapping table organizer according to an embodiment of the present invention.
5 is a flowchart illustrating a mapping management method according to an embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다. 본 발명의 일실시예에 따른 맵핑 관리 방법은 맵핑 관리 시스템에 의해 수행될 수 있다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. The mapping management method according to an embodiment of the present invention may be performed by a mapping management system.

도 1은 본 발명의 일실시예에 따른 맵핑 관리 시스템을 도시한 블록 다이어그램이다. 1 is a block diagram illustrating a mapping management system according to an embodiment of the present invention.

도 1을 참고하면, 본 발명의 일실시예에 따른 맵핑 관리 시스템(100)은 중복 제거부(110), 변경 데이터 판단부(120), 데이터 변경부(130), 및 맵핑 테이블 정리부(140)를 포함할 수 있다. Referring to FIG. 1, the mapping management system 100 according to an exemplary embodiment of the present invention may include a deduplication unit 110, a change data determination unit 120, a data change unit 130, and a mapping table organizer 140. It may include.

중복 제거부(110)는 복수의 데이터가 중복 되는 경우, 맵핑 테이블을 사용하여 상기 데이터에 대응하는 적어도 하나의 참조 블록을 상기 데이터가 저장된 원본 블록에 맵핑할 수 있다.The deduplication unit 110 may map at least one reference block corresponding to the data to the original block in which the data is stored by using a mapping table when a plurality of data is duplicated.

이때, 중복 제거부(110)는 중복 여부를 판단하는 중복 판단부(111)과 맵핑 테이블을 생성하여 관리하는 맵핑 테이블 관리부(112)를 포함할 수 있다.In this case, the deduplication unit 110 may include a duplication determination unit 111 that determines whether or not a duplicate and a mapping table management unit 112 that generates and manages a mapping table.

중복 판단부(111)는 핑거프린트(fingerprint) 방법을 사용하여 저장을 요청 받은 신규 데이터가 기존에 블록 단위로 저장된 기존 데이터와 중복되는 데이터인지 여부를 판단할 수 있다. 구체적으로, 중복 판단부(111)는 기존 데이터의 해시 값과 신규 데이터의 해시 값을 계산하고, 동일한 해시 값이 계산된 경우, 신규 데이터를 기존 데이터와 중복 되는 데이터로 판단할 수 있다. 이때, 중복 판단부(111)는 SHA1(Secure Hash Algorithm 1)이나 MD5(Message-Digest algorithm 5)를 사용하여 해시 값을 계산할 수 있다. The overlap determination unit 111 may determine whether new data that is requested to be stored is data that overlaps with the existing data stored in units of blocks by using a fingerprint method. In detail, the overlap determination unit 111 may calculate the hash value of the existing data and the hash value of the new data, and when the same hash value is calculated, may determine the new data as data overlapping with the existing data. In this case, the overlap determination unit 111 may calculate a hash value by using Secure Hash Algorithm 1 (SHA1) or Message-Digest algorithm 5 (MD5).

맵핑 테이블 관리부(112)는 중복 판단부(111)에서 중복으로 판단한 신규 데이터를 저장 장치에 저장하지 않고, 기존 데이터와 중복되었다는 정보를 맵핑 정보 형태로 맵핑 테이블에 저장할 수 있다. 이때, 맵핑 테이블은 기존 데이터가 저장된 블록을 원본 블록으로 설정하고, 신규 데이터가 저장될 예정이었던 블록을 참조 블록으로 설정하며, 원본 블록과 참조 블록을 맵핑하여 저장할 수 있다. 또한, 맵핑 테이블 관리부(112)는 복수의 참조 블록들이 하나의 원본 블록을 참조하는 경우, 맵핑 테이블에서 하나의 엔트리를 사용하여 복수의 참조 블록들이 하나의 원본 블록을 참조하도록 설정할 수 있다. 이때, 맵핑 테이블 관리부(112)는 맵핑 테이블을 탐색하는 성능을 고려하여 인버티드 테이블(inverted table)형태로 맵핑 테이블을 만들 수도 있다.The mapping table manager 112 may store information, which is duplicated with the existing data, in the form of mapping information, in the mapping table, without storing the new data determined as duplicates by the overlap determination unit 111 in the storage device. In this case, the mapping table may set the block in which the existing data is stored as the original block, the block in which the new data is to be stored as the reference block, and map and store the original block and the reference block. In addition, when the plurality of reference blocks refer to one original block, the mapping table manager 112 may set the plurality of reference blocks to refer to one original block by using one entry in the mapping table. In this case, the mapping table manager 112 may create the mapping table in the form of an inverted table in consideration of the capability of searching the mapping table.

맵핑 테이블 관리부(112)가 맵핑 테이블을 생성하는 과정은 이하 도 2를 참조로 상세히 설명한다. A process of generating the mapping table by the mapping table manager 112 will be described in detail with reference to FIG. 2.

변경 데이터 판단부(120)는 저장된 데이터를 변경하고자 하는 변경 요청을 수신한 경우, 변경 요청이 맵핑 테이블에서 원본 블록으로 설정된 블록의 데이터를 변경하는지 여부를 확인할 수 있다.When the change data determination unit 120 receives a change request for changing the stored data, the change data determination unit 120 may determine whether the change request changes the data of the block set as the original block in the mapping table.

데이터 변경부(130)는 변경 데이터 판단부(120)가 상기 변경 요청이 원본 블록의 데이터를 변경하는 것으로 판단한 경우, 상기 변경 요청에 따라 변경한 데이터를 참조 블록 중 하나에 저장하고, 원본 블록이 변경한 데이터를 저장한 참조 블록을 참조하도록 설정할 수 있다.When the change data determination unit 120 determines that the change request changes the data of the original block, the data change unit 130 stores the changed data in one of the reference blocks and changes the original block. It can be set to refer to the reference block where the changed data is stored.

데이터 변경부(130)가 원본 블록의 데이터 변경 요청을 처리하는 과정은 이하 도 3을 참조로 상세히 설명한다. The process of the data change unit 130 processing the data change request of the original block will be described in detail with reference to FIG. 3.

맵핑 테이블 정리부(140)는 참조 블록 중 하나를 참조하는 원본 블록인 체인 블록을 기초로 참조 블록과 상기 체인 블록의 데이터를 참조 대상에 따라 변경하고, 상기 맵핑 테이블에서 체인 블록에 대한 정보를 삭제할 수 있다. 구체적으로, 맵핑 테이블 정리부(140)는 체인 블록의 데이터를 임시 블록에 저장하고, 체인 블록이 참조하던 참조 블록의 데이터를 체인 블록에 저장하며, 임시 블록의 데이터를 체인 블록이 참조하던 참조 블록에 저장하고, 맵핑 테이블에서 상기 체인 블록이 참조 블록을 참조하는 정보를 삭제할 수 있다.The mapping table cleaner 140 may change the reference block and the data of the chain block according to a reference object based on a chain block that is an original block referring to one of the reference blocks, and delete information on the chain block from the mapping table. have. Specifically, the mapping table organizer 140 stores the data of the chain block in the temporary block, stores the data of the reference block referred to by the chain block in the chain block, and stores the data of the temporary block in the reference block referred to by the chain block. Store, and delete information in which the chain block refers to the reference block from the mapping table.

맵핑 테이블 정리부(140)가 맵핑 테이블에서 중복된 맵핑 정보를 정리하는 과정은 이하 도 4를 참조로 상세히 설명한다. A process of arranging the overlapping mapping information in the mapping table by the mapping table cleaner 140 will be described in detail with reference to FIG. 4.

또한, 중복된 맵핑 정보를 검색하고, 정리하는 과정은 검색하고 연산하는 데이터의 량이 크므로 데이터가 저장된 저장 시스템에 부하를 줄 수 있다. 따라서, 맵핑 테이블 정리부(140)는 데이터가 저장되는 저장 시스템이 유휴(idle)상태이거나, 맵핑 테이블의 크기가 임계 값을 초과하는 경우, 동작하도록 설정함으로써 저장 시스템에 대한 부하를 줄일 수 있다. 이때, 임계 값은 맵핑 테이블을 생성하기 전에 설정한 맵핑 테이블의 최대 크기일 수 있다. 즉, 맵핑 테이블이 최대 크기인 경우 맵핑 테이블 안의 중복 맵핑 정보를 정리함으로써 맵핑 테이블의 크기를 감소할 수 있다.In addition, the process of retrieving and arranging the duplicated mapping information may increase the load on the storage system in which the data is stored because the amount of data to be retrieved and calculated is large. Therefore, the mapping table organizer 140 may reduce the load on the storage system by setting to operate when the storage system in which data is stored is idle or when the size of the mapping table exceeds a threshold. In this case, the threshold may be the maximum size of the mapping table set before generating the mapping table. That is, when the mapping table is the maximum size, the size of the mapping table may be reduced by cleaning up redundant mapping information in the mapping table.

도 2는 본 발명의 일실시예에 따른 맵핑 테이블 관리부가 생성한 맵핑 테이블의 일례이다.2 is an example of a mapping table generated by the mapping table manager according to an embodiment of the present invention.

본 발명의 일실시예에 따른 맵핑 테이블 관리부(112)는 5개의 저장 블록(210)을 사용하는 저장 시스템에서 블록1 내지 블록 4에 동일한 데이터 A가 저장되는 경우, 도 2에 도시된 바와 같이 블록2 내지 블록 4에는 데이터를 저장하지 않고 블록 1을 참조하도록 할 수 있다. 구체적으로 맵핑 테이블 관리부(112)는 맵핑 테이블(220)에 블록2, 블록3, 및 블록 4를 각각 참조 블록으로 설정하고 블록1을 참조 블록이 참조하는 원본 블록으로 설정한 맵핑 정보를 저장할 수 있다.The mapping table management unit 112 according to an embodiment of the present invention stores a block as shown in FIG. 2 when the same data A is stored in blocks 1 to 4 in a storage system using five storage blocks 210. 2 to 4 may refer to block 1 without storing data. In detail, the mapping table manager 112 may store the mapping information in which the block 2, the block 3, and the block 4 are set as the reference blocks and the block 1 is set as the original block referred to by the reference block in the mapping table 220. .

이때, 동일한 데이터를 참조하는 블록들을 별개의 엔트리로 저장하는 경우 맵핑 테이블의 크기가 증가하므로, 맵핑 테이블 관리부(112)는 도 2에 도시된 바와 같이 하나의 엔트리를 사용하여 동일한 원본 블록을 참조하는 블록들을 맵핑할 수 있다(230).In this case, when the blocks referring to the same data are stored as separate entries, the size of the mapping table increases, so that the mapping table managing unit 112 refers to the same original block using one entry as shown in FIG. 2. Blocks may be mapped 230.

일례로, 맵핑 테이블에 저장되는 각 블록의 식별 정보를 저장하기 위하여 4byte의 용량이 필요한 경우, 맵핑 테이블(220)은 3개의 엔트리에 3쌍의 맵핑 정보를 저장하며 각각의 엔트리에 참조 블록과 원본 블록을 저장하므로 수학식 1에 따라 4byte*4byte*3의 저장 공간을 필요로 한다. For example, when 4 bytes of capacity are required to store identification information of each block stored in the mapping table, the mapping table 220 stores three pairs of mapping information in three entries, and the reference block and the original in each entry. Since the block is stored, a storage space of 4 bytes * 4 bytes * 3 is required according to Equation 1.

이때, cn은 참조 블록의 식별 정보를 저장하기 위한 저장 공간의 크기이고, bn은 원본 블록의 식별 정보를 저장하기 위한 저장 공간의 크기이며, e는 맵핑 정보가 저장된 엔트리의 숫자일 수 있다.In this case, cn may be a size of a storage space for storing identification information of a reference block, bn may be a size of a storage space for storing identification information of an original block, and e may be a number of entries in which mapping information is stored.

반면, 맵핑 테이블(230)은 하나의 엔트리에 동일한 원본 블록을 참조하는 3개의 식별 정보를 저장하므로 동일한 정보인 원본 블록의 식별 정보를 저장하는 횟수가 감소한다. 즉, 맵핑 테이블(230)은 수학식 2에 따라 (4byte*3)+4byte의 저장 공간을 필요로 한다. On the other hand, since the mapping table 230 stores three pieces of identification information referring to the same original block in one entry, the number of times of storing the identification information of the original block that is the same information is reduced. That is, the mapping table 230 requires (4 bytes * 3) + 4 bytes of storage space according to Equation 2.

즉, 맵핑 테이블 관리부(112)는 맵핑 테이블(230)와 같은 구조의 맵핑 테이블을 사용함으로써 동일한 데이터를 사용할 경우, 맵핑 테이블(220)를 사용하는 것이 비하여 수학식 3 정도의 저장 공간을 절약할 수 있다.That is, the mapping table manager 112 may save the storage space of Equation 3 when using the same data by using the mapping table having the same structure as the mapping table 230, compared to using the mapping table 220. have.

이때, k는 동일한 원본 블록을 참조하고 있는 참조 블록의 수이고, n은 참조 블록, 또는 원본 블록의 식별 정보를 저장하기 위한 저장 공간의 크기일 수 있다.In this case, k may be the number of reference blocks referring to the same original block, and n may be a size of a storage space for storing identification information of the reference block or the original block.

도 3은 본 발명의 일실시예에 따른 데이터 변경부가 원본 블록의 데이터 변경 요청을 처리하는 과정의 일례이다.3 is an example of a process of a data change unit processing a data change request of an original block according to an embodiment of the present invention.

본 발명의 일실시예에 따른 변경 데이터 판단부(120)는 저장된 데이터를 변경하고자 하는 변경 요청을 수신한 경우, 도 3에 도시된 바와 같이 변경 요청(311)이 맵핑 테이블에서 원본 블록으로 설정된 블록(블록 1)의 데이터를 변경하는지 여부를 확인할 수 있다(310). When the change data determination unit 120 according to an embodiment of the present invention receives a change request for changing the stored data, the change request 311 is set to the original block in the mapping table as shown in FIG. 3. It may be checked whether the data of (block 1) is changed (310).

이때, 본 발명의 일실시예에 따른 변경 데이터 판단부(120)는 복수의 블록이 서로 상대 블록에 있는 데이터를 참조하도록 설정하는 교차 참조 기법을 사용할 수 있다. In this case, the change data determination unit 120 according to an embodiment of the present invention may use a cross-reference technique in which a plurality of blocks are set to refer to data in a relative block.

구체적으로, 데이터 변경부(130)는 변경 요청이 원본 블록의 데이터를 변경하는 것인 경우(310), 원본 블록인 블록 1을 참조하는 블록 2, 내지 블록 4 중에서 하나의 블록을 선택할 수 있다(320). 일례로, 데이터 변경부(130)는 도 3에 도시된 바와 같이 블록 2(321)를 선택한 경우, 맵핑 테이블에 블록 1이 블록 2를 참조한다는 정보(322)를 추가할 수 있다. 이때, 변경 데이터 판단부(120)는 블록 1을 참조하는 블록 2, 내지 블록 4 중에서 다른 참조 블록의 원본 블록으로 설정된 블록이 있는지 여부를 확인하며, 다른 참조 블록의 원본 블록으로 설정된 블록은 선택할 수 없다. In detail, when the change request is to change the data of the original block 310, the data changer 130 may select one block among blocks 2 and 4 that refer to block 1 as the original block ( 320). For example, when selecting block 2 321 as illustrated in FIG. 3, the data changer 130 may add information 322 indicating that block 1 refers to block 2 to the mapping table. At this time, the change data determination unit 120 checks whether there is a block set as an original block of another reference block among blocks 2 and 4 referring to block 1, and a block set as an original block of another reference block can be selected. none.

다음으로, 데이터 변경부(130)는 블록 2에 변경 요청(311)에 따른 데이터(331)을 저장할 수 있다(330). 이때, 블록 1과 블록 2는 서로 상대 블록에 저장된 데이터를 참조하는 교차 참조 상태이다.Next, the data changer 130 may store data 331 according to the change request 311 in block 2 (330). In this case, block 1 and block 2 are cross-reference states that refer to data stored in the relative block.

그 다음으로, 다른 시스템에서 블록 2 내지 블록 4 중에 하나의 데이터를 요구하는 경우, 맵핑 테이블 관리부(112)는 맵핑 테이블에 따라 블록 2 내지 블록 4가 참조하고 있는 블록 1에 저장된 데이터 A를 제공할 수 있다. 또한, 다른 시스템에서 블록 1의 데이터를 요구하는 경우, 맵핑 테이블 관리부(112)는 맵핑 테이블에 따라 블록 1이 참조하고 있는 블록 2에 저장된 데이터 B를 제공할 수 있다. Next, when another system requires one of blocks 2 to 4, the mapping table manager 112 may provide data A stored in block 1 to which blocks 2 to 4 are referenced according to the mapping table. Can be. In addition, when the data of the block 1 is requested by another system, the mapping table manager 112 may provide data B stored in the block 2 referred to by the block 1 according to the mapping table.

본 발명의 일실시예에 따른 데이터 변경부(130)는 원본 블록의 데이터가 변경될 경우 수행되는 복사와 변경 연산을 감소시킬 수 있다. 구체적으로 종래의 맵핑 테이블 관리 시스템은 원본 블록인 블록 1의 데이터를 변경하는 경우, 블록 2 내지 블록 4중 하나에 데이터 A를 복사하고, 데이터 A가 복사된 블록을 다른 블록이 참조하도록 변경한 다음, 블록 1의 데이터를 B로 변경하여야 하므로 1회의 블록 데이터 복사와 2회의 맵핑 테이블 엔트리 변경 및 1회의 블록 데이터 변경을 필요로 한다.The data changer 130 according to an embodiment of the present invention may reduce copy and change operations performed when data of the original block is changed. Specifically, when the conventional mapping table management system changes the data of block 1, which is the original block, the data A is copied to one of the blocks 2 to 4, and the block to which the data A is copied is referred to by another block. For example, since the data of block 1 must be changed to B, one block data copy, two mapping table entry changes, and one block data change are required.

반면, 본 발명에 따른 데이터 변경부(130)는 1회의 맵핑 테이블에 엔트리 추가(322)와 1회의 블록 데이터 복사(331)만으로도 외부에서 블록 1의 데이터를 요구하는 경우 데이터 B를 제공하고, 블록 2 내지 블록 4의 데이터를 요구하는 경우 데이터 A를 제공할 수 있다. 따라서, 본 발명에 따른 데이터 변경부(130)는 도 3의 일례에서 종래의 맵핑 테이블 관리 시스템에 비하여 1회의 맵핑 테이블 엔트리 변경 및 1회의 블록 데이터 변경을 생략할 수 있다.On the other hand, the data changer 130 according to the present invention provides the data B when externally requesting the data of the block 1 only by adding an entry 322 and one time copying the block data 331 to the mapping table once. Data A may be provided when data of 2 to 4 is required. Accordingly, the data changer 130 according to the present invention may omit one mapping table entry change and one block data change as compared to the conventional mapping table management system in the example of FIG. 3.

도 4는 본 발명의 일실시예에 따른 맵핑 테이블 정리부가 맵핑 테이블에서 중복된 맵핑 정보를 정리하는 과정의 일례이다.4 is an example of a process of arranging duplicate mapping information in a mapping table by a mapping table organizer according to an embodiment of the present invention.

데이터 변경부(130)가 블록 데이터를 변경하는 방법은 도 3에 도시된 바와 같이 연산 횟수를 감소시키지만 맵핑 테이블에 포함된 엔트리의 숫자는 증가시키게 된다. 이때, 맵핑 테이블 정리부(140)는 한 블록의 원본 블록이 다른 블록을 참조하는 관계들의 정보인 체인 정보를 기초로 체인 클리닝 기법을 수행함으로써 맵핑 테이블을 정리하여 데이터 변경부(130)에 의하여 증가한 엔트리의 숫자를 감소시킬 수 있다.The method of changing the block data by the data changing unit 130 reduces the number of operations as shown in FIG. 3, but increases the number of entries included in the mapping table. At this time, the mapping table cleaning unit 140 cleans up the mapping table by performing a chain cleaning method based on chain information, which is information of relations in which an original block of one block refers to another block, and increases the entries by the data changing unit 130. You can reduce the number of.

먼저, 맵핑 테이블 정리부(140)는 체인 정보를 검색할 수 있다. 구체적으로 맵핑 테이블 정리부(140)는 참조 블록과 원본 블록으로 모두 사용되는 블록을 검색하여 체인 블록으로 설정할 수 있다(410). 일례로, 맵핑 테이블(412)에서는 블록 1, 블록 2, 및 블록 3이 참조 블록과 원본 블록으로 모두 사용되고 있음을 검색할 수 있다. 구체적으로, 블록 1은 블록 2를 참조하고 있고, 블록 2는 블록 3을 참조하고 있으며, 블록 3은 블록 1을 참조하고 있다. 즉, 저장 블록(411)에 도시된 바와 같이 블록 3과 블록 4는 블록 1에 저장된 데이터 A를 참조하고 있고, 블록 1은 블록 2에 저장된 데이터 B를 참조하고 있으며, 블록 2는 블록 3에 저장된 데이터 C를 참조하고 있다.First, the mapping table organizer 140 may search for chain information. In more detail, the mapping table organizer 140 may search for a block used as both a reference block and an original block and set it as a chain block (410). For example, the mapping table 412 may search that blocks 1, 2, and 3 are used as both reference blocks and original blocks. Specifically, block 1 refers to block 2, block 2 refers to block 3, and block 3 refers to block 1. That is, as shown in the storage block 411, blocks 3 and 4 refer to data A stored in block 1, block 1 refers to data B stored in block 2, and block 2 stored in block 3 Reference is made to data C.

다음으로, 맵핑 테이블 정리부(140)는 체인 블록들을 각각의 체인 블록이 참조하는 대상에 기초하여 정리할 수 있다. Next, the mapping table organizer 140 may organize the chain blocks based on the object referred to by each chain block.

먼저, 맵핑 테이블 정리부(140)는 블록 1에 저장되었던 데이터 A를 임시 블록에 백업하고, 블록 1이 참조하던 데이터 B를 블록 1에 저장할 수 있다. 이때, 맵핑 테이블 정리부(140)는 데이터 A를 임시 블록에서 로드 하여 블록 2에 저장하고, 블록 1이 블록 2를 참조한다는 맵핑 정보 1:2를 삭제할 수 있다. First, the mapping table organizer 140 may back up data A stored in block 1 to a temporary block, and store data B referred to by block 1 in block 1. In this case, the mapping table organizer 140 may load the data A from the temporary block and store the data A in the block 2 and delete the mapping information 1: 2 indicating that the block 1 refers to the block 2.

다음으로, 맵핑 테이블 정리부(140)는 블록 1의 원본 블록인 블록 2가 참조 블록으로 설정된 맵핑 정보 2:3을 사용하여 블록 2를 정리할 수 있다. 이때, 맵핑 테이블 정리부(140)는 상기 단계에서 블록 2에 저장된 데이터 A를 임시 블록에 백업하고, 블록 2가 참조하던 데이터 C를 블록 2에 저장할 수 있다. 이때, 맵핑 테이블 정리부(140)는 데이터 A를 임시 블록에서 로드 하여 블록 3에 저장하고, 블록 2가 블록 3을 참조한다는 맵핑 정보 2:3을 삭제할 수 있다. Next, the mapping table organizer 140 may clean up block 2 using mapping information 2: 3 in which block 2 which is the original block of block 1 is set as a reference block. In this case, the mapping table organizer 140 may back up the data A stored in the block 2 to the temporary block and store the data C referred to by the block 2 in the block 2. In this case, the mapping table organizer 140 may load the data A from the temporary block and store the data A in the block 3 and delete the mapping information 2: 3 indicating that the block 2 refers to the block 3.

이때, 블록 2의 원본 블록인 블록 3이 참조 블록으로 설정된 맵핑 정보 3:1은 최초의 정리 대상이었던 블록 1을 참조하므로, 체인 클리닝 기법의 수행을 중단할 수 있다. 이때, 맵핑 테이블 정리부(140)는 블록 4가 참조하던 대상인 데이터 A가 저장된 장소가 블록 1에서 블록 3으로 이동하였으므로, 블록 4에 대한 엔트리를 블록 4가 블록 3을 참조한다는 맵핑 정보 4:3으로 변경할 수 있다.In this case, since the mapping information 3: 1 in which block 3, which is the original block of block 2, is set as the reference block refers to block 1 that was the first cleanup target, the chain cleaning scheme may be stopped. At this time, the mapping table organizer 140 moves the block in which the data A, which the block 4 refers to, is stored from block 1 to block 3, so that the entry for block 4 is changed to mapping information 4: 3 indicating that block 4 refers to block 3. You can change it.

본 발명에 따른 맵핑 테이블 정리부(140)는 상기 과정을 통하여 도 4와 같이 저장 블록(421) 간에 참조하는 대상의 수를 1로 감소시키고, 맵핑 테이블(422)의 엔트리를 1개로 감소시킬 수 있다(420).The mapping table organizer 140 according to the present invention may reduce the number of objects referenced between the storage blocks 421 to 1 and reduce the number of entries of the mapping table 422 to 1 as shown in FIG. 4. (420).

도 5는 본 발명의 일실시예에 따른 맵핑 관리 방법을 도시한 플로우차트이다.5 is a flowchart illustrating a mapping management method according to an embodiment of the present invention.

단계(S510)에서 중복 판단부(111)는 신규 데이터가 기존 데이터와 중복되는 데이터인지 여부를 판단할 수 있다. 일례로, 중복 판단부(111)는 핑거프린트(fingerprint) 방법을 사용하여 신규 데이터가 기존 데이터와 중복되는 데이터인지 여부를 판단할 수 있다.In operation S510, the overlap determination unit 111 may determine whether new data is data that overlaps with existing data. For example, the overlap determination unit 111 may determine whether new data is data that overlaps with existing data by using a fingerprint method.

단계(S520)에서 맵핑 테이블 관리부(112)는 단계(S520)에서 중복으로 판단한 신규 데이터를 저장 장치에 저장하지 않고, 신규 데이터가 기존 데이터와 중복되었다는 맵핑 정보를 맵핑 테이블에 저장할 수 있다. 이때, 맵핑 테이블 관리부(112)는 기존 데이터가 저장된 블록의 번호를 원본 블록으로 설정하고, 신규 데이터가 저장될 예정이었던 블록을 참조 블록으로 설정하며, 원본 블록과 참조 블록을 맵핑하여 맵핑 테이블에 저장할 수 있다. 이때, 맵핑 테이블은, 하나의 엔트리에서 복수의 참조 블록들이 하나의 원본 블록을 참조하는 구조일 수 있다. In operation S520, the mapping table managing unit 112 may store, in the mapping table, mapping information indicating that the new data is duplicated with the existing data, without storing the new data determined as duplicate in operation S520 in the storage device. At this time, the mapping table management unit 112 sets the number of blocks in which the existing data is stored as the original block, sets the block in which the new data is to be stored as the reference block, and maps the original block and the reference block to be stored in the mapping table. Can be. In this case, the mapping table may have a structure in which a plurality of reference blocks refer to one original block in one entry.

단계(S530)에서 변경 데이터 판단부(120)는 저장된 데이터를 변경하고자 하는 변경 요청을 수신한 경우, 변경 요청이 맵핑 테이블에서 원본 블록으로 설정된 블록의 데이터를 변경하는지 여부를 확인할 수 있다.In operation S530, when the change data determination unit 120 receives a change request for changing the stored data, the change data determination unit 120 may check whether the change request changes the data of the block set as the original block in the mapping table.

단계(S540)에서 데이터 변경부(130)는 단계(S530)에서 원본 블록의 데이터를 변경하는 것으로 판단한 경우, 상기 변경 요청에 따라 변경한 데이터를 참조 블록 중 하나에 저장하고, 원본 블록이 변경한 데이터를 저장한 참조 블록을 참조하도록 설정하는 교차 참조 기법을 수행할 수 있다.If it is determined in step S540 that the data change unit 130 changes the data of the original block in step S530, the data change unit 130 stores the changed data in one of the reference blocks according to the change request and changes the original block. A cross-reference technique for setting a reference to a reference block storing data may be performed.

단계(S550)에서 데이터 변경부(130)는 단계(S530)에서 원본 블록이 아닌 참조 블록의 데이터를 변경하는 것으로 판단한 경우, 상기 변경 요청에 따라 참조 블록의 데이터를 변경할 수 있다. 이때, 데이터 변경에 의하여 참조 블록과 원본 블록의 데이터가 다르게 되므로, 참조 블록이 원본 블록을 참조한다는 맵핑 정보를 삭제할 수 있다.If it is determined in step S550 that the data change unit 130 changes the data of the reference block other than the original block in step S530, the data change unit 130 may change the data of the reference block according to the change request. In this case, since the data of the reference block and the original block are different due to the data change, the mapping information indicating that the reference block refers to the original block can be deleted.

단계(S560)에서 맵핑 테이블 정리부(140)는 단계(S540)의 교차 참조 기법에 의하여 증가된 맵핑 테이블의 정보를 정리하는 체인 클리닝 기법의 실행이 필요한지 여부를 판단할 수 있다. 구체적으로, 맵핑 테이블 정리부(140)는 데이터가 저장되는 저장 시스템이 유휴(idle)상태이거나, 맵핑 테이블의 크기가 임계 값을 초과하는 경우, 체인 클리닝 기법의 실행이 필요한 것으로 판단할 수 있다.In operation S560, the mapping table organizer 140 may determine whether an execution of the chain cleaning technique that cleans up the information of the mapping table that is increased by the cross-reference technique of operation S540 is required. In detail, the mapping table cleaner 140 may determine that the chain cleaning technique is required when the storage system in which data is stored is idle or when the size of the mapping table exceeds a threshold.

단계(S570)에서 맵핑 테이블 정리부(140)는 단계(S540)의 교차 참조 기법에 의하여 증가된 맵핑 테이블에서 정리 대상 블록인 체인 블록을 검색할 수 있다. 구체적으로 맵핑 테이블 정리부(140)는 참조 블록과 원본 블록으로 모두 사용되는 블록을 체인 블록으로 검색할 수 있다.In operation S570, the mapping table organizer 140 may search for a chain block that is a cleanup target block in the mapping table increased by the cross-reference scheme of operation S540. In detail, the mapping table organizer 140 may search for a block used as both a reference block and an original block as a chain block.

단계(S580)에서 맵핑 테이블 정리부(140)는 단계(S570)에서 검색한 체인 블록들을 각각의 체인 블록이 참조하는 대상에 기초하여 정리할 수 있다. 구체적으로, 맵핑 테이블 정리부(140)는 체인 블록의 데이터를 임시 블록에 저장하고, 체인 블록이 참조하던 참조 블록의 데이터를 체인 블록에 저장하며, 임시 블록의 데이터를 체인 블록이 참조하던 참조 블록에 저장하고, 맵핑 테이블에서 체인 블록이 참조 블록을 참조하는 정보를 삭제할 수 있다. 이때, 맵핑 테이블 정리부(140)는 체인 블록이 참조하던 참조 블록을 새로운 체인 블록으로 설정하여 체인 클리닝 기법을 적용하며, 최초의 체인 블록을 원본 블록으로 설정한 체인 블록이 검색된 경우 체인 클리닝 기법의 수행을 종료할 수 있다.In operation S580, the mapping table organizer 140 may organize the chain blocks searched in operation S570 based on the object referred to by each chain block. Specifically, the mapping table organizer 140 stores the data of the chain block in the temporary block, stores the data of the reference block referred to by the chain block in the chain block, and stores the data of the temporary block in the reference block referred to by the chain block. Store, and delete information that the chain block refers to the reference block from the mapping table. At this time, the mapping table cleaner 140 applies a chain cleaning technique by setting the reference block referred to by the chain block as a new chain block, and performs a chain cleaning technique when a chain block having the first chain block set as the original block is found. Can be terminated.

본 발명은 복수의 참조 블록이 참조하는 원본 블록의 데이터가 변경되는 경우, 변경되는 원본 블록의 데이터를 참조 블록 중 하나에 저장하고, 원본 블록이 해당 참조 블록을 참조하도록 함으로써 원본 블록의 데이터가 변경될 경우 수행되는 복사와 변경 연산을 감소시킬 수 있다. According to the present invention, when the data of the original block referenced by a plurality of reference blocks is changed, the data of the original block is changed by storing the data of the changed original block in one of the reference blocks, and having the original block refer to the corresponding reference block. This can reduce the copy and change operations performed.

또한, 본 발명은 복수의 참조 블록이 하나의 원본 블록을 참조하는 경우, 맵핑 테이블에서 하나의 엔트리를 사용하여 복수의 참조 블록이 원본 블록을 참조한다는 정보를 저장함으로써 맵핑 테이블의 크기를 감소시킬 수 있다.Also, when the plurality of reference blocks refer to one original block, the present invention can reduce the size of the mapping table by storing information indicating that the plurality of reference blocks refer to the original block by using one entry in the mapping table. have.

이상과 같이 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.As described above, the present invention has been described by way of limited embodiments and drawings, but the present invention is not limited to the above embodiments, and those skilled in the art to which the present invention pertains various modifications and variations from such descriptions. This is possible.

그러므로, 본 발명의 범위는 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.Therefore, the scope of the present invention should not be limited to the described embodiments, but should be determined by the equivalents of the claims, as well as the claims.

110: 중복 제거부
120: 변경 데이터 판단부
130: 데이터 변경부
140: 맵핑 테이블 정리부110: deduplication unit
120: change data determination unit
130: data change unit
140: mapping table cleanup unit

Claims

A deduplication unit configured to map at least one reference block corresponding to the data to the original block in which the data is stored, by using a mapping table when a plurality of data is duplicated;
When a change request of data changes data of the original block, the changed data is stored in one of the reference blocks according to the change request, and the original block is set to refer to the reference block storing the changed data. A data changer; And
Mapping table organizer for changing the reference block and the data of the chain block according to the reference object based on the chain block, which is an original block referring to one of the reference blocks, and deleting information on the chain block from the mapping table.
Mapping management system comprising a.

The method of claim 1,
The deduplication unit,
And when a plurality of reference blocks refer to one original block, setting the plurality of reference blocks to refer to one original block using one entry in the mapping table.

delete

The method of claim 1,
The mapping table organizer,
Store data of the chain block in a temporary block, store data of a reference block referred to by the chain block in the chain block, store data of the temporary block in a reference block referred to by the chain block, and map the And the chain block deletes the information referring to the reference block from the table.

The method of claim 1,
The mapping table organizer,
And a storage management system in which data is stored in an idle state.

The method of claim 1,
The mapping table organizer,
And operate when the size of the mapping table exceeds a threshold.

Mapping a plurality of data to at least one reference block corresponding to the data to an original block in which the data is stored using a mapping table;
If the change request of data changes the data of the original block, storing the changed data in one of the reference blocks according to the change request;
Setting the original block to refer to a reference block storing the changed data; And
Changing the reference block and data of the chain block according to a reference object based on a chain block which is an original block referring to one of the reference blocks, and deleting information on the chain block from the mapping table;
Mapping management method comprising a.

The method of claim 7, wherein
The mapping step
And when a plurality of reference blocks refer to one original block, setting the plurality of reference blocks to refer to one original block using one entry in the mapping table.

delete

The method of claim 7, wherein
Deleting information about the chain block,
Storing the data of the chain block in a temporary block;
Storing data of a reference block referred to by the chain block in the chain block;
Storing the data of the temporary block in a reference block referred to by the chain block; And
Deleting information in which the chain block refers to a reference block in the mapping table
Mapping management method comprising a.

The method of claim 7, wherein
Deleting information about the chain block,
And executing when the storage system storing data is in an idle state.

The method of claim 7, wherein
Deleting information about the chain block,
If the size of the mapping table exceeds a threshold, executing the mapping management method.