KR20200098971A

KR20200098971A - Method and apparatus for storing data based on single-level

Info

Publication number: KR20200098971A
Application number: KR1020190016770A
Authority: KR
Inventors: 최영리; Kaiyrakhmetolzhas
Original assignee: 울산과학기술원
Priority date: 2019-02-13
Filing date: 2019-02-13
Publication date: 2020-08-21
Also published as: KR102233880B1

Abstract

The present invention relates to a single-level based data storage device and a method thereof for providing higher performance and efficiency for reading and writing data. According to an embodiment of the present invention, the method of the single-level based data storage device comprises the following steps of: storing data in a first memory table included in a nonvolatile memory; storing at least a part of the data stored in the first memory table in a second memory table included in the nonvolatile memory, when the capacity of the data stored in the first memory table is greater than or equal to a predetermined value; identifying the data stored in the second memory table and performing a flush operation; and storing the data stored in the second memory table in a single-level on a disk separate from the nonvolatile memory based on the flush operation.

Description

Single-level based data storage device and method {METHOD AND APPARATUS FOR STORING DATA BASED ON SINGLE-LEVEL}

본 발명은 비휘발성 메모리를 이용하며 싱글-레벨로 데이터를 저장하는 데이터 저장 장치 및 방법에 관한 것이다. The present invention relates to a data storage device and method for storing data in a single-level using a non-volatile memory.

최근에는 빅 데이터 수요 및 데이터 분석 요구에 따라, 데이터베이스에서 원하는 정보를 찾기 위해 데이터가 저장된 파일의 위치를 나타내는 키(key) 값에 기초하여 정렬되는 인덱스 기능을 가지는 시스템이 요구되고 있다. 인덱스 기능은 포함되는 데이터가 쉽게 검색될 수 있도록 비-트리, LSM 트리 등과 같은 효율적인 구조를 가진다. In recent years, according to the demand for big data and data analysis, a system having an index function that is arranged based on a key value indicating the location of a file in which data is stored is required to find desired information in a database. The index function has an efficient structure such as a non-tree and an LSM tree so that included data can be easily searched.

비-트리(B-tree)를 이용한 인덱스 기능은 데이터를 키(key)에 따라 정렬하여 유지하고, 추가적인 데이터의 삽입, 삭제 시 정렬된 데이터를 갱신하여 다시 기록한다. 이 방식은 추가적인 데이터 삽입 또는 삭제 시, 대상 데이터 영역의 내용을 수정 후 다시 쓰기를(re-write)한다. 이러한 데이터의 다시 쓰기는 무작위 쓰기(random write) 형태로 발생하게 되는데, 이러한 경우, SSD 최대 성능 발휘를 어렵게 만들며, 가비지 컬렉션(garbage collection) 부하를 증가시킨다.The index function using a B-tree keeps the data sorted according to the key, and when additional data is inserted or deleted, the sorted data is updated and rewritten. In this method, when additional data is inserted or deleted, the contents of the target data area are modified and then re-written. Such data rewriting occurs in the form of a random write. In this case, it makes it difficult to exert the maximum performance of the SSD and increases the garbage collection load.

LSM 트리(log structured merge tree)를 이용한 인덱스 기능은 데이터를 키에 따라 정렬하여 유지하되, 추가적인 데이터의 삽입, 삭제시, 새롭게 정렬된 데이터로 독립적으로 기록한다. 이 방식은 추가적인 데이터 삽입 또는 삭제시, 추가적인 연산에 대한 데이터를 따로 정렬하여 저장한다. 이 때, 순차 쓰기(sequential write)로 I/O(input/output)가 발생하므로 최대 쓰기 성능이 발휘된다. 다만, 독립적으로 정렬된 데이터가 늘어남에 따라, 데이터 조회 시 각각 분리된 데이터를 모두 접근해야 하므로 성능이 떨어지는 문제가 있다. The index function using the LSM tree (log structured merge tree) sorts and maintains data according to the key, but when additional data is inserted or deleted, it is independently recorded as newly sorted data. In this method, when additional data is inserted or deleted, data for additional operations are sorted and stored separately. At this time, since input/output (I/O) is generated by sequential writing, maximum write performance is exhibited. However, as the number of independently sorted data increases, there is a problem in that performance is degraded because all separated data must be accessed when searching data.

한국공개특허 제10-2014-0070834호 (2014년 06월 11일 공개)Korean Patent Publication No. 10-2014-0070834 (published on June 11, 2014)

본 발명이 해결하고자 하는 과제는, 데이터의 읽기 및 쓰기에 대해 보다 높은 성능 및 효율을 제공하는 데이터 저장 방법 및 장치를 제공하는 것이다. The problem to be solved by the present invention is to provide a data storage method and apparatus that provides higher performance and efficiency for reading and writing data.

다만, 본 발명이 해결하고자 하는 과제는 이상에서 언급한 바로 제한되지 않으며, 언급되지는 않았으나 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있는 목적을 포함할 수 있다.However, the problems to be solved by the present invention are not limited as mentioned above, and are not mentioned, but include objects that can be clearly understood by those of ordinary skill in the art from the following description. can do.

본 발명의 일 실시예에 따른 싱글-레벨 기반의 데이터 저장 방법은, 비휘발성 메모리에 포함된 제1 메모리 테이블에 데이터를 저장하는 단계와 상기 제1 메모리 테이블에 저장된 데이터의 용량이 소정 값 이상인 경우, 상기 제1 메모리 테이블에 저장된 적어도 일부의 데이터를 상기 비휘발성 메모리에 포함된 제2 메모리 테이블에 저장하는 단계와, 상기 제2 메모리 테이블에 저장된 데이터를 식별하여 플러시(flush) 동작를 수행하는 단계와, 상기 플러시 동작에 기초하여 상기 제2 메모리 테이블에 저장된 데이터를 상기 비휘발성 메모리와는 구분되는 디스크(disk)에 싱글-레벨로 저장하는 단계를 포함한다. A single-level based data storage method according to an embodiment of the present invention includes storing data in a first memory table included in a nonvolatile memory, and when the capacity of data stored in the first memory table is greater than or equal to a predetermined value. , Storing at least some of the data stored in the first memory table in a second memory table included in the nonvolatile memory, and performing a flush operation by identifying data stored in the second memory table; and And storing the data stored in the second memory table in a single-level on a disk separate from the nonvolatile memory based on the flush operation.

또한, 상기 싱글-레벨로 저장하는 단계는, 비-트리(B-tree) 데이터 구조에 기초하여 데이터가 검색되도록, 상기 플러시 동작에 기초하여 상기 제2 메모리 테이블에 저장된 데이터를 싱글-레벨로 저장하는 단계를 포함할 수 있다. In addition, the storing of the single-level may include storing the data stored in the second memory table as a single-level based on the flush operation so that data is retrieved based on a B-tree data structure. It may include the step of.

또한, 상기 싱글-레벨로 저장된 데이터의 적어도 일부에 대해 컴팩션(compaction)을 수행하는 단계를 포함할 수 있다. Further, it may include performing compaction on at least a part of the data stored in the single-level.

또한, 상기 컴팩션을 수행하는 단계는, 상기 싱글-레벨로 저장된 데이터 중 업데이트 전의 데이터를 삭제하고, 상기 싱글-레벨로 저장된 나머지 데이터를 상기 나머지 데이터에 포함되는 키 값(key value)에 기초하여 정렬하는 단계를 포함할 수 있다. In addition, the performing of the compaction may include deleting the data before the update from among the data stored in the single-level, and storing the remaining data stored in the single-level based on a key value included in the remaining data. It may include the step of aligning.

또한, 상기 컴팩션을 수행하는 단계는, 상기 싱글-레벨로 저장된 데이터 중 적어도 일부에서 저장된 시점이 업데이트 시점 이전인 데이터와 업데이트 시점 이후의 데이터의 비율을 식별하는 단계와, 상기 식별된 비율이 소정 값 보다 낮은 경우, 상기 업데이트 시점 이전인 데이터의 적어도 일부를 컴팩션 후보(compaction candidate)로 선택하는 단계를 포함하고, 상기 선택된 컴팩션 후보 중 적어도 일부를 삭제하는 단계를 포함할 수 있다. In addition, the performing of the compaction may include: identifying a ratio of data in which at least a portion of the data stored in the single-level is stored before the update time and the data after the update time, and the identified ratio is predetermined. If it is lower than the value, it may include selecting at least a portion of the data prior to the update point as a compaction candidate, and deleting at least a portion of the selected compaction candidate.

또한, 상기 컴팩션을 수행하는 단계는, 상기 비-트리의 각 노드 중 소정 개수의 노드를 스캔하는 단계와, 상기 싱글-레벨로 저장된 데이터 중 상기 스캔된 노드와 관련된 데이터가 소정 개수 이상의 데이터 테이블 파일에 분산되어 저장된 경우, 상기 데이터 테이블 파일을 상기 컴팩션 후보로 선택하는 단계를 포함할 수 있다. In addition, the performing of the compaction may include scanning a predetermined number of nodes among each node of the non-tree, and a data table having a predetermined number or more of data related to the scanned node among the data stored in the single-level. In the case of being distributed and stored in a file, the step of selecting the data table file as the compaction candidate may be included.

또한, 상기 컴팩션을 수행하는 단계는, 상기 싱글-레벨로 저장된 데이터를 읽기(read) 위해 요구되는 데이터 테이블 파일의 수를 산출하는 단계와, 상기 산출된 데이터 테이블 파일의 수가 가장 큰 데이터와 관련된 데이터 테이블 파일을 상기 컴팩션 후보로 선택하는 단계를 포함할 수 있다. In addition, the performing of the compaction may include calculating the number of data table files required to read the data stored in the single-level, and the calculated number of data table files related to the largest data. It may include the step of selecting a data table file as the compaction candidate.

본 발명의 일 실시예에 따른 싱글-레벨 기반의 데이터 저장 장치는, 입력되는 데이터를 저장하는 제1 메모리 테이블과, 상기 제1 메모리 테이블에 저장된 데이터의 용량이 소정 값 이상인 경우, 상기 제1 메모리 테이블에 저장된 적어도 일부의 데이터를 저장하는 제2 메모리 테이블과, 상기 제2 메모리 테이블에 저장된 데이터의 상태를 식별하여 플러시(flush) 동작를 수행하는 플러시 수행부와, 상기 플러시 동작에 기초하여 상기 제2 메모리 테이블에 저장된 데이터 중 적어도 일부를 상기 비휘발성 메모리와는 구분되는 디스크(disk)에 싱글-레벨로 저장하는 싱글-레벨 저장부를 포함하고, 상기 제1 메모리 테이블 및 상기 제2 메모리 테이블은 비휘발성 메모리에 저장된다. A single-level based data storage device according to an embodiment of the present invention includes a first memory table for storing input data, and when a capacity of data stored in the first memory table is greater than a predetermined value, the first memory A second memory table that stores at least some data stored in the table, a flush execution unit that performs a flush operation by identifying a state of the data stored in the second memory table, and the second memory table based on the flush operation. A single-level storage unit for storing at least some of the data stored in the memory table as a single-level on a disk separate from the nonvolatile memory, wherein the first memory table and the second memory table are nonvolatile Stored in memory.

또한, 상기 싱글-레벨 저장부는, 비-트리(B-tree) 데이터 구조에 기초하여 데이터가 검색되도록, 상기 플러시 동작에 기초하여 상기 제2 메모리 테이블에 저장된 데이터를 싱글-레벨로 저장할 수 있다. Further, the single-level storage unit may store data stored in the second memory table in a single-level based on the flush operation so that data is retrieved based on a B-tree data structure.

또한, 상기 싱글-레벨로 저장된 데이터의 적어도 일부에 대해 컴팩션(compaction)을 수행하는 컴팩션 수행부를 포함할 수 있다. In addition, it may include a compaction performing unit that performs compaction on at least a part of the data stored in the single-level.

또한, 상기 컴팩션 수행부는, 상기 싱글-레벨로 저장된 데이터 중 상기 싱글-레벨로 저장된 데이터 중 업데이트 전의 데이터를 삭제하고, 상기 싱글-레벨로 저장된 나머지 데이터를 상기 나머지 데이터에 포함되는 키 값(key value)에 기초하여 정렬할 수 있다. In addition, the compaction performing unit deletes the data before the update from among the data stored in the single-level among the data stored in the single-level, and stores the remaining data in the single-level as a key value included in the remaining data. You can sort based on value).

또한, 상기 컴팩션 수행부는, 상기 싱글-레벨로 저장된 데이터 중 적어도 일부에서 저장된 시점이 업데이트 시점 이전인 데이터와 업데이트 시점 이후의 데이터의 비율을 식별하는 단계와, 상기 식별된 비율이 소정 값 보다 낮은 경우, 상기 소정 시점 이전인 데이터의 적어도 일부를 컴팩션 후보(compaction candidate)로 선택하는 단계를 포함하고, 상기 선택된 컴팩션 후보 중 적어도 일부를 삭제할 수 있다. In addition, the compaction performing unit may further include identifying a ratio of data that is stored in at least some of the single-level data stored before the update time and data after the update time, and wherein the identified ratio is lower than a predetermined value. In this case, it may include selecting at least a part of data before the predetermined time point as a compaction candidate, and deleting at least some of the selected compaction candidates.

또한, 상기 컴팩션 수행부는, 상기 비-트리의 각 노드 중 소정 개수의 노드를 스캔하고, 상기 싱글-레벨로 저장된 데이터 중 상기 스캔된 노드와 관련된 데이터가 소정 개수 이상의 데이터 테이블 파일에 분산되어 저장된 경우, 상기 데이터 테이블 파일을 상기 컴팩션 후보로 선택할 수 있다. In addition, the compaction performing unit scans a predetermined number of nodes among each node of the non-tree, and among the data stored in the single-level, data related to the scanned node is distributed and stored in a data table file of a predetermined number or more. In this case, the data table file may be selected as the compaction candidate.

또한, 상기 컴팩션 수행부는, 상기 싱글-레벨로 저장된 데이터를 읽기(read) 위해 요구되는 데이터 테이블 파일의 수를 산출하고, 상기 산출된 데이터 테이블 파일의 수가 가장 큰 데이터와 관련된 데이터 테이블 파일을 상기 컴팩션 후보로 선택할 수 있다. In addition, the compaction execution unit calculates the number of data table files required to read the data stored in the single-level, and the data table file associated with the data with the largest number of the calculated data table files. Can be selected as a compaction candidate.

본 발명의 실시예에 따른 싱글-레벨 기반의 데이터 저장 장치 및 방법은 데이터의 읽기 및 쓰기에 대해 보다 높은 성능 및 효율을 제공할 수 있다. The single-level-based data storage device and method according to an embodiment of the present invention may provide higher performance and efficiency for reading and writing data.

다만, 본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다. However, the effects obtainable in the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those of ordinary skill in the art from the following description. I will be able to.

도 1은 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 방법의 개념도를 도시한다.
도 2는 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 장치의 기능적 구성의 예를 도시한다.
도 3은 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 방법의 각 단계의 흐름을 도시한다.
도 4는 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 방법의 읽기(read) 단계의 흐름을 개념적으로 도시한다.
도 5는 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 방법의 쓰기(write) 단계의 흐름을 개념적으로 도시한다. 1 is a conceptual diagram illustrating a single-level data storage method according to an embodiment of the present invention.
2 shows an example of a functional configuration of a single-level data storage device according to an embodiment of the present invention.
3 shows the flow of each step of the single-level data storage method according to an embodiment of the present invention.
4 conceptually illustrates a flow of a read step in a method for storing single-level data according to an embodiment of the present invention.
5 conceptually illustrates a flow of a write step in a method for storing single-level data according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명의 범주는 청구항에 의해 정의될 뿐이다.Advantages and features of the present invention, and a method of achieving them will become apparent with reference to the embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various forms, and only these embodiments make the disclosure of the present invention complete, and those skilled in the art to which the present invention pertains. It is provided to fully inform the person of the scope of the invention, and the scope of the invention is only defined by the claims.

본 발명의 실시예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명은 본 발명의 실시예들을 설명함에 있어 실제로 필요한 경우 외에는 생략될 것이다. 그리고 후술되는 용어들은 본 발명의 실시예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing the embodiments of the present invention, detailed descriptions of known functions or configurations will be omitted except when actually necessary in describing the embodiments of the present invention. In addition, terms to be described later are terms defined in consideration of functions in an embodiment of the present invention, which may vary according to the intention or custom of users or operators. Therefore, the definition should be made based on the contents throughout this specification.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예들을 포함할 수 있는바, 특정 실시예들을 도면에 예시하고 상세한 설명에 설명하고자 한다. 그러나 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로서 이해되어야 한다.Since the present invention can make various changes and include various embodiments, specific embodiments will be illustrated in the drawings and described in the detailed description. However, this is not intended to limit the present invention to a specific embodiment, and should be understood as including all changes, equivalents, and substitutes included in the spirit and scope of the present invention.

제 1, 제 2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 해당 구성요소들은 이와 같은 용어들에 의해 한정되지는 않는다. 이 용어들은 하나의 구성요소들을 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms including an ordinal number such as first and second may be used to describe various elements, but the corresponding elements are not limited by these terms. These terms are only used for the purpose of distinguishing one component from another.

어떤 구성요소가 다른 구성요소에 '연결되어' 있다거나 '접속되어' 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.When a component is referred to as being'connected' or'connected' to another component, it is understood that it may be directly connected or connected to the other component, but other components may exist in the middle. Should be.

도 1은 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 방법의 개념도를 도시한다. 이러한 싱글-레벨 데이터 저장 방법은 싱글-레벨 데이터 저장 장치(100)에 의해 수행되는 것일 수 있다. 1 is a conceptual diagram illustrating a single-level data storage method according to an embodiment of the present invention. This single-level data storage method may be performed by the single-level data storage device 100.

도 1을 참조하면, 싱글-레벨 데이터 저장 장치(100)는 비휘발성 메모리(1)와 디스크(2)를 포함할 수 있다. 비휘발성 메모리(1)에는 제1 메모리 테이블(11), 제2 메모리 테이블(13), 컴팩션 로그(15), 비-트리 인덱스(17)가 저장될 수 있다. 비휘발성 메모리(1)는 NVRAM(non-volatile memory)일 수 있고, 디스크(2)는 비휘발성 메모리(1)와는 구분되는 메모리의 일종일 수 있다. 예를 들어, 디스크(20)는 하드 디스크 또는 SSD일 수 있다. Referring to FIG. 1, a single-level data storage device 100 may include a nonvolatile memory 1 and a disk 2. The nonvolatile memory 1 may store a first memory table 11, a second memory table 13, a compaction log 15, and a non-tree index 17. The nonvolatile memory 1 may be a non-volatile memory (NVRAM), and the disk 2 may be a type of memory that is distinct from the nonvolatile memory 1. For example, the disk 20 may be a hard disk or an SSD.

데이터가 입력되면, 1차적으로 제1 메모리 테이블(11)에 저장될 수 있다. 제1 메모리 테이블(11)은 버퍼(uptade buffer)의 역할을 수행하는 구성으로서, 예를 들면, MemTable일 수 있다. 제1 메모리 테이블(11)은 일종의 스킵 리스트 데이터 스트럭쳐(skip list data structure)일 수 있다. 스킵 리스트 데이터 스트럭쳐는 점차 스파스해지는 데이터의 시퀀스를 연결하여 계층 구조로로 정렬하는 데이터 저장 방식의 일종일 수 있다. 제1 메모리 테이블(11)은 영구적이며, 로깅(logging)을 피할 수 있다. When data is input, it may be primarily stored in the first memory table 11. The first memory table 11 is a component that serves as an uptade buffer, and may be, for example, a MemTable. The first memory table 11 may be a kind of skip list data structure. The skip list data structure may be a type of data storage method in which a sequence of gradually sparse data is connected and arranged in a hierarchical structure. The first memory table 11 is permanent, and logging can be avoided.

입력되는 모든 데이터는 제1 메모리 테이블(11)을 통해 데이터베이스 내에 업데이트 될 수 있다.All input data may be updated in the database through the first memory table 11.

제1 메모리 테이블(11)의 용량이 소정 값 이상인 경우, 제1 메모리 테이블(11)에 저장된 적어도 일부의 데이터는 제2 메모리 테이블(13)에 저장될 수 있다. 예를 들어, 제1 메모리 테이블(11)의 용량은 기지정되어 있을 수 있고, 이러한 용량이 가득 차는 경우 제1 메모리 테이블(11)에 저장되는 정보는 제2 메모리 테이블(13)에 저장되어 제2 메모리 테이블(13)에 플러시 계획이 수립될 수 있다. 여기서, 플러시는 디스크 버전의 memtable인 SSTable로 제2 메모리 테이블(13)에 저장되는 정보를 저장하는 것을 의미할 수 있으며, 이와 관련된 내용은 통상의 기술자에게 용이한바 자세한 설명은 생략하겠다. When the capacity of the first memory table 11 is greater than or equal to a predetermined value, at least some data stored in the first memory table 11 may be stored in the second memory table 13. For example, the capacity of the first memory table 11 may be predetermined, and when such capacity is full, information stored in the first memory table 11 is stored in the second memory table 13 and 2 A flush plan may be established in the memory table 13. Here, flushing may mean storing information stored in the second memory table 13 in an SSTable, which is a memtable of a disk version, and a detailed description thereof will be omitted since it is easy for a person skilled in the art.

제2 메모리 테이블(13)은 제1 메모리 테이블(11)과 유사한 기능을 수행할 수 있다. 다만, 제2 메모리 테이블(13)은 불변적(immutable)이어, 읽는 동작만 가능하고 업데이트 또는 쓰기 동작이 불가능 할 수 있다. 제2 메모리 테이블(13)은 플러시 수행부(20)에 의해 플러시가 수행됨에 기초하여, 가득 찬 데이터가 삭제되면 새로운 memtable을 생성 할 수 있다. The second memory table 13 may perform a function similar to that of the first memory table 11. However, since the second memory table 13 is immutable, only a read operation is possible and an update or write operation may not be possible. The second memory table 13 may generate a new memtable when full data is deleted based on the flush being performed by the flush execution unit 20.

보다 구체적으로, 제1 메모리 테이블(11)에 정보가 가득차게 되면, 제1 메모리 테이블(11)의 정보가 제2 메모리 테이블(13)에 저장되도록 하여 플러시가 수행되도록 하고, 제1 메모리 테이블(11)에 새로운 정보가 입력될 수 있다. 경우에 따라, 제1 메모리 테이블(11)에 정보가 가득차게 되면, 제1 메모리 테이블(11)을 제2 메모리 테이블(13)로 변환하여 플러시가 수행되도록 하고, 새로운 제1 메모리 테이블(11)이 생성되도록 할 수 있다. More specifically, when the first memory table 11 is full of information, the information of the first memory table 11 is stored in the second memory table 13 so that flushing is performed, and the first memory table 11 New information can be entered in 11). In some cases, when the first memory table 11 is filled with information, the first memory table 11 is converted into the second memory table 13 so that flushing is performed, and the new first memory table 11 Can be created.

만약 제2 메모리 테이블(13)에 대한 플러시가 수행되지 못하면, 플러시가 수행될 때까지 새로운 정보의 입력, 즉 새로운 제1 메모리 테이블(11)의 생성 또는 제1 메모리 테이블(11)에 대한 새로운 정보의 입력을 대기할 수 있다. If the second memory table 13 is not flushed, new information is input until the flush is performed, that is, a new first memory table 11 is created or new information on the first memory table 11 is performed. Can wait for input of.

플러시 수행부(20)에 의해 플러시가 수행되면, 제2 메모리 테이블(13)에 저장된 데이터 중 적어도 일부에 대해 제2 메모리 테이블(13)로부터 디스크(2)로 데이터가 전달될 수 있다. When the flush is performed by the flush execution unit 20, data may be transferred from the second memory table 13 to the disk 2 for at least some of the data stored in the second memory table 13.

디스크(2)는 파일로서 SSTable을 포함할 수 있는데, 플러시가 수행되면 전달된 데이터(예: 키 값 페어(이하, KV 페어(pair))가 디스크(2)의 SSTable에 저장될 수 있다. SSTable에는 하나 또는 이상의 KV 페어가 저장될 수 있다. 보다 구체적으로, 플러시 된 데이터는 디스크(2)에서 싱글-레벨 형태의 SSTable에 저장될 수 있다. 싱글-레벨은 계층적인 구조의 멀티-레벨이 아닌 도시된 바와 같이 SSTable이 하나의 층(layer)를 이루어 저장되는 단일 레벨 구조를 의미할 수 있다. The disk 2 may include an SSTable as a file, and when flushing is performed, transmitted data (eg, a key value pair (hereinafter, a KV pair)) may be stored in the SSTable of the disk 2. SSTable More specifically, the flushed data may be stored in a single-level SSTable in the disk 2. Single-level is not a hierarchical multi-level multi-level structure. As illustrated, it may mean a single level structure in which an SSTable is formed as a single layer and stored.

디스크(2)는 이러한 싱글-레벨 구조를 가지기 때문에 싱글-레벨로 저장된 데이터의 검색에 비-트리 인덱스(17)를 이용할 수 있다. Since the disk 2 has such a single-level structure, the non-tree index 17 can be used to search data stored in a single-level.

보다 구체적으로 설명하면, 종래의 LSM 트리의 경우, 데이터를 멀티-레벨 구조, 즉, 계층적인 구조를 가지도록 저장하고 각 단계별 순차적으로 이동하면서 데이터를 검색하는 방식을 가진다. 반면에, 비-트리의 경우 싱글-레벨 구조로 저장된 데이터를 검색하도록 설정되어 있기 때문에, 디스크(2)는 비-트리 형태의 인덱스인 비-트리 인덱스(17)를 이용하여 데이터 검색을 수행할 수 있다. 이러한 경우, 디스크(2)에 저장되는 데이터가 정렬된 순서를 유지할 필요가 없으므로, 쓰기 증폭(write amplification)이 줄어들게 되어, 데이터 검색의 효율이 증가할 수 있다. More specifically, in the case of a conventional LSM tree, data is stored in a multi-level structure, that is, a hierarchical structure, and data is searched by sequentially moving each step. On the other hand, in the case of a non-tree, since it is set to search data stored in a single-level structure, the disk 2 can perform data search using the non-tree index 17, which is a non-tree type index. I can. In this case, since it is not necessary to maintain the sorted order of the data stored in the disk 2, write amplification is reduced, and the efficiency of data retrieval may increase.

여기서, 비-트리 인덱스(17)는 후술되는 디스크(2)에 저장되는 모든 데이터에 대한 인덱스를 가지는 데이터 스트럭쳐일 수 있다. 비-트리 인덱스(17)는 일반적인 비-트리와 유사한 형태를 가질 수 있다. 다만, 비휘발성 메모리(1) 내에 저장되어 영구적으로 보존될 수 있다. Here, the non-tree index 17 may be a data structure having an index for all data stored on the disk 2 to be described later. The non-tree index 17 may have a shape similar to that of a general non-tree. However, it may be stored in the nonvolatile memory 1 and permanently preserved.

도 1에서는 디스크(2)에 6개의 SSTable이 존재하며 각각에 KV 페어가 하나씩 저장되는 경우의 예를 도시한다. 구체적으로, 디스크(2)에 KV 페어가 6개 저장되는 경우, 제1 KV 페어(21-1), 제2 KV 페어(21-2), 제3 KV 페어(21-3), 제4 KV 페어(21-4), 제5 KV 페어(21-5), 및 제6 KV 페어(21-6)가 각각 SSTable에 저장되어 하나의 층을 이룰 수 있다. 그러나, 이에 제한되는 것은 아니며, 하나의 SSTable에 두 개이상의 KV 페어가 저장될 수도 있다. 즉, 도시하지는 않았으나, 제1 KV 페어(21-1), 제2 KV 페어(21-2)가 하나의 SSTable에 저장되고, 제3 KV 페어(21-3), 제4 KV 페어(21-4), 제5 KV 페어(21-5), 및 제6 KV 페어(21-6)가 다른 하나의 SSTable에 저장될 수도 있다. In FIG. 1, there is shown an example in which six SSTables exist in the disk 2 and one KV pair is stored in each. Specifically, when six KV pairs are stored on the disk 2, the first KV pair 21-1, the second KV pair 21-2, the third KV pair 21-3, and the fourth KV The pair 21-4, the fifth KV pair 21-5, and the sixth KV pair 21-6 are each stored in the SSTable to form one layer. However, it is not limited thereto, and two or more KV pairs may be stored in one SSTable. That is, although not shown, the first KV pair 21-1 and the second KV pair 21-2 are stored in one SSTable, and the third KV pair 21-3 and the fourth KV pair 21- 4), the fifth KV pair 21-5, and the sixth KV pair 21-6 may be stored in another SSTable.

한편, 주어진 키 값 범위 전체에 대한 스캔 동작(scan operation)에 대해 합리적인 성능을 제공하기 위해 디스크(2)에 저장된 KV 페어의 시퀀셜리티(sequentiality)가 소정 값 이상 유지될 필요가 있다. 여기서, 시퀀셜리티는 KV 페어가 디스크에 저장될 때 얼마나 잘 정렬된 순서로 저장되는지에 대한 정도를 의미할 수 있다. 이러한 시퀀셜리티는 후술되는 컴팩션이 수행됨으로써 일정 수준 이상으로 유지될 수 있다. Meanwhile, in order to provide reasonable performance for a scan operation for the entire range of a given key value, the sequentiality of the KV pair stored in the disk 2 needs to be maintained above a predetermined value. Here, the sequentiality may mean a degree of how well the KV pairs are stored in a sorted order when they are stored on the disk. This sequence may be maintained above a certain level by performing compaction to be described later.

한편, 디스크(2)에서 쓸모가 없어진 KV 페어는 가비지 컬렉션을 통해 삭제될 필요가 있다. 즉, 디스크(2)에 새로운 KV 페어가 들어와 업데이트가 수행되면 이전에 저장되어 있던 KV 페어의 이용 가치가 없어진다. 이에 따라, 이전에 저장되어 있던 KV 페어는 디스크(2)의 용량 측면을 고려하여 삭제될 필요가 있다. On the other hand, the KV pair that has become obsolete in the disk 2 needs to be deleted through garbage collection. That is, when a new KV pair is inserted into the disk 2 and the update is performed, the use value of the previously stored KV pair is lost. Accordingly, the previously stored KV pair needs to be deleted in consideration of the capacity aspect of the disk 2.

이러한 동작은 컴팩션으로 지칭될 수 있으며 컴팩션 수행부(23)는 컴팩션을 수행할 수 있다. 컴팩션 동작은 파일, 즉 SSTable마다 수행될 수 있으며, 컴팩션 동작에 기초하여 파일의 KV 페어들 중 쓸모가 없어진 KV 페어가 삭제되고, 남아있는 KV 페어는 키 값에 기초하여 정렬될 수 있다. This operation may be referred to as compaction, and the compaction performing unit 23 may perform compaction. The compaction operation may be performed for each file, that is, for each SSTable. Based on the compaction operation, useless KV pairs are deleted among the KV pairs of the file, and the remaining KV pairs may be sorted based on the key value.

보다 구체적으로 설명하면, KV 페어는 키와 해당 키에 대한 하나의 값의 형태로 나타날 수 있다. 예를 들어, 키가 5이고, 키에 대한 값이 100이면 <5, 100>로 표현될 수 있다. 이러한 경우, '5'라는 키를 요청하면, 100의 값을 획득할 수 있다. 경우에 따라, 키에 대한 값은 여러 번 업데이트될 수 있다. 예를 들어, '5'라는 키에 대해 현재 '100'이라는 값이 저장되어 있지만, 이후에 값을 '200'으로 업데이트할 수 있습니다. 이러한 경우, 처음에 <5,100>이라는 KV 페어가 파일 X에 쓰이면 비-트리가 키가 '5'인 경우에 대해서 X를 포인트(point)할 수 있다. 이후에 <5,200>이라는 KV 페어가 새로운 파일 Y에 쓰이면 비-트리가 트리가 키가 '5'인 경우에 대해서 Y를 포인트할 수 있다. 컴팩션 과정에서, 키가 5인 경우에 대해 비-트리가 어떤 파일을 포인트하는 지 확인할 수 있다. 이러한 경우, X 파일에 대해 컴팩션을 수행하는 경우에는 이전의 KV 페어는 삭제될 수 있다. More specifically, the KV pair may appear in the form of a key and a value for the key. For example, if the key is 5 and the value for the key is 100, it may be expressed as <5, 100>. In this case, when a key of '5' is requested, a value of 100 may be obtained. In some cases, the value for the key may be updated multiple times. For example, for a key of '5', a value of '100' is currently stored, but you can update the value to '200' afterwards. In this case, when a KV pair of <5,100> is initially written to the file X, the non-tree can point X to the case where the key is '5'. Thereafter, if the KV pair <5,200> is written to the new file Y, the non-tree can point to Y for the case where the tree has a key of '5'. In the compaction process, when the key is 5, it is possible to check which file the non-tree points to. In this case, when compaction is performed on the X file, the previous KV pair may be deleted.

다른 예를 들면, F1= (<1, 10>, <5, 20>, <10, 30>) 이고, F2=(<3, 30>, <4, 40>, <10, 100>) 이라고 하고 key=10에 대해서 F2의 <10, 100> 이 업데이트된 새로운 값이라고 가정하고 F1과 F2에 대한 컴팩션을 수행하면, F3, F4이라는 새로운 파일을 F3=(<1, 10>, <3, 30>, <4, 40>), F4=(<5, 20>, <10, 100>)으로 생성할 수 있다. 이러한 과정에서 <10, 30>은 삭제될 수 있으며, F3, F4가 생성되어 디스크에 저장되면 기존의 F1, F2는 삭제될 수 있다. For another example, F1= (<1, 10>, <5, 20>, <10, 30>) and F2=(<3, 30>, <4, 40>, <10, 100>) And assuming that <10, 100> of F2 is the updated new value for key=10, and compaction is performed for F1 and F2, the new files F3 and F4 are saved as F3=(<1, 10>, <3 , 30>, <4, 40>), F4 = (<5, 20>, <10, 100>) can be created. In this process, <10, 30> can be deleted, and if F3 and F4 are created and stored on the disk, the existing F1 and F2 can be deleted.

비휘발성 메모리(1)에 저장되는 컴팩션 로그(15)는 컴팩션의 수행과 관련된 관련된 다양한 정보들을 기록할 수 있다. The compaction log 15 stored in the nonvolatile memory 1 may record various types of information related to the execution of compaction.

데이터를 읽기 위한 입력이 인가되면, 비-트리 인덱스(17)를 기초로 데이터 검색이 이루어질 수 있다. When an input for reading data is applied, data search may be performed based on the non-tree index 17.

도 2는 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 장치의 기능적 구성의 예를 도시한다. 이하 사용되는 '…부'등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어, 또는, 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. 이하 도 2의 설명에서는 도 1과 중복되는 내용은 생략될 수 있다. 2 shows an example of a functional configuration of a single-level data storage device according to an embodiment of the present invention. Used below'… A term such as'negative' means a unit that processes at least one function or operation, which may be implemented by hardware or software, or a combination of hardware and software. Hereinafter, in the description of FIG. 2, content overlapping with FIG. 1 may be omitted.

도 2를 참조하면, 싱글-레벨 데이터 저장 장치(100)는 제1 메모리 테이블(110), 제2 메모리 테이블(120), 플러시 수행부(130), 싱글-레벨 저장부(140), 합병 정렬 수행부(150)를 포함할 수 있다. Referring to FIG. 2, the single-level data storage device 100 includes a first memory table 110, a second memory table 120, a flush execution unit 130, a single-level storage unit 140, and a merger alignment. It may include an execution unit 150.

제1 메모리 테이블(110)은 마이크로프로세서(microprocessor)를 포함하는 연산 장치에 의해 구현될 수 있으며, 이는 후술할 제2 메모리 테이블(120), 플러시 수행부(130), 싱글-레벨 저장부(140), 합병 정렬 수행부(150)에 있어서도 같다. The first memory table 110 may be implemented by a computing device including a microprocessor, which is a second memory table 120, a flush execution unit 130, and a single-level storage unit 140 to be described later. ), the same for the merger and alignment performing unit 150.

제1 메모리 테이블(110)은 비휘발성 메모리(1) 내에 포함되며, 데이터베이스의 형성을 위해 입력되는 데이터를 저장할 수 있다. 제1 메모리 테이블(110)은 도 1의 제1 메모리 테이블(11)에 대응하는 구성일 수 있다. The first memory table 110 is included in the nonvolatile memory 1 and may store data input to form a database. The first memory table 110 may have a configuration corresponding to the first memory table 11 of FIG. 1.

제2 메모리 테이블(120)은, 제1 메모리 테이블(110)에 저장된 데이터를 저장할 수 있다. 제2 메모리 테이블(120)은 immutable memtable로서, 제1 메모리 테이블(110)의 데이터를 불변적(immutable)으로 저장할 수 있다. The second memory table 120 may store data stored in the first memory table 110. The second memory table 120 is an immutable memtable, and may store data of the first memory table 110 immutable.

플러시 수행부(130)는 제2 메모리 테이블(120)에 저장된 데이터에 대해 플러시 동작를 수행할 수 있다. 플러시 수행부(130)는 제2 메모리 테이블(120)에 저장된 데이터를 디스크(2)로 플러시할 수 있다. 플러시가 수행되면, 그에 상응하는 데이터가 제2 메모리 테이블(120)에서 삭제될 수 있다. The flush execution unit 130 may perform a flush operation on data stored in the second memory table 120. The flush execution unit 130 may flush data stored in the second memory table 120 to the disk 2. When the flush is performed, data corresponding thereto may be deleted from the second memory table 120.

싱글-레벨 저장부(140)는 플러시 동작에 기초하여 전달받은 데이터를 싱글-레벨로 저장할 수 있다. 구체적으로, 싱글-레벨 저장부(140)는 제2 메모리 테이블(120)에 저장된 데이터 중 적어도 일부를 비휘발성 메모리(1)와는 구분되는 디스크(2)에 싱글-레벨로 저장할 수 있다. The single-level storage unit 140 may store data received based on the flush operation as a single-level. Specifically, the single-level storage unit 140 may store at least some of the data stored in the second memory table 120 in a single-level on a disk 2 that is separate from the nonvolatile memory 1.

싱글-레벨 저장부(140)는 비-트리(B-tree) 데이터 구조에 기초하여 데이터가 검색되도록, 제2 메모리 테이블에 저장된 데이터 중 적어도 일부를 싱글-레벨로 저장할 수 있다. The single-level storage unit 140 may store at least some of the data stored in the second memory table as a single-level so that data is retrieved based on a B-tree data structure.

컴팩션 수행부(150)는 싱글-레벨로 저장된 데이터 중 쓸모가 없어진 데이터에 대해 컴팩션을 수행할 수 있다. 컴팩션은 파일들을 키 값에 대한 합병 정렬(merge sort)을 수행하면서 새로운 파일을 만드는 과정을 의미할 수 있다. The compaction execution unit 150 may perform compaction on useless data among data stored in a single-level. Compaction may refer to a process of creating a new file while performing merge sort of files on key values.

합병 정렬을 수행하기에 앞서, 컴팩션 후보(compaction candidate)에 대한 리스트가 생성될 수 있으며, 이러한 리스트 안에 포함된 데이터에 대해 합병 정렬이 수행될 수 있다. Prior to performing the merger sort, a list of compaction candidates may be generated, and merge sort may be performed on data included in the list.

컴팩션 수행부(150)는 싱글-레벨로 저장된 데이터 중 필요가 없어진 데이터를 삭제하고, 나머지 데이터를 키 값에 기초하여 정렬할 수 있다. 삭제된 데이터는 컴팩션 후보로 추출되어 리스트로 생성되어 있을 수 있고, 리스트에 포함된 데이터 중 일부에 대해 합병 정렬이 수행될 수 있다. 경우에 따라, 데이터를 읽기 위한 입력이 인가되는 경우 비-트리 인덱스(17)를 이용하여 합병 정렬된 데이터에서 키 값을 기준으로 검색이 수행될 수 있다. The compaction execution unit 150 may delete unnecessary data from data stored in a single-level and sort the remaining data based on a key value. Deleted data may be extracted as a compaction candidate and generated as a list, and merger and sorting may be performed on some of the data included in the list. In some cases, when an input for reading data is applied, a search may be performed based on a key value in merged and sorted data using the non-tree index 17.

다른 예를 들면, 컴팩션 수행부(150)는 싱글-레벨로 저장된 데이터 중 저장된 시점이 업데이트 시점 이전의 데이터와 업데이트 시점 이후의 데이터의 비율을 식별할 수 있다. 여기서, 업데이트 시점 이전의 데이터는 예를 들면, 새로운 값이 쓰이거나(write) 업데이트된 경우, 그 이전의 값, 즉 유효하지 않은(invalid)(또는 obsolete) KV를 포함하고, 업데이트 시점 이후의 데이터는 새로운 값, 즉 유효한(valid) KV를 포함할 수 있다. For another example, the compaction performing unit 150 may identify a ratio of data stored in single-level data before the update time and data after the update time. Here, the data before the update point includes, for example, a new value written or updated, a previous value, that is, an invalid (or obsolete) KV, and data after the update point May contain a new value, that is, a valid KV.

컴팩션 수행부(150)는 식별된 비율이 소정 값 보다 낮은 경우, 업데이트 시점 이전인 데이터 중 적어도 일부를 컴팩션 후보로 선택할 수 있다. 이에 따라, 컴팩션 수행부(150)는 업데이트 시점 이전인 데이터와 업데이트 시점 이후의 데이터의 비율을 소정 값 이상으로 유지할 수 있다. When the identified ratio is lower than a predetermined value, the compaction performing unit 150 may select at least some of the data prior to the update time as a compaction candidate. Accordingly, the compaction performing unit 150 may maintain a ratio of data before the update time and data after the update time to a predetermined value or more.

또 다른 예를 들면, 컴팩션 수행부(150)는 비-트리의 각 노드 중 소정 개수의 노드를 스캔할 수 있다. 이 때, 비-트리의 각 노드는 싱글-레벨로 저장된 데이터의 적어도 일부와 관련될 수 있고, 컴팩션 수행부(150)는 스캔된 노드와 관련된 데이터가 어떠한 데이터 테이블 파일에 분산되어 저장되는지를 식별할 수 있다. 데이터가 데이터 테이블 파일에 대해 분산된 정도가 소정 값 이상인 경우, 컴팩션 수행부(150)는 데이터가 저장된 데이터 테이블 파일 자체 또는 데이터 테이블 파일에 저장된 데이터를 컴팩션 후보로 선택할 수 있다. As another example, the compaction performing unit 150 may scan a predetermined number of nodes among each node of the non-tree. At this time, each node of the non-tree may be associated with at least a part of data stored in a single-level, and the compaction execution unit 150 determines in which data table file the data related to the scanned node is distributed and stored. Can be identified. When the degree of distribution of data to the data table file is greater than or equal to a predetermined value, the compaction execution unit 150 may select the data table file itself or data stored in the data table file as a candidate for compaction.

여기서, 데이터 테이블 파일은 SStable 파일일 수 있고, KV 페어가 저장되는 파일을 의미할 수 있고, 컴팩션 수행부(150)의 비-트리 스캔 방식은 라운드-로빈(round-robin) 방식에 기초할 수 있으며, 이와 관련하여서는 통상의 기술자에게 용이한 바 생략하겠다.Here, the data table file may be an SStable file, and may mean a file in which a KV pair is stored, and the non-tree scan method of the compaction execution unit 150 is based on a round-robin method. In this regard, it will be omitted as it is easy for a person skilled in the art.

또 다른 예를 들면, 컴팩션 수행부(150)는 싱글-레벨로 저장된 데이터를 읽기(read) 위해 요구되는 데이터 테이블 파일의 수를 산출하고, 산출된 데이터 테이블 파일의 수가 가장 큰 데이터와 관련된 데이터 테이블 파일을 컴팩션 후보로 선택할 수 있다. As another example, the compaction execution unit 150 calculates the number of data table files required to read data stored in a single-level, and data related to the data with the largest number of data table files Table files can be selected as candidates for compaction.

컴팩션 수행부(150)는 컴팩션을 통해 쓸모 없는 KV 페어를 삭제할 수 있고, 그 과정에서 삭제되는 KV 페어를 포함하는 파일들에 있는 유효한 KV 페어를 선별하여 새로운 파일로 생성할 수 있다. 예를 들어, 두 개의 파일 F1, F2를 컴팩션하는 경우(이 때, F1= (<1, 10>, <5, 20>, <10, 30>) 이고, F2=(<3, 30>, <4, 40>, <10, 100>) 이라고 하고 key=10에 대해서 F2의 <10, 100> 이 업데이트된 새로운 값이라고 가정), 컴팩션 수행부(150)는 F3, F4라는 새로운 파일을 F3=(<1, 10>, <3, 30>, <4, 40>), F4=(<5, 20>, <10, 100>)으로 생성할 수 있다. 이러한 경우, 컴팩션 수행부(150)는 비-트리에서 키 1, 3, 4, 5, 10에 대해 F3과 F4를 포인트하도록 업데이트할 수 있다. 그 후 컴팩션 수행부(150)는 기존의 파일인 F1, F2를 삭제할 수 있다. The compaction performing unit 150 may delete useless KV pairs through compaction, and may select valid KV pairs in files including the KV pairs to be deleted in the process and generate a new file. For example, if you compact two files F1, F2 (in this case, F1= (<1, 10>, <5, 20>, <10, 30>) and F2=(<3, 30> , <4, 40>, <10, 100>), and assuming that <10, 100> of F2 is an updated new value for key=10), the compaction execution unit 150 is a new file called F3, F4 Can be generated as F3=(<1, 10>, <3, 30>, <4, 40>), F4=(<5, 20>, <10, 100>). In this case, the compaction execution unit 150 may update to point F3 and F4 for keys 1, 3, 4, 5, and 10 in the non-tree. After that, the compaction performing unit 150 may delete the existing files F1 and F2.

이러한 컴팩션이 수행됨으로 인해, 본 발명의 일 실시예에 따른 싱글-레벨 기반의 데이터 저장 장치(100)는 선택적으로 컴팩션을 수행하여 적절한 스캔 동작 성능을 제공하는 동시에, 디스크(2) 공간에 불필요한 데이터가 축적되는 것을 예방할 수 있다. 즉, 싱글-레벨 기반의 데이터 저장 장치(100)는 데이터의 읽기/쓰기 동작이 보다 효율적으로 수행되며, 디스크(2) 공간에 대한 활용도를 향상시킬 수 있다. As such compaction is performed, the single-level-based data storage device 100 according to an embodiment of the present invention selectively performs compaction to provide appropriate scan operation performance, and at the same time, it is stored in the disk 2 space. It can prevent unnecessary data from accumulating. That is, the single-level based data storage device 100 performs data read/write operations more efficiently, and improves utilization of the disk 2 space.

도 3은 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 방법의 각 단계의 흐름을 도시한다. 도 3에 도시된 방법의 각 단계는 경우에 따라 도면에 도시된 바와 그 순서를 달리하여 수행될 수 있음은 물론이다. 3 shows the flow of each step of the single-level data storage method according to an embodiment of the present invention. It goes without saying that each step of the method illustrated in FIG. 3 may be performed in a different order as illustrated in the drawings depending on the case.

도 3을 참조하면, 비휘발성 메모리(1)에 포함된 제1 메모리 테이블(110)에 데이터가 저장될 수 있다(S110). 제1 메모리 테이블(110)의 데이터는 제2 메모리 테이블(120)에 저장될 수 있다. 제2 메모리 테이블(120)도 비휘발성 메모리(1)에 포함되는 것으로, 제1 메모리 테이블(110)에 저장된 데이터 일부를 불변적으로 저장할 수 있다. Referring to FIG. 3, data may be stored in the first memory table 110 included in the nonvolatile memory 1 (S110 ). Data of the first memory table 110 may be stored in the second memory table 120. The second memory table 120 is also included in the nonvolatile memory 1 and may invariably store a part of data stored in the first memory table 110.

제2 메모리 테이블(120)에 저장된 데이터에 대해 플러시 동작이 수행될 수 있다(S130). 예를 들어, 제2 메모리 테이블(120)에 데이터가 가득 차는 경우, 플러시 동작이 수행될 수 있다(S130). A flush operation may be performed on the data stored in the second memory table 120 (S130). For example, when data is full in the second memory table 120, a flush operation may be performed (S130).

플러시 동작에 기초하여 제2 메모리 테이블(120)에서 디스크(2)로 데이터가 전달되며, 전달된 데이터가 싱글-레벨로 저장될 수 있다(S140). Data is transferred from the second memory table 120 to the disk 2 based on the flush operation, and the transferred data may be stored in a single-level (S140).

싱글-레벨로 데이터가 저장되고, 저장된 데이터 중 업데이트가 수행되어 쓸모없는 데이터가 생기면 컴팩션이 수행될 수 있다(S150). 컴팩션이 수행될 때 합병 정렬이 수행될 수 있다. 합병 정렬이 수행될 데이터는 다양한 방법으로 결정될 수 있다. Data is stored in a single-level, and when useless data is generated by performing an update among the stored data, compaction may be performed (S150). When compaction is performed, merger sorting can be performed. The data to be merged and sorted can be determined in various ways.

예를 들어, 싱글-레벨로 저장된 데이터 중 쓸모없는 데이터가 삭제되고, 남은 데이터는 키 값에 기초하여 정렬될 수 있다. 쓸모없는 데이터는 업데이트 되기 전의 데이터일 수 있다. For example, useless data among data stored in a single-level are deleted, and remaining data may be sorted based on a key value. Useless data could be data before it was updated.

다른 예를 들면, 싱글-레벨로 저장된 데이터 중 업데이트 시점 이전의 데이터와 업데이트 이후의 데이터(유효한(live) 데이터)의 비율을 식별하고, 식별된 비율이 소정 값 보다 낮은 경우, 소정 시점 이전인 데이터와 유효한 데이터의 비율이 소정 값을 초과하도록 오래된 데이터를 삭제한 후 남은 데이터를 정렬할 수 있다. For another example, among data stored in a single-level, the ratio of the data before the update point and the data after the update (live data) is identified, and if the identified ratio is lower than a predetermined value, data that is before a predetermined point in time After deleting old data so that the ratio of and valid data exceeds a predetermined value, the remaining data can be sorted.

또 다른 예를 들면, 비-트리의 각 노드 중 소정 개수의 노드를 스캔하여, 스캔된 노드에 해당하는 구간에 존재하는 데이터가 몇 개의 데이터 테이블 파일에 분포하고 있는지를 식별할 수 있다. 이러한 데이터 테이블 파일에 대한 분포의 정도가 소정 값 이상인 경우, 즉 스캔된 노드에 해당하는 구간에 존재하는 데이터가 일정 개수 이상의 데이터 테이블 파일에 분포하고 있는 경우, 데이터가 분포되는 데이터 테이블 파일 자체를 컴팩션 후보로 선택할 수 있다. For another example, by scanning a predetermined number of nodes among each node of the non-tree, it is possible to identify how many data table files exist in the section corresponding to the scanned node. When the degree of distribution for such a data table file is more than a predetermined value, that is, when the data existing in the section corresponding to the scanned node is distributed in a certain number of data table files or more, the data table file itself in which the data is distributed is compressed. Sean can be selected as a candidate.

또 다른 예를 들면, 싱글-레벨로 저장된 데이터를 읽기(read) 위해 요구되는 데이터 테이블 파일의 수를 산출하고, 산출된 데이터 테이블 파일의 수가 가장 큰 데이터의 단위를 컴팩션 후보로 선택할 수 있다. 한편, 여기서 읽기는 하나의 키에 대한 값을 읽는 것(point query)이 아니라 range query 또는 scan으로 주어진 범위의 key들, 예를 들면 200 내지 300의 범위의 키들에 대한 값들을 모두 읽는 동작일 수 있다. 이러한 읽기가 수행되는 경우, scan하는 범위를 서브-범위(sub-range)로 나누어서, 한 서브-범위를 읽기 위해 엑세스한 파일들의 수를 기록하고 있다가, 엑세스한 파일들의 수가 가장 큰 서브-범위에 해당하는 파일을 컴팩션 후보로 선택할 수 있다.As another example, the number of data table files required to read data stored in a single-level may be calculated, and a unit of data having the largest number of data table files may be selected as a compaction candidate. On the other hand, reading here is not reading the value of one key (point query), but reading all the values of the keys in the range given by range query or scan, for example, keys in the range 200 to 300. have. When such a read is performed, the scan range is divided into sub-ranges, and the number of files accessed to read one sub-range is recorded, and the number of accessed files is the largest sub-range. A file corresponding to can be selected as a compaction candidate.

도 4는 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 방법의 읽기(read) 단계의 흐름을 개념적으로 도시한다. 4 conceptually illustrates a flow of a read step in a method for storing single-level data according to an embodiment of the present invention.

도 4를 참조하면, 따른 싱글-레벨 데이터 저장 장치(100)는 데이터를 읽기 위해, 첫번째 단계로 제1 메모리 테이블(110)에서 데이터를 검색할 수 있다. 만약 제1 메모리 테이블(110)에서 데이터를 찾지 못하는 경우, 제2 메모리 테이블(120)에서 데이터를 검색할 수 있다. Referring to FIG. 4, the single-level data storage device 100 may search for data in the first memory table 110 as a first step to read data. If data is not found in the first memory table 110, data may be searched in the second memory table 120.

제2 메모리 테이블(120)에서도 데이터를 찾지 못하는 경우, 비-트리 구조에 기반하여 디스크(2)에서 데이터를 검색할 수 있다. 한편, 데이터를 포함하는 파일은 블록(block)들로 구성될 수 있고, 비-트리의 인덱스는 키에 대해 디스크에 저장된 파일 정보, 예를 들어 키가 저장된 파일의 블록의 오프셋 정보를 포함할 수 있다. 이에 따라, 비-트리 구조, 즉, 비-트리의 인덱스를 이용하여 데이터를 검색하는 경우, 보다 용이하게 KV의 검색이 수행되도록 할 수 있다. When data is not found in the second memory table 120 as well, data may be searched from the disk 2 based on a non-tree structure. On the other hand, a file containing data may be composed of blocks, and the index of the non-tree may include file information stored on the disk for the key, for example, offset information of the block of the file in which the key is stored. have. Accordingly, when data is searched using a non-tree structure, that is, a non-tree index, it is possible to more easily perform a KV search.

상술한 바와 같이, 데이터는 싱글-레벨 형태로 디스크(2)에 저장되기 때문에, 비-트리 구조를 이용하여 검색이 수행될 수 있다. As described above, since data is stored on the disk 2 in a single-level form, a search can be performed using a non-tree structure.

도 5는 본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 방법의 쓰기(write) 단계의 흐름을 개념적으로 도시한다. 5 conceptually illustrates a flow of a write step in a method for storing single-level data according to an embodiment of the present invention.

도 5를 참조하면, 입력되는 모든 데이터는 제1 메모리 테이블(110)에 인가될 수 있다. 제1 메모리 테이블(110)이 가득 찬 경우, 제1 메모리 테이블(110)에 저장된 적어도 일부의 데이터는 제2 메모리 테이블(120)에 인가되어 불변적인 형태로 저장될 수 있다. 그 후 제2 메모리 테이블(120)의 상태에 따라 플러시가 수행될 수 있고, 경우에 따라서는 플러시되어 싱글-레벨로 저장된 데이터에 대해 컴팩션이 수행될 수 있다. Referring to FIG. 5, all input data may be applied to the first memory table 110. When the first memory table 110 is full, at least some data stored in the first memory table 110 may be applied to the second memory table 120 and stored in an immutable form. Thereafter, flushing may be performed according to the state of the second memory table 120, and in some cases, compaction may be performed on data that is flushed and stored as a single-level.

플러시가 수행되는 경우, 싱글-레벨로 디스크(2)에 저장되는 데이터에 대한 정보, 예를 들면 파일과 오프셋에 대한 정보는 비-트리에 인덱스 형태로 저장되거나 업데이트될 수 있다. 이에 기초하여, 컴팩션이 수행되는 경우 비-트리로 검색을 수행하여 컴팩션의 대상이 되는 파일에 있는 KV 페어가 유효한지 여부를 확인할 수 있다. When the flush is performed, information on data stored in the disk 2 in a single-level, for example, information on a file and an offset may be stored or updated in an index form in a non-tree. Based on this, when compaction is performed, a non-tree search may be performed to check whether a KV pair in a file targeted for compaction is valid.

본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 장치 및 방법은, 싱글-레벨로 데이터를 저장하여 데이터의 검색이 신속하게 수행되도록 함으로 효율적으로 검색이 이루어지도록 하며, 또한 컴팩션 후보를 결정하고 컴팩션 후보 중 일부에 대해 합병 정렬을 수행하여 불필요한 데이터의 저장을 방지함으로써 효율적으로 데이터 저장되도록 한다. A single-level data storage device and method according to an embodiment of the present invention enables efficient search by storing data in a single-level so that data search is performed quickly, and furthermore, a compaction candidate is determined and Merging and sorting is performed on some of the compaction candidates to prevent unnecessary data from being stored, thereby efficiently storing data.

본 발명의 일 실시예에 따른 싱글-레벨 데이터 저장 장치 및 방법은, 제1 메모리 테이블(110)과 제2 메모리 테이블(120)에 데이터를 저장함으로써 고성능의 데이터 쓰기가 가능하며, 비-트리를 이용함으로써 고성능의 데이터 읽기가 가능하도록 한다. A single-level data storage device and method according to an embodiment of the present invention enables high-performance data writing by storing data in the first memory table 110 and the second memory table 120, and uses a non-tree. By using it, high-performance data reading is possible.

본 명세서에 첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수도 있다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 흐름도 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.Combinations of each block in the block diagram attached to the present specification and each step in the flowchart may be performed by computer program instructions. Since these computer program instructions can be mounted on the processor of a general purpose computer, special purpose computer, or other programmable data processing equipment, the instructions executed by the processor of the computer or other programmable data processing equipment are shown in each block or flowchart of the block diagram. Each step creates a means to perform the functions described. These computer program instructions can also be stored in computer-usable or computer-readable memory that can be directed to a computer or other programmable data processing equipment to implement a function in a particular way, so that the computer-usable or computer-readable memory It is also possible to produce an article of manufacture in which the instructions stored in the block diagram contain instruction means for performing the functions described in each block or flow chart. Computer program instructions can also be mounted on a computer or other programmable data processing equipment, so a series of operating steps are performed on a computer or other programmable data processing equipment to create a computer-executable process to create a computer or other programmable data processing equipment. It is also possible for the instructions to perform the processing equipment to provide steps for performing the functions described in each block of the block diagram and each step of the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실시예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들 또는 단계들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.In addition, each block or each step may represent a module, segment, or part of code comprising one or more executable instructions for executing the specified logical function(s). In addition, it should be noted that in some alternative embodiments, functions mentioned in blocks or steps may occur out of order. For example, two blocks or steps shown in succession may in fact be performed substantially simultaneously, or the blocks or steps may sometimes be performed in the reverse order depending on the corresponding function.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 품질에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 명세서에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 균등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the present invention, and those of ordinary skill in the art to which the present invention pertains will be able to make various modifications and variations without departing from the essential quality of the present invention. Accordingly, the embodiments disclosed in the present specification are not intended to limit the technical idea of the present disclosure, but to explain the technical idea, and the scope of the technical idea of the present disclosure is not limited by these embodiments. The scope of protection of the present invention should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.

1: 비휘발성 메모리
2: 디스크
100: 싱글-레벨 데이터 저장 장치
110: 제1 메모리 테이블
120: 제2 메모리 테이블
130: 플러시 수행부
140: 싱글-레벨 저장부
150: 컴팩션 수행부 1: non-volatile memory
2: disk
100: single-level data storage device
110: first memory table
120: second memory table
130: flush execution unit
140: single-level storage
150: compaction execution unit

Claims

Storing data in a first memory table included in a nonvolatile memory; and
Storing at least some of the data stored in the first memory table in a second memory table included in the nonvolatile memory when the capacity of the data stored in the first memory table is greater than or equal to a predetermined value; and
Identifying data stored in the second memory table and performing a flush operation; and
And storing data stored in the second memory table in a single-level on a disk separate from the nonvolatile memory based on the flush operation.
Single-level based data storage method.

The method of claim 1,
The step of storing the single-level,
Storing data stored in the second memory table in a single-level based on the flush operation so that data is retrieved based on a non-tree data structure.
Single-level based data storage method.

The method of claim 2,
Comprising the step of performing compaction on at least a portion of the data stored in the single-level
Single-level based data storage method.

The method of claim 3,
The step of performing the compaction,
Deleting the data before the update among the data stored in the single-level, and sorting the remaining data stored in the single-level based on a key value included in the remaining data.
Single-level based data storage method.

The method of claim 3,
The step of performing the compaction,
Identifying a ratio of data in which at least a part of the data stored in the single-level is stored before the update time and data after the update time,
If the identified ratio is lower than a predetermined value, including the step of selecting at least a part of the data before the update time as a compaction candidate,
Including the step of deleting at least some of the selected compaction candidates
Single-level based data storage method.

The method of claim 3,
The step of performing the compaction,
Scanning a predetermined number of nodes among each node of the non-tree;
When the data related to the scanned node among the data stored in the single-level are distributed and stored in a predetermined number or more data table files, selecting the data table file as the compaction candidate
Single-level based data storage method.

The method of claim 5,
The step of performing the compaction,
Calculating the number of data table files required to read the data stored in the single-level;
And selecting a data table file related to the data with the largest number of the calculated data table files as the compaction candidate.
Single-level based data storage method.

A first memory table for storing input data,
A second memory table for storing at least some data stored in the first memory table when the capacity of the data stored in the first memory table is greater than or equal to a predetermined value;
A flush execution unit that identifies data stored in the second memory table and performs a flush operation;
A single-level storage unit for storing data stored in the second memory table in a single-level on a disk separate from the nonvolatile memory based on the flush operation,
The first memory table and the second memory table are stored in a nonvolatile memory.
Single-level based data storage device.

The method of claim 8,
The single-level storage unit,
Storing data stored in the second memory table in a single-level based on the flush operation so that data is retrieved based on a B-tree data structure
Single-level based data storage device.

The method of claim 9,
Comprising a compaction performing unit that performs compaction on at least a part of the data stored in the single-level
Single-level based data storage device.

The method of claim 10,
The compaction performing unit,
Deleting the data before the update among the data stored in the single-level, and sorting the remaining data stored in the single-level based on a key value included in the remaining data
Single-level based data storage device.

The method of claim 10,
The compaction performing unit,
Identifying a ratio of data in which at least a part of the data stored in the single-level is stored before the update time and data after the update time,
If the identified ratio is lower than a predetermined value, including the step of selecting at least a portion of the data before the predetermined time point as a compaction candidate,
Deleting at least some of the selected compaction candidates
Single-level based data storage device.

The method of claim 10,
The compaction performing unit,
Scanning a predetermined number of nodes among each node of the non-tree,
Selecting the data table file as the compaction candidate when data related to the scanned node among the data stored in the single-level are distributed and stored in a predetermined number or more data table files.
Single-level based data storage device.

The method of claim 10,
The compaction performing unit,
Calculate the number of data table files required to read the data stored in the single-level,
Selecting a data table file related to the data with the largest number of the calculated data table files as the compaction candidate
Single-level based data storage device.