KR102512571B1

KR102512571B1 - Memory sytem and operating method thereof

Info

Publication number: KR102512571B1
Application number: KR1020210172557A
Authority: KR
Inventors: 전형준; 남범석; 박성순; 김경표
Original assignee: 성균관대학교산학협력단; (주)글루시스
Priority date: 2021-12-06
Filing date: 2021-12-06
Publication date: 2023-03-22
Also published as: WO2023106635A1

Abstract

A memory system according to one embodiment of the present invention may include: a storage device including a plurality of sorted strings tables; a memory including a MemTable having a predetermined first size in which key and value pairs are arranged and stored; and a control unit using a log structured merge tree data structure, grouping the MemTable into N number of different groups whose key ranges do not overlap, and performing flush to the string tables sorted by groups and storing the groups in a storage device.

Description

Memory system and operation method of memory system {MEMORY SYTEM AND OPERATING METHOD THEREOF}

본 발명은 메모리 시스템 및 메모리 시스템의 동작 방법에 대한 발명이다.The present invention relates to a memory system and a method of operating the memory system.

키-값 기반의 데이터베이스는 센서 데이터, 소셜 네트워크 데이터 등과 같이 비정형 데이터를 다루는데 유용하다. 키-값(key-value) 기반의 데이터베이스는 로그 구조 병합 트리(Log Structured Merge Tree)를 주로 사용한다.Key-value based databases are useful for handling unstructured data such as sensor data and social network data. A key-value based database mainly uses a Log Structured Merge Tree.

로그 구조 병합 트리는 키-값 쌍을 저장하고 키 값을 통해 데이터(값)를 찾을 수 있는 인덱싱 자료 구조의 하나로, Cassandra, HBase, RocksDB, MongoDB 등 다양한 NoSQL 플랫폼들에 사용되고 있는 자료구조이다. 로그 구조 병합 트리는 새로 쓰여진 키-값 쌍을 메모리에 상주하는 SkipList 자료 구조를 사용하여 키 값의 순서대로 정렬한다. 로그 구조 병합 트리는 쓰기 연산을 위해 데이터를 메모리에 버퍼링을 하는데, 이 버퍼를 멤테이블(MemTable)이라 한다. 로그 구조 병합 트리 구조는 멤테이블의 용량이 소정의 용량보다 커지게 되면 멤테이블에 있는 키-값 쌍들을 배열로 변환하여 디스크에 파일로 쓰기 작업을 수행한다. 여기서, 정렬된 문자열 테이블(Sorted Strings Table)은 디스크에 쓰여진 파일이고, 컴팩션(Compaction)은 멤테이블을 디스크에 쓰는 연산 및 새로 쓰여진 정렬된 문자열 테이블을 기존 정렬된 문자열 테이블과 병합 정렬 (Merge Sort)하여 하나의 정렬된 키-값 배열을 유지하는 과정을 의미한다.The log structure merge tree is an indexing data structure that stores key-value pairs and finds data (value) through key values. It is a data structure used in various NoSQL platforms such as Cassandra, HBase, RocksDB, and MongoDB. The log structure merge tree sorts newly written key-value pairs in key-value order using the in-memory SkipList data structure. The log structure merge tree buffers data in memory for write operations, and this buffer is called a MemTable. The log structure merge tree structure converts the key-value pairs in the memtable into an array when the capacity of the memtable becomes larger than the predetermined capacity, and writes the data to a file on the disk. Here, the Sorted Strings Table is a file written to disk, and Compaction is an operation that writes the memtable to disk and Merge Sort of the newly written sorted string table with the existing sorted string table. ) to maintain a single sorted key-value array.

종래 로그 구조 병합 트리는 메인 메모리와 디스크의 쓰기 성능 차이가 존재하기 때문에, 멤테이블이 디스크에 쓰여지는 컴팩션이 빠르게 수행되지 못하는 경우, 메인 메모리가 부족해 질 수 있다. 또한, 정렬된 문자열 테이블들의 병합 정렬이 느린 경우, 겹치는 범위를 가지는 여러 정렬된 문자열 테이블들이 존재할 수 있어, 검색 성능이 떨어질 수 있다. 종래 로그 구조 병합 트리 자료 구조는 컴팩션이 느린 경우 클라이언트가 멤테이블에 새 키-값 쌍을 쓰지 못하도록 막는 Write stall 상황이 발생할 수 있다. Since there is a difference in write performance between the main memory and the disk in the conventional log-structured merge tree, main memory may become insufficient when compaction in which the memtable is written to the disk is not performed quickly. In addition, when merge sorting of sorted character string tables is slow, search performance may be degraded because several sorted character string tables having overlapping ranges may exist. Conventional log structure merge tree data structures can have a write stall situation that prevents clients from writing new key-value pairs to the memtable if compaction is slow.

복수의 정렬된 문자열 테이블이 병합 정렬되지 못하는 경우, 각 정렬된 문자열 테이블 간의 키의 범위가 서로 겹치게 되고, 이 경우, 클라이언트는 모든 정렬된 문자열 테이블을 읽어야 하는 문제점이 있다. 또한, 범위가 넓은 멤테이블이 저장되어 생성된 새로운 정렬된 문자열 테이블은 기존의 정렬된 문자열 테이블들의 키 범위가 중첩될 수 있고, 키 범위가 중첩되는 문자열 테이블을 병합 정렬하여야 하므로 오버헤드가 커지는 문제가 있다.When a plurality of sorted character string tables cannot be merge-sorted, key ranges between the sorted character string tables overlap each other, and in this case, a client has to read all sorted character string tables. In addition, a new sorted string table created by storing a memtable with a wide range may overlap the key ranges of existing sorted string tables, and since the string tables with overlapping key ranges must be merge-sorted, overhead increases. there is

본 발명이 해결하고자 하는 과제는, 멤테이블의 용량을 늘리면서도, 읽기 성능을 향상시킬 수 있는 메모리 시스템 및 메모리 시스템의 동작 방법을 제공하는 것이다.An object to be solved by the present invention is to provide a memory system and a method of operating the memory system capable of improving read performance while increasing the capacity of a memtable.

다만, 본 발명이 해결하고자 하는 과제는 이상에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 해결하고자 하는 과제는 아래의 기재로부터 본 발명이 속하는 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.However, the problem to be solved by the present invention is not limited to those mentioned above, and another problem to be solved that is not mentioned can be clearly understood by those skilled in the art from the description below. will be.

본 발명의 일 실시예에 따른 메모리 시스템은 복수의 정렬된 문자열 테이블(Sorted Strings Table)을 포함하는 저장 장치; 키(key) 및 값(value)쌍이 정렬되어 저장된 소정의 제1 사이즈의 멤테이블(MemTable)을 포함하는 메모리; 및 로그 구조 병합 트리(log structured merge tree) 자료구조를 이용하고, 상기 멤테이블을 상기 키의 범위가 중첩되지 않는 N개의 그룹으로 그룹핑하며, 상기 그룹핑된 그룹 별로 상기 정렬된 문자열 테이블에 플러쉬(flush)하여 상기 저장 장치에 저장하는 제어부를 포함한다.A memory system according to an embodiment of the present invention includes a storage device including a plurality of sorted string tables; a memory including a MemTable having a predetermined first size in which key and value pairs are arranged and stored; and using a log structured merge tree data structure, grouping the memtable into N groups in which the key ranges do not overlap, and flushing the sorted string table for each of the grouped groups ) and a control unit for storing in the storage device.

상기 멤테이블의 제1 사이즈는 1GB(Giga Byte) 이상일 수 있다.The first size of the memtable may be greater than or equal to 1 GB (Giga Byte).

상기 제어부는, 상기 복수의 그룹 중에서 상기 저장된 키 및 값 쌍이 제2 사이즈 이상인 그룹을 상기 정렬된 문자열 테이블에 플러쉬하고, 상기 제2 사이즈 이상인 그룹의 저장된 사이즈를 0으로 초기화할 수 있다.The control unit may flush a group having a stored key/value pair of a second size or larger among the plurality of groups to the sorted character string table, and initialize the stored size of the group having a second size or larger to 0.

상기 제2 사이즈는 상기 제1 사이즈를 N으로 나눈 값일 수 있다.The second size may be a value obtained by dividing the first size by N.

상기 제어부는, 상기 각 그룹별로 플러쉬된 횟수에 기초하여, 해당 그룹을 병합 또는 분할할 수 있다.The control unit may merge or divide a corresponding group based on the number of flushes for each group.

상기 제어부는, 소정의 제1 시점에서 플러쉬한 제1 그룹이 있는 경우, 상기 제1 시점에서, 상기 제1 그룹의 제1 플러쉬 횟수와 상기 N개 그룹의 플러쉬 횟수 중에서 가장 적은 제2 플러쉬 횟수를 비교하여, 해당 그룹을 병합 또는 분할할 수 있다.When there is a first group flushed at a first predetermined time point, the control unit determines, at the first time point, a second number of flushes that is the smallest among the first number of flushes of the first group and the number of flushes of the N groups. By comparison, the corresponding groups can be merged or divided.

상기 제어부는, 상기 제1 플러쉬 횟수와 상기 제2 플러쉬 횟수의 차이가 소정의 값을 초과하는 경우, 상기 제1 그룹을 키 범위가 서로 다른 복수의 그룹으로 분할하고, 상기 제2 플러쉬 횟수를 갖는 제2 그룹과 상기 N개 그룹 중에서 상기 제2 그룹의 키 범위와 연속되는 그룹을 병합할 수 있다.The controller divides the first group into a plurality of groups having different key ranges when a difference between the first flush count and the second flush count exceeds a predetermined value, and having the second flush count A second group and groups consecutive to the key range of the second group among the N groups may be merged.

본 발명의 다른 실시예에 따른 메모리 칩은 복수의 정렬된 문자열 테이블(Sorted Strings Table)을 포함하는 저장 장치; 키(key) 및 값(value)쌍이 정렬되어 저장된 소정의 제1 사이즈의 멤테이블(MemTable)을 포함하는 메모리; 및 로그 구조 병합 트리(log structured merge tree) 자료구조를 이용하고, 상기 멤테이블을 상기 키의 범위가 중첩되지 않는 N개의 그룹으로 그룹핑하며, 상기 그룹핑된 그룹 별로 상기 정렬된 문자열 테이블에 플러쉬(flush)하여 상기 저장 장치에 저장하는 제어부를 포함한다.A memory chip according to another embodiment of the present invention includes a storage device including a plurality of sorted string tables; a memory including a MemTable having a predetermined first size in which key and value pairs are arranged and stored; and using a log structured merge tree data structure, grouping the memtable into N groups in which the key ranges do not overlap, and flushing the sorted string table for each of the grouped groups ) and a control unit for storing in the storage device.

본 발명의 또 다른 실시예에 따른 메모리 시스템 동작 방법은 소정의 제1 사이즈의 멤테이블에 데이터를 키 및 값 쌍으로 저장하는 단계; 상기 멤테이블을 상기 키의 범위가 중첩되지 않는 N개의 그룹으로 그룹핑하는 단계; 및상기 그룹핑된 그룹 별로 상기 저장 장치의 정렬된 문자열 테이블에 플러쉬 하는 단계를 포함한다.A method of operating a memory system according to another embodiment of the present invention includes storing data as key and value pairs in a memtable having a predetermined first size; grouping the memtable into N groups whose key ranges do not overlap; and flushing the sorted character string table of the storage device for each of the grouped groups.

본 발명의 다른 측면에 따른 컴퓨터 프로그램을 저장하고 있는 컴퓨터 판독 가능 기록매체는 소정의 제1 사이즈의 멤테이블에 데이터를 키 및 값 쌍으로 저장하는 단계; 상기 멤테이블을 상기 키의 범위가 중첩되지 않는 N개의 그룹으로 그룹핑하는 단계; 및 상기 그룹핑된 그룹 별로 상기 저장 장치의 정렬된 문자열 테이블에 플러쉬 하는 단계를 포함하는 방법을 프로세서가 수행하도록 하기 위한 명령어를 포함한다.According to another aspect of the present invention, a computer readable recording medium storing a computer program includes: storing data as a key and value pair in a memtable having a predetermined first size; grouping the memtable into N groups whose key ranges do not overlap; and an instruction for causing a processor to perform a method including flushing the sorted character string table of the storage device for each of the grouped groups.

본 발명의 또 다른 측면에 따른 컴퓨터 판독 가능한 기록매체에 저장되어 있는 컴퓨터 프로그램은 소정의 제1 사이즈의 멤테이블에 데이터를 키 및 값 쌍으로 저장하는 단계; 상기 멤테이블을 상기 키의 범위가 중첩되지 않는 N개의 그룹으로 그룹핑하는 단계; 및 상기 그룹핑된 그룹 별로 상기 저장 장치의 정렬된 문자열 테이블에 플러쉬 하는 단계를 포함하는 방법을 프로세서가 수행하도록 하기 위한 명령어를 포함한다.According to another aspect of the present invention, a computer program stored on a computer readable recording medium includes storing data as key and value pairs in a memtable having a predetermined first size; grouping the memtable into N groups whose key ranges do not overlap; and an instruction for causing a processor to perform a method including flushing the sorted character string table of the storage device for each of the grouped groups.

본 발명의 일 실시예에 의하면, 멤테이블의 키 범위를 여러 구간으로 나눠, 워크로드에 따라 동적으로 키 범위를 조절하는 방법을 이용하여, 메모리의 멤테이블의 용량을 확장하면서도, 정렬된 문자열 테이블의 용량은 기존의 용량을 유지할 수 있도록 하여, DRAM과 디스크의 성질을 잘 활용할 수 있게 할 수 있다. 또한, 본 발명의 실시예에 따라, 로그 구조 병합 트리의 클라이언트가 멤테이블에 새 키-값 쌍을 쓰지 못하도록 막는 Write stall 문제가 완화될 수 있고, 디스크의 읽기 성능을 향상시킬 수 있는 효과가 있다.According to an embodiment of the present invention, by using a method of dividing the key range of the memtable into several sections and dynamically adjusting the key range according to the workload, while expanding the capacity of the memtable in memory, an ordered string table The capacity of can maintain the existing capacity, so that the properties of DRAM and disk can be well utilized. In addition, according to an embodiment of the present invention, the write stall problem that prevents the client of the log structure merge tree from writing a new key-value pair to the memtable can be alleviated, and the read performance of the disk can be improved. .

도 1은 멤테이블의 용량을 늘린 경우, 정렬된 문자열 테이블에 플러쉬하는 과정을 나타낸다.
도 2는 본 발명의 실시예에 따른 메모리 시스템의 기능을 개념적으로 나타내는 블록도이다.
도 3는 본 발명의 실시예에 따른 멤테이블의 키 범위를 그룹핑하여 정렬된 문자열 테이블에 플러쉬하는 과정을 나타낸다.
도 4는 본 발명의 실시예에 따른 그룹핑된 멤테이블을 플러쉬 횟수에 따라, 분할 및 병합하는 과정을 나타낸다.
도 5는 본 발명의 실시예에 따른 메모리 시스템과 종래 메모리 시스템의 성능을 비교하는 그래프이다.
도 6은 본 발명의 실시예에 따른 메모리 시스템의 동작 방법을 나타내는 흐름도이다.1 shows a process of flushing a sorted string table when the capacity of a memtable is increased.
2 is a block diagram conceptually illustrating functions of a memory system according to an exemplary embodiment of the present invention.
3 shows a process of grouping a key range of a memtable according to an embodiment of the present invention and flushing the sorted string table.
4 shows a process of dividing and merging grouped memtables according to the number of flushes according to an embodiment of the present invention.
5 is a graph comparing performance of a memory system according to an embodiment of the present invention and a conventional memory system.
6 is a flowchart illustrating a method of operating a memory system according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.Advantages and features of the present invention, and methods of achieving them, will become clear with reference to the detailed description of the following embodiments taken in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments make the disclosure of the present invention complete, and common knowledge in the art to which the present invention belongs. It is provided to completely inform the person who has the scope of the invention, and the present invention is only defined by the scope of the claims.

본 발명의 실시예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명의 실시예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing the embodiments of the present invention, if it is determined that a detailed description of a known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted. In addition, terms to be described later are terms defined in consideration of functions in the embodiment of the present invention, which may vary according to the intention or custom of a user or operator. Therefore, the definition should be made based on the contents throughout this specification.

도 1은 멤테이블의 용량을 늘린 경우, 정렬된 문자열 테이블에 플러쉬하는 과정을 나타낸다.1 shows a process of flushing a sorted string table when the capacity of a memtable is increased.

도 1을 참조하면, 로그 구조 병합 트리 자료구조를 이용하는 메모리 시스템은 저장 장치(110) 및 메모리(120)를 포함할 수 있다. 저장 장치(110)는 비휘발성 메모리인 롬, 플래시 메모리, 마그네틱 컴퓨터 기억 장치(예를 들면, 하드 디스크, 디스켓 드라이브, 마그네틱 테이프), 광디스크 드라이브 등일 수 있고, 플러쉬된 메모리(120)의 멤테이블에 버퍼링된 키-값 쌍을 배열로 변환하여 저장 장치(110)의 정렬된 문자열 테이블(SSTable, Sorted Strings Table)로 저장할 수 있다.Referring to FIG. 1 , a memory system using a log structure merge tree data structure may include a storage device 110 and a memory 120 . The storage device 110 may be a non-volatile memory such as a ROM, a flash memory, a magnetic computer storage device (eg, a hard disk, a diskette drive, a magnetic tape), an optical disk drive, and the like, and a memtable of the flushed memory 120 The buffered key-value pairs may be converted into an array and stored in a sorted string table (SSTable, Sorted Strings Table) of the storage device 110 .

메모리(120)는 휘발성 메모리인 DRAM, SRAM 등일 수 있고, 클라이언트의 쓰기 연산을 위해 데이터를 키-값 쌍으로 버퍼링할 수 있는 멤테이블(Memtable)을 포함할 수 있다. 또한 저장 장치(120)는 멤테이블에 저장된 데이터를 메모리(110)로 플러쉬할 수 있다.The memory 120 may be a volatile memory such as DRAM or SRAM, and may include a memtable capable of buffering data in key-value pairs for a client's write operation. Also, the storage device 120 may flush data stored in the memtable to the memory 110 .

메모리(110)는 한편, 로그 구조 병합 트리 기반 키-값 저장 장치인 RocksDB 및 LevelDB의 경우, 메모리(110)의 멤테이블 용량을 64MB로 기본 설정하여 이용하고 있다. 이 경우, 저장 장치(120)에 대한 쓰기 성능을 늘리기 위해, 예를 들어, 메모리(110)의 멤테이블의 용량을 기본설정인 64MB의 5배인 320MB로 늘린다면 저장 장치(110)의 정렬된 문자열 테이블들 간의 오버랩되는 용량도 커지게 되어, 오버랩된 정렬된 문자열 테이블들을 병합 정렬(Merge sort)하는데 있어서, 오버헤드가 5배 이상 커질 수 있다.On the other hand, in the case of RocksDB and LevelDB, which are log-structured merge tree-based key-value storage devices, the memtable capacity of the memory 110 is set to 64 MB by default. In this case, in order to increase the write performance to the storage device 120, for example, if the capacity of the memtable of the memory 110 is increased to 320 MB, which is five times the default setting of 64 MB, the aligned string of the storage device 110 The overlapping capacity between tables also increases, so that overhead may increase by 5 times or more in merging sorting overlapped sorted character string tables.

이하에서는 본 발명의 실시예에 따른 메모리 시스템을 설명한다.Hereinafter, a memory system according to an embodiment of the present invention will be described.

도 2는 본 발명의 실시예에 따른 메모리 시스템의 기능을 개념적으로 나타내는 블록도이다.2 is a block diagram conceptually illustrating functions of a memory system according to an exemplary embodiment of the present invention.

도 2를 참조하면, 메모리 시스템(10)은 저장 장치(110), 메모리(120) 및 제어부(130)를 포함할 수 있다.Referring to FIG. 2 , the memory system 10 may include a storage device 110 , a memory 120 and a controller 130 .

본 발명의 실시예에 따른 로그 구조 병합 트리 자료구조를 이용하는 메모리 시스템(10)은 메모리(120)의 멤테이블 용량을 확장시키면서도, 멤테이블의 키 범위에 따라 복수의 그룹으로 그룹핑하여 오버헤드를 줄이면서 읽기 성능을 함께 향상시킬 수 있다.The memory system 10 using the log structure merge tree data structure according to an embodiment of the present invention expands the memtable capacity of the memory 120 and reduces overhead by grouping the memtable into a plurality of groups according to the key range of the memtable. At the same time, reading performance can be improved.

또한, 상기 메모리 시스템(10)은 하나의 메모리 칩(chip)으로 구현될 수도 있다. 이 경우, 메모리 칩은 메모리 시스템(10)의 저장 장치(110), 메모리(120) 및 제어부(130)의 구성을 포함할 수 있고, 후술할 저장 장치(110), 메모리(120) 및 제어부(130)의 기능을 수행할 수 있다.Also, the memory system 10 may be implemented as a single memory chip. In this case, the memory chip may include the storage device 110, the memory 120, and the controller 130 of the memory system 10, and the storage device 110, the memory 120, and the controller ( 130) can be performed.

저장 장치(110)는 상술한 기재와 같이 비휘발성 메모리인 롬, 플래시 메모리, 마그네틱 컴퓨터 기억 장치(예를 들면, 하드 디스크, 디스켓 드라이브, 마그네틱 테이프), 광디스크 드라이브 등일 수 있고, 플러쉬된 메모리(120)의 멤테이블에 버퍼링된 키-값 쌍을 배열로 변환하여 저장 장치(110)의 정렬된 문자열 테이블(SSTable, Sorted Strings Table)로 저장할 수 있다. As described above, the storage device 110 may be a non-volatile memory such as a ROM, a flash memory, a magnetic computer storage device (eg, a hard disk, a diskette drive, a magnetic tape), an optical disk drive, and the like, and a flushed memory 120 The key-value pairs buffered in the memtable of ) can be converted into an array and stored as a sorted string table (SSTable, Sorted Strings Table) of the storage device 110.

또한, 메모리(120)는 휘발성 메모리인 DRAM, SRAM 등일 수 있고, 클라이언트의 쓰기 연산을 위해 데이터를 키-값 쌍으로 버퍼링할 수 있는 멤테이블(Memtable)을 포함할 수 있다. 또한 메모리(120)는 멤테이블에 저장된 데이터를 저장 장치(110)로 플러쉬할 수 있다.In addition, the memory 120 may be a volatile memory such as DRAM or SRAM, and may include a memtable capable of buffering data in key-value pairs for a client's write operation. Also, the memory 120 may flush data stored in the memtable to the storage device 110 .

실시예에 따라, 메모리(120)의 멤테이블 용량은 64MB이상일 수 있다. 예를 들어, 멤테이블의 용량은 1GB(Giga Byte)이상일 수 있다. 따라서, 메모리 시스템(10)은 멤테이블의 용량을 GB단위로 확장하여 메모리(120)를 최대한 활용할 수 있다.Depending on embodiments, the memtable capacity of the memory 120 may be 64 MB or more. For example, the capacity of the memtable may be 1 GB (Giga Byte) or more. Accordingly, the memory system 10 can maximize the use of the memory 120 by expanding the capacity of the memtable in units of GB.

제어부(130)는 중앙 처리 장치(central processing unit, CPU), 그래픽 처리 장치(graphics processing unit, GPU), MCU(micro controller unit) 또는 본 발명의 실시예들에 따른 방법들이 수행되는 전용의 프로세서를 의미할 수 있다.The control unit 130 may include a central processing unit (CPU), a graphics processing unit (GPU), a micro controller unit (MCU), or a dedicated processor in which methods according to embodiments of the present invention are performed. can mean

또한, 제어부(130)는 저장된 적어도 하나의 프로그램 명령에 의해 저장 장치(110) 및 메모리(120)의 플러쉬, 병합 정렬 및 멤테이블 그룹핑 등의 기능을 수행할 수 있으며, 이들 각각은 적어도 하나의 모듈의 형태로 메모리에 저장되어 프로세서에 의해 실행될 수 있다.In addition, the control unit 130 may perform functions such as flushing of the storage device 110 and the memory 120, merge sorting, and memtable grouping according to at least one stored program command, each of which includes at least one module It can be stored in memory in the form of and executed by the processor.

본 발명의 실시예에 따라, 메모리 시스템(10)은 멤테이블의 용량을 확장하면서도 키 범위에 따라 멤테이블을 복수개의 그룹으로 나눠, 각 그룹의 워크로드(workload)에 따라, 동적으로 그룹의 크기를 조절할 수 있다.According to an embodiment of the present invention, the memory system 10 divides the memtable into a plurality of groups according to key ranges while expanding the capacity of the memtable, and dynamically determines the size of the group according to the workload of each group. can be adjusted.

도 3는 본 발명의 실시예에 따른 멤테이블의 키 범위에 따라 그룹핑하여 정렬된 문자열 테이블에 플러쉬하는 과정을 나타낸다.3 shows a process of grouping according to key ranges of a mem table according to an embodiment of the present invention and flushing them to a sorted string table.

도 3을 더 참조하면, 제어부(130)는 메모리(120)의 멤테이블을 키의 범위가 중첩되지 않는 N개의 그룹으로 그룹핑할 수 있고 N개 그룹의 키의 범위는 서로 중첩되지는 않으나, 연속될 수 있다. 예를 들어, 제어부(130)는 메모리(120)의 멤테이블을 5개의 그룹(121, 122, 123, 124, 125)으로 그룹핑할 수 있다.Referring further to FIG. 3 , the control unit 130 may group the memtable of the memory 120 into N groups in which key ranges do not overlap, and the key ranges of the N groups do not overlap each other, but consecutive It can be. For example, the control unit 130 may group the memtable of the memory 120 into five groups 121 , 122 , 123 , 124 , and 125 .

제어부(130)는 메모리(120)의 특정 그룹에 버퍼링된 데이터가 몰리게 되는 경우, 다시 말해서, N개의 그룹 중에서 저장된 키 및 값 쌍이 소정의 용량 이상인 그룹을 저장 장치(110)에 플러쉬하고, 메모리(120)에 정렬된 문자열 테이블을 생성하여 저장한 뒤, 해당 그룹의 저장된 용량을 0으로 초기화 할 수 있다.When data buffered in a specific group of the memory 120 gathers, the control unit 130, in other words, flushes to the storage device 110 a group in which the stored key and value pair is greater than or equal to a predetermined capacity among the N groups, and the memory ( 120), the stored capacity of the group can be initialized to 0 after creating and storing the arrayed string table.

여기서, 제어부(130)가 플러쉬하기 위한 소정의 용량은 전체 멤테이블의 용량을 N으로 나눈 값일 수 있다.Here, the predetermined capacity for the controller 130 to flush may be a value obtained by dividing the capacity of the entire memtable by N.

예를 들어, 5개 맴테이블의 전체 용량이 320MB인 경우, 제어부(130)는 5개의 멤테이블 그룹 중에서, 저장된 용량이 64MB이상인 제4 그룹(124)을 플러쉬하여, 제5 정렬된 문자열 테이블(SSTable 5)을 생성하고, 저장 장치(110)에 저장할 수 있다. 이어서, 제어부(130)는 플러쉬된 제4 그룹(124)의 저장 용량을 0으로 초기화할 수 있다.For example, when the total capacity of 5 memtables is 320 MB, the controller 130 flushes the fourth group 124 having a stored capacity of 64 MB or more among the 5 memtable groups, thereby generating a fifth sorted string table ( SSTable 5) can be created and stored in the storage device 110. Subsequently, the controller 130 may initialize the storage capacity of the flushed fourth group 124 to zero.

본 발명의 실시예에 따른 메모리 시스템(10)은 전체 멤테이블을 키 범위에 따라 N개 그룹으로 분할하고 각 그룹별로 저장된 용량에 따라 플러쉬함으로써, 새로 메모리(120)에 저장되는 정렬된 문자열 테이블의 키 범위는 기 플러쉬되어 저장된 문자열 테이블의 키 범위와 겹칠 확률이 줄어 들기 때문에, 이후 키 범위가 겹치는 저장된 문자열 테이블을 병합 정렬하기 위한 오버헤드를 줄일 수 있다.The memory system 10 according to an embodiment of the present invention divides the entire memtable into N groups according to key ranges and flushes them according to the storage capacity of each group, thereby obtaining the sorted string table newly stored in the memory 120. Since the probability that the key range overlaps with the key range of the previously flushed and stored character string table is reduced, overhead for merge-sorting the stored character string table having overlapping key ranges can be reduced.

도 4는 본 발명의 실시예에 따른 그룹핑된 멤테이블을 플러쉬 횟수에 따라, 분할 및 병합하는 과정을 나타낸다.4 shows a process of dividing and merging grouped memtables according to the number of flushes according to an embodiment of the present invention.

도 4를 더 참조하면, 먼저, 제어부(130)는 메모리(120)의 멤테이블을 키 범위에 따라 5개의 그룹(121,122,123,124,125)으로 그룹핑하고, 각 그룹 별로 저장 장치(110)에 플러쉬가 발생할 때마다, 플러쉬 횟수를 카운트할 수 있다.4, first, the controller 130 groups the memtable of the memory 120 into five groups 121, 122, 123, 124, and 125 according to key ranges, and whenever a flush occurs in the storage device 110 for each group , the number of flushes can be counted.

즉, 메모리 시스템(10)은 그룹핑된 메모리(120)의 각 그룹 별로 플러쉬 횟수를 카운트한 뒤, 플러쉬된 횟수에 기초하여 워크로드(workload)가 큰 그룹을 분할하거나, 워크로드가 작은 그룹을 병합하여, 각 그룹의 워크로드를 동적으로 관리할 수 있다.That is, the memory system 10 counts the number of flushes for each group of the grouped memories 120, and then divides groups with large workloads or merges groups with small workloads based on the number of flushes. Thus, the workload of each group can be dynamically managed.

제어부(130)는 하나의 그룹에서 메모리(120)로 플러쉬가 발생하면, 플러쉬가 발생된 그룹의 현재 플러쉬 횟수에서 전체 그룹 중에서 가장 플러쉬 횟수가 적은 그룹의 플러쉬 횟수를 뺀 값이 소정의 값보다 큰 경우, 현재 플러쉬가 발생된 그룹을 여러 개의 그룹으로 분할할 수 있고, 현재 플러쉬 횟수가 가장 적은 그룹과 키 범위가 연속하는 그룹을 병합하여 하나의 그룹으로 관리할 수 있다. When a flush occurs in one group to the memory 120, the control unit 130 determines that a value obtained by subtracting the number of flushes of the group having the lowest number of flushes among all groups from the current number of flushes of the group in which the flush has occurred is greater than a predetermined value. In this case, the group in which the current flush has occurred may be divided into several groups, and the group having the smallest number of current flushes and the group having consecutive key ranges may be merged and managed as one group.

여기서, 제어부(130)는 분할된 그룹들에 분할하기 전 그룹의 플러쉬 횟수를 균등하게 분할하여 플러쉬 횟수를 저장할 수 있고, 병합된 그룹에 병합하기 전 플러쉬 횟수를 더하여 저장할 수 있다.Here, the controller 130 may store the number of flushes by equally dividing the number of flushes of the group before splitting into the divided groups, and may store the number of flushes by adding the number of flushes before merging to the merged group.

예를 들어, 도 4의 (a)를 참조하면, 현재 플러쉬가 발생한 그룹은 제4 그룹(124)이고, 가장 플러쉬 횟수가 적은 그룹은 제1 그룹(121)이다. 제어부(130)는 제4 그룹(124)의 플러쉬 횟수인 20에서 제1 그룹의 플러쉬 횟수인 3을 뺀 17이 미리 지정된 소정의 값 10보다 크기 때문에, 제1 그룹(121)과 제4 그룹(124)을 동적으로 관리한다. For example, referring to (a) of FIG. 4 , the fourth group 124 is the current flushing group, and the first group 121 is the group with the smallest number of flushes. Since 17 obtained by subtracting 3, the number of flushes of the first group from 20, the number of flushes of the fourth group 124, is greater than a predetermined value of 10, the controller 130 determines that the first group 121 and the fourth group ( 124) is dynamically managed.

도 4의 (b)를 참조하면, 제어부(130)는 플러쉬 횟수가 가장 적은 제1 그룹(121)과 키 범위가 연속하는 제2 그룹(122)을 병합하여 제6 그룹(126)을 생성할 수 있고, 제6 그룹(126)의 플러쉬 횟수는 제1 그룹(121)의 플러쉬 횟수인 3과 제2 그룹(122)의 플러쉬 횟수인 5를 더한 8회로 저장할 수 있다. 또한, 제어부(130)는 현재 플러쉬가 발생한 제4 그룹(124)을 두 개의 그룹인 제7 그룹(127)과 제8 그룹(128)로 분할하고, 제4 그룹(124)의 플러쉬 횟수인 20을 균등하게 분배하여 각각 10회의 플러쉬 횟수를 갖도록 저장할 수 있다.Referring to (b) of FIG. 4 , the controller 130 generates a sixth group 126 by merging the first group 121 having the smallest number of flushes with the second group 122 having consecutive key ranges. The number of flushes of the sixth group 126 may be stored as 8 times by adding 3, which is the number of flushes in the first group 121, and 5, which is the number of flushes in the second group 122. In addition, the controller 130 divides the fourth group 124 in which the current flush has occurred into two groups, the seventh group 127 and the eighth group 128, and sets the number of flushes of the fourth group 124 to 20. may be equally distributed and stored so as to have 10 flushes each.

이에 따라, 메모리 시스템(10)은 멤테이블 중에서 특정 그룹에 플러쉬가 집중적으로 발생되는 문제를 해결할 수 있다.Accordingly, the memory system 10 can solve a problem in which flushes are intensively generated in a specific group among memtables.

도 5의 (a)는 싱글 스레드 하에서 db_bench Random Write를 수행한 결과를 처리량(kops/sec) 로 나타낸 그래프이다.5(a) is a graph showing the result of performing db_bench Random Write under a single thread in terms of throughput (kops/sec).

로그 구조 병합 트리를 사용하는 대표적인 키-값 저장 방법인 RocksDB의 코드 상에, 실시예에 따라, 멤테이블의 키 범위에 따라 복수의 그룹으로 나눠 각 그룹별로 플러시한 경우를 나타내며, 키-값 성능을 측정하는 대표적인 벤치마크인 db_bench를 사용하여 성능을 측정하였다.On the code of RocksDB, which is a representative key-value storage method using a log structure merge tree, according to the embodiment, it shows the case of dividing into a plurality of groups according to the key range of the memtable and flushing for each group, and key-value performance Performance was measured using db_bench, a representative benchmark for measuring .

도 5의 (a)를 참조하면, 랜덤 쓰기 처리량(Random Write Throughput) 종래 64MB 멤테이블을 이용할 때와 비교하여, 본 발명의 실시예에 따른 메모리 시스템(10)이 약 1.5배 처리량이 개선됨을 확인할 수 있다.Referring to (a) of FIG. 5 , it can be confirmed that the random write throughput of the memory system 10 according to the embodiment of the present invention is improved by about 1.5 times compared to the case of using a conventional 64MB memtable. can

도 5의 (b)는 싱글 스레드 하에서 db_bench Random Write 를 수행할 때, 실제로 발생한 IO(Input/Output)를 측정하여 쓰기 증폭(Write Amplification)이 얼마나 감소했는지를 측정하였고, 실제로 쓴 유저 데이터는 54GB 였다.In (b) of FIG. 5, when db_bench Random Write is performed under a single thread, the actually generated IO (Input/Output) is measured to measure how much write amplification is reduced, and the actually written user data is 54GB. .

도 5의 (b)를 참조하면, 본 발명의 실시예에 따른 메모리 시스템(10)은 정렬된 문자열 테이블들의 키 값 범위가 겹치는 문제를 완화할 수 있는지 확인할 수 있다. 총 54GB의 데이터를 저장하려고 할 때, 종래 방법은 정렬된 문자열 테이블간의 범위가 겹치는 문제가 자주 발생하여, 같은 키-값 쌍이 5번 이상 병합 정렬됨으로 인해 저장 장치(110)에 쓰는 총 데이터 양이 약 295GB가 된다. 본 발명의 실시예에 따른 메모리 시스템(10)은 같은 키-값 쌍을 3번 미만으로 다시 쓰게 되므로, 저장 장치(110)에 쓰게 되는 총 데이터 양이 약 140GB로 줄어듦을 확인할 수 있다.Referring to (b) of FIG. 5 , the memory system 10 according to an embodiment of the present invention can check whether the problem of overlapping key value ranges of sorted character string tables can be alleviated. When trying to store a total of 54 GB of data, the conventional method frequently has overlapping ranges between sorted string tables, and the total amount of data written to the storage device 110 is caused by merge-sorting the same key-value pair 5 or more times. That's about 295GB. Since the memory system 10 according to an embodiment of the present invention rewrites the same key-value pair less than three times, it can be confirmed that the total amount of data written to the storage device 110 is reduced to about 140 GB.

도 6은 본 발명의 실시예에 따른 메모리 시스템의 동작 방법을 나타내는 흐름도이다.6 is a flowchart illustrating a method of operating a memory system according to an embodiment of the present invention.

도 1 및 도 6을 더 참조하면, 제어부(130)는 메모리(120)에 소정 제1 사이즈의 멤테이블에 데이터를 키 및 값 쌍으로 저장할 수 있다(S510).Further referring to FIGS. 1 and 6 , the controller 130 may store data as key and value pairs in a memtable having a predetermined first size in the memory 120 (S510).

이어서, 제어부(130)는 상기 멤테이블을 상기 키의 범위가 서로 중첩되지 않는 N개의 그룹으로 그룹핑할 수 있다(S520).Subsequently, the control unit 130 may group the memtable into N groups in which key ranges do not overlap (S520).

또한, 제어부(130)는 상기 그룹핑된 그룹 별로 상기 저장 장치(110)의 정렬된 문자열 테이블에 플러쉬할 수 있다(S530).In addition, the control unit 130 may flush the sorted character string table of the storage device 110 for each of the grouped groups (S530).

본 발명에 첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수도 있다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 인코딩 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 인코딩 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방법으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 흐름도 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.Combinations of each block of the block diagram and each step of the flowchart accompanying the present invention may be performed by computer program instructions. Since these computer program instructions may be loaded into an encoding processor of a general-purpose computer, special-purpose computer, or other programmable data processing equipment, the instructions executed by the encoding processor of the computer or other programmable data processing equipment are each block or block diagram of the block diagram. Each step in the flow chart creates means for performing the functions described. These computer program instructions may also be stored in a computer usable or computer readable memory that can be directed to a computer or other programmable data processing equipment to implement functionality in a particular way, such that the computer usable or computer readable memory It is also possible for the instructions stored in to produce an article of manufacture containing instruction means for performing the function described in each block of the block diagram or each step of the flow chart. The computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operational steps are performed on the computer or other programmable data processing equipment to create a computer-executed process to generate computer or other programmable data processing equipment. It is also possible that the instructions performing the processing equipment provide steps for executing the functions described in each block of the block diagram and each step of the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실시 예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들 또는 단계들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.Additionally, each block or each step may represent a module, segment or portion of code that includes one or more executable instructions for executing specified logical function(s). It should also be noted that in some alternative embodiments, it is possible for the functions recited in blocks or steps to occur out of order. For example, two blocks or steps shown in succession may in fact be performed substantially concurrently, or the blocks or steps may sometimes be performed in reverse order depending on their function.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 품질에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명에 개시된 실시 예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시 예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 균등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an example of the technical idea of the present invention, and various modifications and variations can be made to those skilled in the art without departing from the essential qualities of the present invention. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention, but to explain, and the scope of the technical idea of the present invention is not limited by these embodiments. The protection scope of the present invention should be interpreted according to the claims below, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.

10 : 메모리 시스템
110: 저장 장치
120: 메모리
130: 제어부10: memory system
110: storage device
120: memory
130: control unit

Claims

A storage device including a plurality of sorted string tables (Sorted Strings Table);
a memory including a MemTable having a predetermined first size in which key and value pairs are arranged and stored; and
Using a log structured merge tree data structure, grouping the memtable into N groups in which the key ranges do not overlap, and flushing the sorted string table for each grouped group Including; a control unit for storing in the storage device;
The control unit,
Of the N groups, flushing a group having a second size or larger in the stored key/value pair to the sorted string table, and initializing the stored size of the group having a second size or larger to 0
memory system.

According to claim 1,
The first size of the memtable is 1 GB (Giga Byte) or more,
memory system.

delete

According to claim 1,
The second size is a value obtained by dividing the first size by N,
memory system.

According to claim 1,
The control unit,
Based on the number of flushes for each group, merging or dividing the corresponding group
memory system.

According to claim 5,
The control unit,
When there is a first group flushed at a predetermined first time point, the first number of flushes of the first group is compared with the second number of flushes, which is the smallest among the number of flushes of the N groups, at the first time point, and a corresponding merging or splitting groups
memory system.

According to claim 6,
The control unit,
When the difference between the first flush count and the second flush count exceeds a predetermined value, the first group is divided into a plurality of groups having different key ranges, and a second group having the second flush count and Merge groups consecutive to the key range of the second group among the N groups,
memory system.

A storage device including a plurality of sorted string tables (Sorted Strings Table);
a memory including a MemTable having a predetermined first size in which key and value pairs are arranged and stored; and
Using a log structured merge tree data structure, grouping the memtable into N groups in which the key ranges do not overlap, and flushing the sorted string table for each grouped group Including; a control unit for storing in the storage device;
The control unit,
Of the N groups, flushing a group having a second size or larger in the stored key/value pair to the sorted string table, and initializing the stored size of the group having a second size or larger to 0
memory chip.

A method of operating a memory system performed by a memory system that stores data in a storage device using a log structure merge tree data structure, the method comprising:
storing the data as key and value pairs in a memtable having a predetermined first size;
grouping the memtable into N groups whose key ranges do not overlap; and
Including; flushing each grouped group into a sorted string table of the storage device,
The flushing step is
Among the N groups, flushing a group having a second size or larger in the stored key/value pair to the sorted character string table, and initializing the stored size of the group having a second size or larger to 0
How the memory system works.

A computer-readable recording medium storing a computer program,
storing data as key and value pairs in a memtable having a predetermined first size;
grouping the memtable into N groups whose key ranges do not overlap; and
Including; flushing to a sorted string table of a storage device for each of the grouped groups,
The flushing step is
For the processor to perform a method comprising flushing a group having a stored key and value pair of a second size or larger among the N groups to the sorted string table, and initializing the stored size of the group having a second size or larger to 0 Including instructions for
A computer-readable recording medium.

As a computer program stored on a computer-readable recording medium,
storing data as key and value pairs in a memtable having a predetermined first size;
grouping the memtable into N groups whose key ranges do not overlap; and
Including; flushing to a sorted string table of a storage device for each of the grouped groups,
The flushing step is
For the processor to perform a method comprising flushing a group having a stored key and value pair of a second size or larger among the N groups to the sorted string table, and initializing the stored size of the group having a second size or larger to 0 Including instructions for
computer program.