KR102275191B1

KR102275191B1 - Storage System and Method in Windows Operating Systems for the General-Purpose Data Storage

Info

Publication number: KR102275191B1
Application number: KR1020190138327A
Authority: KR
Inventors: 권혁윤
Original assignee: 서울과학기술대학교 산학협력단
Priority date: 2019-11-01
Filing date: 2019-11-01
Publication date: 2021-07-09
Also published as: KR20210052845A

Abstract

본 발명은 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템 및 그 방법에 관한 것으로서, 각 데이터 항목을 키-값 쌍(k, v)으로 생성하는 키-값 저장소의 구성요소를 윈도우 레지스트리의 구성요소에 매핑하여 생성되는 WR-Store; 및 해시 기반의 다중 레벨 레지스트리 색인 기능을 수행하고, 윈도우 기본 API를 사용하여 WR-Store에 저장된 데이터에 대해 입력(Put), 조회(Get) 또는 삭제(Delete) 중에 어느 하나의 기능을 수행하는 제어부를 포함한다.The present invention relates to a storage system and method in a Windows operating system for general-purpose data storage, wherein the components of the key-value storage that generate each data item as a key-value pair (k, v) are stored in the Windows registry. WR-Store created by mapping to components; and a control unit that performs hash-based multi-level registry index function, and performs any one function among input (Put), inquiry (Get), or deletion (Delete) for data stored in WR-Store using Windows basic API includes

Description

Storage System and Method in Windows Operating Systems for the General-Purpose Data Storage

본 발명은 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템 및 그 방법에 관한 것으로 더욱 상세하게는, 윈도우 운영체계에서 자체적으로 지원하는 데이터 저장소 및 API를 활용하여 별도의 소프트웨어 설치 없이 데이터를 저장하고 관리하기 위한 범용적인 데이터 저장소를 제공하는 기술에 관한 것이다.The present invention relates to a storage system and method in a Windows operating system for general-purpose data storage, and more particularly, by utilizing a data storage and API supported by the Windows operating system itself to store data without installing additional software. It relates to technology that provides a general-purpose data repository for data storage and management.

빅 데이터의 출현으로 인해 텍스트, 위치, 그래프 및 이미지와 같은 다양한 유형의 데이터들이 데이터 저장소에 저장되어 관리되고 있으나, 일반적으로 빅 데이터 애플리케이션에서는 데이터 유형을 미리 결정할 수 없다.With the advent of big data, various types of data such as text, location, graph, and image are stored and managed in data stores, but in general, data types cannot be determined in advance in big data applications.

또한, 관계형 데이터베이스에서는 정수, 날짜 및 문자와 같은 데이터 유형을 데이터베이스 스키마에 정의하고, 정의된 데이터 유형에 부합하는 데이터만 저장될 수 있도록 구성된다. 따라서 다양한 유형의 결정되지 않은 유형의 데이터를 관계형 데이터베이스에 저장할 수 없다.In addition, in a relational database, data types such as integer, date, and character are defined in the database schema, and only data conforming to the defined data type can be stored. Therefore, it is not possible to store various types of indeterminate data in a relational database.

이러한 문제점을 '키-값 저장소'를 이용하면 키와 값 쌍의 형태로 모든 데이터 표현이 가능해 데이터 관리가 용이하다. 키-값 저장소는 전체 데이터를 값 (예 : 문자열 또는 이진 형식)에 매핑하고 값에 고유 한 키를 할당하여 데이터 유형에 관계없이 모든 데이터를 키-값 저장소에 저장한다. 이후 필요한 데이터 유형에 따라 값을 구문 분석하여 저장된 데이터를 사용할 수 있다.If you use 'key-value storage' to solve this problem, all data can be expressed in the form of key and value pairs, so data management is easy. A key-value store stores all data in a key-value store, regardless of data type, by mapping the entire data to a value (such as a string or binary format) and assigning a unique key to the value. You can then use the stored data by parsing the values according to the type of data you need.

대표적인 키-값 저장소는 Redis, Google LevelDB, Facebook RocksDB, Oracle Berkeley DB 및 Memcached 등이 있으며, 영구 저장소와 메모리 저장소로 분류 할 수 있다.Representative key-value stores include Redis, Google LevelDB, Facebook RocksDB, Oracle Berkeley DB, and Memcached, and can be classified into persistent storage and memory storage.

영구 저장소는 데이터 유형의 유연성을 관계형 데이터베이스에 제공한다. 즉, 키-값 저장소를 사용하여 관계형 데이터베이스 대신 다양하고 결정되지 않은 데이터 유형을 관리할 수 있습니다.Persistent storage provides data type flexibility to relational databases. This means that you can use key-value stores to manage a variety of indeterministic data types instead of a relational database.

또한, 메모리 저장소는 영구 저장소에 저장된 데이터를 캐싱하여 효율성을 제공합니다. LevelDB, RocksDB 및 BerkelyDB는 영구적 인 키-값 저장소로 분류되고, Redis와 Memcache는 메모리 내 키-값 저장소에 저장된다.Memory storage also provides efficiency by caching data stored in persistent storage. LevelDB, RocksDB, and BerkelyDB are classified as persistent key-value stores, while Redis and Memcache are stored in in-memory key-value stores.

메모리 내 키-값 저장소는 응용 프로그램과 기본 시스템 간의 캐시 역할을 할 수 있으므로 성능 향상을 위해 다른 시스템과 함께 작동 할 수 있다. An in-memory key-value store can act as a cache between an application and the underlying system, allowing it to work with other systems to improve performance.

전술한바와 같은 Facebook의 RocksDB, Google의 LevelDB, Oracle의 BerkeleyDB 등 종래의 데이터 저장 기술(영구 저장소, 메모리 내 키-값 저장소)은 윈도우 운영체계 위에서 독립적으로 동작하도록 구성되며, 이에 따라 별도의 설치과정을 거쳐서 동작하게 된다.As described above, conventional data storage technologies (persistent storage, in-memory key-value storage) such as Facebook's RocksDB, Google's LevelDB, and Oracle's BerkeleyDB are configured to operate independently on the Windows operating system, and accordingly, a separate installation process It works through

이때, 운영체계에 설치되는 소프트웨어의 크기가 커 많은 저장 공간을 필요로 한다는 문제점 있고, 설치된 소프트웨어가 운영체계와 독립적으로 동작함에 따라 소프트웨어와 운영체계간의 상호작용에 많은 시간이 소요되는 단점이 있다.At this time, there is a problem that the size of the software installed in the operating system is large and requires a lot of storage space, and since the installed software operates independently of the operating system, it takes a lot of time for interaction between the software and the operating system.

이에 본 출원인은 윈도우 운영체계에서 자체적으로 지원하는 데이터 저장소 및 API를 활용하여 별도의 소프트웨어 설치 없이 데이터를 저장하고 관리하기 위한 범용적인 데이터 저장소를 구성하는 시스템을 제안하고자 한다.Accordingly, the present applicant intends to propose a system for configuring a universal data storage for storing and managing data without installing additional software by utilizing the data storage and API supported by the Windows operating system.

한국공개특허 제10-2009-0075691호(2009.07.08.공개)Korean Patent Publication No. 10-2009-0075691 (published on Jul. 8, 2009)

본 발명의 목적은, 키-값 저장소의 구성요소를 윈도우 레지스트리의 구성요소에 매핑하고, 해시 기반 다중 레벨 레지스트리 색인이 가능하도록 키-값 저장소에 데이터를 분류하며, API를 이용한 레지스트리 설정을 통해 키-값 저장소에 저장된 데이터에 대한 입력(Put), 조회(Get) 및 삭제(Delete) 기능을 수행토록 함으로써, 추가 라이브러리 및 응용 프로그램을 설치하지 않고도 Windows 기본 제공 구조 및 기본 API를 사용해 가벼운 키-값 저장소를 구성하는데 있다.An object of the present invention is to map the components of the key-value store to the components of the Windows registry, classify data in the key-value store to enable hash-based multi-level registry indexing, and to set the key through registry settings using APIs. -Light key-value using Windows built-in structure and basic API without installing additional libraries and applications by performing input (Put), inquiry (Get) and delete (Delete) functions for data stored in the value store to configure the repository.

이러한 기술적 과제를 달성하기 위한 본 발명의 실시예는 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템으로서, 각 데이터 항목을 키-값 쌍(k, v)으로 생성하는 키-값 저장소의 구성요소를 윈도우 레지스트리의 구성요소에 매핑하여 생성되는 WR-Store; 및 해시 기반의 다중 레벨 레지스트리 색인 기능을 수행하고, 윈도우 기본 API를 사용하여 WR-Store에 저장된 데이터에 대해 입력(Put), 조회(Get) 또는 삭제(Delete) 중에 어느 하나의 기능을 수행하는 제어부를 포함하는 것을 특징으로 한다.An embodiment of the present invention for achieving this technical problem is a storage system in a Windows operating system for general-purpose data storage, and the configuration of a key-value store that generates each data item as a key-value pair (k, v) WR-Store created by mapping elements to components in the Windows registry; and a control unit that performs hash-based multi-level registry index function, and performs any one function among input (Put), inquiry (Get), or deletion (Delete) for data stored in WR-Store using Windows basic API It is characterized in that it includes.

바람직하게는, WR-Store에 저장된 데이터를 키-값 저장소를 지원하는 다른 운영체계 환경에서 인식이 가능하도록 변환하는 마이그레이션부를 더 포함하는 것을 특징으로 한다.Preferably, it is characterized in that it further comprises a migration unit that converts data stored in the WR-Store to be recognized in another operating system environment that supports the key-value store.

전술한 시스템을 기반으로 하는 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 방법은, 제어부가 키-값 저장소의 구성 요소를 윈도우 레지스트리의 구성 요소에 맵핑하여 WR-Store를 생성하는 (a) 단계; 제어부가 WR-Store에 저장된 키-값 데이터를 해시 기반의 다중 레벨 인덱스 구조로 재구성하는 (b) 단계; 및 제어부가 윈도우 기본 API를 사용하여 WR-Store에 저장된 데이터에 대해 입력(Put), 조회(Get) 또는 삭제(Delete) 중에 어느 하나의 작업을 수행하는 (c) 단계를 포함하는 것을 특징으로 한다.In the storage method in the Windows operating system for general-purpose data storage according to an embodiment of the present invention based on the above-described system, the control unit maps the components of the key-value store to the components of the Windows registry, and the WR-Store (a) generating a; (b) the control unit reconstructing the key-value data stored in the WR-Store into a hash-based multi-level index structure; and (c) in which the control unit performs any one of input (Put), inquiry (Get), or deletion (Delete) on the data stored in the WR-Store using the Windows basic API. .

상기와 같은 본 발명에 따르면, 키-값 저장소의 구성요소를 윈도우 레지스트리의 구성요소에 매핑하고, API를 이용한 레지스트리 설정을 통해 키-값 저장소에 저장된 데이터에 대한 입력(Put), 조회(Get) 또는 삭제(Delete) 중에 어느 하나의 기능을 수행하는 WR-Store를 제공함으로써, 추가 라이브러리 및 응용 프로그램을 설치하지 않고도 Windows 기본 제공 구조 및 기본 API를 사용해 가벼운 키-값 저장소를 구성하는 효과가 있다.According to the present invention as described above, the components of the key-value storage are mapped to the components of the Windows registry, and input (Put) and inquiry (Get) of data stored in the key-value storage through registry settings using APIs Alternatively, by providing WR-Store that performs either function during Delete, it has the effect of configuring a lightweight key-value store using the Windows built-in structure and basic API without installing additional libraries and applications.

그리고 본 발명에 따른 WR-Store는 데이터 세트의 크기가 증가함에 따라 다른 키-값 저장소보다 훨씬 더 효율적이며, 마이그레이션을 통해 다른 운영체계에서도 적용이 가능해 확장성이 큰 장점이 있다.And, as the size of the data set increases, the WR-Store according to the present invention is much more efficient than other key-value stores, and it can be applied to other operating systems through migration, so it has great scalability.

도 1은 윈도우 레지스트리의 편집기가 제공하는 논리 구조를 도시한 도면.
도 2는 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 WR-Store의 개념을 도시한 도면.
도 3은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에 의한 WR-Store의 해시 기반 다중 레벨 레지스트리 인덱스의 레지스트리 경로를 도시한 도면.
도 4는 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 WR-Store 처리 알고리즘을 도시한 도면.
도 5는 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에서 WRL이 1로 고정되고 WRL이 증가함에 따른 WR-Store의 성능을 도시한 도면.
도 6은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에서 WRD가 1로 고정 된 상태에서 WR-Store 또는 WRL의 성능이 향상된 것을 도시한 도면.
도 7은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에서 WRD

WRL에 따른 WR-Store의 성능 변화를 도시한 도면.
도 8은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에서 연속 작업 k의 수가 1 에서 1 백만으로 변할 때 1D_4L, 2D_2L 및 4D_1L의 처리 시간을 도시한 도면.
도 9는 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에서 k가 1 에서 100만으로 증가함에 따라 합성 데이터 세트 KVData3의 처리량을 도시한 도면.
도 10은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에서 데이터 세트의 크기가 10,000 에서 1,000만 으로 증가함에 따른 합성 데이터 세트의 처리량을 도시한 도면.
도 11은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에서 k가 1에서 1 백만으로 증가함에 따라 실제 데이터 세트의 처리량을 도시한 도면.
도 12는 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 다양한 실제 데이터 세트의 처리량을 도시한 도면.
도 13은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에서 p가 증가함에 따라 성능 변화를 도시한 도면.
도 14는 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에서 실제 데이터 세트인 ID-HashTag를 저장하는 WR-Store에서 추출 된 데이터의 세부 사항을 도시한 도면.
도 15는 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 E-and-T 방법의 알고리즘을 도시한 도면.
도 16은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 E-then-T 및 E-and-T의 성능 분석을 도시한 도면.
도 17은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템을 도시한 구성도.
도 18은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 방법을 도시한 순서도.1 is a diagram showing a logical structure provided by an editor of the Windows registry.
2 is a diagram illustrating the concept of a WR-Store of a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
3 is a diagram illustrating a registry path of a hash-based multi-level registry index of WR-Store by a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
4 is a diagram illustrating a WR-Store processing algorithm of a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
5 is a diagram illustrating the performance of WR-Store as WRL is fixed to 1 and WRL is increased in a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
6 is a diagram illustrating improved performance of WR-Store or WRL in a state where WRD is fixed to 1 in a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
7 is a WRD in a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention;

A diagram showing the performance change of WR-Store according to WRL.
8 is a diagram illustrating processing times of 1D_4L, 2D_2L and 4D_1L when the number of consecutive jobs k changes from 1 to 1 million in a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention;
9 is a diagram illustrating the throughput of a composite data set KVData3 as k increases from 1 to 1 million in a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
10 is a view showing the throughput of a synthetic data set as the size of the data set increases from 10,000 to 10 million in a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
11 is a diagram illustrating the throughput of an actual data set as k increases from 1 to 1 million in a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention;
12 is a diagram illustrating throughput of various actual data sets of a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention;
13 is a diagram illustrating performance changes as p increases in a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention;
14 is a diagram illustrating details of data extracted from a WR-Store that stores an ID-HashTag, which is an actual data set, in a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
15 is a diagram illustrating an algorithm of an E-and-T method of a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
16 is a diagram illustrating performance analysis of E-then-T and E-and-T of a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
17 is a block diagram illustrating a storage system in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.
18 is a flowchart illustrating a storage method in a Windows operating system for general-purpose data storage according to an embodiment of the present invention.

본 발명의 구체적인 특징 및 이점들은 첨부 도면에 의거한 다음의 상세한 설명으로 더욱 명백해질 것이다. 이에 앞서, 본 명세서 및 청구범위에 사용된 용어나 단어는 발명자가 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념을 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야 할 것이다. 또한, 본 발명에 관련된 공지 기능 및 그 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는, 그 구체적인 설명을 생략하였음에 유의해야 할 것이다.The specific features and advantages of the present invention will become more apparent from the following detailed description taken in conjunction with the accompanying drawings. Prior to this, the terms or words used in the present specification and claims are based on the principle that the inventor can appropriately define the concept of the term in order to best describe his or her invention in the technical spirit of the present invention. It should be interpreted with the corresponding meaning and concept. In addition, it should be noted that, when it is determined that the detailed description of the well-known functions related to the present invention and its configuration may unnecessarily obscure the gist of the present invention, the detailed description thereof is omitted.

도 1은 윈도우 레지스트리의 편집기가 제공하는 논리 구조를 도시한 도면이다. 도 1에 도시된 바와 같이 왼쪽 창은 레지스트리 키를 나타내며, 여기서 트리 구조로 구성된 키의 경로는 ""Computer \ HKEY_CURRENT _USER \ Control Panel \ Keyboard ""이다. 이때, 각 키는 여러 개의 하위 키와 값을 가질 수 있고, 키와 하위 키간에 부모-자녀 관계를 설정할 수 있다.1 is a diagram illustrating a logical structure provided by an editor of a Windows registry. As shown in Fig. 1, the left window indicates registry keys, where the path of the key structured in a tree structure is ""Computer\HKEY_CURRENT_USER\Control Panel\Keyboard"". In this case, each key may have several sub-keys and values, and a parent-child relationship may be established between the keys and sub-keys.

또한, 오른쪽 창에는 각 키의 값이 표시되는데, 각 값은 이름, 유형 및 데이터의 세가지 속성으로 구성되며, Name 속성은 동일한 키의 값을 고유하게 식별하고, Type 속성은 값의 유형을 결정하며, Date 속성은 지정된 값 유형 다음에 나오는 실제 값을 나타낸다.In addition, the right pane displays the value of each key, each of which consists of three properties: name, type, and data, the Name property uniquely identifies the value of the same key, the Type property determines the type of the value, , the Date attribute represents the actual value that follows the specified value type.

대표 값 유형으로는 이진 데이터를 나타내는 REG_BINARY, 숫자를 나타내는 REG_DWORD 및 문자열을 나타내는 REG_SZ 이 있는데, 이는 숫자, 문자열 및 이진을 포함하여 레지스트리에서 다양한 유형의 데이터 관리가 가능함을 의미한다.Representative value types include REG_BINARY for binary data, REG_DWORD for numbers, and REG_SZ for strings, which means that different types of data can be managed in the registry, including numbers, strings, and binary.

윈도우 레지스트리의 종류는 (1) HKEY_LOCAL_MACHINE (이하, 'HKLM' 라고 함), (2) HKEY_CLASSES_ROOT (이하, 'HKCR' 라고 함), (3) HKEY _CURRENT_CONFIG (이하, 'HKCC' 라고 함), (4) HKEY_USERS (이하, 'HKU' 라고 함), 및 (5) HKEY_CURRENT_USER (이하 'HKCU' 라고 함) 등이 있다. Windows registry types are (1) HKEY_LOCAL_MACHINE (hereinafter referred to as 'HKLM'), (2) HKEY_CLASSES_ROOT (hereinafter referred to as 'HKCR'), (3) HKEY _CURRENT_CONFIG (hereinafter referred to as 'HKCC'), (4) ) HKEY_USERS (hereinafter referred to as 'HKU'), and (5) HKEY_CURRENT_USER (hereinafter referred to as 'HKCU').

먼저, HKLM은 시스템 정보를 유지 관리하고, HKCR 및 HKCC는 HKLM의 특정 하위 키에 대한 심볼릭 링크이며, HKU는 시스템의 모든 사용자에 대한 정보를 유지 관리하고, HKCU는 HKU의 현재 사용자 정보에 대한 심볼릭 링크이다. 보안상의 이유로 우리는 일반적으로 HKLM에 제한된 권한 (예 : 관리자)이 필요한 정보를 저장하고 HKCU에 제한없이 (예 : 로그인 한 사용자) 액세스 할 수 있는 다른 정보를 저장한다.First, HKLM maintains system information, HKCR and HKCC are symbolic links to specific subkeys of HKLM, HKU maintains information about all users in the system, HKCU is symbolic to current user information in HKU it's a link For security reasons, we usually store information that requires limited privileges (eg administrators) in HKLM and other information that can be accessed without restrictions (eg logged-in users) in HKCU.

이러한 윈도우 레지스트리는 데이터를 지속적으로 저장하기 위해 Hive라는 특수한 물리적 위치에 저장되며, 운영체제가 메모리 데이터와 Hive를 동기화 시킨다. [표 1]은 윈도우 레지스트리를 조작하기 위한 대표적인 API이다.This Windows registry is stored in a special physical location called Hive to continuously store data, and the operating system synchronizes memory data with Hive. [Table 1] is a representative API for manipulating the Windows registry.

[표 1][Table 1]

기본적으로 레지스트리에 새 정보를 생성하고, 레지스트리에서 기존 정보를 쿼리한 이후 해당 정보를 삭제함에 따라 API를 사용하여 빌드 된 실행 파일은 모든 버전의 윈도우에서 작동한다.Executables built using the API will work on all versions of Windows as it essentially creates new information in the registry, queries the registry for existing information, and then deletes that information.

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 키-값 저장소 (이하, 'WR-Store' 라고 함) 구성에 대해 살피면 아래와 같다.Hereinafter, the configuration of the key-value store (hereinafter referred to as 'WR-Store') of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention is as follows.

WR-Store는 키-값 저장소의 구성 요소를 윈도우 레지스트리의 구성 요소에 맵핑하여 설계되며, 맵핑은 [표 2]와 같이 수행된다.WR-Store is designed by mapping the components of the key-value store to the components of the Windows registry, and the mapping is performed as shown in [Table 2].

이러한, WR-Store는 도 2에 도시된 바와 같이, (1) 키-값 데이터 저장 영역 및 (2) 인덱스 구조 영역으로 구성되며, 도 2의 오른쪽 창은 WR-Store의 키-값 데이터 스토리지를 나타낸 것으로, 레지스트리 키 아래에 여러 값이 저장된다. As shown in FIG. 2, the WR-Store consists of (1) a key-value data storage area and (2) an index structure area, and the right pane of FIG. 2 shows the key-value data storage of the WR-Store. As shown, several values are stored under the registry key.

도 2에 도시된 예와 같이 ""Computer \ HKCU \ kvs \ f e \ 40 \ 8a \ 96 ""창에 표시된 레지스트리 키는 두 개의 키-값 쌍이 저장되며, 하나의 키-값 쌍의 키는 ""792479 ""이며 관련 값은 ""#Scotland ""이다. 또 다른 키-값 쌍의 키는 ""2149 ""이고 값은 ""(-115.223125,36.232915) ""이다.As in the example shown in Fig. 2, the registry key displayed in the ""Computer\HKCU\kvs\fe\40\8a\96"" window stores two key-value pairs, and one key-value pair key is " "792479 "" and the associated value is ""#Scotland "". Another key-value pair has the key ""2149 "" and the value ""(-115.223125,36.232915) "".

도 2의 왼쪽 창은 WR-Store의 인덱스 구조를 나타낸 것으로 윈도우 레지스트리의 키와 하위 키를 색인하는 구조를 갖는다. 결과적으로 키-값 데이터를 레지스트리 경로의 리프 하위 키에 저장된다.The left window of FIG. 2 shows the index structure of the WR-Store, and has a structure of indexing keys and subkeys of the Windows registry. As a result, key-value data is stored in the leaf subkey of the registry path.

예컨대, 도 2에서 ""fe "", ""40 "", ""8a "" 및 ""96 ""은 ""792479"" 및 ""2149"" 키의 키-값 데이터에 액세스하는 경로이다. 레지스트리 키에 여러 값을 저장할 수 있으므로 레지스트리 값의 이름 속성으로 실제 키-값 쌍을 식별한다. 이때, 주어진 키로부터 레지스트리 경로에 있는 키와 서브 키의 이름을 설정할 수 있다.For example, in FIG. 2, ""fe"", ""40"", ""8a"", and ""96"" are paths to access key-value data of keys ""792479"" and ""2149"" to be. Because a registry key can store multiple values, the name attribute of the registry value identifies the actual key-value pair. At this time, you can set the names of keys and subkeys in the registry path from the given key.

특정 루트 레지스트리 아래에 WR-Store에 대한 새 데이터베이스를 생성하는 경우, 로그인 한 사용자가 HKCU에 액세스 할 수 있지만 관리자만 HKLM에 액세스 할 수 있기 때문에 키-값 저장소의 원하는 보안 수준에 따라 루트 레지스트리 키를 선택할 수 있다. 따라서, (1) ""Computer \ HKLM \ DBName"" 또는 (2) ""Computer \ HKCU \ DBName"" 경로 중 하나에 데이터베이스를 생성한다. 이때, 도 2에서 사용 된 루트 키는 HKCU이고 데이터베이스 이름은 ""kvs""이다.If you create a new database for WR-Store under a specific root registry, you can change the root registry key according to the desired security level of the key-value store, since logged in users can access HKCU but only administrators have access to HKLM. You can choose. Therefore, create a database in either (1) ""Computer\HKLM\DBName"" or (2) ""Computer\HKCU\DBName"" path. At this time, the root key used in Fig. 2 is HKCU and the database name is ""kvs"".

이하, 해시 기반 다단계 레지스트리 색인 절차에 대해 살피면 아래와 같다.Hereinafter, the hash-based multi-level registry indexing procedure is as follows.

WR-Store에 대한 인덱스 구조 구성을 위해서는 레지스트리 경로에서 키와 하위 키의 이름을 결정해야 하는데, 이때 간단한 방법으로 윈도우 레지스트리에서와 같이 키-값 저장소에서 키를 사용할 수 있다.To configure the index structure for WR-Store, you need to determine the names of keys and subkeys in the registry path. In this case, you can use the keys in the key-value store as in the Windows registry in a simple way.

예를 들어 도 2에서 ""2149""는 레지스트리의 키가 되므로 ""2149""키의 레지스트리 경로는 ""Computer \ HKCU \ kvs \ 2149""가 된다. 이 경우 WR-Store는 1 단계 인덱스 구조를 갖습니다. 그러나 확장 가능한 인덱스를 만들려면 다단계 구조가 필요하다.For example, in FIG. 2, ""2149"" becomes a key in the registry, so the registry path of the ""2149"" key becomes ""Computer\HKCU\kvs\2149"". In this case, WR-Store has a one-level index structure. However, creating a scalable index requires a multi-level structure.

다단계 구조를 만들기 위해 키-값 저장소에 있는 키의 하위 문자열을 윈도우 레지스트리의 키 및 하위 키로 사용할 수 있다. 예를 들어, "2149""키가 있으면 두 개의 하위 문자열 ""21"" 및 ""49""로 분할하고, ""21""은 첫 번째 레벨 레지스트리 구조의 키가 되고 ""49""는 두 번째 레벨 레지스트리 구조의 키가 된다. 결과적으로 레지스트리 키의 경로는 ""Computer \ HKCU \ kvs \ 21 \ 49""로 설정된다.To create a multi-level structure, substrings of keys in the key-value store can be used as keys and subkeys in the Windows registry. For example, if you have the key "2149"" split it into two substrings ""21"" and ""49"", where ""21"" becomes the key in the first level registry structure and ""49"" becomes a key in the second level registry structure, consequently the path of the registry key is set to ""Computer\HKCU\kvs\21\49"".

이하, 다중 레벨 구조의 레지스트리 키에 키-값 데이터를 균일하게 분배하는 절차에 대해 살피면 아래와 같다.Hereinafter, a procedure for uniformly distributing key-value data to a registry key having a multi-level structure is as follows.

먼저, 해시 함수를 사용하여 키-값 저장소의 지정된 키에서 레지스트리 경로를 결정한다. 이때, 해시 함수는 원래 키의 배포에 관계없이 데이터를 균일하게 배포하는 속성이 있으며, 윈도우 레지스트리에서 동일한 레지스트리 키에 여러 값을 저장할 수 있기 때문에 WR-Store에서 해시 충돌은 문제되지 않고, name 속성을 사용하여 레지스트리 키의 값을 식별 할 수 있다.First, a hash function is used to determine the registry path from the specified key in the key-value store. At this time, the hash function has the property of distributing data uniformly regardless of the distribution of the original key, and since multiple values can be stored in the same registry key in the Windows registry, hash collision is not a problem in WR-Store, and the name property is can be used to identify the value of a registry key.

본 발명의 실시예에서는 해시 함수에 MD5를 사용하고 해시 결과를 16 진수로 나타내었고, 다중 수준 구조를 만들기 위해 해시 결과를 여러 하위 문자열로 분할하고 각 하위 문자열을 다중 수준 구조의 각 수준에 대한 레지스트리 키로 사용하였다. In the embodiment of the present invention, MD5 is used for the hash function and the hash result is expressed in hexadecimal, and the hash result is divided into several substrings to create a multilevel structure, and each substring is divided into a registry for each level of the multilevel structure. used as a key.

도 3은 WR-Store의 해시 기반 다중 레벨 레지스트리 인덱스의 레지스트리 경로를 나타낸 도면이다. 인덱싱을 위한 레지스트리 경로는 각 해시 값 i (1 ≤ I ≤ WRD)의 길이가 WRL이며, 윈도우 레지스트리의 i 번째 수준 인덱스에 사용되는 WRD 해시 값으로 구성된다. 이때, WRD를 레지스트리 깊이로 정의하고 WRL을 레지스트리 경로의 각 하위 키 길이로 정의하면 인덱싱을 위한 레지스트리 경로의 총 길이는 WRL _ WRD이다.3 is a diagram illustrating a registry path of a hash-based multi-level registry index of WR-Store. The registry path for indexing has a length of WRL of each hash value i (1 ≤ I ≤ WRD), and consists of a WRD hash value used for the i-th level index of the Windows registry. At this time, if WRD is defined as the registry depth and WRL is defined as the length of each subkey of the registry path, the total length of the registry path for indexing is WRL_WRD.

또한, [표 3]은 WR-Store의 레지스트리 경로 예를 나타낸 것이다. 예를 들어 키 ""2149"" 및 MD5 (""2149"") = ""fe40a2892ab382cd""이고, WRD가 2이고 WRL이 2이며 데이터베이스 이름이 ""kvs"라고 상정하면, 해시 결과 ""fe""의 첫 두 바이트는 첫 번째 레벨의 레지스트리 키가 되고, 해시 결과 다음의 두 바이트인 ""40""은 두 번째 레벨의 레지스트리 키가 된다. 결과적으로 레지스트리 키의 최종 경로는 ""Computer \ HKCU \ kvs \ fe \ 40""로 설정되며, WRD와 WRL이 달라지면 [표 3]에 표시된 바와 같이 해당 레지스트리 경로도 변경되어 완전히 다른 인덱스 구조를 생성하게 된다.In addition, [Table 3] shows an example of the registry path of WR-Store. For example, assuming key ""2149"" and MD5(""2149"") = ""fe40a2892ab382cd"", WRD is 2, WRL is 2, and database name is ""kvs", hash result ""fe The first two bytes of "" become the first level registry key, and the next two bytes of the hash result ""40"" become the second level registry key. As a result, the final path to the registry key is ""Computer \ It is set to HKCU \ kvs \ fe \ 40"", and when WRD and WRL are different, the corresponding registry path is also changed as shown in [Table 3] to create a completely different index structure.

여기서, ""214910"" 및 ""214911""키의 해시 결과는 원래 키가 유사하더라도 완전히 다르다. 따라서 키-값 데이터를 윈도우 레지스트리에 균일하게 분배 할 수 있다.Here, the hash results of the ""214910"" and ""214911"" keys are completely different even though the original keys are similar. Therefore, the key-value data can be distributed uniformly in the Windows registry.

[표 3][Table 3]

이하, WR-Store에 대한 처리 알고리즘에 대해 살피면 아래와 같다.Hereinafter, the processing algorithm for WR-Store is as follows.

도 4는 WR-Store의 처리 알고리즘을 도시한 것으로, RootKey 및 DBName은 데이터베이스가 생성 될 때 결정된다. 여기서 RootKey는 레지스트리의 루트 키이며 보안 수준에 의해 결정되고, DBName은 데이터베이스 이름을 나타내며 RootKey의 하위 키 이다.4 shows the processing algorithm of WR-Store, RootKey and DBName are determined when the database is created. Here, RootKey is the root key of the registry and is determined by the security level, and DBName represents the database name and is a subkey of RootKey.

Get () 연산은 키를 전달하여 키-값 쌍 (key_in)을 입력으로 찾고 찾은 키-값 쌍 (value_out)의 값을 결과로 반환한다. 먼저 RootKey 및 DBName을 입력으로 전달하여 RegOpenKeyEx ()를 호출하고 지정된 레지스트리 키 hKey의 핸들을 색인해온다. 이후 hKey와 key_in을 입력으로 전달하여 RegQueryValueEx ()를 호출하고 결과로 value_out을 얻는다. 여기서 value_out은 Get () 작업의 최종 결과이고, 마지막으로 RegCloseKey ()를 호출하여 할당 된 리소스를 마무리하게 된다.The Get() operation finds a key-value pair (key_in) as an input by passing a key, and returns the value of the found key-value pair (value_out) as a result. First, RegOpenKeyEx() is called by passing the RootKey and DBName as inputs, and the handle of the specified registry key hKey is indexed. After that, it calls RegQueryValueEx() by passing hKey and key_in as inputs, and gets value_out as a result. Here, value_out is the final result of Get() operation, and finally, RegCloseKey() is called to close the allocated resource.

또한, Put () 작업은 키-값 쌍인 key_in 및 value_in을 입력으로 전달하여 새 키-값 쌍을 저장한다. 먼저 RootKey와 DBName을 전달하여 RegCreateKey ()를 호출한다. 이때, 주어진 DBName이 RootKey에서 레지스트리 키로 작성되지 않은 경우 (즉, 데이터를 처음 삽입하는 경우) RootKey에서 DBName의 레지스트리 키를 작성한다.The Put() operation also stores the new key-value pair by passing the key-value pairs key_in and value_in as inputs. First, call RegCreateKey() by passing RootKey and DBName. At this time, if the given DBName is not created as a registry key in the RootKey (that is, when data is inserted for the first time), the registry key of the DBName is created in the RootKey.

반면에, (즉, DBName이 이미 생성 된 경우) 지정된 레지스트리 키를 열고 레지스트리 키 hKey의 핸들을 반환합니다. 그런 다음 hKey, key_in, REG_SZ, value_in 및 sizeo f (value_in)를 입력으로 레지스트리에 새 키-값 쌍을 저장하여 RegSetValueEx ()를 호출한다. 이때, 문자열을 다루는 값 유형에 REG_SZ를 사용함에 따라 임의의 데이터 유형을 문자열로 저장할 수 있으며, 마지막으로 RegCloseKey ()를 호출하여 할당 된 리소스를 마무리 한다.On the other hand, it opens the specified registry key (i.e. if the DBName has already been created) and returns a handle to the registry key hKey. It then calls RegSetValueEx() by storing the new key-value pair in the registry with hKey, key_in, REG_SZ, value_in, and sizeo f (value_in) as inputs. At this time, any data type can be saved as a string by using REG_SZ for the value type handling string, and finally, RegCloseKey() is called to close the allocated resource.

그리고, Delete () 연산은 키-값 쌍의 키를 전달하여 key_in을 입력으로 삭제한다. 먼저 RootKey 및 DBName을 입력으로 전달하여 RegOpenKeyEx ()를 호출하고 지정된 레지스트리 키 hKey의 핸들을 가져온다. 이후, RootKey, DBName 및 key_in을 입력으로 전달하여 RegDeleteKeyValue ()를 호출하여 지정된 키로 키-값 쌍을 삭제하며, 마지막으로 RegCloseKey ()를 호출하여 할당 된 리소스를 마무리 한다.And, the Delete () operation deletes key_in as an input by passing the key of the key-value pair. First, call RegOpenKeyEx() passing the RootKey and DBName as input and get a handle to the specified registry key hKey. After that, by passing the RootKey, DBName, and key_in as inputs, RegDeleteKeyValue() is called to delete the key-value pair with the specified key, and finally, RegCloseKey() is called to close the allocated resource.

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 실험 데이터 및 환경에 대해 살피면 아래와 같다.Hereinafter, the experimental data and environment of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention will be described as follows.

본 발명의 실시예에 따른 WR-Store의 성능을 대표적인 키-값 저장소인 Facebook의 RocksDB, Oracle의 BerkeleyDB 및 Google의 LevelDB와 비교하였고, WR-Store의 Get (), Put (), Delete () 작업을 구현하기 위해 Windows 기본 API를 기반으로 도 4에 나타낸 알고리즘을 사용하였다. RocksDB의 경우 RocksDB (https://github.com/facebook/rocksdb/wiki/Building-on-Windows)에서 제공하는 지침에 따라 Windows 용 라이브러리를 구축하였다.The performance of WR-Store according to an embodiment of the present invention was compared with RocksDB of Facebook, BerkeleyDB of Oracle, and LevelDB of Google, which are representative key-value stores, and Get (), Put (), Delete () operations of WR-Store. The algorithm shown in FIG. 4 was used based on the Windows basic API to implement . In the case of RocksDB, the library for Windows was built according to the instructions provided by RocksDB (https://github.com/facebook/rocksdb/wiki/Building-on-Windows).

우리는 Github의 마스터 브랜치 (https://github.com/facebook/rocksdb)에 릴리스 된 소스 코드를 사용하였고, 사용 된 소스 코드 버전은 5.14이다. BerkeleyDB의 경우 Oracle에서 제공 한 문서 (https://docs.oracle.com/database/bdb181/index.html)에 따라 Windows 용 라이브러리를 구축하였다. 소스코드는 오라클 공식 홈페이지 (http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/downloads/index.html)에 공개 된 18.1 버전의 소스 코드를 사용하였다. LevelDB의 경우 Windows 버전의 LevelDB (https://github.com/ren85/leveldb-windows) 용 Github에서 릴리스 된 소스 코드를 사용하여 Windows 용 라이브러리를 빌드하였다.We used the source code released on the master branch of Github (https://github.com/facebook/rocksdb), and the source code version used is 5.14. For BerkeleyDB, a library for Windows was built according to the documentation provided by Oracle (https://docs.oracle.com/database/bdb181/index.html). For the source code, the source code of version 18.1 published on the official Oracle website (http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/downloads/index.html) was used. For LevelDB, a library for Windows was built using the source code released on Github for the Windows version of LevelDB (https://github.com/ren85/leveldb-windows).

또한, 공정성을 위해 모든 방법에 C++을 사용하고 Windows 10 64 비트에서 Visual Studio 15.0을 사용합니다. WR-Store가 모든 Windows 환경에서 실행될 수 있는지 확인하기 위해 다른 환경에서 다른 Windows 버전으로 실험을 수행하였다.Also, for the sake of fairness, I use C++ for all methods and Visual Studio 15.0 on Windows 10 64-bit. Experiments were performed with different Windows versions in different environments to confirm that WR-Store can run in all Windows environments.

하드웨어 사양으로는 쿼드 코어 Intel Xeon E5-2660 2.0 GHz 프로세서, 8GB의 주 메모리 및 Windows 2016 서버 64 비트가 탑재된 시스템에서 모든 실험을 수행하였고, Windows에서 실행되는 다른 프로세스의 개입을 최소화하기 위해 시스템 부팅 직후 각 실험을 수행하였다.As for hardware specifications, all experiments were performed on a system with a quad-core Intel Xeon E5-2660 2.0 GHz processor, 8 GB of main memory, and Windows 2016 Server 64-bit. Immediately afterward, each experiment was performed.

[표 4]는 데이터 세트의 특성을 나타낸 것이다.[Table 4] shows the characteristics of the data set.

합성 및 실제 데이터 세트를 포함하여 8개의 데이터 세트를 사용하고, Using 8 data sets, including synthetic and real data sets,

합성 데이터 세트는 데이터 세트의 크기가 증가함에 따라 키-값 저장소의 확장 성을 측정하는 데 사용됩니다. 각 키-값 쌍은 값의 길이가 가변적이고 키의 고유 번호가있는 임의의 문자열로 구성됩니다. (1) KVData1, (2) KVData2, (3) KVData3 및 (4) KVData4의 다양한 크기의 4 가지 데이터 세트를 생성할 수 있다.Synthetic datasets are used to measure the scalability of a key-value store as the size of the dataset increases. Each key-value pair consists of a random string with a variable length of the value and a unique number for the key. Four data sets of various sizes can be created: (1) KVData1, (2) KVData2, (3) KVData3, and (4) KVData4.

[표 4] [Table 4]

트위터에서 4 개의 실제 데이터 세트를 크롤링해 보았다. 실제 트윗을 크롤링하기 위해 Tweepy 라이브러리 (https://github.com/tweepy/tweepy)를 각 데이터 세트에 대해 다른 필터링 조건 및 사후 처리와 함께 사용하며, 모든 데이터 세트는 지리적 위치, 텍스트 및 정수와 같은 다른 데이터 유형을 갖도록 설계된다.I tried crawling 4 real data sets on Twitter. To crawl real tweets, we use the Tweepy library (https://github.com/tweepy/tweepy) with different filtering conditions and post-processing for each dataset, all datasets such as geographic location, text and integer It is designed to have different data types.

또한, [표 5]는 실제 데이터 세트의 샘플을 나타낸 것이다. (1) ID-Geo 데이터 세트는 트윗 ID와 트윗의 위치 정보로 구성되고, (2) ID- 해시 태그 데이터 세트는 트윗 ID와 트윗의 해시 태그로 구성되며, (3) ID-Tweet 데이터 세트는 트윗 ID와 트윗 텍스트로 구성되고, (4) 사용자-추적자 데이터 세트는 사용자 ID 및 사용자 추종자 수로 구성된다.In addition, [Table 5] shows a sample of the actual data set. (1) ID-Geo data set consists of tweet ID and location information of tweets, (2) ID-hashtag data set consists of tweet ID and hashtags of tweets, and (3) ID-Tweet data set consists of Consists of tweet ID and tweet text, and (4) user-tracker data set consists of user ID and number of user followers.

[표 5][Table 5]

한편, 키-값 저장소의 성능을 평가하기 위해 세 가지 종류의 작업을 측정하게 된다. (1) 읽기 작업, (2) 삭제 작업 및 (3) 쓰기 작업은 각 작업 집합을 측정하기 위해 키-값 저장소에서 기본 작업 (예 : 읽기 작업의 경우 Get (), 삭제 작업의 경우 Delete (), 쓰기 작업의 경우 Put)을 수행하게 된다.Meanwhile, to evaluate the performance of the key-value store, we measure three kinds of operations. (1) read operations, (2) delete operations, and (3) write operations are the primary operations in the key-value store to measure each set of operations (e.g. Get() for read operations, Delete() for delete operations). , in the case of a write operation, put) is performed.

이때, 연속 작업 k가 증가함에 따라 성능의 변화를 확인하기 위해 데이터 세트에 대해 k를 1에서 100만으로 변경하는데, 이때 작은 (또는 큰) k의 경우, 키-값 저장소의 초기 비용은 큰 (또는 작은) 부분을 차지하게 된다. 또한, Windows API QueryPerf ormanceCounter ()를 사용하여 처리 시간 (마이크로 초 (ms))을 측정하고 프리젠테이션을 위해 초당 작업 (간단한 OPS)으로 변환하였다.At this time, we change k from 1 to 1 million for the data set to see the change in performance as the continuous operation k increases, where for small (or large) k, the initial cost of the key-value store is large (or small) portion. We also used the Windows API QueryPerf ormanceCounter() to measure the processing time (in microseconds (ms)) and convert it to operations per second (simple OPS) for presentation.

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 실행파일의 크기에 대해 살피면 아래와 같다.Hereinafter, the size of the executable file of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention is as follows.

WR-Store는 윈도우 기본 구조 및 API를 사용하므로 기본적으로 가벼우며, 이를 확인하기 위해 키-값 저장소의 실행 파일 크기를 측정하였고, 그 결과는 [표 6]에 나타내었다.Since WR-Store uses the Windows basic structure and API, it is basically light. To check this, the size of the executable file of the key-value store was measured, and the results are shown in [Table 6].

다른 키-값 저장소의 실행 파일에는 모든 기능을 갖춘 키-값 저장소를 지원하기 위해 큰 크기의 소스 코드와 zlib, snappy 및 lz4와 같은 타사 라이브러리도 포함되어 있다. 반대로 WR-Store는 Windows 기본 제공 구조와 기본 API 만 사용하였다.Executables in other key-value stores also contain large-sized source code and third-party libraries such as zlib, snappy, and lz4 to support full-featured key-value stores. Conversely, WR-Store uses only the Windows built-in structure and basic API.

결과적으로 WR-store의 실행 파일 크기는 다른 키-값 저장소와 비교할 때 크기가 매우 작은 것을 알 수 있으며, 특히 WR-Store는 실행 파일의 크기를 RocksDB에 비해 153.73 배, BerkeleyDB에 비해 72.69 배, LevelDB에 비해 17.77 배 작은 것을 확인할 수 있다.As a result, it can be seen that the size of the executable file of WR-store is very small compared to other key-value stores. In particular, WR-Store increases the size of the executable file by 153.73 times compared to RocksDB, 72.69 times compared to BerkeleyDB, and LevelDB. It can be seen that it is 17.77 times smaller than the .

[표 6][Table 6]

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 WR-Store의 경험적 분석에 대해 살피면 아래와 같다.Hereinafter, an empirical analysis of the WR-Store of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention is as follows.

WR-Store의 성능은 윈도우 레지스트리의 내부에 따라 상이한데, WR-Store의 성능은 두 가지 매개 변수에 의해 영향을 받으며, (1) 레지스트리의 깊이, WRD 및 (2) 레지스트리의 각 하위 키 길이, WRL. WR-Store의 각 설정을 xD_yL로 표시하였다. 여기서 x는 WRD의 값이고 y는 forWRL이며 KVData2를 데이터 세트로 사용하고 OPS (초당 작업 수)를 측정하는 반면 연속 작업 수는 1,000으로 고정되어 있다.The performance of WR-Store depends on the internals of the Windows registry, and the performance of WR-Store is affected by two parameters: (1) the depth of the registry, WRD, and (2) the length of each subkey in the registry; WRL. Each setting of WR-Store is indicated by xD_yL. where x is the value of WRD and y is forWRL, using KVData2 as data set and measuring OPS (operations per second), while the number of consecutive operations is fixed at 1,000.

한편, 도 5는 WRL이 1로 고정되고 WRL이 증가함에 따른 WR-Store의 성능을 도시한 도면이고, 도 6은 WRD가 1로 고정 된 상태에서 WR-Store 또는 WRL의 성능이 향상되었음을 도시한 도면이다.On the other hand, FIG. 5 is a view showing the performance of the WR-Store as WRL is fixed to 1 and WRL is increased, and FIG. 6 is a view showing that the performance of the WR-Store or WRL is improved while WRD is fixed to 1. It is a drawing.

두 실험에서 WRL _ WRD가 2, 3 및 4 인 경우 WR-Store의 성능이 다른 경우보다 상대적으로 우수한 것을 알 수 있다. WRL

WRD는 레지스트리 키 아래의 값 수를 결정하며 WRL 및 WRD 각각보다 WR-Store 성능에 더 중요한 요소인 것으로 결론을 내릴 수 있다. 따라서 WRL

WRD에 가장 효율적인 값을 찾은 다음 WRL

WRD에 대해 결정된 값에 대해 WRL 및 WRD에 적절한 값을 선택해야 한다.In both experiments, when WRL_WRD is 2, 3, and 4, it can be seen that the performance of WR-Store is relatively better than that of other cases. WRL

WRD determines the number of values under the registry key and it can be concluded that it is a more important factor for WR-Store performance than WRL and WRD respectively. So WRL

Find the most efficient value for WRD, then WRL

For the values determined for WRD, appropriate values for WRL and WRD must be chosen.

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 WR-Store의 성능 조정에 대해 살피면 아래와 같다.Hereinafter, the performance adjustment of the WR-Store of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention is as follows.

경험적 분석은 WR-Store가 주어진 데이터 세트에 대해 최상의 파라미터 설정을 가지고 있음을 나타낸다. Empirical analysis indicates that WR-Store has the best parameter settings for a given data set.

도 7은 WRD

WRL에 따른 WR-Store의 성능 변화를 도시한 도면으로, 연속 작업 k의 개수가 1에서 1 백만으로 변할 때 WRD

WRL이 2, 4 및 6 인 WR-Store의 처리 시간을 알 수 있으며, 이때 KVData를 데이터 세트로 사용한다.7 is WRD

A diagram showing the performance change of WR-Store according to WRL. WRD when the number of consecutive jobs k changes from 1 to 1 million.

The processing time of the WR-Store with WRLs of 2, 4 and 6 is known, in which case KVData is used as the data set.

WRD

WRL이 증가함에 따라 별개의 레지스트리 키 수가 증가하기 때문에 각 키의 값이 적을 수 있다. 따라서 WRD

WRL이 작을 경우 지정된 레지스트리를 찾은 후 많은 값을 처리하는 비용이 발생하여 동일한 키에 대해 많은 수의 값을 처리해야 한다. 반대로, WRD

WRL이 크면 동일한 키에 대해 적은 수의 값 (예 : 각 키에 대해 하나의 값) 만 가지므로 많은 하위 키가 존재하고, 큰 레지스트리 인덱스에 액세스하는 데 비용이 발생하기 때문에 큰 레지스트리 인덱스가 필요하다. 결과적으로 WR-Store의 최고 성능을 나타내는 WRD

WRL에 최적의 값이 포함된다.WRD

As the WRL increases, the number of distinct registry keys increases, so each key may have a smaller value. So WRD

If the WRL is small, the cost of processing a large number of values after finding a given registry is incurred, requiring processing a large number of values for the same key. Conversely, WRD

A large WRL requires a large registry index because it only has a small number of values for the same key (e.g. one value for each key), so many subkeys exist, and there is a cost to accessing the large registry index. . As a result, WRD representing the best performance of WR-Store

The WRL contains the optimal value.

실험결과를 살피면, WRD

WRL이 변경됨에 따라 WR-Store의 성능 변동이 매우 심하다는 것을 관찰했다. 또한 2D_2L이 가장 안정적이고 효율적인 성능을 확인하였다.Looking at the experimental results, WRD

We observed that the performance of the WR-Store fluctuates very significantly as the WRL changes. In addition, 2D_2L was confirmed to have the most stable and efficient performance.

이처럼, WR-Store asWRL 및 WRD의 성능은 다양하며, 이전 결과에 따르면 WRL

WRD 인덱싱을 위한 레지스트리 경로의 전체 길이는 4로 고정된다. 이후, 다음 WRL 및 WRD를 변경하여 각각의 최적 조합을 찾는다. 본 발명의 실시예에서는 WRL과 WRD의 세 가지 조합 ((1) 1D_4L, (2) 2D_2L 및 (3) 4D_1L)을 도출하였다.As such, the performance of WR-Store asWRL and WRD varies, and according to previous results, WRL

The total length of the registry path for WRD indexing is fixed at 4. Then, the next WRL and WRD are changed to find an optimal combination of each. In the embodiment of the present invention, three combinations of WRL and WRD ((1) 1D_4L, (2) 2D_2L, and (3) 4D_1L) were derived.

한편, 도 8은 연속 작업 k의 수가 1에서 1 백만으로 변할 때 1D_4L, 2D_2L 및 4D_1L의 처리 시간을 도시한 도면으로, KVData를 데이터 세트로 사용하였고, 전체 성능 차이가 WRD

WRL이 변경되는 경우보다 작더라도 WRL 및 WRD가 변경 될 때 의미 있는 성능 차이가 있다.Meanwhile, FIG. 8 is a diagram showing the processing times of 1D_4L, 2D_2L, and 4D_1L when the number of consecutive jobs k changes from 1 to 1 million. KVData was used as a data set, and the overall performance difference was WRD.

There is a significant performance difference when WRL and WRD change, even if they are smaller than when WRL changes.

먼저, 읽기 작업의 경우 1D_4L이 가장 효율적이며, k가 100 만인 경우, 1D_4L은 2D_2L보다 2.21 배, 4D_1L 1.81 배 뛰어난 것을 확인할 수 있다. First, in the case of a read operation, 1D_4L is the most efficient, and when k is 1 million, it can be seen that 1D_4L is 2.21 times and 4D_1L 1.81 times better than 2D_2L.

또한, 쓰기 작업의 경우 1D_4L이 다른 설정보다 약간 더 효율적이며, k가 100 만인 경우, 1D_4L은 2D_2L보다 1.64 배, 4D_1L 1.39 배 뛰어나다.Also, for write operations, 1D_4L is slightly more efficient than the other settings, and when k is 1 million, 1D_4L outperforms 2D_2L by 1.64 times and 4D_1L by 1.39 times.

그리고, 삭제 작업의 경우 1D_4L이 가장 효율적입이며, k가 100 만인 경우, 1D_4L은 2D_2L보다 1.55 배 더 뛰어나고 4D_1L은 1.55 배 더 뛰어난 것을 알 수 있다.And, in the case of deletion, 1D_4L is the most efficient, and when k is 1 million, it can be seen that 1D_4L is 1.55 times better than 2D_2L and 4D_1L is 1.55 times better.

두 실험 결과에 따르면, WRL과 WRD가 변할 때 (특히 WRD WRL의 경우) WR-Store의 성능 변동이 매우 심각하다는 것을 확인하였다. 따라서 합성 데이터 세트에 대한 나머지 실험에서는 WR-Store에 대한 최상의 매개 변수 설정, 즉 1D_4L을 사용하였다.According to the results of the two experiments, it was confirmed that the performance fluctuation of WR-Store was very serious when WRL and WRD were changed (especially in the case of WRD WRL). Therefore, the best parameter setting for WR-Store, namely 1D_4L, was used for the rest of the experiments on the synthetic data set.

또한, 4개의 실제 데이터 세트에 대한 성능 평가를 수행하여 최상의 매개 변수를 도출하게 되는데, [표 7a] 및 [표 7b]는 WRL 및 WRD가 변경 될 때 실제 데이터 세트에서 측정 된 처리량을 나타낸다. In addition, the best parameters are derived by performing performance evaluation on four real data sets. [Table 7a] and [Table 7b] show the throughput measured in the real data set when WRL and WRD are changed.

여기서 동일한 가중치를 읽기, 쓰기 및 삭제 작업에 할당하여 평균 처리량을 간단히 계산하게 되는데, 결과적으로 굵게 표시된 최상의 매개 변수 설정을 찾을 수 있다. 또한, 실제 데이터 세트에 대한 나머지 실험의 경우 [표 7a] 및 [표 7b]에 표시된 바와 같이 각 데이터 세트에 가장 적합한 매개 변수 설정을 사용할 수 있다.Here we simply compute the average throughput by assigning equal weights to read, write, and delete operations, resulting in finding the best parameter settings in bold. In addition, for the rest of the experiments on real data sets, the parameter settings best suited for each data set can be used, as shown in [Table 7a] and [Table 7b].

[표 7a][Table 7a]

[표 7b][Table 7b]

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 합성 데이터 세트에 대한 실험 결과에 대해 살피면 아래와 같다.Hereinafter, the experimental results for the synthetic data set of the storage system in the Windows operating system for general-purpose data storage according to the embodiment of the present invention are as follows.

도 9는 k가 1 에서 100만으로 증가함에 따라 합성 데이터 세트 KVData3의 처리량을 도시한 도면이다.9 is a diagram showing the throughput of the composite data set KVData3 as k increases from 1 to 1 million.

연속 작업 k의 수가 증가함에 따라 처리량의 결과는 WR-Store가 k에 관계없이 일정한 성능을 보여 주지만 RocksDB, BerkeleyDB 및 LevelDB는 그렇지 않다는 것을 알 수 있다. 결과적으로, WR-Store는 특히 작은 k, 즉 1 내지 1,000에 효율적이다. 특히, WR-Store는 읽기 작업의 경우 59.259479.95 배, 쓰기 작업의 경우 1.596182.61 배, 삭제 작업의 경우 2.365978.49 배만큼 RocksDB보다 성능이 뛰어난 것을 확인할 수 있다.As the number of consecutive jobs k increases, the throughput results show that WR-Store shows constant performance regardless of k, but RocksDB, BerkeleyDB and LevelDB do not. As a result, WR-Store is particularly efficient for small k, i.e. 1 to 1,000. In particular, it can be seen that WR-Store outperforms RocksDB by 59.259479.95 times for read operations, 1.596182.61 times for write operations, and 2.365978.49 times for delete operations.

또한, WR-Store는 BerkeleyDB보다 읽기 작업의 경우 2.0910.24 배, 쓰기 작업의 경우 1.3238.72 배, 삭제 작업의 경우 15.2854.14 배의 성능을 제공합니다. WR-Store는 읽기 작업의 경우 LevelDB 25.291091.79 배, 쓰기 작업의 경우 1.78695.29 배, 삭제 작업의 경우 3.76642.65 배를 능가한다. 이는 다른 키-값 저장소가 작업을 수행하는 데 초기 비용이 많이 필요하다는 사실에서 비롯됩니다. 결과적으로 k가 작 으면 WR-Store에 비해 성능이 상당히 비효율적이며 윈도우 레지스트리가 항상 준비되어 있기 때문에 초기 비용이 필요하지 않다는 이점이 있다.Additionally, WR-Store provides 2.0910.24x performance for read operations, 1.3238.72x performance for write operations, and 15.2854.14x performance for delete operations over BerkeleyDB. WR-Store outperforms LevelDB by 25.291091.79 times for read operations, 1.78695.29 times for write operations, and 3.76642.65 times for delete operations. This comes from the fact that other key-value stores have a high initial cost to do their job. As a result, when k is small, the performance is significantly inefficient compared to WR-Store, and the advantage is that no initial cost is required because the Windows registry is always ready.

k 값이 커지면 다른 키-값 저장소가 WR-Store보다 효율적인 경우의 일부를 관찰 할 수 있다. 특히, RocksDB는 읽기 작업의 경우 5.03 배, 쓰기 작업의 경우 14.02 배, 삭제 작업의 경우 k가 100 만인 경우 9.10 배의 성능을 나타낸다.As the value of k grows, we can observe some cases where other key-value stores are more efficient than WR-Store. Specifically, RocksDB performs 5.03 times for read operations, 14.02 times for write operations, and 9.10 times for delete operations when k is 1 million.

LevelDB는 k 값이 큰 경우에도 WR-Store보다 성능이 우수하다. 즉, LevelDB는 쓰기 작업의 경우 9.16 배, k가 10 만인 경우 삭제 작업의 경우 4.89 배만큼 WR-Store보다 효율적입니다. 그러나 읽기 작업의 경우 WR-Store는 쓰기 및 삭제 작업에서 k의 모든 값에서 일관되게 LevelDB를 능가한다. 이때, WR-Store는 k의 값이 커지면 (즉, 백만) 다시 LevelDB를 능가한다.LevelDB performs better than WR-Store even when the value of k is large. In other words, LevelDB is 9.16 times more efficient than WR-Store for write operations and 4.89 times for delete operations when k is 100,000. However, for read operations, WR-Store consistently outperforms LevelDB at all values of k for write and delete operations. At this time, WR-Store surpasses LevelDB again when the value of k increases (ie, one million).

또한, WR-Store는 큰 k에서도 BerkeleyDB와 비교할 수 있는데 k가 1 백만일 때 삭제 작업에 대해 BerkeleyDB보다 44.34 % 효율적이고, BerkeleyDB는 읽기 작업의 경우 83.09 %, k가 1 백만일 때 삭제 작업의 경우 5.31 %만큼 WR-Store보다 효율적인 것을 알 수 있다.Furthermore, WR-Store is comparable to BerkeleyDB even at large k, which is 44.34% more efficient than BerkeleyDB for delete operations when k is 1 million, BerkeleyDB is 83.09% for read operations and 5.31% for delete operations when k is 1 million. It can be seen that it is more efficient than WR-Store.

또한, 데이터가 저장될 때 WR-Store에 추가 데이터로드 성능을 확인하는 것이 중요하다. 여기에 데이터 세트 KVData3이 저장되었으며 크기가 1 에서 1 백만으로 증가한 추가 데이터 쓰기에 대한 성능을 측정하였고, 그 결과 새로 삽입 된 데이터의 크기가 증가하더라도 WR-Store의 성능이 매우 일정하다는 것을 보여준다. 이는 추가 데이터로드에서도 WR-Store가 안정적임을 의미한다.It is also important to check the performance of loading additional data into the WR-Store as data is being stored. Data set KVData3 is stored here, and the performance of additional data writes increased from 1 to 1 million in size is measured, and the results show that the performance of WR-Store is very constant even when the size of newly inserted data increases. This means that the WR-Store is stable even under additional data loads.

한편, 도 10은 데이터 세트의 크기가 10,000 (즉, KVData1)에서 1,000 만 (즉, KVData4)으로 증가함에 따라 합성 데이터 세트의 처리량을 도시한 도면이다.Meanwhile, FIG. 10 is a diagram illustrating the throughput of the composite data set as the size of the data set increases from 10,000 (ie, KVData1) to 10 million (ie, KVData4).

본 발명의 실시예에서는 데이터 세트의 크기가 증가함에 따른 처리량의 연속 작업 수를 10,000으로 고정하였고, 이 결과의 가장 중요한 점은 WR-Store의 확장성을 보여준다. 즉, WR-Store의 성능은 데이터 세트의 크기가 증가하고 다른 키-값 저장소의 성능은 많이 저하되므로 상당히 일정한 것을 알 수 있다.In the embodiment of the present invention, the number of consecutive operations of the throughput is fixed at 10,000 as the size of the data set increases, and the most important point of this result shows the scalability of the WR-Store. In other words, it can be seen that the performance of WR-Store is fairly constant as the size of the data set increases and the performance of other key-value stores deteriorates significantly.

데이터 세트가 KVData1과 비교하여 KVData4 일 때 성능 저하를 측정하는 경우 WR-Store의 성능은 각각 읽기 작업의 경우 1.05 배, 삭제 작업의 경우 2.30 배, 쓰기 작업의 경우 16.25 배로 나타난다.When measuring the performance degradation when the data set is KVData4 compared to KVData1, the performance of WR-Store is 1.05 times for read operations, 2.30 times for delete operations, and 16.25 times for write operations, respectively.

이때, RocksDB 각각 52.11 배, 55.02 배 및 49.63 배이고, 버클리 DB는 각각 4365.17 배, 230.23 배, 1.48 배이며, LevelDB는 각각 239.80 배, 725.28 배, 766.18 배로 나타났다. 또한 BerkeleyDB는 쓰기 작업으로 확장 할 수 있지만 읽기 및 삭제 작업에서 성능 저하가 훨씬 심각한 것을 알 수 있다.At this time, RocksDB was 52.11 times, 55.02 times, and 49.63 times, respectively, Berkeley DB was 4365.17 times, 230.23 times, and 1.48 times, respectively, and LevelDB was 239.80 times, 725.28 times and 766.18 times, respectively. You can also see that BerkeleyDB can scale with write operations, but the performance hit is much more severe on read and delete operations.

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 실제 데이터 세트에 대한 실험 결과에 대해 살피면 아래와 같다.Hereinafter, the experimental results for the actual data set of the storage system in the Windows operating system for general-purpose data storage according to the embodiment of the present invention are as follows.

도 11은 k가 1 에서 1 백만으로 증가함에 따라 실제 데이터 세트의 처리량을 도시한 도면이다. 이때, 연속 작업 k의 증가에 따른 처리량 k에 대한 실험을 위해 ID-Geo를 데이터 세트로 사용하였다.11 is a diagram illustrating the throughput of an actual data set as k increases from 1 to 1 million. In this case, ID-Geo was used as the data set for the experiment on the throughput k according to the increase of the continuous operation k.

전반적인 추세는 앞서 언급한 도 9에 표시된 합성 데이터 세트의 결과와 유사한데, WR-Store의 성능은 상당히 일정하지만 다른 키-값 저장소는 초기 비용이 많이 필요하다. 따라서 WR-Store는 작은 k에 대해 훨씬 효율적이며 RocksDB 및 LevelDB는 일부 큰 k에 대해서만 WR-Store보다 성능이 뛰어난 것을 알 수 있다.The overall trend is similar to the results of the synthetic data set shown in Fig. 9 mentioned earlier, where the performance of WR-Store is fairly constant, while other key-value stores require high initial costs. Thus, it can be seen that WR-Store is much more efficient for small k, and RocksDB and LevelDB outperform WR-Store only for some large k.

도 12는 다양한 실제 데이터 세트의 처리량을 도시한 도면으로, 다양한 실제 데이터 세트의 처리량 실험을 위해 연속 작업 수를 10,000으로 고정하였다.12 is a diagram showing the throughput of various actual data sets, and the number of consecutive operations is fixed at 10,000 for throughput experiments of various actual data sets.

읽기 및 삭제 작업의 경우 WR-Store가 더 큰 데이터 세트 (예 : ID-Geo 및 ID-Tweet)의 다른 키-값 저장소보다 성능이 우수하다. 쓰기 작업의 경우 BerkeleyDB 및 LevelDB는 상당히 일정한 성능을 보여 주지만 WR-Store는 여전히 성능이 비슷한 것을 알 수 있다.For read and delete operations, WR-Store outperforms other key-value stores for larger data sets (eg ID-Geo and ID-Tweet). For write operations, BerkeleyDB and LevelDB show fairly consistent performance, but WR-Store still shows similar performance.

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 집중적인 레지스트리 워크로드에서의 스트레스 테스트에 대해 살피면 아래와 같다.Hereinafter, the stress test in the intensive registry workload of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention is as follows.

윈도우 응용 프로그램은 항상 데이터를 저장하고 검색하기 위해 레지스트리에 액세스한다. 본 발명의 실시예에 따른 WR-Store는 레지스트리를 기반으로하기 때문에 집중적 인 레지스트리 워크로드에서 WR-Store의 성능을 관찰하는 것이 중요하다.Windows applications always access the registry to store and retrieve data. Since the WR-Store according to the embodiment of the present invention is based on the registry, it is important to observe the performance of the WR-Store in the intensive registry workload.

먼저, WR-Store가 작동하는 동안 다른 프로세스의 레지스트리 액세스를 분석하고, 윈도우의 모든 I / O 이벤트를 모니터링하는 도구인 프로세스 모니터 (https://docs.microsoft.com/en-us/sysinternals/downloads/procmon)를 사용하여 윈도우 운영 체제에서 발생한 모든 레지스트리 작업을 수집한다. (Microsoft가 제공 한 레지스트리 액세스를 포함)First, the Process Monitor (https://docs.microsoft.com/en-us/sysinternals/downloads), a tool that analyzes the registry access of other processes while WR-Store is running, and monitors all I/O events in Windows. /procmon) to collect all registry operations that occur in the Windows operating system. (including registry access provided by Microsoft)

[표 8]은 프로세스 모니터가 수집 한 레지스트리 조작의 요약을 나타낸 것으로, 즉, 626.7 초 동안 380 만 개 이상의 레지스트리 작업이 수집됨을 알 수 있다. 또한, 윈도우 프로세스가 WR-Store와 함께 작동하고 있으며 레지스트리에 적극적으로 액세스하고 있음을 알 수 있다. 이때, 모든 프로세스에서 레지스트리 조작 간의 평균 시간 간격은 각 프로세스마다 12.96 밀리 초이다.Table 8 shows a summary of registry operations collected by Process Monitor, which means that over 3.8 million registry operations are collected in 626.7 seconds. Also, you can see that the Windows process is working with WR-Store and is actively accessing the registry. At this time, the average time interval between registry operations in all processes is 12.96 milliseconds for each process.

[표 9]는 일부 순서대로 레지스트리에 액세스하는 상위 10 개의 프로세스를 나타낸 것으로, 최상위 10 개 프로세스에서 레지스트리 작업 간의 평균 시간 간격은 1.95 밀리 초이다.Table 9 shows the top 10 processes accessing the registry in some order, with an average time interval between registry operations in the top 10 processes of 1.95 milliseconds.

[표 8][Table 8]

[표 9][Table 9]

이후, 집중적인 레지스트리 워크로드 환경에서 WR-Store의 성능을 측정하였다. 여기서는 1.95 밀리 초마다 한 번씩 레지스트리에 액세스 할 수 있도록 각 합성 프로세스를 작성한다. 이는 레지스트리에 액세스하는 상위 10 개 프로세스의 평균 빈도와 같다다.Then, we measured the performance of WR-Store in an intensive registry workload environment. Here we write each synthesizing process so that it can access the registry once every 1.95 milliseconds. This is equal to the average frequency of the top 10 processes accessing the registry.

이어서, 동시에 실행중인 프로세스 p의 수를 1에서 1000으로 변경하였다. 도 13은 p가 증가함에 따라 성능 변화를 도시한 도면이다. 여기서는 실제 데이터 세트 인 ID-Geo를 데이터 세트로 사용하고 연속 작업 수를 10,000으로 수정하였다. 결과적으로, 레지스트리에 집중적으로 액세스하는 1000 개의 프로세스가 동시에 실행중인 경우에도 레지스트리에 의존하지 않는 다른 키-값 저장소와 같이 WR-Store의 성능이 지속적으로 유지됨을 알 수 있다.Then, the number of concurrently running processes p was changed from 1 to 1000. 13 is a diagram illustrating a change in performance as p increases. Here, the actual data set, ID-Geo, was used as the data set and the number of consecutive operations was corrected to 10,000. As a result, it can be seen that the performance of WR-Store is continuously maintained like other key-value stores that do not depend on the registry even when 1000 processes that access the registry intensively are running simultaneously.

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 WR-Store에서 ETL (Extract-Transform-Load)을 위한 효율적인 방법에 대해 살피면 아래와 같다.Hereinafter, an efficient method for ETL (Extract-Transform-Load) in the WR-Store of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention will be described as follows.

WR-Store는 윈도우 레지스트리를 기반으로 하기 때문에 윈도우 운영 체제가 실행중인 경우에만 이식 가능하다. 그러나 윈도우 운영 체제가 없는 다른 환경에서는 WR-Store의 데이터를 다른 환경으로 마이그레이션하는 방법이 필요하다.Because WR-Store is based on the Windows registry, it is only portable if a Windows operating system is running. However, in other environments without a Windows operating system, a method for migrating data from WR-Store to another environment is required.

키-값 저장소는 키와 값 쌍으로 구성된 매우 간단한 형식으로 데이터를 처리한다. 그러나 WR-Store는 윈도우 레지스트리를 기반으로 데이터를 다단계 구조로 저장한다. 따라서 다중 레벨 구조에서 데이터를 추출하고이를 키-값 쌍의 형태로 변환해야 한다.A key-value store processes data in a very simple format consisting of key and value pairs. However, WR-Store stores data in a multi-level structure based on the Windows registry. Therefore, it is necessary to extract data from a multi-level structure and transform it into the form of key-value pairs.

먼저, WR-Store의 레지스트리 경로에서 데이터를 파일로 추출한 다음 파일의 내용을 키-값 쌍의 형태로 변환한다. (이하, 이러한 키-값 변환을 "E-then-T" 라고 지치한다.) E-then-T 방법은 Windows 레지스트리에서 제공하는 내장 추출 기능에 따라 다르다. 즉, ""regedit / E"" 라는 윈도우 명령을 실행하면 특정 레지스트리 경로에 저장된 데이터를 파일로 추출한다.First, data is extracted from the registry path of WR-Store to a file, and then the contents of the file are converted into the form of key-value pairs. (Hereinafter, this key-value conversion is referred to as "E-then-T".) The E-then-T method depends on the built-in extraction function provided by the Windows registry. In other words, if you run the Windows command ""regedit /E"", the data stored in a specific registry path is extracted to a file.

도 14는 실제 데이터 세트인 ID-HashTag를 저장하는 WR-Store에서 추출 된 데이터의 세부 사항을 도시한 도면이다. 이는 다중 레벨의 레지스트리 경로를 나타내며 각 레지스트리 경로는 해당 키 및 값 쌍을 포함한다. 그런 다음 이 파일에서 정규식과 같은 패턴 일치 방법을 기반으로 컨텐츠를 분석하여 키 부분과 값 부분을 지정할 수 있다.14 is a diagram showing details of data extracted from a WR-Store that stores ID-HashTag, which is an actual data set. It represents a multi-level registry path, where each registry path contains a corresponding key and value pair. You can then specify the key part and the value part by parsing the content from this file based on a pattern matching method such as a regular expression.

E-then-T 방법은 두 개의 분리 된 위상을 필요로 하며, 첫 번째 위상 (즉, 추출)의 결과에 대해 두 번째 위상 (즉, 변환)을 수행해야 한다. 따라서 추출 및 변환 (E-and-T)이라고 하는 데이터를 추출하면서 데이터를 직접 변환하는 방법보다 효율적인 ETL 방법을 설계해야 한다.The E-then-T method requires two separate phases, and a second phase (i.e. transform) must be performed on the result of the first phase (i.e. extraction). Therefore, it is necessary to design an ETL method that is more efficient than a method that directly transforms the data while extracting the data, which is referred to as extraction and transformation (E-and-T).

E-and-T 방법의 경우 기본 API를 사용하여 윈도우 레지스트리에 직접 액세스해야하는데, 이러한 API를 사용하여 WR-Store에 대한 데이터를 저장하는 레지스트리 경로에서 전체 구조를 탐색 할 수 있다. 순회 중 키와 값을 찾을 때마다 각 키와 값을 키-값 쌍의 형태로 직접 변환한다.The E-and-T method requires direct access to the Windows registry using native APIs, which can be used to navigate the entire structure in the registry path that stores data for the WR-Store. Whenever it finds a key and value during traversal, it converts each key and value directly into the form of a key-value pair.

도 15는 E-and-T 방법의 알고리즘을 도시한 도면이다. 먼저 하위 레지스트리 수와 키 및 값 쌍 수를 알기 위해 지정된 레지스트리 경로에 대한 정보를 얻는다. 하위 키가 있는 경우 이 알고리즘을 재귀적으로 호출하여 각 하위 키를 조사해야 한다. 그렇지 않으면 키와 값 쌍을 하나씩 추출하는 형태로 WR-Store에 저장된 키 및 값 쌍 목록을 얻을 수 있다.15 is a diagram illustrating an algorithm of the E-and-T method. First, get information about the specified registry path to know the number of subregistries and the number of key and value pairs. If subkeys exist, we must recursively call this algorithm to examine each subkey. Otherwise, you can get a list of key and value pairs stored in the WR-Store in the form of extracting key and value pairs one by one.

또한, 제안된 E-and-T 방법의 성능을 보여주기 위해 몇 가지 실험을 수행ㅎ하였다. 도 16은 WR-Store에 대해 제안 된 두 가지 ETL 방법, 즉 E-then-T 및 E-and-T의 성능 분석을 도시한 도면이다. 여기서는 4 개의 실제 데이터 세트를 사용하고 연속 작업 수를 10,000으로 수정하였다.In addition, several experiments were performed to show the performance of the proposed E-and-T method. 16 is a diagram showing the performance analysis of two ETL methods proposed for WR-Store, namely E-then-T and E-and-T. Here, we used four real data sets and corrected the number of consecutive operations to 10,000.

E-then-T 방법의 경과 시간은 (1) 추출 프로세스의 시간과 (2) 변환 프로세스의 시간으로 구성되며, 각 부품을 개별적으로 측정하여 각 부품을 E-and-T 방법의 경과 시간과 비교하였다. 도 16의 (a)는 추출, 변환, E-then-T 및 E-and-T의 경과 시간을 나타내는데, E-and-T가 E-then-T보다 훨씬 효율적이라는 것을 알 수 있다. 특히 E-and-T는 데이터 세트가 변할 때 E-then-T보다 1.24 ~ 3.85 배 높은 성능을 보였다.The elapsed time of the E-then-T method consists of (1) the time of the extraction process and (2) the time of the conversion process, measuring each part individually and comparing each part with the elapsed time of the E-and-T method. did. Figure 16 (a) shows the elapsed time of extraction, transformation, E-then-T and E-and-T, it can be seen that E-and-T is much more efficient than E-then-T. In particular, E-and-T performed 1.24 to 3.85 times higher than E-then-T when the data set was changed.

또한, E-and-T는 내보내기 전용 프로세스보다 약간 더 많은 시간이 필요하지만 E-then-T는 패턴 일치와 관련된 변환 프로세스에 많은 시간이 필요한 것을 알 수 있다. E-then-T에 비해 E-and-T의 성능 향상은 ID-GEO 및 ID-Tweet과 같은 대규모 데이터 세트의 경우 더 커지고, 성능 향상은 각각 3.30 배와 3.85 배이다.Also, it can be seen that E-and-T requires slightly more time than the export-only process, whereas E-then-T requires a lot of time for the conversion process involving pattern matching. Compared to E-then-T, the performance improvement of E-and-T is larger for large data sets such as ID-GEO and ID-Tweet, and the performance improvement is 3.30 times and 3.85 times, respectively.

또한, 도 16의 (b)는 전체 ETL 프로세스 중 WR-Store에 저장된 데이터 추출을 시작하여 다른 키-값 저장소 (예 : 다음 중 하나)에 로드하는 E-and-T 및 E-then-T 부분을 나타낸다. In addition, (b) of Figure 16 shows the E-and-T and E-then-T parts of the entire ETL process, starting to extract data stored in the WR-Store and loading it into another key-value store (for example, one of the following). indicates

E-then-T는 전체 ETL 프로세스 중 11.78 % 84.93 %를 차지하고; E-and-T는 3.89 % 63.04 %를 차지합니다. BerkeleyDB 및 LevelDB의 E-and-T는 전체 ETL (추출-변형-부하) 프로세스에서 각각 11.21 % 및 3.89 %를 차지한다.E-then-T accounts for 11.78% 84.93% of all ETL processes; E-and-T accounts for 3.89% 63.04%. BerkeleyDB and LevelDB's E-and-T account for 11.21% and 3.89% of the overall ETL (extract-transform-load) process, respectively.

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 관련작업에 대해 살피면 아래와 같다.Hereinafter, the related work of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention will be described as follows.

WR-Store는 윈도우 레지스트리를 기반으로 하는 창의 내부 구조를 인덱스 구조로 사용하기 때문에 기존 키-값 저장소에 필요한 추가 데이터 구조가 필요하지 않다는 구조적인 차이점이 있다.Since WR-Store uses the internal structure of the window based on the Windows registry as an index structure, there is a structural difference in that an additional data structure required for the existing key-value store is not required.

또한, LSM-tree는 키-값 저장소 (예 : LevelDB, RocksDB, HBase 및 Cassandra)의 인덱스 구조로 널리 사용되고 있으며, LSM 트리를 향상시키기 위한 많은 작업이 있었다.In addition, LSM-trees are widely used as index structures for key-value stores (eg LevelDB, RocksDB, HBase, and Cassandra), and there has been a lot of work to improve LSM trees.

bLSM은 인덱스 성능을 향상시키기 위해 Bloom 필터를 사용하여 B-Tree 및 로그 구조화 된 접근 방식의 장점을 가진 LSM 트리이다. VT-tree는 LSM-tree를 확장하여 순차적 및 파일 시스템 워크로드를 효율적으로 처리하며, LSM-Trie는 보다 효율적인 구조 압축을 위해 LSM-Tree의 3 중 구조를 갖는다. 또한 WiscKey는 SSD 스토리지에 최적화 된 영구 LSM 트리 기반 키-값 저장소이고, Flame DB는 LSM- 트리에서 압축 오버 헤드를 줄이기 위해 그룹화 된 레벨 구조를 제안하며, 향상된 블룸 필터를 채택하여 블룸 필터의 오 탐지를 줄일 수 있다.bLSM is an LSM tree with the advantages of B-Tree and log structured approaches using Bloom filters to improve index performance. VT-tree extends LSM-tree to efficiently handle sequential and file system workloads, and LSM-Trie has a triple structure of LSM-Tree for more efficient structure compression. In addition, WiscKey is a persistent LSM tree-based key-value store optimized for SSD storage, Flame DB proposes a grouped level structure to reduce compression overhead in the LSM-tree, and adopts an improved bloom filter for false positives of the bloom filter. can reduce

또한 해시 테이블을 기반으로 하는 다양한 키-값 저장소가 있으며, FAWN-KV는 메모리 내 해시 테이블과 로그 구조화된 데이터 저장소를 사용하고, 플래시 스토리지의 고성능뿐만 아니라 복제 및 일관성도 제공한다. FlashStore는 cuckoo 해시 테이블의 변형을 사용하고 키 서명을 압축하여 키-값 쌍을 플래시 스토리지에 효과적으로 저장하며, SkimpyStash는 선형 체인을 사용하여 메모리 내 해시 테이블을 사용하여 플래시 스토리지의 키-값 쌍을 인덱싱 한다.There are also various key-value stores based on hash tables, FAWN-KV uses an in-memory hash table and log structured data store, and provides not only the high performance of flash storage, but also replication and consistency. FlashStore uses a variant of the cuckoo hash table and compresses key signatures to effectively store key-value pairs in flash storage, while SkimpyStash uses a linear chain to index key-value pairs in flash storage using an in-memory hash table. do.

RAM 풋 프린트가 매우 낮고, SILT는 로그 저장소, 해시 저장소 및 정렬 된 저장소의 세 가지 기본 키-값 저장소를 결합하며, Mercury는 연결 목록과 연결된 체인 해시 테이블로 DRAM 액세스를 줄인다.With a very low RAM footprint, SILT combines three primary key-value stores: a log store, a hash store, and an ordered store, while Mercury reduces DRAM access with linked lists and chained hash tables.

다른 데이터 구조를 기반으로하는 작업이 있으며, Masstree는 B +-트리와 시도를 결합한 메모리 내 구조로 빠른 임의 액세스를 제공하고 범위 쿼리를 지원한다. NVMKV는 고급 FTL 기능을 사용하는 키-값 저장소이고, ForestDB는 디스크 기반 트리 구조와 B + 트리를 결합한 HB + 트리를 사용한다. 다양한 길이의 키를 색인화하고 검색하는 데 효율적이며, Tucana는 B # -tree의 확장을 사용하여 삽입 작업에 대한 점근 적 속성을 제공한다.There are operations based on other data structures, Masstree is an in-memory structure that combines a B+-tree and an attempt, which provides fast random access and supports range queries. NVMKV is a key-value store with advanced FTL features, while ForestDB uses HB + tree, which combines a disk-based tree structure with a B + tree. Efficient for indexing and retrieving keys of varying lengths, Tucana uses extensions of the B#-tree to provide asymptotic properties for insert operations.

이하, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 WR-Store의 기능에 대해 살피면 아래와 같다.Hereinafter, the function of the WR-Store of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention will be described as follows.

[표 10]은 윈도우 레지스트리의 일부 제한 사항을 나타낸다. 레지스트리 키는 WR-Store에서 인덱싱하는데 사용되며, 키 이름의 255자 및 키의 512 심도는 확장 가능한 인덱스를 만들기에 충분하다.[Table 10] shows some limitations of the Windows registry. The registry key is used for indexing by WR-Store, and 255 characters of the key name and 512 depth of the key are sufficient to create a scalable index.

해시 결과는 16 진수로 표현됨에 따라, 한 레벨의 레지스트리에 대해 16255 개의 조합을 작성할 수 있으며 512 레벨의 레지스트리를 작성할 수 있다. 또한 값 이름의 크기 제한은 16,383이므로 WR-store의 키의 최대 길이는 16,383이다. 일반적으로 키는 10 진수로 표시되는바, 대규모 키-값 쌍을 저장하기에 충분한 1016383 키를 처리 할 수 있다.As the hash result is expressed in hexadecimal, 16255 combinations can be created for one level of registry, and 512 levels of registry can be created. Also, since the size limit of the value name is 16,383, the maximum length of the key in WR-store is 16,383. Keys are typically represented as decimal numbers, which can handle 1016383 keys, which is enough to store large key-value pairs.

[표 10][Table 10]

윈도우 레지스트리 값의 크기 제한이 사용 가능한 메모리로 명시되어 있어도 효율성을 위해 크기가 2MB 미만인 값을 저장하는 것이 좋습니다. 소셜 네트워킹 서비스 및 검색 엔진과 같은 핵심 가치 데이터를 수집하는 일반적인 최근 소스는 크기가 2MB 보다 훨씬 작은 데이터를 처리한다.Even though the size limit of the Windows registry value is stated as available memory, it is recommended to store values that are less than 2 MB in size for efficiency. Common modern sources of core value data, such as social networking services and search engines, process data much smaller than 2 MB in size.

예를 들어, Twitter는 트윗에 280 바이트 (즉, 1KB 미만)를 허용하고, Facebook은 게시물에 63,206 바이트 (즉, 약 61KB)를 허용하며, Google은 500KB의 웹 페이지를 최대 크기 (https://developers.google.com/search/reference/robots_txt)로 크롤링 한다.For example, Twitter allows 280 bytes (i.e. less than 1 KB) for tweets, Facebook allows 63,206 bytes (i.e. about 61 KB) for posts, and Google allows webpages of 500 KB in maximum size (https:// Crawl to developers.google.com/search/reference/robots_txt).

또한, 파일 시스템으로 작업하여 대규모 값을 저장할 수 있다. 즉, 대규모 값을 파일 시스템에 파일로 저장하고 윈도우 레지스트리에서 파일 경로만 유지한다.You can also work with the file system to store large values. In other words, it stores large values as files on the file system and only keeps the file paths in the Windows registry.

마이그레이션 측면에서 살피면, 모든 윈도우 버전이 윈도우 레지스트리를 지원함에 따라 윈도우 운영 체제가 대상 시스템에서 실행중인 경우에만 WR-Store에 저장된 데이터를 다른 시스템으로 쉽게 마이그레이션 할 수 있다.In terms of migration, as all versions of Windows support the Windows registry, data stored in WR-Store can be easily migrated to another system only if the Windows operating system is running on the target system.

윈도우 기본 제공 명령을 사용하여 하나의 WR-Store에 저장된 데이터를 쉽게 내보내고 다른 시스템의 다른 WR-Store로 가져올 수 있고, 내 보낸 데이터를 저장할 수 있다. 또한 레지스트리를 지원하지 않는 다른 환경의 경우, 앞서 언급한 ETL (Extract-Transform-Load) 방법을 사용하여 WR-Store의 데이터를 기존 키-값 저장소를 지원하는 다른 환경으로 마이그레이션 할 수 있다.Using Windows built-in commands, you can easily export data stored in one WR-Store and import it to another WR-Store in another system, and save the exported data. In addition, for other environments that do not support the registry, the aforementioned ETL (Extract-Transform-Load) method can be used to migrate data from WR-Store to another environment that supports existing key-value stores.

호환성 측면에서 살피면, 윈도우 레지스트리 및 관련 API는 모든 윈도우 버전에서 작동 할 수 있으므로 WR-Store는 모든 Windows 버전 (예 : Windows 7 / 8 / 8.1 / 10 및 Windows Server 2008/2012/2016)에서 사용할 수 있다. 한 환경에서 컴파일 한 실행 파일은 수정없이 다른 윈도우 환경에서 작업 할 수 있어 호환성이 매우 좋다.In terms of compatibility, the Windows Registry and related APIs can work with any Windows version, so WR-Store can be used on all Windows versions (eg Windows 7/8/8.1/10 and Windows Server 2008/2012/2016). . Executable files compiled in one environment can work in another Windows environment without modification, so compatibility is very good.

안정성 측면에서 살피면, 윈도우 레지스트리는 항상 윈도우 응용 프로그램에서 액세스하기 때문에, explorer.exe 및 services.exe와 같은 시스템 프로세스는 사용자가 응용 프로그램을 전혀 실행하지 않더라도 항상 작동하고, 윈도우에서 제공하는 내장 API를 사용하여 [표 1]에 표시된 윈도우 레지스트리에 액세스하게 된다.In terms of stability, since the Windows registry is always accessed by Windows applications, system processes such as explorer.exe and services.exe always work even if the user does not run the application at all, and use the built-in API provided by Windows. to access the Windows registry shown in [Table 1].

본 발명의 실시예에 따른 WR-Store도 동일한 API를 기반으로 개발되었으며, 도 4에 도시된 바와 같이 키-값 저장소의 기본 작업 (예 : Get, Put 및 Delete)에 대한 모든 알고리즘이 내장 API를 기반으로 한다는 것을 알 수 있다.WR-Store according to an embodiment of the present invention was also developed based on the same API, and as shown in FIG. 4, all algorithms for basic operations (eg, Get, Put, and Delete) of the key-value store use the built-in API. It can be seen that based on

결과적으로 WR-Store는 위도우 레지스트리를 사용하는 일반 Windows 응용 프로그램 중 하나로 이해함이 바람직하며, 윈도우 레지스트리에 대한 집중적인 업데이트가 동시에 발생하는 극단적인 경우에도 안전성을 확보하고 있다.As a result, it is desirable to understand WR-Store as one of the general Windows applications that use the Windows registry, and secures safety even in extreme cases where intensive updates to the Windows registry occur at the same time.

이하, 도 17을 참조하여 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템(S)의 세부구성에 대해 살피면 아래와 같다.Hereinafter, a detailed configuration of the storage system S in the Windows operating system for general-purpose data storage according to an embodiment of the present invention will be described with reference to FIG. 17 .

구체적으로, 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템(S)은, 각 데이터 항목을 키-값 쌍(k, v)으로 생성하는 키-값 저장소의 구성요소를 윈도우 레지스트리의 구성요소에 매핑하여 생성되는 WR-Store(100), 및 해시 기반의 다중 레벨 레지스트리 색인 기능을 수행하고, 윈도우 기본 API를 사용하여 WR-Store(100)에 저장된 데이터에 대해 입력(Put), 조회(Get) 또는 삭제(Delete) 중에 어느 하나의 기능을 수행하는 제어부(200)를 포함하여 구성된다.Specifically, the storage system (S) in the Windows operating system for general-purpose data storage according to an embodiment of the present invention is a key-value store that generates each data item as a key-value pair (k, v). WR-Store (100), which is created by mapping elements to components of the Windows registry, and hash-based multi-level registry indexing functions, and input for data stored in WR-Store (100) using Windows native API It is configured to include a control unit 200 that performs any one function among (Put), inquiry (Get), and deletion (Delete).

또한, WR-Store(100)에 저장된 데이터를 키-값 저장소를 지원하는 다른 운영체계 환경에서 인식이 가능하도록 변환하는 마이그레이션부(300)를 더 포함하여 구성된다. 이때, 마이그레이션은 ETL (Extract-Transform-Load)에 의해 수행된다.In addition, the data stored in the WR-Store 100 is configured to further include a migration unit 300 that converts the data stored in the key-value store so that it can be recognized in another operating system environment that supports the key-value store. At this time, migration is performed by ETL (Extract-Transform-Load).

부연하여, 본 발명의 실시예에 따른 키-값 저장소는 각 데이터 항목을 키-값 쌍 (k, v)으로 나타내며, 키 k는 전체 데이터 항목을 고유하게 식별하고, 값 v는 실제 데이터 항목을 설명을 포함하며, 입력(Put), 조회(Get) 및 삭제(Delete) 기능을 수행한다.Further, a key-value store according to an embodiment of the present invention represents each data item as a key-value pair (k, v), where the key k uniquely identifies the entire data item, and the value v represents the actual data item. It includes description and performs input (Put), inquiry (Get) and delete (Delete) functions.

여기서, Get 기능은 키를 입력으로 사용하여 키-값 저장소에서 제공된 키와 연관된 값을 리턴하고, Put 기능은 키와 값 쌍을 가져 와서 키-값 쌍을 키-값 저장소에 저장하며, Delete 기능은 키를 입력으로 사용하고 지정된 키를 사용하여 데이터 항목을 삭제하는 작업을 수행한다.Here, the Get function takes a key as input and returns the value associated with the provided key from the key-value store, the Put function gets the key and value pair and stores the key-value pair in the key-value store, and the Delete function takes a key as input and performs an operation to delete a data item using the specified key.

이하, 도 18을 참조하여 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템의 키-값 저장소를 구성하는 절차에 대해 살피면 아래와 같다.Hereinafter, with reference to FIG. 18, a procedure for configuring the key-value storage of the storage system in the Windows operating system for general-purpose data storage according to an embodiment of the present invention will be described as follows.

먼저, 제어부(200)가 키-값 저장소의 구성 요소를 윈도우 레지스트리의 구성 요소에 맵핑하여 WR-Store(100)를 생성한다(S1702).First, the control unit 200 creates the WR-Store 100 by mapping the components of the key-value store to the components of the window registry (S1702).

이어서, 제어부(200)가 WR-Store(100)에 저장된 키-값 데이터를 균형있게 분배하고 효율적으로 액세스 할 수 있도록 해시 기반의 다중 레벨 인덱스 구조로 재구성한다(S1704).Then, the control unit 200 reconstructs the hash-based multi-level index structure so that the key-value data stored in the WR-Store 100 can be distributed in a balanced manner and accessed efficiently (S1704).

뒤이어, 제어부(200)가 윈도우 기본 API를 사용하여 WR-Store(100)에 저장된 데이터에 대해 입력(Put), 조회(Get) 또는 삭제(Delete) 중에 어느 하나의 작업을 수행한다(S1706).Subsequently, the control unit 200 performs any one of input (Put), inquiry (Get), or deletion (Delete) on the data stored in the WR-Store 100 using the Windows basic API (S1706).

이어서, 제어부(200)가 WR-Store(100)에 저장된 데이터를 ETL (Extract-Transform-Load)를 적용하여 다른 운영체계의 키-값 저장소로 마이그레이션 한다(S1708). 이를 통해 윈도우 운영 체제가 실행되는 환경뿐만 아니라 기존 키-값 저장소를 지원하는 다른 환경에서도 WR-Store의 적용 범위를 확장 할 수 있다.Next, the control unit 200 migrates the data stored in the WR-Store 100 to the key-value store of another operating system by applying ETL (Extract-Transform-Load) (S1708). Through this, the scope of application of WR-Store can be expanded not only in the environment where the Windows operating system runs, but also in other environments that support the existing key-value store.

그리고, 제어부(200)가 WR-Store(100)의 성능 검증을 위해 경험적 분석을 수행하여 성능을 조정한다(S1710).Then, the control unit 200 performs empirical analysis to verify the performance of the WR-Store 100 to adjust the performance (S1710).

전술한 바와 같은 본 발명의 실시예에 따른 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템에 의하면, 다음과 같은 장점이 있다.According to the storage system in the Windows operating system for general-purpose data storage according to the embodiment of the present invention as described above, there are advantages as follows.

첫째, WR-Store는 추가 라이브러리 및 응용 프로그램을 설치하지 않고도 Windows 기본 제공 구조 및 기본 API를 사용하므로 가벼운 키-값 저장소이다. 실험 결과에 따르면 WR-Store는 다른 키-값 저장소보다 실행 파일의 크기를 17.77153.73 배 줄이게 된다.First, WR-Store is a lightweight key-value store because it uses Windows built-in structures and native APIs without installing additional libraries and applications. Experimental results show that WR-Store reduces the size of executable files by 17.77153.73 times compared to other key-value stores.

둘째, 합성 및 실제 데이터 세트를 사용한 광범위한 실험을 통해 WR-Store가 최신 시스템 (즉, RocksDB, BerkeleyDB 및 LevelDB)과 비교할 수 있거나 훨씬 더 효율적이다. 특히 WR-Store는 연속 작업 수가 적은 경우에도 다른 키-값 저장소보다 성능이 뛰어나고, 데이터 세트의 크기가 증가함에 따라 WR-Store가 다른 키-값 저장소보다 훨씬 더 효율적이며, WR-Store의 확장성을 검증되었다. 또한 레지스트리에 액세스하는 1000 개의 프로세스가 동시에 실행되는 집중적인 레지스트리 워크로드의 경우에도 WR-Store의 성능이 유지됨을 보여준다.Second, extensive experimentation with synthetic and real data sets made WR-Store comparable or even more efficient with modern systems (i.e., RocksDB, BerkeleyDB, and LevelDB). In particular, WR-Store outperforms other key-value stores even with a small number of consecutive operations, WR-Store is much more efficient than other key-value stores as the size of the data set grows, and WR-Store is scalable. has been verified. It also shows that the performance of WR-Store is maintained even for intensive registry workloads where 1000 processes accessing the registry are running concurrently.

이처럼, WR-Store의 장점은 (1) 가벼움, (2) 효율성 및 (3) 확장성입니다. 제안 된 기술은 최소한의 노력으로 윈도우 운영 체제가 실행되는 모든 환경에서 사용할 수 있다. 또한, 운도우 운영 체제가 없는 다른 환경의 경우, 본 발명의 실시예에 따른 ETL (Extract-Transform-Load) 방법을 사용하여 기존 키-값 저장소를 지원하는 다른 환경으로 데이터를 쉽게 마이그레이션 할 수 있다.As such, the advantages of WR-Store are (1) lightness, (2) efficiency, and (3) scalability. The proposed technique can be used in any environment running Windows operating system with minimal effort. In addition, in the case of another environment without the Windows operating system, data can be easily migrated to another environment supporting the existing key-value store by using the ETL (Extract-Transform-Load) method according to the embodiment of the present invention. .

이상으로 본 발명의 기술적 사상을 예시하기 위한 바람직한 실시예와 관련하여 설명하고 도시하였지만, 본 발명은 이와 같이 도시되고 설명된 그대로의 구성 및 작용에만 국한되는 것이 아니며, 기술적 사상의 범주를 일탈함이 없이 본 발명에 대해 다수의 변경 및 수정이 가능함을 당업자들은 잘 이해할 수 있을 것이다. 따라서 그러한 모든 적절한 변경 및 수정과 균등 물들도 본 발명의 범위에 속하는 것으로 간주되어야 할 것이다.Although described and illustrated in relation to a preferred embodiment for illustrating the technical idea of the present invention above, the present invention is not limited to the configuration and operation as shown and described as such, and deviates from the scope of the technical idea. It will be apparent to those skilled in the art that many changes and modifications can be made to the invention without reference to the invention. Accordingly, all such suitable alterations and modifications and equivalents are to be considered as falling within the scope of the present invention.

S: 범용적인 데이터 저장을 위한 윈도우 운영체계에서의 저장 시스템
100: WR-Store
200: 제어부
300: 마이그레이션부S: A storage system in the Windows operating system for general-purpose data storage.
100: WR-Store
200: control unit
300: migration unit

Claims

WR-Store created by mapping the components of the key-value store that creates each data item as a key-value pair (k, v) to the components of the Windows registry; and
A control unit that performs a hash-based multi-level registry index function and performs any one function among input (Put), inquiry (Get), or deletion (Delete) for the data stored in the WR-Store using the Windows basic API including,
The control unit performs empirical analysis of the WR-Store based on the depth WRD of the registry and the length WRL of each subkey of the registry,
A storage system in the Windows operating system for general-purpose data storage, characterized in that it is provided to adjust the performance of the generated WR-Store based on the result of empirical analysis of the received WR-Store.

According to claim 1,
A migration unit that converts the data stored in the WR-Store so that it can be recognized in other operating system environments that support key-value storage
A storage system in the Windows operating system for general-purpose data storage, characterized in that it further comprises

(a) generating, by the control unit, the WR-Store by mapping the components of the key-value store to the components of the Windows registry;
(b) the control unit reconstructing the key-value data stored in the WR-Store into a hash-based multi-level index structure; and
(c) the control unit performs any one of input (Put), inquiry (Get), or deletion (Delete) on the data stored in the WR-Store using the Windows basic API,
(d) After step (c), the control unit performs empirical analysis of the WR-Store based on the depth WRD of the registry and the length WRL of each subkey of the registry, and
A storage method in the Windows operating system for general-purpose data storage, characterized in that it further comprises the step of adjusting the performance of the WR-Store based on the empirical analysis result for the WR-Store.