KR20220138269A

KR20220138269A - Data storage device dynamically determinning whether data is hot or cold and operation method of the same

Info

Publication number: KR20220138269A
Application number: KR1020210044225A
Authority: KR
Inventors: 박동철; 심다은; 하현지; 이혜인
Original assignee: 숙명여자대학교산학협력단
Priority date: 2021-04-05
Filing date: 2021-04-05
Publication date: 2022-10-12

Abstract

A data storage device that dynamically determines whether data is hot/cold is disclosed. The data storage device that dynamically determines whether data is hot/cold according to various embodiments of the present disclosure may include: a host interface that communicates with the host device; a flash memory; and at least one processor. The processor performs: based on a stack distance of workload data, selecting a bloom filter to record an indicator for the workload data from among the plurality of bloom filters; based on at least one hash function, writing the indicator for the workload data to the selected bloom filter; obtaining a hot/cold decision value of the workload data based on the indicator for the workload data recorded in each of the plurality of bloom filters and a weight according to recency for each of the plurality of bloom filters; and determining whether the workload data is hot data/cold data based on the determination value.

Description

Data storage device that dynamically determines whether data is hot/cold and its operation method

본 발명은 데이터의 핫/콜드 여부를 동적으로 판단하는 데이터 저장장치에 대한 것으로, 보다 상세하게는 복수의 블룸필터를 이용하여 데이터의 핫/콜드 여부를 동적으로 판단하는 데이터 저장장치에 대한 것이다.The present invention relates to a data storage device for dynamically determining whether data is hot/cold, and more particularly, to a data storage device for dynamically determining whether data is hot/cold using a plurality of bloom filters.

플래시 메모리는 저전력, 고성능, 높은 내구성의 장점을 가진다. 최근 플래시 메모리의 가격이 하락하고, 용량이 커짐에 따라 낸드 플래시 메모리 기반의 저장장치인 솔리스 스테이트 드라이브(Solid State Driver; 이하SDD)가 기존 하드 디스크 드라이버를 대체하고 있다. SSD는 여러개의 NAND 플래시 칩, DRAM 버퍼와 컨트롤러로 구성되며, 멀티 채널과 멀티 웨이 구조를 활용하여 여러 칩에 동시에 읽기/쓰기 연산이 가능하다.Flash memory has advantages of low power, high performance, and high durability. Recently, as the price of flash memory has decreased and capacity has increased, a solid state drive (SDD), which is a storage device based on a NAND flash memory, is replacing an existing hard disk driver. SSD consists of several NAND flash chips, DRAM buffers and a controller, and it is possible to read/write operations to multiple chips at the same time by utilizing multi-channel and multi-way structures.

SSD(Solid State Disk)는 비휘발성 낸드플래시메모리(nand flash memory)를 이용하여 정보를 저장한다. SSD는 임의접근을 하여 탐색시간 없이 고속으로 데이터를 입출력 할 수 있으면서도 기계적 지연이나 실패율이 현저히 적다. 또한 SSD는 외부의 충격으로 데이터가 손상되지 않으며, 발열·소음 및 전력소모가 적고, 소형화 및 경량화 할 수 있는 장점이 있다.A solid state disk (SSD) stores information using a non-volatile NAND flash memory. SSD can input/output data at high speed without search time by random access, while mechanical delay or failure rate is significantly low. In addition, SSDs do not damage data due to external shocks, reduce heat, noise, and power consumption, and have the advantages of being compact and lightweight.

SSD의 핵심 기술은 FTL (Flash Translation Layer)이다. FTL의 핵심 기능 중 GC(Garbage Collection)와 WL (Wear Leveling)은 SSD의 성능과 수명에 매우 큰 영향을 미친다.The core technology of SSD is FTL (Flash Translation Layer). Among the core functions of FTL, GC (Garbage Collection) and WL (Wear Leveling) greatly affect the performance and lifespan of an SSD.

GC와 WL은 기본적으로 hot/cold data 구분 기법을 채택하고 있다. 하지만, 기존 hot/cold data 구분 메커니즘의 경우 hot/cold data를 구분하는 데 있어 단순히 데이터 액세스 빈도수를 고려하는데 그친다. GC and WL basically adopt the hot/cold data classification technique. However, in the case of the existing hot/cold data classification mechanism, it merely considers the frequency of data access in classifying hot/cold data.

본 발명의 배경이 되는 기술의 일 예로, 대한민국 등록특허공보 제10-1311031호(2013.09.24.)는 부하가 분산된 복수의 스토리지 장치에 효율적으로 엑세스하는 방법에 대한 것으로, 복수의 데이터 각각의 엑세스 시각에 기초하여 핫 데이터 및 콜드 데이터를 구분하는 방법을 개시하고 있다.As an example of the technology that is the background of the present invention, Republic of Korea Patent Registration No. 10-1311031 (September 24, 2013) relates to a method of efficiently accessing a plurality of storage devices with a distributed load, each of a plurality of data Disclosed is a method of classifying hot data and cold data based on an access time.

본 발명은 상술한 문제점을 해결하기 위해 안출된 것으로, 본 발명은 데이터의 핫/콜드 여부를 결정함에 있어 엑세스 빈도수(frequency) 외에도 최신성(recency)을 중요하게 고려함으로써 최근에 엑세스된 데이터일수록 핫한 데이터로 정의하는 방법을 제공하는 것을 목적으로 한다.The present invention has been devised to solve the above problems, and the present invention considers recency in addition to the access frequency in determining whether data is hot/cold, so that the more recently accessed data is, the hotter it is. It aims to provide a way to define data.

또한, 본 발명은 데이터의 핫/콜드를 정적(static)으로 구분한 기존의 기술과는 달리, 서로 다른 워크로드 데이터(workload data)마다 핫/콜드 데이터를 구분하는 기준을 적응적으로 변경함으로써 동적(dynamic)으로 데이터의 핫/콜드를 구분할 수 있다.In addition, the present invention provides a dynamic method by adaptively changing a criterion for classifying hot/cold data for each different workload data, unlike the conventional technology that statically divides hot/cold data. Data can be divided into hot/cold by (dynamic).

본 발명의 다양한 실시 예에 따른 데이터의 핫/콜드 여부를 동적으로 판단하는 데이터 저장장치는, 호스트 장치와 통신을 수행하는 호스트 인터페이스, 플래시 메모리 및 워크로드 데이터의 스택 거리에 기초하여, 상기 복수의 블룸필터 중 상기 워크로드 데이터에 대한 지시자를 기록할 블룸필터를 선택하고, 적어도 하나의 해시함수에 기초하여, 상기 워크로드 데이터에 대한 상기 지시자를 상기 선택된 블룸필터에 기록하고, 상기 복수의 블룸필터 각각에 기록된 상기 워크로드 데이터에 대한 상기 지시자 및 상기 복수의 블룸필터 각각에 대한 최신성에 따른 가중치에 기초하여 상기 워크로드 데이터의 핫/콜드 판정값을 획득하고, 상기 판정값에 기초하여 상기 워크로드 데이터의 핫 데이터/콜드 데이터 여부를 판단하는 적어도 하나의 프로세서를 포함할 수 있다.A data storage device for dynamically determining whether data is hot/cold according to various embodiments of the present disclosure includes a host interface that communicates with a host device, a flash memory, and a stack distance of workload data. select a bloom filter to record the indicator for the workload data among the bloom filters, and record the indicator for the workload data in the selected bloom filter, based on at least one hash function, and the plurality of bloom filters Obtain a hot/cold determination value of the workload data based on the weight according to the freshness for each of the indicator and the plurality of bloom filters for the workload data recorded in each, and based on the determination value, the work It may include at least one processor that determines whether the load data is hot data/cold data.

본 발명의 다양한 실시 예에 따른 데이터의 핫/콜드 여부를 동적으로 판단하는 데이터 저장장치의 동작방법은, 워크로드 데이터의 스택 거리에 기초하여, 복수의 블룸필터 중 상기 워크로드 데이터에 대한 지시자를 기록할 블룸필터를 선택하는 과정, 적어도 하나의 해시함수에 기초하여, 상기 워크로드 데이터에 대한 상기 지시자를 상기 선택된 블룸필터에 기록하는 과정, 상기 복수의 블룸필터 각각에 기록된 상기 워크로드 데이터에 대한 상기 지시자 및 상기 복수의 블룸필터 각각에 대한 최신성에 따른 가중치에 기초하여 상기 워크로드 데이터의 핫/콜드 판정값을 획득하는 과정 및 상기 판정값에 기초하여 상기 워크로드 데이터의 핫 데이터/콜드 데이터 여부를 판단하는 과정을 포함할 수 있다.In the method of operating a data storage device for dynamically determining whether data is hot/cold according to various embodiments of the present disclosure, an indicator for the workload data among a plurality of bloom filters is displayed based on a stack distance of the workload data. The process of selecting a bloom filter to be recorded, the process of recording the indicator for the workload data to the selected bloom filter based on at least one hash function, the workload data recorded in each of the plurality of bloom filters A process of obtaining a hot/cold decision value of the workload data based on the weight according to the freshness of each of the indicator and the plurality of bloom filters, and hot data/cold data of the workload data based on the decision value It may include the process of determining whether or not

본 발명의 다양한 실시 예에 따르면, 본 발명은 데이터의 핫/콜드 여부를 결정함에 있어 엑세스 빈도수(frequency) 외에도 최신성(recency)을 중요하게 고려함으로써 최근에 엑세스된 데이터일수록 핫한 데이터로 정의할 수 있다.According to various embodiments of the present invention, in the present invention, in determining whether data is hot/cold, it is important to consider recency in addition to the access frequency, so that more recently accessed data can be defined as hot data. have.

도 1은 본 발명의 일 실시 예에 따른 데이터 저장장치에 대한 블록도이다.
도 2는 본 발명의 일 실시 예에 따른 플래시 시스템의 아키텍처이다.
도 3은 본 발명의 일 실시 예에 따른 복수의 블룸필터에 대한 프레임워크를 도시한다.
도 4는 본 발명의 일 실시 예에 따른 데이터 저장장치에 대한 블록도이다.
도 5 및 도 6은 본 발명의 일 실시 예에 따른 핫/콜드 데이터 판정방법에 대한 흐름도이다.
도 7은 본 발명의 일 실시 예에 따른 데이터의 핫/콜드 여부를 동적으로 판단하는 저장장치의 동작방법에 대한 흐름도이다.1 is a block diagram of a data storage device according to an embodiment of the present invention.
2 is an architecture of a flash system according to an embodiment of the present invention.
3 illustrates a framework for a plurality of bloom filters according to an embodiment of the present invention.
4 is a block diagram of a data storage device according to an embodiment of the present invention.
5 and 6 are flowcharts of a method for determining hot/cold data according to an embodiment of the present invention.
7 is a flowchart illustrating a method of operating a storage device for dynamically determining whether data is hot/cold according to an embodiment of the present invention.

이하 첨부된 도면을 참조하여 본 발명의 바람직한 실시 예에 대한 동작원리를 상세히 설명한다. 또한, 발명에 대한 실시 예를 설명함에 있어 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 개시의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 하기에서 사용되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로써, 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 사용된 용어들의 정의는 본 명세서 전반에 걸친 내용 및 이에 상응한 기능을 토대로 해석되어야 할 것이다.Hereinafter, the principle of operation of a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings. In addition, when it is determined that a detailed description of a related well-known function or configuration may obscure the gist of the present disclosure in describing an embodiment of the present invention, the detailed description thereof will be omitted. And the terms used below are terms defined in consideration of functions in the present invention, which may vary depending on the intention or custom of the user or operator. Therefore, the definitions of the terms used should be interpreted based on the contents and corresponding functions throughout this specification.

도 1은 본 발명의 일 실시 예에 따른 데이터 저장장치에 대한 블록도이다.1 is a block diagram of a data storage device according to an embodiment of the present invention.

이하에서, 데이터 저장장치(1)는 설명의 편의를 위해 SSD(Solid State Drive)를 예로 들어 설명할 것이나 사용 예는 이에 제한되지 않을 것이다. 일 예로, 낸드(NAND)나 캐시가 포함된 전자장치 예를 들어, 스마트폰, PC 등도 데이터 저장장치(1)에 포함될 수 있을 것이다.Hereinafter, the data storage device 1 will be described by taking a solid state drive (SSD) as an example for convenience of description, but the usage example will not be limited thereto. For example, an electronic device including a NAND or a cache, for example, a smart phone, a PC, etc. may also be included in the data storage device 1 .

또한, 후술할 본 발명의 다양한 실시 예들은 여러가지의 다른 성능의 스토리지를 묶는 스토리지 티어링(storage tiering) 기술 등 다양한 데이터 처리 기술에 적용 가능할 것이다.In addition, various embodiments of the present invention, which will be described later, may be applied to various data processing technologies, such as a storage tiering technology for tying storage of different performance.

도 1을 참조하면, 데이터 저장장치(1)는 호스트 인터페이스(11), 램 버퍼(12), 플래시 메모리(13) 및 컨트롤러(14)를 포함할 수 있다.Referring to FIG. 1 , the data storage device 1 may include a host interface 11 , a RAM buffer 12 , a flash memory 13 , and a controller 14 .

호스트 인터페이스(11)는 호스트로부터 수신된 신호를 컨트롤러(12)로 전달할 수 있다.The host interface 11 may transmit a signal received from the host to the controller 12 .

호스트 인터페이스(11)는 데이터 저장장치(1)를 호스트와 연결시키도록 구성될 수 있다. 호스트 인터페이스(11)는 PCIe(Peripheral Component Interconnect-Express), SAS(Serial Attached SCSI, 직렬 SCSI), SATA(Serial AT Attachment, 직렬 ATA) 또는 당업계에 알려진 다른 구성들과 호환될 수 있다. The host interface 11 may be configured to connect the data storage device 1 with a host. The host interface 11 may be compatible with Peripheral Component Interconnect-Express (PCIe), Serial Attached SCSI (SAS), Serial AT Attachment (SATA), or other configurations known in the art.

또한, 데이터 저장장치(1)는 이더넷(Ethernet), 파이버 채널(fibre channel), 인피니밴드(InfiniBand) 및 다른 네트워크 패브릭들(network fabrics)과 같은 네트워크 패브릭을 통해 호스트에 연결될 수 있다. 이 경우, 데이터 저장장치(1)는 패브릭들(NVMe-oF)을 통해 NVMe(Non-Volatile Memory express, 비휘발성메모리)와 호환될 수 있다.In addition, the data storage device 1 may be connected to the host through a network fabric such as Ethernet, Fiber Channel, InfiniBand, and other network fabrics. In this case, the data storage device 1 may be compatible with NVMe (Non-Volatile Memory express, non-volatile memory) through the fabrics (NVMe-oF).

램 버퍼(12)는 NAND에 저장되어 있는 데이터의 매핑 테이블을 저장할 수 있다. 일 예로, SSD를 구성하는 NAND는 블록과 페이지로 구성되기 때문에 운영체제 상의의 논리적 주소와 NAND의 물리적 주소를 서로 치환해주는 어드레스 매핑이 필요한데, 이러한 매핑 관계는 램 버퍼(12)에 저장되어 컨트롤러(2)에 제공될 수 있다. The RAM buffer 12 may store a mapping table of data stored in the NAND. For example, since the NAND constituting the SSD is composed of blocks and pages, an address mapping that substitutes a logical address on the operating system and a physical address of the NAND is required. This mapping relationship is stored in the RAM buffer 12 and stored in the controller 2 ) can be provided.

또한, 램 버퍼(12)는 처리되는 데이터들이 CPU 및 bus system에서 지원하는 모드에 맞춰 전송될 수 있도록 임시로 데이터를 저장할 수도 있다.In addition, the RAM buffer 12 may temporarily store data so that processed data can be transmitted according to a mode supported by the CPU and the bus system.

플래시 메모리(13)는 플래시 메모리는 비휘발성 반도체 저장장치다. 플래시 메모리(13)는 컨트롤러(14)의 제어에 의해 데이터의 읽기/쓰기를 수행할 수 있다.The flash memory 13 is a nonvolatile semiconductor storage device. The flash memory 13 may read/write data under the control of the controller 14 .

일 예로, 플래시 메모리(13)는 여러 개의 NAND 플래시가 병렬로 연결된 형태일 수 있다. 플래시 메모리(14)의 예로는 SLC(Single Level Cell), MLC(Multi Level Cell). TLC(Triple Level Cell), QLC(Quad Level Cell), PLC(Penta Level Cells) 타입 등이 있다.For example, the flash memory 13 may have a form in which several NAND flashes are connected in parallel. Examples of the flash memory 14 are SLC (Single Level Cell), MLC (Multi Level Cell). There are Triple Level Cell (TLC), Quad Level Cell (QLC), and Penta Level Cells (PLC) types.

컨트롤러(14)는 데이터 저장장치(1)를 통한 읽기, 쓰기, 수명관리 등을 수행할 수 있다. 컨트롤러(14)는 FTL(Flash Translation Layer)에서 상술한 어드레스 매핑을 수행하며, NAND의 장기적인 성능 유지와 수명 향상을 위한 웨어 레벨링(wear leveling)과 가비지 컬렉션(garbage collection)을 수행할 수 있다.The controller 14 may perform read, write, life management, and the like through the data storage device 1 . The controller 14 may perform the above-described address mapping in a Flash Translation Layer (FTL), and may perform wear leveling and garbage collection to maintain long-term performance and improve lifespan of NAND.

여기서, FTL은 운영체제가 내려주는 주소(논리적 주소)를 플래시 주소(물리적 주소)로 변환하는 펌웨어 계층이다. 가비지 컬렉션은 데이터 저장장치(1)를 사용하면서 사용하지 않는 영역들이 쌓이게 되는데 이러한 영역을 한데 모아서 삭제하는 동작이다. 웨어 레벨링은 주소 매핑 테이블의 내용을 수정해 논리 주소에 연결된 물리 주소 자체를 변경하는 작업이다.Here, the FTL is a firmware layer that converts an address (logical address) given by the operating system into a flash address (physical address). In the garbage collection, unused areas are accumulated while using the data storage device 1, and these areas are collected and deleted. Wear leveling is the operation of changing the physical address itself connected to the logical address by modifying the contents of the address mapping table.

컨트롤러(14)는 프로세서(141), 플래시 컨트롤러(142) 및 버퍼 매니저(143)를 포함할 수 있다.The controller 14 may include a processor 141 , a flash controller 142 , and a buffer manager 143 .

프로세서(141)는 호스트에서 요구하는 여러가지 명령들을 판독/수행할 수 있다. The processor 141 may read/execute various commands required by the host.

프로세서(141)는 하나 이상의 프로세싱 코어들을 포함할 수 있다. 예를 들어, 프로세서(141)는 코어텍스 처리 코어들(cortex processing cores)과 같은 복수의 ARM프로세싱 코어들(Advanced RISC Machine processing cores)을 포함할 수 있다. The processor 141 may include one or more processing cores. For example, the processor 141 may include a plurality of ARM processing cores such as cortex processing cores (Advanced RISC Machine processing cores).

일 예로, 프로세서(141)는 NOR 플래시에 기 저장된 내장형 펌웨어에 따라 동작하도록 구성될 수 있다.For example, the processor 141 may be configured to operate according to embedded firmware pre-stored in the NOR flash.

플래시 컨트롤러(142)는 다양한 유형들의 유지 보수 및 다른 작업들을 수행하도록 구성될 수 있다. 또한, 플래시 컨트롤러(142)는 데이터의 입/출력을 관리하도록 구성될 수 있다. 예를 들어, 플래시 컨트롤러(142)는 웨어 레벨링, 블록 피킹(block picking), 가비지 컬렉션 및 암호화를 제공하도록 구성되고, 플래시 변환 계층 및 매핑 등을 제공하도록 구성될 수 있다. The flash controller 142 may be configured to perform various types of maintenance and other tasks. Also, the flash controller 142 may be configured to manage input/output of data. For example, the flash controller 142 may be configured to provide wear leveling, block picking, garbage collection and encryption, and may be configured to provide a flash translation layer and mapping, and the like.

또한, 플래시 컨트롤러(142)는 하나 이상의 채널을 사용하여 플래시 메모리에 포함된 다양한 그룹들과 접속할 수 있다.Also, the flash controller 142 may connect to various groups included in the flash memory using one or more channels.

버퍼 매니저(143)는 입/출력 요청을 용이하게 하기 위해 호스트 인터페이스(130)와 플래시 컨트롤러(150) 사이에 버퍼를 제공하도록 구성될 수 있다. 예를 들어, 버퍼 매니저(143)는 램 버퍼(12)에 연결된 DRAM 컨트롤러뿐만 아니라 집적 SRAM을 포함할 수 있다.The buffer manager 143 may be configured to provide a buffer between the host interface 130 and the flash controller 150 to facilitate input/output requests. For example, the buffer manager 143 may include an integrated SRAM as well as a DRAM controller coupled to the RAM buffer 12 .

상술한 내용에서 데이터 저장장치(1)의 구성을 상세하게 설명하였다. 이하에서, 도 2 내지 도7을 참조하여 본 발명의 다양한 실시 예에 따른 데이터의 핫/콜드 여부를 동적으로 판단하는 방법에 대하여 상세히 설명한다. 이하에서, 핫/콜드 여부를 동적으로 판단하는 방법은 데이터 저장장치(1)의 컨트롤러(14)에 의해 동작하는 것으로 가정한다.The configuration of the data storage device 1 has been described in detail in the above description. Hereinafter, a method of dynamically determining whether data is hot/cold according to various embodiments of the present invention will be described in detail with reference to FIGS. 2 to 7 . Hereinafter, it is assumed that the method of dynamically determining whether hot/cold is operated by the controller 14 of the data storage device 1 .

도 2는 본 발명의 일 실시 예에 따른 플래시 시스템의 아키텍처이다.2 is an architecture of a flash system according to an embodiment of the present invention.

일반적인 FTL 구성은 공지되어 있으므로 여기에서는 상세한 설명을 생략한다.Since a general FTL configuration is known, a detailed description thereof is omitted here.

본 발명의 일 실시 예에 따른 FTL은 핫/콜드 데이터 식별부(Hot DATA Identifier)를 포함할 수 있다.The FTL according to an embodiment of the present invention may include a hot/cold data identifier (Hot DATA Identifier).

핫/콜드 데이터 식별부(21)는 복수의 블룸필터를 이용하여 핫/콜드 데이터 식별을 수행할 수 있다. The hot/cold data identification unit 21 may perform hot/cold data identification using a plurality of bloom filters.

여기서, 핫 데이터 및 콜드 데이터를 정의하는 방식은 다양하겠으나, 일반적으로 핫 데이터는 빈번하게 수정되는 데이터이고, 콜드 데이터는 자주 수정되지 않는 데이터라고 정의할 수 있다. Here, although there are various ways of defining hot data and cold data, in general, hot data may be defined as frequently modified data, and cold data may be defined as data that is not frequently modified.

만약 핫 데이터와 콜드 데이터가 동일한 페이지에 저장된다면, 핫 데이터가 변경될 때마다 콜드 데이터는 핫 데이터와 함께 Read-Modify-Write 오퍼레이션에 함께 포함되어 복사되어야 한다. 또한, 웨어 레벨링을 위해서 콜드 데이터도 같이 계속 다른 페이지로 이동되어야 한다. 콜드 데이터와 핫 데이터는 최대한 분리되어야 상술한 GC가 효율적으로 처리될 수 있다.If hot data and cold data are stored in the same page, whenever the hot data is changed, the cold data must be copied together with the hot data in the Read-Modify-Write operation. Also, for wear leveling, cold data must be continuously moved to another page. Cold data and hot data must be separated as much as possible so that the above-described GC can be efficiently processed.

일 예로, 핫/콜드 데이터 식별부(21)는 워크로드 데이터의 스택 거리(stack distance)에 기초하여, 복수의 블룸필터 중 워크로드 데이터에 대한 지시자를 기록할 블룸필터를 선택하는 과정을 수행할 수 있다. 또한, 핫/콜드 데이터 식별부(21)는 적어도 하나의 해시함수에 기초하여 워크로드 데이터에 대한 지시자를 선택된 블룸필터에 기록할 수 있다. For example, the hot/cold data identification unit 21 performs a process of selecting a bloom filter to record an indicator for workload data among a plurality of bloom filters based on a stack distance of the workload data. can Also, the hot/cold data identification unit 21 may record an indicator for the workload data in the selected bloom filter based on at least one hash function.

이 경우, 핫/콜드 데이터 식별부(21)는 복수의 블룸필터 각각에 기록된 워크로드 데이터에 대한 지시자 및 복수의 블룸필터 각각에 대한 최신성에 따른 가중치에 기초하여 데이터의 핫/콜드 판정값을 획득할 수 있다. 또한, 핫/콜드 데이터 식별부(21)는 획득된 판정값에 기초하여 워크로드 데이터의 핫 데이터/콜드 데이터 여부를 판단할 수 있다.In this case, the hot/cold data identification unit 21 determines the hot/cold determination value of data based on an indicator for workload data recorded in each of the plurality of bloom filters and a weight according to the freshness of each of the plurality of bloom filters. can be obtained Also, the hot/cold data identification unit 21 may determine whether the workload data is hot data/cold data based on the obtained determination value.

도 3은 본 발명의 일 실시 예에 따른 복수의 블룸필터에 대한 프레임워크를 도시한다.3 illustrates a framework for a plurality of bloom filters according to an embodiment of the present invention.

구체적으로, 도 3은 복수의 블룸필터를 이용한 핫/콜드 데이터 판단(식별) 방법에 대한 프레임워크를 도시한다.Specifically, FIG. 3 shows a framework for a method of determining (identifying) hot/cold data using a plurality of bloom filters.

도 3을 참조하면, V개의 독립된 블룸필터 및 K개의 독립된 해시함수가 정의된다. V개의 블룸필터 각각은 K 해시값을 기록하기 위한 M개의 비트로 구성될 수 있다. Referring to FIG. 3 , V independent bloom filters and K independent hash functions are defined. Each of the V bloom filters may consist of M bits for recording K hash values.

일 예로, FTL (Flash Translation Layer)에 첫 번째 쓰기 요청이 수신되는 경우, 상응하는 LBA가 K개의 해시함수에 의해 해싱될 수 있다. K 해시 출력 값은 첫 번째 블룸필터의 상응하는 비트 위치에 1로서 기록될 수 있다. 이 경우, 해시함수의 출력 값은 1부터 M까지이며, 각 해시 출력 값은 M 비트 블룸필터의 각 비트 포지션에 대응된다.For example, when a first write request is received in the Flash Translation Layer (FTL), the corresponding LBA may be hashed by K hash functions. The K hash output value can be written as 1 in the corresponding bit position of the first bloom filter. In this case, the output values of the hash function range from 1 to M, and each hash output value corresponds to each bit position of the M-bit bloom filter.

두 번째 쓰기 요청이 수신되는 경우, 2 번째 블룸필터가 선택될 수 있으며, K 해시 출력 값은 두 번째 블룸필터의 상응하는 비트 위치에 1로서 기록될 수 있다. 여기서, 블룸필터 선택 방식은 라운드 로빈 방식(round robin fashion)일 수 있다.When a second write request is received, a second bloom filter may be selected, and the K hash output value may be written as 1 to the corresponding bit position of the second bloom filter. Here, the bloom filter selection method may be a round robin fashion.

상술한 블룸필터 각각은 M 비트가 모두 기록되는 경우 또는 기정의된 시간 주기에 따라 삭제 또는 리셋될 수 있다.Each of the aforementioned bloom filters may be deleted or reset when all M bits are written or according to a predefined time period.

본 발명의 일 실시 예에 따른 복수의 블룸필터를 이용한 핫/콜드 데이터 판단 방법에서는 LBA 출현 횟수를 카운팅하기 위한 블룸필터 카운터를 이용하지 않는다. 즉, 본 발명의 일 실시 예에 따른 복수의 블룸필터를 이용한 핫/콜드 데이터 판단 방법에서는 복수의 블룸필터 각각을 확인하여 특정 LBA가 기록되었는지 카운팅함으로써 LBA 빈도를 저장할 수 있다.The hot/cold data determination method using a plurality of bloom filters according to an embodiment of the present invention does not use a bloom filter counter for counting the number of LBA appearances. That is, in the hot/cold data determination method using a plurality of bloom filters according to an embodiment of the present invention, the LBA frequency may be stored by checking each of the plurality of bloom filters and counting whether a specific LBA has been recorded.

일 예로, 컨트롤러(14)는 특정 LBA가 입력되는 경우, 복수의 블룸필터 중 하나를 선택하고, 대응되는 비트 위치에 1을 기록할 수 있다. 만약, 선택된 블룸필터에 상기 특정 LBA의 해시 출력값이 기록된 경우, 컨트롤러(14)는 상기 특정 LBA의 해시 출력값을 기록하지 않은 다른 블룸필터를 찾을 때까지 순차적으로(라운드 로빈 방식) 확인할 수 있다. 특정 LBA의 해시 출력값을 기록하지 않은 다른 블룸필터를 찾은 경우, 컨트롤러(14)는 특정 LBA의 해시 출력 값을 기록할 수 있다.For example, when a specific LBA is input, the controller 14 may select one of a plurality of bloom filters and write 1 to a corresponding bit position. If the hash output value of the specific LBA is recorded in the selected bloom filter, the controller 14 may check sequentially (round robin method) until it finds another bloom filter that does not record the hash output value of the specific LBA. If another bloom filter that does not record the hash output value of a specific LBA is found, the controller 14 may record the hash output value of the specific LBA.

도 4는 본 발명의 일 실시 예에 따른 데이터 저장장치에 대한 블록도이다.4 is a block diagram of a data storage device according to an embodiment of the present invention.

도 4를 참조하면, 데이터 저장장치(1')는 워크로드 데이터 분석부(41), 블룸필터 선택부(42) 및 핫/콜드 데이터 분류부(43)를 포함할 수 있다.Referring to FIG. 4 , the data storage device 1 ′ may include a workload data analysis unit 41 , a bloom filter selection unit 42 , and a hot/cold data classification unit 43 .

워크로드 데이터 분석부(41)는 입력되는 워크로드 데이터에 대한 스택 거리(SD; stack distance)를 연산한다. 워크로드 데이터 분석부(41)는 매번 워크로드 데이터가 입력될 때마다 스택 거리를 계산할 수 있다. 일 예로, 상기 워크로드 데이터는 LBA(Logical Block Address)로 정의될 수 있다.The workload data analysis unit 41 calculates a stack distance (SD) for the input workload data. The workload data analyzer 41 may calculate the stack distance whenever workload data is input. For example, the workload data may be defined as a logical block address (LBA).

스택 거리는 동일한 두 개의 오브젝트 사이에 나타나는 여러 개의 오브젝트 중 유니크(unique)한 오브젝트의 개수로 정의될 수 있다. 예를 들어, 오브젝트의 나열 'a b c c c d e e e a b'를 가정할 때, 동일한 'a' 오브젝트 사이에 나타나는 오브젝트 'b c c c d e e e'에서 유니크한 오브젝트는 'b c d e'다. 따라서, 상기 오브젝트 나열의 스택 거리는 4(SD(a)=4)로 정의될 수 있다.The stack distance may be defined as the number of unique objects among multiple objects appearing between two identical objects. For example, assuming a list of objects 'a b c c c d e e e a b', the unique object in the object 'b c c c d e e e' appearing between the same 'a' objects is 'b c d e'. Accordingly, the stack distance of the array of objects may be defined as 4 (SD(a)=4).

스택 거리는 출현되는(또는 access 되는) 모든 오브젝트들의 히스토리를 저장하고 있어야만 계산 가능하기 때문에 메모리 사용량이 크고, 특히 캐시 메커니즘의 경우 모든 오브젝트 각각에 대해서 SD를 구해야 하므로 계산 복잡도 (computational complexity)가 매우 높다. 일반적으로, 스택을 구현하기 위한 자료 구조로는 doubly linked가 있다.Because the stack distance can be calculated only when the history of all objects that appear (or are accessed) is stored, the memory usage is large, and in particular, in the case of the cache mechanism, the computational complexity is very high because SD must be obtained for each object. In general, there is a doubly linked data structure for implementing a stack.

워크로드 데이터 분석부(41)는 획득된 스택 거리에 기초하여 입력된 워크로드 데이터의 최신성을 반영한 최신성 가중치(

)를 계산할 수 있다. 최신성 가중치(

)는 다음의 수학식 1에 기초하여 계산될 수 있다.The workload data analysis unit 41 calculates a freshness weight (

) can be calculated. Freshness weight (

) can be calculated based on Equation 1 below.

여기서,

는 최신성 가중치,

는 스택 거리,

는 평균 스택 거리,

이다.here,

is the freshness weight,

is the stack distance,

is the average stack distance,

to be.

워크로드 데이터 분석부(41)는 워크로드 분석을 통해 최신성(recency)이 더 중요한 워크로드 데이터의 경우 즉, 스택 거리가 상대적으로 작은 경우 빈도(frequency)보다 최신성에 더 큰 가중치를 부여할 수 있다.The workload data analysis unit 41 may give greater weight to freshness than frequency in the case of workload data in which recency is more important, that is, when the stack distance is relatively small through workload analysis. have.

반대로, 워크로드 데이터 분석부(41)는 스택 거리가 상대적으도 작을 경우 최신성보다 빈도에 더 큰 가중치를 부여할 수 있다.Conversely, when the stack distance is relatively small, the workload data analyzer 41 may give a greater weight to the frequency than to the freshness.

블룸필터 선택부(42)는 최신성 가중치에 기초하여, 복수의 블룸필터 중 워크로드 데이터에 대한 지시자를 기록할 블룸필터를 선택할 수 있다. 여기서, 지시자는 워크로드 데이터의 해시 출력 값에 대응되는 블룸필터의 비트에 대한 표시 값으로 정의될 수 있다. 또한, 블룸필터 선택부(42)는 최신성 가중치 할당부로 정의될 수도 있다.The bloom filter selection unit 42 may select a bloom filter to record an indicator for workload data among a plurality of bloom filters based on the freshness weight. Here, the indicator may be defined as a display value for the bit of the bloom filter corresponding to the hash output value of the workload data. Also, the bloom filter selection unit 42 may be defined as a freshness weight allocator.

일 예로, 블룸필터 선택부(42)는 복수의 블룸필터 중 가장 최근에 리셋된 블룸필터의 선택 확률은

로 설정하고, 복수의 블룸필터 중 가장 최근에 리셋된 블룸필터 외의 적어도 하나의 블룸필터의 선택 확률은

로 설정하여, 복수의 블룸필터 중 지시자를 기록할 블룸필터를 선택할 수 있다.As an example, the bloom filter selection unit 42 determines that the selection probability of the most recently reset bloom filter among the plurality of bloom filters is

, and the selection probability of at least one bloom filter other than the most recently reset bloom filter among the plurality of bloom filters is

By setting to , it is possible to select a bloom filter to record an indicator among a plurality of bloom filters.

핫/콜드 데이터 분류부(43)는 워크로드 데이터 및 선택된 블룸필터의 넘버 정보에 기초하여 상기 워크로드 데이터가 핫 데이터 또는 콜드 데이터인지 분류하여 저장할 수 있다. 여기서, 핫/콜드 데이터 분류부(43)는 복수 개의 블룸필터를 이용할 수 있다.The hot/cold data classification unit 43 may classify and store whether the workload data is hot data or cold data based on the workload data and number information of the selected bloom filter. Here, the hot/cold data classification unit 43 may use a plurality of bloom filters.

핫/콜드 데이터 분류부(43)는 복수의 블룸필터 각각에 기록된 워크로드 데이터에 대한 지시자 및 복수의 블룸필터 각각에 대한 최신성에 따른 가중치에 기초하여 워크로드 데이터의 핫/콜드 판정 값을 획득할 수 있다.The hot/cold data classification unit 43 obtains a hot/cold determination value of the workload data based on an indicator for the workload data recorded in each of the plurality of bloom filters and a weight according to the freshness of each of the plurality of bloom filters can do.

구체적으로, 핫/콜드 데이터 분류부(43)는 다음의 수학식 2에 기초하여 핫/콜드 판정 값을 획득할 수 있다.Specifically, the hot/cold data classification unit 43 may obtain a hot/cold determination value based on Equation 2 below.

여기서,

는 핫/콜드 판정 값,

는 n번째 BF에 상기 워크로드 데이터가 기록된 경우 1, 아닌 경우 0,

는 n번째 BF에 대한 기정의된 exponential 함수의 출력 값이다.here,

is the hot/cold judgment value,

is 1 when the workload data is written to the nth BF, 0 otherwise,

is the output value of the predefined exponential function for the nth BF.

복수의 블룸필터 각각은 기정의된 주기에 따라 리셋될 수 있다. 이 경우, 상기

은 복수의 블룸필터 각각에 대한 최신성에 따른 가중치이다. 즉, 복수의 블룸필터 중 최근에 리셋된 블룸필터일수록 최신의 워크로드 데이터에 대한 지시자를 기록하고 있으므로, 복수의 블룸필터 중 최근에 리셋된 블룸필터일수록 최신성에 따른 가중치가 크게 설정될 수 있다.Each of the plurality of bloom filters may be reset according to a predefined period. In this case, the

is a weight according to the freshness of each of the plurality of bloom filters. That is, since a recently reset bloom filter among a plurality of bloom filters records an indicator for the latest workload data, a more recently reset bloom filter among a plurality of bloom filters may have a greater weight according to freshness.

일 예로, 상기

은 상기 수학식 1의

와 동일한 함수로 정의될 수도 있다.For example, the

is in Equation 1 above

It can also be defined as the same function as

일 예로, 블룸필터가 4개가 사용된다고 가정한다. 또한, 복수의 블룸필터 중 제1 블룸필터 및 제2 블룸필터에만 입력된 워크로드 데이터를 지시하는 지시자가 기록되어 있다고 가정한다. As an example, it is assumed that four bloom filters are used. Also, it is assumed that an indicator indicating workload data input to only the first bloom filter and the second bloom filter among the plurality of bloom filters is recorded.

이 경우, 핫/콜드 데이터 분류부(43)는 다음의 수학식 3에 기초하여 핫/콜드 판정값

를 계산할 수 있다.In this case, the hot/cold data classification unit 43 determines the hot/cold determination value based on the following Equation (3).

can be calculated.

핫/콜드 데이터 분류부(43)는 획득된 판정값에 기초하여 워크로드 데이터의 핫 데이터/콜드 데이터 여부를 판단할 수 있다.The hot/cold data classification unit 43 may determine whether the workload data is hot data/cold data based on the obtained determination value.

일 예로, 핫/콜드 데이터 분류부(43)는 핫/콜드 판정값이 기정의된 임계 값보다 크면, 워크로드 데이터를 핫 데이터로 판단하고, 핫/콜드 판정값이 기정의된 임계 값보다 작으면, 워크로드 데이터를 콜드 데이터로 판단할 수 있다.For example, the hot/cold data classification unit 43 determines that the workload data is hot data when the hot/cold determination value is greater than a predefined threshold value, and the hot/cold determination value is smaller than the predefined threshold value. , the workload data can be determined as cold data.

상술한 본 발명의 다양한 실시 예에 따르면, 데이터 저장장치(1')는 입력되는 워크로드 데이터 마다 스택 거리 및 핫/콜드 판정 값을 계산함으로써 워크로드 데이터에 대한 핫/콜드 판정을 동적으로 수행할 수 있다.According to various embodiments of the present invention described above, the data storage device 1 ′ dynamically performs hot/cold determination on workload data by calculating a stack distance and a hot/cold determination value for each input workload data. can

도 5 및 도 6은 본 발명의 일 실시 예에 따른 핫/콜드 데이터 판정방법에 대한 흐름도이다. 5 and 6 are flowcharts of a method for determining hot/cold data according to an embodiment of the present invention.

이하, 도 5 및 도 6에 대한 내용은 상술한 도 4에 대한 내용이 적용되므로 설명의 편의를 위해 중복되는 설명은 생략한다. 또한, 도5 및 도 6은 연속되는 과정으로 해석되어야 할 것이다.Hereinafter, since the contents of FIG. 4 are applied to the contents of FIGS. 5 and 6, overlapping descriptions will be omitted for convenience of description. 5 and 6 should be interpreted as a continuous process.

도 5를 참조하면, 컨트롤러(14)는 워크로드 데이터를 입력받을 수 있다(S501). 컨트롤러(14)는 입력받은 워크로드 데이터에 대한 스택 거리를 획득할 수 있다(S502). Referring to FIG. 5 , the controller 14 may receive workload data ( S501 ). The controller 14 may obtain a stack distance for the input workload data ( S502 ).

컨트롤러(14)는 획득된 스택 거리에 기초하여 워크로드 데이터에 대한 최신성 가중치를 결정할 수 있다(S503). The controller 14 may determine a freshness weight for the workload data based on the obtained stack distance (S503).

컨트롤러(14)는 최신성 가중치에 기초하여 워크로드 데이터에 대한 지시자를 기록할 블룸필터를 복수의 블룸필터 중에서 선택할 수 있다(S504). 예를 들면, 컨트롤러(14)는 최신성 가중치

가 결정되면, 복수의 블룸필터 각각에 대한 선택 확률을 결정할 수 있다. The controller 14 may select a bloom filter to record an indicator for the workload data from among a plurality of bloom filters based on the freshness weight ( S504 ). For example, the controller 14 may weight the freshness

If is determined, a selection probability for each of the plurality of bloom filters may be determined.

컨트롤러(14)는 블룸필터 리셋 주기에 도달했는지 여부를 판단할 수 있다(S505). 일 예로, 선택된 블룸필터의 리셋 주기가 만족되는 경우(S505-예), 선택된 블룸필터의 모든 데이터를 초기화할 수 있다(S506). 다른 예로, 선택된 블룸필터의 리셋 주기가 만족되지 않는 경우(S505-아니오), 블룸필터의 데이터 초기화 없이 워크로드 데이터에 대한 지시자를 기록하기 위한 해싱 과정을 시작할 수 있다.The controller 14 may determine whether the bloom filter reset period has been reached (S505). For example, when the reset period of the selected bloom filter is satisfied (S505 - Yes), all data of the selected bloom filter may be initialized (S506). As another example, when the reset period of the selected bloom filter is not satisfied (S505 - No), the hashing process for recording the indicator for the workload data may be started without data initialization of the bloom filter.

컨트롤러(14)는 적어도 하나의 해시 함수에 기초하여 워크로드 데이터에 대한 해시 출력 값(해시 값)을 획득할 수 있다(S507).The controller 14 may obtain a hash output value (hash value) for the workload data based on at least one hash function (S507).

또한, 컨트롤러(14)는 선택된 블룸필터 상에서 해시 출력 값에 대응되는 위치를 1로 설정하여 워크로드 데이터의 입력을 기록할 수 있다(S508).Also, the controller 14 may set the position corresponding to the hash output value on the selected bloom filter to 1 to record the input of the workload data (S508).

또한, 컨트롤러(14)는 선택된 워크로드 데이터의 핫/콜드 판정 값(

)을 획득할 수 있다(S509).In addition, the controller 14 determines the hot/cold determination value (

) can be obtained (S509).

핫/콜드 판정값이 기정의된 임계 값보다 큰 경우(S510-예), 컨트롤러(14)는 선택된 워크로드 데이터를 콜드 데이터로 분류할 수 있다(S511). 또는, 핫/콜드 판정값이 기정의된 임계 값보다 크지 않은 경우 경우(S510-아니오), 컨트롤러(14)는 선택된 워크로드 데이터를 핫 데이터로 분류할 수 있다(S512).When the hot/cold determination value is greater than the predefined threshold (S510-Yes), the controller 14 may classify the selected workload data as cold data (S511). Alternatively, when the hot/cold determination value is not greater than the predefined threshold (S510-No), the controller 14 may classify the selected workload data as hot data (S512).

상술한 본 발명의 다양한 실시 예에 따르면, 데이터의 핫/콜드 여부를 결정함에 있어 엑세스 빈도수(frequency) 외에도 최신성(recency)을 중요하게 고려함으로써 최근에 엑세스된 데이터일수록 핫한 데이터로 정의할 수 있다.According to various embodiments of the present invention described above, in determining whether data is hot/cold, by considering recency in addition to the frequency of access as important, the more recently accessed data can be defined as hot data. .

또한, 데이터의 핫/콜드를 정적(static)으로 구분한 기존의 기술과는 달리, 서로 다른 워크로드 데이터(workload data)마다 핫/콜드 데이터를 구분하는 기준을 적응적으로 변경함으로써 동적(dynamic)으로 데이터의 핫/콜드를 구분할 수 있다.In addition, unlike the existing technology that statically classifies hot/cold data, it is dynamic by adaptively changing the criteria for classifying hot/cold data for each different workload data. can distinguish between hot/cold data.

도 7은 본 발명의 일 실시 예에 따른 데이터의 핫/콜드 여부를 동적으로 판단하는 저장장치의 동작방법에 대한 흐름도이다.7 is a flowchart illustrating a method of operating a storage device for dynamically determining whether data is hot/cold according to an embodiment of the present invention.

도 7을 참조하면, 데이터의 핫/콜드 여부를 동적으로 판단하는 저장장치의 동작방법은 워크로드 데이터의 스택 거리에 기초하여, 복수의 블룸필터 중 워크로드 데이터에 대한 지시자를 기록할 블룸필터를 선택하는 과정(S710), 적어도 하나의 해시함수에 기초하여, 워크로드 데이터에 대한 지시자를 선택된 블룸필터에 기록하는 과정(S720), 복수의 블룸필터 각각에 기록된 워크로드 데이터에 대한 지시자 및 복수의 블룸필터 각각에 대한 최신성에 따른 가중치에 기초하여 워크로드 데이터의 핫/콜드 판정값을 획득하는 과정(S730) 및 판정값에 기초하여 상기 워크로드 데이터의 핫 데이터/콜드 데이터 여부를 판단하는 과정(S740)을 포함할 수 있다.Referring to FIG. 7 , in a method of operating a storage device for dynamically determining whether data is hot/cold, a bloom filter for recording an indicator for workload data among a plurality of bloom filters is selected based on a stack distance of the workload data. A process of selecting (S710), a process of recording an indicator for workload data in the selected bloom filter based on at least one hash function (S720), an indicator and a plurality of workload data recorded in each of the plurality of bloom filters A process of obtaining a hot/cold decision value of the workload data based on a weight according to the freshness of each bloom filter of ( S730 ) and a process of determining whether the workload data is hot data/cold data based on the decision value (S740) may be included.

상술한 데이터의 핫/콜드 여부를 동적으로 판단하는 저장장치의 동작방법은 상기 워크로드 데이터의 상기 스택 거리에 기초하여, 상기 워크로드 데이터에 대한 최신성 가중치를 결정하는 과정 및 상기 최신성 가중치에 기초하여, 상기 복수의 블룸필터 중 상기 워크로드 데이터에 대한 상기 지시자를 기록할 블룸필터를 선택하는 과정을 더 포함할 수 있다.The above-described method of operating a storage device for dynamically determining whether data is hot/cold includes a process of determining a freshness weight for the workload data based on the stack distance of the workload data, and Based on the plurality of bloom filters, the method may further include selecting a bloom filter to record the indicator for the workload data.

상술한 본 발명의 일 실시 예에서, 상기 최신성 가중치는 아래의 수학식에 기초하여 획득될 수 있다.In an embodiment of the present invention described above, the freshness weight may be obtained based on the following equation.

여기서,

는 최신성 가중치,

는 스택 거리,

는 평균 스택 거리.here,

is the freshness weight,

is the stack distance,

is the average stack distance.

또한, 상술한 본 발명의 일 실시 예에서, 복수의 블룸필터 중 가장 최근에 리셋된 블룸필터의 선택 확률은

로 설정되고, 복수의 블룸필터 중 가장 최근에 리셋된 블룸필터 외의 적어도 하나의 블룸필터의 선택확률은

로 설정될 수 있다.In addition, in the above-described embodiment of the present invention, the selection probability of the most recently reset bloom filter among the plurality of bloom filters is

is set to , and the selection probability of at least one bloom filter other than the most recently reset bloom filter among the plurality of bloom filters is

can be set to

상술한 본 발명의 일 실시 예에서, 상기

는 아래의 수학식을 만족할 수 있다.In one embodiment of the present invention described above, the

may satisfy the following equation.

또한, 상술한 본 발명의 일 실시 예에서, 핫/콜드 판정값은 아래의 수학식에 기초하여 획득될 수 있다.In addition, in the above-described embodiment of the present invention, the hot/cold determination value may be obtained based on the following equation.

여기서,

는 핫/콜드 판정값,

는 n번째 BF에 대한 기정의된 exponential 함수의 출력값이다.here,

is the hot/cold judgment value,

is 1 when the workload data is written to the nth BF, 0 otherwise,

is the output of the predefined exponential function for the nth BF.

상술한 본 발명의 일 실시 예에서, 핫/콜드 판정값이 기정의된 임계 값보다 크면, 워크로드 데이터를 핫 데이터로 판단하고, 핫/콜드 판정값이 기정의된 임계 값보다 작으면, 워크로드 데이터를 콜드 데이터로 판단할 수 있다.In one embodiment of the present invention described above, if the hot/cold determination value is greater than the predefined threshold value, the workload data is determined as hot data, and if the hot/cold determination value is less than the predefined threshold value, the work Load data may be determined as cold data.

상술한 본 발명의 일 실시 예에서, 상기

은 복수의 블룸필터 각각에 대한 최신성에 따른 가중치이고, 복수의 블룸필터 중 최근에 리셋된 블룸필터일수록 크다.In one embodiment of the present invention described above, the

is a weight according to the freshness of each of the plurality of bloom filters, and the recently reset bloom filter among the plurality of bloom filters is larger.

상술한 본 발명의 일 실시 예에서, 복수의 블룸필터 각각은 기정의된 주기에 따라 리셋될 수 있다. 또한, 상기 워크로드 데이터는 LBA(Logical Block Address)로 정의될 수 있다.In one embodiment of the present invention described above, each of the plurality of bloom filters may be reset according to a predefined period. Also, the workload data may be defined as a logical block address (LBA).

한편, 상술한 본 발명의 다양한 실시 예에 따른 데이터의 핫/콜드 여부를 동적으로 판단하는 저장장치의 동작방법은 컴퓨터로 실행 가능한 프로그램 코드로 구현되어 다양한 비 일시적 판독 가능 매체(non-transitory computer readable medium)에 저장된 상태로 프로세서에 의해 실행되도록 각 서버 또는 기기들에 제공될 수 있다.Meanwhile, the above-described method of operating a storage device for dynamically determining whether data is hot/cold according to various embodiments of the present invention is implemented as a computer-executable program code, and thus various non-transitory computer readable media (non-transitory computer readable media) are used. medium) and may be provided to each server or device to be executed by the processor.

일 예로, 전자장치의 프로세서에 의해 실행되는 경우 상기 전자장치의 동작을 수행하도록 하는 컴퓨터 명령을 저장하는 비일시적 컴퓨터 판독 가능 매체에 있어서, 상기 동작은, 워크로드 데이터의 스택 거리에 기초하여, 복수의 블룸필터 중 상기 워크로드 데이터에 대한 지시자를 기록할 블룸필터를 선택하는 과정, 적어도 하나의 해시함수에 기초하여, 상기 워크로드 데이터에 대한 상기 지시자를 상기 선택된 블룸필터에 기록하는 과정, 상기 복수의 블룸필터 각각에 기록된 상기 워크로드 데이터에 대한 상기 지시자 및 상기 복수의 블룸필터 각각에 대한 최신성에 따른 가중치에 기초하여 상기 워크로드 데이터의 핫/콜드 판정값을 획득하는 과정 및 상기 판정값에 기초하여 상기 워크로드 데이터의 핫 데이터/콜드 데이터 여부를 판단하는 과정을 포함할 수 있다.For example, in a non-transitory computer-readable medium storing computer instructions for performing an operation of the electronic device when executed by a processor of an electronic device, the operation may include: The process of selecting a bloom filter to record the indicator for the workload data from among the bloom filters of , the process of recording the indicator for the workload data in the selected bloom filter based on at least one hash function, the plurality of A process of obtaining a hot/cold determination value of the workload data based on the weight according to the freshness of each of the indicator and the plurality of bloom filters for the workload data recorded in each bloom filter of It may include a process of determining whether the workload data is hot data/cold data based on it.

일 예로, 워크로드 데이터의 스택 거리에 기초하여, 복수의 블룸필터 중 워크로드 데이터에 대한 지시자를 기록할 블룸필터를 선택하는 과정, 적어도 하나의 해시함수에 기초하여, 워크로드 데이터에 대한 지시자를 선택된 블룸필터에 기록하는 과정, 복수의 블룸필터 각각에 기록된 워크로드 데이터에 대한 지시자 및 복수의 블룸필터 각각에 대한 최신성에 따른 가중치에 기초하여 워크로드 데이터의 핫/콜드 판정값을 획득하는 과정 및 판정값에 기초하여 상기 워크로드 데이터의 핫 데이터/콜드 데이터 여부를 판단하는 과정을 수행하는 프로그램이 저장된 비일시적 판독 가능 매체(non-transitory computer readable medium)가 제공될 수 있다.For example, on the basis of the stack distance of the workload data, the process of selecting a bloom filter to record an indicator for the workload data from among a plurality of bloom filters, based on at least one hash function, the indicator for the workload data The process of recording in the selected bloom filter, the process of obtaining a hot/cold determination value of the workload data based on an indicator for the workload data recorded in each of the plurality of bloom filters and the weight according to the freshness of each of the plurality of bloom filters and a non-transitory computer readable medium storing a program for performing a process of determining whether the workload data is hot data/cold data based on the determination value.

비 일시적 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상술한 다양한 어플리케이션 또는 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등과 같은 비일시적 판독 가능 매체에 저장되어 제공될 수 있다.The non-transitory readable medium refers to a medium that stores data semi-permanently, rather than a medium that stores data for a short moment, such as a register, cache, memory, etc., and can be read by a device. Specifically, the above-described various applications or programs may be provided by being stored in a non-transitory readable medium such as a CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

이상으로, 본 발명의 실시 예들이 도시되고 설명되었지만, 당업자는 첨부된 청구항들 및 그에 동등한 것들에 의해 정의되는 바와 같은 본 실시 예의 사상 및 범위를 벗어나지 않고 형태 및 세부 사항들에 있어 다양한 변경이 이루어질 수 있음을 이해할 것이다.While the embodiments of the present invention have been shown and described, various changes in form and details may be made by those skilled in the art without departing from the spirit and scope of the embodiments as defined by the appended claims and their equivalents. you will understand that you can

데이터 저장장치: 1, 1'
호스트 인터페이스: 11
램 버퍼: 12
플래시 메모리: 13
컨트롤러: 14
프로세서: 141
플래시 컨트롤러: 142
버퍼 매니저: 143
워크로드 데이터 분석부: 41
블룸필터 선택부: 42
핫/콜드 데이터 분류부: 43Data storage: 1, 1'
Host Interface: 11
RAM Buffer: 12
Flash memory: 13
Controller: 14
Processor: 141
Flash controller: 142
Buffer Manager: 143
Workload Data Analysis Department: 41
Bloom filter selector: 42
Hot/Cold Data Classification: 43

Claims

A data storage device for dynamically determining whether data is hot/cold, comprising:
a host interface for communicating with a host device;
flash memory; and
Based on the stack distance of the workload data, select a bloom filter to record the indicator for the workload data from among a plurality of bloom filters,
Based on at least one hash function, writing the indicator for the workload data to the selected bloom filter,
Obtaining a hot/cold determination value of the workload data based on the weight according to the freshness for each of the indicator and the plurality of bloom filters for the workload data recorded in each of the plurality of bloom filters,
and at least one processor to determine whether the workload data is hot data/cold data based on the determination value.

According to claim 1,
the at least one processor,
determine a freshness weight for the workload data based on the stack distance of the workload data;
Selecting a bloom filter to record the indicator for the workload data from among the plurality of bloom filters, based on the freshness weight.

3. The method of claim 2,
The freshness weight is obtained based on the following equation, a data storage device.

here,

is the freshness weight,

is the stack distance,

is the average stack distance.

4. The method of claim 3,
The selection probability of the most recently reset bloom filter among the plurality of bloom filters is

is set, and the selection probability of at least one bloom filter other than the most recently reset bloom filter among the plurality of bloom filters is

data storage device, which is set to .

4. The method of claim 3,
remind

is a data storage device that satisfies the following equation.

According to claim 1,
and the hot/cold determination value is obtained based on the following equation.

here,

is the hot/cold judgment value,

is 1 when the workload data is written to the nth BF, 0 otherwise,

is the output of the predefined exponential function for the nth BF.

7. The method of claim 6,
the at least one processor,
If the hot/cold determination value is greater than a predefined threshold, the workload data is determined as hot data. If the hot/cold determination value is less than a predefined threshold, the workload data is converted to cold data. Judging, data storage.

7. The method of claim 6,
remind

silver,
A weight according to the freshness for each of the plurality of bloom filters,
A data storage device, wherein the bloom filter recently reset among the plurality of bloom filters is larger.

The method of claim 1,
wherein each of the plurality of bloom filters is reset according to a predefined period.

According to claim 1,
The workload data is LBA (Logical Block Address), a data storage device.

In the method of operating a data storage device for dynamically determining whether data is hot/cold,
selecting a bloom filter to record an indicator for the workload data from among a plurality of bloom filters based on a stack distance of the workload data;
recording the indicator for the workload data in the selected bloom filter based on at least one hash function;
obtaining a hot/cold determination value of the workload data based on the indicator for the workload data recorded in each of the plurality of bloom filters and a weight according to the freshness of each of the plurality of bloom filters; and
and determining whether the workload data is hot data/cold data based on the determination value.

12. The method of claim 11,
determining a freshness weight for the workload data based on the stack distance of the workload data; and
Selecting a bloom filter to record the indicator for the workload data from among the plurality of bloom filters based on the freshness weight; further comprising a method of operating a data storage device.

13. The method of claim 12,
The method of operating a data storage device, wherein the freshness weight is obtained based on the following equation.

here,

is the freshness weight,

is the stack distance,

is the average stack distance.

14. The method of claim 13,
The selection probability of the most recently reset bloom filter among the plurality of bloom filters is

The method of operation of the data storage device, which is set to .

14. The method of claim 13,
remind

is a method of operating a data storage device that satisfies the following equation.

12. The method of claim 11,
The method of operating a data storage device, wherein the hot/cold determination value is obtained based on the following equation.

here,

is the hot/cold judgment value,

is 1 when the workload data is written to the nth BF, 0 otherwise,

is the output of the predefined exponential function for the nth BF.

17. The method of claim 16,
If the hot/cold determination value is greater than a predefined threshold, the workload data is determined as hot data. If the hot/cold determination value is less than a predefined threshold, the workload data is converted to cold data. A method of operating a data storage device, further comprising a process of determining.

17. The method of claim 16,
remind

silver,
A weight according to the freshness for each of the plurality of bloom filters,
The method of operating a data storage device, wherein the bloom filter recently reset among the plurality of bloom filters is larger.

12. The method of claim 11,
Each of the plurality of bloom filters is reset according to a predefined period, the operating method of the data storage device.

12. The method of claim 11,
The workload data is LBA (Logical Block Address), the operating method of the data storage device.

A non-transitory computer-readable medium storing computer instructions for performing an operation of the electronic device when executed by a processor of an electronic device, the operation comprising:
selecting a bloom filter to record an indicator for the workload data from among a plurality of bloom filters based on a stack distance of the workload data;
recording the indicator for the workload data in the selected bloom filter based on at least one hash function;
obtaining a hot/cold determination value of the workload data based on the indicator for the workload data recorded in each of the plurality of bloom filters and a weight according to the freshness of each of the plurality of bloom filters; and
and determining whether the workload data is hot data/cold data based on the determination value.