KR101526110B1 - Flash transition layor design framework for provably correct crash recovery - Google Patents

Flash transition layor design framework for provably correct crash recovery Download PDF

Info

Publication number
KR101526110B1
KR101526110B1 KR1020140013590A KR20140013590A KR101526110B1 KR 101526110 B1 KR101526110 B1 KR 101526110B1 KR 1020140013590 A KR1020140013590 A KR 1020140013590A KR 20140013590 A KR20140013590 A KR 20140013590A KR 101526110 B1 KR101526110 B1 KR 101526110B1
Authority
KR
South Korea
Prior art keywords
processing module
log
block
data
information
Prior art date
Application number
KR1020140013590A
Other languages
Korean (ko)
Other versions
KR20140100907A (en
Inventor
민상렬
남이현
이수관
윤진혁
성윤제
김홍석
최진용
박정수
Original Assignee
서울대학교산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 서울대학교산학협력단 filed Critical 서울대학교산학협력단
Priority to PCT/KR2014/001028 priority Critical patent/WO2014123372A1/en
Publication of KR20140100907A publication Critical patent/KR20140100907A/en
Application granted granted Critical
Publication of KR101526110B1 publication Critical patent/KR101526110B1/en

Links

Images

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)
  • Quality & Reliability (AREA)

Abstract

A flash translation layer design framework for flash memory is disclosed. The flash translation hierarchy according to one embodiment includes a first log to process the data, a second log to process the mapping information, and a third log to process the checkpoint information. Here, the first log and the second log can recover the error using the checkpoint information.

Figure R1020140013590

Description

{FLASH TRANSITION LAYOR DESIGN FRAMEWORK FOR PROVABLY CORRECT CRASH RECOVERY}

The following embodiments relate to a flash translation layer design framework for flash memory.

Unlike HDDs, flash memory can not be updated in place, and page-based read and write operations and block-by-block erase operations are the basic operations. Bad blocks are allowed, and the life of the flash memory is limited. Therefore, in order to develop a high-performance and high-reliability storage device using a flash memory as a storage medium, it is necessary to effectively utilize the advantage of the flash memory and to overcome the constraints. The flash conversion layer (FTL) I have. In order to overcome the limitations of flash memory which can not be updated, FTL introduces the concept of mapping between a logical sector address and a flash memory physical address, thereby providing a block storage system capable of updating the host system. In addition, the FTL performs mapping function to prevent bad blocks that may occur during operation from being used in the future, or to perform wear leveling that prevents specific physical blocks from being erased excessively.

Recently, as reliability of Flash memory has been getting smaller and more reliable, the reliability and performance of FTL has been getting bigger and bigger. In particular, in recent years, the possibility of bit inversion error, in which data normally written is deformed after a certain time, is increasing, and the number of erase / writes allowed per block is gradually decreasing. In addition, error conditions that have not appeared in the past related to power crashes are newly discovered, and they are becoming important problems that threaten the reliability of flash memory storage devices.

If a power failure occurs during the writing of a page in the MLC flash memory, the block or page of the flash memory that was being operated (program / erase) at the time of the power failure may have an unintended residual effect on the FTL, Even the data of a page that has been related and has been written successfully can be lost together. Also, the power failure problem is difficult to identify the block in which the power failure occurred during the recovery process, and may be caused by overlapping power failure during recovery. In order to overcome the difficulties of this power failure recovery problem and to provide a reliable storage system, the FTL must always consider the recoverability during normal operation, and in the power failure recovery process, It is necessary to remove the physical abnormal state of the memory and restore the logical consistency of the data.

In order to overcome the limitations of existing incomplete and FTL dependent power failure recovery schemes, the embodiments propose a power failure recovery scheme that is complete, systematic and verifiable in accuracy, and can be used for HIL (Hierarchically Interacting a set of Logs) framework.

The HIL framework provides a basic building block of FTL design and implementation. The log provides a linear address space that can be updated for persistent recording of data, and serves as a container for storing host data and FTL metadata that constitute the storage system. FTL developers can design and implement any of a variety of FTLs by combining logs.

The power failover scheme provided by the HIL framework ensures complete recovery in the event of random power failures, taking into account any remaining effects, sibling page problems, and nested power failures. The HIL framework can prove formally the accuracy of power failures by designing for forprobability.

Each log in the HIL framework is implemented as a separate thread, and each thread creates a series of flash memory operational streams, so FTL can naturally take full advantage of maximum thread-level parallelism and flash-level parallelism. Each log thread can be performed independently and in parallel, with the exception of a synchronization interface to ensure data coherence and recoverability

A flash transform hierarchy according to one side comprises: a first processing module for processing data; A second processing module for processing the mapping information of the data; And a third processing module for processing checkpoint information including information on an unanalysis block of the first processing module and information on an uninterpreted block of the second processing module, 1 processing module to recover an error using the unprocessed block of the first processing module, and the second processing module can recover the error using the unprocessed block of the second processing module.

In this case, for error recovery, the first processing module detects an error page, copies valid pages in the error block including the error page to the uninterpreted block of the first processing module, And logically swap the error block and the uninterpreted block of the first processing module.

Also, the checkpoint information of the first processing module may further include a block write list of the first processing module, and the first processing module may detect the error page using the block write list.

In addition, the checkpoint information of the first processing module may further include a block write list of the first processing module and a write pointer of the first processing module, and the first processing module may have a write pointer of the first processing module It is possible to detect the error page by checking whether or not the page is erroneous along the block write list of the first processing module from the page to which it is directed.

In addition, the first processing module may send updated checkpoint information to the third processing module due to the logical swap of the error block and the unprocessed block of the first processing module.

After the error is recovered using the unprocessed block of the first processing module, the first processing module obtains the mapping information of the data from the page where the data is stored, and transmits the mapping information of the data to the second Processing module.

In addition, the checkpoint information of the first processing module may further include a block write list of the first processing module, a write pointer of the first processing module, and a reproduction pointer of the first processing module, Acquires mapping information from the page indicated by the reproduction pointer of the first processing module to the page indicated by the writing pointer of the first processing module along the block writing list of the first processing module and transmits the mapping information to the second processing module .

When receiving a write command, the first processing module stores data corresponding to the write command in a cache, and when receiving a read command, determines whether data corresponding to the read command exists in the cache .

The first processing module may store the data and the mapping information of the data in the same page of the flash memory. The first processing module may store the data in the flash memory and then transmit the mapping information of the data to the second processing module.

The first processing module may store the data in a flash memory on a page-by-page basis, and may advance the write pointer along a block write list when page-level data is stored.

The first processing module transmits a persistence request signal to the second processing module when the writing pointer passes the block boundary, and the second processing module, in response to the persistence request, And the first processing module can advance the reproduction pointer along the block write list upon receiving the persistent completion signal from the second processing module.

In addition, for error recovery, the second processing module detects an error page, copies valid pages in the error block including the error page to the uninterpreted block of the second processing module, and when the copy is completed, Block and an uninterpreted block of the second processing module.

The checkpoint information of the second processing module may further include a block write list of the second processing module, and the second processing module may detect the error page using the block write list.

In addition, the checkpoint information of the second processing module may further include a block write list of the second processing module and a write pointer of the second processing module, and the second processing module may include a write pointer of the second processing module It is possible to detect the error page by checking whether or not the page is erroneous along the block write list of the second processing module from the designated page.

In addition, the second processing module may transmit the updated checkpoint information to the third processing module due to the logical swap of the error block and the unprocessed block of the second processing module.

In addition, the flash translation hierarchy further includes a higher-level processing module for processing upper mapping information of the mapping information, and after the error is recovered using the unprocessed block of the second processing module, Acquire upper mapping information of the mapping information from a page storing the mapping information, and transmit upper mapping information of the mapping information to the upper processing module.

In addition, the checkpoint information of the second processing module may further include a block write list of the second processing module, a write pointer of the second processing module, and a reproduction pointer of the second processing module, Acquires upper mapping information from the page indicated by the reproduction pointer of the second processing module to the page indicated by the writing pointer of the second processing module along the block writing list of the second processing module and transmits the upper mapping information to the upper processing module .

When receiving the mapping command, the second processing module stores the mapping information corresponding to the mapping command into the cache. When receiving the read command, the second processing module determines whether mapping information corresponding to the read command exists in the cache It can be judged.

In addition, the second processing module may store upper mapping information of the mapping information and the mapping information in the same page of the flash memory.

In addition, the flash transform hierarchy further includes a higher-order processing module for processing higher-order mapping information of the mapping information, and the second processing module stores the mapping information in a flash memory, To the upper processing module.

In addition, the second processing module may store the mapping information in a flash memory on a page-by-page basis, and may advance the write pointer along the block write list when page-by-page mapping information is stored.

In addition, the flash translation layer structure may further include a higher-order processing module that processes upper mapping information of the mapping information, and the second processing module may transmit a persistence request signal to the higher-order processing module when the write pointer passes the block boundary And the superordinate processing module responds to the persistence request and stores the super mapping information of the block corresponding to the persistence request signal in the flash memory. When the second processing module receives the persistence completion signal from the superordinate processing module The reproduction pointer can be advanced along the block write list.

The error may also include a power failure that occurs asynchronously.

In addition, the processing module included in the flash conversion layer may include an interface unit connected to at least one of a host, another processing module, and a flash memory; A cache unit including a volatile memory; And a processing unit for processing data or information according to the type of processing module using the interface unit and the cache unit.

The flash conversion hierarchy further includes a fourth processing module for processing block state information, and the third processing module further includes information about an uninterpreted block of the fourth processing module, and the fourth processing module The module can recover the error using the unprocessed block of the fourth processing module.

In addition, the flash translation hierarchy further comprises a fifth processing module operating as a non-volatile buffer for the other processing module, the third processing module further comprising information about the unprocessed block of the fifth processing module, And the fifth processing module may recover the error using the unprocessed block of the fifth processing module.

A flash memory controller according to another aspect manages a synchronous error generated by a flash memory operation; And a second processing module for processing mapping information of the data and information on an unexpressed block of the first processing module and information on an uninterpreted block of the second processing module And a third processing module for processing the checkpoint information included therein. Wherein the first processing module restores the asynchronous error using the unprocessed block of the first processing module and the second processing module uses the unprocessed block of the second processing module to recover the asynchronous error can do.

The flash translation hierarchy according to the other side includes D-log processing data; And a plurality of M-logs for hierarchically processing the mapping information of the data.

In this case, each of the plurality of M-logs stores information received from a lower log in a flash memory resource allocated to the lower log, and when the size of information received from the lower log is larger than a predetermined size, To the upper level log.

Also, among the plurality of M-logs, a log having a size smaller than a predetermined size may be determined as a top-level M-log.

Also, the highest-level M-log may store the information received from the lower-level log in its own flash memory resource, and may transmit the mapping information of the flash memory resource to the C-log processing the checkpoint information.

The characteristic of each of the plurality of M-logs is set for each M-log, and the characteristic of each of the plurality of M-logs is set to a mapping unit of each of the plurality of M- Each of which may include at least one of the cache management policies.

An L-log processing block state information; And a plurality of LM-logs for hierarchically processing the mapping information of the block state information.

Each of the plurality of LM-logs stores information received from a lower log in a flash memory resource allocated to the lower log, and when the size of information received from the lower log is larger than a predetermined size, The mapping information can be transferred to the parent log.

Also, among the plurality of LM-logs, a log having a size smaller than a predetermined size may be determined as a top-level LM-log.

Also, the highest-level LM-log may store information received from the lower-level log in its own flash memory resource, and may transmit the mapping information of the flash memory resource to a C-log for processing checkpoint information.

The characteristic of each of the plurality of LM-logs is set for each LM-log, and the characteristic of each of the plurality of LM-logs is a mapping unit of each of the plurality of LM- Each of which may include at least one of the cache management policies.

A method for designing a flash translation layer according to another aspect includes providing a plurality of building blocks for constructing a flash translation layer. Here, the plurality of building blocks include a first processing block for processing data; At least one second processing block for hierarchically processing the mapping information of the data; And a third processing block for processing checkpoint information including information on an uninterpreted block of the first processing block and information on an uninterpreted block of the at least one second processing block, The block may recover the error using the unprocessed block of the first processing block and the at least one second processing block may recover the error using the uninterpreted block of the at least one second processing block.

Wherein the method for designing the flash translation layer comprises receiving a configuration related to the design of the flash translation layer; And generating the flash conversion layer based on the plurality of building blocks and the setting.

In addition, the settings may include settings related to the number of threads implementing the flash translation layer; A setting associated with the number of cores driving the flash translation layer; Settings related to the number of threads being processed per core; And a setting related to a mapping between a plurality of cores and a plurality of threads.

Also, the method for designing the flash translation layer may include receiving a second configuration associated with the design of the flash translation layer; And adaptively regenerating the flash translation layer based on the plurality of building blocks and the second setting.

1 illustrates a flash memory fault classification according to an embodiment;
2 illustrates a storage system architecture according to one embodiment;
3 illustrates a log interface according to one embodiment;
Figure 4 illustrates a flash translation hierarchy through a log combination according to one embodiment;
5 illustrates a hierarchical mapping structure for a storage device providing a logical address space according to an embodiment;
6 is a diagram illustrating host write command processing according to an embodiment;
FIG. 7 illustrates recursive processing of mapping data on a hierarchical structure of a mapping according to an embodiment; FIG.
8 is a diagram illustrating host read command processing according to one embodiment;
9 is a view for explaining a data consistency assurance method according to an embodiment;
10 illustrates an asynchronous error in accordance with one embodiment;
Figure 11 illustrates the elimination of residual effects during a structural repair process in accordance with one embodiment;
12 illustrates a structural recovery process in terms of a storage system according to an embodiment;
13 is a view for explaining reproduction of mapping information during a functional recovery process according to an embodiment;
FIG. 14 is a diagram illustrating examples in which recoverability is destroyed when there is no synchronization between logs according to an embodiment; FIG.
15 illustrates an interface that assures recoverability in accordance with one embodiment;
16 illustrates a functional recovery process in terms of a storage system according to an embodiment;
17 is a diagram illustrating steps of a formal verification according to an embodiment;
18 illustrates the temporal nature of a storage system according to one embodiment;
19 is a diagram illustrating the assurance of data consistency according to one embodiment;
FIG. 20 is a diagram illustrating assurance of recoverability in accordance with an embodiment; FIG.

Work In the embodiment  Flash memory Fault's  Classification

A NAND flash memory chip is composed of a plurality of blocks, and a block includes a plurality of pages. Each page is divided into a data area for storing host data and a spare area for storing metadata associated with host data such as mapping information and ECC information. Flash memory supports three operations: read, program, and erase, and since it is impossible to update the data, the erase operation must be performed before data is written. The minimum unit of erase operation is block, and the minimum unit of program and read operation is page. In terms of reliability, the NAND flash memory permits the generation of bad blocks that can not guarantee normal operation, and it is possible for bad blocks to occur during operation as well as during the initial fabrication stage. It is also possible that an error occurs in which bits of data of a page that have been recorded in the past are inverted by the disturbance mechanism (Disturbance).

Referring to FIG. 1, the error of the flash memory is classified into a synchronous fault occurring as an internal factor of the flash memory and an asynchronous fault caused by an external environmental factor have. Synchronous errors indicate failure of internal flash memory operations such as erase / program / read. Erase and program errors are caused by corruption of the cell for unknown reasons, regardless of whether the block has reached the end of its useful life. This error has a permanent characteristic for the block. Therefore, blocks with such errors must be handled appropriately so that they are not used again in the system. On the other hand, a read error refers to a bit inversion error caused by a program disturbance, a read disturbance, or a data maintenance error. The bit inversion error has a transient attribute that clears the page and reverts to a normal page when it is used again. Since the synchronous error in the flash memory operation occurs as a result of the flash memory operation, it can be solved in the lower layer independent of the FTL (bad block management module and error correction module), and this hierarchical approach can be applied to the FTL It can provide an illusion of flash memory in which no synchronous error occurs at all. The embodiments assume this hierarchical approach and assume that the FTL handles only asynchronous power failures. This is an important assumption to enable proof of correct recovery in any power failure situation by greatly reducing the complexity of the power failure recovery algorithm.

An asynchronous error to be solved by the FTL is an error that can occur at any time without any temporal relation to the internal operation of the flash memory such as erase / program / read, specifically a power failure. Because a clear / program operation that changes the state of a flash memory page does not guarantee atomic properties, if a power failure occurs during the operation, the page can have various states. In particular, even if the data appears to be successfully written to the page on which power failure occurred during the program, it is very vulnerable to read disturbance and can return incorrect data non-deterministically each time it is read. Residual effect page. This residual effect page has the potential to change to an element of system error at any moment, so if the remaining effect pages are not successfully identified and removed during the power failure recovery process, . What makes the problem more difficult is the sibling-page problem in MLC-type flash memory chips, because the metadata (or host data) of the FTL that had already been successfully recorded in the past will become a sibling- page) may be lost at any time if a power failure occurs while being programmed. This means that from the perspective of the FTL, the successful completion of a program on any physical page no longer guarantees the permanence of the data recorded on that page.

Work In the embodiment  Design of File Conversion Layer Framework

FIG. 2 shows a flash memory storage device architecture composed of a flash memory subsystem composed of a flash memory controller and a BMS layer assumed in the embodiments, and an FTL implemented in the HIL framework. It is assumed that the flash memory controller can control a plurality of flash memory chips, supports flash memory read / write / erase operations, and can detect and report errors that occur during operation execution. It is also assumed that the BMS layer handles all the synchronous errors that occur during the execution of the flash memory operation. To do this, a typical BMS layer assumes extra blocks for exchanging blocks when a write / erase error occurs, and error control logic, such as an ECC encoder / decoder, to handle read errors. Therefore, the BMS layer can provide a virtual physical address space to an upper layer in which a flash memory operation error does not occur until an extra block is exhausted or a read error that can not be corrected by the ECC circuit occurs.

The FTL is implemented on a flash memory subsystem that uses the HIL framework to show the virtual physical address space where no synchronous flash memory error occurs. The HIL framework defines an object that implements an address space that can be updated using flash memory blocks in the virtual physical address space for each metadata managed by the FTL such as host data, mapping data, and block state data . Users can combine logs to implement various FTLs. The log consists of a flash log consisting of a series of flash memory blocks for writing data permanently internally and a cache implemented as volatile memory for performance enhancement.

Logs can be referred to as processing modules because they process metadata such as host data, mapping data, or block state data. In the HIL framework, logs of various types for designing FTL are provided as building blocks, so the log can be referred to as a processing block. The log may be implemented in a variety of software, hardware, or a combination thereof.

FIG. 3 shows a general log interface. (1) a read / write interface is required to provide a linear address space that can be updated in place to other logs or external host systems for the (meta) data that it manages, (2) interface with the flash memory subsystem is required. When the log writes data to the flash memory, it is essential to update the mapping information for the data (3) and to recover the invalidated space by the write operation (4) Since the block state information must be updated, I need an interface. In addition, persistent data must be guaranteed to be recovered in the event of a sudden power outage failure. To do so, all logs require an interface to periodically checkpoint all information needed for recovery.

Depending on the type of data managed by the FTL, log types include D-type logs for processing host data, M-type logs for processing mapping information, L-type logs for processing block status information, and checkpoint information for power failure recovery. C type logs to be processed. In some cases, the log type may further include a W-type log that operates as a non-volatile buffer of another log. In addition, the type of the log may be further divided into an LM type log that processes mapping information of the block state information, which is different from the M type log that processes the mapping information of the host data. D-type log, M-type log, L-type log, C-type log, W-type log, and LM type log are D-log, M-log, L-log, C- Lt; / RTI >

Although the interface and behavior are slightly different depending on the kind of log, all logs share common resources in that they provide a linear address space that can be updated on their own data to other log or host system using the resource allocated to them. to be. The HIL framework integrates and manages the addressable address space managed by all the logs that make up the FTL, and defines it as a pseudo logical address (PLA). As shown in FIG. 3, each log occupies a unique area managed by itself in the virtual logical address space.

The HIL framework can design and implement arbitrarily various FTLs through a combination of these various kinds of logs. The connection structure of the logs is determined by the hierarchical structure of the mapping and the attributes of the data managed by the log. Figure 4 shows an example of a link structure between logs. When the log writes data to the flash memory, mapping information is generated and the mapping information is managed by the upper mapping log. Accordingly, a plurality of logs in the HIL framework form a constant hierarchical structure as shown in FIG. 4 through a mapping relation. In order to process the read / write command transmitted from the host computer, the FTL must perform address translation by referring to the mapping information. In the HIL framework, a plurality of logs on the mapping hierarchy and a multi- Address translation. The hierarchical expansion of the mapping structure and host read and write processing using the hierarchical extension will be described later.

When the log writes data, it also needs to update the state information of the physical block. In the HIL framework, it is assumed that the L-type log is dedicated to updating the block state information transmitted by all the logs. This relationship is expressed as a connection line where all logs transmit block state information to the L-type log. The C type log records information for power failure recovery. Since all logs have their own checkpoint information and the amount of checkpoint information is generally small, it is possible to manage checkpoint information of all logs from one checkpoint log . Hereinafter, for convenience of description, embodiments will be described assuming an FTL composed of a D-type log, an M-type log, and a C-type log. However, for the L-type log which manages the liveness or invalidation of a block or page for garbage collection, and the W-type log which operates as a nonvolatile buffer of another log for the hybrid mapping FTL, Can be applied as it is.

I. Hierarchical mapping structure

When the log writes data to flash memory, mapping information is generated. Since the mapping information is a pointer to the physical address of the flash memory in which the host data is stored, the size of the mapping information is smaller than that of the host data. The size of the entire mapping information is determined by the size of the host data area, the mapping unit, and the size of the mapping entry. If the page size of the flash memory is 4 KB and the size of the mapping entry is 4B as shown in FIG. 5, The information is 4MB which is 1/1024 of 4GB. When looking at the mapping data from the same viewpoint as the host data, the mapping data also has a linear address space like the host data. Therefore, the 4GB user address space is written in flash memory on a page basis, and the mapping information occupies 4MB of address space because the address space of 4MB of mapping data is recursively divided into page size units, The mapping data for the mapping data written in the flash memory equally corresponds to occupying a space of 4 KB. 5, the relationship between the D-type log and the M0 log is the same as the relationship between the M0 log and the M1 log. Since the mapping data has a smaller address space than the data associated with the mapping data and the mapping relationship has a recursive property, the mapping relationship between the logs caused by the host data writing operation is generally a hierarchical structure . This allows embodiments to provide a general hierarchical structure with no restriction as to how many steps hierarchical mapping takes place.

In the HIL framework, the hierarchy of mappings can continue until the top-level mapping data becomes any limited number (e.g., a single page). In the example of FIG. 5, since the size of the address space of the M0 log is 4 MB, the size of the address space of the M1 log is 4 KB, which is the flash memory page size. Thus the hierarchy of mappings in the M1 log can be terminated. The FTL implemented in the HIL framework can keep track of all the latest mapping information and the location of data along the hierarchical structure of the mapping by keeping only the top mapping information in the volatile memory during operation. Therefore, since the entire mapping information does not need to be held in the volatile memory, even a flash memory storage system having a volatile memory of a limited size can support a very large (e.g., page mapped FTL) FTL. The FTL can periodically check the location of the top mapping information during operation and scan the checkpoint area only when the storage device is restarted. For example, information as to where the mapping information processed by the top level log (e.g., M1 log) is stored in the flash memory may be stored in the C type log.

FIG. 5 also shows the area occupied by the D-log, M0-log, and M1-log when the HIL internally maintains the virtual logical address space with a byte address to provide the 4GB logical address space to the host system. The D-log provides a logical address space to the host system, which accesses the virtual logical address space implemented by the MO log. Similarly, M0-log provides a virtual logical address space in the D-log and accesses the virtual logical address space managed by M1-log to implement it. In this way, the read and write commands of the host system start from the data log, address translation from the virtual logical address, and propagate chronologically along the hierarchical log. As can be seen in FIG. 5, the logs on the hierarchical structure of the mapping are easily interconnectable because the top and bottom interfaces of the log are the same in the mapping relationship. Therefore, the HIL framework can easily construct various types of hierarchical mapping structure even if the storage capacity increases or the mapping unit of log changes.

II. Host write and read operations

The FTL implemented in the HIL framework provides a storage interface that allows the host system to read and write arbitrary number of sectors. It also provides data coherence and durability of data, as well as general storage devices such as HDDs. To meet the requirements. In addition, the FTL implemented in the HIL framework should naturally accommodate the command queuing capabilities supported by typical storage interfaces, such as SATA native command queuing (NCQ) or SCSI Tagged command queuing (TCQ).

1. Write Host

FIG. 6 shows a process in which a host write request is processed on a three-level mapping hierarchy. The host write procedure is divided into a foreground operation in which the host data is stored in the cache of the D type log and a write completion response and a background operation in which the mapping information is sequentially updated along with the mapping hierarchy of the log Loses.

A sector write request to the logical address space is converted to a page-level operation install_data for the virtual logical address space by the write thread and then transferred to the D-type log (1). The write thread only interfaces with the D-type log in the HIL framework, and is not a log because it does not leave a non-volatile state but runs independently as the log. The D-type log stores the data requested by the host in the volatile memory (2), and completes the response with the installed_v (3). When the write thread receives the Installed_v response to all page writes constituting a single host write request, it can respond to the host write request with write_ack in response to the host write request. When the host read request arrives for the same logical address space as the data requested to be written without writing the data in the volatile buffer to the flash memory after the completion response of the write thread, the FTL processes the read request with the latest data in the buffer Ensure data consistency during device operation.

If there is no separate command or setting that forces the data to be persistent from the host system, the D-type log writes the cached data to flash memory at the appropriate time according to the internal buffer management policy (4). The location where the logical page is written corresponds to the mapping information, which is transferred to the upper M0 log by the install_mapping command (5). Since the install_mapping command received by the M0 log is conceptually identical to the write command received from the write thread, the M0 log stores the mapping information in the volatile memory and then completes the install_v in the D log. (6) After the completion response, the M0 log should guarantee the consistency of the mapping data to the D-log by returning the most recent mapping data that is installed when the D-log queries the mapping information. This is equivalent to ensuring consistency of the host data to the host system during storage operation in response to a D-log write completion response.

The method of processing the mapping data in the M0 log is also essentially the same as the host data processing in the data log. When mapping data from the M0 log is written to flash memory, a mapping to the mapping data is created and a new install_mapping command for the M1 log address space that can subsequently be updated is sent to the M1 log. The chain of write requests that are propagated to the address space of the hierarchical log by the processing structure of the same write request is recursively repeated until the top mapping log is reached. FIG. 7 shows a recursive procedure in which a plurality of mapping data transferred from a lower log in the unit mapping log are collected, written in one page, and one mapping information is sequentially transmitted to the upper log. Since the mapping data is smaller than the data pointed to by the mapping information (e.g., 4B vs. 4KB), the page unit operation in the lower log is converted into the request unit operation of the upper log and transferred to the upper log.

The cache maintained by the log of each layer plays a role of improving the response speed between the host and the D-log and the write request between the log and the log and enhancing the efficiency of the flash memory write operation. In addition, since logs of each layer can be executed in parallel, it is possible to independently write data in the volatile memory to the flash memory. Therefore, the FTL implemented in the HIL framework can optimize the performance of the FTL by concurrently executing the operation of each log in different threads concurrently and simultaneously sending a read / write request to several flash memory chips.

2. Read Host

FIG. 8 shows a process in which a read request for a logical address space p is processed on a mapping hierarchy having a three-level mapping hierarchy. Since the host write operation starts from the write operation of the data log and is propagated chronologically to the upper mapping log, the latest data or mapping information is always propagated down-to-the-bottom. Also, since each log maintains volatile memory, the most recently updated or referenced data can be maintained in volatile memory. Therefore, in order to ensure data coherence, the FTL implemented in the HIL framework operates basically as a bottom-up query method as in the case of the write operation.

The read thread first queries the D log, and if there is data, the D log can transfer the data in the volatile memory to the host system and complete the read processing without having to read the data from the flash memory. Otherwise, the data must be read directly from the flash memory and a bottom-up query process is started to obtain the necessary mapping information. The query to the parent log continues until the queried data is in volatile memory. If all of the logs in the hierarchy of the mapping are not in volatile memory, the query reaches the top-level mapping log. In the HIL framework, the query for the top-level mapping log always succeeds because it assumes that all mapping information managed by the top-level mapping log is kept in volatile memory. If there is no data queried in the volatile memory of the sub-log, but the data in the parent log is the mapping information of the sub-log when there is no data in the parent log, the read-thread transmits mapping information received from the parent log together with top-down requery. The log must read the data from the flash memory using the mapping information that is passed along with the instruction of the material, and always return the data that has been queried. This process of the top-down material is repeated from the point of time when the query data is correctly returned from the volatile memory to the data log in the bottom-up query process.

The right side of FIG. 8 illustrates the process of the bottom-up query and the top-down material through an example in which there is no data queried in the D log and data queried in the M0 log. First, the read thread changes the host read request to the page-based query request and transfers it to the data log (1). A read thread is distinguished from a log that writes non-volatile data to a flash memory in that it maintains only a volatile data structure for internal functioning, just like a write thread. D log does not have the data corresponding to the queried virtual logical address space pla_d in the volatile cache. Therefore, when the cache miss response (Query_ack (miss, null)) is sent (2), the read thread obtains the mapping information for pla_d address And sends a new query request to the M0 log (3). At this time, the query for the upper mapping log has the same interface as the query for the data log. When the cache hit occurs in the M0 log according to the scenario of FIG. 8, the M0 mapping log reads the cache hit and the result values (B i , P j ) (mapping information for the logical page pla_d) of the query as a response to the query (Query_ack) Return it to the thread (4). Then, the read thread retransmits the query command to the D log along with the mapping information obtained from the upper mapping log (5), the D log reads the data from the received flash memory address (6, 7), and completes the read response with the read thread (8). The read thread can finally transmit the collected data to the host system after receiving the completion response from the data log last through the same bottom-up query for all the logical pages constituting one host request. The interface between the read thread and the data log, the read thread, and the M0 log is the same for any upper mapping log Mi, so that the scenario of FIG. 8 can be easily extended to the case where a cache hit occurs from the Mi log through a bottom- have.

Unlike a write thread, a read thread interacts directly with all the mapping logs, including the data log, in order to improve performance by processing host read requests in parallel as much as possible. The host write command, which is a non-blocking request, can immediately respond to data as soon as it is written to the volatile memory, and the task of writing the actual data to the flash memory can be performed later by distributing the data to several flash memory chips. However, since the host read command, which is a blocking request, can complete a response only after receiving requested data, it is necessary to be able to simultaneously send requests to as many flash memory chips as possible. In particular, if the host interface supports instruction queuing such as NCQ, each host read command is distributed across multiple read threads, and each read thread independently interacts with the logs, simultaneously requesting reads to the flash memory chip The processing speed of the host read request can be greatly improved. The HIL framework is designed with these extensions in mind, so that the read thread interacts directly with each log.

To ensure data consistency between the host write operation and the read operation, the write-completed data can be released from the buffer only after it receives the installed_v response to the mapping from the super mapping log. FIG. 9 shows a problem situation that may occur when the data is released from the buffer before the installed_v is received after the writing is completed. If the mapping information for the data and the data disappears from the sub-log at a time when it is not certain that the mapping information for the latest data is reflected in the upper mapping log, all the mapping information for the latest data or data, There can be a certain point in time that seems to have disappeared from. A thread that processes a read operation can be executed concurrently regardless of the log threads associated with a write operation. If a read thread that is independently executed at this point performs a read operation, the latest version data of the corresponding logical page Will return the version data of the result. This situation must be avoided because it violates the data coherence requirement that the result of a read request for any logical address should return the result of the most recently completed write to the same logical address. Therefore, all the logs must obey the synchronization rule of releasing the information about the logical page from the cache only after receiving the installed_v response as shown in the right side of FIG. 9. This rule is an important design element that can verify the accuracy of the HIL framework Design for provability.

Work In the embodiment  Asynchronous error recovery

The built-in power failover scheme in the HIL framework ensures complete power failover recovery in the event of a power failure. Hereinafter, the effect of the power failure will be described first, and then the power failure method will be explained.

FIG. 10 shows the two effects of a power failure during execution of an operation for changing the internal state of the flash memory, that is, the residual effect and the lost update. The residual effect is defined as the effect that the unintended state is left in the flash memory due to power failure during flash memory write or erase operation. Since the flash memory chip does not guarantee the atomicity of the operation, if a power failure occurs during the writing operation as shown in the right side of FIG. 10, 1) no writing is made at all, 2) It may be all written. 1) or 2) can not be used because the intended data can not be written. Therefore, these pages can be detected by ECC or the like in the power failure recovery process, and can be reused without using or erasing them. However, the 3) state is that the write operation guaranteed by the flash chip is not atomic, so even if the data does not have an ECC error, the data is not completely reliable. This is because the page where the power error occurred during writing is vulnerable to reading disturbance, and even if the initial data is returned, there is a high possibility that the wrong data is returned for the continuous reading operation during the operation. The effects of errors that occur when the page or block where the residual effect occurs is not properly handled is potentially unpredictable. Also, as power failures repeat, these potential error pages can increase cumulatively. If the data on the page containing the residual effect contains meta information of the system such as the mapping information, if the FTL does not handle it appropriately, the information may be read temporarily, but after a certain period of time, The consistency of the storage device may be severely damaged.

Lost updates are caused by a power failure while the FTL is performing a series of internal operations to process one external command, such as a host write request, while the entire job is not yet completed, thus destroying the consistency among the FTL data structures . There are many types of inconsistencies that can occur at the time of power failure or implementation of the FTL. Basically, the data is written in the flash memory but the mapping information for the data is not used, or if the power error occurs during the mapping information as shown in FIG. 10, the mapping information that was kept in the volatile memory disappears. It is impossible to recover the mapping information for the data written during the recovery process. Even if the FTL guarantees that the host system has data persistence in response to the flush cache, etc., it can not recover the data because the mapping information for the data is lost. Even if the storage device is restored in case of power failure, It may fall into the unrecoverable state. Moreover, if the mapping information is written to the flash memory first and the data is not written yet, if the power failure occurs, the mapping information may point to the wrong data, and the storage device may not guarantee the data integrity. Therefore, when FTL performs an operation that changes the state of flash memory, it should always consider recovery possibility, and recovery process should be able to remedy systematic and provable effects of two effects: residual effect and lost update.

The power failure recovery of the HIL framework consists of two steps: structural recovery and functional recovery. Structural recovery is the step of atomically removing all residual effects in the storage. Structural restoration of the HIL framework is a new concept that does not exist in existing known recovery techniques. Functional restoration is the step of recovering consistency between data and metadata. For functional recovery, each log recreates the work that should have been done during normal operation if no power failure occurred.

I. Structural restoration

In structured recovery, each log independently removes the residual effect on the other log. In order to eliminate the residual effect, it is necessary to 1) define the blocks in which the operation of changing the state of the flash memory can be processed at the time of power failure, determine the blocks and pages in which the power failure occurred in the blocks, 2) A spare block for copying the normal data among the effected blocks should be prepared, and 3) if the elimination of the residual effect occurs arbitrarily in the event of an overlapping power failure until the atomic end of the entire logs constituting the HIL framework 4) Finally, there is a need for a method to reduce the time to find the block where power failure occurs.

When a block or page of a flash memory in which a residual effect occurs is referred to as a crash frontier, the HIL framework includes a block used list in which a log is used or a block to be used in the near future, and a block write list ). During normal operation, the logs are written to the block in the order specified in the block write list, and the blocks are written in ascending order within the block. In the restoration process, the FTL scans only the blocks included in the block write list, not the entire flash memory space, in order to determine which block the log was writing at the moment of power failure. The determination as to whether the residual effect has actually occurred uses the ECC and CRC information recorded in the spare area together with the data. The log scans the block write list during the recovery process and copies the data up to the immediately preceding page of the page where the CRC error occurred in the crash frontier block.

During the recovery process, the crash frontier page may or may not be read normally. A normal reading means that it is a potential error page that may return erroneous data due to read disturbance in the future. Structural restoration of the HIL framework can remove these residual effect pages by transferring valid pages in the power failure block to a new block after identification of the power failure block. This is because, if the residual effect page returns a valid value, it will be recorded on a new page so that the residual effect will be removed naturally, and if it does not return a valid value, it will be successfully excluded from the system by being excluded from the copy target.

The page copy operation for the structural restoration of the HIL framework in case of power failure again during the valid page copy operation to remove the residual effect page from the log is called the do not care block It is performed on a special block. An uninterpreted block is a block specifically reserved for structural recovery operations, which is maintained for each log with a block write list. This block is not interpreted by the system at all. Even if a power failure occurs during program execution in the unstructured block during the structural recovery process, since the block still remains in the uninterpreted block after the reboot of the system, the integrity of the system is not affected. The uninterpreted block is used in a new state after each erase before the valid page copy operation of the structured recovery. When all the data write operations to the no-analysis block are completed, and the power-on error block on the block write list and the usage change are confirmed, the structural recovery is completed. That is, even if a power failure occurs again during a structured recovery, all write commands are executed on the uninterpreted block, so that the next recovery time can be returned to the initial recovery start point as if nothing happened. You can keep each write pointer to quickly find the power failure block. The log performs a write operation in the order specified on the block write list and advances the write pointer when the write is complete. Since the pages prior to the write pointer are guaranteed to be persistent, there is no need to scan at power failure recovery. Therefore, the scan time can be reduced compared to the case of scanning from the start point of the block write list. Figure 11 summarizes the key data structures and mechanisms of logs to eliminate residual effects.

Figure 12 shows the structural recovery process at the storage level. The FTL constituting the HIL framework can be composed of a plurality of logs, and each log can operate in parallel. Thus, the key to structural recovery is to atomically process the work of removing residual effects from multiple logs. To do this, structural restoration proceeds as follows.

Structured recovery begins with the C-type log finding the most recent checkpoint information in the checkpoint area. Since the size of the checkpoint area is limited and each page storing the checkpoint information includes a time stamp, checkpoint information can be obtained by scanning the checkpoint area in a short time (1). The checkpoint information includes information necessary for power failure recovery such as a block write list and an unresponsive block of the logs managed by the user, and transmits the information to corresponding logs (2). Each log scans the blocks on the block write list from the write pointer using the received checkpoint information. (3) Locates the power error block and the location of the ECC / CRC error on the page. Copy to the analysis block (4).

When the copying is completed, the log changes voluntarily the number of the unallocated block and the power failure block in the checkpoint information as shown in the right side of FIG. 12 (5). By updating the volatile HIL meta information, the effect of the residual effect is completely eliminated and each shadow block has only the valid pages in the power error block. The copy process and the checkpoint information changed by updating the HIL metadata in each log are retransmitted to the checkpoint log (6). A complete shadow-tree is created when the changed HIL metadata of all logs finally reaches the C-type log. The C-type log collects all the final checkpoint information and writes it to the checkpoint area. When this operation is successfully completed, all the operations described above are atomically replaced. (7) The C type log explicitly informs all logs that the structural recovery has been completed successfully, and each log starts a functional recovery.

All erase and write operations that change the state of the flash memory during the entire structured recovery process are performed only in the uninterpreted block except for writing the last page in the C type log, and all other operations are performed only in the volatile memory. Therefore, even if a power failure occurs again at any moment during the structural recovery, there is no change in the logical state of the storage device, and the state returns to the state at the time of the power failure recovery start. The entire structured recovery process is atomically completed by gathering complex residual effect elimination operations performed in each log and writing checkpoint information to the C type log.

II. Functional recovery

At the end of the structured recovery, each log starts functional recovery. From the perspective of each log, functional recovery is to reproduce the install_mapping process, which would have been done if no power failure had occurred, to restore the logical address space of each log to its current state prior to power failure.

In case of power failure, all the information that FTL keeps in volatile memory disappears. Therefore, in order to be able to reproduce the log, when the normal operation is performed, 1) every time data is written to the flash memory, The necessary information should be kept non-volatile, and 2) the logs should write to flash memory to prevent unrecoverable situations. Also, in the reproduction process, 3) the reproduction process should be considered in consideration of the difference between the state after the residual effect is removed and the state before the power failure occurs, and 4) a method for reducing the reproduction time is required

The object of reproduction is the mapping information for the data written in the flash memory. Since the page of the flash memory is configured as a spare area for recording the data area and the meta information, the log writes the mapping information to be used in the reproduction process to the flash memory page together with the data as shown in FIG. Because the data and the mapping data, which is the recovery information, are stored together in the flash memory page by a single flash write operation, the log does not have to take into account the storage order of the data and recovery information, nor does it need to manage a separate space for recovery information . The HIL framework makes it easy to maintain recovery information during normal operation and ensures that the data and mapping information are always on the same page, so the reconstruction process can be easily handled.

Each log can be executed independently in principle. However, in order to be able to recover from power failure, minimum synchronization between logs is required during normal operation. By default, data must be written to flash memory before mapping to data. If the log is completely independent and the upper log writes the mapping information to the flash memory and the lower log does not write the data to the flash memory as shown in the left column of FIG. 14, the corresponding mapping information is actually written The data integrity of the storage device is problematic. Because the log follows the rules that the mapping information transfers to the parent log only after the child log writes the data in the hierarchy, it obeys the flash memory write order restriction between the lower log and the parent log.

MLC flash memory with a sibling page structure ensures that the persistence of the data contained in the page is not guaranteed until all pages in the sibling relationship have been exhausted, . 14 shows a situation in which data of the first page that has already been successfully written due to a power failure occurred while writing data to the second page in the sibling relationship is lost. When the mapping log writes the mapping data without considering such a situation, there is a problem that the mapping information refers to the lost data.

Finally, the third figure in Figure 14 shows the mapping and data consistency problems that occur during power failure recovery. In order to remove the residual effect in the recovery process, the power error block is replaced with the non-analysis block. If the mapping data of valid data written in the power error block has already been written by the upper mapping log, (Power failure block before power failure) but not valid data at the time of the previous power failure but is replaced with the currently unresponsive block, rather than pointing to the data written to the power failure block (power failure block before power failure) .

To solve this problem, the HIL framework has a synchronization command called NV_grant that allows the subordinate log to use the mapping data in the flash memory, and the Intalled_NV, which indicates that the subordinate log is permanently written to the subordinate log Interface. FIG. 15 shows an outline of operations of NV_grant and Intalled_NV. The NV_grant command for data contained in a specific page can be transferred to the parent log only after all the blocks to which the page belongs are written to the flash memory and the write operation for the next block on the block write list is started. By observing these conditions, (1) the sibling page structure of the MLC chip is guaranteed not to be lost in case of power failure, and (2) it can be guaranteed that it will not be included in the power failure block. The parent log receives the mapping information before receiving the NV_Grant command and keeps it only in the cache, and when receiving the NV_grant from the subordinate log, it can write the mapping information. When the mapping information is not only written to the flash memory but persisted, it responds with Installed_NV to the lower data log. The sub-log receiving the Installed_NV message can advance the replay pointer of the block write list to the pages whose mapping information is persistent as shown in (7) of FIG. Since the previous pages of the reproduction pointer have been persisted in both the data and the mapping data for the data, it is no longer necessary to reproduce the updating information of the mapping information in the power failure recovery process. In the case of power failure recovery, the reproduction time can be reduced compared with the case of reproducing from the beginning of the block write list since the reproducing process starts from the reproduction pointer. Blocks in the block's write list of logs that have passed the recursion pointer can no longer be used for recovery and can be truncated from the list.

The log starts functional recovery through the reconstruction process after the structural recovery operation is completed. After the residual effect is removed by the structural recovery operation, the state of the log is as follows: 1) all the data structures held in the volatile memory have disappeared, and 2) mapping data for the data after the re-creation pointer has been written to the flash memory But it is conceptually the same as before the power failure occurred. Therefore, a separate mechanism for the reproduction process is minimized and most of the mechanism of the install_mapping processing in the normal operation is reused. In the reproduction process, the mapping information is retransmitted to the parent log while scanning the pages along the block write list starting from the reproduction pointer. In the normal operation, data is written and the mapping information must be transmitted. Since the data is already written, only the mapping information stored with the data can be transmitted during the reproduction process. The mapping information to be transmitted can be obtained in the spare area in the scanning process. Since the mapping data to be transmitted may have already been written to the flash memory in the upper mapping log before the power failure, the update of the mapping information is idempotent, so if the mapping information is transmitted in the same order, And the mapping data may be written to the flash memory several times.

FIG. 16 shows a process of performing a functional recovery in a storage system having a hierarchical structure of a three-level mapping. In the hierarchical log configuration of the HIL framework, functional recovery is complete from the top-level log and is ready to receive the request to update the mapping information from the sub-logs. When the mapping information in the top-level mapping log is contained in one physical page, the position of the write pointer and the reproduction pointer are the same, and the functional recovery of the logical address space of the top-level mapping log is completed by reading the page indicated by the latest write pointer do. The logical address space of the recovered top-level mapping log M1 represents the mapping data of the lower mapping log M0. In the M0 log, mapping information from the reproduction pointer to the last written page is reproduced (retransmitted to the parent log) Complete functional recovery of the logical address space. This top-down, cascading functional recovery lasts until it reaches the last data log. Finally, the functional recovery is completed in the data log, so that finally the logical address space for the host data is restored to the latest state before the power interruption.

Work In the embodiment  Design of File Conversion Layer Framework  Verification of accuracy

Since the HIL framework aims at generality in designing arbitrary variety of FTLs, the verification of the accuracy through testing after implementing the FTL as a specific instance makes the HIL framework itself accurate It is not verifying. Therefore, formal verification of the accuracy through the abstracted system model of the HIL framework was taken. For this purpose, we aimed at design for provability from the system design stage.

17 shows steps of a formal verification process of the HIL framework. First, the assumptions of the system were established, and the next step was to establish correctness criteria as to what it means to explicitly operate the storage system. Next, a formal specification of the HIL framework was created. This process involves defining an abstract model of HIL framework behavior and a clear definition of concepts, and a formal representation of core rules. Finally, the theorem on the accuracy of the HIL framework based on the assumed assumptions of the system, the criteria for accuracy, and the formal specification is validated mathematically through deductive reasoning.

Table 1 shows the accuracy criteria of the storage system as a starting point of the accuracy proof.

[Accuracy Standards]
"If the storage system always returns the most recent data version of the past write requests for that logical page p for a read request for any logical page p in the address space, I can say that "

In the above accuracy criterion, it is necessary to define more clearly the notion of 'the latest' data best and the concept of 'always' time. First of all, the concept of data bestness and 'always' is a concept of 'the latest', which is defined more precisely in the accuracy criterion of the storage area. From the viewpoint of the temporality of the storage system, the storage system can be seen as a pair of normal operation and power failure recovery repeated a finite number of times until the end of life after initialization as shown in FIG.

The recency of data has different meanings during normal operation and power failure recovery. First, when the storage system is in normal operation, the latest data will be the latest volatile write success response data. When the power failure of the storage system is recovered, the latest data is the last persistent completion data before power failure. Accordingly, if the accuracy criterion in the normal operation and the power failure recovery is refined and defined, in the case of normal operation, " returning the most recent volatile write success response data for the read request " coherence, and in case of power failure recovery, it is ensured that "recover the data lastly made last completed before power interruption, that is, recoverability". Thus, if the HIL framework proves that these two attributes (data consistency and recoverability) are met, then this is the same as proving that the 'always' accuracy criterion is satisfied on the time axis in which the pair of n normal operations and power failure recovery is repeated.

The HIL framework introduced rules that must be followed in log operations to ensure data consistency and recoverability. First, we introduce the contents of reasoning used in the system rules and proofs introduced for ensuring data consistency. In FIG. 19, when Data (X) in Cache i is recorded in Flash log i , new mapping information Link (X) for this is generated. The Link (X) is then carried to the Cache i +1 of the parent log i + 1 through the interface between the log, successfully installed on the Link (X) the Cache i +1 according to the rules of the HIL installed_v framework previously described Data (X) can be removed from Cache i only after it has been verified to be By observing this rule, it is possible to guarantee that the connection information for Data (X) is not lost at any arbitrary time during normal operation with respect to the read request for Data (X). This is the same principle as placing the land in the other hand only after confirming that the next target land is caught with the hand stretched upward, in order to guarantee safety during rock climbing.

Next, the rules introduced in the HIL framework for ensuring the recovery possibility will be described with reference to FIG. In the HIL framework, the flash log maintains a small-sized replica set (Redo_set) with the reproduction pointer as the reference point, without recreating all the mappings of the log to improve recovery performance. In this case, the Installed_NV rule of the HIL framework described in FIG. 15 indicates that the condition that Data (X) recorded in the flash log (Flash log i ) can be removed from the reproduced set (Redo_set i ) X) has been confirmed to be persistent to the flash log (Flash log i +1 ) of the parent log. This rule does not lose connection information for Data (X) in any power failure situation because the connection information for Data (X) is revived by the recovery process of Log i , .

According to embodiments, any of a variety of FTLs may be configured through a combination of logs. HIL is a general framework for FTL design that can encompass a variety of mapping units, write buffering and caching policies, and garbage collection policies. In addition, the HIL framework according to the embodiments guarantees successful recovery in any power failure situations and its accuracy has been verified. HIL has solved the reliability problem of the latest flash memory comprehensively and completely. Embodiments also provide the possibility of performance optimization through thread parallelism, which is effectively coupled to the parallelism of the flash memory requests.

Work In the embodiment  Design of Flash Translation Layer

According to the FTL design technique according to one embodiment, a plurality of building blocks for constructing the flash conversion layer can be provided to the user. Here, the plurality of building blocks includes a first processing block (for example, a D-type log) for processing data, at least one second processing block for hierarchically processing mapping information of data (for example, And a third processing block (e.g., C-type log) for processing checkpoint information for the first processing block and the at least one second processing block. For example, a user interface configuring the flash conversion layer using a plurality of building blocks may be provided to a user. The user can design a desired FTL using a plurality of building blocks.

In addition, a user interface may be provided to the user to receive settings related to the design of the FTL. Here, the design related to the design of the FTL includes a setting related to the number of threads implementing the FTL, a setting related to the number of cores driving the FTL, a setting related to the number of threads processed per core, And settings related to the mapping between the plurality of threads. For example, an individual log may be configured to be implemented as a function, to be implemented as a single thread, or to be implemented as a multi-thread. The FTL can be implemented as a single thread if the individual log is implemented as a function, and the FTL can be implemented as a multithread if the individual log is implemented as a single thread or a multi-thread.

If the FTL is implemented as multi-threaded and the FTL is configured to be run by multicore, a thread-core mapping can be established. For example, if the number of cores is two and the FTL is implemented as ten threads, the mapping between two cores and ten threads can be set.

An FTL may be generated based on the plurality of building blocks used in the design of the FTL and the settings associated with the design of the FTL. In this case, the HIL framework provides an essential synchronization interface for satisfying data consistency and data recoverability even when several threads execute in parallel. For example, individual logs preferentially determine cache hits to satisfy data consistency. In addition, the upper and lower logs are exchanged with NV_grant and Installed_NV to satisfy the recoverability of data.

Thus, a single log can be implemented as a multithread running in parallel with each other. Alternatively, one log may be implemented as a function so that multiple logs may be combined to create a single thread. Thus, the FTL created by the HIL framework can be utilized without any code changes from single-core to multi-core environments. Furthermore, in each case, a single thread or a multi-thread may be driven for each core. As such, the HIL framework can be configured in many forms in a multi-core / multi-threaded environment.

Also, according to the FTL designing method according to an embodiment, the generated FTL can be adaptively regenerated according to the changed setting. For example, a user interface may be provided to the user to change settings related to the design of the FTL. In this case, the FTL can be adaptively regenerated based on the plurality of building blocks used in the pre-generated FTL and the new setting. In other words, even if the hardware configuration is changed, the FTL can be adaptively regenerated without changing the code.

The embodiments described above may be implemented in hardware components, software components, and / or a combination of hardware components and software components. For example, the devices, methods, and components described in the embodiments may be implemented within a computer system, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, such as an array, a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.

The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device , Or may be permanently or temporarily embodied in a transmitted signal wave. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced. Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims (42)

A first processing module for processing data;
A second processing module for processing the mapping information of the data; And
A third processing module for processing checkpoint information including information about an unanalysis block of the first processing module and information about an uninterpreted block of the second processing module,
Lt; / RTI >
Wherein the first processing module recovers errors using the unprocessed blocks of the first processing module and the second processing module recovers errors using the unprocessed blocks of the second processing module,
The processing module included in the flash conversion layer
An interface unit connected to at least one of a host, another processing module, and a flash memory; A cache unit including a volatile memory; And a processing unit for processing data or information according to the type of the processing module using the interface unit and the cache unit
The flash conversion hierarchy.
The method according to claim 1,
For error recovery,
Wherein the first processing module detects an error page and copies valid pages in the error block including the error page to an uninterpreted block of the first processing module, A flash translation hierarchy that logically swaps the uninterpreted blocks of.
3. The method of claim 2,
Wherein the checkpoint information of the first processing module further comprises a block write list of the first processing module and the first processing module detects the error page using the block write list.
3. The method of claim 2,
Wherein the checkpoint information of the first processing module further comprises a block write list of the first processing module and a write pointer of the first processing module, Wherein the error page is detected by checking whether the page is erroneous along the block write list of the first processing module from the page.
3. The method of claim 2,
Wherein the first processing module sends updated checkpoint information to the third processing module due to the logical swap of the error block and the unprocessed block of the first processing module.
The method according to claim 1,
After the error is recovered using the unprocessed block of the first processing module,
Wherein the first processing module obtains mapping information of the data from a page where the data is stored and transmits mapping information of the data to the second processing module.
The method according to claim 6,
Wherein the checkpoint information of the first processing module further comprises a block write list of the first processing module, a write pointer of the first processing module, and a re-creation pointer of the first processing module,
The first processing module obtains mapping information from a page pointed by the reproduction pointer of the first processing module to a last page written normally after the structural recovery along the block writing list of the first processing module and transmits the mapping information to the second processing module Flash conversion hierarchy.
The method according to claim 1,
Wherein the first processing module is configured to store data corresponding to the write command in a cache when receiving a write command and to determine whether data corresponding to the read command exists in the cache when the read command is received, Translation hierarchy.
The method according to claim 1,
Wherein the first processing module stores the data and the mapping information of the data in the same page of the flash memory.
The method according to claim 1,
Wherein the first processing module stores the data in a flash memory and transmits the mapping information of the data to the second processing module.
The method according to claim 1,
Wherein the first processing module stores the data in a flash memory on a page-by-page basis, and advances the write pointer along a block write list when page-level data is stored.
12. The method of claim 11,
Wherein the first processing module transmits a persistence request signal to the second processing module when the writing pointer passes the block boundary and the second processing module receives a mapping of a block corresponding to the persistence request signal in response to the persistence request, Wherein the first processing module advances the reproduction pointer along the block write list upon receipt of a persistent completion signal from the second processing module.
The method according to claim 1,
For error recovery,
Wherein the second processing module detects an error page and copies valid pages in the error block including the error page to an uninterpreted block of the second processing module, A flash translation hierarchy that logically swaps the uninterpreted blocks of.
14. The method of claim 13,
Wherein the checkpoint information of the second processing module further comprises a block write list of the second processing module and the second processing module detects the error page using the block write list.
14. The method of claim 13,
Wherein the checkpoint information of the second processing module further comprises a block write list of the second processing module and a write pointer of the second processing module, Wherein the error page is detected by checking whether the page is erroneous along the block write list of the second processing module from the page.
14. The method of claim 13,
Wherein the second processing module transmits updated checkpoint information to the third processing module due to a logical swap of the error block and the uninterpreted block of the second processing module.
The method according to claim 1,
The upper processing module for processing upper mapping information of the mapping information
Further comprising:
After the error is recovered using the unprocessed block of the second processing module,
Wherein the second processing module obtains upper mapping information of the mapping information from a page storing the mapping information and transmits upper mapping information of the mapping information to the upper processing module.
18. The method of claim 17,
Wherein the checkpoint information of the second processing module further comprises a block write list of the second processing module, a write pointer of the second processing module, and a re-creation pointer of the second processing module,
The second processing module acquires upper mapping information from the page indicated by the reproduction pointer of the second processing module to the last page normally written after the structural restoration along the block writing list of the second processing module and transmits the upper mapping information to the upper processing module Flash conversion hierarchy.
The method according to claim 1,
When receiving the mapping command, the second processing module stores the mapping information corresponding to the mapping command into the cache. When receiving the read command, the second processing module determines whether the mapping information corresponding to the read command exists in the cache , Flash conversion hierarchy.
The method according to claim 1,
Wherein the second processing module stores the mapping information and the upper mapping information of the mapping information in the same page of the flash memory.
The method according to claim 1,
The upper processing module for processing upper mapping information of the mapping information
Further comprising:
Wherein the second processing module stores the mapping information in a flash memory and transmits upper mapping information of the mapping information to the upper processing module.
The method according to claim 1,
Wherein the second processing module stores the mapping information in a flash memory on a page basis and advances a write pointer along a block write list when mapping information on a page basis is stored.
23. The method of claim 22,
The upper processing module for processing upper mapping information of the mapping information
Further comprising:
Wherein the second processing module transmits a persistence request signal to the higher-level processing module when the write pointer passes the block boundary, and the higher-level processing module receives upper-level mapping information of a block corresponding to the persistence request signal in response to the persistence request signal And the second processing module advances the reproduction pointer along the block write list upon receiving the persistent completion signal from the upper processing module.
The method according to claim 1,
Wherein the error includes a power failure occurring asynchronously.
delete The method according to claim 1,
A fourth processing module for processing block state information
Further comprising:
Wherein the third processing module further comprises information about an uninterpreted block of the fourth processing module and the fourth processing module uses an uninterpreted block of the fourth processing module to recover the error, .
The method according to claim 1,
A fifth processing module operating as a non-volatile buffer for the other processing module
Further comprising:
Wherein the third processing module further comprises information about the unprocessed blocks of the fifth processing module and the fifth processing module recovers errors using the unprocessed blocks of the fifth processing module, .
D-log processing data; And
A plurality of M-logs for hierarchically processing the mapping information of the data, each of the plurality of M-logs storing information received from a lower log in a flash memory resource allocated thereto, The mapping information of the flash memory resource is transferred to the upper level log if the size of the information is larger than a predetermined size.
Readable recording medium on which a program is recorded.
A first processing module for processing data; And
A plurality of second processing modules for hierarchically processing the mapping information of the data,
/ RTI >
Each of the plurality of second processing modules
And the information received from the lower layer processing module is stored in the flash memory resource assigned to the lower layer. When the size of the information received from the lower layer processing module is larger than a predetermined size, To a processing module of the flash transform hierarchy.
29. The method of claim 28,
Wherein a log having a size equal to or less than a predetermined size of information received from the lower logs among the plurality of M-logs is determined as the highest M-log.
31. The method of claim 30,
The top M-log
Storing information received from the sub-log in a flash memory resource allocated to the sub-
And transferring the mapping information of the flash memory resource to a C-log that processes checkpoint information.
29. The method of claim 28,
The characteristics of each of the plurality of M-logs are set for each M-log,
Wherein the characteristics of each of the plurality of M-logs comprises at least one of a mapping unit of each of the plurality of M-logs and a cache management policy of each of the plurality of M-logs.
29. The method of claim 28,
The program
An L-log processing block state information; And
A plurality of LM-logs (LM-logs) for hierarchically processing the mapping information of the block state information
Further comprising the steps of:
34. The method of claim 33,
Each of the plurality of LM-
The information received from the sub-log is stored in the flash memory resource allocated to itself,
And transfers the mapping information of the flash memory resource to the superordinate log when the size of information received from the subordinate log is larger than a predetermined size.
34. The method of claim 33,
Wherein a log having a size equal to or smaller than a predetermined size is determined as the highest LM-log, the information received from the lower log among the plurality of LM-logs.
36. The method of claim 35,
The top LM-log
Storing information received from the sub-log in a flash memory resource allocated to the sub-
And transferring the mapping information of the flash memory resource to a C-log that processes checkpoint information.
34. The method of claim 33,
The characteristics of each of the plurality of LM-logs are set for each LM-log,
Wherein the characteristics of each of the plurality of LM-logs comprises at least one of a mapping unit of each of the plurality of LM-logs and a cache management policy of each of the plurality of LM-logs.
Providing a plurality of building blocks for configuring a flash translation layer
Lt; / RTI >
The plurality of building blocks
A first processing block for processing data;
At least one second processing block for hierarchically processing the mapping information of the data; And
A third processing block for processing checkpoint information for the first processing block and the at least one second processing block,
Wherein the plurality of building blocks provide a synchronization interface that satisfies data consistency and data recoverability.
39. The method of claim 38,
Receiving a setting associated with the design of the flash translation layer; And
Generating the flash translation layer based on the plurality of building blocks and the setting
Further comprising the steps of:
40. The method of claim 39,
The setting
A setting associated with the number of threads implementing the flash translation layer;
A setting associated with the number of cores driving the flash translation layer;
Settings related to the number of threads being processed per core; And
Settings related to mapping between multiple cores and multiple threads
The method comprising the steps of:
40. The method of claim 39,
Receiving a second setting associated with the design of the flash translation layer; And
Adaptively regenerating the flash translation layer based on the plurality of building blocks and the second setting
Further comprising the steps of:
41. A computer-readable recording medium having recorded thereon a program for executing the method of any one of claims 38 to 41.
KR1020140013590A 2013-02-07 2014-02-06 Flash transition layor design framework for provably correct crash recovery KR101526110B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/KR2014/001028 WO2014123372A1 (en) 2013-02-07 2014-02-06 Flash translation layer design framework for provable and accurate error recovery

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020130013659 2013-02-07
KR20130013659 2013-02-07

Publications (2)

Publication Number Publication Date
KR20140100907A KR20140100907A (en) 2014-08-18
KR101526110B1 true KR101526110B1 (en) 2015-06-10

Family

ID=51746549

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020140013590A KR101526110B1 (en) 2013-02-07 2014-02-06 Flash transition layor design framework for provably correct crash recovery

Country Status (1)

Country Link
KR (1) KR101526110B1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180002259A (en) 2016-06-29 2018-01-08 주식회사 파두 Structure and design method of flash translation layer
US10698816B2 (en) 2018-06-29 2020-06-30 Micron Technology, Inc. Secure logical-to-physical caching
US10990304B2 (en) * 2019-06-27 2021-04-27 Western Digital Technologies, Inc. Two-dimensional scalable versatile storage format for data storage devices

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012008731A2 (en) * 2010-07-12 2012-01-19 (주)이더블유비엠코리아 Device and method for managing flash memory using block unit mapping

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012008731A2 (en) * 2010-07-12 2012-01-19 (주)이더블유비엠코리아 Device and method for managing flash memory using block unit mapping

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
‘X-BMS: 플래시 메모리 기반 저장 시스템에서 정확성 검증이 가능한 불량 블록 관리 기법', 윤지혁, 서울대학교 공학박사학위논문, 컴퓨터 공학과(2011.02)*
'X-BMS: 플래시 메모리 기반 저장 시스템에서 정확성 검증이 가능한 불량 블록 관리 기법', 윤지혁, 서울대학교 공학박사학위논문, 컴퓨터 공학과(2011.02) *

Also Published As

Publication number Publication date
KR20140100907A (en) 2014-08-18

Similar Documents

Publication Publication Date Title
US11907200B2 (en) Persistent memory management
JP6294518B2 (en) Synchronous mirroring in non-volatile memory systems
CN111480149B (en) Pre-written logging in persistent memory devices
US10776267B2 (en) Mirrored byte addressable storage
US9218278B2 (en) Auto-commit memory
US9047178B2 (en) Auto-commit memory synchronization
US6738863B2 (en) Method for rebuilding meta-data in a data storage system and a data storage system
US9772938B2 (en) Auto-commit memory metadata and resetting the metadata by writing to special address in free space of page storing the metadata
US8527693B2 (en) Apparatus, system, and method for auto-commit memory
JP5404804B2 (en) Storage subsystem
US20150331624A1 (en) Host-controlled flash translation layer snapshot
WO2009124320A1 (en) Apparatus, system, and method for bad block remapping
WO2015020811A1 (en) Persistent data structures
US10324782B1 (en) Hiccup management in a storage array
KR20180002259A (en) Structure and design method of flash translation layer
US9990150B2 (en) Method to provide transactional semantics for updates to data structures stored in a non-volatile memory
JP7543619B2 (en) System and apparatus for restoring data to ephemeral storage - Patent Application 20070123633
CN111831476A (en) Method of controlling operation of RAID system
US20230273751A1 (en) Resiliency and performance for cluster memory
KR101526110B1 (en) Flash transition layor design framework for provably correct crash recovery
US20230012999A1 (en) Resiliency and performance for cluster memory
US12124715B2 (en) Resiliency and performance for cluster memory
Choi et al. Hil: A framework for compositional ftl development and provably-correct crash recovery
WO2014123372A1 (en) Flash translation layer design framework for provable and accurate error recovery
Kim et al. P-BMS: A bad block management scheme in parallelized flash memory storage devices

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20180425

Year of fee payment: 4

FPAY Annual fee payment

Payment date: 20190429

Year of fee payment: 5