CN113220241A - Cross-layer design-based hybrid SSD performance and service life optimization method - Google Patents

Cross-layer design-based hybrid SSD performance and service life optimization method Download PDF

Info

Publication number
CN113220241A
CN113220241A CN202110582732.6A CN202110582732A CN113220241A CN 113220241 A CN113220241 A CN 113220241A CN 202110582732 A CN202110582732 A CN 202110582732A CN 113220241 A CN113220241 A CN 113220241A
Authority
CN
China
Prior art keywords
cmt
ftl
request
block
bml
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110582732.6A
Other languages
Chinese (zh)
Inventor
顾能华
吕梅蕾
陈勇
叶文通
朱秋琴
徐拥华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quzhou University
Original Assignee
Quzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quzhou University filed Critical Quzhou University
Priority to CN202110582732.6A priority Critical patent/CN113220241A/en
Publication of CN113220241A publication Critical patent/CN113220241A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • G06F12/0246Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory in block erasable memory, e.g. flash memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/068Hybrid storage device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a hybrid SSD performance and service life optimization method based on cross-layer design, which comprises the following steps: identifying and centralizing traditional dispersed BML and FTL load characteristics to form a new WAL layer, and designing and optimizing; forming a page/block adaptive BML by adding a Normal area between the Hot area and the Cold area; the design and optimization of the FTL comprise the steps of splitting CMT of DFTL into H-CMT, CMT and S-CMT, optimizing SLC and MLC distribution of FTL layer data and abrasion balance among the SLC and MLC distribution, and adding a programming mode selection module in the FTL layer to realize the utilization of the writing function of flash memory deep level characteristic perception. According to the invention, by adopting the SLC and MLC mixed chip structure in the NAND flash memory array layer, the firmware design problem of the SSD based on the SLC and MLC mixed structure is solved, the compromise among the performance, the service life and the cost of the SSD based on the mixed structure is realized, the read-write performance of the mixed SSD based on the cross-layer design is greatly improved, the service life is greatly prolonged, and the cost is greatly reduced.

Description

Cross-layer design-based hybrid SSD performance and service life optimization method
Technical Field
The invention belongs to the technical field of SSD storage, and particularly relates to a hybrid SSD performance and service life optimization method based on cross-layer design.
Background
In the last half century, with the continuous progress of computer architecture and chip processing technology, the gap between the CPU performance and the input/output (IO) performance of a computer system has been expanding. The bottleneck of computer system IO performance is Hard Disk Drive (HDD). Although the capacity of the HDD has been greatly increased over the years, the access speed has been increased only to a limited extent due to the existence of the mechanical rotation structure. For example, over the past 20 years, the CPU frequency has increased by about 600 times, while the hard disk speed has increased by only 20 times. Compared with a magnetic disk, a Flash Memory (Flash Memory) is a high-speed, low-power consumption, shock-resistant, small, light and portable chip-level storage medium, and is considered as a key component for improving the IO performance of a computer system. In the last 10 years, under the combined efforts of the industry and academia, the flash memory technology has advanced greatly, and great industrial revolution that Solid State Drive (SSD) based on NAND flash memory replaces HDD is taking place in the storage field.
An SSD based on a NAND Flash memory is generally composed of a host interface Layer, a Buffer Management Layer (BML), a Flash Translation Layer (FTL), and a NAND Flash Array Layer (FAL). The system comprises a host interface layer, a BML (bus management platform) and an FTL (file transfer layer), wherein the host interface layer is responsible for communicating with a host, the BML is responsible for managing a data buffer area of the SSD and is a key component for improving the performance and prolonging the service life of the SSD, the FTL is responsible for simulating the SSD into a traditional hard disk only with read-write operation so as to adapt to the current file system, the FTL is generally composed of three modules of address mapping, garbage recovery and wear leveling, the address mapping is a core, and the FAL is responsible for actual physical data storage and is composed of a plurality of NAND flash memory chips. BML and FTL bias software design is the key firmware in SSD design.
Currently, in the prior art, the SSD firmware design also suffers from the following three drawbacks: (1) in the design of the FTL and the BML, the cooperative design of the FTL and the BML is lacked; (2) lack of hybrid SSD firmware design; (3) SSD firmware design is lacking in the utilization of flash deep level features.
Disclosure of Invention
The invention aims to provide a cross-layer design-based hybrid SSD performance and service life optimization method, which solves the problems that the existing method lacks of collaborative design of an FTL and a BML, mixed SSD firmware design and SSD firmware design by utilizing deep-level characteristics of a flash memory by optimizing a WAL layer, a BML layer and an FTL layer.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention relates to a hybrid SSD performance and service life optimization method based on cross-layer design, which comprises the following steps:
s1, identifying and centralizing traditional dispersed BML and FTL load characteristics to form a new WAL layer, and designing and optimizing the WAL layer;
s2, designing and optimizing the BML;
s3, designing and optimizing the FTL;
the optimization method of step S1 is as follows:
s11, for any request, firstly judging whether it hits in BML, if so, updating its access mode according to access history;
s12, if not hit in BML, judging whether hit in FTL, if hit in FTL, using the previous access mode to predict the current mode; otherwise, identifying a rough access mode according to the request size or the relation between the logic page number of the current request and the logic page number of the request in the BML;
s13, when the BML rejects the data item, the access mode is sent to the FTL as a parameter;
s14, FTL adds a parameter (2 bits) for recording access mode to each logic mapping item;
the BML in step S2 is a page/block adaptive BML, and is divided into a Hot region, a Normal region, and a Cold region; the data page migration and elimination method among the Hot region, the Normal region and the Cold region is as follows:
s21, the data pages of two adjacent regions can be migrated to each other, that is, only two types of data migration exist, between the Hot region and the Normal region, and between the Normal region and the Cold region;
s21, when a request is not hit in the BML, processing according to the access mode identified in the step S1, continuously accessing and loading to a Cold area, and loading other accesses to a Normal area;
s22, after a request hits in the BML, judging which area of the Hot area, the Normal area and the Cold area the request hits specifically, and then respectively carrying out corresponding data processing;
and S23, when the space of the three areas is insufficient, performing page/block elimination.
S24, when selecting the block elimination in the Cold area, not only considering the least recent access principle of the Cold area, but also considering the garbage collection efficiency of each block in order to reduce the cost of garbage collection of the FTL;
the optimization method of step S3 is as follows:
s31, realizing classification processing of data, and splitting the CMT of the DFTL into H-CMT, CMT and S-CMT, wherein the H-CMT is responsible for caching frequently accessed mapping items, the S-CMT is responsible for caching continuously accessed mapping items, and the CMT is responsible for caching common randomly accessed mapping items; performing fine-grained management on the H-CMT according to a single mapping item, and clustering the CMT and the S-CMT according to translation pages;
s32, optimizing SLC and MLC distribution of FTL layer data and wear balance among the SLC and MLC distribution;
s33, adding a programming mode selection module in the FTL layer in said step S32 to implement the utilization of the write function of flash memory deep level feature perception.
Further, the load characteristics are divided into macroscopic characteristics and microscopic characteristics in the step S1; the load macroscopic characteristic analysis is realized by adopting a sectional statistical method;
analyzing the microscopic characteristics of the load by adopting a thermal data identification algorithm based on machine learning; the hot data identification algorithm is divided into an off-line learning stage and an on-line learning stage.
Further, in the off-line learning stage, feature modeling and classification are manually carried out on each request, so that a training set is obtained, then machine learning is carried out by using the training set, and finally effective features and model parameters are output;
in the on-line learning stage, when each request arrives, feature extraction is firstly carried out, then the trained classification model is directly used for classification, and a small training sample set is collected on line for on-line machine learning to obtain model parameters, so that the classification model can adapt to the change of load characteristics.
Furthermore, the Hot region, the Normal region and the Cold region all adopt different organization modes;
the Hot area is organized according to pages, belongs to fine granularity and is sorted according to priority values; the Normal area is organized according to virtual blocks and belongs to medium granularity; the Cold areas are organized according to logic blocks and belong to coarse granularity.
Further, the priority value of the Hot zone is calculated as follows:
P1=f1(ti,tl)
where P1 is defined as the priority value, ti is defined as the page/block average update interval, and tl is defined as the last update time.
Further, when a request in step S22 hits in the first half of the Normal region, the request is migrated from the Normal region to the Hot region, otherwise, the request is reordered according to the least recently accessed virtual block rule;
if the request hits the Cold region, it is migrated to the Normal region, which is sorted by logical block least recently accessed.
Further, when the page/block elimination is performed in step S23, the Hot area and the Normal area eliminate the page/block at the end of the queue to the next area, the Cold area selects an elimination block according to the idea of step S24, and sends the block data to the FTL, which determines to write to a suitable flash memory location;
the elimination order of the blocks in step S24 is calculated as follows:
P2=f2(tl,n,D)
wherein, P2 is defined as the culling priority of Cold blocks, tl is defined as the last update time of the blocks, n is defined as the number of dirty pages contained, and D is defined as the location distribution of the dirty pages in the flash memory.
Further, when the step S31 performs the classification processing, when the request is not hit in the H-CMT, the CMT and the S-CMT, the consecutive requests are loaded to the S-CMT and other requests are loaded to the CMT according to the request access mode identified in the step S1; when the request hits in CMT and S-CMT, it is promoted to H-CMT;
during elimination, the H-CMT adopts a simple minimum access principle to eliminate the queue tail mapping item into the CMT, and the CMT and the S-CMT adopt a batch elimination principle;
when the CMT and S-CMT remove the translation pages in batches, the removal priority P3 of the translation pages is calculated as follows:
p3 ═ f3(tl, n), where tl is defined as the translation page last access time and n is defined as the number of mapping items of the page viscera.
Further, in order to map data corresponding to part of the H-CMT mapping items to the SLC, adding a variable to each mapping item of the H-CMT, recording the updating (writing) times of the variable, and calculating the normalized wear degree of the SLC and the MLC; the wear balance inside the SLC and the MLC is realized by adopting the existing wear balance algorithm:
rws=total_es×lm/ns×ls rwm=total_em×1/nm
wherein rws and rwm are relative wear degrees of the SLC and the MLC respectively, total _ es and total _ em are total erasing times of the SLC and the MLC respectively, ns and nm are total block numbers of the SLC and the MLC respectively, and ls and lm are wear upper limits of the SLC and the MLC respectively.
The invention has the following beneficial effects:
1. according to the invention, by adopting the SLC and MLC mixed chip structure in the NAND flash memory array layer, the firmware design problem of the SSD based on the SLC and MLC mixed structure is solved, the compromise among the performance, the service life and the cost of the SSD based on the mixed structure is realized, the read-write performance of the mixed SSD based on the cross-layer design is greatly improved, the service life is greatly prolonged, and the cost is greatly reduced.
2. According to the invention, the design of the BML and the FTL is optimized by adopting a cross-layer design method, the macro/micro characteristics of the load based on the cooperation of the BML and the FTL are identified in real time, and the BML design sensed by the FTL and the FTL design sensed by the flash memory array layer are designed, so that the performance of the hybrid SSD is greatly improved.
3. The invention provides wider space for the read-write performance optimization and the service life extension of the hybrid SSD by utilizing the compromise among the P/E times, the data storage time and the programming speed based on the programming mode selection of the deep level characteristics of the flash memory.
Of course, it is not necessary for any product in which the invention is practiced to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a block diagram of a hybrid SSD based on a cross-layer design;
FIG. 2 is a block diagram of machine learning based thermal data identification;
fig. 3 is a general structural view of the FTL.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1-3, the present invention is a hybrid SSD performance and lifetime optimization method based on cross-layer design, the optimization method includes:
s1, identifying and centralizing traditional dispersed BML and FTL load characteristics to form a new WAL layer, and designing and optimizing the WAL layer;
s2, designing and optimizing the BML;
s3, designing and optimizing the FTL;
the optimization method of step S1 is as follows:
s11, for any request, firstly judging whether it hits in BML, if so, updating its access mode according to access history;
s12, if not hit in BML, judging whether hit in FTL, if hit in FTL, using the previous access mode to predict the current mode; otherwise, identifying a rough access mode according to the request size or the relation between the logic page number of the current request and the logic page number of the request in the BML;
s13, when the BML rejects the data item, the access mode is sent to the FTL as a parameter;
s14, FTL adds a parameter (2 bits) for recording access mode to each logic mapping item; it can be seen that through the cooperation of the BML and the FTL, not only can the accuracy of identifying the access mode by the BML be utilized, but also more access modes can be stored by the FTL, and the cost is only that a storage space of 2 bits needs to be added for each mapping item;
the BML in step S2 is a page/block adaptive BML, and is divided into a Hot region, a Normal region, and a Cold region; the data page migration and elimination method among the Hot region, the Normal region and the Cold region is as follows:
s21, the data pages of two adjacent regions can be migrated to each other, that is, only two types of data migration exist, between the Hot region and the Normal region, and between the Normal region and the Cold region;
s21, when a request is not hit in the BML, processing according to the access mode identified in the step S1, continuously accessing and loading to a Cold area, and loading other accesses to a Normal area;
s22, after a request hits in the BML, judging which area of the Hot area, the Normal area and the Cold area the request hits specifically, and then respectively carrying out corresponding data processing;
and S23, when the space of the three areas is insufficient, performing page/block elimination.
S24, when selecting the block elimination in the Cold area, not only considering the least recent access principle of the Cold area, but also considering the garbage collection efficiency of each block in order to reduce the cost of garbage collection of the FTL;
the optimization method of step S3 is as follows:
s31, realizing classification processing of data, and splitting the CMT of the DFTL into H-CMT, CMT and S-CMT, wherein the H-CMT is responsible for caching frequently accessed mapping items, the S-CMT is responsible for caching continuously accessed mapping items, and the CMT is responsible for caching common randomly accessed mapping items; performing fine-grained management on the H-CMT according to a single mapping item, clustering the CMT and the S-CMT according to the translation page, namely clustering the mapping items belonging to the same translation page together for management;
s32, optimizing the SLC and MLC distribution of FTL layer data and the wear balance among the FTL layer data so as to achieve the purpose that hot data are stored into the SLC and cold data are stored into the MLC;
s33, adding a programming mode selection module in the FTL layer in the step S32 to realize the utilization of the write function of the flash memory deep level characteristic perception, making a compromise among the P/E times, the programming speed and the data storage time, writing hot data into the flash memory by using fast write when the load is heavy, improving the write performance of the SSD, writing cold data into the flash memory by using slow write when the load is light, improving the P/E times of the SSD, and simultaneously designing a set of mechanism to ensure that if the fast-written data is still valid data when the storage time is about to expire, the data is rewritten, and the data storage time is ensured to be unchanged.
In the embodiment provided by the present invention, the load characteristics are divided into macroscopic characteristics and microscopic characteristics in the step S1; the load macroscopic characteristic analysis is realized by adopting a sectional statistical method; specifically, the method comprises the steps of taking service of N access requests as a period, counting the operation types and the access modes of the N access requests, calculating the macroscopic characteristics of the load in the period after a sampling interval is reached, and predicting the macroscopic characteristics of the next period by using the macroscopic characteristics counted currently;
the load microscopic characteristic analysis is carried out by adopting a machine learning-based hot data identification algorithm, and essentially, hot data identification is a two-classification problem, namely, after a request comes, whether the hot data belongs to a hot data class or a cold data class is judged; the hot data identification algorithm is divided into an off-line learning stage and an on-line learning stage.
In the implementation mode provided by the invention, in the off-line learning stage, feature modeling and classification are manually carried out on each request so as to obtain a training set, then machine learning is carried out by utilizing the training set, and finally effective features and model parameters are output;
in the on-line learning stage, when each request arrives, feature extraction is firstly carried out, then the trained classification model is directly used for classification, and a small training sample set is collected on line for on-line machine learning to obtain model parameters, so that the classification model can adapt to the change of load characteristics.
In the embodiment provided by the invention, the Hot region, the Normal region and the Cold region all adopt different organization modes;
the Hot area is organized according to pages, belongs to fine granularity and is sorted according to priority values; the Normal area is organized according to virtual blocks and belongs to medium granularity; the Cold areas are organized according to logic blocks and belong to coarse granularity.
In the embodiment provided by the present invention, the priority value of the Hot zone is calculated as follows:
P1=f1(ti,tl)
wherein, P1 is defined as priority value, ti is defined as average updating interval of page/block, and tl is defined as last updating time; and the Hot area is sorted according to the priority value, and the access frequency and the access recency of the page are considered.
In the embodiment provided by the present invention, when a request in step S22 hits in the first half of Normal region, the request is migrated from Normal region to Hot region, otherwise, the request is reordered according to the least recent access rule of the virtual block;
if the request hits the Cold region, it is migrated to the Normal region, which is sorted by logical block least recently accessed.
In the embodiment provided by the present invention, when performing page/block elimination in step S23, the Hot area and the Normal area eliminate the page/block at the end of the queue to the next area, the Cold area selects an elimination block according to the idea in step S24, sends the block data to the FTL, and the FTL determines to write the block data to a suitable flash memory location;
the elimination order of the blocks in step S24 is calculated as follows:
P2=f2(tl,n,D)
the method comprises the following steps that P2 is defined as the elimination priority of a Cold block, tl is defined as the last updating time of the block, n is defined as the number of dirty pages contained, and D is defined as the position distribution of the dirty pages in a flash memory; specifically, the dirty page position distribution D needs FTL cooperation to be completed, which is also rarely considered in the conventional BML design, and qualitatively, these blocks should be preferentially removed when the last update time tl is earlier, or the number n of dirty pages is larger, or the dirty page position distribution D is more concentrated. This is because the earlier tl is, the worse locality of the block is described, and the larger the dirty page number n is, or the more concentrated the dirty page position distribution D is, the higher the subsequent garbage collection efficiency is.
In the embodiment provided by the present invention, when the step S31 performs the classification processing, when the request is not hit in the H-CMT, the CMT and the S-CMT, the request access mode identified in the step S1 is used to load the consecutive requests to the S-CMT, and the other requests to the CMT; when the request hits in CMT and S-CMT, it is promoted to H-CMT;
during elimination, the H-CMT adopts a simple least-recent-access principle to eliminate the queue tail mapping items into the CMT, and the CMT and the S-CMT adopt a batch elimination principle, namely dirty mapping items belonging to the same translation page are updated into a translation block of a flash memory in batch at one time so as to optimize the updating method of the traditional DFTL according to a single mapping item;
when the CMT and S-CMT remove the translation pages in batches, the removal priority P3 of the translation pages is calculated as follows:
p3 ═ f3(tl, n), where tl is defined as the translation page last access time and n is defined as the number of mapping items of the page viscera; in order to enhance the utilization of the spatial locality of the access, a prefetching strategy is adopted when a new mapping item is read after the request is not hit in H-CMT, CMT and S-CMT, the prefetching size depends on the access mode, more mapping items are continuously requested to be prefetched to the S-CMT, and fewer mapping items are randomly requested to be prefetched.
In the embodiment provided by the invention, in order to map data corresponding to part of the H-CMT mapping items to the SLC, a variable is added to each mapping item of the H-CMT, the updating (writing) times of the mapping item are recorded, the normalized wear degrees of the SLC and the MLC are calculated, finally, the assignment thresholds of the SLC and the MLC are dynamically adjusted according to the normalized wear degrees of the SLC and the MLC, the mapping item of which the updating times of the mapping item is less than the assignment _ th is assigned to the MLC (cold data), and the mapping item is mapped to the SLC (hot data) on the contrary. In addition, when the normalized SLC wear level exceeds the MLC wear level by a certain amount, assign the data to the SLC by increasing assign _ th, and when the normalized SLC wear level is less than the MLC wear level by a certain amount, assign _ th by decreasing assign _ th, assign the data to the SLC by increasing assign _ th. The wear balance inside the SLC and the MLC is realized by adopting the existing wear balance algorithm:
rws=total_es×lm/ns×ls rwm=total_em×1/nm
wherein rws and rwm are relative wear degrees of the SLC and the MLC respectively, total _ es and total _ em are total erasing times of the SLC and the MLC respectively, ns and nm are total block numbers of the SLC and the MLC respectively, and ls and lm are wear upper limits of the SLC and the MLC respectively.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (9)

1. A hybrid SSD performance and lifetime optimization method based on a cross-layer design, the optimization method comprising:
s1, identifying and centralizing traditional dispersed BML and FTL load characteristics to form a new WAL layer, and designing and optimizing the WAL layer;
s2, designing and optimizing the BML;
s3, designing and optimizing the FTL;
the optimization method of step S1 is as follows:
s11, for any request, firstly judging whether it hits in BML, if so, updating its access mode according to access history;
s12, if not hit in BML, judging whether hit in FTL, if hit in FTL, using the previous access mode to predict the current mode; otherwise, identifying a rough access mode according to the request size or the relation between the logic page number of the current request and the logic page number of the request in the BML;
s13, when the BML rejects the data item, the access mode is sent to the FTL as a parameter;
s14, FTL adds a parameter (2 bits) for recording access mode to each logic mapping item;
the BML in step S2 is a page/block adaptive BML, and is divided into a Hot region, a Normal region, and a Cold region; the data page migration and elimination method among the Hot region, the Normal region and the Cold region is as follows:
s21, the data pages of two adjacent regions can be migrated to each other, that is, only two types of data migration exist, between the Hot region and the Normal region, and between the Normal region and the Cold region;
s21, when a request is not hit in the BML, processing according to the access mode identified in the step S1, continuously accessing and loading to a Cold area, and loading other accesses to a Normal area;
s22, after a request hits in the BML, judging which area of the Hot area, the Normal area and the Cold area the request hits specifically, and then respectively carrying out corresponding data processing;
and S23, when the space of the three areas is insufficient, performing page/block elimination.
S24, when selecting the block elimination in the Cold area, not only considering the least recent access principle of the Cold area, but also considering the garbage collection efficiency of each block in order to reduce the cost of garbage collection of the FTL;
the optimization method of step S3 is as follows:
s31, realizing classification processing of data, and splitting the CMT of the DFTL into H-CMT, CMT and S-CMT, wherein the H-CMT is responsible for caching frequently accessed mapping items, the S-CMT is responsible for caching continuously accessed mapping items, and the CMT is responsible for caching common randomly accessed mapping items; performing fine-grained management on the H-CMT according to a single mapping item, and clustering the CMT and the S-CMT according to translation pages;
s32, optimizing SLC and MLC distribution of FTL layer data and wear balance among the SLC and MLC distribution;
s33, adding a programming mode selection module in the FTL layer in said step S32 to implement the utilization of the write function of flash memory deep level feature perception.
2. The method of claim 1, wherein the load characteristics are divided into macroscopic characteristics and microscopic characteristics in step S1; the load macroscopic characteristic analysis is realized by adopting a sectional statistical method;
analyzing the microscopic characteristics of the load by adopting a thermal data identification algorithm based on machine learning; the hot data identification algorithm is divided into an off-line learning stage and an on-line learning stage.
3. The method of claim 2, wherein in the off-line learning stage, feature modeling and classification are performed manually on each request to obtain a training set, and then the training set is used for machine learning, and finally valid features and model parameters are output;
in the on-line learning stage, when each request arrives, feature extraction is firstly carried out, then the trained classification model is directly used for classification, and a small training sample set is collected on line for on-line machine learning to obtain model parameters, so that the classification model can adapt to the change of load characteristics.
4. The method of claim 1, wherein the Hot region, the Normal region and the Cold region are organized differently;
the Hot area is organized according to pages, belongs to fine granularity and is sorted according to priority values; the Normal area is organized according to virtual blocks and belongs to medium granularity; the Cold areas are organized according to logic blocks and belong to coarse granularity.
5. The method of claim 4, wherein the priority value of the Hot region is calculated as follows:
P1=f1(ti,tl)
where P1 is defined as the priority value, ti is defined as the page/block average update interval, and tl is defined as the last update time.
6. The method of claim 1, wherein when a request in step S22 hits in the first half of Normal region, the request is migrated from Normal region to Hot region, otherwise, the request is reordered according to the least recent access rule of the virtual block;
if the request hits the Cold region, it is migrated to the Normal region, which is sorted by logical block least recently accessed.
7. The method of claim 1, wherein in the step S23, when performing page/block culling, the Hot block and the Normal block cull the page/block at the end of the queue to the next block, the Cold block selects the culled block according to the idea of the step S24, sends the block data to the FTL, and the FTL determines to write the block data to a suitable flash memory location;
the elimination order of the blocks in step S24 is calculated as follows:
P2=f2(tl,n,D)
wherein, P2 is defined as the culling priority of Cold blocks, tl is defined as the last update time of the blocks, n is defined as the number of dirty pages contained, and D is defined as the location distribution of the dirty pages in the flash memory.
8. The method of claim 1, wherein the step S31 is configured to load the S-CMT and other requests to the CMT according to the access mode of the request identified in the step S1 when the request is not hit in H-CMT, CMT and S-CMT during the classification process; when the request hits in CMT and S-CMT, it is promoted to H-CMT;
during elimination, the H-CMT adopts a simple minimum access principle to eliminate the queue tail mapping item into the CMT, and the CMT and the S-CMT adopt a batch elimination principle;
when the CMT and S-CMT remove the translation pages in batches, the removal priority P3 of the translation pages is calculated as follows:
p3 ═ f3(tl, n), where tl is defined as the translation page last access time and n is defined as the number of mapping items of the page viscera.
9. The method of claim 8, wherein in order to map data corresponding to part of the H-CMT mapping entries into SLCs, a variable is added to each mapping entry of H-CMT, the number of updates (writes) is recorded, and the normalized wear level of SLCs and MLCs is calculated; the wear balance inside the SLC and the MLC is realized by adopting the existing wear balance algorithm:
rws=total_es×lm/ns×ls rwm=total_em×1/nm
wherein, rws、rwmRelative wear, Total _ e, of SLC and MLC respectivelys、total_emTotal number of erasures, n, for SLC and MLC respectivelys、nmTotal number of blocks, SLC and MLC respectivelys、lmThe upper wear limits for SLC and MLC, respectively.
CN202110582732.6A 2021-05-27 2021-05-27 Cross-layer design-based hybrid SSD performance and service life optimization method Pending CN113220241A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110582732.6A CN113220241A (en) 2021-05-27 2021-05-27 Cross-layer design-based hybrid SSD performance and service life optimization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110582732.6A CN113220241A (en) 2021-05-27 2021-05-27 Cross-layer design-based hybrid SSD performance and service life optimization method

Publications (1)

Publication Number Publication Date
CN113220241A true CN113220241A (en) 2021-08-06

Family

ID=77099594

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110582732.6A Pending CN113220241A (en) 2021-05-27 2021-05-27 Cross-layer design-based hybrid SSD performance and service life optimization method

Country Status (1)

Country Link
CN (1) CN113220241A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120284587A1 (en) * 2008-06-18 2012-11-08 Super Talent Electronics, Inc. Super-Endurance Solid-State Drive with Endurance Translation Layer (ETL) and Diversion of Temp Files for Reduced Flash Wear
CN102981963A (en) * 2012-10-30 2013-03-20 华中科技大学 Implementation method for flash translation layer of solid-state disc
US20130173844A1 (en) * 2011-12-29 2013-07-04 Jian Chen SLC-MLC Wear Balancing
CN106980799A (en) * 2017-03-10 2017-07-25 华中科技大学 The nonvolatile memory encryption system that a kind of abrasion equilibrium is perceived
CN109446117A (en) * 2018-09-06 2019-03-08 杭州电子科技大学 A kind of solid state hard disk page grade flash translation layer (FTL) design method
CN110248373A (en) * 2018-03-07 2019-09-17 中国移动通信有限公司研究院 A kind of cross-layer optimizing backing method and device, equipment, storage medium
CN110413537A (en) * 2019-07-25 2019-11-05 杭州电子科技大学 A kind of flash translation layer (FTL) and conversion method towards hybrid solid-state hard disk

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120284587A1 (en) * 2008-06-18 2012-11-08 Super Talent Electronics, Inc. Super-Endurance Solid-State Drive with Endurance Translation Layer (ETL) and Diversion of Temp Files for Reduced Flash Wear
US20130173844A1 (en) * 2011-12-29 2013-07-04 Jian Chen SLC-MLC Wear Balancing
CN102981963A (en) * 2012-10-30 2013-03-20 华中科技大学 Implementation method for flash translation layer of solid-state disc
CN106980799A (en) * 2017-03-10 2017-07-25 华中科技大学 The nonvolatile memory encryption system that a kind of abrasion equilibrium is perceived
CN110248373A (en) * 2018-03-07 2019-09-17 中国移动通信有限公司研究院 A kind of cross-layer optimizing backing method and device, equipment, storage medium
CN109446117A (en) * 2018-09-06 2019-03-08 杭州电子科技大学 A kind of solid state hard disk page grade flash translation layer (FTL) design method
CN110413537A (en) * 2019-07-25 2019-11-05 杭州电子科技大学 A kind of flash translation layer (FTL) and conversion method towards hybrid solid-state hard disk

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王志奇: "一种优化的闪存转换层的设计与实现", 《通信技术》 *

Similar Documents

Publication Publication Date Title
CN102012867B (en) Data storage system
CN104503710B (en) A kind of method and apparatus for improving flash memory writing speed
Park et al. CFLRU: a replacement algorithm for flash memory
Chang A hybrid approach to NAND-flash-based solid-state disks
CN105930282B (en) A kind of data cache method for NAND FLASH
CN109582593B (en) FTL address mapping reading and writing method based on calculation
CN106874213B (en) Solid state disk hot data identification method fusing multiple machine learning algorithms
CN103019958A (en) Method for managing data in solid state memory through data attribute
CN101477492A (en) Circulating rewriting flash memory equalization method used for solid state disk
CN110413537B (en) Flash translation layer facing hybrid solid state disk and conversion method
CN102646069A (en) Method for prolonging service life of solid-state disk
Wei et al. CBM: A cooperative buffer management for SSD
CN109471594B (en) M L C flash memory read-write method
CN107423229B (en) Buffer area improvement method for page-level FTL
CN109783398A (en) One kind is based on related perception page-level FTL solid state hard disk performance optimization method
CN100377117C (en) Method and device for converting virtual address, reading and writing high-speed buffer memory
CN109446117A (en) A kind of solid state hard disk page grade flash translation layer (FTL) design method
CN113254358A (en) Method and system for address table cache management
CN107590084A (en) A kind of page level buffering area improved method based on classification policy
CN107797772A (en) A kind of garbage retrieving system and method based on flash media
Zhang et al. Crftl: cache reallocation-based page-level flash translation layer for smartphones
CN108664217B (en) Caching method and system for reducing jitter of writing performance of solid-state disk storage system
CN109783019A (en) A kind of data intelligence memory management method and device
CN113220241A (en) Cross-layer design-based hybrid SSD performance and service life optimization method
Tjioe et al. Making garbage collection wear conscious for flash SSD

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210806

RJ01 Rejection of invention patent application after publication