CN108664217B - Caching method and system for reducing jitter of writing performance of solid-state disk storage system - Google Patents

Caching method and system for reducing jitter of writing performance of solid-state disk storage system Download PDF

Info

Publication number
CN108664217B
CN108664217B CN201810294987.0A CN201810294987A CN108664217B CN 108664217 B CN108664217 B CN 108664217B CN 201810294987 A CN201810294987 A CN 201810294987A CN 108664217 B CN108664217 B CN 108664217B
Authority
CN
China
Prior art keywords
write request
data
new write
new
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810294987.0A
Other languages
Chinese (zh)
Other versions
CN108664217A (en
Inventor
孙辉
贾晨
陈国栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN201810294987.0A priority Critical patent/CN108664217B/en
Publication of CN108664217A publication Critical patent/CN108664217A/en
Application granted granted Critical
Publication of CN108664217B publication Critical patent/CN108664217B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a cache method for reducing jitter of writing performance of a solid-state disk storage system, which comprises the following steps: s1, when the new write request reaches the cache system, storing the new write request data to the matched cache cluster; s2, generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint database, and judging the operation type of the new writing request according to the matching result; and S3, selecting the processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request. According to the invention, by adopting the fingerprint library technology, clusters with more updated data pages are selected as much as possible to be written back to the flash memory, and when the storage unit where the original data block is located is idle, garbage recovery is performed on the original data block, so that the utilization rate of a cache space is improved, a large number of invalid data pages generated by frequently updating data in the cache to be written back to the flash memory are reduced, garbage recovery operation is reduced, and the jitter of the solid-state disk writing performance is reduced.

Description

Caching method and system for reducing jitter of writing performance of solid-state disk storage system
Technical Field
The invention relates to the technical field of cache optimization methods, in particular to a cache method and a cache system for reducing jitter of writing performance of a solid-state disk storage system.
Background
Solid state disks, as a new type of storage device, have been widely used in recent years as storage media in various types of consumer electronics devices. With the widespread use of flash solid state disks, problems due to the physical characteristics of their storage media themselves are increasingly emerging. Off-site update operations due to the erase-before-write feature can compromise solid state disk performance and lifetime. The flash memory solid-state disk needs to erase the storage space before writing data, so when the data in the solid-state disk needs to be updated, the updated data needs to be written into the erased data block, and then the original data corresponding to the updated data is set to be invalid. The garbage recovery mainly comprises two parts, namely rewriting operation of an effective data page in a block to be erased and erasing operation of an invalid data block, wherein the rewriting operation of the effective data page in the block to be erased possibly contends with an external new write request for a bus, and the external new write request can be responded after the rewriting operation is completed, so that the performance of the external new write request is jittered, and the response time of the new write request of a user is greatly increased; in addition, the erase time of the flash solid-state disk is much longer than the read-write time, which again increases the time overhead of garbage collection. It is not allowable that the jitter of the write performance of the transactional memory system causes a large response delay, and the large jitter of the write performance is fatal to the transactional memory system. And the cache system on the solid-state disk can effectively reduce the influence of garbage recovery on the response time of the user.
Disclosure of Invention
Based on the technical problems in the background art, the invention provides a caching method and a caching system for reducing the jitter of the writing performance of a solid-state disk storage system.
The invention provides a cache method for reducing jitter of writing performance of a solid-state disk storage system, which comprises the following steps:
s1, when the new write request reaches the cache system, storing the new write request data to the matched cache cluster;
s2, generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint database, and judging the operation type of the new writing request according to the matching result;
and S3, selecting the processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request.
Preferably, step S1 specifically includes:
when a new write request reaches a cache system, whether a cache cluster with a data page address range containing the data page address of the new write request exists in the cache system is searched according to the data page address of the new write request, if so, the new write request data is stored in the cache cluster, if not, a cache space is applied in the cache system to form the cache cluster, the new write request data is stored in the cache cluster, a target fingerprint library unit is constructed in the target fingerprint library, and all data page fingerprint information in a flash memory data block corresponding to the cache cluster is stored in the target fingerprint library unit.
Preferably, in step S2, the target fingerprint database includes a plurality of target fingerprint database units, and each target fingerprint database unit stores fingerprint information of all data pages contained in one flash data block in the flash memory.
Preferably, step S2 specifically includes:
generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint library, if the fingerprint information of a data page with the same logical address as the new writing request is not found in the target fingerprint library, judging the new writing request as a new writing operation, if the fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is successfully matched, judging the new writing request as a repeated writing operation, and if the fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is not successfully matched, judging the new writing request as an updating operation.
Preferably, step S3 specifically includes:
selecting a processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request:
when the new write request is a new write operation, when the cache cluster where the new write request is located is written back to the flash memory, the new write request data is written back to the flash memory along with the cache cluster where the new write request is located; when the new write request is repeated write operation, discarding the new write request data when the cache cluster where the new write request is located is written back to the flash memory; when the new write request is an update operation, adding one to the update data page flag bit of the cache cluster where the new write request data is located, setting the original flash memory data page corresponding to the new write request data page to be in an invalid state, and when the update data page flag bit of the cache cluster where the new write request data is located is larger than a preset threshold value, writing the new write request data back to the flash memory along with the cache cluster where the new write request data is located.
The invention provides a cache system for reducing jitter of writing performance of a solid-state disk storage system, which comprises:
the information storage module is used for storing the new write request data to the matched cache cluster when the new write request reaches the cache system;
the information matching module is used for generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint library and judging the operation type of the new writing request according to a matching result;
and the dynamic operation module is used for selecting a processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request.
Preferably, the information storage module is specifically configured to:
when a new write request reaches a cache system, whether a cache cluster with a data page address range containing the data page address of the new write request exists in the cache system is searched according to the data page address of the new write request, if so, the new write request data is stored in the cache cluster, if not, a cache space is applied in the cache system to form the cache cluster, the new write request data is stored in the cache cluster, a target fingerprint library unit is constructed in the target fingerprint library, and all data page fingerprint information in a flash memory data block corresponding to the cache cluster is stored in the target fingerprint library unit.
Preferably, in the information matching module, the target fingerprint library includes a plurality of target fingerprint library units, and each target fingerprint library unit stores fingerprint information of all data pages included in one flash memory data block in the flash memory.
Preferably, the information matching module is specifically configured to:
generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint library, if the fingerprint information of a data page with the same logical address as the new writing request is not found in the target fingerprint library, judging the new writing request as a new writing operation, if the fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is successfully matched, judging the new writing request as a repeated writing operation, and if the fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is not successfully matched, judging the new writing request as an updating operation.
Preferably, the dynamic operation module is specifically configured to:
selecting a processing mode when the new write request data is written back to the flash memory according to the operation type of the new write request:
when the new write request is a new write operation, when the cache cluster where the new write request is located is written back to the flash memory, the new write request data is written back to the flash memory along with the cache cluster where the new write request is located; when the new write request is repeated write operation, discarding the new write request data when the cache cluster where the new write request is located is written back to the flash memory; when the new write request is an update operation, adding one to the update data page flag bit of the cache cluster where the new write request data is located, setting the original flash memory data page corresponding to the new write request data page to be in an invalid state, and when the update data page flag bit of the cache cluster where the new write request data is located is larger than a preset threshold value, writing the new write request data back to the flash memory along with the cache cluster where the new write request data is located.
According to the caching method for reducing the jitter of the writing performance of the solid-state disk storage system, a caching strategy is based on organizing data in a cluster form, a fingerprint library facing writing/updating operation is designed, and a bridge of semantic interaction between caching and flash storage is realized. Detecting whether an I/O user request written into a cache space at each time is an updating operation or not by utilizing fingerprint information stored in each fingerprint library unit in a fingerprint library, enabling a system to sense whether the number of invalid pages in a data block in a flash memory storage unit corresponding to a cache cluster reaches a threshold value or not in real time, and writing the cluster back to a flash memory when the number of updated data pages in an inactive cluster reaches the threshold value; when the storage unit where the original data block is located is idle, garbage recovery is performed on the original data block, the utilization rate of a cache space is improved, meanwhile, a large number of invalid data pages generated by frequent updating operations between a cache and a flash memory are reduced, and garbage recovery operations are reduced, so that bus conflicts caused by the fact that the valid data pages of the garbage recovery operations are rewritten with an external new write request are reduced as far as possible, and the jitter of the writing performance of the solid state disk is reduced.
Drawings
FIG. 1 is a schematic diagram illustrating steps of a caching method for reducing jitter of write performance of a solid-state disk storage system;
FIG. 2 is a schematic diagram of a cache system for reducing jitter in write performance of a solid-state disk storage system;
FIG. 3 is a schematic diagram of a cache system for reducing jitter of write performance of a solid-state disk storage system;
FIG. 4 is a diagram illustrating an exemplary fingerprint database matching process;
FIG. 5 is a diagram illustrating an embodiment of a process for writing data back to flash memory.
Detailed Description
With the increasing scale of big data applications, data processing puts higher demands on the storage performance of computers. The traditional storage system based on the magnetic disk has low performance and cannot meet the requirement of big data application with high response time requirement. While flash-based solid state disks are widely used in large data processing storage systems.
Solid state disks, as a new type of storage device, have been widely used in recent years as storage media in various types of consumer electronics devices. Solid state disks have unique advantages over conventional magnetic disk devices. First, the solid-state disk has a fast read/write speed. Conventional magnetic disk devices include mechanical components and require a long addressing time. And the solid-state disk takes an electronic device as a storage medium, so that the addressing overhead is low. The physical characteristics of solid-state disk storage media have higher access performance than conventional magnetic disks. Secondly, the solid-state disk is lower in energy consumption and resistant to shock, the performance of the solid-state disk is continuously improved, meanwhile, the 3D flash memory technology effectively increases the storage capacity, and the solid-state disk is obviously more suitable for a large-scale storage system than a traditional disk.
Currently, flash memory storage media are widely used for solid state disks. The minimum unit of write operation in the flash memory storage medium is a page, the granularity of erase operation is a block, and the flash memory storage medium has the physical characteristics of inconsistent granularity of read/write operation and erase operation, asymmetric read-write speed, erasure before write and the like. With the widespread use of flash solid state disks, problems due to the physical characteristics of their storage media themselves are increasingly emerging. Off-site update operations due to the erase-before-write feature can compromise solid state disk performance and lifetime. The flash memory solid-state disk needs to erase the storage space before writing data, so when the data in the solid-state disk needs to be updated, the updated data needs to be written into the erased data block, and then the original data corresponding to the updated data is set to be invalid. The garbage recovery mainly comprises two parts, namely rewriting operation of an effective data page in a block to be erased and erasing operation of an invalid data block, wherein the rewriting operation of the effective data page in the block to be erased possibly contends with an external new write request for a bus, and the external new write request can be responded after the rewriting operation is completed, so that the performance of the external new write request is jittered, and the response time of the new write request of a user is greatly increased; in addition, the erase time of the flash solid-state disk is much longer than the read-write time, which again increases the time overhead of garbage collection. It is not allowable that the jitter of the write performance of the transactional memory system causes a large response delay, and the large jitter of the write performance is fatal to the transactional memory system. And the cache system on the solid-state disk can effectively reduce the influence of garbage recovery on the response time of the user.
A great deal of work has been done by the predecessors for solid-state disk cache system optimization, and many classical algorithms have been formed. The LRU algorithm is the most representative cache management algorithm. The LRU algorithm arranges the pages according to the access frequency of the data pages to form a linked list, and when the cache is full and needs to replace the pages, the pages at the tail of the linked list are expelled. The LRU algorithm principle is relatively simple, but it only judges the activity degree of data according to a single access record, and cannot accurately drive out the least active data, and may replace the active data out of the cache, resulting in a large number of invalid data pages in the flash memory, and increasing the garbage recovery frequency.
For this purpose, CFLRU algorithm modifies traditional LRU algorithm appropriately, resulting in Flash media-based cache management algorithm, which takes into account the asymmetric property of Flash read-write cost, and divides the linked list into a work area (work Region) containing recently accessed data pages and an erasure area (Clean-First Region) containing data pages that are not accessed by means of window time, and when the cache is full, a Clean page in the erasure area is preferably selected to replace the cache. The CFLRU algorithm improves the accuracy of selecting inactive data during cache replacement, but when the CFLRU window time is not appropriate and there are no clean pages in the preferred erasure area, the algorithm degenerates to the conventional LRU algorithm, does not consider the access frequency of dirty pages, and may leave cold dirty pages in a limited cache space.
Generally speaking, when a user reads and writes a storage unit of a solid-state disk accessed by an I/O, if a garbage collection operation exists in the storage unit, the user read and write operation may compete for resources with a valid data page rewrite operation of the garbage collection operation, causing the user read and write operation to be delayed until the garbage collection is finished, and a read and new write request response is prolonged, causing performance jitter of the solid-state disk.
The problem of write performance jitter caused by garbage collection operations severely affects user response time. When performance jitter occurs, a performance burr phenomenon can be generated, so that a user cannot respond to a read-write request in time, the delay is long, the use experience of the user is greatly reduced, and the harm caused by the performance jitter is larger in systems with high real-time requirements such as real-time transaction and the like.
In order to reduce the response time of a user request and reduce performance jitter, the existing method mainly writes back part of cached data to a flash memory only when the cache space is insufficient by caching active data, so that the user reads a new write request and obtains a response in the cache with a higher read-write speed as much as possible. However, in these methods, when writing back cache data, whether the memory unit written back at that time is performing garbage collection operation is not considered, and if the memory unit is performing garbage collection, a bus contention between data rewriting and a write-back data request in the garbage collection operation may be caused, so that the write-back data request must wait until the garbage collection is completed before being responded, thereby greatly prolonging the response time of the write-back data request, and causing performance jitter.
Based on the above cache defects existing in the prior art, the present invention provides a caching method and system for reducing jitter of write performance of a solid-state disk storage system, as shown in fig. 1 to 5, and fig. 1 to 5 provide a caching method and system for reducing jitter of write performance of a solid-state disk storage system for this embodiment.
Referring to fig. 1, the caching method for reducing jitter of write performance of a solid-state disk storage system according to the present invention includes the following steps:
s1, when the new write request reaches the cache system, storing the new write request data to the matched cache cluster;
in the embodiment, a page level mapping mode is adopted between the logical address and the physical address, and is combined with a data organization mode of a cache cluster and a data block in the flash memory; the cache is stored and managed in a cluster mode by taking a data block as a unit, wherein the size of the cluster is the same as that of the data block in the flash memory storage unit. Each free data block in the cache is allocated and stores, in a cluster, a data page within a particular LPN (local page number) range, the range of LPNs that the data block can store depending on the LPN address of the data page first allocated within the data block. Through the form of clustering, the storage distribution of the data pages becomes relatively ordered, and in the cache, clusters with more updated data pages in the cache can be written back to the flash memory storage unit by better utilizing the target fingerprint library, so that the garbage recovery efficiency in the flash memory storage unit is effectively improved. For example, assuming that an SSD can store 64 data pages at most in each data block, data pages with logical page addresses (LPN) between 0 and 63 will be stored in the same data block, and data pages with logical page addresses (LPN) between 64 and 127 will be stored in the same data block.
In this embodiment, step S1 specifically includes:
when a new write request reaches a cache system, whether a cache cluster with a data page address range containing the data page address of the new write request exists in the cache system is searched according to the data page address of the new write request, if so, the new write request data is stored in the cache cluster, if not, a cache space is applied in the cache system to form the cache cluster, the new write request data is stored in the cache cluster, a target fingerprint library unit is constructed in the target fingerprint library, and all data page fingerprint information in a flash memory data block corresponding to the cache cluster is stored in the target fingerprint library unit.
S2, generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint database, and judging the operation type of the new writing request according to the matching result;
in step S2, the target fingerprint database includes a plurality of target fingerprint database units, and each target fingerprint database unit stores fingerprint information of all data pages included in one flash data block in the flash memory.
Step S2 specifically includes:
generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint library, if the fingerprint information of a data page with the same logical address as the new writing request is not found in the target fingerprint library, judging the new writing request as a new writing operation, if the fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is successfully matched, judging the new writing request as a repeated writing operation, and if the fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is not successfully matched, judging the new writing request as an updating operation.
And S3, selecting the processing mode of the new write request data when writing back to the flash memory according to the operation type of the new write request.
Step S3 specifically includes:
selecting a processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request:
when the new write request is a new write operation, when the cache cluster where the new write request is located is written back to the flash memory, the new write request data is written back to the flash memory along with the cache cluster where the new write request is located; when the new write request is repeated write operation, discarding the new write request data when the cache cluster where the new write request is located is written back to the flash memory; when the new write request is an update operation, adding one to the update data page flag bit of the cache cluster where the new write request data is located, setting the original flash memory data page corresponding to the new write request data page to be in an invalid state, and when the update data page flag bit of the cache cluster where the new write request data is located is larger than a preset threshold value, writing the new write request data back to the flash memory along with the cache cluster where the new write request data is located.
The method is explained by taking a specific embodiment in conjunction with fig. 3, fig. 4 and fig. 5 as follows:
assuming that a new write request with LPN-P0 arrives at the cache system, the new write request with LPN-P0 should be stored in the cache cluster 1 according to the cache data organization form of the cluster, so it is first necessary to check whether the cache cluster 1 exists in the cache to store the new write request data page with LPN-P0;
if the cache cluster 1 exists, writing the new write request data page with LPN (P0) into the cache cluster 1 in the cache;
if no cache cluster 1 exists, searching for an idle cache cluster, making LBN of the cache cluster equal to 1, storing a data page with LPN equal to P0 into the cache cluster 1, applying a free space in a target fingerprint library to allocate to a newly generated target fingerprint library unit 1, and storing fingerprint information generated by a flash memory data block 1 corresponding to the cache cluster 1 into the target fingerprint library unit 1;
and judging whether a data page with LPN (total number of pages) of P0 exists in the flash data block 1 of the flash memory according to whether the corresponding content field of the target fingerprint library unit 1 in the target fingerprint library is empty, and matching the fingerprint information of the data page if the data page exists. (specifically, the content field of the new write request with LPN-P0 is subjected to the same string compression algorithm as the target fingerprint library, so that the content fingerprint of the new write request with LPN-P0 is 0001, and the content fingerprint information is matched with the content field with LPN-P0 in the target fingerprint library unit 1);
if the content field of LPN-P0 in the target fingerprint library unit 1 is not empty and matching fails, the current write operation is considered as an update operation; after the data page of the cache cluster 1 is written back to the flash memory, the data page of which LPN is P0 in the flash memory data block 1 is invalidated; therefore, statistics is needed for each update operation, if the update data page in the cache cluster 1 is greater than the preset threshold value at this time, all data pages of the cache cluster 1 are written back to the idle flash memory data block 1 'in the flash memory, then the valid pages in the flash memory data block 1 with the original LBN of 1 in the flash memory are written into the same idle flash memory data block 1' in a data rewriting manner, all pages of the original flash memory data block 1 are set as invalid, and the invalid pages are added into a garbage collection queue to wait for executing garbage collection operation.
Through the operation, the effective data page copying (rewriting) operation can be completed after the effective data page copying (rewriting) operation is advanced to the cluster and written back to the flash memory, so that the garbage collection time is reduced to one-time erasing time, the garbage collection operation time is reduced, and the possibility of bus collision between the rewriting operation and an external new write request in the garbage collection process is further reduced. Because onboard memory space is limited, all data block target fingerprint library information in a flash memory cannot be stored in a cache, a cluster stored in the cache is taken as an object, flash memory data page fingerprint information which is the same as the logic block number of the cluster in the flash memory is extracted and stored in a target fingerprint library, and when a data page is written into the cache, whether a new write request of a user can cause a corresponding data page in a flash memory storage unit to be outdated or not can be judged. In addition, when a cluster is written back to the flash memory in the cache, the fingerprint information of the flash memory data block corresponding to the cluster is removed from the target fingerprint database.
Referring to fig. 2, fig. 2 is a cache system for reducing jitter of write performance of a solid-state disk storage system according to the present invention, which includes:
the information storage module is used for storing the new write request data to the matched cache cluster when the new write request reaches the cache system;
in the embodiment, a page level mapping mode is adopted between the logical address and the physical address, and is combined with a data organization mode of a cache cluster and a data block in the flash memory; the cache is stored and managed in a cluster mode by taking a data block as a unit, wherein the size of the cluster is the same as that of the data block in the flash memory storage unit. Each free data block in the cache is allocated and stores, in a cluster, a data page within a particular LPN (local page number) range, the range of LPNs that the data block can store depending on the LPN address of the data page first allocated within the data block. Through the form of clustering, the storage distribution of the data pages becomes relatively ordered, and in the cache, clusters with more updated data pages in the cache can be written back to the flash memory storage unit by better utilizing the target fingerprint library, so that the garbage recovery efficiency in the flash memory storage unit is effectively improved. For example, assuming that an SSD can store 64 data pages at most in each data block, data pages with logical page addresses (LPN) between 0 and 63 will be stored in the same data block, and data pages with logical page addresses (LPN) between 64 and 127 will be stored in the same data block.
In this embodiment, the information storage module is specifically configured to:
when a new write request reaches a cache system, whether a cache cluster with a data page address range containing the data page address of the new write request exists in the cache system is searched according to the data page address of the new write request, if so, the new write request data is stored in the cache cluster, if not, a cache space is applied in the cache system to form the cache cluster, the new write request data is stored in the cache cluster, a target fingerprint library unit is constructed in the target fingerprint library, and all data page fingerprint information in a flash memory data block corresponding to the cache cluster is stored in the target fingerprint library unit.
The information matching module is used for generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint library and judging the operation type of the new writing request according to a matching result;
in the information matching module, the target fingerprint database comprises a plurality of target fingerprint database units, and fingerprint information of all data pages contained in one flash memory data block in the flash memory is stored in each target fingerprint database unit.
In this embodiment, the information matching module is specifically configured to:
generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint library, if the fingerprint information of a data page with the same logical address as the new writing request is not found in the target fingerprint library, judging the new writing request as a new writing operation, if the fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is successfully matched, judging the new writing request as a repeated writing operation, and if the fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is not successfully matched, judging the new writing request as an updating operation.
And the dynamic operation module is used for selecting a processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request.
In this embodiment, the dynamic operation module is specifically configured to:
selecting a processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request:
when the new write request is a new write operation, when the cache cluster where the new write request is located is written back to the flash memory, the new write request data is written back to the flash memory along with the cache cluster where the new write request is located; when the new write request is repeated write operation, discarding the new write request data when the cache cluster where the new write request is located is written back to the flash memory; when the new write request is an update operation, adding one to the update data page flag bit of the cache cluster where the new write request data is located, setting the original flash memory data page corresponding to the new write request data page to be in an invalid state, and when the update data page flag bit of the cache cluster where the new write request data is located is larger than a preset threshold value, writing the new write request data back to the flash memory along with the cache cluster where the new write request data is located.
The method is explained by taking a specific embodiment in conjunction with fig. 3, fig. 4 and fig. 5 as follows:
assuming that a new write request with LPN-P0 arrives at the cache system, the new write request with LPN-P0 should be stored in the cache cluster 1 according to the cache data organization form of the cluster, so it is first necessary to check whether the cache cluster 1 exists in the cache to store the new write request data page with LPN-P0;
if the cache cluster 1 exists, writing the new write request data page with LPN (P0) into the cache cluster 1 in the cache;
if no cache cluster 1 exists, searching for an idle cache cluster, making LBN of the cache cluster equal to 1, storing a data page with LPN equal to P0 into the cache cluster 1, applying a free space in a target fingerprint library to allocate to a newly generated target fingerprint library unit 1, and storing fingerprint information generated by a flash memory data block 1 corresponding to the cache cluster 1 into the target fingerprint library unit 1;
and judging whether a data page with LPN (total number of pages) of P0 exists in the flash data block 1 of the flash memory according to whether the corresponding content field of the target fingerprint library unit 1 in the target fingerprint library is empty, and matching the fingerprint information of the data page if the data page exists. (specifically, the content field of the new write request with LPN-P0 is subjected to the same string compression algorithm as the target fingerprint library, so that the content fingerprint of the new write request with LPN-P0 is 0001, and the content fingerprint information is matched with the content field with LPN-P0 in the target fingerprint library unit 1);
if the content field of LPN-P0 in the target fingerprint library unit 1 is not empty and matching fails, the current write operation is considered as an update operation; after the data page of the cache cluster 1 is written back to the flash memory, the data page of which LPN is P0 in the flash memory data block 1 is invalidated; therefore, statistics is needed for each update operation, if the update data page in the cache cluster 1 is greater than the preset threshold value at this time, all data pages of the cache cluster 1 are written back to the idle flash memory data block 1 'in the flash memory, then the valid pages in the flash memory data block 1 with the original LBN of 1 in the flash memory are written into the same idle flash memory data block 1' in a data rewriting manner, all pages of the original flash memory data block 1 are set as invalid, and the invalid pages are added into a garbage collection queue to wait for executing garbage collection operation.
According to the cache management strategy provided by the embodiment, on the basis of the clustered data organization form, a fingerprint library technology for providing 'updating' semantic interaction for a cache and a flash memory is designed, and a cluster write-back flash memory space which is low in activity and can generate more invalid data pages after being written back to the flash memory is selected as much as possible, so that the garbage recovery times are reduced, the writing performance jitter caused by contention bus collision possibly existing in garbage recovery operation is effectively reduced, the problem of overlarge delay of user request response is avoided, the garbage recovery efficiency is improved, and the writing amplification is reduced.
According to the caching method for reducing the jitter of the writing performance of the solid-state disk storage system, the caching strategy is based on organizing data in a cluster form, a fingerprint library facing writing/updating operation is designed, and a bridge of semantic interaction between caching and flash storage is realized. Detecting whether an I/O user request written into a cache space at each time is an updating operation or not by utilizing fingerprint information stored in each fingerprint library unit in a fingerprint library, enabling a system to sense whether the number of invalid pages in a data block in a flash memory storage unit corresponding to a cache cluster reaches a threshold value or not in real time, and writing the cluster back to a flash memory when the number of updated data pages in an inactive cluster reaches the threshold value; when the storage unit where the original data block is located is idle, garbage recovery is performed on the original data block, the utilization rate of a cache space is improved, meanwhile, a large number of invalid data pages generated by frequent updating operations between a cache and a flash memory are reduced, and garbage recovery operations are reduced, so that bus conflicts caused by the fact that the valid data pages of the garbage recovery operations are rewritten with an external new write request are reduced as far as possible, and the jitter of the writing performance of the solid state disk is reduced.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (4)

1. A cache method for reducing jitter of write performance of a solid-state disk storage system is characterized by comprising the following steps:
s1, when the new write request reaches the cache system, storing the new write request data to the matched cache cluster;
s2, generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint database, and judging the operation type of the new writing request according to the matching result;
s3, selecting a processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request;
in step S2, the target fingerprint library includes a plurality of target fingerprint library units, and each target fingerprint library unit stores fingerprint information of all data pages included in one flash memory data block in the flash memory;
step S2 specifically includes:
generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint library, if no fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library, judging that the new writing request is a new writing operation, if the fingerprint information of the data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is successfully matched, judging that the new writing request is a repeated writing operation, and if the fingerprint information of the data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is not successfully matched, judging that the new writing request is an updating operation;
step S3 specifically includes:
selecting a processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request:
when the new write request is a new write operation, when the cache cluster where the new write request is located is written back to the flash memory, the new write request data is written back to the flash memory along with the cache cluster where the new write request is located; when the new write request is repeated write operation, discarding the new write request data when the cache cluster where the new write request is located is written back to the flash memory; when the new write request is an update operation, adding one to the update data page flag bit of the cache cluster where the new write request data is located, setting the original flash memory data page corresponding to the new write request data page to be in an invalid state, and when the update data page flag bit of the cache cluster where the new write request data is located is larger than a preset threshold value, writing the new write request data back to the flash memory along with the cache cluster where the new write request data is located.
2. The caching method for reducing the write performance jitter of the solid-state disk storage system according to claim 1, wherein step S1 specifically includes:
when a new write request reaches a cache system, whether a cache cluster with a data page address range containing the data page address of the new write request exists in the cache system is searched according to the data page address of the new write request, if so, the new write request data is stored in the cache cluster, if not, a cache space is applied in the cache system to form the cache cluster, the new write request data is stored in the cache cluster, a target fingerprint library unit is constructed in the target fingerprint library, and all data page fingerprint information in a flash memory data block corresponding to the cache cluster is stored in the target fingerprint library unit.
3. A cache system for reducing jitter in write performance of a solid state disk storage system, comprising:
the information storage module is used for storing the new write request data to the matched cache cluster when the new write request reaches the cache system;
the information matching module is used for generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint library and judging the operation type of the new writing request according to a matching result;
the dynamic operation module is used for selecting a processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request;
in the information matching module, the target fingerprint database comprises a plurality of target fingerprint database units, and fingerprint information of all data pages contained in one flash memory data block in the flash memory is stored in each target fingerprint database unit;
the information matching module is specifically configured to:
generating fingerprint information based on the new writing request data, matching the fingerprint information with a target fingerprint library, if no fingerprint information of a data page with the same logical address as the new writing request is found in the target fingerprint library, judging that the new writing request is a new writing operation, if the fingerprint information of the data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is successfully matched, judging that the new writing request is a repeated writing operation, and if the fingerprint information of the data page with the same logical address as the new writing request is found in the target fingerprint library and the fingerprint information of the two content fields is not successfully matched, judging that the new writing request is an updating operation;
the dynamic operation module is specifically configured to:
selecting a processing mode of the new write request data when the cache cluster where the new write request data is located is written back to the flash memory according to the operation type of the new write request:
when the new write request is a new write operation, when the cache cluster where the new write request is located is written back to the flash memory, the new write request data is written back to the flash memory along with the cache cluster where the new write request is located; when the new write request is repeated write operation, discarding the new write request data when the cache cluster where the new write request is located is written back to the flash memory; when the new write request is an update operation, adding one to the update data page flag bit of the cache cluster where the new write request data is located, setting the original flash memory data page corresponding to the new write request data page to be in an invalid state, and when the update data page flag bit of the cache cluster where the new write request data is located is larger than a preset threshold value, writing the new write request data back to the flash memory along with the cache cluster where the new write request data is located.
4. The cache system for reducing write performance jitter of a solid state disk storage system of claim 3, wherein the information storage module is specifically configured to:
when a new write request reaches a cache system, whether a cache cluster with a data page address range containing the data page address of the new write request exists in the cache system is searched according to the data page address of the new write request, if so, the new write request data is stored in the cache cluster, if not, a cache space is applied in the cache system to form the cache cluster, the new write request data is stored in the cache cluster, a target fingerprint library unit is constructed in the target fingerprint library, and all data page fingerprint information in a flash memory data block corresponding to the cache cluster is stored in the target fingerprint library unit.
CN201810294987.0A 2018-04-04 2018-04-04 Caching method and system for reducing jitter of writing performance of solid-state disk storage system Active CN108664217B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810294987.0A CN108664217B (en) 2018-04-04 2018-04-04 Caching method and system for reducing jitter of writing performance of solid-state disk storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810294987.0A CN108664217B (en) 2018-04-04 2018-04-04 Caching method and system for reducing jitter of writing performance of solid-state disk storage system

Publications (2)

Publication Number Publication Date
CN108664217A CN108664217A (en) 2018-10-16
CN108664217B true CN108664217B (en) 2021-07-13

Family

ID=63783079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810294987.0A Active CN108664217B (en) 2018-04-04 2018-04-04 Caching method and system for reducing jitter of writing performance of solid-state disk storage system

Country Status (1)

Country Link
CN (1) CN108664217B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109491620B (en) * 2018-11-23 2020-08-14 柏科数据技术(深圳)股份有限公司 Storage data rewriting method, device, server and storage medium
CN111857578A (en) * 2020-06-30 2020-10-30 浪潮(北京)电子信息产业有限公司 Data information reading and writing method, device, equipment and storage medium
CN113485649B (en) * 2021-07-23 2023-03-24 天翼云科技有限公司 Data storage method, system, device, medium and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102646069A (en) * 2012-02-23 2012-08-22 华中科技大学 Method for prolonging service life of solid-state disk
CN103049397A (en) * 2012-12-20 2013-04-17 中国科学院上海微系统与信息技术研究所 Method and system for internal cache management of solid state disk based on novel memory
CN103678158A (en) * 2013-12-26 2014-03-26 中国科学院信息工程研究所 Optimization method and system for data layout
CN106293525A (en) * 2016-08-05 2017-01-04 上海交通大学 A kind of method and system improving caching service efficiency
CN106445413A (en) * 2012-12-12 2017-02-22 华为技术有限公司 Processing method and device for data in trunk system
CN106599146A (en) * 2016-12-06 2017-04-26 腾讯科技(深圳)有限公司 Cache page processing method and device and cache page update request processing method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9690706B2 (en) * 2015-03-25 2017-06-27 Intel Corporation Changing cache ownership in clustered multiprocessor
US10089228B2 (en) * 2016-05-09 2018-10-02 Dell Products L.P. I/O blender countermeasures

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102646069A (en) * 2012-02-23 2012-08-22 华中科技大学 Method for prolonging service life of solid-state disk
CN106445413A (en) * 2012-12-12 2017-02-22 华为技术有限公司 Processing method and device for data in trunk system
CN103049397A (en) * 2012-12-20 2013-04-17 中国科学院上海微系统与信息技术研究所 Method and system for internal cache management of solid state disk based on novel memory
CN103678158A (en) * 2013-12-26 2014-03-26 中国科学院信息工程研究所 Optimization method and system for data layout
CN106293525A (en) * 2016-08-05 2017-01-04 上海交通大学 A kind of method and system improving caching service efficiency
CN106599146A (en) * 2016-12-06 2017-04-26 腾讯科技(深圳)有限公司 Cache page processing method and device and cache page update request processing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Exploring SSD Endurance Model based on Write Amplification and Temperature;Hui Sun et al.;《 2016 Seventh International Green and Sustainable Computing Conference (IGSC). IEEE Computer Society, 2017》;20170406;1-5 *

Also Published As

Publication number Publication date
CN108664217A (en) 2018-10-16

Similar Documents

Publication Publication Date Title
CN107066393B (en) Method for improving mapping information density in address mapping table
US10915475B2 (en) Methods and apparatus for variable size logical page management based on hot and cold data
US8935484B2 (en) Write-absorbing buffer for non-volatile memory
US9342458B2 (en) Cache allocation in a computerized system
US20180121351A1 (en) Storage system, storage management apparatus, storage device, hybrid storage apparatus, and storage management method
US9489239B2 (en) Systems and methods to manage tiered cache data storage
US20070094445A1 (en) Method to enable fast disk caching and efficient operations on solid state disks
CN105930282B (en) A kind of data cache method for NAND FLASH
CN103885728A (en) Magnetic disk cache system based on solid-state disk
US8572321B2 (en) Apparatus and method for segmented cache utilization
CN110968269A (en) SCM and SSD-based key value storage system and read-write request processing method
CN108664217B (en) Caching method and system for reducing jitter of writing performance of solid-state disk storage system
CN110674056B (en) Garbage recovery method and device
CN113254358A (en) Method and system for address table cache management
CN114185492B (en) Solid state disk garbage recycling method based on reinforcement learning
KR101026634B1 (en) A method of data storage for a hybrid flash memory
CN102650972A (en) Data storage method, device and system
CN110968527B (en) FTL provided caching
US6324633B1 (en) Division of memory into non-binary sized cache and non-cache areas
CN114115711B (en) Quick buffer storage system based on nonvolatile memory file system
CN112559384B (en) Dynamic partitioning method for hybrid solid-state disk based on nonvolatile memory
CN114610654A (en) Solid-state storage device and method for writing data into solid-state storage device
CN111796757B (en) Solid state disk cache region management method and device
Kwon et al. Fast responsive flash translation layer for smart devices
KR101353967B1 (en) Data process method for reading/writing data in non-volatile memory cache having ring structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant