WO2023245942A1 - Ssd finite window data deduplication identification method and apparatus, and computer device - Google Patents

Ssd finite window data deduplication identification method and apparatus, and computer device Download PDF

Info

Publication number
WO2023245942A1
WO2023245942A1 PCT/CN2022/127731 CN2022127731W WO2023245942A1 WO 2023245942 A1 WO2023245942 A1 WO 2023245942A1 CN 2022127731 W CN2022127731 W CN 2022127731W WO 2023245942 A1 WO2023245942 A1 WO 2023245942A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
read
deduplication
write
command
Prior art date
Application number
PCT/CN2022/127731
Other languages
French (fr)
Chinese (zh)
Inventor
王猛
徐伟华
韩道静
Original Assignee
苏州忆联信息系统有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州忆联信息系统有限公司 filed Critical 苏州忆联信息系统有限公司
Publication of WO2023245942A1 publication Critical patent/WO2023245942A1/en

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C16/00Erasable programmable read-only memories
    • G11C16/02Erasable programmable read-only memories electrically programmable
    • G11C16/04Erasable programmable read-only memories electrically programmable using variable threshold transistors, e.g. FAMOS
    • G11C16/0483Erasable programmable read-only memories electrically programmable using variable threshold transistors, e.g. FAMOS comprising cells having several storage transistors connected in series
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C16/00Erasable programmable read-only memories
    • G11C16/02Erasable programmable read-only memories electrically programmable
    • G11C16/06Auxiliary circuits, e.g. for writing into memory
    • G11C16/10Programming or data input circuits
    • G11C16/14Circuits for erasing electrically, e.g. erase voltage switching circuits

Definitions

  • the present application relates to the technical field of solid state drives, and in particular to an SSD limited window data deduplication identification method, device, computer equipment and storage medium.
  • SSD solid-state drive
  • LBA logical address
  • L2P table logical to physical address mapping table
  • An SSD limited window data deduplication and identification method includes:
  • the command is a read command
  • the data reading is completed according to the normal path, the read command is divided into data segments, and the data summary is calculated for each segment of data;
  • the command is a write command, allocate a write buffer and receive data from the host, divide the write command into data segments, and calculate its data summary for each segment of data;
  • the corresponding read data segment logical address and write data segment logical address are sent to the write deduplication application module.
  • the write deduplication application module replaces the write data into NAND by copying the mapping table.
  • the step of dividing the read command into data segments, and calculating the data summary for each segment of data further includes:
  • Each data segment contains the read data segment logical address, data summary and data content;
  • the step further includes:
  • the step of further determining whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window further includes:
  • the write deduplication application module copies the mapping table of the read data segment logical address to the write data segment. Mapping table of logical addresses to complete data copy.
  • the SSD limited window data deduplication and identification device includes:
  • a first judgment module the first judgment module is used to obtain the command issued by the host and judge whether the command is a read command
  • the read deduplication window module is used to complete data reading according to the normal path if the command is a read command, divide the read command according to data segments, and calculate its data summary for each segment of data; Add the read data segment logical address, data summary and data content to the read deduplication queue, move the read data deduplication window and keep the number of records in the window constant;
  • the second judgment module is used to continue to judge whether the command is a write command if the command is not a read command;
  • the write deduplication identification module is used to allocate a write buffer and receive data from the host if the command is a write command, divide the write command according to data segments, and calculate the data for each segment of data Summary;
  • a third judgment module is used to judge whether the data summary of each data segment of the write command is consistent with the data summary in the read data deduplication window;
  • the fourth judgment module is used to further judge whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window if the data digest is consistent; if the data content is also consistent, then Send the corresponding read data segment logical address and write data segment logical address to the write deduplication application module;
  • a write deduplication application module which is used to write data into NAND by copying a mapping table instead of writing data.
  • the read deduplication window module is also used to:
  • Each data segment contains the read data segment logical address, data summary and data content;
  • the read deduplication window module is also used to:
  • the fourth judgment module is also used to:
  • the write deduplication application module copies the mapping table of the read data segment logical address to the write data segment. Mapping table of logical addresses to complete data copy.
  • a computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the steps of any one of the above methods are implemented.
  • the SSD internally maintains a read data deduplication window of a certain depth, including the logical address, data summary and data content of the recently read data segment.
  • the SSD internally generates a data summary for the written data segment in real time, and compares the summary of the written data segment with the data summary in the current read data deduplication window.
  • the SSD can independently identify duplicate data within a limited window, and then use some mapping table copying and other methods to reduce data writing and improve SSD reliability. At the same time, the performance of intra-disk data copying is greatly improved.
  • Figure 1 is a schematic diagram of a typical SSD reading and writing process in traditional technology
  • Figure 2 is a schematic flow chart of host file data copying in traditional technology
  • FIG. 3 is a schematic flowchart of the SSD limited window data deduplication and identification method in one embodiment
  • Figure 4 is a schematic flowchart of an SSD limited window data deduplication and identification method in another embodiment
  • Figure 5 is a schematic diagram of a limited window deduplication module introduced in one embodiment
  • Figure 6 is a schematic diagram of read deduplication window maintenance in one embodiment
  • Figure 7 is a schematic diagram of the specific implementation of the SSD limited window data deduplication and identification method in one embodiment
  • Figure 8 is a structural block diagram of an SSD limited window data deduplication and identification device in one embodiment
  • Figure 9 is an internal structure diagram of a computer device in one embodiment.
  • the host sends commands to the SSD hardware module; the SSD hardware module PCIe/NVMe, after receiving the command, transfers it to the firmware module for processing; the SSD firmware front-end module Divide the command into mapping units (typically 4KB); submit the operation request to the buffer management module and allocate a read and write buffer; if it is a write command, establish data transmission with the host based on the allocated buffer, and complete After data transmission, the host is notified that the command is completed; if it is a read command, the operation request is submitted to the mapping table management module; the mapping table management module is responsible for allocating the corresponding physical address (write command) according to the logical address or converting the logical address into a NAND physical address ( Read command); submit the operation request to the back-end module, and the back-end module initiates a NAND read/write request according to the physical address; waits for the NAND read/write operation request to be completed; if it is a read command,
  • mapping units typically 4KB
  • SSD converts LBA to LPA; queries the physical address of the data to be read; initiates reading of the corresponding physical address.
  • S3 SSD returns data to the host.
  • S5 Convert LBA to LPA; allocate the physical address of the data to be written and update the mapping table; initiate writing of the corresponding physical address.
  • the source data needs to be read out from the SSD/NAND in sequence, transmitted to the host memory through the bus (PCIe), and then transmitted to the SSD through the bus, and then written to the NAND.
  • PCIe bus
  • this application proposes an SSD limited window data deduplication and identification method, which aims to identify the characteristics of this type of repeated data, so as to optimize the performance of the SSD and reduce the amount of writing to the SSD.
  • an SSD limited window data deduplication and identification method which method includes:
  • Step 302 Obtain the command issued by the host and determine whether the command is a read command
  • Step 304 if the command is a read command, complete the data reading according to the normal path, divide the read command into data segments, and calculate its data summary for each segment of data;
  • Step 306 Add the read data segment logical address, data summary and data content to the read deduplication queue, move the read data deduplication window and keep the number of records in the window constant;
  • Step 308 If the command is not a read command, continue to determine whether the command is a write command;
  • Step 310 if the command is a write command, allocate a write buffer and receive data from the host, divide the write command into data segments, and calculate its data summary for each segment of data;
  • Step 312 determine whether the data digest of each data segment of the write command is consistent with the data digest in the read data deduplication window
  • Step 314 if the data digests are consistent, further determine whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window;
  • Step 316 if the data content is also consistent, the corresponding read data segment logical address and write data segment logical address are sent to the write deduplication application module.
  • the write deduplication application module replaces the write data into NAND by copying the mapping table.
  • a method for deduplication and identification of SSD limited window data which can effectively identify data duplication in such limited command window scenarios, providing feasibility for further optimizing performance and improving lifespan. Specifically, it includes the following steps:
  • a deduplication module is added to the command processing path, which is subdivided into three sub-modules: read deduplication window, write deduplication identification, and write deduplication application:
  • the read deduplication window is used to maintain the information of the latest read data segment inside the SSD: logical address, data summary, and data content to match the newly written data segment.
  • Write deduplication identification is used to generate a data summary for the data segment newly written by the host, and compare it with the record saved in the read deduplication window queue to determine whether the summary is the same; for scenarios where the summary is the same, further comparison is performed Its data, confirm whether the data is completely consistent; for write commands whose summary and data content comparison are consistent, send its logical address and the logical address information of the hit read data segment to the write deduplication application module.
  • the write deduplication application module is used to copy the mapping table of the read logical address to the mapping table of the write logical address according to the received logical addresses of the write data segment and the read data segment to complete the data copy.
  • the SSD internally maintains a read data deduplication window of a certain depth, including the logical address, data summary and data content of the recently read data segment.
  • the SSD internally generates a data summary for the written data segment in real time, and compares the summary of the written data segment with the data summary in the current read data deduplication window.
  • the SSD can independently identify duplicate data within a limited window, and then use some mapping table copying and other methods to reduce data writing and improve SSD reliability. At the same time, the performance of intra-disk data copying is greatly improved.
  • an SSD limited window data deduplication and identification method is provided.
  • the read command is divided into data segments, and the step of calculating the data summary for each segment of data also includes:
  • Step 402 Divide the read command into multiple data segments according to a certain size. Each data segment contains the read data segment logical address, data summary and data content;
  • Step 404 Maintain the read data segments recently accessed by the host from far to near according to time to form a read deduplication queue
  • Step 406 In the read data deduplication window, only a certain number of recently accessed read data segments are kept, and further records are discarded to reduce resource overhead and hit check costs.
  • the read command is first divided into data segments according to a certain size (customized, it can be 4KB/8KB/%), and each data segment contains the following information:
  • Data summary the data summary corresponding to the data segment, can use common algorithms, such as SHA/MD5..., for rough comparison (fast, but there is a certain probability of misjudgment);
  • Data content the original content of the data segment, used for accurate comparison (slower speed).
  • the read data segments accessed by the latest host are maintained to form a read deduplication queue.
  • N pieces of data are generally read and N pieces of data are written, so the write command will basically hit the latest N data segments of the read command.
  • the size of the read deduplication window (number of data segments) can be adjusted to improve efficiency.
  • an SSD limited window data deduplication and identification method is provided.
  • the method also includes: if the data digests are consistent, XOR the data content of the write data segment with the data content of the hit read data segment and determine Whether the result is 0; if the result is 0, it is judged that the data is repeated, and the logical address of the write data end and the logical address of the hit read data segment are sent to the write deduplication application module.
  • the write deduplication application module will read the mapping table of the data segment logical address. Copy to the mapping table of the logical address of the write data segment to complete the data copy.
  • the summary of the written data segment is compared with the data summary in the current read data deduplication window. If they match, further data matching is performed. For data digest matching scenarios, the corresponding read and write data are further XORed. If the result is 0, the data is confirmed to be duplicated.
  • Step 7.1 Obtain new commands from the host.
  • Step 7.2 Determine whether it is a read command; if so, continue to step 7.3; if not, jump to step 7.7.
  • Step 7.3 Complete data reading according to the normal path.
  • Step 7.4 Divide the read command into data segments, and calculate the data summary for each segment of data.
  • Step 7.5 Add (data segment logical address, data summary, data content) to the read deduplication queue, and move the read data deduplication window to keep the number of records in the window constant.
  • Step 7.7 Determine whether it is a write command; if so, proceed to step 7.8; if not, proceed according to the normal path.
  • Step 7.8 Allocate write buffer and receive data from host.
  • Step 7.9 Divide the write command into data segments and calculate its data summary for each segment of data.
  • Step 7.10 Compare the data summary of each data segment of the write command with the data summary in the read data deduplication window.
  • Step 7.11 determine whether they are equal; if yes, proceed to step 7.12; if not, write the data to NAND according to the normal path.
  • Step 7.12 XOR the content of the write data segment with the data content of the hit read data segment.
  • Step 7.13 Determine whether the result is 0; if so, proceed to step 7.14; if not, write the data to NAND according to the normal path.
  • Step 7.14 Data is repeated and the corresponding read data segment logical address and write data segment logical address are sent to the write deduplication application module.
  • Step 7.15 The write deduplication application module writes the data into NAND by copying the mapping table based on the logical addresses of the read and write data segments.
  • Step 7.16 the command is completed.
  • an SSD limited window data deduplication identification device 800 which device includes:
  • the first judgment module 801 is used to obtain the command issued by the host and judge whether the command is a read command
  • Read deduplication window module 802. The read deduplication window module is used to complete data reading according to the normal path if the command is a read command, divide the read command according to data segments, and calculate its data summary for each segment of data. ;Add the read data segment logical address, data summary and data content to the read deduplication queue, move the read data deduplication window and keep the number of records in the window constant;
  • the second judgment module 803 is used to continue to judge whether the command is a write command if the command is not a read command;
  • the write deduplication identification module is used to allocate a write buffer and receive data from the host if the command is a write command, divide the write command according to data segments, and calculate its value for each segment of data. data summary;
  • the third judgment module 805 is used to judge whether the data digest of each data segment of the write command is consistent with the data digest in the read data deduplication window;
  • the fourth judgment module 806 is used to further judge whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window if the data digest is consistent; if the data content is also consistent, Then send the corresponding read data segment logical address and write data segment logical address to the write deduplication application module;
  • the write deduplication application module is used to write data into NAND by copying the mapping table instead of writing data.
  • the read deduplication window module 802 is also used to:
  • Each data segment contains the read data segment logical address, data summary and data content;
  • the read deduplication window module 802 is also used to:
  • the fourth judgment module 806 is also used to:
  • the write deduplication application module copies the mapping table of the read data segment logical address to the write data segment. Mapping table of logical addresses to complete data copy.
  • a computer device is provided, the internal structure diagram of which can be shown in Figure 9 .
  • the computer device includes a processor, memory, and network interface connected through a system bus. Wherein, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes non-volatile storage media and internal memory.
  • the non-volatile storage medium stores operating systems, computer programs and databases. This internal memory provides an environment for the execution of operating systems and computer programs in non-volatile storage media.
  • the network interface of the computer device is used to communicate with external terminals through a network connection.
  • the computer program is executed by a processor to implement an SSD limited window data deduplication and identification method.
  • Figure 9 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied.
  • Specific computer equipment can May include more or fewer parts than shown, or combine certain parts, or have a different arrangement of parts.
  • a computer device including a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • the processor executes the computer program, the steps in each of the above method embodiments are implemented.
  • a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the steps in each of the above method embodiments are implemented.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Synchlink DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Abstract

An SSD finite window data deduplication identification method and apparatus, a computer device, and a storage medium. The method comprises: if a command is a write command, allocating a write buffer area and receiving data from a host, dividing the write command according to data segments, and calculating a data abstract for each data segment; determining whether the data abstract for each data segment of the write command is consistent with the data abstract in a read data deduplication window; if yes, further determining whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window; and if yes, sending a corresponding read data segment logical address and a write data segment logical address to a write deduplication application module, so that the write deduplication application module replaces the manner of writing write data to NAND with a mapping table copying manner. The present application can reduce data writing, improve the SSD reliability, and greatly improve the performance of on-disk data copying.

Description

SSD有限窗口数据去重识别方法、装置和计算机设备SSD limited window data deduplication identification method, device and computer equipment
本申请要求于2022年6月24日在中国专利局提交的、申请号为202210731384.9、发明名称为“SSD有限窗口数据去重识别方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on June 24, 2022, with the application number 202210731384.9 and the invention title "SSD limited window data deduplication identification method, device and computer equipment", all of which The contents are incorporated into this application by reference.
技术领域Technical field
本申请涉及固态硬盘技术领域,特别是涉及一种SSD有限窗口数据去重识别方法、装置、计算机设备和存储介质。The present application relates to the technical field of solid state drives, and in particular to an SSD limited window data deduplication identification method, device, computer equipment and storage medium.
背景技术Background technique
随着固态硬盘技术的发展,SSD(固态硬盘)已经被广泛应用于各种场合,目前在PC市场,已经逐步替代传统的HDD,从可靠性和性能方面为用户提供较好的体验。SSD内部使用NAND作为数据存储的介质,主机访问SSD基于逻辑地址(LBA),而SSD内部维持一个逻辑到物理地址的映射表(L2P表),用以指示对应的逻辑地址数据存放的物理地址。当写入数据时,为主机写入的逻辑地址数据分配物理存储地址,并更新对应的L2P表。而当读取数据时,则根据逻辑地址查询L2P表对应的物理地址,进而读取数据返回主机。With the development of solid-state drive technology, SSD (solid-state drive) has been widely used in various occasions. Currently, it has gradually replaced traditional HDD in the PC market, providing users with a better experience in terms of reliability and performance. SSD internally uses NAND as the data storage medium. The host accesses the SSD based on the logical address (LBA), and the SSD internally maintains a logical to physical address mapping table (L2P table) to indicate the physical address where the corresponding logical address data is stored. When data is written, a physical storage address is assigned to the logical address data written by the host, and the corresponding L2P table is updated. When reading data, the physical address corresponding to the L2P table is queried according to the logical address, and then the data is read and returned to the host.
目前,现有的SSD中,存在较多的重复数据,其对应的逻辑地址、物理地址不一样,但是数据内容一样,需要消耗额外的NAND 擦写寿命。在典型的主机拷贝文件时,需要将数据从源逻辑地址读出到主机,然后再写入到新的逻辑地址(目的地址),在有限的时间窗口内存在较多的重复数据。此过程中,需要多次从NAND读取/写入数据,而且数据需要多次经过总线,性能受到极大地制约。进一步地,此类重复数据的写入消耗了额外的NAND 擦写,消耗了SSD寿命。Currently, there is a lot of duplicate data in existing SSDs. The corresponding logical addresses and physical addresses are different, but the data content is the same, which requires additional NAND erase and write life. When a typical host copies a file, the data needs to be read from the source logical address to the host, and then written to a new logical address (destination address). There is a lot of duplicate data within a limited time window. During this process, data needs to be read/written from NAND multiple times, and the data needs to pass through the bus multiple times, which greatly limits performance. Furthermore, the writing of such repeated data consumes additional NAND erase and write, and consumes the life of the SSD.
技术问题technical problem
基于此,有必要针对上述技术问题,提供一种SSD有限窗口数据去重识别方法、装置、计算机设备和存储介质。Based on this, it is necessary to provide an SSD limited window data deduplication identification method, device, computer equipment and storage medium to address the above technical problems.
技术解决方案Technical solutions
一种SSD有限窗口数据去重识别方法,所述方法包括:An SSD limited window data deduplication and identification method, the method includes:
获取主机下发的命令并判断所述命令是否为读命令;Obtain the command issued by the host and determine whether the command is a read command;
若所述命令为读命令,则按正常路径完成数据读取,将读命令按数据段进行划分,针对每段数据计算其数据摘要;If the command is a read command, the data reading is completed according to the normal path, the read command is divided into data segments, and the data summary is calculated for each segment of data;
将读数据段逻辑地址、数据摘要以及数据内容添加到读去重队列中,移动读数据去重窗口并保持窗口内记录数量恒定;Add the read data segment logical address, data summary and data content to the read deduplication queue, move the read data deduplication window and keep the number of records in the window constant;
若所述命令不为读命令,继续判断所述命令是否为写命令;If the command is not a read command, continue to determine whether the command is a write command;
若所述命令为写命令,则分配写缓冲区并从主机接收数据,将写命令按数据段进行划分,针对每段数据计算其数据摘要;If the command is a write command, allocate a write buffer and receive data from the host, divide the write command into data segments, and calculate its data summary for each segment of data;
判断所述写命令各个数据段的数据摘要与所述读数据去重窗口中的数据摘要是否一致;Determine whether the data summary of each data segment of the write command is consistent with the data summary in the read data deduplication window;
若数据摘要一致则进一步判断写命令各个数据段的数据内容与所述读数据去重窗口中的数据内容是否一致;If the data digests are consistent, it is further determined whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window;
若数据内容也一致,则将对应的读数据段逻辑地址以及写数据段逻辑地址发送给写去重应用模块,所述写去重应用模块通过映射表拷贝的方式替代写数据写入NAND。If the data content is also consistent, the corresponding read data segment logical address and write data segment logical address are sent to the write deduplication application module. The write deduplication application module replaces the write data into NAND by copying the mapping table.
在其中一个实施例中,所述将读命令按数据段进行划分,针对每段数据计算其数据摘要的步骤还包括:In one embodiment, the step of dividing the read command into data segments, and calculating the data summary for each segment of data further includes:
将读命令按一定的大小划分为多个数据段,每个数据段包含读数据段逻辑地址、数据摘要以及数据内容;Divide the read command into multiple data segments according to a certain size. Each data segment contains the read data segment logical address, data summary and data content;
按照时间从远到近维护最近主机访问的读数据段,形成读去重队列。Maintain the read data segments recently accessed by the host from far to near according to time to form a read deduplication queue.
在其中一个实施例中,在所述按照时间从远到近维护最近主机访问的读数据段,形成读去重队列的步骤之后还包括:In one embodiment, after the step of maintaining the read data segments recently accessed by the host from far to near in time and forming a read deduplication queue, the step further includes:
在读数据去重窗口中,仅保持一定数量的最近访问的读数据段,将更远的记录则丢弃掉以减少资源开销以及命中检查的代价。In the read data deduplication window, only a certain number of recently accessed read data segments are kept, and further records are discarded to reduce resource overhead and hit check costs.
在其中一个实施例中,所述若数据摘要一致则进一步判断写命令各个数据段的数据内容与所述读数据去重窗口中的数据内容是否一致的步骤还包括:In one embodiment, if the data digests are consistent, the step of further determining whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window further includes:
若数据摘要一致则将写数据段的数据内容与命中的读数据段的数据内容进行异或并判断结果是否为0;If the data digests are consistent, XOR the data content of the write data segment with the data content of the hit read data segment and determine whether the result is 0;
若结果为0则判断数据重复,将写数据端逻辑地址以及命中的读数据段逻辑地址发送给写去重应用模块,写去重应用模块将读数据段逻辑地址的映射表拷贝到写数据段逻辑地址的映射表以完成数据拷贝。If the result is 0, it is judged that the data is duplicated, and the logical address of the write data end and the logical address of the hit read data segment are sent to the write deduplication application module. The write deduplication application module copies the mapping table of the read data segment logical address to the write data segment. Mapping table of logical addresses to complete data copy.
一种SSD有限窗口数据去重识别装置,所述SSD有限窗口数据去重识别装置包括:An SSD limited window data deduplication and identification device. The SSD limited window data deduplication and identification device includes:
第一判断模块,所述第一判断模块用于获取主机下发的命令并判断所述命令是否为读命令;A first judgment module, the first judgment module is used to obtain the command issued by the host and judge whether the command is a read command;
读去重窗口模块,所述读去重窗口模块用于若所述命令为读命令,则按正常路径完成数据读取,将读命令按数据段进行划分,针对每段数据计算其数据摘要;将读数据段逻辑地址、数据摘要以及数据内容添加到读去重队列中,移动读数据去重窗口并保持窗口内记录数量恒定;Read deduplication window module, the read deduplication window module is used to complete data reading according to the normal path if the command is a read command, divide the read command according to data segments, and calculate its data summary for each segment of data; Add the read data segment logical address, data summary and data content to the read deduplication queue, move the read data deduplication window and keep the number of records in the window constant;
第二判断模块,所述第二判断模块用于若所述命令不为读命令,继续判断所述命令是否为写命令;a second judgment module, the second judgment module is used to continue to judge whether the command is a write command if the command is not a read command;
写去重识别模块,所述写去重识别模块用于若所述命令为写命令,则分配写缓冲区并从主机接收数据,将写命令按数据段进行划分,针对每段数据计算其数据摘要;Write deduplication identification module, the write deduplication identification module is used to allocate a write buffer and receive data from the host if the command is a write command, divide the write command according to data segments, and calculate the data for each segment of data Summary;
第三判断模块,所述第三判断模块用于判断所述写命令各个数据段的数据摘要与所述读数据去重窗口中的数据摘要是否一致;A third judgment module, the third judgment module is used to judge whether the data summary of each data segment of the write command is consistent with the data summary in the read data deduplication window;
第四判断模块,所述第四判断模块用于若数据摘要一致则进一步判断写命令各个数据段的数据内容与所述读数据去重窗口中的数据内容是否一致;若数据内容也一致,则将对应的读数据段逻辑地址以及写数据段逻辑地址发送给写去重应用模块;The fourth judgment module is used to further judge whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window if the data digest is consistent; if the data content is also consistent, then Send the corresponding read data segment logical address and write data segment logical address to the write deduplication application module;
写去重应用模块,所述写去重应用模块用于通过映射表拷贝的方式替代写数据写入NAND。A write deduplication application module, which is used to write data into NAND by copying a mapping table instead of writing data.
在其中一个实施例中,所述读去重窗口模块还用于:In one embodiment, the read deduplication window module is also used to:
将读命令按一定的大小划分为多个数据段,每个数据段包含读数据段逻辑地址、数据摘要以及数据内容;Divide the read command into multiple data segments according to a certain size. Each data segment contains the read data segment logical address, data summary and data content;
按照时间从远到近维护最近主机访问的读数据段,形成读去重队列。Maintain the read data segments recently accessed by the host from far to near according to time to form a read deduplication queue.
在其中一个实施例中,所述读去重窗口模块还用于:In one embodiment, the read deduplication window module is also used to:
在读数据去重窗口中,仅保持一定数量的最近访问的读数据段,将更远的记录则丢弃掉以减少资源开销以及命中检查的代价。In the read data deduplication window, only a certain number of recently accessed read data segments are kept, and further records are discarded to reduce resource overhead and hit check costs.
在其中一个实施例中,所述第四判断模块还用于:In one embodiment, the fourth judgment module is also used to:
若数据摘要一致则将写数据段的数据内容与命中的读数据段的数据内容进行异或并判断结果是否为0;If the data digests are consistent, XOR the data content of the write data segment with the data content of the hit read data segment and determine whether the result is 0;
若结果为0则判断数据重复,将写数据端逻辑地址以及命中的读数据段逻辑地址发送给写去重应用模块,写去重应用模块将读数据段逻辑地址的映射表拷贝到写数据段逻辑地址的映射表以完成数据拷贝。If the result is 0, it is judged that the data is duplicated, and the logical address of the write data end and the logical address of the hit read data segment are sent to the write deduplication application module. The write deduplication application module copies the mapping table of the read data segment logical address to the write data segment. Mapping table of logical addresses to complete data copy.
一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述任意一项方法的步骤。A computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the steps of any one of the above methods are implemented.
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述任意一项方法的步骤。A computer-readable storage medium on which a computer program is stored, which implements the steps of any of the above methods when executed by a processor.
有益效果beneficial effects
上述SSD有限窗口数据去重识别方法、装置、计算机设备和存储介质, SSD 内部维护一定深度的读数据去重窗口,包含最近读出的数据段的逻辑地址、数据摘要以及数据内容。在主机写入数据时,SSD内部针对写入的数据段实时生成数据摘要,并根据写入数据段的摘要与当前读数据去重窗口中的数据摘要进行比对。对于数据摘要匹配的场景,进一步将相应的数据内容进行比对,可以实现SSD端自主识别有限窗口内的重复数据,进而可以通过一些映射表复制等方法,减少数据写入,提升SSD可靠性,同时极大地提升了盘内数据拷贝的性能。In the above-mentioned SSD limited window data deduplication identification method, device, computer equipment and storage medium, the SSD internally maintains a read data deduplication window of a certain depth, including the logical address, data summary and data content of the recently read data segment. When the host writes data, the SSD internally generates a data summary for the written data segment in real time, and compares the summary of the written data segment with the data summary in the current read data deduplication window. For data summary matching scenarios, by further comparing the corresponding data content, the SSD can independently identify duplicate data within a limited window, and then use some mapping table copying and other methods to reduce data writing and improve SSD reliability. At the same time, the performance of intra-disk data copying is greatly improved.
附图说明Description of the drawings
图1为传统技术中典型的SSD读写流程示意图;Figure 1 is a schematic diagram of a typical SSD reading and writing process in traditional technology;
图2为传统技术中主机文件数据拷贝的流程示意图;Figure 2 is a schematic flow chart of host file data copying in traditional technology;
图3为一个实施例中SSD有限窗口数据去重识别方法的流程示意图;Figure 3 is a schematic flowchart of the SSD limited window data deduplication and identification method in one embodiment;
图4为另一个实施例中SSD有限窗口数据去重识别方法的流程示意图;Figure 4 is a schematic flowchart of an SSD limited window data deduplication and identification method in another embodiment;
图5为一个实施例中引入的有限窗口去重模块的示意图;Figure 5 is a schematic diagram of a limited window deduplication module introduced in one embodiment;
图6为一个实施例中读去重窗口维护的示意图;Figure 6 is a schematic diagram of read deduplication window maintenance in one embodiment;
图7为一个实施例中具体执行SSD有限窗口数据去重识别方法的示意图;Figure 7 is a schematic diagram of the specific implementation of the SSD limited window data deduplication and identification method in one embodiment;
图8为一个实施例中SSD有限窗口数据去重识别装置的结构框图;Figure 8 is a structural block diagram of an SSD limited window data deduplication and identification device in one embodiment;
图9为一个实施例中计算机设备的内部结构图。Figure 9 is an internal structure diagram of a computer device in one embodiment.
本发明的实施方式Embodiments of the invention
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions and advantages of the present application more clear, the present application will be further described in detail below with reference to the drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application and are not used to limit the present application.
参考图1所示的典型的SSD读写流程,具体包括如下实现过程:主机发送命令给到SSD硬件模块;SSD硬件模块PCIe/NVMe,接收到命令后,转交给固件模块处理;SSD固件前端模块将命令分割成映射单元(典型如4KB);提交操作请求到缓冲区管理模块,分配读写缓冲区;如果为写命令,则根据所分配的缓冲区,建立与主机的数据传输,且在完成数据传输后告知主机命令完成;如果为读命令,则提交操作请求到映射表管理模块;映射表管理模块负责根据逻辑地址分配对应的物理地址(写命令)或者把逻辑地址转换成NAND物理地址(读命令);提交操作请求到后端模块,后端模块根据物理地址发起对NAND读/写请求;等待NAND读/写操作请求完成;如果为读命令,数据Ready后,启动数据从NAND Cache Register到主机的传输。Refer to the typical SSD read and write process shown in Figure 1, which specifically includes the following implementation process: the host sends commands to the SSD hardware module; the SSD hardware module PCIe/NVMe, after receiving the command, transfers it to the firmware module for processing; the SSD firmware front-end module Divide the command into mapping units (typically 4KB); submit the operation request to the buffer management module and allocate a read and write buffer; if it is a write command, establish data transmission with the host based on the allocated buffer, and complete After data transmission, the host is notified that the command is completed; if it is a read command, the operation request is submitted to the mapping table management module; the mapping table management module is responsible for allocating the corresponding physical address (write command) according to the logical address or converting the logical address into a NAND physical address ( Read command); submit the operation request to the back-end module, and the back-end module initiates a NAND read/write request according to the physical address; waits for the NAND read/write operation request to be completed; if it is a read command, after the data is Ready, start the data from the NAND Cache Register to host transmission.
参考图2所示的主机文件数据拷贝的流程,具体包括如下实现过程:Refer to the host file data copy process shown in Figure 2, which specifically includes the following implementation process:
S0: 获取待拷贝源数据和目的数据的逻辑地址。S0: Get the logical addresses of the source and destination data to be copied.
S1: 主机读取源数据片段1。S1: The host reads source data segment 1.
S2: SSD将LBA转换为LPA;查询待读取数据的物理地址;发起对应物理地址的读。S2: SSD converts LBA to LPA; queries the physical address of the data to be read; initiates reading of the corresponding physical address.
S3: SSD返回数据给到主机。S3: SSD returns data to the host.
S4: 主机将数据片段1写入到目的地址。S4: The host writes data segment 1 to the destination address.
S5: 将LBA转换为LPA;分配待写入数据的物理地址并更新映射表;发起对应物理地址的写。S5: Convert LBA to LPA; allocate the physical address of the data to be written and update the mapping table; initiate writing of the corresponding physical address.
重复S1-S5直到完成所有数据片段的复制。Repeat S1-S5 until all data segments have been copied.
在上述过程中,需要依次将源数据从SSD/NAND上读出,通过总线(PCIe)传输到主机端内存,然后再通过总线传输到SSD,再写入到NAND上,涉及的操作很多,极大地制约了数据拷贝的性能;同时重复的数据写入NAND,消耗了SSD寿命。In the above process, the source data needs to be read out from the SSD/NAND in sequence, transmitted to the host memory through the bus (PCIe), and then transmitted to the SSD through the bus, and then written to the NAND. There are many operations involved, which is extremely difficult. This greatly limits the performance of data copying; at the same time, repeated data is written to NAND, which consumes the life of the SSD.
基于此,本申请提出了一种SSD有限窗口数据去重识别方法,旨在可以识别出该类重复数据特征,以实现SSD的性能优化以及减少SSD的写入量。Based on this, this application proposes an SSD limited window data deduplication and identification method, which aims to identify the characteristics of this type of repeated data, so as to optimize the performance of the SSD and reduce the amount of writing to the SSD.
在一个实施例中,如图3所示,提供了一种SSD有限窗口数据去重识别方法,该方法包括:In one embodiment, as shown in Figure 3, an SSD limited window data deduplication and identification method is provided, which method includes:
步骤302,获取主机下发的命令并判断所述命令是否为读命令;Step 302: Obtain the command issued by the host and determine whether the command is a read command;
步骤304,若命令为读命令,则按正常路径完成数据读取,将读命令按数据段进行划分,针对每段数据计算其数据摘要;Step 304, if the command is a read command, complete the data reading according to the normal path, divide the read command into data segments, and calculate its data summary for each segment of data;
步骤306,将读数据段逻辑地址、数据摘要以及数据内容添加到读去重队列中,移动读数据去重窗口并保持窗口内记录数量恒定;Step 306: Add the read data segment logical address, data summary and data content to the read deduplication queue, move the read data deduplication window and keep the number of records in the window constant;
步骤308,若命令不为读命令,继续判断命令是否为写命令;Step 308: If the command is not a read command, continue to determine whether the command is a write command;
步骤310,若命令为写命令,则分配写缓冲区并从主机接收数据,将写命令按数据段进行划分,针对每段数据计算其数据摘要;Step 310, if the command is a write command, allocate a write buffer and receive data from the host, divide the write command into data segments, and calculate its data summary for each segment of data;
步骤312,判断写命令各个数据段的数据摘要与读数据去重窗口中的数据摘要是否一致;Step 312, determine whether the data digest of each data segment of the write command is consistent with the data digest in the read data deduplication window;
步骤314,若数据摘要一致则进一步判断写命令各个数据段的数据内容与读数据去重窗口中的数据内容是否一致;Step 314, if the data digests are consistent, further determine whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window;
步骤316,若数据内容也一致,则将对应的读数据段逻辑地址以及写数据段逻辑地址发送给写去重应用模块,写去重应用模块通过映射表拷贝的方式替代写数据写入NAND。Step 316, if the data content is also consistent, the corresponding read data segment logical address and write data segment logical address are sent to the write deduplication application module. The write deduplication application module replaces the write data into NAND by copying the mapping table.
在本实施例中,提供了一种SSD有限窗口数据去重识别方法,该方法可有效识别出此类有限命令窗口场景下的数据重复性,为进一步优化性能以及寿命提升提供了可行性。具体包括如下步骤:In this embodiment, a method for deduplication and identification of SSD limited window data is provided, which can effectively identify data duplication in such limited command window scenarios, providing feasibility for further optimizing performance and improving lifespan. Specifically, it includes the following steps:
首先,获取主机下发的命令并判断所述命令是否为读命令;若命令为读命令,则按正常路径完成数据读取,将读命令按数据段进行划分,针对每段数据计算其数据摘要。First, obtain the command issued by the host and determine whether the command is a read command; if the command is a read command, complete the data reading according to the normal path, divide the read command into data segments, and calculate the data summary for each segment of data. .
参考图5所示的有限窗口去重模块的示意图。在命令处理通路上增加了去重模块,其内细分为读去重窗口、写去重识别、写去重应用三个子模块:Refer to the schematic diagram of the limited window deduplication module shown in Figure 5. A deduplication module is added to the command processing path, which is subdivided into three sub-modules: read deduplication window, write deduplication identification, and write deduplication application:
读去重窗口,用于维护SSD内部最新读出的数据段的信息:逻辑地址,数据摘要,数据内容,用以和新写入的数据段进行匹配。The read deduplication window is used to maintain the information of the latest read data segment inside the SSD: logical address, data summary, and data content to match the newly written data segment.
写去重识别,用于针对主机新写入的数据段,生成其数据摘要,并和读去重窗口队列中保存的记录进行比较,判断是否摘要一样;对于摘要相同的场景,则进一步比对其数据,确认数据是否完全一致;对于摘要以及数据内容比对均一致的写命令,则将其逻辑地址以及命中的读数据段的逻辑地址信息发送给写去重应用模块。Write deduplication identification is used to generate a data summary for the data segment newly written by the host, and compare it with the record saved in the read deduplication window queue to determine whether the summary is the same; for scenarios where the summary is the same, further comparison is performed Its data, confirm whether the data is completely consistent; for write commands whose summary and data content comparison are consistent, send its logical address and the logical address information of the hit read data segment to the write deduplication application module.
写去重应用模块,用于根据所接收到的写数据段和读数据段的逻辑地址,将读逻辑地址的映射表拷贝到写逻辑地址的映射表,完成数据拷贝。The write deduplication application module is used to copy the mapping table of the read logical address to the mapping table of the write logical address according to the received logical addresses of the write data segment and the read data segment to complete the data copy.
在本实施例中,SSD 内部维护一定深度的读数据去重窗口,包含最近读出的数据段的逻辑地址、数据摘要以及数据内容。在主机写入数据时,SSD内部针对写入的数据段实时生成数据摘要,并根据写入数据段的摘要与当前读数据去重窗口中的数据摘要进行比对。对于数据摘要匹配的场景,进一步将相应的数据内容进行比对,可以实现SSD端自主识别有限窗口内的重复数据,进而可以通过一些映射表复制等方法,减少数据写入,提升SSD可靠性,同时极大地提升了盘内数据拷贝的性能。In this embodiment, the SSD internally maintains a read data deduplication window of a certain depth, including the logical address, data summary and data content of the recently read data segment. When the host writes data, the SSD internally generates a data summary for the written data segment in real time, and compares the summary of the written data segment with the data summary in the current read data deduplication window. For data summary matching scenarios, by further comparing the corresponding data content, the SSD can independently identify duplicate data within a limited window, and then use some mapping table copying and other methods to reduce data writing and improve SSD reliability. At the same time, the performance of intra-disk data copying is greatly improved.
在一个实施例中,如图4所示,提供了一种SSD有限窗口数据去重识别方法,该方法中将读命令按数据段进行划分,针对每段数据计算其数据摘要的步骤还包括:In one embodiment, as shown in Figure 4, an SSD limited window data deduplication and identification method is provided. In this method, the read command is divided into data segments, and the step of calculating the data summary for each segment of data also includes:
步骤402,将读命令按一定的大小划分为多个数据段,每个数据段包含读数据段逻辑地址、数据摘要以及数据内容;Step 402: Divide the read command into multiple data segments according to a certain size. Each data segment contains the read data segment logical address, data summary and data content;
步骤404,按照时间从远到近维护最近主机访问的读数据段,形成读去重队列;Step 404: Maintain the read data segments recently accessed by the host from far to near according to time to form a read deduplication queue;
步骤406,在读数据去重窗口中,仅保持一定数量的最近访问的读数据段,将更远的记录则丢弃掉以减少资源开销以及命中检查的代价。Step 406: In the read data deduplication window, only a certain number of recently accessed read data segments are kept, and further records are discarded to reduce resource overhead and hit check costs.
参考图6所示的读去重窗口维护的示意图,具体地,首先将读命令按一定的大小分为数据段(自定义,可以为4KB/8KB/…),每个数据段包含如下信息:Referring to the schematic diagram of read deduplication window maintenance shown in Figure 6, specifically, the read command is first divided into data segments according to a certain size (customized, it can be 4KB/8KB/...), and each data segment contains the following information:
读数据段逻辑地址,对应数据段的起始逻辑地址(LBA);Read the logical address of the data segment, corresponding to the starting logical address (LBA) of the data segment;
数据摘要,对应数据段的数据摘要,可以用常见的算法,如SHA/MD5…,用以粗略比对(速度快,但是存在一定的概率误判);Data summary, the data summary corresponding to the data segment, can use common algorithms, such as SHA/MD5..., for rough comparison (fast, but there is a certain probability of misjudgment);
数据内容,该数据段的原始内容,用以精确比对(速度较慢)。Data content, the original content of the data segment, used for accurate comparison (slower speed).
然后,按照时间从远到近,维护最近主机访问的读数据段,形成读去重队列。Then, from far to near in time, the read data segments accessed by the latest host are maintained to form a read deduplication queue.
最后,在读数据去重窗口中,仅保持一定数量的最近访问的读数据段(数量可自定义,如4/8/16…),更远的记录则丢弃掉,减少资源开销以及命中检查的代价。Finally, in the read data deduplication window, only a certain number of recently accessed read data segments are kept (the number can be customized, such as 4/8/16...), and further records are discarded, reducing resource overhead and hit checking. cost.
在本实施例中,在典型的数据拷贝场景中,一般是读N条数据,写N条数据,所以写命令基本会命中最近的N条读命令数据段。根据此N,可以来调节读去重窗口的大小(数据段数量),提高效率。In this embodiment, in a typical data copy scenario, N pieces of data are generally read and N pieces of data are written, so the write command will basically hit the latest N data segments of the read command. According to this N, the size of the read deduplication window (number of data segments) can be adjusted to improve efficiency.
在一个实施例中,提供了一种SSD有限窗口数据去重识别方法,该方法还包括:若数据摘要一致则将写数据段的数据内容与命中的读数据段的数据内容进行异或并判断结果是否为0;若结果为0则判断数据重复,将写数据端逻辑地址以及命中的读数据段逻辑地址发送给写去重应用模块,写去重应用模块将读数据段逻辑地址的映射表拷贝到写数据段逻辑地址的映射表以完成数据拷贝。In one embodiment, an SSD limited window data deduplication and identification method is provided. The method also includes: if the data digests are consistent, XOR the data content of the write data segment with the data content of the hit read data segment and determine Whether the result is 0; if the result is 0, it is judged that the data is repeated, and the logical address of the write data end and the logical address of the hit read data segment are sent to the write deduplication application module. The write deduplication application module will read the mapping table of the data segment logical address. Copy to the mapping table of the logical address of the write data segment to complete the data copy.
在本实施例中,首先,根据写入数据段的摘要,与当前读数据去重窗口中的数据摘要进行比对,如果匹配,则进行进一步数据匹配。对于数据摘要匹配的场景,进一步将相应的读写数据进行异或,如果结果为0,则确认数据重复。In this embodiment, first, the summary of the written data segment is compared with the data summary in the current read data deduplication window. If they match, further data matching is performed. For data digest matching scenarios, the corresponding read and write data are further XORed. If the result is 0, the data is confirmed to be duplicated.
下面以一个具体的示例进行说明,参考图7所示的具体执行流程示意图,该方法包括:A specific example will be used to illustrate the following. With reference to the specific execution flow diagram shown in Figure 7, the method includes:
步骤7.1、获取主机新命令。Step 7.1. Obtain new commands from the host.
步骤7.2、判断是否为读命令;若是,则继续步骤7.3;若否,跳转步骤7.7。Step 7.2: Determine whether it is a read command; if so, continue to step 7.3; if not, jump to step 7.7.
步骤7.3、按正常路径完成数据读取。Step 7.3. Complete data reading according to the normal path.
步骤7.4、将读命令按数据段划分,针对每段数据计算其数据摘要。Step 7.4: Divide the read command into data segments, and calculate the data summary for each segment of data.
步骤7.5、将(数据段逻辑地址,数据摘要,数据内容)添加到读去重队列中,并移动读数据去重窗口,保持窗口内记录数量恒定。Step 7.5: Add (data segment logical address, data summary, data content) to the read deduplication queue, and move the read data deduplication window to keep the number of records in the window constant.
步骤7.6、命令完成。Step 7.6, the command is completed.
步骤7.7、判断是否为写命令;若是,则继续步骤7.8;若否,则按正常路径进行处理。Step 7.7: Determine whether it is a write command; if so, proceed to step 7.8; if not, proceed according to the normal path.
步骤7.8、分配写缓冲区并从主机接收数据。Step 7.8. Allocate write buffer and receive data from host.
步骤7.9、将写命令按数据段划分,针对每段数据计算其数据摘要。Step 7.9: Divide the write command into data segments and calculate its data summary for each segment of data.
步骤7.10、将写命令各个数据段的数据摘要与读数据去重窗口中的数据摘要进行比对。Step 7.10: Compare the data summary of each data segment of the write command with the data summary in the read data deduplication window.
步骤7.11、判断是否相等;若是,则继续步骤7.12;若否,则按正常路径将数据写入NAND。Step 7.11, determine whether they are equal; if yes, proceed to step 7.12; if not, write the data to NAND according to the normal path.
步骤7.12、将写数据段内容与命中的读数据段的数据内容进行异或。Step 7.12: XOR the content of the write data segment with the data content of the hit read data segment.
步骤7.13、判断结果是否为0;若是,则继续步骤7.14;若否,则按正常路径将数据写入NAND。Step 7.13. Determine whether the result is 0; if so, proceed to step 7.14; if not, write the data to NAND according to the normal path.
步骤7.14、数据重复,将对应的读数据段逻辑地址以及写数据段逻辑地址发送给写去重应用模块。Step 7.14. Data is repeated and the corresponding read data segment logical address and write data segment logical address are sent to the write deduplication application module.
步骤7.15、写去重应用模块根据读、写数据段的逻辑地址,通过映射表拷贝的方式替代写数据写入NAND。Step 7.15: The write deduplication application module writes the data into NAND by copying the mapping table based on the logical addresses of the read and write data segments.
步骤7.16、命令完成。Step 7.16, the command is completed.
应该理解的是,虽然图1-7的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1-7中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowcharts of Figures 1-7 are shown in sequence as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated in this article, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in Figures 1-7 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times. These sub-steps or stages The order of execution is not necessarily sequential, but may be performed in turn or alternately with other steps or sub-steps of other steps or at least part of the stages.
在一个实施例中,如图8所示,提供了一种SSD有限窗口数据去重识别装置800,该装置包括:In one embodiment, as shown in Figure 8, an SSD limited window data deduplication identification device 800 is provided, which device includes:
第一判断模块801,所述第一判断模块用于获取主机下发的命令并判断所述命令是否为读命令;The first judgment module 801 is used to obtain the command issued by the host and judge whether the command is a read command;
读去重窗口模块802,所述读去重窗口模块用于若所述命令为读命令,则按正常路径完成数据读取,将读命令按数据段进行划分,针对每段数据计算其数据摘要;将读数据段逻辑地址、数据摘要以及数据内容添加到读去重队列中,移动读数据去重窗口并保持窗口内记录数量恒定;Read deduplication window module 802. The read deduplication window module is used to complete data reading according to the normal path if the command is a read command, divide the read command according to data segments, and calculate its data summary for each segment of data. ;Add the read data segment logical address, data summary and data content to the read deduplication queue, move the read data deduplication window and keep the number of records in the window constant;
第二判断模块803,所述第二判断模块用于若所述命令不为读命令,继续判断所述命令是否为写命令;The second judgment module 803 is used to continue to judge whether the command is a write command if the command is not a read command;
写去重识别模块804,所述写去重识别模块用于若所述命令为写命令,则分配写缓冲区并从主机接收数据,将写命令按数据段进行划分,针对每段数据计算其数据摘要;Write deduplication identification module 804. The write deduplication identification module is used to allocate a write buffer and receive data from the host if the command is a write command, divide the write command according to data segments, and calculate its value for each segment of data. data summary;
第三判断模块805,所述第三判断模块用于判断所述写命令各个数据段的数据摘要与所述读数据去重窗口中的数据摘要是否一致;The third judgment module 805 is used to judge whether the data digest of each data segment of the write command is consistent with the data digest in the read data deduplication window;
第四判断模块806,所述第四判断模块用于若数据摘要一致则进一步判断写命令各个数据段的数据内容与所述读数据去重窗口中的数据内容是否一致;若数据内容也一致,则将对应的读数据段逻辑地址以及写数据段逻辑地址发送给写去重应用模块;The fourth judgment module 806 is used to further judge whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window if the data digest is consistent; if the data content is also consistent, Then send the corresponding read data segment logical address and write data segment logical address to the write deduplication application module;
写去重应用模块807,所述写去重应用模块用于通过映射表拷贝的方式替代写数据写入NAND。Write deduplication application module 807. The write deduplication application module is used to write data into NAND by copying the mapping table instead of writing data.
在一个实施例中,读去重窗口模块802还用于:In one embodiment, the read deduplication window module 802 is also used to:
将读命令按一定的大小划分为多个数据段,每个数据段包含读数据段逻辑地址、数据摘要以及数据内容;Divide the read command into multiple data segments according to a certain size. Each data segment contains the read data segment logical address, data summary and data content;
按照时间从远到近维护最近主机访问的读数据段,形成读去重队列。Maintain the read data segments recently accessed by the host from far to near according to time to form a read deduplication queue.
在一个实施例中,读去重窗口模块802还用于:In one embodiment, the read deduplication window module 802 is also used to:
在读数据去重窗口中,仅保持一定数量的最近访问的读数据段,将更远的记录则丢弃掉以减少资源开销以及命中检查的代价。In the read data deduplication window, only a certain number of recently accessed read data segments are kept, and further records are discarded to reduce resource overhead and hit check costs.
在一个实施例中,第四判断模块806还用于:In one embodiment, the fourth judgment module 806 is also used to:
若数据摘要一致则将写数据段的数据内容与命中的读数据段的数据内容进行异或并判断结果是否为0;If the data digests are consistent, XOR the data content of the write data segment with the data content of the hit read data segment and determine whether the result is 0;
若结果为0则判断数据重复,将写数据端逻辑地址以及命中的读数据段逻辑地址发送给写去重应用模块,写去重应用模块将读数据段逻辑地址的映射表拷贝到写数据段逻辑地址的映射表以完成数据拷贝。If the result is 0, it is judged that the data is duplicated, and the logical address of the write data end and the logical address of the hit read data segment are sent to the write deduplication application module. The write deduplication application module copies the mapping table of the read data segment logical address to the write data segment. Mapping table of logical addresses to complete data copy.
关于SSD有限窗口数据去重识别装置的具体限定可以参见上文中对于SSD有限窗口数据去重识别方法的限定,在此不再赘述。Regarding the specific limitations of the SSD limited window data deduplication identification device, please refer to the above limitations on the SSD limited window data deduplication identification method, which will not be described again here.
在一个实施例中,提供了一种计算机设备,其内部结构图可以如图9所示。该计算机设备包括通过系统总线连接的处理器、存储器以及网络接口。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种SSD有限窗口数据去重识别方法。In one embodiment, a computer device is provided, the internal structure diagram of which can be shown in Figure 9 . The computer device includes a processor, memory, and network interface connected through a system bus. Wherein, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes non-volatile storage media and internal memory. The non-volatile storage medium stores operating systems, computer programs and databases. This internal memory provides an environment for the execution of operating systems and computer programs in non-volatile storage media. The network interface of the computer device is used to communicate with external terminals through a network connection. The computer program is executed by a processor to implement an SSD limited window data deduplication and identification method.
本领域技术人员可以理解,图9中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in Figure 9 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Specific computer equipment can May include more or fewer parts than shown, or combine certain parts, or have a different arrangement of parts.
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,处理器执行计算机程序时实现以上各个方法实施例中的步骤。In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the steps in each of the above method embodiments are implemented.
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现以上各个方法实施例中的步骤。In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored. When the computer program is executed by a processor, the steps in each of the above method embodiments are implemented.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink) DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be completed by instructing relevant hardware through a computer program. The computer program can be stored in a non-volatile computer-readable storage. In the medium, when the computer program is executed, it may include the processes of the embodiments of the above methods. Any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined in any way. To simplify the description, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, all possible combinations should be used. It is considered to be within the scope of this manual.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-described embodiments only express several implementation modes of the present application, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the scope of the invention patent. It should be noted that, for those of ordinary skill in the art, several modifications and improvements can be made without departing from the concept of the present application, and these all fall within the protection scope of the present application. Therefore, the protection scope of this patent application should be determined by the appended claims.

Claims (10)

  1. 一种SSD有限窗口数据去重识别方法,所述方法包括:An SSD limited window data deduplication and identification method, the method includes:
    获取主机下发的命令并判断所述命令是否为读命令;Obtain the command issued by the host and determine whether the command is a read command;
    若所述命令为读命令,则按正常路径完成数据读取,将读命令按数据段进行划分,针对每段数据计算其数据摘要;If the command is a read command, the data reading is completed according to the normal path, the read command is divided into data segments, and the data summary is calculated for each segment of data;
    将读数据段逻辑地址、数据摘要以及数据内容添加到读去重队列中,移动读数据去重窗口并保持窗口内记录数量恒定;Add the read data segment logical address, data summary and data content to the read deduplication queue, move the read data deduplication window and keep the number of records in the window constant;
    若所述命令不为读命令,继续判断所述命令是否为写命令;If the command is not a read command, continue to determine whether the command is a write command;
    若所述命令为写命令,则分配写缓冲区并从主机接收数据,将写命令按数据段进行划分,针对每段数据计算其数据摘要;If the command is a write command, allocate a write buffer and receive data from the host, divide the write command into data segments, and calculate its data summary for each segment of data;
    判断所述写命令各个数据段的数据摘要与所述读数据去重窗口中的数据摘要是否一致;Determine whether the data summary of each data segment of the write command is consistent with the data summary in the read data deduplication window;
    若数据摘要一致则进一步判断写命令各个数据段的数据内容与所述读数据去重窗口中的数据内容是否一致;If the data digests are consistent, it is further determined whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window;
    若数据内容也一致,则将对应的读数据段逻辑地址以及写数据段逻辑地址发送给写去重应用模块,所述写去重应用模块通过映射表拷贝的方式替代写数据写入NAND。If the data content is also consistent, the corresponding read data segment logical address and write data segment logical address are sent to the write deduplication application module. The write deduplication application module replaces the write data into NAND by copying the mapping table.
  2. 根据权利要求1所述的SSD有限窗口数据去重识别方法,其特征在于,所述将读命令按数据段进行划分,针对每段数据计算其数据摘要的步骤还包括:The SSD limited window data deduplication and identification method according to claim 1, characterized in that the step of dividing the read command according to data segments, and calculating the data summary for each segment of data also includes:
    将读命令按一定的大小划分为多个数据段,每个数据段包含读数据段逻辑地址、数据摘要以及数据内容;Divide the read command into multiple data segments according to a certain size. Each data segment contains the read data segment logical address, data summary and data content;
    按照时间从远到近维护最近主机访问的读数据段,形成读去重队列。Maintain the read data segments recently accessed by the host from far to near according to time to form a read deduplication queue.
  3. 根据权利要求2所述的SSD有限窗口数据去重识别方法,其特征在于,在所述按照时间从远到近维护最近主机访问的读数据段,形成读去重队列的步骤之后还包括:The SSD limited window data deduplication identification method according to claim 2, characterized in that after the step of maintaining read data segments recently accessed by the host from far to near according to time and forming a read deduplication queue, it also includes:
    在读数据去重窗口中,仅保持一定数量的最近访问的读数据段,将更远的记录则丢弃掉以减少资源开销以及命中检查的代价。In the read data deduplication window, only a certain number of recently accessed read data segments are kept, and further records are discarded to reduce resource overhead and hit check costs.
  4. 根据权利要求1-3任一项所述的SSD有限窗口数据去重识别方法,其特征在于,所述若数据摘要一致则进一步判断写命令各个数据段的数据内容与所述读数据去重窗口中的数据内容是否一致的步骤还包括:The SSD limited window data deduplication identification method according to any one of claims 1 to 3, characterized in that if the data digest is consistent, it is further determined that the data content of each data segment of the write command is consistent with the read data deduplication window. The steps to ensure that the data content in are consistent include:
    若数据摘要一致则将写数据段的数据内容与命中的读数据段的数据内容进行异或并判断结果是否为0;If the data digests are consistent, XOR the data content of the write data segment with the data content of the hit read data segment and determine whether the result is 0;
    若结果为0则判断数据重复,将写数据端逻辑地址以及命中的读数据段逻辑地址发送给写去重应用模块,写去重应用模块将读数据段逻辑地址的映射表拷贝到写数据段逻辑地址的映射表以完成数据拷贝。If the result is 0, it is judged that the data is duplicated, and the logical address of the write data end and the logical address of the hit read data segment are sent to the write deduplication application module. The write deduplication application module copies the mapping table of the read data segment logical address to the write data segment. Mapping table of logical addresses to complete data copy.
  5. 一种SSD有限窗口数据去重识别装置,其特征在于,所述SSD有限窗口数据去重识别装置包括:An SSD limited window data deduplication and identification device, characterized in that the SSD limited window data deduplication and identification device includes:
    第一判断模块,所述第一判断模块用于获取主机下发的命令并判断所述命令是否为读命令;A first judgment module, the first judgment module is used to obtain the command issued by the host and judge whether the command is a read command;
    读去重窗口模块,所述读去重窗口模块用于若所述命令为读命令,则按正常路径完成数据读取,将读命令按数据段进行划分,针对每段数据计算其数据摘要;将读数据段逻辑地址、数据摘要以及数据内容添加到读去重队列中,移动读数据去重窗口并保持窗口内记录数量恒定;Read deduplication window module, the read deduplication window module is used to complete data reading according to the normal path if the command is a read command, divide the read command according to data segments, and calculate its data summary for each segment of data; Add the read data segment logical address, data summary and data content to the read deduplication queue, move the read data deduplication window and keep the number of records in the window constant;
    第二判断模块,所述第二判断模块用于若所述命令不为读命令,继续判断所述命令是否为写命令;a second judgment module, the second judgment module is used to continue to judge whether the command is a write command if the command is not a read command;
    写去重识别模块,所述写去重识别模块用于若所述命令为写命令,则分配写缓冲区并从主机接收数据,将写命令按数据段进行划分,针对每段数据计算其数据摘要;Write deduplication identification module, the write deduplication identification module is used to allocate a write buffer and receive data from the host if the command is a write command, divide the write command according to data segments, and calculate the data for each segment of data Summary;
    第三判断模块,所述第三判断模块用于判断所述写命令各个数据段的数据摘要与所述读数据去重窗口中的数据摘要是否一致;A third judgment module, the third judgment module is used to judge whether the data summary of each data segment of the write command is consistent with the data summary in the read data deduplication window;
    第四判断模块,所述第四判断模块用于若数据摘要一致则进一步判断写命令各个数据段的数据内容与所述读数据去重窗口中的数据内容是否一致;若数据内容也一致,则将对应的读数据段逻辑地址以及写数据段逻辑地址发送给写去重应用模块;The fourth judgment module is used to further judge whether the data content of each data segment of the write command is consistent with the data content in the read data deduplication window if the data digest is consistent; if the data content is also consistent, then Send the corresponding read data segment logical address and write data segment logical address to the write deduplication application module;
    写去重应用模块,所述写去重应用模块用于通过映射表拷贝的方式替代写数据写入NAND。A write deduplication application module, which is used to write data into NAND by copying a mapping table instead of writing data.
  6. 根据权利要求5所述的SSD有限窗口数据去重识别装置,其特征在于,所述读去重窗口模块还用于:The SSD limited window data deduplication identification device according to claim 5, characterized in that the read deduplication window module is also used to:
    将读命令按一定的大小划分为多个数据段,每个数据段包含读数据段逻辑地址、数据摘要以及数据内容;Divide the read command into multiple data segments according to a certain size. Each data segment contains the read data segment logical address, data summary and data content;
    按照时间从远到近维护最近主机访问的读数据段,形成读去重队列。Maintain the read data segments recently accessed by the host from far to near according to time to form a read deduplication queue.
  7. 根据权利要求6所述的SSD有限窗口数据去重识别装置,其特征在于,所述读去重窗口模块还用于:The SSD limited window data deduplication identification device according to claim 6, characterized in that the read deduplication window module is also used to:
    在读数据去重窗口中,仅保持一定数量的最近访问的读数据段,将更远的记录则丢弃掉以减少资源开销以及命中检查的代价。In the read data deduplication window, only a certain number of recently accessed read data segments are kept, and further records are discarded to reduce resource overhead and hit check costs.
  8. 根据权利要求5-7任一项所述的SSD有限窗口数据去重识别装置,其特征在于,所述第四判断模块还用于:The SSD limited window data deduplication and identification device according to any one of claims 5 to 7, characterized in that the fourth judgment module is also used to:
    若数据摘要一致则将写数据段的数据内容与命中的读数据段的数据内容进行异或并判断结果是否为0;If the data digests are consistent, XOR the data content of the write data segment with the data content of the hit read data segment and determine whether the result is 0;
    若结果为0则判断数据重复,将写数据端逻辑地址以及命中的读数据段逻辑地址发送给写去重应用模块,写去重应用模块将读数据段逻辑地址的映射表拷贝到写数据段逻辑地址的映射表以完成数据拷贝。If the result is 0, it is judged that the data is duplicated, and the logical address of the write data end and the logical address of the hit read data segment are sent to the write deduplication application module. The write deduplication application module copies the mapping table of the read data segment logical address to the write data segment. Mapping table of logical addresses to complete data copy.
  9. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至4中任一项所述方法的步骤。A computer device, including a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that when the processor executes the computer program, any one of claims 1 to 4 is implemented The steps of the method.
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至4中任一项所述的方法的步骤。A computer-readable storage medium with a computer program stored thereon, characterized in that when the computer program is executed by a processor, the steps of the method described in any one of claims 1 to 4 are implemented.
PCT/CN2022/127731 2022-06-24 2022-10-26 Ssd finite window data deduplication identification method and apparatus, and computer device WO2023245942A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210731384.9 2022-06-24
CN202210731384.9A CN114974365A (en) 2022-06-24 2022-06-24 SSD (solid State disk) limited window data deduplication identification method and device and computer equipment

Publications (1)

Publication Number Publication Date
WO2023245942A1 true WO2023245942A1 (en) 2023-12-28

Family

ID=82965981

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/127731 WO2023245942A1 (en) 2022-06-24 2022-10-26 Ssd finite window data deduplication identification method and apparatus, and computer device

Country Status (2)

Country Link
CN (1) CN114974365A (en)
WO (1) WO2023245942A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114974365A (en) * 2022-06-24 2022-08-30 苏州忆联信息系统有限公司 SSD (solid State disk) limited window data deduplication identification method and device and computer equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103150258A (en) * 2013-03-20 2013-06-12 中国科学院苏州纳米技术与纳米仿生研究所 Writing, reading and garbage collection method of solid-state memory system
CN106557571A (en) * 2016-11-23 2017-04-05 福建亿榕信息技术有限公司 A kind of data duplicate removal method and device based on K V storage engines
CN107193686A (en) * 2016-03-15 2017-09-22 伊姆西公司 Method and apparatus for data backup
CN109271097A (en) * 2017-12-28 2019-01-25 新华三大数据技术有限公司 Data processing method, data processing equipment and server
US20190129639A1 (en) * 2017-10-31 2019-05-02 EMC IP Holding Company LLC Optimizing inline deduplication during copies
CN114974365A (en) * 2022-06-24 2022-08-30 苏州忆联信息系统有限公司 SSD (solid State disk) limited window data deduplication identification method and device and computer equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103150258A (en) * 2013-03-20 2013-06-12 中国科学院苏州纳米技术与纳米仿生研究所 Writing, reading and garbage collection method of solid-state memory system
CN107193686A (en) * 2016-03-15 2017-09-22 伊姆西公司 Method and apparatus for data backup
CN106557571A (en) * 2016-11-23 2017-04-05 福建亿榕信息技术有限公司 A kind of data duplicate removal method and device based on K V storage engines
US20190129639A1 (en) * 2017-10-31 2019-05-02 EMC IP Holding Company LLC Optimizing inline deduplication during copies
CN109271097A (en) * 2017-12-28 2019-01-25 新华三大数据技术有限公司 Data processing method, data processing equipment and server
CN114974365A (en) * 2022-06-24 2022-08-30 苏州忆联信息系统有限公司 SSD (solid State disk) limited window data deduplication identification method and device and computer equipment

Also Published As

Publication number Publication date
CN114974365A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
US10860217B2 (en) System and method of management of multi-tier storage systems
US9727452B2 (en) Distributing metadata across multiple different disruption regions within an asymmetric memory system
US11803325B2 (en) Specifying media type in write commands
US20170139823A1 (en) Distibuted multimode storage management
TWI569139B (en) Valid data merging method, memory controller and memory storage apparatus
US10877898B2 (en) Method and system for enhancing flash translation layer mapping flexibility for performance and lifespan improvements
US20180150249A1 (en) System and method for copy on write on an ssd
US11928053B2 (en) System garbage collection method and method for garbage collection in solid state disk
CN107564558B (en) Implementing decentralized atomic I/O writing
EP3346387B1 (en) Storage system and system garbage collection method
KR102302426B1 (en) Error checking in namespaces on storage devices
CN109074308B (en) Adaptive Block Transform Table (BTT)
EP4012547B1 (en) Storage method and apparatus for key value (kv) and storage device
TWI459198B (en) Memory storage device, memory controller thereof, and method for identifying valid data
US20100250885A1 (en) Storage control device, storage system, and copying method
US20120233382A1 (en) Data storage apparatus and method for table management
US11042316B1 (en) Reordered data deduplication in storage devices
WO2023245942A1 (en) Ssd finite window data deduplication identification method and apparatus, and computer device
US11372774B2 (en) Method and system for a solid state drive with on-chip memory integration
CN110312986B (en) Opportunistic use of streams for storing data on solid state devices
US11947419B2 (en) Storage device with data deduplication, operation method of storage device, and operation method of storage server
US11662949B2 (en) Storage server, a method of operating the same storage server and a data center including the same storage server
CN111026678B (en) Cache design method and device based on solid state disk and computer equipment
CN109002265B (en) Data processing method and related device
Xiao et al. Per-file secure deletion for flash-based solid state drives

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22947679

Country of ref document: EP

Kind code of ref document: A1