WO2020192710A1 - Method for processing garbage based on lsm database, solid state hard disk, and storage apparatus - Google Patents

Method for processing garbage based on lsm database, solid state hard disk, and storage apparatus Download PDF

Info

Publication number
WO2020192710A1
WO2020192710A1 PCT/CN2020/081281 CN2020081281W WO2020192710A1 WO 2020192710 A1 WO2020192710 A1 WO 2020192710A1 CN 2020081281 W CN2020081281 W CN 2020081281W WO 2020192710 A1 WO2020192710 A1 WO 2020192710A1
Authority
WO
WIPO (PCT)
Prior art keywords
log
data
solid state
hard disk
written
Prior art date
Application number
PCT/CN2020/081281
Other languages
French (fr)
Chinese (zh)
Inventor
刘绍全
陈祥
李卫军
杨亚飞
Original Assignee
深圳大普微电子科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大普微电子科技有限公司 filed Critical 深圳大普微电子科技有限公司
Publication of WO2020192710A1 publication Critical patent/WO2020192710A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device

Definitions

  • This application relates to the field of storage technology, and in particular to a method for garbage disposal based on an LSM database, a solid state drive, and a storage device.
  • the LSM (Log Structured Merge Trees) database engine converts random write IO for operating solid-state drives into sequential write IO, which improves the write performance of the database and provides relatively good read performance.
  • the write operation of the LSM database engine specifically includes: when there is a write operation, the written data is first written into the buffer of the memory, the sequence of the written data is recorded in the memory through a specific data structure, and the written data is additionally written To the logfile of the solid-state hard drive for recovery if necessary.
  • the written data in the memory is flushed to the solid-state hard disk regularly or in a fixed size to store multiple ordered sstfile files on the solid-state hard disk.
  • garbage disposal is an important part of the solid state hard disk firmware design, and it is also the main factor affecting the stable performance of the solid state hard disk.
  • the particle characteristics of the flash memory (NAND) in the solid state drive are: multiple physical blocks (Block) form the flash memory, and the physical blocks must be erased before data can be written. If there are some valid data pages (physical pages storing user data) in a physical block before erasing, in order not to lose user data, the data in the valid physical pages must be read out and written to another physical block before Erase the physical block, which is garbage disposal.
  • the data merging operation performed by the LSM database engine that is, the garbage processing performed cannot be perceived by the solid state drive, which causes the LSM database engine to perform the merging operation, that is, garbage processing.
  • the solid-state drive will also perform garbage processing on the same data again, which will cause repeated data movement and increase write amplification.
  • the present application provides a method for garbage processing based on an LSM database, a solid state drive, and a storage device, which can flexibly perform garbage processing on data in the LSM database and reduce write amplification.
  • a technical solution adopted by this application is to provide a garbage disposal method based on the LSM database.
  • the method includes: The data is the sstfile file or logfile file in the LSM database; write the data to be written into the corresponding log; perform the merge operation on the logs in the solid state drive under the condition of performing the merge operation; delete the merged log and mark the The super block where the deleted log is located is in a data invalid state, where the super block is a physical address space allocated in the solid state disk for storing each log.
  • the method further includes: obtaining the number of super blocks in the idle state in the solid state disk; obtaining the access strength of the solid state disk; detecting whether the number and access strength of the super blocks in the idle state meet a predetermined condition; wherein, when the detection result is When the number of super blocks in the idle state and the access intensity meet the predetermined conditions, the conditions for performing the merge operation are satisfied.
  • the steps of performing a merge operation on each log in the solid state drive include: obtaining the number of logs stored in each super block and the amount of data corresponding to each log; obtaining the corresponding data of each super block according to the number of logs and the amount of data corresponding to each log Effective data volume; merge the super block with the smallest effective data volume.
  • the step of creating a log corresponding to the data to be written in the solid state drive includes: obtaining a group identifier corresponding to the data to be written, wherein each group identifier corresponds to a super block, and each super block includes a first predetermined There are multiple first logic blocks of different sizes, all super blocks correspond to an entry parameter list, and the first logic block in each super block corresponds to an entry parameter in the entry parameter list; dynamic in the super block corresponding to the group identifier The first logical block that is free for storing logs is allocated.
  • the step of writing the data to be written into the corresponding log further includes: dividing the log after writing the data into a plurality of second logical blocks of a second predetermined size;
  • the method further includes: using the log identifier and the logical block identifier written in the entry parameters as the hash key value, obtaining the hash value according to the hash key value; obtaining the index value of the hash bucket according to the hash value; and detecting the index value Whether the corresponding hash bucket is empty; if the detection result is that the hash bucket is empty, the subscript of the entry parameter is saved in the hash bucket corresponding to the index value; if the detection result is that the hash bucket is not empty, then Save the subscript of the entry parameter in the corresponding hash link.
  • the method further includes: obtaining a group identifier in the LSM database; creating corresponding super blocks for different group identifiers in the solid state disk, where the LSM database The sstfile files in the same layer belong to the same group ID.
  • the method further includes: dividing each super block into a plurality of first logical blocks of a first predetermined size; and creating an entry parameter list corresponding to all super blocks according to the sum of the number of first logical blocks in each super block.
  • a solid-state hard disk which includes a processor and a storage controller coupled to the processor, wherein the storage controller stores a storage device for implementing any of the above A program instruction of a method for garbage disposal based on the LSM database; the processor is used to execute program instructions stored by the storage controller to process garbage on the solid state drive; wherein the LSM database interacts with the solid state drive through SPDK.
  • Another technical solution adopted in this application is to provide a storage device that stores program files that can implement any of the above methods.
  • the beneficial effects of this application are: the method for garbage disposal based on the LSM database, the solid state hard disk and the storage device of this application create a log corresponding to the data to be written in the solid state hard disk; write the data to be written into the corresponding log ; Under the condition of performing the merge operation, perform the merge operation on the logs in the solid state hard disk; delete the merged log and mark the super block where the deleted log is in a data invalid state.
  • the present application can flexibly perform garbage processing on the data of the LSM database stored in the solid state hard disk, and reduce write amplification.
  • FIG. 1 is a schematic flowchart of a garbage processing method based on an LSM database according to a first embodiment of the present application
  • FIG. 2 is a schematic flowchart of a garbage processing method based on an LSM database according to a second embodiment of the present application
  • FIG. 3 is a schematic diagram of the structure of a super block according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of the relationship between a super block, a log, and an entry parameter list in an embodiment of the present application
  • FIG. 5 is a schematic diagram of the relationship among the hash bucket, the entry parameter list, and the hash link in an embodiment of the present application;
  • FIG. 6 is a schematic structural diagram of a solid-state hard disk according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of the link between the LSM database and the solid state drive shown in FIG. 6;
  • FIG. 8 is a schematic structural diagram of a storage device according to an embodiment of the present application.
  • first”, “second”, and “third” in this application are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with “first”, “second”, and “third” may explicitly or implicitly include at least one of the features.
  • "a plurality of” means at least two, such as two, three, etc., unless otherwise specifically defined. All directional indications (such as up, down, left, right, front, back%) in the embodiments of this application are only used to explain the relative positional relationship between the components in a specific posture (as shown in the drawings) If the specific posture changes, the directional indication will change accordingly.
  • Fig. 1 is a flowchart of a method for garbage processing based on an LSM database according to a first embodiment of the present application. It should be noted that if there is substantially the same result, the method of the present application is not limited to the sequence of the process shown in FIG. 1. As shown in Figure 1, the method includes steps:
  • Step S101 Create a log corresponding to the data to be written in the solid state hard disk.
  • nvme commands include commands such as creating log CreatLog, appending to writing log AppendLog, reading log ReadLog, and deleting log DeleteLog.
  • step S101 the LSM database uses the create log command CreatLog to create a log corresponding to the data to be written in the solid state disk, where the data to be written is an sstfile file or a logfile file in the LSM database.
  • Step S102 Write the data to be written into the corresponding log.
  • step S102 the LSM database uses the append log command AppendLog to write the data to be written into the corresponding log, and store it in the super block of the solid state disk.
  • Step S103 Perform a merge operation on the logs in the solid state disk under the condition that the merge operation is performed.
  • step S103 when the solid-state hard disk meets the trigger condition for executing the merge operation, the merge operation is performed on the logs in the solid-state hard disk.
  • the time for performing the merge operation on the logs in the solid state drive can be flexibly configured.
  • the LSM database uses the read log command ReadLog to read out the valid data in the log to be merged and transfer it to another effective physical address space.
  • Step S104 Delete the merged log and identify the super block where the deleted log is in a data invalid state, where the super block is a physical address space allocated in the solid state disk for storing each log.
  • step S104 the LSM database uses the delete log command DeleteLog to delete the merged log, and at the same time mark the super block where the log is located as a data invalid state, so as to achieve the purpose of quickly releasing the super block.
  • FIG. 2 is a schematic flowchart of a method for garbage processing based on flash memory according to a second embodiment of the present application. It should be noted that if there is substantially the same result, the method of the present application is not limited to the sequence of the process shown in FIG. 2. As shown in Figure 2, the method includes the following steps:
  • Step S201 Acquire group identifiers in the LSM database, and create corresponding super blocks in the solid state disk for groups corresponding to different group identifiers.
  • the group identifier in the LSM database corresponds to the layer identifier where the sstfile file in the LSM database is located, that is, the sstfile file of the same layer in the LSM database belongs to the same group identifier.
  • the solid state disk creates corresponding super blocks for groups corresponding to different group identifiers, where the super block is a physical address space allocated for storing data in the solid state disk.
  • Step S202 Divide each super block into a plurality of first logical blocks of a first predetermined size.
  • each super block is divided into a plurality of first logical blocks of a first predetermined size, for example, 1 MB, so as to use the first logical blocks in the super block to sequentially store the data to be written.
  • FIG. 3 is a schematic structural diagram of a super block according to an embodiment of the application.
  • Step S203 Create an entry parameter list corresponding to each super block according to the sum of the number of first logical blocks in each super block.
  • step S203 the space for storing the entry table of the entry parameter list is requested according to the number of the first logical blocks provided by the super block to create the entry table of the entry parameter list.
  • the size of all super blocks is 1TB
  • the size of the first logical block is 1MB
  • each entry parameter entry in the entry parameter list is 4 bytes
  • the space required for the entry parameter list is 4MB.
  • the entry parameter entry does not need to save the identifier of the super block and the logical identifier of the first logical block, that is, which is the first logical block in the super block, which can be listed in the entry parameter list through the entry parameter entry
  • the subscript in the entry table is the corresponding calculation of the position of the entry parameter entry in the entry parameter list entry table, that is, each first logical block corresponds to the entry parameter entry one-to-one.
  • Step S204 Create a log corresponding to the data to be written in the solid state hard disk.
  • step S204 the step of creating a log corresponding to the data to be written in the solid state disk includes: obtaining the group identifier corresponding to the data to be written, and dynamically allocating free space for storing the log in the super block corresponding to the group identifier The first logical block.
  • the LSM database When the LSM database creates a log log, it needs to know the group ID of the log log, where the group ID of the log log is the group ID of the data to be written. In addition, when the log log is created, the space allocation of the corresponding log log is allocated on the super block created by the group corresponding to the group identifier.
  • Step S205 Write the data to be written into the corresponding log.
  • the entry parameter entry corresponding to the first logical block is the log identifier of the log log and the logical block identifier of the second logical block that has been written (as shown in FIG. 4).
  • the information of the entry parameter corresponding to the first logical block will be updated at the same time.
  • the second logical block and the first logical block have the same size, for example, the same 1MB.
  • the step of writing the data to be written into the corresponding log further includes: using the log ID and the logical block ID block id of the entry parameter as the hash key value hash key, and the hash key value is based on the hash key value.
  • the detection result is that the hash bucket bucket is not empty, that is to say, it proves that a hash conflict has occurred currently.
  • you need to find the value stored in the hash bucket bucket, that is, the index of the entry parameter The subscript of the corresponding hash link hashlink, where the subscript of the hash link hashlink corresponds to the subscript of the entry parameter entry one-to-one, and then the hash link corresponding to the subscript saves the next entry parameter Mark. If the hash link corresponding to the subscript also has a value, it proves that there is still a conflict. You need to continue to find the next next based on the value saved in the hash link hashlink corresponding to the subscript, that is, the subscript of the entry parameter entry The target hash link hashlink until there is no conflict.
  • the hash bucket bucket is not empty, and its saved value is 2, then look for the hash link hashlink with subscript 2 and find that its saved value is 4, then continue to look for the subscript
  • the hash link hashlink is 4, and the saved value is found to be 6, and the hash link hashlink with the subscript 6 is continued to search, and when it is found to be empty, the subscript of the entry parameter entry to be saved can be stored.
  • Step S206 Perform a merge operation on the logs in the solid state hard disk when the conditions for performing the merge operation are met.
  • step S206 the operation of judging whether the condition for performing the merge operation is met is specifically: obtaining the number of free super blocks in the solid state disk; obtaining the access strength of the solid state disk; detecting the number and access of the super blocks in the idle state Whether the strength meets a predetermined condition; wherein, when the detection result is that the number of super blocks in an idle state and the access strength meet the predetermined condition, the condition for performing the merge operation is satisfied.
  • the access intensity of the solid state drive is the business load of the solid state drive, which can also be understood as the window time, that is, the number of accesses to the solid state drive within a unit time.
  • the merge operation is performed. That is to say, when the number of super blocks in the idle state is relatively small and the access intensity of the solid state disk is relatively small, the merge operation is performed. At this time, the influence of the merge operation on the host's access to the solid state disk can be avoided. Conversely, when the detection result is that the number of super blocks in the idle state is greater than or equal to the first predetermined threshold or the access intensity of the solid state disk is greater than or equal to the second predetermined threshold, the merge operation is not performed to ensure the quality of the host's access to the solid state disk.
  • the steps of performing a merge operation on the logs in the solid state drive include: obtaining the number of logs stored in each super block and the amount of data corresponding to each log, which can be obtained by traversing the entry parameter entry; according to the number of logs and the amount of data corresponding to each log Obtain the effective data volume corresponding to each super block; merge the super block with the smallest effective data volume. That is to say, the LSM database can select the appropriate time to merge the super block with the smallest amount of effective data in the solid state disk according to the number of idle super blocks inside the solid state disk and the current access intensity of the solid state disk.
  • the data in the super block needs to be read, and the operation of reading the data is specifically: obtaining the corresponding log ID and logical block ID block id according to the logical address lba of the read IO; Log ID and logical block ID block id are used as the hash key value hash key to calculate the hash value hash value; generate the index value of the hash bucket bucket according to the hash data hash key; according to the index value index of the hash bucket bucket Get the subscript of the entry parameter entry; determine whether the log id stored in the entry parameter entry, the logical block id block id and the log id of the IO read this time, and the logical block id block id are equal, if they are equal, the entry parameter entry is used Calculate the identifier of the corresponding super block and the logical block identifier of the first logical block, and read the data in the first logical block; if they are not equal, look up the
  • Step S207 Delete the merged log and mark the super block where the deleted log is located in a data invalid state.
  • step S207 each log in the merged super block is deleted and the super block is marked as a data invalid state, thereby releasing the super block to achieve the purpose of garbage disposal.
  • FIG. 6 is a schematic structural diagram of a solid state drive according to an embodiment of the application.
  • the solid state drive 10 includes a processor 11 and a storage controller 12 coupled to the processor.
  • the storage controller 12 stores program instructions for implementing the LSM database-based garbage processing method described in any of the above embodiments.
  • the processor 12 is configured to execute program instructions stored by the storage controller 12 to perform garbage processing on the solid state hard disk.
  • the processor 11 may also be referred to as a CPU (Central Processing Unit, central processing unit).
  • the processor 11 may be an integrated circuit chip with signal processing capabilities.
  • the processor 11 may also be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component .
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • FIG. 7 is a schematic diagram of the link between the LSM database and the solid state drive shown in FIG. 6.
  • the LSM database 20 interacts with the solid-state hard disk through SPDK30 for 10 lines, where SPDK30 is a storage performance development kit.
  • nvme commands include commands such as creating log CreatLog, appending to writing log AppendLog, reading log ReadLog, and deleting log DeleteLog.
  • the solid state drive 10 is a solid state drive 10 provided in the form of a log, and the interaction between the LSM database 20 and the solid state drive 10 is realized by adding a special nvme command.
  • FIG. 8 is a schematic structural diagram of a storage device according to an embodiment of the application.
  • the storage device in the embodiment of the present application stores a program file 21 that can implement all the above methods.
  • the program file 21 can be stored in the storage device in the form of a software product, and includes several instructions to enable a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage devices include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code , Or terminal devices such as computers, servers, mobile phones, and tablets.
  • the beneficial effects of this application are: the method for garbage disposal based on the LSM database, the solid state hard disk and the storage device of this application create a log corresponding to the data to be written in the solid state hard disk; write the data to be written into the corresponding log ; Under the condition of performing the merge operation, perform the merge operation on the logs in the solid state hard disk; delete the merged log and mark the super block where the deleted log is in a data invalid state.
  • the present application can avoid the LSM database engine and the solid state hard disk in the prior art from performing repeated garbage processing on the same data. Further, the present application can flexibly perform garbage processing on the data of the LSM database stored in the solid state hard disk, and reduce write amplification.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.

Abstract

Disclosed in the present application are a method for processing garbage based on an LSM database, a solid state hard disk, and a storage apparatus. The method comprises: in a solid state hard disk, creating a log corresponding to data to be written; writing the data to be written to the corresponding log; when a condition for executing a merge operation is met, executing a merge operation on the logs in the solid state hard disk; and deleting the merged logs and labelling the super block in which the deleted logs were located as an invalid data state. By means of the present method, the present application can flexibly perform garbage processing on the data of an LSM database stored in the solid state hard disk, reducing write amplification.

Description

基于LSM数据库的垃圾处理的方法、固态硬盘以及存储装置Garbage processing method based on LSM database, solid state hard disk and storage device 技术领域Technical field
本申请涉及存储技术领域,特别是涉及一种基于LSM数据库的垃圾处理的方法、固态硬盘以及存储装置。This application relates to the field of storage technology, and in particular to a method for garbage disposal based on an LSM database, a solid state drive, and a storage device.
背景技术Background technique
LSM(Log Structured Merge Trees)数据库引擎,把操作固态硬盘的随机写IO转换成顺序写IO,提高数据库的写性能,并提供相对较好的读性能。The LSM (Log Structured Merge Trees) database engine converts random write IO for operating solid-state drives into sequential write IO, which improves the write performance of the database and provides relatively good read performance.
LSM数据库引擎的写操作的具体包括:当有写操作时,将写入数据首先写入内存的缓冲区内,内存中通过特定数据结构记录写入数据的先后顺序,同时将写入数据追加写到固态硬盘的logfile文件中,以备必要时恢复。内存中的写入数据定时或按固定大小地刷到固态硬盘,以在固态硬盘上存储多个有序的sstfile文件。The write operation of the LSM database engine specifically includes: when there is a write operation, the written data is first written into the buffer of the memory, the sequence of the written data is recorded in the memory through a specific data structure, and the written data is additionally written To the logfile of the solid-state hard drive for recovery if necessary. The written data in the memory is flushed to the solid-state hard disk regularly or in a fixed size to store multiple ordered sstfile files on the solid-state hard disk.
其中,随着越来越多写操作,固态硬盘上积累的sstfile文件也越来越多,这些文件不可写且有序,LSM数据库会定时对sstfile文件进行合并操作(compaction),合并完成删除相应的sstfile文件,减少文件数量。Among them, with more and more write operations, there are more and more sstfile files accumulated on the solid state drive. These files are not writable and in order. The LSM database will periodically perform compaction on the sstfile files, and the merge is completed and deleted. Sstfile file, reduce the number of files.
众所周知,在固定硬盘技术领域,垃圾处理是固态硬盘固件设计中的重要一环,也是影响固态硬盘稳定性能的主要因素。其中,固态硬盘中的闪存(NAND)的颗粒特性为:多个物理块(Block)形成闪存,物理块必须擦除后才能写入数据。如果在擦除前某物理块中存在部分有效数据页(存储了用户数据的物理页),为了不丢失用户数据,必须将有效物理页中的数据读出来写到另一个物理块中,然后才能擦除该物理块,这就是垃圾处理。As we all know, in the field of fixed hard disk technology, garbage disposal is an important part of the solid state hard disk firmware design, and it is also the main factor affecting the stable performance of the solid state hard disk. Among them, the particle characteristics of the flash memory (NAND) in the solid state drive are: multiple physical blocks (Block) form the flash memory, and the physical blocks must be erased before data can be written. If there are some valid data pages (physical pages storing user data) in a physical block before erasing, in order not to lose user data, the data in the valid physical pages must be read out and written to another physical block before Erase the physical block, which is garbage disposal.
使用固态硬盘做LSM数据库的存储介质时,LSM数据库引擎所做的数据合并操作也即进行的垃圾处理是不能被固态硬盘所感知的,从而导致LSM数据库引擎进行了合并操作也即垃圾处理后,固态硬盘还会对相 同的数据再进行一次垃圾处理,这样就导致了数据重复搬移,增加写放大。When a solid state drive is used as the storage medium of the LSM database, the data merging operation performed by the LSM database engine, that is, the garbage processing performed cannot be perceived by the solid state drive, which causes the LSM database engine to perform the merging operation, that is, garbage processing. The solid-state drive will also perform garbage processing on the same data again, which will cause repeated data movement and increase write amplification.
发明内容Summary of the invention
本申请提供一种基于LSM数据库的垃圾处理的方法、固态硬盘以及存储装置,能够灵活地对LSM数据库的数据进行垃圾处理,降低写放大。The present application provides a method for garbage processing based on an LSM database, a solid state drive, and a storage device, which can flexibly perform garbage processing on data in the LSM database and reduce write amplification.
为解决上述技术问题,本申请采用的一个技术方案是:提供一种基于LSM数据库的垃圾处理方法,该方法包括:在固态硬盘中创建与待写入数据对应的日志,其中,待写入的数据为LSM数据库中的sstfile文件或者logfile文件;将待写入数据写入对应的日志中;在满足执行合并操作的条件下,对固态硬盘中日志执行合并操作;删除合并后的日志并标识被删除的日志所在的超级块为数据无效状态,其中,超级块为固态硬盘中分配用于存储各日志的物理地址空间。In order to solve the above technical problems, a technical solution adopted by this application is to provide a garbage disposal method based on the LSM database. The method includes: The data is the sstfile file or logfile file in the LSM database; write the data to be written into the corresponding log; perform the merge operation on the logs in the solid state drive under the condition of performing the merge operation; delete the merged log and mark the The super block where the deleted log is located is in a data invalid state, where the super block is a physical address space allocated in the solid state disk for storing each log.
其中,该方法进一步包括:获取固态硬盘中处于空闲状态的超级块的数量;获取固态硬盘的访问强度;检测处于空闲状态的超级块的数量和访问强度是否满足预定条件;其中,当检测结果为处于空闲状态的超级块的数量和访问强度满足预定条件时,执行合并操作的条件被满足。Wherein, the method further includes: obtaining the number of super blocks in the idle state in the solid state disk; obtaining the access strength of the solid state disk; detecting whether the number and access strength of the super blocks in the idle state meet a predetermined condition; wherein, when the detection result is When the number of super blocks in the idle state and the access intensity meet the predetermined conditions, the conditions for performing the merge operation are satisfied.
其中,对固态硬盘中各日志执行合并操作的步骤包括:获取各超级块中存储的日志的数量以及各日志对应的数据量;根据日志的数量以及各日志对应的数据量获取各超级块对应的有效数据量;合并有效数据量最小的超级块。Among them, the steps of performing a merge operation on each log in the solid state drive include: obtaining the number of logs stored in each super block and the amount of data corresponding to each log; obtaining the corresponding data of each super block according to the number of logs and the amount of data corresponding to each log Effective data volume; merge the super block with the smallest effective data volume.
其中,在固态硬盘中创建与待写入数据对应的日志的步骤包括:获取待写入数据所对应的分组标识,其中,每一分组标识对应一超级块,,每一超级块包括第一预定大小的多个第一逻辑块,所有超级块对应一入口参数列表,每一超级块中的第一逻辑块与入口参数列表中的一入口参数相对应;在与分组标识对应的超级块中动态分配用于存储日志的空闲的第一逻辑块。Wherein, the step of creating a log corresponding to the data to be written in the solid state drive includes: obtaining a group identifier corresponding to the data to be written, wherein each group identifier corresponds to a super block, and each super block includes a first predetermined There are multiple first logic blocks of different sizes, all super blocks correspond to an entry parameter list, and the first logic block in each super block corresponds to an entry parameter in the entry parameter list; dynamic in the super block corresponding to the group identifier The first logical block that is free for storing logs is allocated.
其中,将待写入数据写入对应的日志中的步骤进一步包括:将写入数据后的日志划分为第二预定大小的多个第二逻辑块;Wherein, the step of writing the data to be written into the corresponding log further includes: dividing the log after writing the data into a plurality of second logical blocks of a second predetermined size;
查找超级块中空闲的第一逻辑块对应的入口参数;将第二逻辑块中的内容写入对应的空闲的第一逻辑块中;修改已完成写入操作的第一逻辑块对应的入口参数为日志的日志标识和已写入的第二逻辑块的逻辑块标识。Find the entry parameter corresponding to the free first logic block in the super block; write the content in the second logic block into the corresponding free first logic block; modify the entry parameter corresponding to the first logic block that has completed the write operation It is the log ID of the log and the logical block ID of the second logical block that has been written.
其中,该方法进一步包括:将写入入口参数的日志标识和逻辑块标识作为哈希键值,根据哈希键值获取哈希数值;根据哈希数值获取哈希桶的索引值;检测索引值对应的哈希桶中是否为空;若检测结果为哈希桶为空,则将入口参数的下标保存在索引值对应的哈希桶中;若检测结果为哈希桶不为空,则将入口参数的下标保存在对应的哈希链接中。Wherein, the method further includes: using the log identifier and the logical block identifier written in the entry parameters as the hash key value, obtaining the hash value according to the hash key value; obtaining the index value of the hash bucket according to the hash value; and detecting the index value Whether the corresponding hash bucket is empty; if the detection result is that the hash bucket is empty, the subscript of the entry parameter is saved in the hash bucket corresponding to the index value; if the detection result is that the hash bucket is not empty, then Save the subscript of the entry parameter in the corresponding hash link.
其中,在固态硬盘中创建与待写入数据对应的日志的步骤之前,方法进一步包括:获取LSM数据库中的分组标识;在固态硬盘中为不同的分组标识创建对应的超级块,其中,LSM数据库中的同一层sstfile文件属于同一分组标识。Wherein, before the step of creating a log corresponding to the data to be written in the solid state disk, the method further includes: obtaining a group identifier in the LSM database; creating corresponding super blocks for different group identifiers in the solid state disk, where the LSM database The sstfile files in the same layer belong to the same group ID.
其中,该方法进一步包括:将各超级块划分为多个第一预定大小的第一逻辑块;按照各超级块中第一逻辑块的数量的总和创建与所有超级块对应的入口参数列表。Wherein, the method further includes: dividing each super block into a plurality of first logical blocks of a first predetermined size; and creating an entry parameter list corresponding to all super blocks according to the sum of the number of first logical blocks in each super block.
为解决上述技术问题,本申请采用的另一个技术方案是:提供一种固态硬盘,该固态硬盘包括处理器、与处理器耦接的存储控制器,其中,存储控制器存储有用于实现上述任一项的基于LSM数据库的垃圾处理的方法的程序指令;处理器用于执行存储控制器存储的程序指令以对固态硬盘进行垃圾处理;其中,LSM数据库通过SPDK与固态硬盘进行交互。In order to solve the above technical problems, another technical solution adopted in this application is to provide a solid-state hard disk, which includes a processor and a storage controller coupled to the processor, wherein the storage controller stores a storage device for implementing any of the above A program instruction of a method for garbage disposal based on the LSM database; the processor is used to execute program instructions stored by the storage controller to process garbage on the solid state drive; wherein the LSM database interacts with the solid state drive through SPDK.
为解决上述技术问题,本申请采用的又一个技术方案是:提供一种存储装置,存储有能够实现上述任一方法的程序文件。In order to solve the above technical problems, another technical solution adopted in this application is to provide a storage device that stores program files that can implement any of the above methods.
本申请的有益效果是:本申请的基于LSM数据库的垃圾处理的方法、固态硬盘以及存储装置通过在固态硬盘中创建与待写入数据对应的日志;将待写入数据写入对应的日志中;在满足执行合并操作的条件下,对固态硬盘中日志执行合并操作;删除合并后的日志并标识被删除的日志所在的超级块为数据无效状态。通过上述方式,本申请能够灵活地对存储在固态硬盘中的LSM数据库的数据进行垃圾处理,降低写放大。The beneficial effects of this application are: the method for garbage disposal based on the LSM database, the solid state hard disk and the storage device of this application create a log corresponding to the data to be written in the solid state hard disk; write the data to be written into the corresponding log ; Under the condition of performing the merge operation, perform the merge operation on the logs in the solid state hard disk; delete the merged log and mark the super block where the deleted log is in a data invalid state. Through the above method, the present application can flexibly perform garbage processing on the data of the LSM database stored in the solid state hard disk, and reduce write amplification.
附图说明Description of the drawings
图1是本申请第一实施例的基于LSM数据库的垃圾处理方法的流程示意图;FIG. 1 is a schematic flowchart of a garbage processing method based on an LSM database according to a first embodiment of the present application;
图2是本申请第二实施例的基于LSM数据库的垃圾处理方法的流程示意图;2 is a schematic flowchart of a garbage processing method based on an LSM database according to a second embodiment of the present application;
图3是本申请实施例的超级块的结构示意图;FIG. 3 is a schematic diagram of the structure of a super block according to an embodiment of the present application;
图4是本申请实施例的超级块、日志和入口参数列表的关系示意图;FIG. 4 is a schematic diagram of the relationship between a super block, a log, and an entry parameter list in an embodiment of the present application;
图5是本申请实施例的哈希桶、入口参数列表和哈希链接的关系示意图;FIG. 5 is a schematic diagram of the relationship among the hash bucket, the entry parameter list, and the hash link in an embodiment of the present application;
图6是本申请实施例的固态硬盘的结构示意图;FIG. 6 is a schematic structural diagram of a solid-state hard disk according to an embodiment of the present application;
图7是LSM数据库与图6所示的固态硬盘的链接示意图;FIG. 7 is a schematic diagram of the link between the LSM database and the solid state drive shown in FIG. 6;
图8是本申请实施例的存储装置的结构示意图。FIG. 8 is a schematic structural diagram of a storage device according to an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请的一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请中的术语“第一”、“第二”、“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”、“第三”的特征可以明示或者隐含地包括至少一个该特征。本申请的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。本申请实施例中所有方向性指示(诸如上、下、左、右、前、后……)仅用于解释在某一特定姿态(如附图所示)下各部件之间的相对位置关系、运动情况等,如果该特定姿态发生改变时,则该方向性指示也相应地随之改变。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。 例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", and "third" in this application are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with "first", "second", and "third" may explicitly or implicitly include at least one of the features. In the description of this application, "a plurality of" means at least two, such as two, three, etc., unless otherwise specifically defined. All directional indications (such as up, down, left, right, front, back...) in the embodiments of this application are only used to explain the relative positional relationship between the components in a specific posture (as shown in the drawings) If the specific posture changes, the directional indication will change accordingly. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes unlisted steps or units, or optionally also includes Other steps or units inherent to these processes, methods, products or equipment.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.
图1是本申请第一实施例的基于LSM数据库的垃圾处理的方法的流程图。需注意的是,若有实质上相同的结果,本申请的方法并不以图1所示的流程顺序为限。如图1所示,该方法包括步骤:Fig. 1 is a flowchart of a method for garbage processing based on an LSM database according to a first embodiment of the present application. It should be noted that if there is substantially the same result, the method of the present application is not limited to the sequence of the process shown in FIG. 1. As shown in Figure 1, the method includes steps:
步骤S101:在固态硬盘中创建与待写入数据对应的日志。Step S101: Create a log corresponding to the data to be written in the solid state hard disk.
在本实施例中,固态硬盘提供的所有接口使用nvme协议来承载,即在nvme协议上定制nvme命令来实现对应的接口;为了提供更高的性能,LSM数据库使用SPDK直接发送对应的nvme命令与固态硬盘进行交互。其中,nvme命令包括创建日志CreatLog、追加写入日志AppendLog、读日志ReadLog、删除日志DeleteLog等命令。In this embodiment, all interfaces provided by the solid state drive are carried by the nvme protocol, that is, the nvme command is customized on the nvme protocol to implement the corresponding interface; in order to provide higher performance, the LSM database uses SPDK to directly send the corresponding nvme command and Solid state drives interact. Among them, the nvme commands include commands such as creating log CreatLog, appending to writing log AppendLog, reading log ReadLog, and deleting log DeleteLog.
在步骤S101中,LSM数据库利用创建日志命令CreatLog在固态硬盘中创建与待写入的数据对应的日志,其中,待写入数据为LSM数据库中的sstfile文件或者logfile文件。In step S101, the LSM database uses the create log command CreatLog to create a log corresponding to the data to be written in the solid state disk, where the data to be written is an sstfile file or a logfile file in the LSM database.
步骤S102:将待写入数据写入对应的日志中。Step S102: Write the data to be written into the corresponding log.
在步骤S102中,LSM数据库利用追加写入日志命令AppendLog将待写入数据写入对应的日志中,并存储在固态硬盘的超级块中。In step S102, the LSM database uses the append log command AppendLog to write the data to be written into the corresponding log, and store it in the super block of the solid state disk.
步骤S103:在满足执行合并操作的条件下,对固态硬盘中日志执行合并操作。Step S103: Perform a merge operation on the logs in the solid state disk under the condition that the merge operation is performed.
在步骤S103中,当固态硬盘满足执行合并操作的触发条件后,对固态硬盘中日志执行合并操作。也就是说,对固态硬盘中的日志执行合并操作的时间是可以灵活进行配置的。In step S103, when the solid-state hard disk meets the trigger condition for executing the merge operation, the merge operation is performed on the logs in the solid-state hard disk. In other words, the time for performing the merge operation on the logs in the solid state drive can be flexibly configured.
其中,在合并的过程中,LSM数据库利用读日志命令ReadLog将待合并的日志中的有效数据读出并转移到另一有效的物理地址空间。Among them, in the process of merging, the LSM database uses the read log command ReadLog to read out the valid data in the log to be merged and transfer it to another effective physical address space.
步骤S104:删除合并后的日志并标识被删除的日志所在的超级块为数据无效状态,其中,超级块为固态硬盘中分配用于存储各日志的物理地址空间。Step S104: Delete the merged log and identify the super block where the deleted log is in a data invalid state, where the super block is a physical address space allocated in the solid state disk for storing each log.
在步骤S104中,LSM数据库利用删除日志命令DeleteLog删除合并后的日志,同时将该日志所在的超级块标识为数据无效状态,从而达到快速释放该超级块的目的。In step S104, the LSM database uses the delete log command DeleteLog to delete the merged log, and at the same time mark the super block where the log is located as a data invalid state, so as to achieve the purpose of quickly releasing the super block.
图2是本申请第二实施例的基于闪存的垃圾处理的方法的流程示意图。需注意的是,若有实质上相同的结果,本申请的方法并不以图2所示的流程顺序为限。如图2所示,该方法包括如下步骤:FIG. 2 is a schematic flowchart of a method for garbage processing based on flash memory according to a second embodiment of the present application. It should be noted that if there is substantially the same result, the method of the present application is not limited to the sequence of the process shown in FIG. 2. As shown in Figure 2, the method includes the following steps:
步骤S201:获取LSM数据库中的分组标识,在固态硬盘中为不同的分组标识所对应的分组创建对应的超级块。Step S201: Acquire group identifiers in the LSM database, and create corresponding super blocks in the solid state disk for groups corresponding to different group identifiers.
在步骤S201中,LSM数据库中的分组标识与LSM数据库中sstfile文件所在的层标识相对应,也就是说,LSM数据库中的同一层sstfile文件属于同一分组标识。固态硬盘为不同的分组标识所对应的分组创建对应的超级块,其中,超级块为固态硬盘中分配用于存储数据的物理地址空间。In step S201, the group identifier in the LSM database corresponds to the layer identifier where the sstfile file in the LSM database is located, that is, the sstfile file of the same layer in the LSM database belongs to the same group identifier. The solid state disk creates corresponding super blocks for groups corresponding to different group identifiers, where the super block is a physical address space allocated for storing data in the solid state disk.
步骤S202:将各超级块划分为多个第一预定大小的第一逻辑块。Step S202: Divide each super block into a plurality of first logical blocks of a first predetermined size.
在步骤S202中,将各超级块划分为多个第一预定大小例如1MB的第一逻辑块,以利用超级块中的第一逻辑块依次来存储待写入数据。In step S202, each super block is divided into a plurality of first logical blocks of a first predetermined size, for example, 1 MB, so as to use the first logical blocks in the super block to sequentially store the data to be written.
图3为本申请实施例的超级块的结构示意图。如图3所示,超级块包括多个第一逻辑块BLOCKN(N=0,1,2.....)。其中,超级块中的第一逻辑块是动态分配的,当已分配的第一逻辑块完成存储操作后,若未完成所有待写入数据的存储,则继续分配一个新的第一逻辑块来继续存储待写入数据。FIG. 3 is a schematic structural diagram of a super block according to an embodiment of the application. As shown in Fig. 3, the super block includes a plurality of first logical blocks BLOCKN (N=0, 1, 2,...). Among them, the first logical block in the super block is dynamically allocated. After the allocated first logical block completes the storage operation, if the storage of all the data to be written is not completed, continue to allocate a new first logical block to Continue to store the data to be written.
步骤S203:按照各超级块中第一逻辑块的数量的总和创建与各超级块对应的入口参数列表。Step S203: Create an entry parameter list corresponding to each super block according to the sum of the number of first logical blocks in each super block.
在步骤S203中,按照超级块提供的第一逻辑块的个数申请保存入 口参数列表entry table的空间以创建入口参数列表entry table。In step S203, the space for storing the entry table of the entry parameter list is requested according to the number of the first logical blocks provided by the super block to create the entry table of the entry parameter list.
具体来说,假设所有超级块的大小为1TB,第一逻辑块的大小为1MB,入口参数列表中每个入口参数entry为4个字节,则入口参数列表所需的空间为4MB。Specifically, assuming that the size of all super blocks is 1TB, the size of the first logical block is 1MB, and each entry parameter entry in the entry parameter list is 4 bytes, the space required for the entry parameter list is 4MB.
其中,入口参数entry不需要保存超级块的标识符和第一逻辑块的逻辑标识符也即是第几个超级块中的第几个第一逻辑块,其可以通过入口参数entry在入口参数列表entry table中的下标也即入口参数entry在入口参数列表entry table中的位置对应计算出,也就是说,每一第一逻辑块和入口参数entry一一对应。Among them, the entry parameter entry does not need to save the identifier of the super block and the logical identifier of the first logical block, that is, which is the first logical block in the super block, which can be listed in the entry parameter list through the entry parameter entry The subscript in the entry table is the corresponding calculation of the position of the entry parameter entry in the entry parameter list entry table, that is, each first logical block corresponds to the entry parameter entry one-to-one.
步骤S204:在固态硬盘中创建与待写入数据对应的日志。Step S204: Create a log corresponding to the data to be written in the solid state hard disk.
在步骤S204中,在固态硬盘中创建与待写入数据对应的日志的步骤包括:获取待写入数据所对应的分组标识,在与分组标识对应的超级块中动态分配用于存储日志的空闲的第一逻辑块。In step S204, the step of creating a log corresponding to the data to be written in the solid state disk includes: obtaining the group identifier corresponding to the data to be written, and dynamically allocating free space for storing the log in the super block corresponding to the group identifier The first logical block.
LSM数据库在创建日志log时,需要知道该日志log的分组标识,其中,日志log的分组标识即为待写入数据的分组标识。另外,在创建日志log时,对应的日志log的空间分配使用分组标识所对应的分组创建的超级块上进行分配。When the LSM database creates a log log, it needs to know the group ID of the log log, where the group ID of the log log is the group ID of the data to be written. In addition, when the log log is created, the space allocation of the corresponding log log is allocated on the super block created by the group corresponding to the group identifier.
步骤S205:将待写入数据写入对应的日志中。Step S205: Write the data to be written into the corresponding log.
在步骤S205中,将待写入数据写入对应的日志中的步骤包括:将待写入数据写入日志log并将该日志log划分为第二预定大小的多个第二逻辑块BlockN(N=0,1,2...);查找空闲的第一逻辑块对应的入口参数entry;将第二逻辑块中的内容写入对应的空闲的第一逻辑块中;修改已完成写入操作的第一逻辑块对应的入口参数entry为该日志log的日志标识和已写入的第二逻辑块的逻辑块标识(如图4所示)。In step S205, the step of writing the data to be written into the corresponding log includes: writing the data to be written into the log log and dividing the log log into a plurality of second logical blocks of a second predetermined size BlockN(N =0,1,2...); find the entry parameter entry corresponding to the free first logical block; write the content in the second logical block into the corresponding free first logical block; modify the completed write operation The entry parameter entry corresponding to the first logical block is the log identifier of the log log and the logical block identifier of the second logical block that has been written (as shown in FIG. 4).
也就是说,当待写入数据作为日志被保存在固态硬盘的超级块的第一逻辑块中后,会同时更新与该第一逻辑块对应的入口参数的信息。That is to say, after the data to be written is stored as a log in the first logical block of the super block of the solid state hard disk, the information of the entry parameter corresponding to the first logical block will be updated at the same time.
优选地,第二逻辑块和第一逻辑块的大小相同例如同为1MB。Preferably, the second logical block and the first logical block have the same size, for example, the same 1MB.
优选地,将待写入数据写入对应的日志中的步骤进一步包括:将写入入口参数的日志标识log Id和逻辑块标识block id作为哈希键值 hash key,根据哈希键值hash key获取哈希数值hash value;根据哈希数值hash value获取哈希桶bucket的索引值index;检测与索引值index对应的哈希桶bucket是否为空;若检测结果为哈希桶bucket为空,则将入口参数entry的下标保存在索引值index对应的哈希桶bucket中;若检测结果为哈希桶bucket不为空,则将入口参数entry的下标保存在对应的哈希链接hashlink中。Preferably, the step of writing the data to be written into the corresponding log further includes: using the log ID and the logical block ID block id of the entry parameter as the hash key value hash key, and the hash key value is based on the hash key value. Get the hash value hash value; get the index value index of the hash bucket bucket according to the hash value hash value; check whether the hash bucket bucket corresponding to the index value index is empty; if the check result is that the hash bucket bucket is empty, then Save the subscript of the entry parameter entry in the hash bucket bucket corresponding to the index value; if the detection result is that the hash bucket bucket is not empty, save the entry subscript of the entry parameter entry in the corresponding hash link hashlink.
需要强调的是,若检测结果为哈希桶bucket不为空,也就是说,证明当前发生了哈希冲突,这个时候需要根据哈希桶bucket中保存的值也即入口参数的下标,找到对应的哈希链接hashlink的下标,其中,哈希链接hashlink的下标与入口参数entry的下标一一对应,然后在与该下标对应的哈希链接hashlink中保存最近的入口参数的下标。如果与该下标对应的哈希链接hashlink中也有值,证明还有冲突,需要根据与该下标对应的哈希链接hashlink中保存的值也即入口参数entry的下标,继续查找下一个下标的哈希链接hashlink,直到没有冲突为止。It needs to be emphasized that if the detection result is that the hash bucket bucket is not empty, that is to say, it proves that a hash conflict has occurred currently. At this time, you need to find the value stored in the hash bucket bucket, that is, the index of the entry parameter The subscript of the corresponding hash link hashlink, where the subscript of the hash link hashlink corresponds to the subscript of the entry parameter entry one-to-one, and then the hash link corresponding to the subscript saves the next entry parameter Mark. If the hash link corresponding to the subscript also has a value, it proves that there is still a conflict. You need to continue to find the next next based on the value saved in the hash link hashlink corresponding to the subscript, that is, the subscript of the entry parameter entry The target hash link hashlink until there is no conflict.
举例来说,如图5所示,哈希桶bucket不为空,其保存的值为2,则寻找下标为2的哈希链接hashlink,发现其保存的值为4,则继续寻找下标为4的哈希链接hashlink,发现其保存的值为6,继续寻找下标为6的哈希链接hashlink,发现其为空,则可以存储待保存的入口参数entry的下标。For example, as shown in Figure 5, the hash bucket bucket is not empty, and its saved value is 2, then look for the hash link hashlink with subscript 2 and find that its saved value is 4, then continue to look for the subscript The hash link hashlink is 4, and the saved value is found to be 6, and the hash link hashlink with the subscript 6 is continued to search, and when it is found to be empty, the subscript of the entry parameter entry to be saved can be stored.
步骤S206:在满足执行合并操作的条件下,对固态硬盘中日志执行合并操作。Step S206: Perform a merge operation on the logs in the solid state hard disk when the conditions for performing the merge operation are met.
在步骤S206中,判断执行合并操作的条件是否被满足的操作具体为:获取固态硬盘中处于空闲状态的超级块的数量;获取固态硬盘的访问强度;检测处于空闲状态的超级块的数量和访问强度是否满足预定条件;其中,当检测结果为处于空闲状态的超级块的数量和访问强度满足预定条件时,执行合并操作的条件被满足。其中,固态硬盘的访问强度即为固态硬盘的业务负载,其也可以理解为窗口时间也即单位时间内对固态硬盘的访问次数。In step S206, the operation of judging whether the condition for performing the merge operation is met is specifically: obtaining the number of free super blocks in the solid state disk; obtaining the access strength of the solid state disk; detecting the number and access of the super blocks in the idle state Whether the strength meets a predetermined condition; wherein, when the detection result is that the number of super blocks in an idle state and the access strength meet the predetermined condition, the condition for performing the merge operation is satisfied. Among them, the access intensity of the solid state drive is the business load of the solid state drive, which can also be understood as the window time, that is, the number of accesses to the solid state drive within a unit time.
具体来说,当检测结果为处于空闲状态的超级块的数量小于第一预 定阈值且固态硬盘的访问强度小于第二预定阈值时,执行合并操作。也就是说,当处于空闲状态的超级块的数量比较小、固态硬盘的访问强度比较小时,执行合并操作,此时可以规避合并操作对主机访问固态硬盘的操作的影响。反之,当检测结果为处于空闲状态的超级块的数量大于等于第一预定阈值或固态硬盘的访问强度大于等于第二预定阈值时,不执行合并操作,以保证主机对固态硬盘的访问质量。Specifically, when the detection result is that the number of super blocks in the idle state is less than the first predetermined threshold and the access intensity of the solid state disk is less than the second predetermined threshold, the merge operation is performed. That is to say, when the number of super blocks in the idle state is relatively small and the access intensity of the solid state disk is relatively small, the merge operation is performed. At this time, the influence of the merge operation on the host's access to the solid state disk can be avoided. Conversely, when the detection result is that the number of super blocks in the idle state is greater than or equal to the first predetermined threshold or the access intensity of the solid state disk is greater than or equal to the second predetermined threshold, the merge operation is not performed to ensure the quality of the host's access to the solid state disk.
对固态硬盘中日志执行合并操作的步骤包括:获取各超级块中存储的日志的数量以及各日志对应的数据量,其可以通过遍历入口参数entry获得;根据日志的数量以及各日志对应的数据量获取各超级块对应的有效数据量;合并有效数据量最小的超级块。也就是说,LSM数据库可以根据固态硬盘内部的处于空闲状态的超级块的数量,结合当前对固态硬盘的访问强度,来选择合适的时间对固态硬盘中有效数据量最小的超级块进行合并操作。The steps of performing a merge operation on the logs in the solid state drive include: obtaining the number of logs stored in each super block and the amount of data corresponding to each log, which can be obtained by traversing the entry parameter entry; according to the number of logs and the amount of data corresponding to each log Obtain the effective data volume corresponding to each super block; merge the super block with the smallest effective data volume. That is to say, the LSM database can select the appropriate time to merge the super block with the smallest amount of effective data in the solid state disk according to the number of idle super blocks inside the solid state disk and the current access intensity of the solid state disk.
其中,在对超级块进行合并操作是需要读出超级块中的数据,其读出数据的操作具体为:根据读IO的逻辑地址lba获取对应的日志标识log Id和逻辑块标识block id;将日志标识log Id和逻辑块标识block id作为哈希键值hash key计算得到哈希数值hash value;根据哈希数据hash key生成哈希桶bucket的索引值index;根据哈希桶bucket的索引值index获取入口参数entry的下标;判断入口参数entry保存的日志标识log id、逻辑块标识block id与这次读IO的日志标识log id和逻辑块标识block id是否相等,如果相等则根据入口参数entry的下标计算对应的超级块的标识和第一逻辑块的逻辑块标识,并读取该第一逻辑块中的数据;如果不等则查找哈希链接hashlink中保存的入口参数entry下标,再次进行比较,直到匹配。Among them, in the merging operation of the super block, the data in the super block needs to be read, and the operation of reading the data is specifically: obtaining the corresponding log ID and logical block ID block id according to the logical address lba of the read IO; Log ID and logical block ID block id are used as the hash key value hash key to calculate the hash value hash value; generate the index value of the hash bucket bucket according to the hash data hash key; according to the index value index of the hash bucket bucket Get the subscript of the entry parameter entry; determine whether the log id stored in the entry parameter entry, the logical block id block id and the log id of the IO read this time, and the logical block id block id are equal, if they are equal, the entry parameter entry is used Calculate the identifier of the corresponding super block and the logical block identifier of the first logical block, and read the data in the first logical block; if they are not equal, look up the entry index of the entry parameter saved in the hashlink, Compare again until it matches.
步骤S207:删除合并后的日志并标识被删除的日志所在的超级块为数据无效状态。Step S207: Delete the merged log and mark the super block where the deleted log is located in a data invalid state.
在步骤S207中,删除合并后的超级块中的各日志并将该超级块标识为数据无效状态,从而释放该超级块,以达到垃圾处理的目的。In step S207, each log in the merged super block is deleted and the super block is marked as a data invalid state, thereby releasing the super block to achieve the purpose of garbage disposal.
请参阅图6,图6为本申请实施例的固态硬盘的结构示意图。如图 6所示,该固态硬盘10包括处理器11及和处理器耦接的存储控制器12。Please refer to FIG. 6, which is a schematic structural diagram of a solid state drive according to an embodiment of the application. As shown in FIG. 6, the solid state drive 10 includes a processor 11 and a storage controller 12 coupled to the processor.
存储控制器12存储有用于实现上述任一实施例所述的基于LSM数据库的垃圾处理的方法的程序指令。The storage controller 12 stores program instructions for implementing the LSM database-based garbage processing method described in any of the above embodiments.
处理器12用于执行存储控制器12存储的程序指令以对固态硬盘进行垃圾处理。The processor 12 is configured to execute program instructions stored by the storage controller 12 to perform garbage processing on the solid state hard disk.
其中,处理器11还可以称为CPU(Central Processing Unit,中央处理单元)。处理器11可能是一种集成电路芯片,具有信号的处理能力。处理器11还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 11 may also be referred to as a CPU (Central Processing Unit, central processing unit). The processor 11 may be an integrated circuit chip with signal processing capabilities. The processor 11 may also be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component . The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
请一并参考图7,图7是LSM数据库与图6所示的固态硬盘的链接示意图。如图7所示,LSM数据库20通过SPDK30与固态硬盘进10行交互,其中,SPDK30为存储性能开发工具包。Please refer to FIG. 7 together, which is a schematic diagram of the link between the LSM database and the solid state drive shown in FIG. 6. As shown in FIG. 7, the LSM database 20 interacts with the solid-state hard disk through SPDK30 for 10 lines, where SPDK30 is a storage performance development kit.
具体来说,固态硬盘10提供的所有接口使用nvme协议来承载,即在nvme协议上定制nvme命令来实现对应的接口;为了提供更高的性能,LSM数据库20使用SPDK30直接发送对应的nvme命令与固态硬盘10进行交互。其中,nvme命令包括创建日志CreatLog、追加写入日志AppendLog、读日志ReadLog、删除日志DeleteLog等命令。Specifically, all interfaces provided by the solid state drive 10 are carried by the nvme protocol, that is, the nvme command is customized on the nvme protocol to implement the corresponding interface; in order to provide higher performance, the LSM database 20 uses the SPDK30 to directly send the corresponding nvme command and The solid state drive 10 interacts. Among them, the nvme commands include commands such as creating log CreatLog, appending to writing log AppendLog, reading log ReadLog, and deleting log DeleteLog.
换个角度来说,固态硬盘10是以日志log形式提供的固态硬盘10,通过新增特殊的nvme命令来实现LSM数据库20和固态硬盘10的交互。To put it another way, the solid state drive 10 is a solid state drive 10 provided in the form of a log, and the interaction between the LSM database 20 and the solid state drive 10 is realized by adding a special nvme command.
参阅图8,图8为本申请实施例的存储装置的结构示意图。本申请实施例的存储装置存储有能够实现上述所有方法的程序文件21,其中,该程序文件21可以以软件产品的形式存储在上述存储装置中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式所述方法的全部或部分步骤。而前述的存储装置包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质,或者是计算 机、服务器、手机、平板等终端设备。Refer to FIG. 8, which is a schematic structural diagram of a storage device according to an embodiment of the application. The storage device in the embodiment of the present application stores a program file 21 that can implement all the above methods. The program file 21 can be stored in the storage device in the form of a software product, and includes several instructions to enable a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage devices include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code , Or terminal devices such as computers, servers, mobile phones, and tablets.
本申请的有益效果是:本申请的基于LSM数据库的垃圾处理的方法、固态硬盘以及存储装置通过在固态硬盘中创建与待写入数据对应的日志;将待写入数据写入对应的日志中;在满足执行合并操作的条件下,对固态硬盘中日志执行合并操作;删除合并后的日志并标识被删除的日志所在的超级块为数据无效状态。通过上述方式,本申请能够避免现有技术中的LSM数据库引擎和固态硬盘对相同的数据进行重复的垃圾处理。进一步,本申请能够灵活地对存储在固态硬盘中的LSM数据库的数据进行垃圾处理,降低写放大。The beneficial effects of this application are: the method for garbage disposal based on the LSM database, the solid state hard disk and the storage device of this application create a log corresponding to the data to be written in the solid state hard disk; write the data to be written into the corresponding log ; Under the condition of performing the merge operation, perform the merge operation on the logs in the solid state hard disk; delete the merged log and mark the super block where the deleted log is in a data invalid state. Through the above method, the present application can avoid the LSM database engine and the solid state hard disk in the prior art from performing repeated garbage processing on the same data. Further, the present application can flexibly perform garbage processing on the data of the LSM database stored in the solid state hard disk, and reduce write amplification.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative, for example, the division of units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
以上所述仅为本申请的实施方式,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only implementations of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made by using the description and drawings of this application, or directly or indirectly applied to other related technologies In the same way, all fields are included in the scope of patent protection of this application.

Claims (10)

  1. 一种基于LSM数据库的垃圾处理方法,其特征在于,所述方法包括:A garbage processing method based on LSM database, characterized in that the method includes:
    在固态硬盘中创建与待写入数据对应的日志,其中,所述待写入的数据为所述LSM数据库中的sstfile文件或者logfile文件;Creating a log corresponding to the data to be written in the solid state drive, where the data to be written is an sstfile file or a logfile file in the LSM database;
    将所述待写入数据写入对应的所述日志中;Write the data to be written into the corresponding log;
    在满足执行合并操作的条件下,对所述固态硬盘中所述日志执行合并操作;Performing a merging operation on the logs in the solid-state hard disk when the conditions for performing the merge operation are met;
    删除合并后的所述日志并标识被删除的所述日志所在的超级块为数据无效状态,其中,所述超级块为所述固态硬盘中分配用于存储各所述日志的物理地址空间。The merged log is deleted and the super block where the deleted log is located is in a data invalid state, where the super block is a physical address space allocated in the solid-state hard disk for storing each log.
  2. 根据权利要求1所述的方法,其特征在于,所述方法进一步包括:The method of claim 1, wherein the method further comprises:
    获取所述固态硬盘中处于空闲状态的所述超级块的数量;Acquiring the number of the super blocks in the idle state in the solid state hard disk;
    获取所述固态硬盘的访问强度;Obtaining the access intensity of the solid state hard disk;
    检测处于空闲状态的所述超级块的所述数量和所述访问强度是否满足预定条件;Detecting whether the number of the super blocks in the idle state and the access intensity meet a predetermined condition;
    其中,当检测结果为处于空闲状态的所述超级块的所述数量和所述访问强度满足预定条件时,执行合并操作的条件被满足。Wherein, when the detection result is that the number of the super blocks in the idle state and the access intensity satisfy a predetermined condition, the condition for performing the merge operation is satisfied.
  3. 根据权利要求1或2所述的方法,其特征在于,所述对所述固态硬盘中所述日志执行合并操作的步骤包括:The method according to claim 1 or 2, wherein the step of performing a merge operation on the logs in the solid state drive comprises:
    获取各超级块中存储的所述日志的数量以及各所述日志对应的数据量;Acquiring the number of the logs stored in each super block and the amount of data corresponding to each of the logs;
    根据所述日志的数量以及各所述日志对应的数据量获取各所述超级块对应的有效数据量;Acquiring the effective data amount corresponding to each super block according to the number of the logs and the data amount corresponding to each of the logs;
    合并有效数据量最小的所述超级块。The super block with the smallest amount of effective data is merged.
  4. 根据权利要求1所述的方法,其特征在于,所述在固态硬盘中创建与待写入数据对应的日志的步骤包括:The method according to claim 1, wherein the step of creating a log corresponding to the data to be written in the solid state hard disk comprises:
    获取待写入数据所对应的分组标识,其中,每一所述分组标识对应一所述超级块,每一所述超级块包括第一预定大小的多个第一逻辑块,所有所述超级块对应一入口参数列表,每一所述超级块中的所述第一逻辑块与所述入口参数列表中的一入口参数相对应;Obtain the group identifier corresponding to the data to be written, wherein each of the group identifiers corresponds to one of the super blocks, each of the super blocks includes a plurality of first logical blocks of a first predetermined size, and all the super blocks Corresponding to an entry parameter list, the first logical block in each super block corresponds to an entry parameter in the entry parameter list;
    在与所述分组标识对应的所述超级块中动态分配用于存储所述日志的空闲的所述第一逻辑块。The free first logical block for storing the log is dynamically allocated in the super block corresponding to the group identifier.
  5. 根据权利要求4所述的方法,其特征在于,所述将所述待写入数据写入对应的所述日志中的步骤包括:The method according to claim 4, wherein the step of writing the data to be written into the corresponding log comprises:
    将所述待写入数据写入所述日志并将所述日志划分为第二预定大小的多个第二逻辑块;Writing the data to be written into the log and dividing the log into a plurality of second logical blocks of a second predetermined size;
    查找所述超级块中空闲的所述第一逻辑块对应的所述入口参数;Searching for the entry parameter corresponding to the first logical block that is free in the super block;
    将所述第二逻辑块中的内容写入对应的空闲的所述第一逻辑块中;Write the content in the second logical block into the corresponding free first logical block;
    修改已完成写入操作的所述第一逻辑块对应的所述入口参数为所述日志的日志标识和已写入的所述第二逻辑块的逻辑块标识。Modifying the entry parameters corresponding to the first logical block for which the writing operation has been completed is the log identifier of the log and the logical block identifier of the second logical block that has been written.
  6. 根据权利要求5所述的方法,其特征在于,所述将所述待写入数据写入对应的所述日志中的步骤进一步包括:The method according to claim 5, wherein the step of writing the data to be written into the corresponding log further comprises:
    将写入所述入口参数的所述日志标识和所述逻辑块标识作为哈希键值,根据所述哈希键值获取哈希数值;Using the log identifier and the logical block identifier written in the entry parameter as a hash key value, and obtaining a hash value according to the hash key value;
    根据所述哈希数值获取哈希桶的索引值;Obtaining the index value of the hash bucket according to the hash value;
    检测与所述索引值对应的所述哈希桶是否为空;Detecting whether the hash bucket corresponding to the index value is empty;
    若检测结果为所述哈希桶为空,则将所述入口参数的下标保存在所述索引值对应的所述哈希桶中;If the detection result is that the hash bucket is empty, save the index of the entry parameter in the hash bucket corresponding to the index value;
    若检测结果为所述哈希桶不为空,则将所述入口参数的下标保存在对应的哈希链接中。If the detection result is that the hash bucket is not empty, the subscript of the entry parameter is stored in the corresponding hash link.
  7. 根据权利要求1所述的方法,其特征在于,在所述固态硬盘中创建与待写入数据对应的日志的步骤之前,所述方法进一步包括:The method according to claim 1, characterized in that, before the step of creating a log corresponding to the data to be written in the solid state hard disk, the method further comprises:
    获取所述LSM数据库中的分组标识;Acquiring a group identifier in the LSM database;
    在所述固态硬盘中为不同的分组标识所对应的分组创建对应的所述超级块,其中,所述LSM数据库中的同一层所述sstfile文件属于同一分组标识。In the solid state hard disk, corresponding super blocks are created for groups corresponding to different group identifiers, wherein the sstfile files of the same layer in the LSM database belong to the same group identifier.
  8. 根据权利要求7所述的方法,其特征在于,所述方法进一步包括:The method according to claim 7, wherein the method further comprises:
    将各所述超级块划分为多个第一预定大小的第一逻辑块;Dividing each of the super blocks into a plurality of first logical blocks of a first predetermined size;
    按照各所述超级块中所述第一逻辑块的数量的总和创建与所有所述超级块对应的入口参数列表。Create an entry parameter list corresponding to all the super blocks according to the sum of the number of the first logical blocks in each of the super blocks.
  9. 一种固态硬盘,其特征在于,所述固态硬盘包括处理器、与所述处理器耦接的存储控制器,其中,A solid-state hard disk, characterized in that, the solid-state hard disk comprises a processor and a storage controller coupled to the processor, wherein:
    所述存储控制器存储有用于实现如权利要求1-8中任一项所述的基于LSM数据库的垃圾处理的方法的程序指令;The storage controller stores program instructions for implementing the method for garbage processing based on the LSM database according to any one of claims 1-8;
    所述处理器用于执行所述存储控制器存储的所述程序指令以对所述固态硬盘进行垃圾处理;The processor is configured to execute the program instructions stored by the storage controller to perform garbage processing on the solid-state hard disk;
    其中,所述LSM数据库通过SPDK与所述固态硬盘进行交互。Wherein, the LSM database interacts with the solid state hard disk through SPDK.
  10. 一种存储装置,其特征在于,存储有能够实现如权利要求1-8中任一项所述方法的程序文件。A storage device, characterized in that it stores a program file capable of implementing the method according to any one of claims 1-8.
PCT/CN2020/081281 2019-03-28 2020-03-26 Method for processing garbage based on lsm database, solid state hard disk, and storage apparatus WO2020192710A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910243592.2A CN110007860A (en) 2019-03-28 2019-03-28 Method, solid state hard disk and the storage device of garbage disposal based on LSM database
CN201910243592.2 2019-03-28

Publications (1)

Publication Number Publication Date
WO2020192710A1 true WO2020192710A1 (en) 2020-10-01

Family

ID=67168637

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/081281 WO2020192710A1 (en) 2019-03-28 2020-03-26 Method for processing garbage based on lsm database, solid state hard disk, and storage apparatus

Country Status (2)

Country Link
CN (1) CN110007860A (en)
WO (1) WO2020192710A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112596682A (en) * 2020-12-28 2021-04-02 郝东东 Storage device and storage method for block chain

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110007860A (en) * 2019-03-28 2019-07-12 深圳大普微电子科技有限公司 Method, solid state hard disk and the storage device of garbage disposal based on LSM database
CN110531927B (en) * 2019-08-06 2023-05-09 深圳大普微电子科技有限公司 Garbage collection method based on block classification and nonvolatile storage device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8965849B1 (en) * 2012-08-06 2015-02-24 Amazon Technologies, Inc. Static sorted index replication
CN106708427A (en) * 2016-11-17 2017-05-24 华中科技大学 Storage method suitable for key value pair data
CN108182154A (en) * 2017-12-22 2018-06-19 深圳大普微电子科技有限公司 A kind of reading/writing method and solid state disk of the journal file based on solid state disk
CN108733306A (en) * 2017-04-14 2018-11-02 华为技术有限公司 A kind of Piece file mergence method and device
CN108804019A (en) * 2017-04-27 2018-11-13 华为技术有限公司 A kind of date storage method and device
CN110007860A (en) * 2019-03-28 2019-07-12 深圳大普微电子科技有限公司 Method, solid state hard disk and the storage device of garbage disposal based on LSM database

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8965849B1 (en) * 2012-08-06 2015-02-24 Amazon Technologies, Inc. Static sorted index replication
CN106708427A (en) * 2016-11-17 2017-05-24 华中科技大学 Storage method suitable for key value pair data
CN108733306A (en) * 2017-04-14 2018-11-02 华为技术有限公司 A kind of Piece file mergence method and device
CN108804019A (en) * 2017-04-27 2018-11-13 华为技术有限公司 A kind of date storage method and device
CN108182154A (en) * 2017-12-22 2018-06-19 深圳大普微电子科技有限公司 A kind of reading/writing method and solid state disk of the journal file based on solid state disk
CN110007860A (en) * 2019-03-28 2019-07-12 深圳大普微电子科技有限公司 Method, solid state hard disk and the storage device of garbage disposal based on LSM database

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112596682A (en) * 2020-12-28 2021-04-02 郝东东 Storage device and storage method for block chain

Also Published As

Publication number Publication date
CN110007860A (en) 2019-07-12

Similar Documents

Publication Publication Date Title
US8521949B2 (en) Data deleting method and apparatus
JP6343438B2 (en) Computer system and data management method for computer system
WO2020192710A1 (en) Method for processing garbage based on lsm database, solid state hard disk, and storage apparatus
US9747298B2 (en) Inline garbage collection for log-structured file systems
US10452562B2 (en) File access method and related device
EP3168737A2 (en) Distributed multimode storage management
US11782632B2 (en) Selective erasure of data in a SSD
EP3168735A1 (en) Apparatus, method, and multimode storage device for performing selective underlying exposure mapping on user data
EP2665065A2 (en) Electronic device employing flash memory
EP3168734A1 (en) Multimode storage management system
TWI521518B (en) Method and device for storing data in flash memory device
US9710283B2 (en) System and method for pre-storing small data files into a page-cache and performing reading and writing to the page cache during booting
JP2016157441A (en) System and method for copy on write on ssd
US9430492B1 (en) Efficient scavenging of data and metadata file system blocks
EP3364303B1 (en) Data arrangement method, storage apparatus, storage controller and storage array
CN111506269A (en) Disk storage space allocation method, device, equipment and storage medium
CN108664482B (en) FLASH memory and storage file management method
WO2020103468A1 (en) Flash memory-based information garbage processing method, solid state disk, and storage device
WO2017054636A1 (en) Method and apparatus for processing virtual machine snapshots
JP6215631B2 (en) Computer system and data management method thereof
US11803469B2 (en) Storing data in a log-structured format in a two-tier storage system
CN105786724A (en) Space management method and apparatus
CN112347060B (en) Data storage method, device and equipment of desktop cloud system and readable storage medium
US20180150257A1 (en) File System Streams Support And Usage
CN108984432B (en) Method and device for processing IO (input/output) request

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20777272

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20777272

Country of ref document: EP

Kind code of ref document: A1