CN112597070B - Object recovery method and device - Google Patents

Object recovery method and device Download PDF

Info

Publication number
CN112597070B
CN112597070B CN202011277460.0A CN202011277460A CN112597070B CN 112597070 B CN112597070 B CN 112597070B CN 202011277460 A CN202011277460 A CN 202011277460A CN 112597070 B CN112597070 B CN 112597070B
Authority
CN
China
Prior art keywords
metadata
row object
bitmap
garbage
row
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011277460.0A
Other languages
Chinese (zh)
Other versions
CN112597070A (en
Inventor
何孝金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co Ltd filed Critical New H3C Big Data Technologies Co Ltd
Priority to CN202011277460.0A priority Critical patent/CN112597070B/en
Publication of CN112597070A publication Critical patent/CN112597070A/en
Application granted granted Critical
Publication of CN112597070B publication Critical patent/CN112597070B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Abstract

The application provides an object recovery method and device, and the method comprises the following steps: judging whether the currently read metadata are junk data or not in the process of executing a merge compact operation on the metadata generated by the current layer of the database; if the ROW object is garbage data, determining that the ROW object is redirected during writing corresponding to the metadata, and performing garbage marking processing on a position, corresponding to the metadata, in a bitmap of the determined ROW object; updating the determined garbage amount of the ROW object based on the size of the metadata; acquiring the garbage amount of each ROW object, and screening out target ROW objects with the garbage amount meeting the recovery condition from each ROW object; aiming at the screened target ROW object, acquiring a bitmap of the target ROW object; judging whether the bitmap of the target ROW object has unmarked bits or not; when there are no unmarked bits, the target ROW object is released. Thereby achieving fast reclamation of ROW objects.

Description

Object recovery method and device
Technical Field
The present application relates to the field of storage technologies, and in particular, to an object recovery method and apparatus.
Background
At present, high performance storage media Solid State Disks (SSDs) are in large-scale commercial use. In the field of distributed storage, full flash memory distributed storage systems based on full SSDs are also being continuously introduced. In a distributed full flash system, data is written in a read-on-write (ROW) mode, so that the performance advantages of an SSD (solid state disk) can be better exerted, and the characteristics of deduplication, compression and the like can be better supported. However, the write mode of the ROW is a completely new write mode, and the GC must cooperate with Garbage Collection (GC) to release space in time through Garbage Collection, so that the data write by the ROW can be continuously supported, that is, the performance of the GC is improved, and the data write performance of the ROW is also greatly improved. To improve the performance of the GC, the object with the largest amount of garbage needs to be accurately identified, so that the effective data moved each time is minimized.
In the prior art, when identifying an object storing garbage, metadata of data is read out when data is written in, and the original object stored in the data is found and marked as garbage, so that a GC decides whether to recycle the object according to the garbage amount. When object recovery is performed according to the metadata storage mode, rocksDB or a similar variant technology is mainly used for storing metadata, but when the metadata is operated under the technology, the mode is a mode only of adding writing when the metadata is modified, and the writing amplification of the metadata is reduced by multi-stage metadata storage aggregation. In this technical scenario, if the original metadata is read before the metadata is modified each time, the reading overhead is large, and the object recycling performance is extremely affected.
Therefore, how to rapidly recycle objects and further improve the overall performance of the distributed full flash memory system is one of the considerable technical problems.
Disclosure of Invention
In view of the above, the present application provides an object recovery method and apparatus for rapidly recovering an object, so as to improve the overall performance of a distributed full flash memory system.
Specifically, the method is realized through the following technical scheme:
according to a first aspect of the present application, there is provided an object recovery method comprising:
judging whether the currently read metadata are garbage data or not in the process of executing the merged compact operation on the metadata generated by the current layer of the database;
if the ROW object is garbage data, determining that the ROW object is redirected during writing corresponding to the metadata, and performing garbage marking processing on a position, corresponding to the metadata, in a bitmap of the determined ROW object; updating the determined garbage amount of the ROW object based on the size of the metadata;
acquiring the garbage amount of each ROW object, and screening out target ROW objects with the garbage amount meeting the recovery conditions from each ROW object;
aiming at the screened target ROW object, acquiring a bitmap of the target ROW object;
judging whether the bitmap of the target ROW object has unmarked bits or not;
when there are no unmarked bits, the target ROW object is released.
According to a second aspect of the present application, there is provided an object recovery apparatus comprising:
the first judgment module is used for judging whether the currently read metadata are junk data or not in the process of executing the merge compact operation on the metadata generated by the current layer of the database;
a determining module, configured to determine, if the determination result of the first determining module is garbage data, that the ROW object is redirected during writing corresponding to the metadata, and perform garbage marking processing on a position, corresponding to the metadata, in a bitmap of the determined ROW object;
a statistical module, configured to update the determined amount of garbage of the ROW object based on the size of the metadata;
the first acquisition module is used for acquiring the garbage amount of each ROW object and screening target ROW objects with the garbage amount meeting the recovery condition from each ROW object;
the second acquisition module is used for acquiring a bitmap of the target ROW object aiming at the screened target ROW object;
a second judging module, configured to judge whether an unmarked bit exists in a bitmap of the target ROW object;
and the releasing module is used for releasing the target ROW object when the judgment result of the second judging module is that the unmarked bit does not exist.
According to a third aspect of the present application, there is provided an electronic device comprising a processor and a machine-readable storage medium, the machine-readable storage medium storing a computer program executable by the processor, the processor being caused by the computer program to perform the method provided by the first aspect of the embodiments of the present application.
According to a fourth aspect of the present application, there is provided a machine-readable storage medium storing a computer program which, when invoked and executed by a processor, causes the processor to perform the method provided by the first aspect of the embodiments of the present application.
The beneficial effects of the embodiment of the application are as follows:
because the method and the device provide the bitmap for maintaining each ROW object and counting the garbage amount of each ROW object, and then the recovery of the objects is carried out based on the garbage amount and the bitmap, the garbage amount and the bitmap are obtained in the compact operation process, and the compact operation needs to read the metadata, thereby avoiding the scheme that the metadata is read again and whether the metadata is garbage data or not when data is written in the prior art, reducing the reading of the metadata once, saving the reading time of reading the metadata once, and further realizing the rapid recovery of the ROW objects.
Drawings
Fig. 1 is a schematic flowchart of an object recovery method provided in an embodiment of the present application;
FIG. 2 is a flowchart of an object recycling method provided in an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating the processing logic of the compact operation and bitmap of the RocksDB provided by the embodiment of the present application;
FIG. 4 is a block diagram of an object recycling apparatus according to an embodiment of the present application;
fig. 5 is a block diagram of an electronic device implementing an object recovery method according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings in which the same numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with aspects such as the present application.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the corresponding listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. The word "if" as used herein may be interpreted as "at" \8230; "or" when 8230; \8230; "or" in response to a determination ", depending on the context.
The object recovery method provided by the present application is described in detail below.
Referring to fig. 1 and fig. 2, fig. 1 and fig. 2 are both flowcharts of an object recovery method provided in the present application, it should be noted that the steps of fig. 1 and the steps of fig. 2 are two interdependent flows, and the execution sequence of fig. 1 and fig. 2 is not limited in this embodiment. In practical applications, the flows in fig. 1 and fig. 2 may be executed by different processes in the device, such as two processes that execute the flows in fig. 1 and fig. 2 in parallel, and so on, so that the execution order of fig. 1 and fig. 2 is not specifically limited in this application, and may be determined according to actual situations. Referring to fig. 1, a process of identifying and marking garbage data will be described, wherein the method may include the following steps:
s101, judging whether the currently read metadata are garbage data or not in the process of executing a merged compact operation on the metadata generated by the current layer of the database; if the data is garbage data, executing step S102; if the data is not garbage data, step S103 is executed.
In this step, the database may be, but is not limited to, a RocksDB, where the RocksDB stores data based on multiple layers of SST (session section text) files, and when the storage capacity of each layer reaches a certain threshold, a merge (compact) operation is triggered, that is, newly generated metadata of a current layer and old metadata of a next layer are merged and deduplicated, and then rewritten to the next layer. Since both new metadata and old metadata are read during the compact operation, the present embodiment provides that during the compact operation, whether the metadata currently read by the compact operation is garbage data is identified, and if so, step S102 is executed. When the read metadata is not garbage data, step S103 is performed.
S102, determining a ROW object corresponding to the metadata during writing, and performing garbage marking processing on a position corresponding to the metadata in a bitmap of the determined ROW object; and updating the determined amount of garbage of the ROW object based on the size of the metadata.
In this step, when it is determined that currently read metadata is garbage data during the compact operation, in order to achieve fast recovery of an ROW object, in this embodiment, a bitmap is maintained for each ROW object, a value of each bit of the bitmap is used to represent whether corresponding metadata belonging to the ROW object is garbage data, that is, one bit corresponds to one metadata, and a value of the bit represents whether corresponding metadata is garbage data.
On the basis, when the metadata read by the compact operation is determined to be garbage data, an ROW object corresponding to the metadata is determined, and after the ROW object is determined, a bitmap corresponding to the ROW object is obtained. And then confirming the corresponding bit of the metadata in the bitmap, and then modifying the value of the confirmed bit into a value for marking junk data. For example, if the value 1 is used to indicate that the corresponding metadata is not junk data, the value of each bit of the bitmap may be set to null during initialization, and then, if the metadata is confirmed to be junk data, the value of the bit corresponding to the metadata in the bitmap is modified to 1, thereby indicating that the metadata corresponding to the bit is junk data.
In addition, when the ROW object is collected, it is not that the ROW object is collected immediately when there is garbage data, so this embodiment proposes to count the garbage amount of each ROW object and then collect the ROW object according to the garbage amount of each ROW object, which will be described in detail later. Based on this principle, after the metadata is determined to be garbage data in step S102, the garbage amount of the ROW object corresponding to the metadata may be updated according to the size of the metadata, that is, the updated garbage amount may be a sum of the garbage amount before updating and the size of the metadata.
S103, judging whether the reading of the metadata generated by the current layer is finished; if yes, ending the process; if not, go to step S104.
S104, continuously reading next metadata, taking the read metadata as the currently read metadata, and continuously executing the step S101.
Specifically, when the metadata read in step S101 is not garbage data, it may be determined whether reading of the metadata generated by the current layer is completed, and when not, the next metadata may be read from the current layer, and then the next metadata is taken as the metadata currently read in step S101, and step S101 is continuously executed until the data generated by the current layer is read, that is, the present compact operation is executed. Then, when the next compact operation is performed, the flow shown in fig. 1 is continuously performed. Compared with the scheme that the metadata is read again and whether the metadata is the garbage data is judged in the process of reading the metadata in the compact operation stage in the prior art, the method reduces the reading of the metadata once, saves the reading time of reading the metadata once, saves the performance overhead caused by reading the metadata again, and lays a foundation for quickly recycling the object.
It should be noted that the flow shown in fig. 1 may be executed periodically or at regular time, after the garbage amount of each ROW object is counted, the garbage amount of each ROW object is recorded for use in object recovery, and after the bitmap is updated, the bitmap of each ROW object is also synchronously recorded for use in object recovery.
Referring to the flow shown in fig. 2, the object recycling process will be described next, which includes the following steps:
s201, acquiring the garbage amount of each ROW object, and screening out target ROW objects with the garbage amount meeting the recovery conditions from each ROW object.
In this step, the garbage amount corresponding to each ROW object may be counted based on the process shown in fig. 1, and then the garbage amount of each ROW object at the current time point may be obtained when the process shown in fig. 2 is executed.
In addition, when object recovery is performed, objects with garbage (the garbage amount of an object is not 0) can be processed one by one, but this affects write performance. Therefore, the target ROW object with the garbage amount meeting the recovery condition can be screened from the ROW objects, the recovery condition can be that the garbage amount is the largest, or the garbage amount ranks to the top N, and the like.
S202, aiming at the screened target ROW object, acquiring a bitmap of the target ROW object.
In this step, after the target ROW object is screened out, the bitmap of the target ROW object can be found from the stored bitmaps of the respective ROW objects.
S203, judging whether the bitmap of the target ROW object has unmarked bits or not, and if the bitmap has the unmarked bits, executing the step S204; if there are no unmarked bits, step S206 is executed.
In this step, when step S203 is executed, the bitmap of the target ROW object may be traversed bit by bit, if the current bit is not marked, step S204 is executed, and after the determination result of the metadata corresponding to the current bit is determined based on step S204, the next bit in the bitmap is continuously traversed until the last bit is traversed.
Alternatively, it is also possible to first go through all the unmarked bits included in the bitmap, and then perform step S204 for each unmarked bit in turn.
S204, judging whether metadata corresponding to unmarked bits in a bitmap of the target ROW object is junk data; if not, go to step S205; if the data is garbage data, step S206 is executed.
In this step, after step S204 is performed based on all unmarked bits in the bitmap of the target ROW object, if it is determined that metadata corresponding to all unmarked bits are garbage data, it indicates that there is no valid data under the target ROW object, step S206 may be performed, that is, the target ROW object is released; if any bit value exists in the bitmap of the target ROW object to represent that the metadata corresponding to the bit is not junk data, step S205 is executed.
S205, moving the metadata corresponding to the unmarked bits to other ROW objects and writing the metadata corresponding to the unmarked bits into the database at the positions corresponding to the other ROW objects.
In this step, when it is determined that metadata corresponding to unmarked bits in a bitmap of the target ROW object is not garbage data, it indicates that the metadata is currently still valid, and then since the garbage amount of the target ROW is large and a recovery condition is met, in order to avoid that the valid metadata is emptied, the valid metadata may be moved to another ROW object, and at the same time, the valid metadata is written into a position in the database corresponding to the moved ROW object (another ROW object), and when all valid metadata corresponding to the target ROW object are moved to a position below another ROW object, step S206 may be performed, that is, the target ROW object is released and all metadata under the target ROW object are emptied, thereby implementing fast recovery of the target ROW object.
When the valid metadata in the target ROW object is moved to another ROW object, the garbage amount of the other ROW object is relatively small, or the other ROW object is a new ROW object, and after the movement, the value of the bit corresponding to the valid metadata in the bitmap corresponding to the other ROW object may be set to 0 to indicate that the metadata corresponding to the bit is valid data.
S206, releasing the target ROW object.
Specifically, when all the bits in the bitmap of the traversal target ROW object in step S203 are marked, it is determined that all the metadata corresponding to the target ROW object are garbage data, and at this time, the garbage data may be deleted, and the target ROW object is directly released.
When the valid data (the metadata corresponding to the unmarked bits and not being the garbage data) corresponding to the target ROW object is moved to other ROW objects based on step S205, indicating that the valid data is completely moved, the metadata corresponding to the moved target ROW object only remains the garbage data, and at this time, the garbage data can be cleared, and the target ROW object is released.
It should be noted that, when a plurality of target ROW objects are screened out based on step S201, the flow of steps S201 to S206 is executed for each ROW object, and the execution process of each ROW object is similar, which is not described in detail here.
By implementing the flow shown in fig. 2, because the present application proposes to maintain a bitmap for each ROW object and count the garbage amount of each ROW object, and then perform object recovery based on the garbage amount and the bitmap, since the acquisition of the garbage amount and the bitmap is generated during the compact operation, the compact operation itself needs to read the metadata, thereby avoiding the scheme of reading the metadata again and determining whether the metadata is garbage data when writing data in the prior art, reducing the reading of the metadata once, saving the reading time of reading the metadata once, and further realizing the rapid recovery of the ROW objects.
Optionally, in this embodiment, the metadata corresponding to the unmarked bits in the bitmap of the target ROW object may be obtained according to the following method: determining an address of metadata in a database corresponding to the unmarked bits based on the positions of the unmarked bits in the bitmap; and acquiring the metadata corresponding to the determined address from the database.
Specifically, each metadata corresponding to each ROW object has a unique bit in the bitmap corresponding to the metadata, and each bit corresponds to the address of the metadata in the database, that is, different bits correspond to different addresses (addresses of the metadata). On the basis, after the unmarked bits in the bitmap are searched in the traversal process, the addresses corresponding to the positions of the unmarked bits in the bitmap can be determined, and then the metadata corresponding to the addresses can be searched from the database based on the determined addresses.
Optionally, in this embodiment, whether the metadata is garbage data may be determined according to the following method: determining a user object corresponding to the metadata; judging whether the determined user object is redirected to a new ROW object; and if the data is redirected to a new ROW object, confirming that the metadata is garbage data.
Specifically, when step S101 and step S204 are executed, whether the metadata is garbage data can be determined according to the above-mentioned method, and the general idea is to determine whether the metadata is overwritten, and when the metadata is confirmed to be overwritten, the metadata is confirmed to be garbage data, and when the metadata is not overwritten, the metadata is confirmed to be valid data. For example, in step S101, after performing compact operation and reading current metadata, a user object corresponding to the metadata may be confirmed, and then it is confirmed whether the user object is redirected to a new ROW object, if so, the metadata is confirmed to be garbage data, otherwise, the metadata is confirmed to be valid data. For another example, in step S204, after the metadata is found based on the address corresponding to the unmarked bit, the user object corresponding to the metadata may be confirmed, and then it is confirmed whether the user object is redirected to a new ROW object, if so, the metadata is confirmed as garbage data, otherwise, the metadata is confirmed as valid data.
Specifically, each metadata may maintain a correspondence between a user object and a ROW object of the metadata object, where the correspondence is stored in a Key-Value pair Key-Value form, where a Key may be an address of the user object and may be represented by userboj _ LBA, and a Value may be an address of a corresponding ROW object and may be represented by RowObj _ LBA, so that, after the metadata is found, a correspondence between the user object and the ROW object maintained for the metadata may be found, so that an address userboj _ LBA of the user object of the metadata may be found, and after the userboj _ LBA address is found, it may be queried whether the userboj _ LBA address is redirected to another ROW object, when the user obj _ LBA address is redirected to another ROW object, the metadata is determined to be junk data, and if the user obj _ LBA address is not redirected, the metadata is determined to be valid data. Similarly, based on the position of the unmarked bit in the bitmap, the address of the position may be determined, where the address is Value corresponding to the metadata, that is, the metadata is mapped to the address in the ROW object, that is, the RowObj _ LBA, and after the RowObj _ LBA address is determined, the Key name Key corresponding to the RowObj _ LBA address may be determined, that is, the address userboj _ LBA of the user object is found, and after the userboj _ LBA address is found, it may be queried whether the userboj _ LBA address is redirected to another ROW, and when the userboj _ LBA address is redirected to another ROW object, the metadata is determined to be garbage data, and if the userboj _ LBA address is not redirected, the metadata is determined to be valid data.
To better understand the object recovery method provided in this embodiment of the present application, a database may be taken as RocksDB for example, and as shown in fig. 3, when the flow in fig. 1 is executed based on fig. 3, that is, when a compact operation is performed on data generated by a Level1 layer in fig. 3, that is, when the data is merged into the Level2 layer, when reading a Level2 layer data, it finds metadata currently read from the Level2 layer, that is, metadata corresponding to a first SSL in the Level2 layer in fig. 3, may find a Key name in a Key Value pair maintained by the metadata to be userboj 1_ LBA1, and then if it is found that the Key name is redirected to a ROW object corresponding to RowObj2_ LBA2, if the metadata can be confirmed to be garbage data, the bitmap of the ROW object corresponding to the metadata can be modified at this time, that is, the corresponding ROW object rowbj 1 is searched based on the Value RowObj1_ LBA1 of the metadata, then the corresponding bitmap is searched according to the determined ROW object rowbj 1, referring to fig. 3, then the corresponding bit of the metadata in the searched bitmap is confirmed based on the size of the metadata, and then the Value of the searched bit is configured to be 1, which is used for indicating that the metadata is garbage data, and meanwhile, the garbage amount of the ROW object rowbj 1 is updated according to the size of the metadata. On this basis, if it is determined that the garbage amount of the ROW object rowbj 1 is the largest based on the garbage amount of each ROW object, the process shown in fig. 2 may be executed, the bitmap is traversed bit by bit, it is found that the bit 0 of the bitmap of the rowbj 1 in fig. 3 is not marked as garbage data, the address userboj 1_ LBA0 of the user object corresponding to the bit 0 (that is, the address of the metadata in the rowbj 1, rowObj1_ LBA 0) may be queried based on the address of the bit 0 (that is, the address of the metadata in the rowbj 1, and then whether the userboj 1_ LBA0 is redirected to another ROW object is determined, if yes, the metadata corresponding to the bit 0 is determined to be garbage data, then the bit 1 is traversed continuously, and if the bit 1 is marked, the bit 1 is traversed continuously, the bit 2 is skipped, and so on until the last bit of the bitmap of the rowbj 1 is traversed. After the traversal is completed, if it is found that valid metadata exists under the ROW object RowObj1, the relocation process is executed, and then the ROW object RowObj1 is released. The process of searching the junk data is realized in the compact operation process, so that the reading of metadata at one time is reduced, and the purpose of quickly recovering the ROW object is realized.
Based on the same inventive concept, the application also provides an object recovery device corresponding to the object recovery method. The implementation of the object recovery apparatus may refer to the above description of the object recovery method, and is not discussed here.
Referring to fig. 4, fig. 4 is a diagram of an object recycling apparatus according to an exemplary embodiment of the present application, including:
a first determining module 401, configured to determine whether currently read metadata is garbage data in a process of performing a merge compact operation on metadata generated by a current layer of a database;
a determining module 402, configured to determine, if the determination result of the first determining module 401 is garbage data, that the ROW object is redirected during writing corresponding to the metadata, and perform garbage marking processing on a position, corresponding to the metadata, in a bitmap of the determined ROW object;
a statistics module 403, configured to update the determined garbage amount of the ROW object based on the size of the metadata;
a first obtaining module 404, configured to obtain the garbage amount of each ROW object, and screen out, from each ROW object, a target ROW object whose garbage amount meets the recovery condition;
a second obtaining module 405, configured to obtain, for the screened target ROW object, a bitmap of the target ROW object;
a second determining module 406, configured to determine whether an unmarked bit exists in the bitmap of the target ROW object;
a releasing module 407, configured to release the target ROW object when the determination result of the second determining module 406 is that there is no unmarked bit.
Optionally, the object recovery apparatus provided in this embodiment further includes: a data movement module (not shown), wherein:
the first determining module 401 is further configured to determine whether metadata corresponding to an unmarked bit in a bitmap of the target ROW object is garbage data when the determination result of the second determining module 406 is that the unmarked bit exists;
a data moving module (not shown in the figure), configured to move the metadata corresponding to the unmarked bits to other ROW objects and write the metadata corresponding to the unmarked bits into a database at a position corresponding to the other ROW objects if the determination result of the first determining module 401 is that the data is not junk data;
the releasing module 407 is further configured to release the target ROW object.
Optionally, the object recovery apparatus provided in this embodiment further includes:
a metadata obtaining module (not shown in the figure) configured to obtain metadata corresponding to unmarked bits in the bitmap of the target ROW object according to the following method: determining an address of metadata in a database corresponding to the unmarked bits based on the positions of the unmarked bits in the bitmap; and acquiring metadata corresponding to the determined address from the database.
Optionally, the first determining module 401 is specifically configured to determine a user object corresponding to the metadata; judging whether the determined user object is redirected to a new ROW object; and if the data is redirected to a new ROW object, confirming that the metadata is garbage data.
Based on the same inventive concept, embodiments of the present application provide an electronic device, which may be a device for managing a database, such as a database server. As shown in fig. 5, the apparatus includes a processor 501 and a machine-readable storage medium 502, the machine-readable storage medium 502 stores a computer program capable of being executed by the processor 501, and the processor 501 is caused by the computer program to execute the object recycling method provided by the embodiment of the present application.
The computer-readable storage medium may include a RAM (Random Access Memory), a DDR SRAM (Double Data Rate Synchronous Dynamic Random Access Memory), and may also include a NVM (Non-volatile Memory), such as at least one disk Memory. Alternatively, the computer readable storage medium may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also DSPs (Digital Signal processors), ASICs (Application Specific Integrated circuits), FPGAs (Field-Programmable Gate arrays) or other Programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
In addition, the embodiment of the application provides a machine-readable storage medium, which stores a computer program, and when the computer program is called and executed by a processor, the computer program causes the processor to execute the object recycling method provided by the embodiment of the application.
For the embodiments of the electronic device and the machine-readable storage medium, since the contents of the related methods are substantially similar to those of the foregoing embodiments of the methods, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the embodiments of the methods.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
The implementation process of the functions and actions of each unit/module in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are only schematic, where the units/modules described as separate parts may or may not be physically separate, and the parts displayed as units/modules may or may not be physical units/modules, may be located in one place, or may be distributed on multiple network units/modules. Some or all of the units/modules can be selected according to actual needs to achieve the purpose of the solution of the present application. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only a preferred embodiment of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (8)

1. An object recycling method, comprising:
judging whether the currently read metadata are junk data or not in the process of executing a merge compact operation on the metadata generated by the current layer of the database;
if the ROW object is garbage data, determining that the ROW object is redirected during writing corresponding to the metadata, and performing garbage marking processing on a position, corresponding to the metadata, in a bitmap of the determined ROW object; updating the determined garbage amount of the ROW object based on the size of the metadata;
acquiring the garbage amount of each ROW object, and screening out target ROW objects with the garbage amount meeting the recovery condition from each ROW object;
aiming at the screened target ROW object, acquiring a bitmap of the target ROW object;
judging whether the bitmap of the target ROW object has unmarked bits or not;
when the unmarked bit does not exist, releasing the target ROW object;
when the unmarked bits exist, judging whether metadata corresponding to the unmarked bits in the bitmap of the target ROW object is junk data;
if the data is not garbage data, moving the metadata corresponding to the unmarked bits to other ROW objects and writing the metadata corresponding to the unmarked bits into the database at the positions corresponding to the other ROW objects;
and releasing the target ROW object.
2. The method of claim 1, wherein the metadata corresponding to the unmarked bits in the bitmap of the target ROW object is obtained according to the following method:
determining an address of metadata in a database corresponding to the unmarked bits based on the positions of the unmarked bits in the bitmap;
and acquiring the metadata corresponding to the determined address from the database.
3. The method of claim 1, wherein determining whether the metadata is spam is performed by:
determining a user object corresponding to the metadata;
judging whether the determined user object is redirected to a new ROW object;
and if the ROW object is redirected to a new ROW object, confirming that the metadata is junk data.
4. An object recovery device, comprising:
the first judgment module is used for judging whether the currently read metadata are garbage data or not in the process of executing the merged compact operation on the metadata generated by the current layer of the database;
a determining module, configured to determine, if the determination result of the first determining module is garbage data, that the ROW object is redirected during writing corresponding to the metadata, and perform garbage marking processing on a position, corresponding to the metadata, in a bitmap of the determined ROW object;
a statistics module, configured to update the determined garbage amount of the ROW object based on the size of the metadata;
the first acquisition module is used for acquiring the garbage amount of each ROW object and screening target ROW objects with the garbage amount meeting the recovery condition from each ROW object;
the second acquisition module is used for acquiring a bitmap of the target ROW object aiming at the screened target ROW object;
the second judgment module is used for judging whether the bitmap of the target ROW object has unmarked bits or not;
a releasing module, configured to release the target ROW object when the determination result of the second determining module is that there is no unmarked bit;
the first judging module is further configured to, when the judgment result of the second judging module is that an unmarked bit exists, judge whether metadata corresponding to the unmarked bit in the bitmap of the target ROW object is junk data;
a data moving module, configured to move the metadata corresponding to the unmarked bits to other ROW objects and write the metadata corresponding to the unmarked bits into a database at locations corresponding to the other ROW objects if the determination result of the first determining module is that the data is not junk data;
the releasing module is further configured to release the target ROW object.
5. The apparatus of claim 4, further comprising:
a metadata obtaining module, configured to obtain metadata corresponding to unmarked bits in a bitmap of the target ROW object according to the following method: determining an address of metadata in a database corresponding to the unmarked bits based on the positions of the unmarked bits in the bitmap; and acquiring the metadata corresponding to the determined address from the database.
6. The apparatus of claim 4,
the first judgment module is specifically configured to determine a user object corresponding to the metadata; judging whether the determined user object is redirected to a new ROW object or not; and if the data is redirected to a new ROW object, confirming that the metadata is garbage data.
7. An electronic device comprising a processor and a machine-readable storage medium, the machine-readable storage medium storing a computer program executable by the processor, the processor being caused by the computer program to perform the method of any of claims 1-3.
8. A machine readable storage medium, having stored thereon a computer program which, when invoked and executed by a processor, causes the processor to perform the method of any of claims 1-3.
CN202011277460.0A 2020-11-16 2020-11-16 Object recovery method and device Active CN112597070B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011277460.0A CN112597070B (en) 2020-11-16 2020-11-16 Object recovery method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011277460.0A CN112597070B (en) 2020-11-16 2020-11-16 Object recovery method and device

Publications (2)

Publication Number Publication Date
CN112597070A CN112597070A (en) 2021-04-02
CN112597070B true CN112597070B (en) 2022-10-21

Family

ID=75183354

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011277460.0A Active CN112597070B (en) 2020-11-16 2020-11-16 Object recovery method and device

Country Status (1)

Country Link
CN (1) CN112597070B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115202590B (en) * 2022-09-15 2022-12-16 中电云数智科技有限公司 Method and device for processing SSD (solid state disk) redirection data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101221535A (en) * 2008-01-25 2008-07-16 中兴通讯股份有限公司 Garbage recovery mobile communication terminal of Java virtual machine and recovery method thereof
CN102110146A (en) * 2011-02-16 2011-06-29 清华大学 Key-value storage-based distributed file system metadata management method
CN110226153A (en) * 2016-11-29 2019-09-10 净睿存储股份有限公司 Garbage collection system and process
CN111045956A (en) * 2019-12-22 2020-04-21 北京浪潮数据技术有限公司 Solid state disk garbage recycling method and device based on multi-core CPU

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9921959B2 (en) * 2016-03-11 2018-03-20 Oracle International Corporation Efficient reference classification and quick memory reuse in a system that supports concurrent garbage collection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101221535A (en) * 2008-01-25 2008-07-16 中兴通讯股份有限公司 Garbage recovery mobile communication terminal of Java virtual machine and recovery method thereof
CN102110146A (en) * 2011-02-16 2011-06-29 清华大学 Key-value storage-based distributed file system metadata management method
CN110226153A (en) * 2016-11-29 2019-09-10 净睿存储股份有限公司 Garbage collection system and process
CN111045956A (en) * 2019-12-22 2020-04-21 北京浪潮数据技术有限公司 Solid state disk garbage recycling method and device based on multi-core CPU

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Reducing garbage collection overhead of log-structured file systems with GC journaling";Hyunho Gwak等;《2015 International Symposium on Consumer Electronics (ISCE)》;20150806;全文 *
"基于NVM的分代式可恢复Java垃圾回收器";李鹤婷;《中国优秀硕士学位论文全文数据库 信息科技辑》;20200615(第06期);第I137-71页 *
Java虚拟机中无用单元的精确回收;丁宇新等;《计算机学报》;19991112(第11期);全文 *

Also Published As

Publication number Publication date
CN112597070A (en) 2021-04-02

Similar Documents

Publication Publication Date Title
US10649910B2 (en) Persistent memory for key-value storage
US20220413703A1 (en) Allowing Access To A Partially Replicated Dataset
US10620862B2 (en) Efficient recovery of deduplication data for high capacity systems
US11100071B2 (en) Key-value store tree data block spill with compaction
US10915546B2 (en) Counter-based compaction of key-value store tree data block
US10176190B2 (en) Data integrity and loss resistance in high performance and high capacity storage deduplication
US9779027B2 (en) Apparatus, system and method for managing a level-two cache of a storage appliance
US11334270B2 (en) Key-value store using journaling with selective data storage format
US9043334B2 (en) Method and system for accessing files on a storage system
JP2017079053A (en) Methods and systems for improving storage journaling
EP2336901B1 (en) Online access to database snapshots
CN109086141B (en) Memory management method and device and computer readable storage medium
US11537582B2 (en) Data access method, a data access control device, and a data access system
US11262929B2 (en) Thining databases for garbage collection
US20150212744A1 (en) Method and system of eviction stage population of a flash memory cache of a multilayer cache system
JP2014130549A (en) Storage device, control method, and control program
US11841801B2 (en) Metadata management in non-volatile memory devices using in-memory journal
CN107273306B (en) Data reading and writing method for solid state disk and solid state disk
CN112597070B (en) Object recovery method and device
CN113253932A (en) Read-write control method and system for distributed storage system
CN112597074B (en) Data processing method and device
CN103226522B (en) A kind of data block replacement method of solid state disk buffer zone and device
TWI475419B (en) Method and system for accessing files on a storage system
KR102139578B1 (en) Method for restoring data of database through analysis of disc block pattern
CN115421648A (en) Memory garbage collection method, device, equipment, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant