CN116225748A - Data difference log query method, device, equipment and storage medium - Google Patents

Data difference log query method, device, equipment and storage medium Download PDF

Info

Publication number
CN116225748A
CN116225748A CN202211606464.8A CN202211606464A CN116225748A CN 116225748 A CN116225748 A CN 116225748A CN 202211606464 A CN202211606464 A CN 202211606464A CN 116225748 A CN116225748 A CN 116225748A
Authority
CN
China
Prior art keywords
log
data
difference
active
data block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211606464.8A
Other languages
Chinese (zh)
Inventor
廖孝军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Technologies Co Ltd
Original Assignee
New H3C Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Technologies Co Ltd filed Critical New H3C Technologies Co Ltd
Priority to CN202211606464.8A priority Critical patent/CN116225748A/en
Publication of CN116225748A publication Critical patent/CN116225748A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a data difference log query method, a device, equipment and a storage medium. In addition, whether the large data intervals are different or not is recorded through the hierarchical object, so that data difference inquiry can be accelerated, fault recovery can be accelerated, and the availability and reliability of dual-activity service can be improved.

Description

Data difference log query method, device, equipment and storage medium
Technical Field
The present invention relates to the field of storage computing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for querying a data difference log.
Background
The dual-activity characteristic of the distributed block storage system is based on the two sets of distributed block storage clusters to construct Active-Active read-write access capability, two logic storage volumes with the dual-activity characteristic form a dual-activity volume, data of the dual-activity volume keep synchronous between the two distributed block storage systems, if any one of the two storage systems forming the dual-activity volume fails, services on the failure system can be automatically switched to the other storage cluster to continue to operate, a recovery point target (Recovery Point Objective, RPO) can be equal to zero based on the dual-activity characteristic, and a recovery time target (Recovery Time Objective, RTO) approaches to 0, so that continuity of upper-layer services is guaranteed.
Writing Data to dual live volumes is called dual live writing, and during the dual live writing process, a Data Change LOG (DCL) is recorded, and when a cluster fault is recovered, the DCL is used to recover the Data of the storage logical volumes (Logic Unit Number, LUN) at both ends to a consistent state. The performance of the storage system in recording DCL directly affects the writing performance of the dual live volume, in order to support rpo=0 and rto=0 of the dual live volume, it is necessary to quickly acquire the data difference of the dual live volume and restore the data of the dual live volume to a consistent state, and especially, when the data amount of the data difference of the dual live volume is small, the quick acquisition capability of the data difference is particularly important. How to maximize the recording and query performance of the data difference log DCL is a key to improving the performance of the data difference log system.
In a recording method of a distributed storage data difference log, a dual-active metadata pool is used for recording DCL, dual-active metadata block objects are in one-to-one correspondence with data block objects, before dual-active volume writing I/O, the DCL log is recorded into the corresponding DCL object, and after dual-writing is successful (both dual-active volumes successfully write the data block objects into persistent storage), the corresponding DCL object is cleared. After the failed distributed block storage system is restored, the data difference of the dual live volumes is queried, and the data of the logic volumes at two ends are restored to be consistent based on the DCL.
In the prior art, the logs are scattered and stored in the corresponding DCL objects, so that the DCL read-write performance requirement is met to a certain extent, but each write I/O has at least one time of write amplification, the log overhead is large, and the write performance of the dual live volumes is influenced. In addition, in the fault recovery process, the difference is acquired by traversing all DCL objects of the dual active volume one by one, when the capacity of the dual active volume is larger, even if only a small amount of data is written in the fault process, the time for acquiring the data difference is longer, the reading efficiency of the data difference log is lower, and the fault recovery speed is influenced.
Disclosure of Invention
In view of the above, the present invention provides a method, apparatus, device and storage medium for querying a data difference log, which are used for solving the technical problem of low fault recovery performance of a distributed block storage system.
Based on one aspect of the embodiment of the invention, the invention provides a data difference log recording method, which is applied to a distributed block storage system configured with dual activity characteristics, and comprises the following steps:
receiving a double live volume data difference log DCL record request, and calculating a data block object identifier of a data block object included in the request and a large block data interval where the data block object is located according to a logic block address LBA and a request range in the request;
Firstly, recording a data difference log of the data block object in an active log table in an active log mode and performing log persistence; the active log table is used for caching data difference logs of data block objects written in a data area by hot spots, and the number of the active logs is limited;
if the recording in the active log mode fails, recording the data difference log of the data block object in an independent recording mode and carrying out log persistence;
under the condition of fault single writing, recording a data difference log of the data block object in a log bitmap mode and carrying out log persistence;
persistence of the data difference log in a hierarchical object manner in a data difference log pool DCLPool; and synchronously updating the hierarchical object of the large-block data interval to which the data block object belongs to identify that the large-block data interval to which the data block object belongs has data difference when log persistence is carried out.
Further, the method for recording the data difference log of the data block object in the activity log table in an activity log mode comprises the following steps:
inquiring whether an activity log table records the activity log of the data block object according to the data block object identification, if so, adding 1 to the reference count of the corresponding activity log, and directly returning to the successful response of the DCL record request; if not, an active log block is allocated for the data block object, if the allocation is successful, the data difference log information of the data block object is recorded in the allocated active log block to generate an active log, and the reference count is increased by 1 and then added into an active log table for log persistence.
Further, the method comprises the step of managing the activity log in the activity log table in the form of a state linked list:
the idle linked list is used for managing and maintaining unallocated active log blocks AL-existence in an idle state; the activity log blocks which are allocated and written with the data block object information form an activity log;
the commit link table is used for managing and maintaining the activity log which is added to the activity log table but is not completed with log persistence;
using a linked list for managing and maintaining the active log with completed log persistence, wherein the reference count of the active log in the linked list is not 0, and when the reference count is 0, the active log is migrated to the LRU linked list;
least recently used LRU linked lists are used to manage and maintain activity logs with reference counts of 0 but not yet retired; the AL-existence of the obsolete activity log is migrated to the free linked list.
Further, the method further comprises a data difference log deletion step:
calculating a data block object identifier and a large block data interval where the data block object is located according to the data difference log deletion request initial logical block address and the request range;
checking whether the data difference log is recorded in an independent recording mode according to the data block object identification, if so, acquiring a data difference log storage position, and deleting the data difference log;
Checking whether the data block object activity log exists in the activity log table if the data difference log which is not recorded in an independent recording mode, and reducing the reference count by 1 if the data block object activity log exists; if the data difference log recorded based on the log bitmap mode does not exist, deleting the corresponding data difference log from the data difference log DCL object corresponding to the data block object in the DCLPool based on the log bitmap recorded by the difference bitmap module;
and synchronously updating the hierarchical object of the large block data interval to which the data block object belongs to identify whether the large block data interval to which the data block object belongs has data difference or not when deleting the data difference log.
Further, the method further comprises a data difference log query step:
calculating a data block object identifier and a large block data interval where the data block object is located according to a starting logical block address and a request range of a data difference log query request;
checking whether the calculated large data interval has a difference in the hierarchical object of the DCLPool, and if the calculated large data interval has no difference, returning that the difference is null;
if the difference exists, calculating a DCL object corresponding to the large data interval, and then inquiring all data difference logs and log bitmap records of the DCL object;
And sequentially checking all the data difference logs existing in the request range, and summarizing and returning the data differences in the request range.
Based on another aspect of the embodiment of the invention, the invention further provides a data difference log recording device, which is applied to a distributed block storage system configured with dual activity characteristics. The device may be implemented in software, hardware or a combination of both. When implemented as a software module, the program code of the software module is loaded into a storage medium of the device and executed by a processor reading the program code in the storage medium, thereby implementing the functions of the respective constituent modules in the apparatus. The device comprises:
the activity log module is used for receiving a double-activity volume data difference log DCL record request, and calculating a data block object identifier of a data block object included in the request and a large block data interval where the data block object is located according to a logic block address LBA and a request range in the request; firstly, recording a data difference log of the data block object in an active log table in an active log mode and performing log persistence; the active log table is used for caching data difference logs of data block objects written in a data area by hot spots, and the number of the active logs is limited;
The single record log management module is used for recording the data difference log of the data block object in an independent recording mode and carrying out log persistence when recording fails in an active log mode;
the difference bitmap module is used for recording the data difference log of the data block object in a log bitmap mode under the condition of fault single writing and carrying out log persistence;
the large-block interval management module is used for synchronously updating the hierarchical object of the large-block data interval to which the data block object belongs to identify that the large-block data interval to which the data block object belongs has data difference when log persistence is carried out;
the data difference log is persisted in a hierarchical object mode in a data difference log pool DCLPool.
Further, the activity log module includes:
the judging unit is used for inquiring whether the activity log table records the activity log of the data block object according to the data block object identification;
the response unit is used for directly returning the successful response of the DCL record request after adding 1 to the reference count of the corresponding activity log when the activity log of the data block object is queried in the activity log table;
and the allocation unit is used for allocating an active log block for the data block object when the active log of the data block object is not inquired in the active log table, recording the data difference log information of the data block object in the allocated active log block to generate an active log if the allocation is successful, adding the reference count of the active log into the active log table after increasing by 1, and carrying out log persistence.
Further, the activity log module further includes:
the state chain table management unit is used for managing the active logs in the active log table in a state chain table mode, and the state chain table comprises an idle chain table, a submitting chain table, a using chain table and an LRU chain table;
the idle linked list is used for managing and maintaining unallocated active log blocks AL-existence in an idle state; the activity log blocks which are allocated and written with the data block object information form an activity log;
the commit link table is used for managing and maintaining the activity log which is added to the activity log table but is not completed with log persistence;
using a linked list for managing and maintaining the active log with completed log persistence, wherein the reference count of the active log in the linked list is not 0, and when the reference count is 0, the active log is migrated to the LRU linked list;
least recently used LRU linked lists are used to manage and maintain activity logs with reference counts of 0 but not yet retired; the AL-existence of the obsolete activity log is migrated to the free linked list.
Further, the apparatus further includes a data difference log deletion module, which includes:
the first counting unit is used for calculating a data block object identifier and a large block data interval where the data block object is located according to the data difference log deletion request initial logic block address and the request range;
A log deleting unit, configured to check whether the log is a data difference log recorded in an independent recording manner according to the data block object identifier, and if so, acquire a data difference log storage location, and then delete the data difference log; checking in an active log table if the data difference log recorded in a single recording mode is not recorded
If the data block object activity log exists, the reference count is reduced by 1; if not, 5 is based on the data difference log recorded in the log bitmap mode, and is based on the log bit recorded by the difference bitmap module
Deleting a corresponding data difference log from a data difference log (DCL) object corresponding to the data block object in the DCLPool;
and the first synchronous updating unit is used for synchronously updating the hierarchical object of the large block data interval to which the data block object belongs to identify whether the large block data interval to which the data block object belongs has data difference or not when deleting the data difference log. 0 further, the apparatus further comprises a data difference log query module comprising:
the second calculation unit is used for calculating a data block object identifier and a large block data interval where the data block object is located according to the initial logical block address and the request range of the data difference log query request;
The difference judging module is used for checking whether the calculated large data interval has the difference in the hierarchical object of the DCLPool, and returning that the difference is null if the difference does not exist;
5, calculating and inquiring unit, calculating DCL object corresponding to the large data interval, and inquiring all data difference logs and log bitmap records of the DCL object; and sequentially checking all the data difference logs existing in the request range, and summarizing and returning the data differences in the request range.
It should be noted that the method of the present invention may be performed by a single device, such as a computer or a server. The method of the embodiment can also be applied to a distributed scene, and the method is completed by mutually matching a plurality of devices to achieve the effect of 0. In the case of such a distributed scenario, one of the devices may only perform the present invention
One or more steps of the method may be performed by a plurality of devices interacting with each other to perform the method.
The invention provides a data difference log recording method, a device and equipment, wherein the data difference log recording method, the device and the equipment are used for recording data difference in the invention
The different log DCL adopts a relative scattering and centralized storage mode, and the recording and 5 inquiry performances of the data difference log are considered in a compromise, so that the fault recovery can be accelerated, and the availability and reliability of the dual-activity service are improved. The invention is realized by
The active log is a difference log recorded in a hot spot data interval, and can be directly returned after the active log is cached, so that the number of times of writing I/O (input/output) is reduced, the performance of writing I/O is greatly improved, and particularly, the performance of writing small I/O in sequence by double active volumes is improved. And whether the large data interval is different or not is recorded through the hierarchical object, so that data difference inquiry is accelerated, fault recovery is accelerated, and the availability and reliability of dual-activity service are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will briefly describe the drawings required to be used in the embodiments of the present invention or the description in the prior art, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to these drawings of the embodiments of the present invention for a person having ordinary skill in the art.
FIG. 1 is a schematic diagram of a distributed block storage system with dual activity features according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a storage structure for recording data difference logs according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a data structure of a DCLPool record DCL according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a process of recording a data difference log by an activity log module in a data difference log recording method according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating steps of a method for data discrepancy logging according to an embodiment of the present invention;
FIG. 6 is a flowchart of a distributed block store data difference log query provided by an embodiment of the present invention;
FIG. 7 is a flowchart of a distributed block store data difference log delete according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device for implementing the data difference log recording method according to an embodiment of the present invention.
Detailed Description
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of embodiments of the invention. As used in this embodiment of the invention, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The term "and/or" as used in this disclosure refers to any or all possible combinations comprising one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in embodiments of the present invention to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of embodiments of the present invention. Depending on the context, furthermore, the word "if" used may be interpreted as "at … …" or "at … …" or "in response to a determination".
In order to improve the recording and inquiring performances and the fault recovery performances of the data difference logs of the double active volumes of the distributed block storage system, the invention provides a data difference log recording method, a device and equipment for realizing the method, and the core idea of the invention is that: and adopting a centralized and relatively scattered mode for the difference log, and taking recording and query performances of the difference log into consideration. The method can reduce the number of times of I/O (input/output) disc dropping and greatly improve the performance of I/O writing, in particular to the performance of sequential writing small I/O of double live volumes. The data difference log DCL is managed through the large data block, so that data difference inquiry can be accelerated, fault recovery can be accelerated, and the availability and reliability of dual-activity service can be improved.
FIG. 1 is a schematic diagram of a distributed block memory system with dual activity features according to an embodiment of the present invention. The distributed block storage system A (simply called storage A) and the distributed block storage system B (simply called storage B) are respectively provided with a double-activity module, the storage A and the storage B form a double-activity distributed block storage system, and the double-activity module can ensure that the data of double-activity volumes formed by logical volume lun0 and logical volume theory 1 are synchronous and consistent. Under normal conditions, both the storage A and the storage B are in an Active state, and can provide storage services for the service host. The business host writes I/O and sends to a double-active module stored, the double-active module records DCL first, then carries out the double-write of data, write the I/O into the local storage pool and the remote storage pool (double-write), if both ends write successfully, delete DCL; if one end stores or links between stores are faulty, one end carries service, records DCL and executes single writing when writing into the dual-active module, after fault recovery, the data difference of the dual-active volume needs to be calculated, and then the data is synchronized into a consistent state. To implement the dual activity feature, the distributed block storage system needs to provide a record, query, delete interface for the data difference log.
FIG. 2 is a diagram illustrating a storage structure for recording data difference logs according to an embodiment of the present invention. The data pool (DataPool) is used for caching data block objects (data_obj 1, data_obj2, …), and the data block objects have a one-to-one correspondence with data blocks in a dual-active logical volume (refer to a logical volume with dual-active characteristics in storage, and abbreviated as dual-active volume). A data difference log pool (DCLPool) of a distributed block storage cluster (simply referred to as a storage cluster) stores data differences of dual live volumes in a hierarchical object manner. The data difference log cache module (DCLCache) includes a plurality of modules (ActiveLog, bigExtLog, lidMgr, bitmap) for DCL recording, querying and deleting DCLs. DCLCache is located in the protocol process.
Fig. 3 is a schematic diagram of a data structure of DCLPool recording DCL according to an embodiment of the invention. DCLPool is used to hierarchically record DCLs at different granularity in a hierarchical object and DCL object manner. The first level of DCLPool is an L0 level (e.g., dcl_hdr_lun0), each L0 level corresponds to a first granularity big block data interval (e.g., 256 GB), each unit of each L0 level object corresponds to an L1 level object, each L1 level object corresponds to a second granularity big block data interval (e.g., 4 GB), and each L1 level object includes a number (e.g., 64) of DCL objects. In consideration of the difference log query performance, DCL information of adjacent data blocks (e.g., 4 MB) is recorded in one DCL object without break up. The data block singly written by the bitmap difference bitmap module uses an omap (a kv object) to store data differences, and the updating needs to support a get-and-set interface.
The big block section management module (BigExtLog) is used for identifying and recording whether data differences exist in the big block data sections of the double live volumes (corresponding to the L0 and L1 levels) by using bits. And if the process is restarted after the last synchronization is started and a bit is 0, reloading the object in the DCLPool, and checking whether the detailed difference of the region needs to be acquired or not.
The single record log management module (LidMgr) is used for separately recording the data difference log of the single drop disc during double-active-volume double-writing. When the allocation of the active log block al_extension fails during double-write of the double-active volume, the module individually records the data difference log in the form of (Lid, [ offset, len ]).
The active log module (active log) is used for recording the data difference log of the hot spot writing area in an active log mode, if the data difference log of the data block object is already recorded in the active log table, the DCL recording request only needs to be directly returned after the reference count of the corresponding active log is increased, so that the DCL can be prevented from being written into the DCLPool for multiple times, and the number of times of landing of the data difference log of the hot spot data area is reduced. The number of AL-extensions can be configured according to the needs of the actual service scenario.
A differential Bitmap module (Bitmap) is used to record in a log Bitmap fashion which data block objects were written to the data differential log in a single write fashion at the time of a storage system failure or a link failure. The log bitmap in this module starts to be swiped for persistence when AL-extension runs out.
Fig. 4 is a schematic diagram illustrating a process of recording a data difference log by an active log module in the data difference log recording method according to an embodiment of the present invention, where an active log table and various state linked lists are managed and maintained by the active log module. The active log table may use index arrays (Bucket 0-Bucket n) to establish a corresponding relationship between a data block object and an active log block AL-extension (AL for short), for example, a value obtained by calculating a hash function of a data block object identifier ID is used as an array unit sequence number of the index array, and each array unit of the index array links the active log block AL-extension (AL for short) having the same mapping relationship. The activity log table is a global shared data structure, and the activity log of the data block object can be rapidly inquired through the activity log table. For convenience of description, the present invention refers to an activity log block (AL-extension) recording data block object information as an activity log.
The active log module is further used for managing and maintaining the state of the active log through 4 state linked lists, namely a free linked list, a commit linked list, a use linked list and an LRU linked list (non-use linked list).
The idle linked list is used for managing and maintaining unallocated AL-entries in an idle state, and all the AL-entries in an initial state are put into the idle linked list.
The commit link table is used to manage and maintain the bombed log that has been added to the active log table but has not yet completed log persistence. And after the activity log is added to the activity log table, the activity log is migrated to a commit link table, and the activity log of the commit link table is persisted to the DCLPool by a storage system.
The activity log module increments the reference count of the corresponding activity log by 1 each time a DCL record request is received for the same data block object, the reference count representing the number of write I/Os for the data block object.
The linked list is used for managing and maintaining the active log with completed log persistence, namely, the data difference log of the data block object recorded in the active log is successfully written into the DCLPool to realize log landing, and the reference count of the active log in the linked list is not 0. After the data difference log of the data block object corresponding to the AL-extension is written into the DCLPool in a DCL object mode to realize persistence, the active log in the commit chain table is migrated into the use chain table. When the two ends of the dual active volume storage system complete one data disk dropping operation aiming at the same write I/O, the active log module reduces the reference count of the active log of the data block object related to the write I/O by 1, and when the reference count is reduced to 0, the active log module transfers the corresponding active log from the use linked list to the LRU linked list.
A least recently used (Least Recently Used, LRU) linked list is used to manage and maintain an activity log with a reference count of 0 but not yet obsolete. The reference count of the active log in the LRU linked list is 0, but the data block object information is not cleared, the active log is also linked in the active log list, the active log in the LRU linked list is eliminated from the active log list when a preset elimination condition is met, and the information such as the data block object information, the reference count and the like in the eliminated active log is cleared, so that AL-extension is released for re-allocation.
When the AL-extension is failed to be allocated from the free linked list for a new DCL record request, if the LRU linked list is not empty, the active log module eliminates the active log from the LRU linked list to release the AL-extension, and then allocates the released AL-extension.
FIG. 5 is a flowchart illustrating steps of a method for data discrepancy logging according to an embodiment of the present invention. The method comprises the following steps:
step 501, a request is received for recording a dual live volume data difference log DCL, and a data block object identifier of a data block object included in the request and a large block data interval where the data block object is located are calculated according to a logical block address LBA and a request range in the request.
The logical block address LBA is a logical address of a logical volume formed by block devices in the block storage system, a storage space of the logical volume is segmented by a set data block (for example, 4 MB), a data block is stored and managed in a data pool in an object manner, and a data block object identifier ID generally includes block device information and a data block identifier.
The data difference log DCL records that the LBA and the Length of the request range carried by the request may cover a plurality of data block objects, so this step needs to calculate the identification ID of each data block object according to the LBA and Length carried in the request. For convenience of description, the description of the subsequent steps will be described with reference to one data block object, but it will be understood by those skilled in the art that if a request refers to a plurality of data block objects, each data block object needs to be processed in turn, which is not described in detail in the present invention.
And after receiving the DCL record request, the data difference log buffer module DCLCache calculates a data block object identifier and a large block data interval where the data block object is located, and then firstly inquires the activity log module according to the data block object ID whether an activity log of the data block object is recorded in the activity log table. The purpose of calculating the large data section where the data block object is located is to record the data difference information of the large data section in the DCLPool as a hierarchical object.
Step 502, firstly, recording a data difference log of the data block object in an active log table in an active log mode and performing log persistence; the active log table is used for caching data difference logs of data block objects written in a data area by hot spots, and the number of the active logs is limited;
the activity log module firstly inquires whether an activity log table records the activity log of the data block object according to the data block object ID, if so, the reference count of the corresponding activity log is increased by 1 and then the successful response of the DCL record request is directly returned; the activity log is recorded with information such as double-activity volume identification, data block object identification, reference count and the like.
Each activity log recorded in the activity log table maintained and managed by the activity log module corresponds to a data block object, and reflects that a data writing operation is recently performed in a logic storage space where the recorded data block object is located. Since the write hot spot area is usually a limited few areas, a preset number of active log blocks AL-existence are preset for each dual active volume, that is, the number of active log blocks provided by each dual active volume is limited, and considering the limitation of dual active volume synchronous data volume after the dual active system fails, the number of AL-existence can be configured according to the need, for example, the default value can be set to 64.
The reference count of the activity log in the activity log module represents the number of write I/Os currently being executed, the reference count is decremented by 1 after each write I/O operation is executed, i.e. write data is dropped, and when the reference count is decremented to 0 and a retirement condition is met, the activity log module clears the content release activity log block in the activity log. Therefore, if the active log is queried in the active log table through the data block object identifier, the corresponding data block area is still in a hot spot state, and for a new DCL record request aiming at the data block area, a successful response of the DCL log record request can be directly returned after the reference count is increased, so that the speed of data difference log record and the writing response speed can be greatly accelerated.
If the activity log module does not inquire the activity log of the data block object in the activity log table, the AL-content is allocated to the data block object, if the allocation is successful, the data difference log information is recorded in the allocated AL-content to generate the activity log, and the reference count is increased by 1 and then added into the activity log table for log persistence.
Referring to the example of fig. 3, the active log module manages the active logs in the active log table in a state linked list (including free linked list, commit linked list, use linked list, LRU linked list), and the allocation of the active logs, persistence of the logs (log drop) and persistence of the data block objects (data drop) all change the state of the active logs, and the process of allocating the active log blocks AL-extension to the data block objects by the active log module is as follows:
Step 301, firstly, acquiring idle AL-existence from an idle linked list, under the condition that the idle AL-existence exists in the idle linked list, distributing the idle AL-existence to a data block object, writing information such as a logical volume ID, an object identifier and the like into the AL-existence to generate an active log of the data block object, adding 1 to the reference count of the active log, then migrating the active log to a commit linked list, and adding the active log to an active log table.
Step 302, when the free linked list is empty but the LRU linked list is not empty, the active log in the LRU linked list is eliminated, and the eliminated AL-content is released to the free linked list, and then the allocation step of step 301 is executed. And (4) firstly, performing elimination, and then replacing the eliminated list into an idle linked list for distribution and use in the current flow.
The step of the activity log module managing the activity log further comprises:
303. after the storage system writes the active log in the commit link list to persistent storage, the active log is migrated to the use link list.
The commit link table is used for managing the activity log which is added to the activity log table but is not written into the DCLPool to realize persistent storage, and when the activity log is persistent successfully, the activity log is migrated to the use link table.
304. When the log module receives the DCL record request for the same data block object again, the reference count of the activity log corresponding to the data block object is increased by 1; when the dual live volume completes the disk drop processing of one write I/O, the reference count of the corresponding active log is decremented by 1.
The use of a linked list for managing active logs for which data block objects have not yet fully been persisted has a reference count of active logs in the use of linked list of non-zero. For the same data block object, there may be one or more write I/Os sent by one or more service hosts to execute the landing of the data block object, and when the double active volumes complete the landing processing of one write I/O, the reference count of the corresponding active log is reduced by 1.
305. When the reference count of an active log located in the usage chain table is decremented to 0, the log module will migrate the active log to the LRU chain table.
When the reference count of the active log in the usage linked list is reduced to 0, the data drop processing of the write I/O of the data block object corresponding to all the active logs is completed at both ends of the dual active volume, and at this time, the active log module will migrate the active log to the LRU linked list.
306. And eliminating the active logs in the LRU linked list based on a preset elimination rule, wherein the AL-existence of the eliminated active logs is released back to the idle linked list.
The elimination rule of the active log in the LRU linked list can adopt a tail elimination rule of first-in first-out, namely, the active log added into the LRU linked list is eliminated first, all data difference log records (comprising the elimination of the DCL object and the updating of the corresponding hierarchical object of the large data interval) related to the data block object in the active log in the DCLPool are deleted in the elimination process, and meanwhile, the information such as the identification, the reference count and the like of the active data block in the active log is also cleared. After the elimination process is completed, the active log module removes the corresponding AL-content from the active log table and replaces it in the free linked list.
Optionally, the active log persistence uses a distributed storage Key Value pair (Key-Value, KV) interface to record, and the active log Key Value is formed by a "log_" + "object number" for recording whether the data block object has a data difference, where the object number is a relative position number of the data block object in the data block.
Step 503, if the recording in the active log mode fails, recording the data difference log of the data block object in a single recording mode and performing persistence;
in the case that the free linked list is empty and the LRU linked list is also empty and no AL-extension is available, so that the AL-extension allocation fails, the single record log management module LidMgr is used for independently recording the data difference log of the data block object and performing log persistence (independently recording log in the case of double live volume and double write). Firstly, a global unique global log ID is obtained through a single record log identification management module, and the obtained global unique ID and an object relative number of a prefix ' log_off_ ' + ' are assembled into a key value, wherein the ' object relative number ' is determined by indicating the relative offset position of a data block object in a large block data interval. Using [ offset, length ] as a value, the offset referring to an offset of a data block in a logical volume, the length being a data block size, describing data difference log information by kv, generating a DCL object for kv based on a key value, and storing the DCL object in DCLPool for log persistence.
Step 504, under the condition that single-ended data disk drop (fault single write) is required for storage system fault or link fault, recording the data difference log of the data block object in a log bitmap mode and performing log persistence;
when the storage system where the dual active volume is located detects a system fault or a storage link fault (single write under the fault condition) of the opposite end, a log bitmap mode is adopted to record the data difference log of the data block object. The step is to assemble the obtained global log ID and the prefix 'bm_' + 'object relative number' into a key value, use [ offset, length ] as value, generate a DCL object for kv based on the key value, and store the DCL object in the DCLPool by the difference bitmap module for log persistence.
The DCL object is stored in the high-performance metadata pool, the name of the DCL object is calculated by the number of the large data section where the current data block object is located, the data block object protected by the large data section can be configured, alternatively, the large data section can be set to be 4GB in size (corresponding to the L1 level object in DCLPool in fig. 3), and the corresponding DCL object name is composed of "dcl_l1_" +lunid+ "_" + "number".
Step 505, persisting the data difference log in a hierarchical object mode in a data difference log pool DCLPool; synchronously updating the hierarchical object of the large-block data interval to which the data block object belongs to identify that the large-block data interval to which the data block object belongs has data difference when log persistence is carried out;
In the process that the activity log module, the single record log management module and the difference bitmap module record the data difference log in the DCLPool to carry out log persistence, the method further comprises the following steps:
and before the DCL object is persisted, checking whether the large-block data interval to which the DCL object belongs has difference information recorded in the hierarchical object in the DCLPool or not by the large-block interval management module, and if the hierarchical object corresponding to the large-block data interval is not recorded with the difference, recording the large-block data interval with the data difference and persisting the large-block data interval, and then persisting the DCL object. For example, in the configuration example of fig. 3, when the DCL object with the generated object name dcl_l1_lun0_0 needs to be recorded in DCLPool, it is first determined whether the L1-level object corresponding to the DCL object has a data difference in the large block data interval corresponding to the L1-level object recorded in the l0-level object with the name dcl_dr_lun0 corresponding to the L0-level object, and if no (no_diff) is recorded, the unit value is modified to have a difference (have_diff). The large-block data interval difference information can be organized through an array, is fully cached, and when the large-block data interval difference is accessed, the large-block interval offset is calculated first, so that whether the large-block data interval records a data difference log object, namely a DCL object, can be obtained directly and quickly.
Fig. 6 is a flowchart of a query of a data difference log of a distributed block storage system according to an embodiment of the present invention, where the query method is implemented based on a data difference log recorded by a data difference log recording method in the distributed block storage system, and the method includes:
step 601, calculating a data block object ID and a large block data section where the data block object is located according to the dual live-volume starting LBA address and the request range of the data difference log query request.
When the data difference log buffer module DCLCache receives a DCL inquiry request, the relative offset ID of a large block data interval where a contained data block object is located is calculated according to a logic block address LBA and a request range Length in the request, and then difference information of the large block data interval is obtained.
Step 602, checking whether a difference exists between the large data intervals;
step 603, if there is no difference, returning that the difference is null;
the large block interval management module checks whether the large block data interval has data difference in the L0 and L1 level objects in the DCLPool according to the data block object ID and the large block data interval where the data block object is located, and returns that the acquired difference is null (no difference) if not; if so, continuing to check the DCL object difference.
Step 604, if there is a difference, calculating a DCL object corresponding to the large data interval, and then querying all the data difference logs recorded in the DCL object in the active log mode, the independent recording mode and the log bitmap mode.
Firstly, calculating a DCL object name and an object relative number corresponding to a large data interval, and acquiring data difference information of a data block object through a distributed storage KV interface, wherein the method comprises the following steps:
1. acquiring activity log information of the data block object, recording the activity log by taking a log_ "+" object relative number "+" log ID "as a key, and returning the difference existing in the whole data block object if the activity log exists in the object;
2. obtaining a data difference log recorded in a log bitmap mode of a data block object, wherein the data difference log recorded in the log bitmap mode is recorded by taking a 'bm_' + 'object relative number' + 'log ID' as a key;
3. obtaining an independently recorded log of the data block object, wherein the independently recorded log is recorded by taking a log_off_ "+" object relative number "+" log ID "as a key;
step 605, sequentially checking all the data difference logs existing in the request range, and collecting and returning the data differences in the request range.
And sequentially checking the activity logs, the bitmap difference logs and the independently recorded logs of all data block objects in the request range of the DCL query request in the DCLPool, and taking the union as the DCL to return difference information.
Fig. 7 is a flowchart of a distributed block storage data difference log deletion according to an embodiment of the present invention, and the method 5 includes:
and step 701, calculating a data block object identifier and a large block data interval where the data block object is located according to the data difference log deletion request starting LBA address and the request range.
The relative offset ID of a large block data interval where the data block object is located is calculated, and then the difference information of the large block data interval is obtained.
0 step 702, checking whether the data difference recorded in the single recording mode is based on the data block object identification
A log;
step 703, if the record is in a single record mode, acquiring a data difference log storage position, and then deleting the data difference log;
checking that the log ID is valid, searching a log ID table recorded by the single record log management module (all data difference logs recorded in an independent record mode are recorded in a log ID table 5), and if the log ID is found, indicating that the log ID is independent
And recording the data difference log of the data block object in a recording mode. Acquiring relative numbers of data block objects to
"log_off_" + "object relative number" + "log ID" constitutes a key value, calculates the DCL object name included in the chunk data interval, and then invokes the interface to delete the log from the DCL object.
Step 704, if the data difference log recorded in the independent recording mode is not recorded, checking 0 in the activity log table to check whether the activity log of the data block object exists, and if so, reducing the reference count by 1; if the activity log table
If the data difference log does not exist, the data difference log recorded in a log bitmap mode is described, and the corresponding data difference log is deleted from the DCL object corresponding to the data block object in the DCLPool based on the log bitmap recorded by the difference bitmap module.
If the activity log exists, determining that the activity 5 log is used for the data difference log of the previous recorded data block object, in this case, reducing the corresponding activity log reference count by 1, checking whether the reference count is 0, if
If the value is 0, moving the active log to the LRU linked list, and if the value is not 0 after subtraction, directly returning a DCL deletion success response;
and if the active log of the data block object does not exist, deleting the log from the DCL object according to the log bitmap recorded by the difference bitmap module.
The above deleted data difference log may be bitmap difference log information recorded with "bm_" + "object relative number", and an object activity log recorded with "log_" + "object relative number".
In step 705, the activity log module periodically checks the LRU linked list to clear the least recently used activity log.
The activity log module may initiate a timing task, periodically check the activity log in the LRU linked list, and clear the activity log that was not used recently. Clearing the activity log includes: the method comprises the steps of firstly removing a specified activity log from an activity log table, then eliminating the activity log, deleting the activity log record from a DCL object, and finally replacing an AL-extension element into an idle linked list.
In the above step, when deleting the data difference log, the hierarchical object of the large block data interval to which the data block object belongs needs to be synchronously updated to identify whether the large block data interval to which the data block object belongs has the data difference. And if all the data difference logs in the large block data interval are dropped, modifying the identification of the hierarchical object corresponding to the large block data interval to which the data block object belongs in the DCLPool to be no difference.
Fig. 8 is a schematic structural diagram of an electronic device for implementing a data difference log recording method according to an embodiment of the present invention, where the device 800 includes: a processor 810 such as a Central Processing Unit (CPU), a communication bus 820, a communication interface 840, and a storage medium 830. Wherein the processor 810 and the storage medium 830 may communicate with each other via a communication bus 820. The storage medium 830 has stored therein a computer program which, when executed by the processor 810, performs the functions of one or more steps of the data difference logging method provided by the present invention.
The storage medium may include a random access Memory (Random Access Memory, RAM) or a Non-Volatile Memory (NVM), such as at least one magnetic disk Memory. In addition, the storage medium may be at least one storage device located remotely from the processor. The processor may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
It should be appreciated that embodiments of the invention may be implemented or realized in computer hardware, a combination of hardware and software, or by computer instructions stored in non-transitory memory. The method may be implemented in a computer program using standard programming techniques, including a non-transitory storage medium configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose. Furthermore, the operations of the processes described in the present invention may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes (or variations and/or combinations thereof) described herein may be performed under control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications), by hardware, or combinations thereof, collectively executing on one or more processors. The computer program includes a plurality of instructions executable by one or more processors.
Further, the method may be implemented in any type of computing platform operatively connected to a suitable computing platform, including, but not limited to, a personal computer, mini-computer, mainframe, workstation, network or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and so forth. Aspects of the invention may be implemented in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optical read and/or write storage medium, RAM, ROM, etc., such that it is readable by a programmable computer, which when read by a computer, is operable to configure and operate the computer to perform the processes described herein. Further, the machine readable code, or portions thereof, may be transmitted over a wired or wireless network. When such media includes instructions or programs that, in conjunction with a microprocessor or other data processor, implement the steps described above, the invention described herein includes these and other different types of non-transitory computer-readable storage media. The invention also includes the computer itself when programmed according to the methods and techniques of the present invention.
The foregoing is merely exemplary of the present invention and is not intended to limit the present invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (12)

1. A method of data difference logging, the method being applied to a distributed block storage system configured with dual activity characteristics, the method comprising:
receiving a double live volume data difference log DCL record request, and calculating a data block object identifier of a data block object included in the request and a large block data interval where the data block object is located according to a logic block address LBA and a request range in the request;
firstly, recording a data difference log of the data block object in an active log table in an active log mode and performing log persistence; the active log table is used for caching data difference logs of data block objects written in a data area by hot spots, and the number of the active logs is limited;
if the recording in the active log mode fails, recording the data difference log of the data block object in an independent recording mode and carrying out log persistence;
Under the condition of fault single writing, recording a data difference log of the data block object in a log bitmap mode and carrying out log persistence;
persistence of the data difference log in a hierarchical object manner in a data difference log pool DCLPool; and synchronously updating the hierarchical object of the large-block data interval to which the data block object belongs to identify that the large-block data interval to which the data block object belongs has data difference when log persistence is carried out.
2. The method according to claim 1, wherein the method of recording the data difference log of the data block object in the activity log table in an activity log manner is as follows:
inquiring whether an activity log table records the activity log of the data block object according to the data block object identification, if so, adding 1 to the reference count of the corresponding activity log, and directly returning to the successful response of the DCL record request; if not, an active log block is allocated for the data block object, if the allocation is successful, the data difference log information of the data block object is recorded in the allocated active log block to generate an active log, and the reference count is increased by 1 and then added into an active log table for log persistence.
3. The method of claim 2, further comprising the step of managing the activity log in the activity log table in a state linked list manner:
The idle linked list is used for managing and maintaining unallocated active log blocks AL-existence in an idle state; the activity log blocks which are allocated and written with the data block object information form an activity log;
the commit link table is used for managing and maintaining the activity log which is added to the activity log table but is not completed with log persistence;
using a linked list for managing and maintaining the active log with completed log persistence, wherein the reference count of the active log in the linked list is not 0, and when the reference count is 0, the active log is migrated to the LRU linked list;
least recently used LRU linked lists are used to manage and maintain activity logs with reference counts of 0 but not yet retired; the AL-existence of the obsolete activity log is migrated to the free linked list.
4. The method of claim 2, further comprising the step of data difference log deletion:
calculating a data block object identifier and a large block data interval where the data block object is located according to the data difference log deletion request initial logical block address and the request range;
checking whether the data difference log is recorded in an independent recording mode according to the data block object identification, if so, acquiring a data difference log storage position, and deleting the data difference log;
Checking whether the data block object activity log exists in the activity log table if the data difference log which is not recorded in an independent recording mode, and reducing the reference count by 1 if the data block object activity log exists; if the data difference log recorded based on the log bitmap mode does not exist, deleting the corresponding data difference log from the data difference log DCL object corresponding to the data block object in the DCLPool based on the log bitmap recorded by the difference bitmap module;
and synchronously updating the hierarchical object of the large block data interval to which the data block object belongs to identify whether the large block data interval to which the data block object belongs has data difference or not when deleting the data difference log.
5. The method of claim 2, further comprising the step of data difference log querying:
calculating a data block object identifier and a large block data interval where the data block object is located according to a starting logical block address and a request range of a data difference log query request;
checking whether the calculated large data interval has a difference in the hierarchical object of the DCLPool, and if the calculated large data interval has no difference, returning that the difference is null;
if the difference exists, calculating a DCL object corresponding to the large data interval, and then inquiring all data difference logs and log bitmap records of the DCL object;
And sequentially checking all the data difference logs existing in the request range, and summarizing and returning the data differences in the request range.
6. A data difference logging apparatus for use in a distributed block storage system configured with dual activity features, the apparatus comprising:
the activity log module is used for receiving a double-activity volume data difference log DCL record request, and calculating a data block object identifier of a data block object included in the request and a large block data interval where the data block object is located according to a logic block address LBA and a request range in the request; firstly, recording a data difference log of the data block object in an active log table in an active log mode and performing log persistence; the active log table is used for caching data difference logs of data block objects written in a data area by hot spots, and the number of the active logs is limited;
the single record log management module is used for recording the data difference log of the data block object in an independent recording mode and carrying out log persistence when recording fails in an active log mode;
the difference bitmap module is used for recording the data difference log of the data block object in a log bitmap mode under the condition of fault single writing and carrying out log persistence;
The large-block interval management module is used for synchronously updating the hierarchical object of the large-block data interval to which the data block object belongs to identify that the large-block data interval to which the data block object belongs has data difference when log persistence is carried out;
the data difference log is persisted in a hierarchical object mode in a data difference log pool DCLPool.
7. The apparatus of claim 6, wherein the activity log module comprises:
the judging unit is used for inquiring whether the activity log table records the activity log of the data block object according to the data block object identification;
the response unit is used for directly returning the successful response of the DCL record request after adding 1 to the reference count of the corresponding activity log when the activity log of the data block object is queried in the activity log table;
and the allocation unit is used for allocating an active log block for the data block object when the active log of the data block object is not inquired in the active log table, recording the data difference log information of the data block object in the allocated active log block to generate an active log if the allocation is successful, adding the reference count of the active log into the active log table after increasing by 1, and carrying out log persistence.
8. The apparatus of claim 7, wherein the activity log module further comprises:
The state chain table management unit is used for managing the active logs in the active log table in a state chain table mode, and the state chain table comprises an idle chain table, a submitting chain table, a using chain table and an LRU chain table;
the idle linked list is used for managing and maintaining unallocated active log blocks AL-existence in an idle state; the activity log blocks which are allocated and written with the data block object information form an activity log;
the commit link table is used for managing and maintaining the activity log which is added to the activity log table but is not completed with log persistence;
using a linked list for managing and maintaining the active log with completed log persistence, wherein the reference count of the active log in the linked list is not 0, and when the reference count is 0, the active log is migrated to the LRU linked list;
least recently used LRU linked lists are used to manage and maintain activity logs with reference counts of 0 but not yet retired; the AL-existence of the obsolete activity log is migrated to the free linked list.
9. The apparatus of claim 7, further comprising a data difference log deletion module comprising:
the first counting unit is used for calculating a data block object identifier and a large block data interval where the data block object is located according to the data difference log deletion request initial logic block address and the request range;
A log deleting unit, configured to check whether the log is a data difference log recorded in an independent recording manner according to the data block object identifier, and if so, acquire a data difference log storage location, and then delete the data difference log; checking whether the data block object activity log exists in the activity log table if the data difference log which is not recorded in an independent recording mode, and reducing the reference count by 1 if the data block object activity log exists; if the data difference log recorded based on the log bitmap mode does not exist, deleting the corresponding data difference log from the data difference log DCL object corresponding to the data block object in the DCLPool based on the log bitmap recorded by the difference bitmap module;
and the first synchronous updating unit is used for synchronously updating the hierarchical object of the large block data interval to which the data block object belongs to identify whether the large block data interval to which the data block object belongs has data difference or not when deleting the data difference log.
10. The apparatus of claim 7, further comprising a data difference log query module comprising:
the second calculation unit is used for calculating a data block object identifier and a large block data interval where the data block object is located according to the initial logical block address and the request range of the data difference log query request;
The difference judging module is used for checking whether the calculated large data interval has the difference in the hierarchical object of the DCLPool, and returning that the difference is null if the difference does not exist;
the calculation query unit is used for calculating the DCL object corresponding to the large data interval and then querying all data difference logs and log bitmap records of the DCL object; and sequentially checking all the data difference logs existing in the request range, and summarizing and returning the data differences in the request range.
11. An electronic device is characterized by comprising a processor, a communication interface, a storage medium and a communication bus, wherein the processor, the communication interface and the storage medium are communicated with each other through the communication bus;
a storage medium storing a computer program;
a processor for carrying out the method steps of any one of claims 1-5 when executing a computer program stored on a storage medium.
12. A storage medium having stored thereon a computer program which, when executed by a processor, implements the method of any of claims 1 to 5.
CN202211606464.8A 2022-12-12 2022-12-12 Data difference log query method, device, equipment and storage medium Pending CN116225748A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211606464.8A CN116225748A (en) 2022-12-12 2022-12-12 Data difference log query method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211606464.8A CN116225748A (en) 2022-12-12 2022-12-12 Data difference log query method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116225748A true CN116225748A (en) 2023-06-06

Family

ID=86575710

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211606464.8A Pending CN116225748A (en) 2022-12-12 2022-12-12 Data difference log query method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116225748A (en)

Similar Documents

Publication Publication Date Title
CN107209714B (en) Distributed storage system and control method of distributed storage system
US10133511B2 (en) Optimized segment cleaning technique
US8930648B1 (en) Distributed deduplication using global chunk data structure and epochs
JP2021508879A (en) Systems and methods for database management using additional dedicated storage devices
US10170151B2 (en) Method and system for handling random access write requests for a shingled magnetic recording hard disk drive
US20160371186A1 (en) Access-based eviction of blocks from solid state drive cache memory
US9910798B2 (en) Storage controller cache memory operations that forego region locking
EP3321792B1 (en) Method for deleting duplicated data in storage system, storage system and controller
US10503424B2 (en) Storage system
US10235059B2 (en) Technique for maintaining consistent I/O processing throughput in a storage system
CN108319430B (en) Method and device for processing IO (input/output) request
US11436102B2 (en) Log-structured formats for managing archived storage of objects
CN114253908A (en) Data management method and device of key value storage system
US20160357672A1 (en) Methods and apparatus for atomic write processing
US10114566B1 (en) Systems, devices and methods using a solid state device as a caching medium with a read-modify-write offload algorithm to assist snapshots
US9934248B2 (en) Computer system and data management method
US10346077B2 (en) Region-integrated data deduplication
US11599460B2 (en) System and method for lockless reading of metadata pages
CN116225748A (en) Data difference log query method, device, equipment and storage medium
US10853257B1 (en) Zero detection within sub-track compression domains
US11288238B2 (en) Methods and systems for logging data transactions and managing hash tables
WO2020215223A1 (en) Distributed storage system and garbage collection method used in distributed storage system
US11966590B2 (en) Persistent memory with cache coherent interconnect interface
US20230273731A1 (en) Persistent memory with cache coherent interconnect interface
US11150827B2 (en) Storage system and duplicate data management method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination