CN114647624A - Method, system and storage medium for capturing database consistent point in block-level CDP - Google Patents

Method, system and storage medium for capturing database consistent point in block-level CDP Download PDF

Info

Publication number
CN114647624A
CN114647624A CN202210508886.5A CN202210508886A CN114647624A CN 114647624 A CN114647624 A CN 114647624A CN 202210508886 A CN202210508886 A CN 202210508886A CN 114647624 A CN114647624 A CN 114647624A
Authority
CN
China
Prior art keywords
request packet
layer
file
volume
driver
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210508886.5A
Other languages
Chinese (zh)
Other versions
CN114647624B (en
Inventor
谢俊峰
黄传波
刘国印
涂磊
钱禹航
谢卓伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Vinchin Science And Technology Co
Original Assignee
Chengdu Vinchin Science And Technology Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Vinchin Science And Technology Co filed Critical Chengdu Vinchin Science And Technology Co
Priority to CN202210508886.5A priority Critical patent/CN114647624B/en
Publication of CN114647624A publication Critical patent/CN114647624A/en
Application granted granted Critical
Publication of CN114647624B publication Critical patent/CN114647624B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method, a system and a storage medium for capturing database consistent points in a block-level CDP, belonging to the field of data recovery. The method comprises the following steps: a path where the redo log is located is obtained; a step of creating a consistent point high-speed cache; a file filter driver and a volume filter driver are installed; a sending step; acquiring basic information; a judgment step; and (5) a capturing step. The system comprises: a redo log location path acquisition module; a consistent point high-speed cache establishing module; a file filter driver and a volume filter driver installation module; a sending module; a basic information acquisition module; a judgment module; and a capturing module. The invention provides a lighter-weight method, which enables the volume filtering driver to capture the consistent point, completes the work of capturing the consistent point through the mutual cooperation of the file filtering driver, the consistent point cache and the volume filtering driver, does not need to analyze the specific content of an IO request packet, only needs to identify an InNODB redoing log check point, and is simple, convenient and feasible.

Description

Method, system and storage medium for capturing database consistent point in block-level CDP
Technical Field
The invention belongs to the field of data recovery, and relates to a method, a system and a storage medium for capturing database consistent points in block-level CDP.
Background
InnodB is a storage engine that supports transactional, rollback, multi-versioning concurrency control, and the currently more popular MySQL database is one that has InnodB as the default storage engine. The InNODB is provided with a fixed-position check point field in a redo log file, and corresponding records are made in the check point every time the database cache is dropped. The block-level CDP is a backup method that can continuously capture IO requests of all volume devices, can effectively avoid data loss caused by non-hardware failures such as operation errors and virus attacks by users, and can provide a more flexible target recovery point. With the development of storage technology, block-level CDP is gradually becoming the mainstream mode of data protection, and block-level CDP is applied to mainstream database to realize rapid and complete recovery of data, and is also gradually becoming an industry research hotspot.
Currently, there are some block-level continuous data protection methods for database consistency in the prior art, for example, the prior art discloses a CDP backup and recovery method (publication number CN 108170766B) for ensuring database consistency, which determines whether an IO request packet has consistency by parsing the specific content of the IO request packet in the transaction log to find the last IO request packet generated by a transaction. The method can effectively solve the problem of the consistency of the database in the block-level CDP, but has certain defects that the content of each transaction log IO request packet needs to be analyzed, certain performance influence is inevitably caused on a protected client, particularly the IO request packet is generated in a kernel layer and is issued at a very high speed, and if the specific content of the IO request packet is analyzed in the issuing process of the IO request packet, a large delay is caused.
Therefore, how to capture the block-level CDP consistent point and reduce the influence on the system efficiency as much as possible becomes a technical problem which needs to be solved at present.
Disclosure of Invention
In order to solve the technical problems in the background art, embodiments of the present invention provide a method, a system, and a storage medium for capturing a database consistency point in a block level CDP. The technical scheme is as follows:
in a first aspect, a method for capturing a database consistency point in a block level CDP is provided, the method comprising the steps of:
a path obtaining step of the redo log, obtaining the path of the InNODB redo log;
a consistent point high-speed cache establishing step, namely establishing a consistent point high-speed cache in the kernel layer;
a file filter driver and a volume filter driver installation step, wherein the file filter driver is installed on the upper layer of the file driver, and the volume filter driver is installed on the upper layer of the volume equipment driver;
a sending step, sending the path of the InNODB redo log to the file filter driver;
a basic information obtaining step, when an operating system issues an IO request, the file filter driver captures a write type IO request packet of a file system layer and obtains basic information, wherein the basic information comprises: write file path, offset, and data length;
a judging step, namely judging whether the IO request packet of the file system layer has consistency or not by the file filter driver according to the basic information and the path of the InNODB redoing log; if yes, recording the first address of the file system layer IO request packet in a consistent point cache, and then issuing the file system layer IO request packet; if not, issuing the file system layer IO request packet;
a capture step, the volume filter driver capturing consistency points by querying a consistency point cache.
In one embodiment, the obtaining step of the path where the redo log is located includes:
traversing all environment variables of the current operating system, and screening out the environment variables of the path where the database starting command is located, wherein the database uses an InNODB storage engine;
and obtaining the path of the InNODB redo log according to the environment variable of the path of the database starting command and the directory structure of the database.
In one embodiment, the step of creating the cache of the consistent point comprises:
installing a kernel-state dynamic link library driver;
creating a spin lock, a non-paged memory allocation table and a hash table in the dynamic link library driver, wherein the hash table has an externally exposed interface, and the hash table manages a memory through the non-paged memory allocation table.
In one embodiment, the determining step includes:
judging whether the IO request packet of the file system layer covers a check point field of the redo log of the database or not according to the basic information and the path of the InoDB redo log; if so, applying for a memory with a specified size in a consistency point cache, storing the first address of the file system layer IO request packet into the memory, and then issuing the file system layer IO request packet; and if not, issuing the file system layer IO request packet.
It should be understood that the file system layer IO request packet and the volume layer IO request packet belong to different states of the same IO request packet at different positions of the device stack, and have the same first address.
In one embodiment, the capturing step comprises:
a volume filter driver acquires a first address of a volume layer IO request packet;
a volume filter driver calls an interface of a hash table in a consistency cache and judges whether the first address of the volume layer IO request packet is recorded in the consistency cache or not; if yes, marking the backup data extracted by the volume layer IO request packet, deleting the first address of the volume layer IO request packet from a consistency cache through an interface, and issuing the volume layer IO request packet; if not, backing up the volume layer IO request packet, and then issuing the volume layer IO request packet.
It is worth explaining that the operation of marking the backup data extracted by the volume layer IO request packet corresponds to the completion of the consistent point capture work.
It is worth explaining that the volume layer IO request packet is generated by state change when the file system layer IO request packet reaches the volume filtering driver layer through the device stack; the volume layer IO request packet and the file system layer IO request packet have the same first address.
In a second aspect, there is provided a system for capturing a database consistency point in a block level CDP, the system comprising:
the redo log locating path obtaining module is used for obtaining the InNODB redo log locating path;
a consistent point high-speed cache establishing module used for establishing consistent point high-speed cache in the kernel layer;
the file filtering driver and roll filtering driver installation module is used for installing a file filtering driver on a file driver layer and installing a roll filtering driver on a roll equipment driver layer;
the sending module is used for sending the path where the InNODB redo log is located to the file filter driver;
a basic information obtaining module, configured to capture, by the file filter driver, a write type IO request packet of a file system layer and obtain basic information when an IO request is issued by an operating system, where the basic information includes: write file path, offset, and data length;
the judging module is used for judging whether the IO request packet of the file system layer has consistency or not according to the basic information and the path of the redo log of the InNODB by the file filter driver; if yes, recording the first address of the file system layer IO request packet in a consistent point cache, and then issuing the file system layer IO request packet; if not, issuing the file system layer IO request packet;
a capture module to capture a consistency point by the volume filter driver by querying a consistency point cache.
In one embodiment, the coherent point cache creation module comprises:
the installation unit is used for installing the kernel-mode dynamic link library driver;
the creating unit is used for creating a spin lock, a non-paged memory allocation table and a hash table in the dynamic link library driver, wherein the hash table is provided with an externally exposed interface, and the hash table manages a memory through the non-paged memory allocation table.
In one embodiment, the determining module includes:
the check point coverage detection unit is used for judging whether the file system layer IO request packet covers a check point field of the database redo log or not according to the basic information and the path where the InoDB redo log is located; if so, applying for a memory with a specified size in a consistency point cache, storing the first address of the file system layer IO request packet into the memory, and then issuing the file system layer IO request packet; and if not, issuing the file system layer IO request packet.
In one embodiment, the capturing module includes:
a first address obtaining unit, configured to obtain, by a volume filter driver, a first address of a volume layer IO request packet;
a consistency point judgment marking unit, configured to call an externally exposed hash table interface in a consistency cache by a volume filter driver, and judge whether a first address of the volume layer IO request packet is recorded in the consistency cache; if yes, marking the backup data extracted by the volume layer IO request packet, and deleting the first address of the volume layer IO request packet from the consistency cache through an interface; if not, backing up the volume layer IO request packet, and then issuing the volume layer IO request packet.
In a third aspect, a computer readable storage medium is further provided, which has a computer program stored thereon, and when the program is executed by a processor, the program implements the method for capturing the database consistency point in the block level CDP.
The invention has the beneficial effects that:
1. the invention completes the work of capturing the consistent point through the mutual cooperation of the file filtering driver, the consistent point cache and the volume filtering driver, the specific content of the IO request packet does not need to be analyzed in the capturing process, and only the InoDB redoing log check point is identified, thus being simple, convenient and feasible;
2. compared with the traditional solution for the data consistency of the file system, the method can identify the consistency point in the state that the file is continuously opened, does not depend on monitoring whether the file is closed, and can be applied to a database in which the file is opened for a long time;
3. the invention obtains the consistent point by identifying the writing position of the IO request packet, does not need to search the data carried by the IO request packet, has less time for the identification process and has less influence on the production efficiency of the operating system where the client is positioned.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart illustrating a method for capturing a database consistency point in a block level CDP according to an embodiment of the present invention.
Fig. 2 is a structural diagram of an InnoDB redo log file header in the first embodiment of the present invention.
Fig. 3 is a structural diagram of a capture system of a database consistency point in block level CDP according to a second embodiment of the present invention.
Fig. 4 is a block diagram of a coherent point cache creation module according to a second embodiment of the present invention.
Fig. 5 is a structural diagram of a determining module in the second embodiment of the present invention.
Fig. 6 is a structural diagram of a capture module according to a second embodiment of the present invention.
FIG. 7 is a diagram of a database table according to a second embodiment of the present invention.
In the drawings, the components represented by the respective reference numerals are listed below:
1001. the system comprises a redo log locating path obtaining module, 1002, a consistency point high-speed cache creating module, 1003, a file filtering driver and volume filtering driver installing module, 1004, a sending module, 1005, a basic information obtaining module, 1006, a judging module, 1007, a capturing module, 10021, an installing unit, 10022, a creating unit, 10061, a check point coverage detecting unit, 10071, a first address obtaining unit, 10072 and a consistency point judging and marking unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The method provided by the invention can be applied to the following environments: the operating system is Windows10, and the parsing process is written in C language.
Interpretation of terms:
(1) file system layer IO request package: an IO request packet at a file system layer;
(2) volume level IO request packet: an IO request packet at a volume device layer;
(3) check points: and the InNODB redoes a field in the log file header and is used for marking the last serial number of the last disk-drop completion of the database cache.
Example one
As shown in fig. 1, a method for capturing a database consistency point in a block level CDP is provided, the method comprising the steps of:
s1, obtaining a path where the InNODB redo log is located.
It can be understood that, the path where the inodb redo log is located specifically means: the location of the redo log contained in the database of the InoDB storage engine in the operating system is used.
In addition, for ease of understanding, the following description will discuss step S1 in detail.
Optionally, the step S1 includes:
s11, traversing all environment variables of the current operating system, and screening out the environment variables of the path where the database starting command is located, wherein the database uses an InNODB storage engine.
It can be understood that the operating system records the path of the database start command in the environment variables, and the environment variables of the path of the database start command can be obtained by traversing each environment variable in the current operating system.
And S12, obtaining the path of the InNODB redo log according to the environment variable of the path of the database starting command and the directory structure of the database.
Generally, a database has a fixed directory structure, the location of a database root directory can be analyzed through environment variables of a path where a database start command is located, and a path where an InNODB redo log is located can be obtained according to the directory structure of the database.
It is noted that more than one redo log may be found in the InNODB, for example, multiple redo log files such as "ib _ logfile 0", "ib _ logfile 1", etc. may be found in the MySQL database. Each redo log file has the same structure, but only the checkpoint in the redo log named "ib _ logfile 0" is used by the inodb, so the path of the redo log named ib _ logfile0 "is obtained here.
And S2, creating a consistent point high-speed cache at a kernel layer.
Optionally, the step S2 includes:
and S21, installing a kernel-mode dynamic link library driver.
S22, creating a spin lock, a non-paged memory allocation table and a hash table in the dynamic link library drive, wherein the hash table is provided with an externally exposed interface, and the hash table manages a memory through the non-paged memory allocation table.
Optionally, the step S22 includes:
s221, the non-paged memory allocation table allocates the memory through a lazy loading mechanism.
The non-paged memory has precious resources in the kernel of the operating system, and the application of a large non-paged memory block at a time is a very wasteful behavior, and meanwhile, the execution efficiency of the kernel process of the operating system is also influenced. Therefore, the cache at the consistency point adopts a lazy loading mechanism, and when new data is added to the cache at the consistency point, the cache with the required size is distributed through a non-paged memory distribution table.
It is worth explaining that in the operating system kernel, memory is divided into paged memory and non-paged memory. Paged memory is virtual memory that can be swapped onto disk, and data held by paged memory may be replaced into disk when memory is insufficient, and then scheduled from disk to memory again when it is needed to access the data. The non-paged memory is a memory which cannot be replaced on a disk, and data stored in the non-paged memory can be accessed at any time.
In this embodiment, the IO request packet is issued very fast on the device stack of the operating system. The paging memory is used for temporarily storing the head address of the IO request packet, so that replacement is easy to occur before the block memory is accessed, and the issuing efficiency of the IO request packet is seriously influenced. The non-paging memory is used for temporarily storing the head address of the IO request packet, so that the replacement can be avoided, and the influence on the issuing of the IO request packet is avoided. Therefore, the consistency point cache uses a non-paged memory to achieve the effect of quick access, and meanwhile, the issuing efficiency of the IO request packet is not influenced.
It should be noted that the high-speed cache of the consistent point is used by the file filter driver to temporarily store the IO request packet head address of the file system layer, and also provides the query and delete functions for the volume filter driver. Before issuing a volume layer IO request packet, the volume filter driver queries whether the first address of the IO request packet exists in a consistency point cache, and the consistency point cache provides a query operation with time complexity of O (1) through a hash table, so that the efficiency of issuing the volume layer IO request packet by the volume filter driver is not influenced.
It should be further noted that the query and delete operations of the volume filter driver and the temporary storage operation of the file filter driver are concurrently executed in the kernel of the operating system, so that the creation of the spin lock can protect the hash table.
And S3, installing a file filtering driver on the upper layer of the file driver, and installing a roll filtering driver on the upper layer of the roll equipment driver.
It is worth to be noted that the consistent point capturing method is realized based on the recognition of the redo log checkpoint, and the device stack level where the volume filter driver is located causes that the volume filter driver cannot recognize the specific write file path of the filtered IO request packet, so that the file filter driver is introduced to perform recognition work. The specific content of the file filtering driver identification work is to check whether an IO request packet modifies a redo log checkpoint, a write-in file path of the IO request packet needs to be acquired in the checking process, the file filtering driver is installed on the lower layer of the file driver and cannot acquire the write-in file path of the IO request packet, and the file filtering driver is installed on the upper layer of the file driver and can be selected to realize acquisition.
And S4, sending the path of the InNODB redo log to the file filter driver.
It should be noted that the sending of the path of the InnoDB redo log to the file filter driver may be accomplished by using a sending program in the application layer.
S5, when an operating system issues an IO request, the file filtering driver captures a write type IO request packet of a file system layer and obtains basic information, wherein the basic information comprises: write file path, offset, and data length.
It is worth explaining that the write file path, offset, and data length are specifically:
writing a file path, namely the position of a file to be written by the IO request packet in an operating system;
writing a file offset, namely the initial position of the file to be written in the IO request packet;
and writing the data length of the file, namely the data length of the file to be written in the IO request packet.
S6, according to the basic information and the path of the InNODB redo log, the file filter driver judges whether the IO request packet of the file system layer has consistency or not; if yes, recording the first address of the file system layer IO request packet in a consistent point cache, and then issuing the file system layer IO request packet; and if not, issuing the file system layer IO request packet.
Optionally, the step S6 includes:
s61, judging whether the IO request packet of the file system layer covers a check point field of the redo log of the database or not according to the basic information and the path where the InoDB redo log is located; if so, applying for a memory with a specified size in a consistency point cache, storing the first address of the file system layer IO request packet into the memory, and then issuing the file system layer IO request packet; and if not, issuing the file system layer IO request packet.
Fig. 2 shows the contents of the top 2048B of the InnoDB redo log. In the InNODB redo log, the first 2048B includes four fields, a first block (0B-512B), checkpoint 1 (512B-1024B), an encrypted field (1024B-1536B), and checkpoint 2 (1536B-2048B).
The InNODB redo log records the physical condition of each data page change of the database. The InNODB redo log is mainly used for data recovery, and can restore data to be consistent according to check points when the database crashes or abnormally stops running.
For ease of understanding, specifically, an example of operation is provided for step S61: on a 64-bit Windows operating system, a file filtering driver acquires an IO Stack Location of an IO request packet through an IoGetCurrentIrpStack Location function in a Windows API, a write file path of the IO request packet acquired through the IO Stack Location is 'E: \ Programfiles \ mysql-8.0.23-winx64\ data _ log file 0', a write file offset is 0x600 (1536B), and a write file data length is 0x200 (512B). The file filtering driver judges that the IO request packet has consistency, applies for a block of memory through a dynamic link library where a consistent point high-speed cache is located, the size of the memory is 8B (the size of a pointer of a 64-bit operating system), stores the initial address of the IO request packet into the block of memory, and then the file filtering driver issues the IO request packet.
S7, the volume filter driver captures the consistency point through querying the consistency point cache.
Optionally, the step S7 includes:
s71, a volume filter driver acquires a first address of a volume layer IO request packet;
s72, calling an interface of a hash table in a consistency cache by a volume filtering driver, and judging whether the first address of the volume layer IO request packet is recorded in the consistency cache or not; if yes, marking the backup data extracted by the volume layer IO request packet, deleting the first address of the volume layer IO request packet from a consistency cache through an interface, and issuing the volume layer IO request packet; if not, backing up the volume layer IO request packet, and then issuing the volume layer IO request packet.
It is worth explaining that the volume filter driver cannot identify the InoDB redo log checkpoint due to the device stack level where the volume filter driver is located, and therefore the file filter driver is introduced to complete the identification work. After capturing the consistent point, the file filter driver temporarily stores the consistent point into a consistent point cache, and after receiving the IO request packet issued by the file filter driver, the volume filter driver can complete the final capture of the consistent point by querying the consistent point cache.
According to the technical scheme of the embodiment, the volume filtering driver captures the consistency point by a lighter method, and the work of capturing the consistency point is completed by the mutual cooperation of the file filtering driver, the consistency point cache and the volume filtering driver. According to the method, the specific content of the IO request packet does not need to be analyzed, the capture of the consistent point can be completed only by identifying the InoDB redo log check point, and the technical problem that the efficiency of an operating system where the client is located is greatly influenced in the prior art is solved.
Example two
As shown in fig. 3, in one embodiment, a system for capturing database consistency points in block level CDP is provided, the system comprising:
a redo log location path obtaining module 1001 configured to obtain a path where the inodb redo log is located;
a consistency point high-speed cache creation module 1002, configured to create a consistency point high-speed cache in the kernel layer;
a file filter driver and volume filter driver installation module 1003 configured to install a file filter driver on the file driver layer and install a volume filter driver on the volume device driver layer;
a sending module 1004, configured to send a path where the inodb redo log is located to the file filter driver;
a basic information obtaining module 1005, configured to, when an IO request is issued by an operating system, capture, by the file filter driver, a write type IO request packet of a file system layer and obtain basic information, where the basic information includes: write file path, offset, and data length;
a judging module 1006, configured to judge, according to the basic information and a path where the inodb redo log is located, whether the file system layer IO request packet has consistency by the file filter driver; if yes, recording the first address of the file system layer IO request packet in a consistent point cache, and then issuing the file system layer IO request packet; if not, issuing the file system layer IO request packet;
a capture module 1007 for the volume filter driver to capture consistency points by querying a consistency point cache.
Optionally, as shown in fig. 4, on the basis of this embodiment, the cache creation module 1002 of the coherent point includes:
an installation unit 10021, configured to install a kernel-mode dynamic link library driver;
a creating unit 10022, configured to create a spin lock, a non-paged memory allocation table, and a hash table in the dynamic link library driver, where the hash table has an interface exposed to the outside, and the hash table manages a memory through the non-paged memory allocation table.
Optionally, as shown in fig. 5, on the basis of this embodiment, the determining module 1006 includes:
a checkpoint covering detection unit 10061, configured to determine, according to the basic information and a path where the inodb redo log is located, whether the file system layer IO request packet covers a checkpoint field of the database redo log; if so, applying for a memory with a specified size in a consistency point cache, storing the first address of the file system layer IO request packet into the memory, and then issuing the file system layer IO request packet; and if not, issuing the file system layer IO request packet.
Optionally, as shown in fig. 6, on the basis of this embodiment, the capturing module 1007 includes:
a first address obtaining unit 10071, configured to obtain, by the volume filter driver, a first address of the volume layer IO request packet;
a consistency point judgment marking unit 10072, configured to call an externally exposed hash table interface in the consistency cache by the volume filter driver, and judge whether the first address of the volume layer IO request packet is recorded in the consistency cache; if yes, marking the backup data extracted by the volume layer IO request packet, and deleting the first address of the volume layer IO request packet from the consistency cache through an interface; if not, backing up the volume layer IO request packet, and then issuing the volume layer IO request packet.
In the technical scheme of this embodiment, the redo log location path obtaining module 1001 is configured to obtain a path where an inodb redo log is located; a consistency point high-cache creating module 1002, configured to create a consistency point high-cache at the kernel layer; a file filter driver and volume filter driver installation module 1003 configured to install a file filter driver on the file driver layer and install a volume filter driver on the volume device driver layer; a sending module 1004, configured to send a path where the inodb redo log is located to the file filter driver; a basic information obtaining module 1005, configured to, when an IO request is issued by an operating system, capture, by the file filter driver, a write type IO request packet of a file system layer and obtain basic information, where the basic information includes: write file path, offset, and data length; a judging module 1006, configured to judge, according to the basic information and a path where the inodb redo log is located, whether the file system layer IO request packet has consistency by the file filter driver; if yes, recording the first address of the file system layer IO request packet in a consistent point cache, and then issuing the file system layer IO request packet; if not, issuing the file system layer IO request packet; a capture module 1007 for the volume filter driver to capture consistency points by querying a consistency point cache. The technical problem that the consistency of the recovered data cannot be maintained in the prior art is solved, the identification of a file system layer to an IO request packet is added in the solution of block-level continuous data protection, the consistency point can be identified by checking the check point generated by the InNODB storage engine, and the effect of ensuring the consistency of the recovered data is achieved.
In the following, we provide a set of experiments to further illustrate the embodiment, the version of the operating system used in the experiments is Windows 1020H 2, and the version of the database used in the experiments is MySQL-8.0.23.
This experiment was performed 10 times for transactions of the order of 1000 ten thousand. 1000 ten thousand transactions were performed using Python scripts. The structure of the database table involved in the transaction is shown in fig. 7, and the database table comprises four fields of ID, name, age and balance. The operation realized by each transaction is as follows: and transferring accounts for two randomly selected records of different ages, firstly carrying out money deduction operation on the balance of one record, and then adding the money deduction number into the balance of the other record. The number of the coincident points captured in each experiment is about 130, and one coincident point is randomly selected as a recovery point in each experiment.
The evaluation indexes of the experiment are whether the sum of the balance of each record after the data recovery is consistent with that before the data recovery and whether the database normally operates. The experimental results are as follows:
the first group of experiments generate 131 consistent points in total, the selected recovery point is the 7 th consistent point, the data after recovery is kept consistent, and the database normally runs;
the second group of experiments generate 131 consistent points in total, the selected recovery point is the 21 st consistent point, the data after recovery keep consistent, and the database operates normally;
the third group of experiments generate 128 consistent points in total, the selected recovery point is the 74 th consistent point, the data after recovery keep consistent, and the database normally runs;
130 consistent points are generated in the fourth group of experiments, the selected recovery point is the 56 th consistent point, the data are kept consistent after recovery, and the database normally runs;
a fifth group of experiments generate 128 consistent points in total, the selected recovery point is the 45 th consistent point, the data after recovery is kept consistent, and the database normally runs;
a sixth group of experiments generate 130 consistent points in total, the selected recovery point is the 103 th consistent point, the data after recovery is kept consistent, and the database normally runs;
a seventh group of experiments generate 131 consistent points in total, the selected recovery point is a 71 th consistent point, the data after recovery keep consistent, and the database normally runs;
130 consistent points are generated in the eighth group of experiments in total, the selected recovery point is the 13 th consistent point, the data are kept consistent after recovery, and the database normally runs;
a ninth group of experiments generate 128 consistent points in total, the selected recovery point is the 116 th consistent point, the data are kept consistent after recovery, and the database normally runs;
the tenth set of experiments yielded 130 consistency points in total, the selected recovery point was the 79 th consistency point, the data remained consistent after recovery and the database ran normally.
From the above experimental results, in ten experiments, the recovery point randomly selected each time can keep the consistency after the data recovery, and the database can normally run without database crash. In ten experiments, the number of the consistent points captured each time is about 130, and the number is stable. Therefore, the embodiment can stably capture the consistent point, and the data recovered according to the consistent point can keep consistent.
EXAMPLE III
In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor, implements the method for capturing a database consistency point in a block level CDP according to the first embodiment.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for capturing a database consistent point in a block level CDP is characterized by comprising the following steps:
a path obtaining step of the redo log, obtaining the path of the InNODB redo log;
a consistent point high-speed cache establishing step, namely establishing a consistent point high-speed cache in the kernel layer;
a file filter driver and a volume filter driver installation step, wherein the file filter driver is installed on the upper layer of the file driver, and the volume filter driver is installed on the upper layer of the volume equipment driver;
a sending step, sending the path of the InNODB redo log to the file filter driver;
a basic information obtaining step, when an operating system issues an IO request, the file filter driver captures a write type IO request packet of a file system layer and obtains basic information, wherein the basic information comprises: write file path, offset, and data length;
a judging step, wherein the file filtering driver judges whether the IO request packet of the file system layer has consistency or not according to the basic information and the path where the InoDB redo log is located; if yes, recording the first address of the file system layer IO request packet in a consistent point cache, and then issuing the file system layer IO request packet; if not, issuing the file system layer IO request packet;
a capture step, the volume filter driver capturing consistency points by querying a consistency point cache.
2. The method of claim 1, wherein the step of obtaining the path of the redo log comprises:
traversing all environment variables of the current operating system, and screening out the environment variables of the path where the database starting command is located, wherein the database uses an InNODB storage engine;
and obtaining the path of the InNODB redo log according to the environment variable of the path of the database starting command and the directory structure of the database.
3. The method of claim 1, wherein the step of cache creation of the consistency point comprises:
installing a kernel-state dynamic link library driver;
creating a spin lock, a non-paged memory allocation table and a hash table in the dynamic link library driver, wherein the hash table has an externally exposed interface, and the hash table manages a memory through the non-paged memory allocation table.
4. The method of claim 3, wherein the determining step comprises:
judging whether the IO request packet of the file system layer covers a check point field of the redo log of the database or not according to the basic information and the path of the InoDB redo log; if so, applying for a memory with a specified size in a consistency point cache, storing the first address of the file system layer IO request packet into the memory, and then issuing the file system layer IO request packet; and if not, issuing the file system layer IO request packet.
5. The method of claim 3, wherein the capturing step comprises:
a volume filter driver acquires a first address of a volume layer IO request packet;
a volume filter driver calls an interface of a hash table in a consistency cache and judges whether the first address of the volume layer IO request packet is recorded in the consistency cache or not; if yes, marking the backup data extracted by the volume layer IO request packet, deleting the first address of the volume layer IO request packet from a consistency cache through an interface, and issuing the volume layer IO request packet; if not, backing up the volume layer IO request packet, and then issuing the volume layer IO request packet.
6. A system for capturing a database consistency point in a block level CDP, the system comprising:
the redo log path acquisition module is used for acquiring the path of the InNODB redo log;
a consistent point high-speed cache establishing module used for establishing consistent point high-speed cache in the kernel layer;
the file filtering driver and roll filtering driver installation module is used for installing a file filtering driver on a file driver layer and installing a roll filtering driver on a roll equipment driver layer;
the sending module is used for sending the path where the InNODB redo log is located to the file filter driver;
a basic information obtaining module, configured to capture, by the file filter driver, a write type IO request packet of a file system layer and obtain basic information when an IO request is issued by an operating system, where the basic information includes: write file path, offset, and data length;
the judging module is used for judging whether the IO request packet of the file system layer has consistency or not according to the basic information and the path where the InoDB redo log is located; if yes, recording the first address of the file system layer IO request packet in a consistent point cache, and then issuing the file system layer IO request packet; if not, issuing the file system layer IO request packet;
a capture module to capture a consistency point by the volume filter driver by querying a consistency point cache.
7. The system of claim 6, wherein the cache creation module comprises:
the installation unit is used for installing the kernel-state dynamic link library driver;
the creating unit is used for creating a spin lock, a non-paged memory allocation table and a hash table in the dynamic link library driver, wherein the hash table is provided with an externally exposed interface, and the hash table manages a memory through the non-paged memory allocation table.
8. The system of claim 7, wherein the determining module comprises:
the check point coverage detection unit is used for judging whether the file system layer IO request packet covers a check point field of the database redo log or not according to the basic information and the path where the InoDB redo log is located; if so, applying for a memory with a specified size in a consistency point cache, storing the first address of the file system layer IO request packet into the memory, and then issuing the file system layer IO request packet; and if not, issuing the file system layer IO request packet.
9. The system of claim 7, wherein the capture module comprises:
a first address obtaining unit, configured to obtain, by a volume filter driver, a first address of a volume layer IO request packet;
a consistency point judgment marking unit, configured to call an externally exposed hash table interface in a consistency cache by a volume filter driver, and judge whether a first address of the volume layer IO request packet is recorded in the consistency cache; if yes, marking the backup data extracted by the volume layer IO request packet, and deleting the first address of the volume layer IO request packet from the consistency cache through an interface; if not, backing up the volume layer IO request packet, and then issuing the volume layer IO request packet.
10. A computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of capturing database consistency points in a block level CDP according to any of claims 1 to 5.
CN202210508886.5A 2022-05-11 2022-05-11 Method, system and storage medium for capturing database consistent point in block-level CDP Active CN114647624B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210508886.5A CN114647624B (en) 2022-05-11 2022-05-11 Method, system and storage medium for capturing database consistent point in block-level CDP

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210508886.5A CN114647624B (en) 2022-05-11 2022-05-11 Method, system and storage medium for capturing database consistent point in block-level CDP

Publications (2)

Publication Number Publication Date
CN114647624A true CN114647624A (en) 2022-06-21
CN114647624B CN114647624B (en) 2022-08-02

Family

ID=81996937

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210508886.5A Active CN114647624B (en) 2022-05-11 2022-05-11 Method, system and storage medium for capturing database consistent point in block-level CDP

Country Status (1)

Country Link
CN (1) CN114647624B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115827334A (en) * 2023-01-09 2023-03-21 四川大学 ORACLE database block-level CDP backup recovery method and system
CN117873403A (en) * 2024-03-11 2024-04-12 四川大学 Method and system for restoring tmp file in office document IO

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101501652A (en) * 2005-03-04 2009-08-05 伊姆西公司 Checkpoint and consistency markers
US20120210066A1 (en) * 2011-02-15 2012-08-16 Fusion-Io, Inc. Systems and methods for a file-level cache
CN103399921A (en) * 2013-08-01 2013-11-20 天津火星科技有限公司 Consensus point capturing method based on Oracle database
US20150026404A1 (en) * 2013-07-19 2015-01-22 Apple Inc. Least Recently Used Mechanism for Cache Line Eviction from a Cache Memory
US20150143044A1 (en) * 2013-11-15 2015-05-21 Apple Inc. Mechanism for sharing private caches in a soc
CN110442560A (en) * 2019-08-14 2019-11-12 上海达梦数据库有限公司 Method, apparatus, server and storage medium are recurred in a kind of log
CN111125040A (en) * 2018-10-31 2020-05-08 华为技术有限公司 Method, apparatus and storage medium for managing redo log
CN112035410A (en) * 2020-08-18 2020-12-04 腾讯科技(深圳)有限公司 Log storage method and device, node equipment and storage medium
CN112256485A (en) * 2020-10-30 2021-01-22 网易(杭州)网络有限公司 Data backup method, device, medium and computing equipment
CN113760846A (en) * 2021-01-08 2021-12-07 北京沃东天骏信息技术有限公司 Data processing method and device

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101501652A (en) * 2005-03-04 2009-08-05 伊姆西公司 Checkpoint and consistency markers
US20120210066A1 (en) * 2011-02-15 2012-08-16 Fusion-Io, Inc. Systems and methods for a file-level cache
US20150026404A1 (en) * 2013-07-19 2015-01-22 Apple Inc. Least Recently Used Mechanism for Cache Line Eviction from a Cache Memory
CN103399921A (en) * 2013-08-01 2013-11-20 天津火星科技有限公司 Consensus point capturing method based on Oracle database
US20150143044A1 (en) * 2013-11-15 2015-05-21 Apple Inc. Mechanism for sharing private caches in a soc
CN111125040A (en) * 2018-10-31 2020-05-08 华为技术有限公司 Method, apparatus and storage medium for managing redo log
CN110442560A (en) * 2019-08-14 2019-11-12 上海达梦数据库有限公司 Method, apparatus, server and storage medium are recurred in a kind of log
CN112035410A (en) * 2020-08-18 2020-12-04 腾讯科技(深圳)有限公司 Log storage method and device, node equipment and storage medium
CN112256485A (en) * 2020-10-30 2021-01-22 网易(杭州)网络有限公司 Data backup method, device, medium and computing equipment
CN113760846A (en) * 2021-01-08 2021-12-07 北京沃东天骏信息技术有限公司 Data processing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王郁: "基于Windows安全访问接口层的加密系统设计与实现", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115827334A (en) * 2023-01-09 2023-03-21 四川大学 ORACLE database block-level CDP backup recovery method and system
CN117873403A (en) * 2024-03-11 2024-04-12 四川大学 Method and system for restoring tmp file in office document IO

Also Published As

Publication number Publication date
CN114647624B (en) 2022-08-02

Similar Documents

Publication Publication Date Title
CN114647624B (en) Method, system and storage medium for capturing database consistent point in block-level CDP
JP6556911B2 (en) Method and apparatus for performing an annotated atomic write operation
Lee et al. Smart layers and dumb result: IO characterization of an android-based smartphone
Kang et al. X-FTL: transactional FTL for SQLite databases
US8812446B2 (en) Block level backup and restore
CN101819525B (en) Method and equipment for searching configuration file of application in system
US7904493B2 (en) Method and system for object age detection in garbage collection heaps
US20060037079A1 (en) System, method and program for scanning for viruses
US20100280996A1 (en) Transactional virtual disk with differential snapshots
KR101674176B1 (en) Method and apparatus for fsync system call processing using ordered mode journaling with file unit
CN109683825B (en) Storage system online data compression method, device and equipment
US9477496B2 (en) Method and apparatus for loading classes and re-organizing class archives
US8484620B2 (en) Implementing performance impact reduction of watched variables
CN111666046B (en) Data storage method, device and equipment
CN113641446A (en) Memory snapshot creating method, device and equipment and readable storage medium
JP2015114750A (en) Examination program, information processing device, and information processing method
US11726991B2 (en) Bulk updating of mapping pointers with metadata transaction log
US11055184B2 (en) In-place garbage collection of a sharded, replicated distributed state machine based on supersedable operations
CN117112522A (en) Concurrent process log management method, device, equipment and storage medium
Cui et al. Pars: A page-aware replication system for efficiently storing virtual machine snapshots
CN106557572A (en) A kind of extracting method and system of Android application program file
CN103164290B (en) application memory management method and device
Seo et al. Deduplication flash file system with PRAM for non-linear editing
Song et al. CADedup: High-performance Consistency-aware Deduplication Based on Persistent Memory
Park et al. Memory efficient fork-based checkpointing mechanism for in-memory database systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant