CN112099996A

CN112099996A - Database cluster multi-node redo log recovery method based on page update sequence number

Info

Publication number: CN112099996A
Application number: CN202010993792.2A
Authority: CN
Inventors: 刘碧楠; 周勇亮; 吴嵩; 蒋旭; 于凯; 马岳; 李彬; 陈振巍
Original assignee: TIANJIN SHENZHOU GENERAL DATA TECHNOLOGY CO LTD
Current assignee: TIANJIN SHENZHOU GENERAL DATA TECHNOLOGY CO LTD
Priority date: 2020-09-21
Filing date: 2020-09-21
Publication date: 2020-12-18
Anticipated expiration: 2040-09-21
Also published as: CN112099996B

Abstract

The invention relates to a database cluster multi-node redo log recovery method based on page update sequence numbers, which comprises the following steps: allocating space at the head of the page as a page updating sequence number; when all nodes in the cluster update the data page, updating the page updating sequence number and the redo log of the node; restarting the database cluster, wherein the node started first becomes a main node, and the main node performs instance recovery; the main node loads a control file from the shared disk, reads redo log information of each node from the control file as a scanning handle, and stores the redo log information in a scanning handle array; traversing all the scanning handles and recovering the redo log; and after all the scanning handles are scanned, recovering the redo log. When the redo log is newly added and the instance is recovered, the method judges whether the pages are continuous or not by using the page updating sequence number, and recovers the modification of the same page among different nodes in sequence, thereby ensuring the consistency of data and improving the reliability of system operation.

Description

Database cluster multi-node redo log recovery method based on page update sequence number

Technical Field

The invention belongs to the technical field of databases, relates to database recovery, and particularly relates to a database cluster multi-node redo log recovery method based on page update sequence numbers.

Background

When a certain transaction wants to modify a certain row of data in the database, the database reads the related data page from the disk into the memory for modification. At this time, data is modified in memory, creating a difference compared to the page contents in disk, and this differentiated data page is called a dirty page.

The database processes the dirty page instead of refreshing the dirty page back to the disk every time the dirty page is generated, because a large number of random IO operations are generated during the process, the processing performance of the database is seriously affected. The database has a special page back-brushing thread, the data page in the memory can be regularly back-brushed to the disk, and the page becomes a clean page after being back-brushed. During the period from the generation of the dirty page to the back-flushing of the dirty page to the change of the clean page, if the database is accidentally down due to power failure, system failure and process crash, data errors can be caused, and the modification of the user is lost, so that the durability of the transaction cannot be ensured.

The database solves the above problem by redo log, ensuring transaction durability. When a transaction needs to modify a data page, the modified content of the transaction is recorded in a redo log file. When the database is restarted after being down, the database can be restored to a correct state by restoring the redo log.

In the shared storage cluster, each database node can read and write data files on the shared storage and independently provide services for the outside. If the nodes share one redo log file, competition is inevitably generated, and the performance of the database is influenced. Therefore, each database node in the cluster has its own redo log file. In the operation process, each node only accesses the redo log file of the node and does not access the redo log files of other nodes. When a node generates a redo log, the redo log is only written into the redo log belonging to the node, and the redo log writing is completely irrelevant among different nodes. By the method, interaction and waiting among the nodes are reduced, and the overall performance of the cluster is improved.

Under the environment of a single-machine database, the redo log is recovered only by sequentially recovering the log LSNs (note: the logical serial number of the log is increased along with the writing of the log) from small to large. Different from a single-machine environment, a plurality of nodes of a database cluster have redo logs of the nodes, and the LSNs among the logs cannot be directly compared in size, so that problems can be encountered when the redo logs are recovered. Examples are as follows:

as shown in FIG. 1a, consider the case when redo log recovery is performed, where one data page is modified by multiple instances, and two nodes in the graph alternate page modification.

For the same data page, if the redo log of one node is restored first and then another node is restored, the restoration is not performed according to the modification order of the page, resulting in that the last restored data is erroneous data, as shown in fig. 1 b.

For the same data page, the log of the modified page may be distributed in different redo log files, and the recovery must be performed according to the modification order of the page, so that the recovery is correct, as shown in fig. 1 c.

In summary, how to recover the modification of the same page between different nodes in order to ensure the consistency of data is a problem that needs to be solved urgently at present.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a database cluster multi-node redo log recovery method based on page update sequence numbers, which is reasonable in design, high in efficiency and capable of effectively guaranteeing the consistency of a cluster database.

The technical problem to be solved by the invention is realized by adopting the following technical scheme:

a database cluster multi-node redo log recovery method based on page update sequence numbers comprises the following steps:

step 1, allocating space at the head of a page as a page updating sequence number;

step 2, when all nodes in the cluster update the data page, updating the page updating sequence number and the redo log of the node;

step 3, restarting the database cluster, wherein the node started first becomes a main node, and the main node performs instance recovery;

step 4, the main node loads a control file from the shared disk, reads redo log information of each node from the control file as a scanning handle, and stores the redo log information in a scanning handle array;

step 5, traversing all scanning handles and recovering redo logs;

and 6, finishing scanning all the scanning handles and finishing the recovery of the redo log.

Further, the step 1 allocates 8 byte spaces at the head of the page as the page update sequence number.

Further, the specific implementation method of step 2 is as follows: when the data page is modified, a redo log is added, the modified content and the current page update sequence number are written into the redo log, and then the page update sequence number value is increased by one.

Further, the control file loaded by the master node from the shared disk is a binary file defining the physical state of the database cluster.

Further, the specific implementation method of step 5 includes the following steps:

traversing the scanning handles, if all the scanning handles are scanned completely, jumping out of a cycle, and turning to the step 6;

secondly, if the current scanning handle is scanned completely, turning to the step 5, and starting the next redo log scanning;

and thirdly, circularly reading, analyzing and recovering each redo log from the current scanning handle.

6. The method for recovering the database cluster multi-node redo log based on the page update sequence number according to claim 5, characterized in that: the concrete implementation method of the step three is as follows:

if the end of the redo log is scanned, the current redo log is recovered, the current scanning handle is marked to be scanned completely, and the step 5 is carried out to start the next redo log scanning;

secondly, acquiring a redo log from the current scanning position of the scanning handle, acquiring a page updating sequence number from the redo log, comparing the page updating sequence number with the updating sequence number of the current page, and performing the following processing:

if the serial number recorded in the redo log is smaller than the update serial number of the current page, ignoring the log and continuing to scan the next log;

if the sequence number recorded in the redo log is equal to the update sequence number of the current page, applying the modification of the log to the data page, then adding one to the update sequence number value of the current page, and continuing to scan the next log;

and if the sequence number recorded in the redo log is greater than the update sequence number of the current page and the pages are discontinuous, sequentially updating the pages and recording the sequentially updated pages in the redo logs of other nodes, turning to the step 5, and starting scanning of the next redo log.

The invention has the advantages and positive effects that:

the invention has reasonable design, sets a page updating sequence number for marking the page updating sequence for each page, maintains the page updating sequence number when adding redo logs and recovering the instance, judges whether the pages are continuous or not by traversing each redo log and utilizing the page updating sequence number, and recovers the modification of the same page among different nodes in sequence, thereby ensuring the consistency of data and improving the reliability of system operation.

Drawings

FIG. 1a is a schematic diagram of a process in which a data page is modified by multiple instances;

FIG. 1b is a schematic diagram of a recovery process using a node sequence;

FIG. 1c is a schematic diagram of a recovery process using a page modification sequence;

FIG. 2 is a flow diagram of a write redo log of the present invention;

FIG. 3 is a flow chart of recovering redo logs of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

The design idea of the invention is as follows: in order to be able to mark the update order of the pages, an update sequence number needs to be maintained for each page. Therefore, 8 bytes are allocated at the head of the page as a page update sequence number, representing the modification order on the page. After each data page update, the page update sequence number value is incremented by one. When the redo log is newly added, the current update sequence number of the page is also recorded into the redo log. When the instance is recovered, the redo log of one node is selected for recovery, and if the current page to be recovered is found to be discontinuous, the redo log is switched to the next redo log for recovery. And repeating the process until all the redo logs are recovered. By traversing each redo log and judging whether the pages are continuous or not by using the page update sequence number, the modification of the same page among different nodes can be restored in sequence, and the consistency of data is ensured.

Based on the design idea, the invention provides a database cluster multi-node redo log recovery method based on page update sequence numbers, which comprises the following steps:

step 1, allocating space at the head of the page as a page update sequence number, wherein the page update sequence number represents the update sequence of the page.

In this embodiment, 8 byte spaces are allocated at the head of the page as the page update sequence number. And when a redo log is newly added, writing the modified content and the current page update sequence number into the redo log.

And 2, when all the nodes in the cluster update the data page, updating the page update sequence number and the redo log of the node.

As shown in fig. 1, the specific implementation method of this step is: when the data page is modified, a redo log is added, the modified content and the current page update sequence number are written into the redo log, and then the page update sequence number value is increased by one.

And 3, restarting the database cluster, wherein the node started firstly becomes a main node, and the main node performs instance recovery.

And 4, loading a control file (note: the control file is a binary file defining the physical state of the database cluster) from the shared disk by the main node, reading redo log information of each node from the control file as a scanning handle, and storing the redo log information in a scanning handle array.

And 5, traversing all the scanning handles and recovering the redo log.

As shown in fig. 3, the specific implementation method of this step includes the following steps:

if all the scanning handles are scanned completely, the process jumps out of the cycle and goes to step 6.

Secondly, if the current scanning handle is scanned completely, the next redo log scanning is started by going to the step 5.

Thirdly, circularly reading, analyzing and recovering each redo log from the current scanning handle, wherein the method comprises the following steps:

if the end of the redo log is scanned, the current redo log is recovered and the current scanning handle is marked to be scanned completely, and then the step 5 is carried out to start the next redo log scanning.

And acquiring a redo log from the current scanning position of the scanning handle. Acquiring a page updating sequence number from the redo log, comparing the page updating sequence number with the updating sequence number of the current page, and performing the following processing:

if the sequence number recorded in the redo log is smaller than the update sequence number of the current page, the redo log is indicated that the content of the redo log is returned to the page, the redo log is ignored, and the next log is continuously scanned.

If the sequence number recorded in the redo log is equal to the update sequence number of the current page, it indicates that the redo log is not refreshed to the page yet, and the redo log is updated in sequence, and the log modification is applied to the data page. And then adding one to the current page update sequence number value for advancing redo log recovery. The next log scan is continued.

If the serial number recorded in the redo log is greater than the update serial number of the current page, the page is discontinuous, which indicates that the redo log is not updated in sequence, the sequential update of the page is recorded in the redo logs of other nodes, and the step 5 is switched to, and the next redo log scanning is started.

And 6, finishing scanning all the scanning handles at the moment, and finishing the recovery of all the redo logs.

It should be emphasized that the embodiments described herein are illustrative rather than restrictive, and thus the present invention is not limited to the embodiments described in the detailed description, but also includes other embodiments that can be derived from the technical solutions of the present invention by those skilled in the art.

Claims

1. A database cluster multi-node redo log recovery method based on page update sequence numbers is characterized by comprising the following steps: the method comprises the following steps:

step 5, traversing all scanning handles and recovering redo logs;

2. The method for recovering the database cluster multi-node redo log based on the page update sequence number according to claim 1, characterized in that: in the step 1, 8 byte spaces are allocated at the head of the page as the page updating sequence number.

3. The method for recovering the database cluster multi-node redo log based on the page update sequence number according to claim 1, characterized in that: the specific implementation method of the step 2 comprises the following steps: when the data page is modified, a redo log is added, the modified content and the current page update sequence number are written into the redo log, and then the page update sequence number value is increased by one.

4. The method for recovering the database cluster multi-node redo log based on the page update sequence number according to claim 1, characterized in that: and the control file loaded from the shared disk by the main node is a binary file defining the physical state of the database cluster.

5. The method for recovering the database cluster multi-node redo log based on the page update sequence number according to claim 1, characterized in that: the specific implementation method of the step 5 comprises the following steps: