CN115328698A - Data page recovery method and device, electronic equipment and storage medium - Google Patents

Data page recovery method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115328698A
CN115328698A CN202210955872.8A CN202210955872A CN115328698A CN 115328698 A CN115328698 A CN 115328698A CN 202210955872 A CN202210955872 A CN 202210955872A CN 115328698 A CN115328698 A CN 115328698A
Authority
CN
China
Prior art keywords
data page
target
event
node
waiting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210955872.8A
Other languages
Chinese (zh)
Inventor
王巍
王海龙
韩朱忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN202210955872.8A priority Critical patent/CN115328698A/en
Publication of CN115328698A publication Critical patent/CN115328698A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data page recovery method and device, electronic equipment and a storage medium. The data page recovery method comprises the following steps: determining a data page cache of a first data page aimed at by a target waiting event, wherein the target waiting event is a waiting event for a target node in a waiting database cluster to respond, and the target node comprises a fault node and/or an active node; releasing other data page caches except the data page cache of the first data page in the active node; and recovering the data page cache of the second data page according to the first redo log of the second data page, wherein the second data page is the data page taking the active node as the recovery node, and the first redo log is recorded in the first log file in the disk. By using the method, the other data page caches except the data page cache of the first data page are released, and the data page is recovered according to the first redo log of the second data page, so that the recovery efficiency of the data page in the shared storage database cluster is improved.

Description

Data page recovery method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of databases, in particular to a data page recovery method and device, electronic equipment and a storage medium.
Background
A Shared storage database Cluster (DMDSC) is a multi-instance and single-database system, the Cluster allows a plurality of database instances to access and operate the same database at the same time, and after a node in the Cluster fails, the database service cannot be provided.
In a shared storage database cluster environment, when a node failure occurs, the remaining active nodes perform failure processing. The key of the fault processing is the redo log which can record the operation of modifying the data page, the corresponding redo log can be generated simultaneously when the node modifies the data page, the data page does not need to be immediately brushed into the disk after the data page is modified, only the redo log needs to be brushed into the disk, and if the node is in fault at the moment, all the data pages can be recovered to the latest state according to the content of the redo log after the node is restarted.
However, in the prior art, when a data page is recovered, a Log Serial Number (LSN) of a Log needs to be frequently read from a disk to determine the Log that needs to be redone, and the data page is recovered by redoing the Log, when a large Number of invalid data pages exist in a cache, frequent reading of the LSN value from the disk may generate a large Number of disk reads/writes, which results in low overall efficiency of data page recovery.
Disclosure of Invention
The invention provides a data page recovery method and device, electronic equipment and a storage medium, and aims to solve the problem of low data page recovery efficiency.
In a first aspect, an embodiment of the present invention provides a method for recovering a data page, including:
determining a data page cache of a first data page for which a target waiting event is aimed, wherein the target waiting event is a waiting event for a target node in a database cluster to respond, and the target node comprises a fault node and/or an active node;
releasing other data page caches except the data page cache of the first data page in the active node;
and recovering the data page cache of a second data page according to a first redo log of the second data page, wherein the second data page is a data page taking the active node as a recovery node, and the first redo log is recorded in a first log file in a disk.
In a second aspect, an embodiment of the present invention provides an apparatus for recovering a data page, including:
the data page cache determining module is used for determining a data page cache of a first data page for which a target waiting event is aimed, wherein the target waiting event is a waiting event for a target node in a database cluster to respond, and the target node comprises a fault node and/or an active node;
the data page cache releasing module is used for releasing other data page caches except the data page cache of the first data page in the active node;
and the data page cache recovery module is used for recovering the data page cache of a second data page according to a first redo log of the second data page, wherein the second data page is a data page taking the active node as a recovery node, and the first redo log is recorded in a first log file in a disk.
In a third aspect, an embodiment of the present invention provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the method of restoring a data page according to the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where computer instructions are stored, and the computer instructions are configured to, when executed, cause a processor to implement the data page recovery method according to the first aspect.
The technical scheme of the embodiment of the invention comprises the steps of firstly determining a data page cache of a first data page for a target waiting event, wherein the target waiting event is a waiting event for a target node in a database cluster to respond, and the target node comprises a fault node and/or an active node; then releasing other data page caches except the data page cache of the first data page in the active node; and finally, recovering the data page cache of the second data page according to the first redo log of the second data page, wherein the second data page is the data page taking the active node as the recovery node, and the first redo log is recorded in the first log file in the disk. Under the condition of node failure, other data page caches except the data page cache of the first data page are released, and data page recovery is performed according to the first redo log of the second data page, so that frequent disk reading/writing is avoided, and the recovery efficiency of the data pages in the shared storage database cluster is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flowchart of a recovery method for a data page according to an embodiment of the present invention;
FIG. 2 is a flowchart of a recovery method for a data page according to a second embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a device for restoring a data page according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device implementing the data page recovery method according to the embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first", "second", etc. in the present invention are used for distinguishing similar objects, and are not necessarily used for describing a particular order or sequence. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It can be understood that, before the technical solutions disclosed in the embodiments of the present invention are used, the type, the use range, the use scenario, etc. of the personal information related to the present disclosure should be informed to the user and authorized by the user in a proper manner according to relevant laws and regulations.
Example one
Fig. 1 is a flowchart of a data page recovery method according to an embodiment of the present invention, where this embodiment is applicable to a case where a data page is recovered when a node in a shared-storage database cluster fails, and the method may be executed by a data page recovery apparatus, which may be implemented in software and/or hardware and integrated in an electronic device. Further, electronic devices include, but are not limited to: computers, notebook computers, smart phones, servers, and the like. As shown in fig. 1, the method includes:
s110, determining a data page cache of a first data page aimed at by a target waiting event, wherein the target waiting event is a waiting event waiting for a target node in a database cluster to respond, and the target node comprises a fault node and/or an active node.
The data page may refer to a basic unit of data storage in the database management system, and is a basic unit of data access, modification, and I/O. The failed node may refer to a node in the shared storage database cluster that fails, and the failure may be understood as the node being unable to perform a specified function, and the type of the failure is not limited, such as a transaction failure, a system failure, or a medium failure. In the shared storage database cluster, when a node failure occurs, the remaining active nodes enter a failure processing flow, the active nodes may refer to nodes that can normally work in the shared storage database cluster, and the number of the active nodes is not limited.
The determination method of the node failure is not limited, for example, an individual control node may be set in the shared storage database cluster, the control node may monitor the operation state of each node in the cluster, and when a node has a failure, the control node controls other active nodes to perform failure processing, or sends the information of the failed node to other active nodes, so that the other active nodes perform failure processing; for another example, when a node has a fault, one or more other active nodes report to the control node, and the control node controls the other active nodes to perform fault processing, or sends information of the node having the fault to the other active nodes, so that the other active nodes perform fault processing.
The target waiting event may refer to a waiting event for a target node in the database cluster to respond, the target node including a failed node and/or an active node, wherein the waiting event may refer to a suspension of waiting events.
In one embodiment, the target node comprises an active node, when a fault node exists in the cluster, the active node suspends the working thread and/or the conversation thread and enters a fault processing flow, and the target waiting event can comprise a waiting event waiting for the active node to respond.
In one embodiment, the target node includes a failed node EP2, a session SESS1 of the node EP1 needs to modify data pages P1 and P2 in sequence, P1 has been modified and SESS1 also has a W authority of the data page P2, global latch service (GBS) information of P2 is at the EP2 node, EP1 requests an X blocking authority of the data page P2 from EP2, SESS1 waits for an EP2 response message, and then the target waiting event may include a waiting event for the failed node EP2 to respond.
The first data page targeted by the target wait event may refer to the data page that the target wait event needs to access or modify, such as the data pages P1 and P2 that need to be modified in the node EP1 in the above example.
Generally, a database allocates a continuous piece of memory for a data page cache to improve data access performance, all data accesses and modifications are directed to data pages in the cache, and the same data page may be distributed in caches of different nodes, that is, the same data page may have data page caches in multiple nodes, and it is not guaranteed that the data page contents in the node caches are completely the same, the latest data page contents are only stored in some node caches, and a history version of the data page is stored in other node caches.
Determining the data page cache of the first data page for which the target wait event is directed may be understood as determining the data page cache of the first data page in the active node. The data page cache of the first data page in the active node may be the latest data page cache of the first data page, or may be a data page cache of a history version of the first data page.
The determination method of the data page cache of the first data page to which the target wait event is directed is not limited as long as the data page cache of the first data page to which the target wait event is directed can be determined. For example, the control node may determine, by monitoring the operating state of each node in the cluster, the node involved in the fault and the first data page corresponding to the node when the fault occurs, and further determine the data page cache of the first data page for the target waiting event.
Further, the target waiting event comprises a waiting event for waiting for the target node in the database cluster to perform data page authorization, and/or a waiting event for waiting for the target node in the database cluster to return the targeted data page.
The target waiting event includes a waiting event for a target node in the database cluster to perform data page authorization, where the waiting target node performs data page authorization, and it can be understood that the waiting target node authorizes an S or X blocking authority of the data page to the active node, so that the active node can perform corresponding reading or modifying operation on the data page.
The target waiting event includes a waiting event for a target node in the database cluster to return a targeted data page, where the waiting target node returns the targeted data page, and it is understood that the waiting target node returns a latest data page of the targeted data page to the active node, so that the active node may obtain the latest data page. The data page targeted by the target waiting event in the active node may have the same version as the data page returned by the target node, or may have a different version, specifically, the data page returned by the target node is used as the reference. Preferably, the data page targeted by the target wait event in the active node may have a different version from the data page returned by the target node, that is, the active node may obtain the latest version of the data page from the target node in which the latest version of the data page is cached only when the data page stored in its cache is not the latest version of the data page.
And S120, releasing other data page caches except the data page cache of the first data page in the active node.
The active node may be any active node in the shared storage database cluster. The active node may have a data page cache of the first data page, and may also have other data page caches except the first data page, and release the other data page caches except the data page cache of the first data page in the active node, so that in the fault processing process, the cache of the active node does not have an invalid data page cache, and further, a large amount of disk read/write cannot be generated due to frequent reading of the LSN value from the disk.
For example, all data page caches in the active node may be searched, and for each data page cache, whether the data page cache is a data page cache of the first data page is determined, and if yes, the data page cache is reserved; if not, releasing the data page cache.
And S130, recovering the data page cache of a second data page according to a first redo log of the second data page, wherein the second data page is a data page taking the active node as a recovery node, and the first redo log is recorded in a first log file in the disk.
When a fault occurs, a plurality of active nodes can enter a fault processing flow at the same time, and each active node recovers a data page corresponding to the active node and having recovery authority. The recovery authority of the active node to the data page may be set in advance according to actual needs, for example, different active nodes may have recovery authority to the same data page; each active node may only have recovery authority for one or more data pages, that is, each data page may only have one active node to have recovery authority for it, or may have multiple active nodes to have recovery authority for it, and the present invention does not limit the recovery authority of the active nodes for the data pages.
The second data page may be a data page using the active node as a recovery node, that is, the active node has a recovery right for the second data page, and the active node may be used as the recovery node to recover the second data page.
The first log file may refer to a log file recorded within the disk, and the first redo log may be recorded in the first log file. Wherein the first redo log may record operations to add, delete, modify objects, or change data to the database.
The first redo log may automatically generate a new LSN value each time it is modified, where the LSN may be an integer numerical value from zero, and the maximum value is determined by the number of bytes stored corresponding to the LSN value. Each first redo log record corresponds to an LSN value, and the LSN value is recorded in the data page header while the first redo log is generated through data modification operation.
In the DMDSC cluster system, for modification of the same data page, the generated LSN value is globally incremented, and the active node can restore the data page cache of the second data page according to the first redo log of the second data page. The manner of recovering the data page cache of the second data page according to the first redo log of the second data page is not limited, for example, the first redo log of the second data page corresponding to the LSN value may be redone according to the LSN value in the order that is sequentially arranged from small to large, and then the data page cache of the second data page may be recovered.
The technical scheme of the embodiment of the invention comprises the steps of firstly determining a data page cache of a first data page for a target waiting event, wherein the target waiting event is a waiting event for a target node in a database cluster to respond, and the target node comprises a fault node and/or an active node; then releasing other data page caches except the data page cache of the first data page in the active node; and finally, recovering the data page cache of a second data page according to a first redo log of the second data page, wherein the second data page is the data page taking the active node as a recovery node, and the first redo log is recorded in a first log file in the disk. Under the condition of node failure, other data page caches except the data page cache of the first data page are released, and data page recovery is carried out according to the first redo log of the data page taking the active node as a recovery node, so that frequent disk reading/writing is avoided, and the recovery efficiency of the data page in the shared storage database cluster is improved.
Example two
Fig. 2 is a flowchart of a data page recovery method according to a second embodiment of the present invention, which is optimized based on the first embodiment. In this embodiment, the case before determining the data page cache of the first data page for which the target wait event is intended and after recovering the data page cache of the second data page according to the first redo log of the second data page is embodied.
In this embodiment, before determining the data page cache of the first data page for which the target waiting event is addressed, the method may further include: determining a first target waiting event in the active node; after determining the first target waiting event in the active node, the method further comprises: determining a second target waiting event in the active node; before determining the second target waiting event in the active node, the method further comprises: recording the modification of the first target waiting event to the target first data page into a second log file, and releasing the read-write permission of the first target waiting event to the target first data page; before the modification of the first target waiting event to the target first data page is recorded into the second log file through the working thread of the first target waiting event, and the read-write permission of the first target waiting event to the target first data page is released, the method further comprises the following steps: the first target is awakened to wait for the event.
In this embodiment, before determining the data page cache of the first data page to which the target wait event is directed, the method further includes: and pausing the target threads in the active node.
In this embodiment, after recovering the data page cache of the second data page according to the first redo log of the second data page, the method further includes: waking up a second target waiting event; and executing the second target waiting event again through the working thread of the second target waiting event according to a second redo log of the second target waiting event, wherein the second redo log is recorded in a second log file.
In this embodiment, after recovering the data page cache of the second data page according to the first redo log of the second data page, the method further includes: and adjusting the target thread in the suspended state in the active node to be in a running state.
Please refer to the first embodiment for a detailed description of the present embodiment.
As shown in fig. 2, a method for restoring a data page according to a second embodiment of the present invention includes:
s201, suspending each target thread in the active node.
The target thread comprises a work thread and/or a conversation thread. The working thread may refer to a core thread in the DMDSC cluster system, and the working thread may take out a task from the task queue, perform corresponding processing according to the type of the task, and be responsible for related operations of all actual data. The session thread may refer to a thread in the DMDSC cluster system that is responsible for establishing a session.
And each target thread in the active node is suspended, so that all the target threads of the active node can stop working, and the active node can enter a fault processing flow.
S202, determining a first target waiting event in the active node.
The first target waiting event comprises an event that a target node in the database cluster is currently waiting for response. The determination method of the first target wait event in the active node is not limited as long as the first target wait event in the active node can be determined. For example, the running state of each node in the cluster may be monitored by the control node, when a fault occurs, the control node determines the node involved in the fault, and further determines a target node in the cluster that the active node needs to wait for a response, and further determines a first target waiting event in the active node. Or, the active node determines the first target waiting event in the active node according to the processing progress of each waiting event in the active node.
In one embodiment, the target node includes a failed node EP2, a session SESS1 of the node EP1 needs to modify data pages P1 and P2 in sequence, P1 has been modified and SESS1 also has a W authority of the data page P2, global latch service GBS information of P2 is at the EP2 node, EP1 requests an X blocking authority of the data page P2 from EP2, SESS1 waits for an EP2 response message, and then a waiting event that the active node waits for the failed node EP2 to respond can be used as a first target waiting event in the active node.
S203, awakening the first target waiting event.
Wherein, waking up may refer to waiting for a termination event to make the thread continue running. And awakening the first target waiting event to enable the thread corresponding to the first target waiting event to continue running.
S204, recording the modification of the first target waiting event to the first target data page into a second log file, and releasing the read-write permission of the first target waiting event to the first target data page.
Wherein the second log file is stored in the cache. The second log file may be used to record modifications made to the corresponding data page by events such as the first target wait event prior to writing to disk. The target first data page is the data page for which the first target wait event is directed.
The modification of the first target waiting event to the target first data page is recorded into the second log file, and the manner of releasing the read-write permission of the first target waiting event to the target first data page is not limited, for example, the modification of the first target waiting event to the target first data page may be recorded into the second log file by the working thread of the first target waiting event, and the read-write permission of the first target waiting event to the target first data page is released.
In the DMDSC cluster system, for modification of a data page, when multiple sessions (SESSs) within a node access or modify the data page, a read (R)/write (W) right of the data page is acquired in addition to an LBS right to acquire the data page. If the session SESS1 has the R permission of the data page P1, the session SESS2 can immediately and successfully apply for the R permission again for the data page P1; if the session SESS1 has the R permission of the data page P1, and if the session SESS2 applies the W permission again for the data page P1, the session SESS2 can apply for success only after the session SESS1 releases the R permission; if the session SESS1 has the W authority of the data page P1, and if the session SESS2 applies for R or W authority again for the data page P1, the session SESS2 may apply for success only after the session SESS1 releases the W authority. Therefore, the read-write permission of the first target waiting event for the target first data page is released, so that other sessions in the currently operated node can request the read-write permission for the target first data page.
In one embodiment, the target node includes a failed node EP2, a session SESS1 of the node EP1 needs to modify data pages P1 and P2 in sequence, P1 has been modified and SESS1 also possesses W-permissions of the data page P2, global latch service GBS information of P2 is at the EP2 node, EP1 requests X-lockout permissions of the data page P2 from EP2, SESS1 waits for an EP2 response message. The modification of the first target waiting event on the target first data page may be a modification operation of the session SESS1 of the node EP1 on the P1, the modification operation of the session SESS1 of the node EP1 on the P1 is recorded in the second log file, and the session SESS1 releases the write permission on the P1 and the P2, so that other sessions in subsequent nodes may request read-write permission on the data pages P1 and P2.
Further, recording modification of the first target waiting event to the target first data page in a second log file, and releasing the read-write permission of the first target waiting event to the target first data page, including:
and recording the modification of the first target waiting event to the target first data page into a second log file through the working thread of the first target waiting event, and releasing the read-write permission of the first target waiting event to the target first data page.
The working thread of the first target waiting event is responsible for the relevant operation of the first target waiting event, the modification of the first target data page by the first target waiting event is recorded into the second log file through the working thread of the first target waiting event, and the read-write permission of the first target waiting event for the first target data page is released, so that the subsequent fault processing flow can be conveniently carried out.
S205, determining a second target waiting event in the active node.
The second target waiting event comprises an event for waiting the read-write permission of the first target waiting event for a target first data page, and the target first data page is a data page to which the first target waiting event aims.
In one embodiment, the target node includes a failed node EP2, a session SESS1 of the node EP1 needs to modify data pages P1 and P2 in sequence, P1 has been modified and SESS1 also owns W authority of the data page P2, global latch service GBS information of P2 is at the EP2 node, EP1 requests X lockout authority of the data page P2 from EP2, SESS1 waits for an EP2 response message. On this basis, SESS2 of the EP1 node also modifies data page P1, and SESS2 needs to wait for SESS1 to complete the modification of data pages P1 and P2 and release the W authority on P1 and P2. At this time, the second target wait event may include an event waiting for the W authority of the first target wait event for the data pages P1 and P2.
S206, determining the data page cache of the first data page corresponding to the target waiting event.
And S207, releasing other data page caches except the data page cache of the first data page in the active node.
And S208, recovering the data page cache of the second data page according to the first redo log of the second data page.
S209, awakening a second target waiting event; and executing the second target waiting event again through the working thread of the second target waiting event according to a second redo log of the second target waiting event, wherein the second redo log is recorded in a second log file.
Waking up the second target wait event may be understood as terminating the wait event, and causing the thread corresponding to the second target wait event to continue running.
The second redo log of the second target wait event may refer to a modification record that has been made by the current operation corresponding to the second target wait event, and the second redo log is recorded in the second log file. And re-executing the second target waiting event according to the second redo log of the second target waiting event by the worker thread of the second target waiting event, which can be understood as that the worker thread of the second target waiting event re-requests the read-write permission for the target first data page.
Further, re-executing the second target wait event according to the second redo log of the second target wait event includes:
if the second redo log of the second target wait event has not been written to the first log file, the second target wait event is re-executed according to the second redo log of the second target wait event.
And if the second redo log of the second target waiting event is not written into the first log file, the second redo log of the second target waiting event is not flushed to the disk, and the second target waiting event can be re-executed according to the second redo log of the second target waiting event, so that the cluster system can continue the operation before the fault is generated, and the fault processing is completed.
S210, adjusting the target thread in the suspended state in the active node to be in the running state.
The target thread in the suspended state in the active node is adjusted to be in the running state, which can be understood as that the target thread suspended due to fault processing in the active node continues to run, indicating that the active node completes fault processing. The target thread may include a worker thread and/or a conversation thread.
According to the technical scheme, the data page recovery is carried out by taking any active node in the shared storage database cluster as an example, the situations before the data page cache of the first data page for which the target waiting event is determined and after the data page cache of the second data page is recovered according to the first redo log of the second data page are specified, so that the data page recovery process is clearer, and the recovery efficiency of the data page in the shared storage database cluster is improved.
The invention is illustrated below:
the invention provides a method for recovering a data page in a fault processing process, which comprises the following specific processes:
1. in a DSC database cluster (i.e., a shared storage database cluster) environment, when a node failure occurs, the remaining active nodes enter a failure processing flow.
2. All remaining active nodes suspend the worker thread and the session thread (i.e., the target thread). Due to node failure, logic is blocked based on data pages, where there may be multiple waiting events in the database. Wherein, all waiting events waiting for the response of the failed node are called waiting event 1 (i.e. the first target waiting event), and the waiting event 1 is registered in the global cache.
For example, assuming an EP2 node failure, the following several wait events may occur (the present invention is not limited to the following several wait events, but is used for example only):
SESS1 of EP1 node needs to modify data pages P1 and P2 in sequence, P1 is modified and SESS1 also has W authority of data page P2; the GBS information of P2 is at the EP2 node, EP1 requests the X blocking authority of the data page P2 from EP2, and SESS1 waits for the EP2 response message, namely waiting for event 1. Waiting for event 1 to register in the global cache.
b. In the case of scenario a, SESS2 of the EP1 node also modifies the data page P1, SESS2 needs to wait for SESS1 to release the W authority for data page P1 (after SESS1 modifies P1 and P2, the W authority for P1 and P2 is released), and since SESS2 does not directly wait for the EP2 response message, the waiting event is not included in the waiting event 1, and the waiting event does not need to be registered in the global cache.
3. And waking up a waiting event in the global cache, namely waiting event 1, and registering data page (namely target first data page) information (for example, data page P1 and P2 information in waiting scenario a) in use by the current operation process after waiting event 1 is woken up. Recording the modification made by the current operation into an REDO log file (for example, recording the modification made by the SESS1 session to the data page P1 into the REDO log file), then releasing the R/W permission of the current operation for the data page (for example, the SESS1 session releases the W permission for the data pages P1 and P2), that is, step S204, and then continuing to wait for the subsequent flow. In addition to waiting for event 1, other waiting events that are naturally awakened in the cluster also need to continue waiting for subsequent processes. All wait events at this time are referred to as wait event 2 (i.e., second target wait event), and wait event 2 is registered in the global cache.
For example, since the SESS1 session releases the W authority for the data pages P1 and P2 in this step, SESS2 is naturally woken up, and continues to wait for the subsequent flow, which is taken into the wait event 2 and registered in the global cache.
4. Scanning all data page buffers in the system, keeping the data page buffers registered in step 3 (e.g. P1 and P2 in step 3), and directly releasing the unregistered data page buffers (i.e. steps S206 and S207).
5. The unregistered data page cache may have been previously modified but the data page has not been flushed, in which case the data page modification operation (i.e., the first REDO log) has been recorded in the online REDO log file (i.e., the first log file) that has been flushed to disk, which is a normal data modification flow, and the online REDO logs of all nodes are redoed to restore this type of data page cache (i.e., step S208). And waking up all waiting events 2 after the redo is finished. After the event 2 is woken up, if there is an REDO log that has not been flushed (i.e., a second REDO log), the REDO log is redone first, and after the REDO is completed, the subsequent process of the event is executed continuously (i.e., step S209).
For example, the SESSs 1 and SESSs 2 are woken up, and after the SESSs 1 is woken up, the REDO log recorded by the current operation is redone (the REDO log is not yet flushed to the disk), in the process, the W authority to the data pages P1 and P2 is restored, and the X lockout request to the P2 is reinitiated. After the SESS2 is woken up, the W authority for P1 is requested again, and at this time, it still needs to wait for the SESS1 to release the W authority for P1.
According to the method for recovering the data page in the fault processing process, except for the data page cache in use, other cache data pages are discarded uniformly, and then the REDO logs of all the nodes are redone, so that the data page is recovered. The method avoids frequent disk reading/writing, and can realize higher fault processing efficiency when a large number of invalid data pages exist.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a data page recovery apparatus according to a third embodiment of the present invention, which is applicable to a case where a data page is recovered when a node in a shared storage database cluster fails. As shown in fig. 3, the specific structure of the apparatus includes:
a data page cache determining module 21, configured to determine a data page cache of a first data page to which a target waiting event is addressed, where the target waiting event is a waiting event for a target node in a database cluster to respond, and the target node includes a failed node and/or an active node;
a data page cache releasing module 22, configured to release other data page caches in the active node except the data page cache of the first data page;
and a data page cache recovery module 23, configured to recover a data page cache of a second data page according to a first redo log of the second data page, where the second data page is a data page using the active node as a recovery node, and the first redo log is recorded in a first log file in the disk.
In the recovery apparatus for a data page provided in this embodiment, first, a data page cache of a first data page to which a target waiting event is addressed is determined by a data page cache determining module 21, where the target waiting event is a waiting event for a target node in a database cluster to respond, and the target node includes a failed node and/or an active node; then, releasing other data page caches except the data page cache of the first data page in the active node through a data page cache releasing module 22; and finally, recovering the data page cache of the second data page by the data page cache recovery module 23 according to the first redo log of the second data page, wherein the second data page is the data page using the active node as the recovery node, and the first redo log is recorded in the first log file in the disk. Under the condition of node failure, other data page caches except the data page cache of the first data page are released, and data page recovery is carried out according to the first redo log of the data page taking the active node as a recovery node, so that frequent disk reading/writing is avoided, and the recovery efficiency of the data page in the shared storage database cluster is improved.
Further, the target waiting event in the data page cache determining module 21 includes a waiting event for waiting for the target node in the database cluster to perform data page authorization, and/or a waiting event for the target node in the database cluster to return the targeted data page.
Further, the target wait event comprises a first target wait event, and the apparatus further comprises:
the first target waiting event determining module is used for determining a first target waiting event in the active node before determining the data page cache of the first data page aimed at by the target waiting event, wherein the first target waiting event comprises an event which is responded by a target node in the current waiting database cluster.
Further, the target wait event further includes a second target wait event, and the apparatus further includes:
the second target waiting event determining module is configured to determine a second target waiting event in the active node after determining the first target waiting event in the active node, where the second target waiting event includes an event that waits for a read-write permission of the first target waiting event for a target first data page, and the target first data page is a data page to which the first target waiting event is directed.
Further, the apparatus further comprises:
and the read-write permission release module is used for recording the modification of the first target waiting event to the first target data page into a second log file before determining a second target waiting event in the active node, and releasing the read-write permission of the first target waiting event to the first target data page, wherein the second log file is stored in the cache.
Further, the read-write permission release module is specifically configured to:
and recording the modification of the first target waiting event to the target first data page into a second log file through the working thread of the first target waiting event, and releasing the read-write permission of the first target waiting event to the target first data page.
Further, before the modification of the first target wait event to the target first data page is recorded in the second log file by the worker thread of the first target wait event, and the read-write permission of the first target wait event to the target first data page is released, the apparatus further includes:
and the first target waiting event awakening module is used for awakening the first target waiting event.
Further, the apparatus further comprises:
the second target waiting event awakening module is used for awakening a second target waiting event after recovering the data page cache of the second data page according to the first redo log of the second data page;
and the second target waiting event executing module is used for re-executing the second target waiting event according to a second redo log of the second target waiting event through the working thread of the second target waiting event, wherein the second redo log is recorded in a second log file.
Further, the second target wait event execution module is specifically configured to:
if the second redo log of the second target wait event has not been written to the first log file, the second target wait event is re-executed according to the second redo log of the second target wait event.
Further, the apparatus further comprises:
and the target thread suspension module is used for suspending each target thread in the active node before determining the data page cache of the first data page to which the target waiting event aims, wherein the target thread comprises a working thread and/or a conversation thread.
Further, the apparatus further comprises:
and the adjusting module is used for adjusting the target thread in the suspended state in the activity node to be in the running state after the data page cache of the second data page is recovered according to the first redo log of the second data page.
The data page recovery device provided by the embodiment of the invention can execute the data page recovery method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 4 is a schematic structural diagram of an electronic device implementing the data page recovery method according to the embodiment of the present invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 may also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The processor 11 performs the various methods and processes described above, such as a method of restoring a data page.
In some embodiments, the method of restoring the data page may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the recovery method of data pages described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the method of restoring the data page by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Computer programs for implementing the methods of the present invention can be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (14)

1. A method of restoring a data page, the method performed by an active node in a database cluster, the method comprising:
determining a data page cache of a first data page for which a target waiting event is aimed, wherein the target waiting event is a waiting event for a target node in a database cluster to respond, and the target node comprises a fault node and/or an active node;
releasing other data page caches except the data page cache of the first data page in the active node;
and recovering the data page cache of a second data page according to a first redo log of the second data page, wherein the second data page is a data page taking the active node as a recovery node, and the first redo log is recorded in a first log file in a disk.
2. The method according to claim 1, wherein the target wait event comprises a wait event waiting for a target node in the database cluster to perform data page authorization and/or a wait event waiting for a target node in the database cluster to return a targeted data page.
3. The method of claim 1, wherein the target wait event comprises a first target wait event, and further comprising, before the determining that the data page of the first data page targeted by the target wait event is cached:
determining a first target waiting event in the active node, wherein the first target waiting event comprises an event that a target node in a current waiting database cluster responds.
4. The method of claim 3, wherein the target wait event further comprises a second target wait event, and further comprising, after the determining the first target wait event in the active node:
determining a second target waiting event in the active node, where the second target waiting event includes an event of waiting for a read-write permission of the first target waiting event for a target first data page, and the target first data page is a data page to which the first target waiting event is directed.
5. The method of claim 4, prior to said determining the second target wait event in the active node, further comprising:
and recording the modification of the first target waiting event to the target first data page into a second log file, and releasing the read-write permission of the first target waiting event to the target first data page, wherein the second log file is stored in a cache.
6. The method according to claim 5, wherein the recording the modification of the first target waiting event to the target first data page into a second log file, and releasing the read-write permission of the first target waiting event to the target first data page comprises:
and recording the modification of the first target waiting event to the target first data page into a second log file through the working thread of the first target waiting event, and releasing the read-write permission of the first target waiting event to the target first data page.
7. The method according to claim 6, before the recording, by the worker thread of the first target wait event, the modification of the first target wait event to the target first data page in the second log file, and releasing the read-write permission of the first target wait event to the target first data page, further comprising:
waking the first target waiting event.
8. The method of claim 4, further comprising, after the restoring the data page cache for the second data page from the first redo log for the second data page:
waking the second target wait event;
and re-executing the second target waiting event according to a second redo log of the second target waiting event by the working thread of the second target waiting event, wherein the second redo log is recorded in a second log file.
9. The method of claim 8, wherein re-executing the second target wait event according to the second redo log of the second target wait event comprises:
and if the second redo log of the second target waiting event is not written into the first log file, re-executing the second target waiting event according to the second redo log of the second target waiting event.
10. The method according to any one of claims 1 to 9, further comprising, before said determining the data page cache of the first data page for which the target wait event is intended:
and pausing each target thread in the active node, wherein the target thread comprises a working thread and/or a conversation thread.
11. The method of claim 10, further comprising, after the restoring the data page cache for the second data page from the first redo log for the second data page:
and adjusting the target thread in the suspended state in the active node to be in a running state.
12. An apparatus for restoring a data page, comprising:
the data page cache determining module is used for determining a data page cache of a first data page for which a target waiting event is aimed, wherein the target waiting event is a waiting event for a target node in a database cluster to respond, and the target node comprises a fault node and/or an active node;
the data page cache releasing module is used for releasing other data page caches except the data page cache of the first data page in the active node;
and the data page cache recovery module is used for recovering the data page cache of a second data page according to a first redo log of the second data page, wherein the second data page is a data page taking the active node as a recovery node, and the first redo log is recorded in a first log file in a disk.
13. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the method of restoring a data page of any one of claims 1-11.
14. A computer-readable storage medium storing computer instructions for causing a processor to implement the method for restoring a data page of any one of claims 1-11 when executed.
CN202210955872.8A 2022-08-10 2022-08-10 Data page recovery method and device, electronic equipment and storage medium Pending CN115328698A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210955872.8A CN115328698A (en) 2022-08-10 2022-08-10 Data page recovery method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210955872.8A CN115328698A (en) 2022-08-10 2022-08-10 Data page recovery method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115328698A true CN115328698A (en) 2022-11-11

Family

ID=83921217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210955872.8A Pending CN115328698A (en) 2022-08-10 2022-08-10 Data page recovery method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115328698A (en)

Similar Documents

Publication Publication Date Title
US20200019543A1 (en) Method, apparatus and device for updating data, and medium
US8365193B2 (en) Recoverable asynchronous message driven processing in a multi-node system
US9128895B2 (en) Intelligent flood control management
CN113364877B (en) Data processing method, device, electronic equipment and medium
CN113238815B (en) Interface access control method, device, equipment and storage medium
KR20210040866A (en) File resource processing method and apparatus, device and medium
CN111488492A (en) Method and apparatus for retrieving graph database
CN113553216B (en) Data recovery method and device, electronic equipment and storage medium
CN117725115A (en) Database sequence processing method, device, equipment and storage medium
US10970175B2 (en) Flexible per-request data durability in databases and other data stores
US20230161664A1 (en) Method of responding to operation, electronic device, and storage medium
CN115328698A (en) Data page recovery method and device, electronic equipment and storage medium
CN114513468B (en) Method, device, equipment, storage medium and product for protecting flow in Sentinel
CN115934742A (en) Fault processing method, device, equipment and storage medium
CN114691781A (en) Data synchronization method, system, device, equipment and medium
CN114579260A (en) Transaction processing method and system
CN114500443A (en) Message pushing method, device, system, electronic equipment and storage medium
CN114780022A (en) Method and device for realizing write-addition operation, electronic equipment and storage medium
CN112434013A (en) Data table migration method and device, electronic equipment and storage medium
CN113722389A (en) Data management method and device, electronic equipment and computer readable storage medium
CN111142795A (en) Control method, control device and control equipment for write operation of distributed storage system
US11663098B2 (en) Maintaining durability of a data object using unplanned delta components during transient failures
CN118151887B (en) Divider data processing method, device, terminal equipment and storage medium
CN110727898B (en) OTA website event assisted processing method, system, equipment and storage medium
CN115687244A (en) File processing monitoring method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination