CN115858236A

CN115858236A - Data backup method and database cluster

Info

Publication number: CN115858236A
Application number: CN202111117550.8A
Authority: CN
Inventors: 郑程光; 张琼; 李绪立; 张彤
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2021-09-23
Filing date: 2021-09-23
Publication date: 2023-03-28
Also published as: WO2023046042A1

Abstract

The embodiment of the application discloses a data backup method, which comprises the following steps: the master data node writes the first physical log into a first storage device, the first storage device is used for transmitting the first physical log to a second storage device, so that a second data node in a second database cluster can obtain the first physical log from the second storage device, the first storage device is deployed in different database clusters, and the second data node is used as a backup node of the first data node. In the application, a data node in a first database cluster (a main cluster) can quickly synchronize a physical log to a second database cluster (a standby cluster) through a storage device, so that the data transmission efficiency during data backup is improved.

Description

Data backup method and database cluster

Technical Field

The present application relates to the field of databases, and more particularly, to a data backup method and a database cluster.

Background

With the rapid development of information technologies including cloud computing, big data and the like, more and more enterprises perform centralized processing on applications, data and systems, the data is also at risk while being largely centralized, and how to ensure the online performance of core services of the enterprises when catastrophic emergencies occur, that is, the uninterrupted operation of the core services becomes a primary problem of enterprise attention. The finance and banking industries have high requirements on data security, and need to ensure the data security and the service availability. Therefore, a disaster recovery scheme with dual clusters needs to be adopted, and the standby cluster also has the capability of continuously providing services when the main cluster fails. When a natural or man-made disaster occurs, data is protected and recovery is performed quickly.

Data disaster recovery refers to the establishment of a data system in a different location, which is an available copy of local critical application data. When the local data and the whole application system have disasters, the system at least stores a piece of available data of key services at different places. The main technology adopted by the method is data backup and data replication technology, and the data disaster tolerance processing is actually the processing of data replication in different places. The offsite data may be a full real-time copy with the local production data (synchronous copy) or may be slightly behind the local data (asynchronous copy).

Taking data disaster tolerance of an Oracle database as an example, in order to ensure high availability and high reliability, the Oracle database respectively deploys a master repository and a backup repository in two machine rooms in the same city (or different places), wherein the backup repository is a backup of the master repository, and data synchronization between the master repository and the backup repository is performed by using a data protection technology.

The basic flow of data synchronization is that when the main storage library generates a physical log, the physical log is transmitted to the standby storage library in a synchronous replication or asynchronous replication mode through a transmission mode configured in advance, so as to realize data replication between the main storage library and the standby storage library.

In the prior art, a dedicated network line is used for transmitting the physical logs, but the dedicated network line causes higher cost of network facilities under the architecture, and the transmission efficiency of the transmission mode using the dedicated network line is lower.

Disclosure of Invention

The embodiment of the application provides a data backup method, which can improve the data transmission efficiency during data backup.

In a first aspect, the present application provides a data backup method, where the method is applied to a first database cluster, where the first database cluster includes a first data node, and the method includes: the first data node acquires a first physical log, wherein the first physical log comprises operation information of data in the first data node; the first data node writes the first physical log into a first storage device, the first storage device is used for transmitting the first physical log to a second storage device, so that a second data node in a second database cluster acquires the first physical log from the second storage device, the first storage device is deployed in the first database cluster, the second storage device is deployed in the second database cluster, the first database cluster and the second database cluster are different database clusters, and the second data node is used as a backup node of the first data node.

In one possible implementation, a synchronous replication of data may be performed between a first storage device and a second storage device, which may be storage devices with remote and parallel data transfer capabilities.

Through the method, the data node in the first database cluster (primary cluster) can quickly synchronize the physical log to the second database cluster (backup cluster) through the storage device, so that the target that the RPO (recovery point object) is 0 is more close to the target, meanwhile, the high performance of the service is ensured, and the data transmission efficiency during data backup is improved.

In one possible implementation, the first physical log includes operation information on data in the first data node, where the operation information indicates a modification operation, a write operation, and/or a delete operation on the data in the data node, and the first physical log may be a Redo log, which may also be referred to as XLog, and may record physical modifications (or describe changes of the data) of a data page, which may be used to recover a physical data page after a transaction is committed.

In a possible implementation, the first data node performs transaction commit on the data in the first data node according to the first physical log after writing the first physical log into the first storage device.

In the embodiment of the application, the first data node transmits the first physical log to the first storage device before the transaction submission of the first physical log is completed, that is, the standby data node cannot acquire the first physical log for log playback, before the transaction submission of the data in the first data node is performed, the first data node transmits the first physical log to the first storage device, writes the first physical log into the first storage device successfully, that is, considers that the first physical log is copied to the standby data node, and even if the first data node fails before the transaction submission is completed, the first physical log can be transmitted to the standby data node.

In one possible implementation, the first data node may write the first physical log to a first storage device, and the first storage device may include a storage area (first storage area) partitioned for the first data node, and for example, the first data node belongs to a first partition in the first database cluster, the first storage device may include a shared volume partitioned for the first partition, and each data node in the first partition may share the shared volume. Specifically, the first storage device may include a first storage space for storing a physical log of the first data node and a second storage space for storing a physical log of the third data node, where the first storage space and the second storage space are different.

In one possible implementation, the first database cluster further includes a third data node, and when the third data node writes a second physical log into the first storage device, the first data node may write the first physical log into the first storage device in parallel. Wherein the third data node may be a master node in the first database cluster. Specifically, when the third data node writes a second physical log into the second storage space, the first data node may write the first physical log into the first storage space in parallel.

Here, parallel means that the action of the first data node writing the first physical log into the first storage space and the action of the third data node writing the physical log into the second storage space occur simultaneously in time.

Different storage areas are divided for different main nodes in the first database cluster, so that the writing-in of physical logs can be executed in parallel among different main nodes, the concurrency of data writing-in is improved, and the transmission efficiency of the logs during data backup is improved.

Due to the limited size of the storage space in the storage device, in one possible implementation, the first storage device includes a first storage space for storing a physical log of the first data node, and the first data node may determine a target storage space from the first storage space based on an available storage space in the first storage space being smaller than a storage space required by the first physical log, the target storage space storing a target physical log, and replace the target physical log in the target storage space with the first physical log based on the target physical log having been subjected to log replay by the second data node. That is, when the size of the first storage space is insufficient, the occupied storage space in the first storage space is emptied and reused, and in order to prevent the standby data node from being emptied when the physical log is not taken yet, the physical log emptied by the first data node must be the physical log in which the standby data node has already performed log playback (the information may be fed back to the first data node after the standby data node completes log playback).

In one possible implementation, the storage addresses of the first storage space include a head address and a tail address, and the storage order of the storage spaces is from the storage space corresponding to the head address to the storage space corresponding to the tail address;

the determining a target storage space from the first storage space based on the available storage space in the first storage space being less than the storage space required by the first physical log comprises: and determining the storage space corresponding to the head address from the first storage space as the target storage space based on the occupied storage space corresponding to the tail address in the first storage space.

In one possible implementation, the first storage device is a bare device. A raw device, which may also be referred to as a raw partition (i.e., an original partition), is a device file that is not formatted and is not read through a file system. It is up to the application program to read and write it. Without buffering by the file system. It is a device that is not directly managed by the operating system. The method can directly write the data node from the memory to the storage device, reduces the steps of writing the data node from the operating system cache to the operating system cache of the storage device, and has higher I/O efficiency. The first data node can write the first physical log into the first storage device based on a direct I/O mode, and the read-write performance is improved.

In a possible implementation, after the first data node performs transaction commit on the data in the first data node according to the first physical log, the first data node may further write a second physical log containing commit information into the first storage device, where the first storage device is configured to transfer the second physical log to a second storage device, so that a management node in a second database cluster acquires the second physical log from the second storage device, and the commit information indicates that the first data node has completed the transaction commit of the first physical log. Wherein the commit information may be referenced by the second database cluster as a global consistency point when logging playback is performed.

The commit information may include, among other things, a transaction commit number, which may be used to identify committed database transactions (also called transactions). A transaction is a logical unit of data storage nodes performing database operations, consisting of a sequence of database operations. The transaction being in the committed state indicates that the transaction has executed successfully and that the data involved in the transaction has been written to the data storage node.

In one possible implementation, the first database cluster further includes a fourth data node configured to act as a backup node for the first data node, and the method further includes: the fourth data node acquires the first physical log from the first storage device; and the fourth data node performs log playback according to the first physical log.

It should be understood that the fourth data node may first read the control information in the header of the first storage device, perform verification, compare the write progress of the log on the storage device with the write progress of the local physical log, copy the read physical log to the local if there is an updated physical log on the storage device, and perform playback, and wait circularly if there is no data to be read.

In a second aspect, the present application provides a data backup method, where the method is applied to a second database cluster, where the second database cluster includes a second data node, the second data node is used as a backup node for a first data node, the first data node belongs to a first database cluster, the first database cluster and the second database cluster are different database clusters, a first storage device is deployed in the first database cluster, and a second storage device is deployed in the second database cluster, where the method includes: the second data node acquires a first physical log from the second storage device, wherein the first physical log is a physical log which is transmitted from the first data node and is stored in the second storage device through the first storage device, and the first physical log comprises operation information of data in the first data node; and the second data node performs log playback according to the first physical log.

Through the method, the data node in the first database cluster (primary cluster) can quickly synchronize the physical log to the second database cluster (backup cluster) through the storage device, so that the target that the RPO (recovery point object) is 0 is more closely realized, meanwhile, the high performance of the service is ensured, and the data transmission efficiency during data backup is improved.

In one possible implementation, the operation information indicates a modification operation, a write operation, and/or a delete operation on data in the data node.

Similar to the first storage device, the second storage device may include a storage area (third storage area) partitioned for the second data node, and the second data node belongs to the first partition in the second database cluster. Specifically, the second storage device may include a third storage space for storing the physical log of the second data node and a fourth storage space for storing the physical log of the fifth data node, and the third storage space and the fourth storage space are different. Furthermore, similar to when the first data node writes the first physical log into the first storage device, the second data node may concurrently acquire the first physical log from the second storage device while the other data nodes read the physical log from the second storage device.

Different storage areas are divided for different main nodes in the second database cluster, so that reading of physical logs can be executed in parallel among different fragmented standby data nodes, concurrency of data reading is improved, and transmission efficiency of the logs during data backup is improved.

In a possible implementation, in order to ensure distributed consistency among the backup data nodes, it is required to ensure that log playback schedules among the backup data nodes are the same, that is, log playback sequence numbers LSN are consistent (a log sequence number may indicate a playback schedule, and the greater the sequence number, the earlier the schedule is), in this embodiment of the present application, a management node in the second database cluster may maintain global information (the global information may be referred to as a barrier point), the global information may be a log sequence number of a physical log that each backup data node has already acquired, and the log sequence number is the smallest log sequence number in the largest physical log sequence numbers that each master node has currently performed transaction submission, for example, the management node acquires: the current transaction submitting progress of the master node 1 is 1, 2, the current transaction submitting progress of the master node 2 is 1, 2, the current transaction submitting progress of the master node 3 is 1, 2, 3, the current transaction submitting progress of the master node 4 is 1, 2, 3, 4, and 3 is the minimum physical log serial number in the maximum physical log serial numbers which are acquired by the management node and currently subjected to transaction submitting by each master node.

In a possible implementation, the physical logs with the same sequence number on different master data nodes in the multiple master data nodes correspond to the same task, and each master data node in the multiple master data nodes submits transactions to the physical logs in a sequence of the sequence numbers from small to large, that is, the sequence numbers may indicate the progress of the transaction submission of the master nodes.

It should be understood that the functions of the management node may be implemented by the cooperation of the CMA, CMS, etc. modules in the second database cluster.

After the standby data node acquires the physical log carrying the submission information from the second storage device, the standby data node can write the physical log into a local disk, analyze the physical log of the new landing disk, store the analyzed barrier point in a hash table, and record the currently received maximum barrier point, wherein the maximum barrier point is the log serial number of the physical log which is newly submitted by the main data node, the function of the management node can be realized by the cooperation of CMA, CMS and ETCD, the maximum barrier point of the CMA query CN and DN is reported to the CMS, and the CMS can report the minimum value of the maximum barrier point on each standby data node as a "candidate serial number" (or a to-be-detected value) to be stored in the ETCD; the CMA acquires a 'value to be detected' from the ETCD, inquires the DN, confirms whether the point exists in the DN (namely, determines whether each standby data node in a plurality of standby data nodes acquires a physical log corresponding to the candidate serial number), reports the result to the CMS, and the CMS judges as follows: if the physical log corresponding to the "value to be detected" exists in each backup data node, the physical log can be stored in the ETCD as a "target serial number" (or simply referred to as a target value) point, otherwise, the physical log is abandoned, and the CMA reads the "target value" in the ETCD and updates the local "target value". In one reporting, the CMA needs to inquire and execute three steps of reporting a maximum value of barrier, locally inquiring whether a 'value to be detected' exists or not and updating a 'target value'; the CMS needs to perform an update of the "value to be detected" and an update of the "target value". Barrier deletion is a consistent end point, the Barrier deletion occurs in physical log playback, when the log is played back, the playback position is updated when the Barrier point is played back, and the Barrier point is deleted in the hash table, so that the whole process from generation to deletion of Barrier is completed.

In the embodiment of the application, a target serial number is maintained as global information based on a management node in a second database cluster, the target serial number is the smallest serial number in the log serial numbers, each backup data node in the backup data nodes has obtained a physical log corresponding to the target serial number, and each backup data node needs to perform log playback when the log serial number corresponding to the physical log to be currently played back is equal to the target serial number, so that the backup data nodes are ensured to be played back to the target serial number, different backup data nodes are all restored to the same position, and consistency of data among different backup data nodes in the distributed database is ensured.

In a third aspect, the present application provides a first database cluster comprising a first data node, the first data node comprising:

the log acquisition module is used for acquiring a first physical log, wherein the first physical log comprises operation information of data in the first data node;

a log transfer module, configured to write the first physical log into a first storage device, where the first storage device is configured to transfer the first physical log to a second storage device, so that a second data node in a second database cluster obtains the first physical log from the second storage device, where the first storage device is deployed in the first database cluster, the second storage device is deployed in the second database cluster, the first database cluster and the second database cluster are different database clusters, and the second data node is configured to serve as a backup node of the first data node.

In one possible implementation, the first data node further includes:

and the transaction submitting module is used for submitting the transaction of the data in the first data node according to the first physical log after the first physical log is transmitted to the first storage device.

In one possible implementation, the first database cluster further includes a third data node; the log transfer module is specifically configured to:

and when the third data node writes the second physical log into the first storage device, writing the first physical log into the first storage device in parallel.

In one possible implementation, the first storage device includes a first storage space for storing physical logs of the first data node and a second storage space for storing physical logs of the third data node, the first storage space and the second storage space being different;

the log transfer module is specifically configured to:

and when the third data node writes a second physical log into the second storage space, the first data node writes the first physical log into the first storage space in parallel.

In a possible implementation, the first storage device includes a first storage space for storing a physical log of the first data node, and the log transfer module is specifically configured to:

determining a target storage space from the first storage space based on the available storage space in the first storage space being smaller than the storage space required by the first physical log, wherein the target storage space stores a target physical log;

replacing the target physical log in the target storage space with the first physical log based on the target physical log having been log replayed by the second data node.

In a possible implementation, the storage addresses of the first storage space include a head address and a tail address, and the storage order of the storage spaces is from the storage space corresponding to the head address to the storage space corresponding to the tail address;

the log transfer module is specifically configured to:

and determining the storage space corresponding to the head address from the first storage space as the target storage space based on the occupied storage space corresponding to the tail address in the first storage space.

In one possible implementation, the first storage device is a bare device.

In one possible implementation, the log transfer module is further configured to:

after the first data node commits the data in the first data node according to the first physical log in a transaction manner, the first data node writes a second physical log containing commit information into the first storage device, the first storage device is used for transmitting the second physical log to a second storage device, so that a management node in a second database cluster can obtain the second physical log from the second storage device, and the commit information indicates that the first data node has completed the transaction commit of the first physical log.

In one possible implementation, the first database cluster further includes a fourth data node, the fourth data node being configured to act as a backup node for the first data node, the fourth data node including: the log obtaining module is used for obtaining the first physical log from the first storage device;

and the log playback module is used for performing log playback according to the first physical log.

In a fourth aspect, the present application provides a second database cluster, where the second database cluster includes a second data node, the second data node is used as a backup node for a first data node, the first data node belongs to a first database cluster, the first database cluster and the second database cluster are different database clusters, a first storage device is deployed in the first database cluster, a second storage device is deployed in the second database cluster, and the second data node includes:

a log obtaining module, configured to obtain a first physical log from the second storage device, where the first physical log is a physical log that is from the first data node and is stored in the second storage device by being transferred through the first storage device, and the first physical log includes operation information on data in the first data node;

In one possible implementation, the second database cluster further includes a fifth data node; the log obtaining module is specifically configured to:

when the fifth data node acquires the physical logs from the second storage device, the second data node acquires the first physical logs from the second storage device in parallel.

In one possible implementation, the second storage device includes a third storage space for storing the physical log of the second data node and a fourth storage space for storing the physical log of the fifth data node, and the third storage space and the fourth storage space are different;

the log obtaining module is specifically configured to:

and when the fifth data node acquires the physical logs from the fourth storage space, the second data node acquires the first physical logs from the third storage space in parallel.

In one possible implementation, the first database cluster includes a plurality of primary data nodes including the first data node, the second database cluster includes a plurality of backup data nodes including the second data node, the second database cluster further includes a management node, the management node includes:

a commit information obtaining module, configured to obtain, from the second storage device, commit information from the first database cluster, where the commit information includes a log serial number of a physical log in which a transaction is latest completed by each of the plurality of primary data nodes, the target serial number is a smallest serial number among the plurality of log serial numbers, and each of the plurality of backup data nodes has obtained the physical log corresponding to the target serial number;

the log playback module is specifically configured to acquire, by the second data node, the target serial number from the management node;

and after determining that the log sequence number of the first physical log is equal to the target sequence number, performing log playback according to the first physical log.

In one possible implementation, the physical logs of the same sequence number on different master data nodes in the plurality of master data nodes correspond to the same task, and each master data node in the plurality of master data nodes submits transactions to the physical logs based on the sequence of the sequence numbers from small to large.

In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, which is characterized by comprising computer-readable instructions, when the computer-readable instructions are executed on a computer device, the computer device is caused to execute the first aspect and any optional method thereof, and the second aspect and any optional method thereof.

In a sixth aspect, the present application provides a computer program product, which is characterized by comprising computer readable instructions, when the computer readable instructions are executed on a computer device, the computer device is caused to execute the first aspect and any optional method thereof, and the second aspect and any optional method thereof.

In a seventh aspect, the present application provides a chip system, which includes a processor, configured to enable the apparatus to implement the functions recited in the above aspects, for example, to transmit or process data recited in the above methods; or, information. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the execution device or the training device. The chip system may be formed by a chip, or may include a chip and other discrete devices.

The present application may further combine to provide more implementation manners on the basis of the implementation manners provided by the above aspects.

Drawings

FIG. 1 is an architectural illustration provided by an embodiment of the present application;

FIG. 2 is an architectural illustration provided by an embodiment of the present application;

FIG. 3 is an architectural illustration provided by an embodiment of the present application;

fig. 4 is a flowchart illustrating a data backup method according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a storage space provided in an embodiment of the present application;

FIG. 6 is a schematic representation of a barrier point processing flow provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of a first database cluster provided in an embodiment of the present application;

fig. 8 is a schematic diagram of a second database cluster provided in an embodiment of the present application;

fig. 9 is a schematic structural diagram of a computing device according to an embodiment of the present application.

Detailed Description

The embodiments of the present application are described below with reference to the drawings. The terminology used in the description of the embodiments section of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the present application.

The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely descriptive of the various embodiments of the application and how objects of the same nature can be distinguished. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Fig. 1 is a schematic diagram of a system logical structure of a data backup system according to an embodiment of the present application, where the system may include a client, a master repository (e.g., a first database cluster in this embodiment of the present application), and a backup repository (e.g., a second database cluster in this embodiment of the present application), where the master repository may include a plurality of shards (e.g., shard 1 and shard 2 shown in fig. 1), where each shard may include a Data Node (DN), e.g., shard 1 shown in fig. 1 includes a master node 1 and a backup node, a backup node of shard 1 may serve as a backup of the master node 1, and shard 2 includes a master node 2 and a backup node, where a backup node of shard 2 may serve as a backup of the master node 2. The master repository may include a Coordinator Node (CN), and a hardware device 1 may also be disposed on one side of the master repository, where the hardware device 1 may be a storage device (for example, a first storage device in this embodiment).

As a backup of the main database, a backup database may be correspondingly deployed with a plurality of segments, for example, segment 1 and segment 2 shown in fig. 1, where segment 1 in the backup database may serve as a backup of segment 1 in the main database, a plurality of backup nodes in segment 1 may serve as a backup of main node 1, a plurality of backup nodes in segment 2 may serve as a backup of main node 2, and in addition, a hardware device 2 may be deployed on one side of the main database, and the hardware device 2 may be a storage device (for example, a second storage device in this embodiment).

The master repository or the backup repository may be a Storage array or a Network Attached Storage (NAS) or a Storage Area Network (SAN) or other Network Storage architectures, respectively. Each storage node (e.g., the data nodes and the coordinating node described above) may be a Logical Unit Number (LUN) or a file system. It should be understood that the embodiments of the present application do not limit the representation forms of the storage library and the storage node.

Although not shown in fig. 1, the active and standby database systems may further include a client, and the client may be connected to the primary database system and the standby database through a network, where the network may be the internet, an intranet, a Local Area network (LANs for short), a wide Area network (WLANs for short), a Storage Area network (SANs for short), or the like, or a combination thereof.

The primary node and backup node shown in FIG. 1 may be implemented by computing device 200 shown in FIG. 2.

Fig. 2 is a simplified logical block diagram of a computing device 200, as shown in fig. 2, the computing device 200 includes a processor 202, a memory unit 204, an input/output interface 206, a communication interface 208, a bus 210, and a storage device 212. The processor 202, the memory unit 204, the input/output interface 206, the communication interface 208 and the storage device 212 are communicatively connected to each other through a bus 210.

The processor 202 is a control center of the computing device 200, and is configured to execute relevant programs to implement the technical solutions provided by the embodiments of the present invention. Optionally, processor 202 includes one or more Central Processing Units (CPUs), such as CPU 1 and CPU 2 shown in fig. 2. Optionally, the computing device 200 may further include a plurality of processors 202, and each processor 202 may be a single-core processor (including one CPU) or a multi-core processor (including multiple CPUs). Unless otherwise stated, in the embodiments of the present application, a component for performing a specific function, for example, the processor 202 or the memory unit 204, may be implemented by configuring a general-purpose component to perform the corresponding function, or may be implemented by a special-purpose component for specifically performing the specific function, which is not limited in the present application. The processor 202 may be a general-purpose central processing unit, a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the present Application.

Processor 202 may be coupled to one or more storage schemes via bus 210. The storage scheme may include memory cells 204 and storage devices 212. The storage device 212 may be a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a Random Access Memory (RAM). Memory unit 204 may be a random access memory. Memory unit 204 may be integrated with processor 202 or internal to processor 202, or may be one or more storage units separate from processor 202.

Program code for execution by processor 202 or a CPU internal to processor 202 may be stored in storage device 212 or memory unit 204. Optionally, program code stored within storage device 212 (e.g., an operating system, application software, backup module, communication module, or storage control module, etc.) is copied into memory unit 204 for execution by processor 202.

The storage device 212 may be a physical hard disk or a partition thereof (including a small computing device system interface storage or a global network block device volume), a network storage protocol (including a network or cluster file system such as a network file system NFS), a file-based virtual storage device (virtual disk image), a logical volume-based storage device. May include high speed Random Access Memory (RAM), and may also include non-volatile memory, such as one or more disk memories, flash memories, or other non-volatile memories. In some embodiments, the storage device may further comprise a remote memory separate from the one or more processors 202, such as a network disk accessible via the communication interface 208 to a communication network, such as the internet, intranets, local Area Networks (LANs), wide area networks (WLANs), storage Area Networks (SANs), etc., or a combination thereof.

An operating system (e.g., darwin, RTXC, LINUX, UNIX, OS X, WINDOWS, or an embedded operating system such as Vxworks) includes various software components and/or drivers for controlling and managing conventional system tasks (e.g., memory management, storage device control, power management, etc.) and for facilitating communication between various software and hardware components.

The input/output interface 206 is used for receiving input data and information, and outputting data such as operation results.

Communication interface 208 enables communication between computing device 200 and other devices or communication networks using transceiver means, such as, but not limited to, transceivers.

Bus 210 may include a pathway to transfer information between components of computing device 200, such as processor 202, memory unit 204, input/output interface 206, communication interface 208, and storage device 212. Alternatively, the bus 210 may use a wired connection or a wireless communication, which is not limited in this application.

It should be noted that although the computing device 200 shown in fig. 2 shows only a processor 202, a memory unit 204, an input/output interface 206, a communication interface 208, a bus 210, and a storage device 212, in particular implementation procedures, it should be understood by those skilled in the art that the computing device 200 also contains other components necessary to achieve proper operation.

The computing device 200 may be a general-purpose computing device or a special-purpose computing device, including but not limited to a portable computing device, a personal desktop computing device, a web server, a tablet computer, a mobile phone, a Personal Digital Assistant (PDA), and any other electronic devices, or a combination of two or more of them, and the application does not limit the specific implementation form of the computing device 200.

Moreover, computing device 200 of FIG. 2 is merely an example of one computing device 200, and computing device 200 may contain more or fewer components than shown in FIG. 2, or have a different arrangement of components. Those skilled in the art will appreciate that the computing device 200 may also contain hardware components that implement other additional functionality, according to particular needs. Those skilled in the art will appreciate that the computing device 200 may also contain only those elements necessary to implement an embodiment of the present invention, and need not contain all of the elements shown in FIG. 2. Also, the various components illustrated in FIG. 2 may be implemented in hardware, software, or a combination of hardware and software.

The hardware structure shown in fig. 2 and the above description are applicable to various computing devices provided in the embodiments of the present application, and are suitable for executing various data backup methods provided in the embodiments of the present application.

Referring to fig. 3, fig. 3 is a product implementation form according to an embodiment of the present application, and mainly includes a dual cluster disaster recovery architecture of a distributed database with shared logs. The database double clusters are respectively deployed in two physical areas, and when the data backup method is operated, the program codes of the data backup method provided by the embodiment of the application are operated in a host memory of the server. Taking the application scenario shown in fig. 3 as an example, the client at the management and control side may issue instructions for building a cluster, establishing a disaster tolerance relationship between two clusters, switching the clusters, querying a cluster state, and the like, and after receiving the instruction, the OM module in the cluster controls the CM and other modules and the database node to complete related operations, and returns an execution result. A shared volume is a storage device with remote (physical distance) parallel data replication capability for the synchronous transfer of redo logs between primary and backup clusters. When the database cluster operates, the main node on each fragment of the main cluster generates a log and writes the log into the shared volume, the log is synchronized into the shared volume corresponding to the standby cluster, and the logs are read from the shared volume and played back by the standby data nodes of the main cluster and the standby cluster.

The embodiments of the present application will be described below with reference to the drawings. The terminology used in the description of the embodiments section of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the present application.

Referring to fig. 4, fig. 4 is a flowchart illustrating a data backup method provided in an embodiment of the present application, where the method may be applied to a first database cluster, where the first database cluster includes a first data node, and the method includes:

401. the first data node acquires a first physical log, wherein the first physical log comprises operation information of data in the first data node.

In one possible implementation, the first database cluster may be a distributed database, the first database cluster may be a primary cluster, and the second database cluster may be a backup cluster for the first database cluster.

For example, the first database cluster may be a database system based on a data fragmentation distributed architecture (shared nothing architecture), each data node may be configured with a Central Processing Unit (CPU), a memory, a hard disk, and the like, resources are not shared among the storage nodes, the first data node may be a data node of one fragment in the first database cluster, for example, the first data node may be a main data node of one fragment in the first database cluster.

In one possible implementation, the first data node may be a data node DN in a first database cluster. The first database cluster may be deployed with at least one data node, wherein the coordinating node may be deployed on a computing device. The data node may be deployed on a computing device. The plurality of coordinating nodes may be deployed at different computing devices, respectively, or may be deployed at the same computing device. The plurality of data nodes may be respectively deployed at different computing devices. The coordinating node and the data node may be deployed separately on different computing devices, or may be deployed on the same computing device.

In one possible implementation, data may be distributed on data nodes, data between the data nodes is not shared, when a service is executed, a coordinating node receives a query request from a client and generates an execution plan to be issued to each data node, the data node initializes an operator (for example, a data operation (stream) operator) to be used according to the received plan, and then executes the execution plan issued by the coordinating node. The coordination node and the data node, and the data nodes in different physical nodes may be connected through a network channel, where the network channel may be various communication protocols such as an extensible transmission control protocol (STCP).

In a possible implementation, the first data node serves as a master node in the first database cluster, and may receive a data operation request from a client, and generate a first physical log according to the data operation request, and serves as a backup node of the first data node, and the second data node (or a fourth data node described in subsequent embodiments) may acquire the first physical log and perform log playback according to the first physical log, so as to ensure consistency of data on the first data node and the second data node.

In one possible implementation, the log files in the database system may include logical log files and physical log files. The logical log in the logical log file is used to document the original logic of the logical operation performed on the database system. For example, the logic log is used to record the original logic of the logical operations of data access, data deletion, data modification, data query, database system upgrade, and database system management performed on the database system. The logical operation refers to a process of performing logical processing according to a data operation command of a user and determining which data operations need to be performed on data. Also, when the data operation command is expressed using Structured Query Language (SQL), the original logic of the logical operation may be computer instructions expressed using SQL statements. The physical logs in the physical log file are used for recording the change condition of data in the database system (for example, recording the change of data pages in the data storage nodes). The contents of the physical log record may be understood as data changes caused by performing logical operations on the database system.

It should be understood that before the database is initialized, the primary-backup relationship between the first database cluster and the second database cluster is not distinguished, and when the database is initialized, the two clusters are independent from each other without distinguishing primary and backup roles, and the nodes in the sub-slices each include a master node and a plurality of backup data nodes. Each segment is configured with a shared volume (i.e. storage spaces in the first storage device and the second storage device described in the following embodiments), it is ensured that all nodes in the segment have access rights to the shared volume, a master node in the segment generates a log and stores the log on the storage device corresponding to the segment, and at this time, the storage devices between clusters do not establish a synchronous copy relationship, and do not distinguish a master end from a slave end. One of the clusters is selected as a disaster recovery backup cluster, the cluster is stopped to prevent the cluster from writing data into a shared disk, relevant parameter information of the main cluster and the backup cluster is configured respectively, and a remote replication relation of the storage devices is established, namely, the data is synchronously replicated from a main-end storage device of the main cluster to a slave-end storage device of the backup cluster. The backup cluster carries out build (reconstruction) request to the main cluster through the network, completes transmission and copying of data, logs and the like, starts the cluster, and completes establishment of disaster tolerance relation.

402. And the first data node writes the first physical log into a first storage device, and the first storage device is used for transmitting the first physical log to a second storage device.

In one possible implementation, the first data node may write the first physical log to the first storage device after acquiring the first physical log.

The first storage device and the second storage device may be physical devices such as a full flash memory storage system.

In one possible implementation, after the first data node writes the first physical log into the first storage device, the first data node performs transaction commit on data in the first data node according to the first physical log.

In one possible implementation, the first data node may write the first physical log to a first storage device, and the first storage device may include a storage area (first storage area) partitioned for the first data node, and for example, the first data node belongs to a first partition in the first database cluster, the first storage device may include a shared volume partitioned for the first partition, and each data node in the first partition may share the shared volume. Specifically, the first storage device may include a first storage space for storing the physical log of the first data node and a second storage space for storing the physical log of the third data node, and the first storage space and the second storage space are different.

In a possible implementation, the first database cluster further includes a third data node, and when the third data node writes a second physical log to the first storage device, the first data node may write the first physical log to the first storage device in parallel. Wherein the third data node may be a master node in the first database cluster. Specifically, when the third data node writes a second physical log into the second storage space, the first data node may write the first physical log into the first storage space in parallel.

Due to the limited size of the storage space in the storage device, in one possible implementation, the first storage device includes a first storage space for storing a physical log of the first data node, and the first data node may determine a target storage space from the first storage space based on that the available storage space in the first storage space is smaller than a storage space required by the first physical log, the target storage space storing a target physical log, and replace the target physical log in the target storage space with the first physical log based on that the target physical log has been executed by the second data node for log replay. That is, when the size of the first storage space is insufficient, the occupied storage space in the first storage space is emptied and reused, and in order to prevent the backup data node from being emptied when the physical log is not taken yet, the physical log emptied by the first data node must be the physical log which has been subjected to log playback by the backup data node (the information may be fed back to the first data node after the backup data node completes the log playback).

In a possible implementation, the storage addresses of the first storage space include a head address and a tail address, the storage order of the storage spaces is from the storage space corresponding to the head address to the storage space corresponding to the tail address, and the first data node may determine, from the first storage space, the storage space corresponding to the head address as the target storage space based on that the storage space corresponding to the tail address in the first storage space is occupied.

Referring to fig. 5, an area (for example, 16MB in size) may be partitioned at the header of the storage device for writing control information (control info), and the control information may include information such as a check code, a log writing position, and a file size. The physical log can be written from a position after 16M, the storage area of the physical log can be recycled, and when the writing position (head) is updated to the tail (tail) of the log area, the writing can be continued from the offset position of 16M again. The master node (e.g., the first data node) generates a physical log and copies the physical log from the local directory to the storage device, and updates the control information while writing. After the physical log is written into the storage device, the log is considered to be successfully persisted and then submitted. However, the storage device has a limited space, and it is not avoided that the physical log of the standby machine is not read to the local and is covered due to the large pressure of the host, so that the standby machine is not usable, and the host needs to know the maximum Log Sequence Number (LSN) of the current local log of the standby machine, and can determine whether to continue to write data to the storage device according to the LSN. Therefore, for the host, if the standby cluster nodes are connected, the log at the position is ensured to be covered after at least one standby cluster node copies the log.

Each log has a unique LSN, or the log and the LSN are one-to-one, so that a log can be uniquely determined according to the LSN.

In a possible implementation, after the first data node performs transaction commit on the data in the first data node according to the first physical log, the first data node may further write a second physical log containing commit information into the first storage device, where the first storage device is configured to transfer the second physical log to a second storage device, so that a management node in a second database cluster acquires the second physical log from the second storage device, and the commit information indicates that the first data node has completed the transaction commit of the first physical log. Wherein the commit information may be referenced by the second database cluster as a global consistency point during log replay.

In a possible implementation, the first database cluster further includes a fourth data node, where the fourth data node is used as a backup node of the first data node, for example, the fourth data node may be a data node in the same segment as the first data node, and the fourth data node is used as a backup node of the first data node, and then the fourth data node may obtain the first physical log from the first storage device, and the fourth data node performs log playback according to the first physical log.

In one possible implementation, a synchronous replication of data may be performed between a first storage device and a second storage device, which may be storage devices with remote and parallel data transfer capabilities. Wherein the first storage device may transfer the first physical log to the second storage device.

In one possible implementation, the first storage device may include a third storage space for storing the physical log of the second data node and a fourth storage space for storing the physical log of the fifth data node, where the third storage space and the fourth storage space are different, and the first storage device may transfer the first physical log to the third storage space in the second storage device.

403. A second data node in a second database cluster obtains the first physical log from the second storage device.

In a possible implementation, similar to the first database cluster, the second database cluster may be a distributed database, the first database cluster may be a master cluster, and the second database cluster may be a backup cluster of the first database cluster, for example, the second database cluster may be a database system based on a data fragmentation distributed architecture (shared nothing architecture), each data node may be configured with a Central Processing Unit (CPU), a memory, a hard disk, and the like, resources are not shared among the storage nodes, and the second data node may be a data node of one fragment in the second database cluster.

In one possible implementation, the second data node may be a data node DN in a second database cluster. The second database cluster may be deployed with at least one data node, wherein the coordinating node may be deployed on a computing device. The data node may be deployed on a computing device. The plurality of coordinating nodes may be deployed at different computing devices, respectively, or may be deployed at the same computing device. The plurality of data nodes may be respectively deployed at different computing devices. The coordinating node and the data node may be deployed separately on different computing devices, or may be deployed on the same computing device.

In a possible implementation, the first data node serves as a master node in the first database cluster, may receive a data operation request from a client, generates a first physical log according to the data operation request, and serves as a backup node of the first data node, and the second data node may acquire the first physical log from the second storage device and perform log playback according to the first physical log, so as to ensure consistency of data on the first data node and data on the second data node. Specifically, the second data node may obtain the first physical log from a third storage space, where the third storage space is a storage space allocated to the second data node in the second storage device.

In one possible implementation, the first database cluster includes a plurality of data nodes including the first data node, the second database cluster further includes a management node, and the management node may further obtain commit information from the first database cluster from the second storage device, where the commit information includes a log sequence number of a physical log in which each of the plurality of data nodes has completed a transaction commit most recently, and the target sequence number is a smallest sequence number among the plurality of log sequence numbers;

in the prior art, a distributed consistency mechanism based on storage equipment is adopted, and a global barrier log is generated to ensure that a farthest recovery point common to different fragments is found, but the problem that data synchronization fails due to the fact that the storage equipment has a network problem cannot be solved.

In a possible implementation, in order to ensure distributed consistency among the backup data nodes, it is required to ensure that log playback schedules among the backup data nodes are the same, that is, log playback sequence numbers LSN are consistent (a log sequence number may indicate a playback schedule, and the greater the sequence number, the earlier the schedule is), in this embodiment of the present application, a management node in the second database cluster may maintain global information (the global information may be referred to as a barrier point), the global information may be a log sequence number of a physical log that each backup data node has already acquired, and the log sequence number is the smallest log sequence number in the largest physical log sequence numbers that each master node has currently performed transaction submission, for example, the management node acquires: the current transaction submitting progress of the master node 1 is 1, 2, the current transaction submitting progress of the master node 2 is 1, 2, the current transaction submitting progress of the master node 3 is 1, 2, 3, the current transaction submitting progress of the master node 4 is 1, 2, 3, 4, and then 3 is the minimum physical log serial number in the maximum physical log serial numbers which are acquired by the management node and currently subjected to transaction submission by each master node.

In a possible implementation, the physical logs with the same sequence number on different master data nodes in the plurality of master data nodes correspond to the same task, and each master data node in the plurality of master data nodes submits transactions to the physical logs based on the sequence of the sequence numbers from small to large, that is, the sequence numbers may indicate the progress of the master node transaction submission.

It should be understood that the management node may be an operation management module (OM), a cluster management module (CM), a Cluster Management Agent (CMA), a cluster management service (CM Server, CMs), a Global Transaction Manager (GTM), or the like.

After the standby data node acquires the physical log carrying the commit information from the second storage device, the physical log can be written into a local disk, the physical log of the new landing disk is analyzed, the analyzed barrier point is stored in a hash table, and the currently received maximum barrier point is recorded, the maximum barrier point is the log serial number of the physical log which is newly submitted by the main data node, the function of the management node can be realized by mutually matching CMA, CMS and ETCD, wherein the maximum barrier point of the CMA query CN and DN is reported to the CMS, and the CMS can take the minimum value of the maximum barrier point on each standby data node as a candidate serial number (or a candidate value) to be stored in the ETCD; the CMA acquires a to-be-detected value from the ETCD, inquires the DN, confirms whether the DN has the point (namely, determines whether each backup data node in the plurality of backup data nodes acquires the physical log corresponding to the candidate serial number), reports the result to the CMS, and the CMS judges the following: if the physical log corresponding to the "value to be detected" exists in each backup data node, the physical log can be stored in the ETCD as a "target serial number" (or simply referred to as a target value) point, otherwise, the physical log is abandoned, and the CMA reads the "target value" in the ETCD and updates the local "target value". In one reporting, the CMA needs to inquire and execute three steps of reporting a maximum value of barrier, locally inquiring whether a 'value to be detected' exists or not and updating a 'target value'; the CMS needs to perform an update of the "value to be detected" and an update of the "target value". Barrier deletion is a consistent end point, the Barrier deletion occurs in physical log playback, when the log is played back, the playback position is updated when the Barrier point is played back, and the Barrier point is deleted in the hash table, so that the whole process from generation to deletion of Barrier is completed.

Illustratively, in a multi-slice scenario, there are multiple synchronous links sharing a storage device, and in order to ensure that distributed consistency can be ensured even if the progress of the synchronous links is inconsistent. The backup cluster needs to obtain the minimum barrier point (namely the log serial number of the physical log submitted by the latest completed transaction of each main data node in the plurality of main data nodes) in the current maximum barrier points of each segment, and the database backup cluster can recover to the minimum barrier point. For example, four stages can be divided: barrier generation, barrier analysis and storage, barrier promotion and barrier deletion. Where Barrier generation is a prerequisite for consistency, the Barrier point may be initiated by any CN node, but is responsible for generation by the first CN. If the CN that initiated the barrier generation is not the first CN, the first CN is notified to generate. The generated CN and/or DN node adds it to the physical log. Barrier analysis storage is the basis of consistency, and after a corresponding data backup node on a backup cluster receives a log through storage equipment, the log is written into a local disk. The log of the new landed is parsed first, and the parsed barrier points are stored in a hash table, and the currently received maximum barrier point is recorded. The hash table is created before the log analysis thread is created, and is released when the cluster is unloaded. The hash table stores parsed barrier points that will be deleted when the physical log is played back. Barrier advancement is the key to consistency, and this can be done in part by CN, DN, CMA, CMS, ETCD, as shown in FIG. 6. Advancement of Barrier consistency points may include five cycles: in the first period, CMA queries the maximum value of barrier of CN and DN and reports the maximum value to CMS; the CMS takes the minimum value of the collected and compared values as a 'value to be detected' and stores the value into the ETCD; the CMA acquires a to-be-detected value from the ETCD, queries the CN and the DN, confirms whether the DN has the point, reports the result to the CMS and the CMS, judges after the result is collected, if the fragment of the to-be-detected value is confirmed to exist, the fragment is stored as a target value point in the ETCD, and otherwise, the fragment is discarded; the CMA reads the target value in the ETCD and updates the local target value. In one reporting, the CMA needs to inquire and execute three steps of reporting a maximum value of barrier, locally inquiring whether a 'value to be detected' exists or not and updating a 'target value'; the CMS needs to perform an update of the "value to be detected" and an update of the "target value". Barrier deletion is a consistent end point, the Barrier deletion occurs in physical log playback, when the log is played back, the playback position is updated when the Barrier point is played back, and the Barrier point is deleted in the hash table, so that the whole process from generation to deletion of Barrier is completed.

404. And the second data node performs log playback according to the first physical log.

In a possible implementation, the second data node obtains the target sequence number from the management node, and after determining that the log sequence number of the first physical log is equal to the target sequence number, the second data node performs log playback according to the first physical log.

When the second database cluster needs to become the main cluster due to the failure of the first database cluster or the manual adjustment of the user, the failure over flow or the switching over flow can be used for realizing the failure over flow or the switching over flow.

In the failover process, failover is performed when the main cluster is abnormal, that is, the standby cluster is upgraded to the main cluster, and production service is continuously provided. A client side of a control side issues a cluster failover instruction, the state of the storage device is checked, and if the state is normal, switching of RPO =0 can be performed; interrupting the synchronous relation of the storage equipment, and removing the write protection of the storage equipment of the standby cluster, so that the storage equipment can read and write; stopping the standby cluster, and covering a redo log in the storage device to a local log for a CN node of the standby cluster; updating the related parameter information stored in the standby cluster etcd; OM modifies mode parameters of CM, CN, DN, starts cluster according to main cluster mode.

In the switchover process, under the condition that the main cluster and the standby cluster normally operate, planned cluster role switching is initiated by a user, namely the main cluster is reduced to the standby cluster, and the standby cluster is increased to the main cluster to replace the original main cluster to provide production service. A client side on a control side firstly issues a cluster switch instruction to a main cluster, checks the state of a storage device, and can perform RPO =0 switching if the state is normal; closing the main cluster, OM modifies mode parameters of CM, CN and DN, and starts the cluster according to the standby cluster mode; checking the state of the storage equipment, and performing master-slave switching on the storage equipment, namely synchronously transmitting the data from the original standby cluster to the original main cluster in the copying direction of the data; stopping the standby cluster, and covering a redo log in the storage device to a local log for a CN node of the standby cluster; OM modifies mode parameters of CM, CN, DN, and starts original cluster according to main cluster mode.

The embodiment of the application provides a data backup method, which is applied to a first database cluster, wherein the first database cluster comprises a first data node, and the method comprises the following steps: the first data node acquires a first physical log, wherein the first physical log comprises operation information of data in the first data node; the first data node writes the first physical log into a first storage device, the first storage device is used for transmitting the first physical log to a second storage device, so that a second data node in a second database cluster acquires the first physical log from the second storage device, the first storage device is deployed in the first database cluster, the second storage device is deployed in the second database cluster, the first database cluster and the second database cluster are different database clusters, and the second data node is used as a backup node of the first data node. Through the method, the data node in the first database cluster (primary cluster) can quickly synchronize the physical log to the second database cluster (backup cluster) through the storage device, so that the target that the RPO (recovery point object) is 0 is more close to the target, meanwhile, the high performance of the service is ensured, and the data transmission efficiency during data backup is improved.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a first database cluster 700 according to an embodiment of the present application, where the first database cluster 700 may include a first data node 70, and the first data node 70 may include:

a log obtaining module 701, configured to obtain a first physical log, where the first physical log includes operation information on data in the first data node 70;

for a detailed description of the log obtaining module 701, reference may be made to the description of step 401 in the foregoing embodiment, which is not described herein again.

In a specific implementation process, the log obtaining module 701 may be implemented by the processor 202 and the memory unit 204 shown in fig. 2. More specifically, the processor 202 may execute the associated code in the memory unit 204 to obtain the first physical log.

A log transferring module 702, configured to write the first physical log into a first storage device, where the first storage device is configured to transfer the first physical log to a second storage device, so that a second data node 80 in a second database cluster obtains the first physical log from the second storage device, where the first storage device is deployed in the first database cluster, the second storage device is deployed in the second database cluster, the first database cluster and the second database cluster are different database clusters, and the second data node 80 is configured to serve as a backup node for the first data node 70.

For a detailed description of the log transfer module 702, reference may be made to the description of step 402 in the foregoing embodiment, which is not described herein again.

In a specific implementation, the log transfer module 702 may be implemented by the processor 202, the memory unit 204 and the communication interface 208 shown in fig. 2. More specifically, the communication module and the backup module in the memory unit 204 may be executed by the processor 202 to cause the communication interface 208 to write the first physical log to the first storage device.

In one possible implementation, the first data node 70 further comprises:

a transaction committing module 703, configured to perform transaction committing on the data in the first data node 70 according to the first physical log after the first physical log is transferred to the first storage device.

In one possible implementation, the first database cluster further includes a third data node; the log transfer module 702 is specifically configured to:

In a possible implementation, the first storage device includes a storage space for storing a physical log, and the log transfer module 702 is specifically configured to:

determining a target storage space from the storage space based on the fact that the available storage space in the storage space is smaller than the storage space required by the first physical log, wherein the target storage space stores a target physical log;

replacing the target physical log in the target storage space with the first physical log based on the target physical log having been log replayed by the second data node 80.

In one possible implementation, the storage addresses of the storage space include a head address and a tail address, and the storage order of the storage space is configured from the storage space corresponding to the head address to the storage space corresponding to the tail address;

the log transfer module 702 is specifically configured to:

and determining the storage space corresponding to the head address from the storage space as the target storage space based on the occupied storage space corresponding to the tail address in the storage space.

In one possible implementation, the first storage device is a bare device.

In one possible implementation, the log transfer module 702 is further configured to:

after the first data node 70 performs transaction commit on the data in the first data node 70 according to the first physical log, the first data node 70 writes a second physical log containing commit information into the first storage device, where the first storage device is configured to transfer the second physical log to a second storage device, so that a management node in a second database cluster obtains the second physical log from the second storage device, and the commit information indicates that the first data node 70 has completed the transaction commit of the first physical log.

In one possible implementation, the first database cluster further includes a fourth data node, the fourth data node being configured to act as a backup node for the first data node 70, the fourth data node including: the log obtaining module is used for obtaining the first physical log from the first storage device;

Referring to fig. 8, fig. 8 is a schematic structural diagram of a second database cluster 800 according to an embodiment of the present disclosure, where the second database cluster 800 may include a second data node 80, where the second data node 80 is used as a backup node for a first data node 70, the first data node 70 belongs to a first database cluster, the first database cluster and the second database cluster are different database clusters, a first storage device is disposed in the first database cluster, a second storage device is disposed in the second database cluster, and the second data node 80 includes:

a log obtaining module 801, configured to obtain a first physical log from the second storage device, where the first physical log is a physical log that is from the first data node 70 and is stored in the second storage device by being transferred through the first storage device, and the first physical log includes operation information on data in the first data node 70;

for a detailed description of the log obtaining module 801, reference may be made to the description of step 403 in the foregoing embodiment, which is not described herein again.

In a specific implementation process, the log obtaining module 801 may be implemented by the processor 202, the memory unit 204 and the communication interface 208 shown in fig. 2. More specifically, the communication module in the memory unit 204 may be executed by the processor 202 to enable the communication interface 208 to retrieve the first physical log from the second storage device.

A log playback module 802, configured to perform log playback according to the first physical log.

For a detailed description of the log playback module 802, reference may be made to the description of step 404 in the foregoing embodiment, which is not described herein again.

when the fifth data node obtains the physical log from the second storage device, the second data node 80 obtains the first physical log from the second storage device in parallel.

In one possible implementation, the first database cluster includes a plurality of data nodes within the first data node 70, and the second database cluster further includes a management node including:

a commit information obtaining module, configured to obtain commit information from the first database cluster from the second storage device, where the commit information includes a log sequence number of a physical log in which a transaction is completed latest by each of the multiple data nodes, and the target sequence number is a minimum sequence number among the multiple log sequence numbers;

the log playback module is specifically configured to obtain, by the second data node 80, the target serial number from the management node;

and after determining that the log serial number of the first physical log is equal to the target serial number, performing log playback according to the first physical log.

An embodiment of the present application further provides a computing device, where the computing device may be a node in the first database cluster or a node in the second database cluster described in the foregoing embodiments. The computing device may be a server or a terminal, etc. The aforementioned database management node and/or data storage node may be deployed in the computing device. As shown in fig. 9, the computing device 90 includes: a processor 901, a communication interface 902 and a memory 903. The processor 901, the communication interface 902 and the memory 903 are connected to each other via a bus 904.

The memory 903 is used to store computer instructions. The processor 901, when executing computer instructions in the memory 903, is able to implement the functionality of the computer instructions. For example, when the processor 901 executes the computer instructions in the memory 903, the data recovery method provided by the embodiment of the present application can be implemented. For another example, when the database management node is deployed in a computer device, the processor 901 can implement the functions of the first data node and the fourth data node in the data backup method provided in the embodiment of the present application when executing the computer instructions in the memory 903. For another example, when the data storage node is deployed in a computer device, the processor 901 executes computer instructions in the memory 903, so as to implement the function of the second data node in the data backup method provided in the embodiment of the present application.

In fig. 9, the bus 904 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 9, but this does not indicate only one bus or one type of bus.

In fig. 9, the processor 901 may be a hardware chip, which may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof. Alternatively, the processor may be a general-purpose processor, such as a Central Processing Unit (CPU), a Network Processor (NP), or a combination of a CPU and an NP.

In fig. 9, the memory 903 may include a volatile memory (volatile memory), such as a random-access memory (RAM). Non-volatile memory (non-volatile memory) may also be included, such as flash memory, hard Disk Drive (HDD) or solid-state drive (SSD). Combinations of the above types of memory may also be included.

The embodiment of the present application further provides a storage medium, where the storage medium is a nonvolatile computer-readable storage medium, and instructions in the storage medium are used to implement the data backup method provided in the embodiment of the present application.

The embodiment of the application also provides a computer program product containing instructions, and the instructions included in the computer program product are used for realizing the data backup method provided by the embodiment of the application. The computer program product may be stored on the storage medium.

The embodiment of the present application further provides a chip, where the chip includes a programmable logic circuit and/or a program instruction, and when the chip runs, the chip is used to implement the data backup method provided in the embodiment of the present application.

It should be noted that the above-described embodiments of the apparatus are merely illustrative, where the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiments of the apparatus provided in the present application, the connection relationship between the modules indicates that there is a communication connection therebetween, and may be implemented as one or more communication buses or signal lines.

Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, for the present application, the implementation of a software program is more preferable. Based on such understanding, the technical solutions of the present application may be substantially embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, an exercise device, or a network device) to execute the method according to the embodiments of the present application.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the application are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website site, computer, training device, or data center to another website site, computer, training device, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a training device, a data center, etc., that incorporates one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.

Claims

1. A method for data backup, the method being applied to a first database cluster, the first database cluster including a first data node, the method comprising:

the first data node acquires a first physical log, wherein the first physical log comprises operation information of data in the first data node;

the first data node writes the first physical log into a first storage device, the first storage device is used for transmitting the first physical log to a second storage device, so that a second data node in a second database cluster acquires the first physical log from the second storage device, wherein the first storage device is deployed in the first database cluster, the second storage device is deployed in the second database cluster, the first database cluster and the second database cluster are different database clusters, and the second data node is used as a backup node of the first data node.

2. The method according to claim 1, wherein the operation information indicates a modification operation, a write operation and/or a delete operation on data in the data node.

3. The method according to claim 1 or 2, characterized in that the method further comprises:

after the first data node writes the first physical log into the first storage device, transaction submission is carried out on data in the first data node according to the first physical log.

4. A method according to any one of claims 1 to 3, wherein the first database cluster further comprises a third data node; the first data node writes the first physical log to a first storage device, including:

and when the third data node writes a second physical log into the first storage device, the first data node writes the first physical log into the first storage device in parallel.

5. The method of claim 4, wherein the first storage device comprises a first storage space for storing physical logs of the first data node and a second storage space for storing physical logs of the third data node, the first storage space and the second storage space being different;

when the third data node writes the second physical log into the first storage device, the first data node writes the first physical log into the first storage device in parallel, including:

6. The method of any of claims 1 to 5, wherein the first storage device comprises a first storage space for storing a physical log of the first data node, and wherein writing the first physical log to the first storage device comprises:

replacing the target physical log in the target storage space with the first physical log based on the target physical log having been subjected to log replay by the second data node.

7. The method according to claim 6, wherein the storage addresses of the first storage space comprise a head address and a tail address, and the storage space is stored from the storage space corresponding to the head address to the storage space corresponding to the tail address;

the determining a target storage space from the first storage space based on the available storage space in the first storage space being less than the storage space required by the first physical log comprises:

8. The method of any of claims 1 to 7, further comprising:

9. The method of any of claims 1 to 8, wherein the first database cluster further comprises a fourth data node configured to act as a backup node for the first data node, the method further comprising:

the fourth data node acquires the first physical log from the first storage device;

and the fourth data node performs log playback according to the first physical log.

10. A data backup method is applied to a second database cluster, the second database cluster comprises a second data node, the second data node is used as a backup node of a first data node, the first data node belongs to a first database cluster, the first database cluster and the second database cluster are different database clusters, a first storage device is deployed in the first database cluster, and a second storage device is deployed in the second database cluster, the method comprises:

the second data node acquires a first physical log from the second storage device, wherein the first physical log is a physical log which is transmitted from the first data node and is stored in the second storage device through the first storage device, and the first physical log comprises operation information of data in the first data node;

and the second data node performs log playback according to the first physical log.

11. The method according to claim 10, wherein the operation information indicates a modification operation, a write operation and/or a delete operation on data in the data node.

12. The method of claim 10 or 11, wherein the second database cluster further comprises a fifth data node; the second data node obtains a first physical log from the second storage device, and the method comprises the following steps:

13. The method of claim 12, wherein the second storage device comprises a third storage space for storing the physical log of the second data node and a fourth storage space for storing the physical log of the fifth data node, and wherein the third storage space and the fourth storage space are different;

when the fifth data node acquires the physical logs from the second storage device, the second data node acquires the first physical logs from the second storage device in parallel, and the method includes:

14. The method of any of claims 10 to 13, wherein the first database cluster includes a plurality of primary data nodes including the first data node, wherein the second database cluster includes a plurality of backup data nodes including the second data node, wherein the second database cluster further includes a management node, and wherein the method further comprises:

the management node acquires commit information from the first database cluster from the second storage device, wherein the commit information includes log serial numbers of physical logs submitted by latest completed transactions of each master data node in the plurality of master data nodes, the target serial number is the smallest serial number in the plurality of log serial numbers, and each backup data node in the plurality of backup data nodes acquires the physical log corresponding to the target serial number;

the second data node performs log playback according to the first physical log, and the method comprises the following steps:

the second data node acquires the target serial number from the management node;

and after determining that the log sequence number of the first physical log is equal to the target sequence number, the second data node performs log playback according to the first physical log.

15. The method of claim 14, wherein physical logs of a same sequence number on different ones of the plurality of master data nodes correspond to a same task, and wherein each of the plurality of master data nodes performs transaction commit on the physical logs based on a sequence of sequence numbers that increases from small to large.

16. A first database cluster, the first database cluster comprising a first data node, the first data node comprising:

17. The first database cluster of claim 16, wherein the first data node further comprises:

18. A first database cluster according to claim 16 or 17, characterized in that it further comprises a third data node; the log transfer module is specifically configured to:

19. The first database cluster of claim 18, wherein the first storage device comprises a first storage space for storing physical logs of the first data node and a second storage space for storing physical logs of the third data node, the first storage space and the second storage space being different;

the log transfer module is specifically configured to:

20. The first database cluster according to any of claims 16 to 19, wherein the first storage device comprises a first storage space for storing a physical log of the first data node, and the log transfer module is specifically configured to:

21. The first database cluster of claim 20, wherein the storage addresses of the first storage space comprise a head address and a tail address, and the storage space is stored in an order from the storage space corresponding to the head address to the storage space corresponding to the tail address;

the log transfer module is specifically configured to:

22. A second database cluster, wherein the second database cluster includes a second data node, the second data node is configured to serve as a backup node for a first data node, the first data node belongs to a first database cluster, the first database cluster and the second database cluster are different database clusters, a first storage device is deployed in the first database cluster, a second storage device is deployed in the second database cluster, and the second data node includes:

23. The second database cluster of claim 22, wherein the second database cluster further comprises a fourth data node; the log obtaining module is specifically configured to:

24. The second database cluster according to claim 23, wherein the second storage device comprises a third storage space for storing physical logs of the second data node and a fourth storage space for storing physical logs of the fifth data node, the third storage space and the fourth storage space being different;

the log obtaining module is specifically configured to:

25. The second database cluster of any of claims 22 to 24, wherein the first database cluster comprises a plurality of primary data nodes including the first data node, wherein the second database cluster comprises a plurality of backup data nodes including the second data node, wherein the second database cluster further comprises a management node, wherein the management node comprises:

26. The second database cluster of claim 25, wherein the physical logs with the same sequence number on different ones of the plurality of master data nodes correspond to the same task, and wherein the master data nodes in the plurality of master data nodes perform transaction commit on the physical logs based on a descending order of the sequence numbers.

27. A computer readable storage medium comprising computer readable instructions which, when run on a computer device, cause the computer device to perform the method of any of claims 1 to 15.

28. A computer program product comprising computer readable instructions which, when run on a computer device, cause the computer device to perform the method of any one of claims 1 to 15.