WO2023046042A1 - Procédé de sauvegarde de données et groupement de bases de données - Google Patents
Procédé de sauvegarde de données et groupement de bases de données Download PDFInfo
- Publication number
- WO2023046042A1 WO2023046042A1 PCT/CN2022/120709 CN2022120709W WO2023046042A1 WO 2023046042 A1 WO2023046042 A1 WO 2023046042A1 CN 2022120709 W CN2022120709 W CN 2022120709W WO 2023046042 A1 WO2023046042 A1 WO 2023046042A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- log
- data node
- data
- storage device
- physical log
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 238000007726 management method Methods 0.000 claims description 43
- 238000012546 transfer Methods 0.000 claims description 35
- 238000012217 deletion Methods 0.000 claims description 18
- 230000037430 deletion Effects 0.000 claims description 18
- 238000012986 modification Methods 0.000 claims description 12
- 230000004048 modification Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 9
- 230000005540 biological transmission Effects 0.000 abstract description 17
- 230000004888 barrier function Effects 0.000 description 52
- 230000015654 memory Effects 0.000 description 47
- 238000004891 communication Methods 0.000 description 22
- 230000008569 process Effects 0.000 description 16
- 238000011084 recovery Methods 0.000 description 16
- 238000012545 processing Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 14
- 230000010076 replication Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 12
- 230000009471 action Effects 0.000 description 9
- 239000012634 fragment Substances 0.000 description 9
- 238000013500 data storage Methods 0.000 description 8
- 230000001360 synchronised effect Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000005192 partition Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 230000001174 ascending effect Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 238000013467 fragmentation Methods 0.000 description 3
- 238000006062 fragmentation reaction Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000003139 buffering effect Effects 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002688 persistence Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
Definitions
- the present application relates to the field of databases, and more specifically, relates to a data backup method and a database cluster.
- Data disaster recovery refers to the establishment of an off-site data system, which is an available copy of local key application data.
- the system has at least one copy of available business-critical data in off-site.
- the main technology it adopts is data backup and data replication technology, and the processing of data disaster recovery is actually the processing of off-site data replication.
- the remote data can be a full real-time replication of the local production data (synchronous replication), or it can be slightly behind the local data (asynchronous replication).
- Oracle database Take the data disaster recovery of Oracle database as an example.
- Oracle database deploys a primary storage library and a backup storage library in two computer rooms in the same city (or different places).
- the backup storage library is the main storage Repository backup, and use data protection technology to synchronize data between primary and backup repositories.
- the basic process of data synchronization is that when the primary repository generates a physical log, it is transmitted to the standby repository in the form of synchronous or asynchronous replication through the pre-configured transmission method, so as to realize data replication between the primary repository and the standby repository.
- the transmission of physical logs is carried out by means of a dedicated network line.
- the dedicated network line leads to high cost of network facilities under this architecture, and the transmission efficiency of the transmission mode using a dedicated network line is low.
- the embodiment of the present application provides a data backup method, which can improve data transmission efficiency during data backup.
- the present application provides a data backup method, the method is applied to a first database cluster, the first database cluster includes a first data node, and the method includes: the first data node obtains the first A physical log, the first physical log includes operation information on data in the first data node; the first data node writes the first physical log into a first storage device, and the first storage device uses to transfer the first physical log to the second storage device, so that the second data node in the second database cluster obtains the first physical log from the second storage device, wherein the first storage device Deployed in the first database cluster, the second storage device is deployed in the second database cluster, the first database cluster and the second database cluster are different database clusters, and the second data node uses to be the backup node of the first data node.
- synchronous replication of data may be performed between the first storage device and the second storage device, and the first storage device and the second storage device may be storage devices capable of remote and parallel data transmission.
- the data nodes in the first database cluster can quickly synchronize the physical log to the second database cluster (standby cluster) through the storage device, thereby getting closer to realizing the data recovery point objective (recovery point objective, RPO) is 0, while ensuring the high performance of the business and improving the data transmission efficiency during data backup.
- RPO recovery point objective
- the first physical log includes operation information on the data in the first data node, and the operation information indicates a modification operation, a write operation, and/or
- the first physical log can be a Redo log, also known as an XLog, which can record the physical modification of the data page (or describe it as a change in data), and can be used to restore the physical data page after the transaction is committed.
- the data in the first data node make a transaction commit.
- the transaction commit can ensure the persistence of the operation information on the data in the first physical log. If the first data node transfers the first physical log to the first storage device after completing the transaction commit of the first physical log, since the first The data node may fail before the transaction submission is completed, and the first physical log cannot be delivered to the first storage device, that is, the standby data node cannot obtain the first physical log for log playback. In the embodiment of this application, before the first data node commits the transaction on the data in the first data node, it transfers the first physical log to the first storage device, and writes the first physical log into the first After the storage device succeeds, it can be considered that the first physical log has been copied to the standby data node. Even if the first data node fails before the transaction commit is completed, the first physical log can be transferred to the standby data node.
- the first data node may write the first physical log into the first storage device, and the first storage device may include a storage area (first storage area) allocated for the first data node to For example, the first data node belongs to the first shard in the first database cluster.
- the first storage device may include a shared volume for the first shard, and each data node in the first shard can share the shared volume. roll.
- the first storage device may include a first storage space for storing the physical log of the first data node and a second storage space for storing the physical log of the third data node, the first A storage space is different from the second storage space.
- the first database cluster further includes a third data node, and when the third data node writes the second physical log into the first storage device, the first data node may Writing the first physical log into the first storage device in parallel.
- the third data node may be the master node in the first database cluster. Specifically, when the third data node writes the second physical log into the second storage space, the first data node writes the first physical log into the first storage space in parallel.
- the so-called parallelism can be understood as the action of the first data node writing the first physical log into the first storage space and the action of the third data node writing the physical log into the second storage space in terms of time. Actions happen simultaneously.
- the first database cluster further includes a third data node, and when the third data node writes the second physical log into the first storage device, the first data node may Writing the first physical log into the first storage device in parallel.
- the third data node may be the master node in the first database cluster. Specifically, when the third data node writes the second physical log into the second storage space, the first data node writes the first physical log into the first storage space in parallel.
- the first storage device includes a first storage space for storing the physical log of the first data node, and the first data node is based on the When the storage space available in the first storage space is less than the storage space required by the first physical log, the target storage space may be determined from the first storage space, the target storage space stores the target physical log, and Based on the fact that the target physical log has been replayed by the second data node, the target physical log in the target storage space is replaced by the first physical log.
- the first data node when the size of the first storage space is insufficient, the occupied storage space in the first storage space is emptied and reused, and in order to prevent the standby data node from being emptied before receiving the physical log, the first data node
- the cleared physical log must be the physical log that has been played back by the standby data node (this information can be fed back to the first data node after the standby data node completes the log playback).
- the storage address of the first storage space includes a head address and a tail address
- the storage order of the storage space is from the storage space corresponding to the head address to the storage space corresponding to the tail address storage
- the determining the target storage space from the first storage space based on the available storage space in the first storage space being less than the storage space required by the first physical log includes: The storage space corresponding to the tail address is occupied, and the storage space corresponding to the head address is determined from the first storage space as the target storage space.
- the first storage device is a raw device.
- a raw device which can also be called a raw partition (that is, a raw partition)
- a raw partition that is, a raw partition
- the first data node may write the first physical log into the first storage device based on direct I/O, which improves read and write performance.
- the first data node may also include commit information
- the second physical log of is written into the first storage device, and the first storage device is used to transfer the second physical log to the second storage device, so that the management nodes in the second database cluster can read from the second
- the storage device obtains the second physical log, and the commit information indicates that the first data node has completed the transaction commit of the first physical log.
- the commit information can be used as a reference for the global consistency point when the second database cluster performs log playback.
- the commit information may include a transaction commit number, which may be used to identify a committed database transaction (also called transaction, transaction).
- a transaction is a logical unit for a data storage node to perform database operations, and consists of a sequence of database operations.
- a transaction in the submitted state indicates that the transaction has been successfully executed, and the data involved in the transaction has been written to the data storage node.
- the first database cluster further includes a fourth data node, and the fourth data node is used as a backup node for the first data node, and the method further includes: the fourth The data node acquires the first physical log from the first storage device; the fourth data node performs log replay according to the first physical log.
- the fourth data node may first read the control information of the header of the first storage device, and after verification, compare the writing progress of the log on the storage device with the local physical log. If there is an updated physical log on the storage device, Then read the physical log and copy it to the local, and play it back. If there is no data to read, wait in a loop.
- the present application provides a data backup method, the method is applied to a second database cluster, the second database cluster includes a second data node, and the second data node is used as the first data node Backup node, the first data node belongs to the first database cluster, the first database cluster and the second database cluster are different database clusters, the first storage device is deployed in the first database cluster, and the second storage The device is deployed in the second database cluster, and the method includes: the second data node acquires a first physical log from the second storage device, and the first physical log is from the first data node and The physical log stored in the second storage device via the first storage device, the first physical log includes operation information on the data in the first data node; the second data node according to the The above-mentioned first physical log performs log playback.
- the data nodes in the first database cluster can quickly synchronize the physical log to the second database cluster (standby cluster) through the storage device, thereby getting closer to realizing the data recovery point objective (recovery point objective, RPO) is 0, while ensuring the high performance of the business and improving the data transmission efficiency during data backup.
- RPO recovery point objective
- the operation information indicates a modification operation, a write operation, and/or a deletion operation on the data in the data node.
- the second storage device may include a storage area (third storage area) for the second data node, taking the second data node belonging to the first fragment in the second database cluster as an example,
- the second storage device may include a shared volume allocated for the first slice, and each data node in the first slice may share the shared volume.
- the second storage device may include a third storage space for storing the physical log of the second data node and a fourth storage space for storing the physical log of the fifth data node, the first The third storage space is different from the fourth storage space.
- the second data node can read the physical log from the second storage device in parallel while other data nodes read the physical log from the first data node. Acquire the first physical log from the second storage device.
- the so-called parallelism can be understood as the action of the first data node writing the first physical log into the first storage space and the action of the third data node writing the physical log into the second storage space in terms of time. Actions happen simultaneously.
- the management node in the second database cluster can maintain a global information (this global information can be called a barrier point), the global information
- This global information can be called a barrier point
- the log sequence number of the physical log obtained by each standby data node, and the log sequence number is the smallest log sequence number among the largest physical log sequence numbers currently committed by each master node.
- the management node obtains: The current transaction submission progress of node 1 is 1, 2, the current transaction submission progress of master node 2 is 1, 2, the current transaction submission progress of master node 3 is 1, 2, 3, and the current transaction submission progress of master node 4 is 1 , 2, 3, 4, then 3 is the smallest physical log sequence number among the maximum physical log sequence numbers that each master node has currently committed transactions obtained by the management node.
- physical logs with the same sequence number on different primary data nodes among the multiple primary data nodes correspond to the same task, and each primary data node among the multiple primary data nodes is based on the sequence number Transactions are committed to the physical log in ascending order, that is, the sequence number can indicate the progress of the transaction submission of the master node.
- management node may be realized through cooperation among modules such as CMA, CMS, and ETCD in the second database cluster.
- the coordinating node in the first database cluster can obtain the current transaction submission progress of each primary data node (that is, the log sequence number of the physical log that completes the transaction submission, for example, it can be called a barrier point), and will include the transaction submission
- the progress submission information is transmitted to the second database cluster in the form of physical logs (this step can also be completed by the primary data node).
- the standby data node obtains the physical log carrying the submission information from the second storage device,
- the log is written to the local disk, and the newly placed physical log is parsed, and the parsed barrier point is stored in the hash table, and the largest barrier point currently received is recorded, and the largest barrier point is the primary data node
- the function of the management node can be realized by the cooperation of CMA, CMS, and ETCD.
- the CMA queries the maximum value of the barrier of CN and DN to the CMS, and the CMS can send each standby data node
- the minimum value of the largest barrier point above is used as the "candidate serial number" (or called the value to be detected), and stored in ETCD;
- CMA obtains the "value to be detected” from ETCD, queries the DN, and confirms whether the DN exists
- the CMS performs the following judgment: if the "value to be detected "The corresponding physical log exists in each standby data node, it can be stored in ETCD as the "target serial number” (or simply called the target value) point, otherwise discarded, CMA reads the "target value” in ETCD, Update the local "target value”.
- CMA In a report, CMA needs to query and execute the report of the maximum value of the barrier, locally query whether the "value to be detected” exists, and update the "target value”; CMS needs to perform the update of the "value to be detected” and the “target value” renew.
- Barrier deletion is the end point of consistency. Barrier deletion occurs during physical log playback. During log playback, when the playback reaches the barrier point, the playback position will be updated, and the barrier point will be deleted in the hash table, thus completing the generation of the barrier to the entire process of deletion.
- a target sequence number is maintained as global information based on the management node in the second database cluster, the target sequence number is the smallest sequence number among the plurality of log sequence numbers, and the plurality of Each standby data node in the standby data node has obtained the physical log corresponding to the target serial number, and each standby data node needs to perform log playback only when the log serial number corresponding to the physical log to be played back is equal to the target serial number. It ensures that each standby data node is played back until the target serial number, so that different standby data nodes are restored to the same position, and the data consistency between different standby data nodes in the distributed database is guaranteed.
- the present application provides a first database cluster, where the first database cluster includes a first data node, and the first data node includes:
- a log acquisition module configured to acquire a first physical log, where the first physical log includes operation information on data in the first data node
- a log transfer module configured to write the first physical log into a first storage device, and the first storage device is configured to transfer the first physical log to a second storage device, so that the first physical log in the second database cluster
- Two data nodes obtain the first physical log from the second storage device, wherein the first storage device is deployed in the first database cluster, and the second storage device is deployed in the second database cluster , the first database cluster and the second database cluster are different database clusters, and the second data node is used as a backup node for the first data node.
- the operation information indicates a modification operation, a write operation, and/or a deletion operation on the data in the data node.
- the first data node further includes:
- a transaction commit module configured to perform transaction commit on the data in the first data node according to the first physical log after transferring the first physical log to the first storage device.
- the first database cluster further includes a third data node; the log transfer module is specifically configured to:
- the first storage device includes a first storage space for storing the physical log of the first data node and a second storage space for storing the physical log of the third data node , the first storage space is different from the second storage space;
- the log transfer module is specifically used for:
- the third data node When the third data node writes the second physical log into the second storage space, the first data node writes the first physical log into the first storage space in parallel.
- the first storage device includes a first storage space for storing the physical log of the first data node, and the log transfer module is specifically configured to:
- the target physical log in the target storage space is replaced by the first physical log.
- the storage address of the first storage space includes a head address and a tail address
- the storage order of the storage space is from the storage space corresponding to the head address to the storage space corresponding to the tail address storage
- the log transfer module is specifically used for:
- the first storage device is a raw device.
- the log transfer module is also used to:
- the first data node After the first data node commits the data in the first data node according to the first physical log, the first data node writes the second physical log containing commit information into the first physical log A storage device, the first storage device is used to transfer the second physical log to the second storage device, so that the management node in the second database cluster obtains the second physical log from the second storage device , the commit information indicates that the first data node has completed the transaction commit of the first physical log.
- the first database cluster further includes a fourth data node, the fourth data node is used as a backup node for the first data node, and the fourth data node includes: log acquisition A module, configured to obtain the first physical log from the first storage device;
- a log playback module configured to perform log playback according to the first physical log.
- the present application provides a second database cluster
- the second database cluster includes a second data node
- the second data node is used as a backup node for the first data node
- the first data node Belonging to the first database cluster, the first database cluster and the second database cluster are different database clusters
- the first storage device is deployed in the first database cluster
- the second storage device is deployed in the second database cluster
- the second data node includes:
- a log acquisition module configured to acquire a first physical log from the second storage device, the first physical log is stored in the second a physical log in the storage device, the first physical log includes operation information on the data in the first data node;
- a log playback module configured to perform log playback according to the first physical log.
- the operation information indicates a modification operation, a write operation, and/or a deletion operation on the data in the data node.
- the second database cluster further includes a fifth data node; the log acquisition module is specifically configured to:
- the second data node obtains the first physical log from the second storage device in parallel.
- the second storage device includes a third storage space for storing the physical log of the second data node and a fourth storage space for storing the physical log of the fifth data node , the third storage space is different from the fourth storage space;
- the log acquisition module is specifically used for:
- the second data node obtains the first physical log from the third storage space in parallel.
- the first database cluster includes multiple primary data nodes including the first data node
- the second database cluster includes multiple standby data nodes including the second data node node
- the second database cluster further includes a management node
- the management node includes:
- a commit information acquisition module configured to acquire commit information from the first database cluster from the second storage device, where the commit information includes the latest transaction commit completed by each master data node among the plurality of master data nodes
- the log sequence number of the physical log, the target sequence number is the smallest sequence number among the multiple log sequence numbers, and each standby data node in the multiple standby data nodes has obtained the target
- the log playback module is specifically used for the second data node to obtain the target serial number from the management node;
- physical logs with the same sequence number on different primary data nodes among the multiple primary data nodes correspond to the same task, and each primary data node among the multiple primary data nodes is based on the sequence number Commit transactions to physical logs in ascending order.
- the embodiment of the present application provides a computer-readable storage medium, which is characterized in that it includes computer-readable instructions, and when the computer-readable instructions are run on a computer device, the computer device is made to execute the above-mentioned first aspect. and any optional method thereof, as well as the above-mentioned second aspect and any optional method thereof.
- the embodiment of the present application provides a computer program product, which is characterized in that it includes computer-readable instructions, and when the computer-readable instructions are run on a computer device, the computer device executes the above-mentioned first aspect and its Any optional method, and the above-mentioned second aspect and any optional method thereof.
- the present application provides a chip system, which includes a processor, configured to support the above-mentioned device to implement the functions involved in the above-mentioned aspect, for example, send or process the data involved in the above-mentioned method; or, information .
- the system-on-a-chip further includes a memory, and the memory is used for storing necessary program instructions and data of the execution device or the training device.
- the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
- Figure 1 is a schematic diagram of the architecture provided by the embodiment of the present application.
- FIG. 2 is a schematic diagram of the architecture provided by the embodiment of the present application.
- FIG. 3 is a schematic diagram of the architecture provided by the embodiment of the present application.
- FIG. 4 is a schematic flow chart of a data backup method provided in an embodiment of the present application.
- FIG. 5 is a schematic diagram of the storage space provided by the embodiment of the present application.
- FIG. 6 is a schematic diagram of the barrier point processing flow provided by the embodiment of the present application.
- FIG. 7 is a schematic diagram of the first database cluster provided by the embodiment of the present application.
- FIG. 8 is a schematic diagram of the second database cluster provided by the embodiment of the present application.
- FIG. 9 is a schematic structural diagram of a computing device provided by an embodiment of the present application.
- Figure 1 is a schematic diagram of the system logic structure of a data backup system according to an embodiment of the application.
- the system may include a client, a main storage library (such as the first database cluster in the embodiment of the application) and a backup storage library (such as the implementation of the application)
- the second database cluster in the example wherein, the main storage library can contain multiple fragments (such as fragmentation 1 and fragmentation 2 shown in Figure 1), wherein each fragmentation can include a data node (data node , DN), for example, the fragment 1 shown in Figure 1 includes the master node 1 and the backup node, the backup node of the fragment 1 can be used as the backup of the master node 1, and the fragment 2 includes the master node 2 and the backup node, wherein, the fragment The backup node of 2 can serve as the backup of the primary node 2.
- the fragment 1 shown in Figure 1 includes the master node 1 and the backup node
- the backup node of the fragment 1 can be used as the backup of the master node 1
- the fragment 2
- the main storage library can include a coordinator node (coordinator node, CN).
- a hardware device 1 can also be deployed on one side of the main storage library, and the hardware device 1 can be a storage device (such as the first storage device in the embodiment of the present application) .
- the backup database can be deployed with multiple shards, such as shard 1 and 2 shown in Figure 1, where shard 1 in the backup database can be used as shard 1 in the main database. backup, where multiple backup nodes in shard 1 can serve as backups for primary node 1, and multiple backup nodes in shard 2 can serve as backups for primary node 2.
- the side of the main repository can also be deployed with A hardware device 2.
- the hardware device 2 may be a storage device (for example, the second storage device in the embodiment of the present application).
- the primary storage library or the backup storage library can be a storage array or a network storage architecture such as a network attached storage (Network Attached Storage, NAS) or a storage area network (storage area network, SAN) respectively.
- Each storage node (such as the data node and coordination node described above) can be a logical unit number (logical unit number, LUN) or a file system. It should be understood that the embodiment of the present application does not limit the expression forms of the storage repository and the storage node.
- the primary and secondary database systems can also include clients, and the clients can be connected to the primary database system and the standby database through a network, wherein the network can be the Internet, an intranet, or a local area network (Local Area Networks , referred to as LANs), wide area networks (Wireless Local Area Networks, referred to as WLANs), storage area networks (Storage Area Networks, referred to as SANs), etc., or a combination of the above networks.
- LANs Local Area Networks
- WLANs Wireless Local Area Networks
- SANs Storage Area Networks
- the primary node and backup node shown in FIG. 1 can be implemented by the computing device 200 shown in FIG. 2 .
- FIG. 2 is a schematic diagram of a simplified logical structure of a computing device 200. As shown in FIG. Among them, the processor 202 , the memory unit 204 , the input/output interface 206 , the communication interface 208 and the storage device 212 are connected to each other through the bus 210 .
- the processor 202 is the control center of the computing device 200, and is used to execute related programs to realize the technical solutions provided by the embodiments of the present invention.
- the processor 202 includes one or more central processing units (Central Processing Unit, CPU), for example, the central processing unit 1 and the central processing unit 2 shown in FIG. 2 .
- the computing device 200 may further include multiple processors 202, and each processor 202 may be a single-core processor (including one CPU) or a multi-core processor (including multiple CPUs).
- a component for performing a specific function for example, the processor 202 or the memory unit 204, can be implemented by configuring a general component to perform the corresponding function, or by configuring a dedicated It is implemented by a dedicated component that performs a specific function, which is not limited in this application.
- the processor 202 can adopt a general-purpose central processing unit, a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, for executing related programs, so as to realize the technology provided by this application plan.
- ASIC Application Specific Integrated Circuit
- Processor 202 may be connected to one or more storage schemes via bus 210 .
- the storage scheme may include memory unit 204 and storage device 212 .
- the storage device 212 can be a read-only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device or a random access memory (Random Access Memory, RAM).
- the memory unit 204 may be a random access memory.
- the memory unit 204 may be integrated with the processor 202 or inside the processor 202 , or may be one or more storage units independent of the processor 202 .
- Program codes for execution by the processor 202 or a CPU within the processor 202 may be stored in the storage device 212 or the memory unit 204 .
- program codes stored in the storage device 212 for example, operating system, application software, backup module, communication module or storage control module, etc. are copied to the memory unit 204 for execution by the processor 202 .
- the storage device 212 can be a physical hard disk or its partition (including small computing device system interface storage or global network block device volume), network storage protocol (including network or cluster file systems such as network file system NFS), file-based virtual storage device ( virtual disk mirroring), storage devices based on logical volumes. It may include high-speed random access memory (RAM), and may also include non-volatile memories, such as one or more disk memories, flash memories, or other non-volatile memories. In some embodiments, the storage device may further include a remote memory separate from the one or more processors 202, such as a network disk accessed through a communication interface 208 and a communication network.
- the communication network may be the Internet, an intranet , Local Area Networks (LANs), Wide Area Networks (WLANs), Storage Area Networks (SANs), etc., or a combination of the above networks.
- Operating systems include tools for controlling and managing routine system tasks (such as memory management, storage device control, power management, etc.) ) and various software components and/or drivers that facilitate communication between the various hardware and software components.
- the input/output interface 206 is used to receive input data and information, and output data such as operation results.
- Communication interface 208 enables communication between computing device 200 and other devices or communication networks using transceiving means such as, but not limited to, transceivers.
- Bus 210 may comprise a path for carrying information between various components of computing device 200 (eg, processor 202 , memory unit 204 , input/output interface 206 , communication interface 208 , and storage device 212 ).
- the bus 210 may use a wired connection manner or a wireless communication manner, which is not limited in this application.
- computing device 200 shown in FIG. Those skilled in the art should appreciate that computing device 200 also includes other components necessary for proper operation.
- the computing device 200 can be a general-purpose computing device or a special-purpose computing device, including but not limited to any electronic device such as a portable computing device, a personal desktop computing device, a network server, a tablet computer, a mobile phone, a personal digital assistant (PDA), or The combination of the above two or more devices, the present application does not limit the specific implementation form of the computing device 200 in any way.
- a portable computing device such as a portable computing device, a personal desktop computing device, a network server, a tablet computer, a mobile phone, a personal digital assistant (PDA), or The combination of the above two or more devices, the present application does not limit the specific implementation form of the computing device 200 in any way.
- PDA personal digital assistant
- the computing device 200 in FIG. 2 is only an example of the computing device 200 , and the computing device 200 may include more or fewer components than those shown in FIG. 2 , or have different component configurations. According to specific needs, those skilled in the art should understand that the computing device 200 may also include hardware devices for implementing other additional functions. Those skilled in the art should understand that the computing device 200 may also only include the components necessary to implement the embodiment of the present invention, and does not necessarily include all the components shown in FIG. 2 . Meanwhile, various components shown in FIG. 2 may be implemented in hardware, software, or a combination of hardware and software.
- the hardware structure shown in FIG. 2 and the above description are applicable to various computing devices provided in the embodiments of the present application, and are suitable for executing various data backup methods provided in the embodiments of the present application.
- FIG. 3 is a product implementation form of the embodiment of the present application, which mainly includes a dual-cluster disaster recovery architecture of a distributed database with log sharing.
- the dual database clusters are respectively deployed in two physical areas, and the program code of the data backup method provided by the embodiment of the present application runs in the host memory of the server during operation.
- the client on the management and control side can issue commands such as building a cluster, establishing a dual-cluster disaster recovery relationship, cluster switching, and cluster status query.
- the OM module in the cluster will control Modules such as CM and database nodes complete related operations and return execution results.
- a shared volume is a storage device capable of parallel copying data remotely (physical distance), and is used to synchronously transmit redo logs between the active and standby clusters.
- the primary node on each shard of the primary cluster When the database cluster is running, the primary node on each shard of the primary cluster generates logs and writes them to the shared volume, and synchronizes them to the corresponding shared volume of the standby cluster.
- the standby data nodes of the primary cluster and the standby cluster read the logs from the shared volume and perform playback.
- FIG. 4 is a schematic flowchart of a data backup method provided in an embodiment of the present application, wherein the method can be applied to a first database cluster, and the first database cluster includes a first data node.
- the method include:
- the first data node acquires a first physical log, where the first physical log includes operation information on data in the first data node.
- the first database cluster may be a distributed database
- the first database cluster may be a master cluster
- the second database cluster may serve as a backup cluster of the first database cluster.
- the first database cluster may be a database system based on a distributed architecture (shared nothing architecture) of data sharding, and each data node may be configured with a central processing unit (central processing unit, CPU), memory, and hard disk, etc., and each storage node
- the first data node may be a data node of a shard in the first database cluster, for example, the first data node may be a master data node of a shard in the first database cluster.
- the first data node may be a data node DN in the first database cluster.
- the first database cluster can be deployed with at least one data node, where the coordinator node can be deployed on a computing device.
- Data nodes may be deployed on computing devices. Multiple coordinating nodes can be deployed on different computing devices, or can be deployed on the same computing device. Multiple data nodes can be deployed on different computing devices.
- the coordinator node and the data node can be deployed on different computing devices, or can be deployed on the same computing device.
- the data can be distributed on the data nodes, and the data between the data nodes is not shared.
- the coordinating node receives the query request from the client and generates an execution plan and sends it to each data node.
- the data The node initializes the operator (such as a data operation (stream) operator) to be used according to the received plan, and then executes the execution plan delivered by the coordinating node.
- Coordinating nodes and data nodes, as well as data nodes in different physical nodes can be connected through a network channel, and the network channel can be various communication protocols such as scalable transmission control protocol (STCP).
- STCP scalable transmission control protocol
- the first data node as the master node in the first database cluster, can receive the data operation request from the client, and generate the first physical log according to the data operation request, as the backup of the first data node node, the second data node (or the fourth data node described in the subsequent embodiments) can obtain the first physical log, and perform log playback according to the first physical log to ensure that the data on the first data node and the second data node consistency.
- the first physical log includes operation information on the data in the first data node, and the operation information indicates a modification operation, a write operation, and/or
- the first physical log can be a Redo log, also known as an XLog, which can record the physical modification of the data page (or describe it as a change in data), and can be used to restore the physical data page after the transaction is committed.
- log files in the database system may include logical log files and physical log files.
- the logical log in the logical log file is used to record the original logic of the logical operation performed on the database system.
- logical logs are used to record the original logic of logical operations such as data access, data deletion, data modification, data query, database system upgrade, and database system management performed on the database system.
- the logical operation refers to a process of performing logical processing according to a user's data operation command to determine which data operations need to be performed on the data.
- the data operation command is expressed in a structured query language (structured query language, SQL)
- the original logic of the logic operation may be a computer instruction expressed in an SQL statement.
- the physical log in the physical log file is used to record the change of data in the database system (for example, record the change of the data page in the data storage node).
- the content of the physical log records can be understood as data changes caused by logical operations performed on the database system.
- each node in the shard contains a primary node and multiple standby data nodes.
- Each slice is configured with a shared volume (that is, the storage space in the first storage device and the second storage device described in the subsequent embodiments), ensuring that all nodes in the slice have access to the shared volume.
- the master node of the cluster generates logs and stores them on the storage device corresponding to the shard.
- the storage device between the clusters does not establish a synchronous replication relationship, and does not distinguish between the master and slave ends.
- Select one of the clusters as the disaster recovery cluster stop the cluster to prevent the cluster from writing data to the shared disk, configure the relevant parameter information of the active and standby clusters, and establish the remote replication relationship of the storage device, that is, the data is sent by the master end of the active cluster
- the storage device performs synchronous replication to the slave storage device of the standby cluster.
- the standby cluster sends a build (reconstruction) request to the primary cluster through the network, completes the transmission and replication of data and logs, starts the cluster, and completes the establishment of the disaster recovery relationship.
- the first data node writes the first physical log into a first storage device, and the first storage device is configured to transfer the first physical log to a second storage device.
- the first data node may write the first physical log into the first storage device.
- the first storage device and the second storage device may be physical devices such as an all-flash storage system.
- the first storage device is a raw device.
- a raw device which can also be called a raw partition (that is, a raw partition)
- a raw partition that is, a raw partition
- the first data node may write the first physical log into the first storage device based on direct I/O, which improves read and write performance.
- the data in the first data node make a transaction commit.
- the transaction commit can ensure the persistence of the operation information on the data in the first physical log. If the first data node transfers the first physical log to the first storage device after completing the transaction commit of the first physical log, since the first The data node may fail before the transaction submission is completed, and the first physical log cannot be delivered to the first storage device, that is, the standby data node cannot obtain the first physical log for log playback. In the embodiment of this application, before the first data node commits the transaction on the data in the first data node, it transfers the first physical log to the first storage device, and writes the first physical log into the first After the storage device succeeds, it can be considered that the first physical log has been copied to the standby data node. Even if the first data node fails before the transaction commit is completed, the first physical log can be transferred to the standby data node.
- the first data node may write the first physical log into the first storage device, and the first storage device may include a storage area (first storage area) allocated for the first data node to For example, the first data node belongs to the first shard in the first database cluster.
- the first storage device may include a shared volume for the first shard, and each data node in the first shard can share the shared volume. roll.
- the first storage device may include a first storage space for storing the physical log of the first data node and a second storage space for storing the physical log of the third data node, the first A storage space is different from the second storage space.
- the first database cluster further includes a third data node, and when the third data node writes the second physical log into the first storage device, the first data node may Writing the first physical log into the first storage device in parallel.
- the third data node may be the master node in the first database cluster. Specifically, when the third data node writes the second physical log into the second storage space, the first data node writes the first physical log into the first storage space in parallel.
- the first storage device includes a first storage space for storing the physical log of the first data node, and the first data node is based on the When the storage space available in the first storage space is less than the storage space required by the first physical log, the target storage space may be determined from the first storage space, the target storage space stores the target physical log, and Based on the fact that the target physical log has been replayed by the second data node, the target physical log in the target storage space is replaced by the first physical log.
- the first data node when the size of the first storage space is insufficient, the occupied storage space in the first storage space is emptied and reused, and in order to prevent the standby data node from being emptied before receiving the physical log, the first data node
- the cleared physical log must be the physical log that has been played back by the standby data node (this information can be fed back to the first data node after the standby data node completes the log playback).
- the storage address of the first storage space includes a head address and a tail address
- the storage order of the storage space is from the storage space corresponding to the head address to the storage space corresponding to the tail address storage space
- the first data node may determine from the first storage space that the storage space corresponding to the head address is the target storage space based on the storage space corresponding to the tail address in the first storage space space.
- an area (exemplarily, the size is 16MB) can be divided at the head of the storage device for writing control information (control info), and the control information can include check code, log write location, file size and other information.
- the physical log can be written from a position after 16M, and the physical log storage area can be recycled. When the write position (head) is updated to the tail of the log area (tail), it can resume from the offset position of 16M. write.
- the master node for example, the first data node
- the master node generates a physical log, copies the physical log from the local directory to the storage device, and updates the control information while writing. After the physical log is written to the storage device, it is considered that the log is persisted successfully, and then submitted.
- each log has a unique LSN, or in other words, the log and the LSN are one-to-one, so a log can be uniquely determined according to the LSN.
- the first data node may also include commit information
- the second physical log of is written into the first storage device, and the first storage device is used to transfer the second physical log to the second storage device, so that the management nodes in the second database cluster can read from the second
- the storage device obtains the second physical log, and the commit information indicates that the first data node has completed the transaction commit of the first physical log.
- the commit information can be used as a reference for the global consistency point when the second database cluster performs log playback.
- the commit information may include a transaction commit number, which may be used to identify a committed database transaction (also called transaction, transaction).
- a transaction is a logical unit for a data storage node to perform database operations, and consists of a sequence of database operations.
- a transaction in the submitted state indicates that the transaction has been successfully executed, and the data involved in the transaction has been written to the data storage node.
- the first database cluster further includes a fourth data node, the fourth data node is used as a backup node for the first data node, for example, the fourth data node can The nodes are data nodes in the same shard, and the fourth data node is used as a backup node of the first data node, and then the fourth data node can obtain the first physical log from the first storage device, And the fourth data node performs log playback according to the first physical log.
- the fourth data node may first read the control information of the header of the first storage device, and after verification, compare the writing progress of the log on the storage device with the local physical log. If there is an updated physical log on the storage device, Then read the physical log and copy it to the local, and play it back. If there is no data to read, wait in a loop.
- synchronous replication of data may be performed between the first storage device and the second storage device, and the first storage device and the second storage device may be storage devices capable of remote and parallel data transmission.
- the first storage device may transfer the first physical log to the second storage device.
- the first storage device may include a third storage space for storing the physical log of the second data node and a fourth storage space for storing the physical log of the fifth data node
- the third storage space is different from the fourth storage space, and the first storage device may transfer the first physical log to the third storage space in the second storage device.
- a second data node in the second database cluster acquires the first physical log from the second storage device.
- the second database cluster can be a distributed database
- the first database cluster can be the master cluster
- the second database cluster can be used as the backup cluster of the first database cluster
- the second database cluster can be a database system based on a distributed architecture (shared nothing architecture) based on data sharding
- each data node can be configured with a central processing unit (central processing unit, CPU), memory, and hard disk, etc.
- the resources are not shared between each other, and the second data node may be a data node of a shard in the second database cluster.
- the second data node may be a data node DN in the second database cluster.
- the second database cluster can be deployed with at least one data node, where the coordinator node can be deployed on a computing device.
- Data nodes may be deployed on computing devices.
- Multiple coordinating nodes may be deployed on different computing devices, or may be deployed on the same computing device.
- Multiple data nodes can be deployed on different computing devices.
- the coordinator node and the data node can be deployed on different computing devices, or can be deployed on the same computing device.
- the first data node as the master node in the first database cluster, can receive the data operation request from the client, and generate the first physical log according to the data operation request, as the backup of the first data node Node, the second data node can obtain the first physical log from the second storage device, and perform log playback according to the first physical log, so as to ensure the consistency of data on the first data node and the second data node.
- the second data node may obtain the first physical log from a third storage space, where the third storage space is a storage space allocated for the second data node in the second storage device.
- the second storage device may include a storage area (third storage area) for the second data node, taking the second data node belonging to the first fragment in the second database cluster as an example,
- the second storage device may include a shared volume allocated for the first slice, and each data node in the first slice may share the shared volume.
- the second storage device may include a third storage space for storing the physical log of the second data node and a fourth storage space for storing the physical log of the fifth data node, the first The third storage space is different from the fourth storage space.
- the second data node can read the physical log from the second storage device in parallel while other data nodes read the physical log from the first data node. Acquire the first physical log from the second storage device.
- the so-called parallelism can be understood as the action of the first data node writing the first physical log into the first storage space and the action of the third data node writing the physical log into the second storage space in terms of time. Actions happen simultaneously.
- the first database cluster includes multiple data nodes including the first data node
- the second database cluster further includes a management node
- the management node can also obtain Obtain the commit information from the first database cluster in the second storage device, the commit information includes the log sequence number of the physical log of the latest transaction commit completed by each data node among the plurality of data nodes, and the target sequence number is the smallest sequence number among the plurality of log sequence numbers;
- the existing technology adopts a distributed consistency mechanism based on storage devices and generates a global barrier log to ensure that the farthest recovery point common to different shards can be found, but it cannot solve the problem of data synchronization failure caused by network problems in storage devices.
- the management node in the second database cluster can maintain a global information (this global information can be called a barrier point), the global information
- This global information can be called a barrier point
- the log sequence number of the physical log obtained by each standby data node, and the log sequence number is the smallest log sequence number among the largest physical log sequence numbers currently committed by each master node.
- the management node obtains: The current transaction submission progress of node 1 is 1, 2, the current transaction submission progress of master node 2 is 1, 2, the current transaction submission progress of master node 3 is 1, 2, 3, and the current transaction submission progress of master node 4 is 1 , 2, 3, 4, then 3 is the smallest physical log sequence number among the maximum physical log sequence numbers that each master node has currently committed transactions obtained by the management node.
- physical logs with the same sequence number on different primary data nodes among the multiple primary data nodes correspond to the same task, and each primary data node among the multiple primary data nodes is based on the sequence number Transactions are committed to the physical log in ascending order, that is, the sequence number can indicate the progress of the transaction submission of the master node.
- management node can be an operation and maintenance management module (operation manager, OM), a cluster management module (cluster manager, CM), a cluster management agent (CM agent, CMA), a cluster management service (CM Server, CMS), a global Transaction manager (global transaction manager, GTM), etc.
- operation manager OM
- cluster manager CM
- CM agent cluster management agent
- CMA cluster management agent
- CM Server cluster management service
- GTM global Transaction manager
- the coordinating node in the first database cluster can obtain the current transaction submission progress of each primary data node (that is, the log sequence number of the physical log that completes the transaction submission, for example, it can be called a barrier point), and will include the transaction submission
- the progress submission information is transmitted to the second database cluster in the form of physical logs (this step can also be completed by the primary data node).
- the standby data node obtains the physical log carrying the submission information from the second storage device,
- the log is written to the local disk, and the newly placed physical log is parsed, and the parsed barrier point is stored in the hash table, and the largest barrier point currently received is recorded, and the largest barrier point is the primary data node
- the function of the management node can be realized by the cooperation of CMA, CMS, and ETCD.
- the CMA queries the maximum value of the barrier of CN and DN to the CMS, and the CMS can send each backup data node
- the minimum value of the largest barrier point above is used as the "candidate serial number" (or called the value to be detected), and stored in ETCD;
- CMA obtains the "value to be detected” from ETCD, queries the DN, and confirms whether the DN exists
- the CMS performs the following judgment: if the "value to be detected "The corresponding physical log exists in each standby data node, it can be stored in ETCD as the "target serial number” (or simply called the target value) point, otherwise discarded, CMA reads the "target value” in ETCD, Update the local "target value”.
- CMA In a report, CMA needs to query and execute the report of the maximum value of the barrier, locally query whether the "value to be detected” exists, and update the "target value”; CMS needs to perform the update of the "value to be detected” and the “target value” renew.
- Barrier deletion is the end point of consistency. Barrier deletion occurs during physical log playback. During log playback, when the playback reaches the barrier point, the playback position will be updated, and the barrier point will be deleted in the hash table, thus completing the generation of the barrier to the entire process of deletion.
- the standby cluster needs to obtain the minimum barrier point among the current maximum barrier points of each fragment (that is, the log sequence number of the physical log of the latest transaction submission of each primary data node among multiple primary data nodes), and the database backup set can be restored to the minimum barrier point.
- the minimum barrier point can be divided into four stages: barrier generation, barrier parsing and storage, barrier advancement, and barrier deletion.
- barrier generation is the premise of consistency. Barrier points can be initiated by any CN node, but the first CN is responsible for generating them.
- CN that initiates barrier generation is not the first CN, notify the first CN to generate.
- CN and/or DN nodes add it to the physical log after generation.
- Barrier parsing and storage is the basis of consistency. After the corresponding standby data node on the standby cluster receives the log through the storage device, it writes the log to the local disk. First, parse the newly placed log, store the parsed barrier points in the hash table, and record the currently received maximum barrier point.
- the hash table is created before the log parsing thread is created, and released when the cluster is uninstalled. The parsed barrier points are stored in the hash table, and these barriers will be deleted when playing back the physical log.
- Barrier advancement is the key to consistency, and this part can be carried out through the cooperation of CN, DN, CMA, CMS, and ETCD, as shown in Figure 6.
- the advancement of the barrier consistency point can include five cycles: in the first cycle, CMA queries the maximum value of the barrier of CN and DN and reports it to CMS; CMS collects and compares the minimum value among them as the "value to be detected” and stores it in ETCD Middle; CMA obtains the "value to be detected” from ETCD, queries CN and DN, confirms whether the point exists in DN, and reports the result to CMS, and judges after collecting all the values.
- a target sequence number is maintained as global information based on the management node in the second database cluster, the target sequence number is the smallest sequence number among the plurality of log sequence numbers, and the plurality of Each standby data node in the standby data node has obtained the physical log corresponding to the target serial number, and each standby data node needs to perform log playback only when the log serial number corresponding to the physical log to be played back is equal to the target serial number. It ensures that each standby data node is played back until the target serial number, so that different standby data nodes are restored to the same position, and the data consistency between different standby data nodes in the distributed database is guaranteed.
- the second data node performs log replay according to the first physical log.
- the second data node obtains the target sequence number from the management node, and the second data node determines that the log sequence number of the first physical log is equal to the target sequence number After the number, perform log playback according to the first physical log.
- the second database cluster needs to become the primary cluster, it can be realized through the failover process or the switchover process.
- the failover process is to perform a failover switchover when an abnormality occurs in the main cluster, that is, the standby cluster is upgraded to the main cluster and continues to provide production services.
- the client on the control side issues the cluster failover command to check the status of the storage device.
- switch to RPO 0; interrupt the synchronization relationship of the storage device and remove the write protection of the storage device of the standby cluster, so that the storage device Readable and writable; stop the standby cluster, and overwrite the redo log in the storage device to the local log for the CN node of the standby cluster; update the relevant parameter information stored in etcd of the standby cluster; OM modifies the mode parameters of CM, CN, and DN according to The main cluster mode starts the cluster.
- the switchover process is a planned cluster role switch initiated by the user when the active and standby clusters are running normally, that is, the active cluster is downgraded to the standby cluster, and the standby cluster is promoted to the active cluster to replace the original active cluster to provide production services.
- the client on the management and control side first sends a cluster switchover command to the main cluster to check the status of the storage device.
- OM modifies the mode parameters of CM, CN, and DN according to Start the cluster in the standby cluster mode; check the status of the storage device, and the storage device performs master-slave switching, that is, the direction of data replication is synchronous transmission from the original backup cluster to the original master cluster; stop the backup cluster, and for the CN node of the backup cluster, transfer the The redo log is overwritten to the local log; OM modifies the mode parameters of CM, CN, and DN, and starts the original cluster according to the mode of the main cluster.
- An embodiment of the present application provides a data backup method, the method is applied to a first database cluster, the first database cluster includes a first data node, and the method includes: the first data node obtains a first physical log , the first physical log includes operation information on data in the first data node; the first data node writes the first physical log into a first storage device, and the first storage device is used to store The first physical log is transferred to the second storage device, so that the second data node in the second database cluster obtains the first physical log from the second storage device, wherein the first storage device is deployed in The first database cluster, the second storage device is deployed in the second database cluster, the first database cluster and the second database cluster are different database clusters, and the second data node is used as A backup node of the first data node.
- the data nodes in the first database cluster can quickly synchronize the physical log to the second database cluster (standby cluster) through the storage device, thereby getting closer to realizing the data recovery point objective (recovery point objective, RPO) is 0, while ensuring the high performance of the business and improving the data transmission efficiency during data backup.
- RPO recovery point objective
- FIG. 7 is a schematic structural diagram of a first database cluster 700 according to an embodiment of the present application.
- the first database cluster 700 may include a first data node 70, and the first data node 70 may include:
- a log obtaining module 701 configured to obtain a first physical log, where the first physical log includes operation information on the data in the first data node 70;
- log acquisition module 701 For a specific description of the log acquisition module 701, reference may be made to the description of step 401 in the above embodiment, and details are not repeated here.
- the log acquisition module 701 may be implemented by the processor 202 and the memory unit 204 shown in FIG. 2 . More specifically, the processor 202 may execute related codes in the memory unit 204 to obtain the first physical log.
- a log transfer module 702 configured to write the first physical log into a first storage device, and the first storage device is configured to transfer the first physical log to a second storage device, so that the The second data node 80 obtains the first physical log from the second storage device, wherein the first storage device is deployed in the first database cluster, and the second storage device is deployed in the second A database cluster, the first database cluster and the second database cluster are different database clusters, and the second data node 80 is used as a backup node for the first data node 70 .
- log delivery module 702 For a specific description of the log delivery module 702, reference may be made to the description of step 402 in the above embodiment, and details are not repeated here.
- the log delivery module 702 may be implemented by the processor 202 , the memory unit 204 and the communication interface 208 shown in FIG. 2 . More specifically, the processor 202 may execute the communication module and the backup module in the memory unit 204, so that the communication interface 208 writes the first physical log into the first storage device.
- the operation information indicates a modification operation, a write operation, and/or a deletion operation on the data in the data node.
- the first data node 70 further includes:
- the transaction commit module 703 is configured to perform transaction commit on the data in the first data node 70 according to the first physical log after transferring the first physical log to the first storage device.
- the first database cluster further includes a third data node; the log transfer module 702 is specifically configured to:
- the first storage device includes storage space for storing physical logs
- the log delivery module 702 is specifically configured to:
- the target physical log in the target storage space is replaced by the first physical log.
- the storage address of the storage space includes a head address and a tail address, and the storage order of the storage space is configured from the storage space corresponding to the head address to the storage space corresponding to the tail address storage;
- the log transfer module 702 is specifically used for:
- the first storage device is a raw device.
- the log transfer module 702 is further configured to:
- the first data node 70 After the first data node 70 commits the data in the first data node 70 according to the first physical log, the first data node 70 writes the second physical log containing the commit information into The first storage device, the first storage device is used to transfer the second physical log to the second storage device, so that the management nodes in the second database cluster can obtain the second physical log from the second storage device Two physical logs, the commit information indicates that the first data node 70 has completed the transaction commit of the first physical log.
- the first database cluster further includes a fourth data node, the fourth data node is used as a backup node for the first data node 70, and the fourth data node includes: a log an obtaining module, configured to obtain the first physical log from the first storage device;
- a log playback module configured to perform log playback according to the first physical log.
- FIG. 8 is a schematic structural diagram of a second database cluster 800 according to an embodiment of the present application.
- the second database cluster 800 may include a second data node 80, and the second data node 80 is used as a first
- a log obtaining module 801 configured to obtain a first physical log from the second storage device, the first physical log is from the first data node 70 and is stored in the A physical log in the second storage device, the first physical log includes operation information on the data in the first data node 70;
- log acquisition module 801 For a specific description of the log acquisition module 801, reference may be made to the description of step 403 in the above embodiment, and details are not repeated here.
- the log acquisition module 801 may be implemented by the processor 202 , the memory unit 204 and the communication interface 208 shown in FIG. 2 . More specifically, the processor 202 may execute the communication module in the memory unit 204, so that the communication interface 208 obtains the first physical log from the second storage device.
- the log playback module 802 is configured to perform log playback according to the first physical log.
- log playback module 802 For a specific description of the log playback module 802, reference may be made to the description of step 404 in the above embodiment, and details are not repeated here.
- the operation information indicates a modification operation, a write operation, and/or a deletion operation on the data in the data node.
- the second database cluster further includes a fifth data node; the log acquisition module is specifically configured to:
- the second data node 80 obtains the first physical log from the second storage device in parallel.
- the first database cluster includes a plurality of data nodes including the first data node 70, and the second database cluster further includes a management node, and the management node includes:
- a commit information acquiring module configured to acquire, from the second storage device, commit information from the first database cluster, where the commit information includes the physical transaction commit latest completed by each data node among the plurality of data nodes.
- the log sequence number of the log, and the target sequence number is the smallest sequence number among the plurality of log sequence numbers;
- the log playback module is specifically used for the second data node 80 to obtain the target serial number from the management node;
- physical logs with the same sequence number on different primary data nodes among the multiple primary data nodes correspond to the same task, and each primary data node among the multiple primary data nodes is based on the sequence number Commit transactions to physical logs in ascending order.
- the embodiment of the present application also provides a computing device, which may be a node in the first database cluster or a node in the second database cluster described in the above embodiments.
- the computing device may be a server or a terminal.
- the foregoing database management node and/or data storage node may be deployed in the computing device.
- the computing device 90 includes: a processor 901 , a communication interface 902 and a memory 903 .
- the processor 901 , the communication interface 902 and the memory 903 are connected to each other through a bus 904 .
- the memory 903 is used to store computer instructions.
- the processor 901 executes the computer instructions in the memory 903, it can realize the functions of the computer instructions.
- the data recovery method provided in the embodiment of the present application can be implemented.
- the database management node is deployed in a computer device
- the processor 901 executes computer instructions in the memory 903, the functions of the first data node and the fourth data node in the data backup method provided by the embodiment of the present application can be realized.
- the data storage node is deployed in the computer device, when the processor 901 executes the computer instructions in the memory 903, the function of the second data node in the data backup method provided by the embodiment of the present application can be realized.
- the bus 904 can be divided into an address bus, a data bus, a control bus, and the like.
- the bus 904 can be divided into an address bus, a data bus, a control bus, and the like.
- a thick line is used in FIG. 9 , but it does not mean that there is only one bus or one type of bus.
- the processor 901 may be a hardware chip, and the hardware chip may be an application-specific integrated circuit (application-specific integrated circuit, ASIC), a programmable logic device (programmable logic device, PLD) or a combination thereof.
- the aforementioned PLD may be a complex programmable logic device (complex programmable logic device, CPLD), a field-programmable gate array (field-programmable gate array, FPGA), a general array logic (generic array logic, GAL) or any combination thereof.
- it may also be a general-purpose processor, for example, a central processing unit (central processing unit, CPU), a network processor (network processor, NP), or a combination of a CPU and NP.
- the memory 903 may include a volatile memory (volatile memory), such as a random-access memory (random-access memory, RAM). It may also include a non-volatile memory (non-volatile memory), such as a flash memory (flash memory), a hard disk drive (hard disk drive, HDD) or a solid-state drive (solid-state drive, SSD). Combinations of the above types of memory may also be included.
- volatile memory such as a random-access memory (random-access memory, RAM).
- non-volatile memory such as a flash memory (flash memory), a hard disk drive (hard disk drive, HDD) or a solid-state drive (solid-state drive, SSD). Combinations of the above types of memory may also be included.
- the embodiment of the present application also provides a storage medium, which is a non-volatile computer-readable storage medium, and the instructions in the storage medium are used to implement the data backup method provided in the embodiment of the present application.
- the embodiment of the present application also provides a computer program product including instructions, and the instructions included in the computer program product are used to realize the data backup method provided in the embodiment of the present application.
- the computer program product can be stored on the storage medium.
- the embodiment of the present application also provides a chip, the chip includes a programmable logic circuit and/or program instructions, which are used to implement the data backup method provided in the embodiment of the present application when the chip is running.
- the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physically separated.
- a unit can be located in one place, or it can be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- the connection relationship between the modules indicates that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines.
- the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a floppy disk of a computer , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, training device, or network device, etc.) execute the method of each embodiment of the present application .
- a computer device which can be a personal computer, training device, or network device, etc.
- all or part of them may be implemented by software, hardware, firmware or any combination thereof.
- software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
- the computer program product includes one or more computer instructions.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be passed from a website site, computer, training device, or data center Wired (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.) transmission to another website site, computer, training device, or data center.
- Wired eg, coaxial cable, fiber optic, digital subscriber line (DSL)
- wireless eg, infrared, wireless, microwave, etc.
- the computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a training device or a data center integrated with one or more available media.
- the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a DVD), or a semiconductor medium (such as a solid state disk (Solid State Disk, SSD)), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Des modes de réalisation de la présente demande divulguent un procédé de sauvegarde de données. Le procédé comprend : un nœud de données principal écrivant un premier journal physique dans un premier dispositif de stockage, et le premier dispositif de stockage étant configuré pour transmettre le premier journal physique à un second dispositif de stockage, de sorte qu'un second nœud de données dans un second groupement de bases de données obtient le premier journal physique à partir du second dispositif de stockage, le premier dispositif de stockage étant déployé dans différents groupements de bases de données, et le second nœud de données étant utilisé en tant que nœud de sauvegarde du premier nœud de données. Selon la présente demande, les nœuds de données dans le premier groupement de bases de données (un groupement principal) peuvent rapidement synchroniser un journal physique avec le second groupement de bases de données (un groupement en attente) au moyen du dispositif de stockage, ce qui permet d'améliorer l'efficacité de transmission de données pendant la sauvegarde de données.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111117550.8A CN115858236A (zh) | 2021-09-23 | 2021-09-23 | 一种数据备份方法和数据库集群 |
CN202111117550.8 | 2021-09-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023046042A1 true WO2023046042A1 (fr) | 2023-03-30 |
Family
ID=85652386
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/120709 WO2023046042A1 (fr) | 2021-09-23 | 2022-09-23 | Procédé de sauvegarde de données et groupement de bases de données |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115858236A (fr) |
WO (1) | WO2023046042A1 (fr) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116955015A (zh) * | 2023-09-19 | 2023-10-27 | 恒生电子股份有限公司 | 基于数据存储服务的数据备份系统及方法 |
CN117033087A (zh) * | 2023-10-10 | 2023-11-10 | 武汉吧哒科技股份有限公司 | 数据处理方法、装置、存储介质及管理服务器 |
CN117171266A (zh) * | 2023-08-28 | 2023-12-05 | 北京逐风科技有限公司 | 一种数据同步方法、装置、设备和存储介质 |
CN117667515A (zh) * | 2023-12-08 | 2024-03-08 | 广州鼎甲计算机科技有限公司 | 主备集群的备份管理方法、装置、计算机设备及存储介质 |
CN117857568A (zh) * | 2023-12-25 | 2024-04-09 | 慧之安信息技术股份有限公司 | 基于云边协同的边缘设备增容配置方法和系统 |
CN118410115A (zh) * | 2024-07-03 | 2024-07-30 | 上海联鼎软件股份有限公司 | 一种oracl数据库的自动双活容灾方法、装置与存储介质 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116760693B (zh) * | 2023-08-18 | 2023-10-27 | 天津南大通用数据技术股份有限公司 | 一种数据库主备节点倒换的方法及系统 |
CN117194566B (zh) * | 2023-08-21 | 2024-04-19 | 泽拓科技(深圳)有限责任公司 | 多存储引擎数据复制方法、系统、计算机设备 |
CN118484345B (zh) * | 2024-07-15 | 2024-09-20 | 浪潮云信息技术股份公司 | 一种基于linux操作系统的分布式文件系统热备份方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106570007A (zh) * | 2015-10-09 | 2017-04-19 | 阿里巴巴集团控股有限公司 | 用于分布式缓存系统数据同步的方法和设备 |
CN110377577A (zh) * | 2018-04-11 | 2019-10-25 | 北京嘀嘀无限科技发展有限公司 | 数据同步方法、装置、系统和计算机可读存储介质 |
US10936545B1 (en) * | 2013-12-20 | 2021-03-02 | EMC IP Holding Company LLC | Automatic detection and backup of primary database instance in database cluster |
CN112905390A (zh) * | 2021-03-31 | 2021-06-04 | 恒生电子股份有限公司 | 日志数据备份方法、装置、设备及存储介质 |
-
2021
- 2021-09-23 CN CN202111117550.8A patent/CN115858236A/zh active Pending
-
2022
- 2022-09-23 WO PCT/CN2022/120709 patent/WO2023046042A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10936545B1 (en) * | 2013-12-20 | 2021-03-02 | EMC IP Holding Company LLC | Automatic detection and backup of primary database instance in database cluster |
CN106570007A (zh) * | 2015-10-09 | 2017-04-19 | 阿里巴巴集团控股有限公司 | 用于分布式缓存系统数据同步的方法和设备 |
CN110377577A (zh) * | 2018-04-11 | 2019-10-25 | 北京嘀嘀无限科技发展有限公司 | 数据同步方法、装置、系统和计算机可读存储介质 |
CN112905390A (zh) * | 2021-03-31 | 2021-06-04 | 恒生电子股份有限公司 | 日志数据备份方法、装置、设备及存储介质 |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117171266A (zh) * | 2023-08-28 | 2023-12-05 | 北京逐风科技有限公司 | 一种数据同步方法、装置、设备和存储介质 |
CN117171266B (zh) * | 2023-08-28 | 2024-05-14 | 北京逐风科技有限公司 | 一种数据同步方法、装置、设备和存储介质 |
CN116955015A (zh) * | 2023-09-19 | 2023-10-27 | 恒生电子股份有限公司 | 基于数据存储服务的数据备份系统及方法 |
CN116955015B (zh) * | 2023-09-19 | 2024-01-23 | 恒生电子股份有限公司 | 基于数据存储服务的数据备份系统及方法 |
CN117033087A (zh) * | 2023-10-10 | 2023-11-10 | 武汉吧哒科技股份有限公司 | 数据处理方法、装置、存储介质及管理服务器 |
CN117033087B (zh) * | 2023-10-10 | 2024-01-19 | 武汉吧哒科技股份有限公司 | 数据处理方法、装置、存储介质及管理服务器 |
CN117667515A (zh) * | 2023-12-08 | 2024-03-08 | 广州鼎甲计算机科技有限公司 | 主备集群的备份管理方法、装置、计算机设备及存储介质 |
CN117857568A (zh) * | 2023-12-25 | 2024-04-09 | 慧之安信息技术股份有限公司 | 基于云边协同的边缘设备增容配置方法和系统 |
CN118410115A (zh) * | 2024-07-03 | 2024-07-30 | 上海联鼎软件股份有限公司 | 一种oracl数据库的自动双活容灾方法、装置与存储介质 |
CN118410115B (zh) * | 2024-07-03 | 2024-09-06 | 上海联鼎软件股份有限公司 | 一种oracle数据库的自动双活容灾方法、装置与存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN115858236A (zh) | 2023-03-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023046042A1 (fr) | Procédé de sauvegarde de données et groupement de bases de données | |
US11836155B2 (en) | File system operation handling during cutover and steady state | |
US11734306B2 (en) | Data replication method and storage system | |
US12073091B2 (en) | Low overhead resynchronization snapshot creation and utilization | |
US10929428B1 (en) | Adaptive database replication for database copies | |
US7299378B2 (en) | Geographically distributed clusters | |
JP4461147B2 (ja) | リモートデータミラーリングを用いたクラスタデータベース | |
US10452680B1 (en) | Catch-up replication with log peer | |
US20240061603A1 (en) | Co-located Journaling and Data Storage for Write Requests | |
US11461192B1 (en) | Automatic recovery from detected data errors in database systems | |
WO2024051027A1 (fr) | Procédé et système de configuration de données pour mégadonnées | |
US11461018B2 (en) | Direct snapshot to external storage | |
US11681592B2 (en) | Snapshots with selective suspending of writes | |
US11265374B2 (en) | Cloud disaster recovery | |
US20230252045A1 (en) | Life cycle management for standby databases | |
CN117992467A (zh) | 数据处理系统、方法、装置及相关设备 | |
CN117931831A (zh) | 数据处理系统、数据处理方法、装置及相关设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22872086 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22872086 Country of ref document: EP Kind code of ref document: A1 |