WO2021093323A1 - 数据恢复方法及系统、数据存储节点、数据库管理节点 - Google Patents
数据恢复方法及系统、数据存储节点、数据库管理节点 Download PDFInfo
- Publication number
- WO2021093323A1 WO2021093323A1 PCT/CN2020/096006 CN2020096006W WO2021093323A1 WO 2021093323 A1 WO2021093323 A1 WO 2021093323A1 CN 2020096006 W CN2020096006 W CN 2020096006W WO 2021093323 A1 WO2021093323 A1 WO 2021093323A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- data storage
- storage node
- transaction
- distributed
- Prior art date
Links
- 238000013500 data storage Methods 0.000 title claims abstract description 342
- 238000011084 recovery Methods 0.000 title claims abstract description 285
- 238000007726 management method Methods 0.000 title claims abstract description 97
- 238000000034 method Methods 0.000 title claims abstract description 85
- 230000015654 memory Effects 0.000 claims description 28
- 230000008569 process Effects 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 23
- 238000012545 processing Methods 0.000 description 26
- 238000004590 computer program Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000013523 data management Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- This application relates to the field of databases, and in particular to a data recovery method and system, data storage node, and database management node.
- the data recovery of the database refers to the recovery of the database from the current state of the database to a previous consistent state. For example, after the database fails, the data in the database is restored to the state it was in at a certain point in time before the database failed.
- the database management node can control the data storage node to perform logical operations according to the logical operations recorded in the logical log in the binary log file according to the binary log file , And perform corresponding data operations on the backup data of each data storage node according to the result of the logical operation to realize the data recovery of the distributed database.
- the logical log is used to record the original logic of the logical operation performed on the database.
- This application provides a data recovery method and system, a data storage node, and a database management node, which can solve the problem of slow database recovery speed in related technologies.
- this application provides a data recovery method.
- the method is applied to a distributed database system, which includes a database management node and multiple data storage nodes.
- the method includes: a database management node receives a data recovery request, the data recovery request is used to request data recovery for the distributed database system; the database management node sends a data recovery command to a first data storage node based on the data recovery request, and the first
- the data storage node is any one of a plurality of data storage nodes; the first data storage node performs data operations recorded in the physical log file of the first data storage node according to the instructions of the data recovery command to perform backup data on the first data storage node Perform data operations to restore data to the distributed database system.
- a data recovery command is sent to a data storage node through a database management node, so that the data storage node operates according to the data recorded in the physical log file of the first data storage node according to the instructions of the data recovery command.
- the method may further include: there is an unfinished distributed transaction on the first data storage node, and the second data storage node When the distributed transaction has been submitted, the first data storage node submits the distributed transaction, and the second data storage node is any one of the multiple data storage nodes that processes the distributed transaction together with the first data storage node; or, in the first data storage node When there is an unfinished distributed transaction on the storage node and the second data storage node has rolled back the distributed transaction, the first data storage node rolls back the distributed transaction.
- the distributed transaction in the distributed database system is cleaned up, so that the same distribution is processed together.
- Multiple data storage nodes of a distributed transaction have the same processing state for the distributed transaction, which can ensure the consistency of data recovery.
- the implementation process of the database management node sending the data recovery command to the first data storage node based on the data recovery request may include: the data recovery request is used to request the recovery of the distributed database system to the target recovery point At the time, the database management node determines the target transaction commit number used to indicate that the distributed database system is at the target recovery point based on the target recovery point and the transaction commit number recorded in the physical log files of multiple data storage nodes; the database management node sends the first data The storage node sends a data recovery command carrying the commit number of the target transaction.
- a data recovery request requests that the distributed database system be recovered to the target recovery point
- the distributed database system can be recovered to the target recovery point according to user requirements.
- the database management node determines the target transaction commit number for indicating that the distributed database system is at the target recovery point based on the target recovery point and the transaction commit number recorded in the physical log files of multiple data storage nodes, which may include: database The management node determines the transaction commit number at the target recovery point for each data storage node based on the physical log file of each data storage node; the database management node determines the transaction commit number at the target recovery point of multiple data storage nodes, Determine the largest transaction commit number as the target transaction commit number.
- the first data storage node performs data operations on the backup data of the first data storage node according to the data operation recorded in the physical log file of the first data storage node according to the instructions of the data recovery command , May include: the first data storage node sequentially executes the data operations involved in the corresponding transaction commit number on the backup data according to the commit time sequence of the multiple transaction commit numbers recorded in the physical log file of the first data storage node, until the next time The transaction commit number of the executed data operation is greater than the target transaction commit number.
- this application provides a data storage node, the data storage node includes: a receiving module for receiving a data recovery command sent by a database management node; an execution module for receiving a data recovery command in accordance with the instructions of the data recovery command
- the data operations recorded in the physical log file of the node perform data operations on the backup data of the data storage node to restore data in the distributed database system.
- the data storage node is any one of the multiple data storage nodes in the distributed database system.
- the execution module is also used to submit the distributed transaction when there is an unfinished distributed transaction on the data storage node and the second data storage node has submitted the distributed transaction, and the second data storage node is multiple data storage Any one of the nodes that processes distributed transactions together with the data storage node; or, the execution module is also used to return when there is an unfinished distributed transaction on the data storage node and the second data storage node has rolled back the distributed transaction Roll distributed transactions.
- the execution module is specifically configured to: in accordance with the order of the commit time of the multiple transaction commit numbers recorded in the physical log file of the data storage node, sequentially execute the data operations involved in the corresponding transaction commit number on the backup data until the next time
- the transaction commit number of the executed data operation is greater than the target transaction commit number.
- the target transaction commit number is used to indicate that the distributed database system is at the target recovery point, and the data recovery request is used to request the distributed database system to be restored to the target recovery point.
- this application provides a database management node, the database management node includes: a receiving module for receiving a data recovery request, the data recovery request for requesting data recovery for a distributed database system; a sending module for Based on the data recovery request, a data recovery command is sent to the first data storage node, so that the first data storage node performs data operations on the first data storage node according to the data operation recorded in the physical log file of the first data storage node according to the instructions of the data recovery command. Perform data operations on the backup data, and the first data storage node is any one of the multiple data storage nodes in the distributed database system.
- the sending module includes: a determining sub-module, which is used to record data based on the target recovery point and physical log files of multiple data storage nodes when the data recovery request is used to request the recovery of the distributed database system to the target recovery point.
- the transaction commit number determines the target transaction commit number used to indicate that the distributed database system is at the target recovery point; the sending sub-module is used to send a data recovery command carrying the target transaction commit number to the first data storage node.
- the determining sub-module is specifically used to: determine the transaction commit number at the target recovery point for each data storage node based on the physical log file of each data storage node; Among the transaction commit numbers of dots, the largest transaction commit number is determined as the target transaction commit number.
- this application provides a distributed database system, which includes the database management node of any one of the first aspect and multiple data storage nodes.
- the present application provides a computing device that includes a processor and a memory; the processor executes computer instructions stored in the memory, so that the computing device realizes the function of the database management node in any data recovery method of the first aspect .
- the present application provides a computing device that includes a processor and a memory; the processor executes computer instructions stored in the memory, so that the computing device realizes the function of the data storage node in any data recovery method of the first aspect .
- the present application provides a storage medium, and computer instructions in the storage medium are used to implement the function of a database management node in any data recovery method of the first aspect.
- the present application provides a storage medium, and computer instructions in the storage medium are used to implement the function of a data storage node in any data recovery method of the first aspect.
- the present application provides a computer program product containing instructions.
- the instructions included in the computer program product are used to implement the function of a database management node in any data recovery method of the first aspect.
- this application provides a computer program product containing instructions.
- the instructions included in the computer program product are used to implement the function of a data storage node in any data recovery method of the first aspect.
- FIG. 1 is a schematic structural diagram of a distributed database system involved in a data recovery method provided by an embodiment of the present application
- FIG. 2 is a flowchart of a data recovery method provided by an embodiment of the present application
- FIG. 3 is a flowchart of a method for a database management node to determine a target transaction commit number according to a target recovery point according to an embodiment of the present application
- FIG. 4 is a schematic structural diagram of a data storage node provided by an embodiment of the present application.
- Figure 5 is a schematic structural diagram of another database management node provided by an embodiment of the present application.
- FIG. 6 is a schematic structural diagram of a sending module provided by an embodiment of the present application.
- Fig. 7 is a schematic structural diagram of a computing device provided by an embodiment of the present application.
- a data storage node and a database management node are usually deployed in a database system.
- the data storage node is mainly used to store data.
- the database management node is mainly used to manage the database system.
- log files can be used to record operations performed on the data in the database system.
- the data in the database system can be restored according to the operation recorded in the log file to restore the database system from the current state to a previous state .
- Log files in the database system include logical log files and physical log files.
- the logical log in the logical log file is used to record the original logic of the logical operation performed on the database system.
- the logical log is used to record the original logic of logical operations such as data access, data deletion, data modification, data query, database system upgrade, and database system management performed on the database system.
- the logical operation refers to the process of performing logical processing according to the user's data operation command to determine which data operations need to be performed on the data.
- the original logic of the logical operation may be a computer instruction expressed in a SQL statement.
- the physical log in the physical log file is used to record the changes of the data in the database system (for example, record the changes of the data pages in the data storage node).
- the content of the physical log record can be understood as the data change caused by the logical operation of the database system.
- logical logs are uniformly stored in a binary log file (binlog).
- the database management node in the distributed database system can control the data storage node according to the binary log file according to the logical operation recorded in the logical log in the binary log file. Perform data operations on the backup data of the data storage node to realize data recovery of the distributed database system.
- each data storage node is configured with a central processing unit (CPU), memory, and hard disk. Share resource.
- CPU central processing unit
- the logical operations performed on all data storage nodes are uniformly recorded in the binlog, and the physical log in the data storage node records changes in the data in the data storage node.
- the database management node can control each data storage node to perform logical operations according to the logical operations recorded in the binglog, and perform corresponding data operations on the backup data of each data storage node according to the results of the logical operations.
- the embodiment of the present application provides a data recovery method, which sends a data recovery command to a data storage node through a database management node, so that the data storage node operates according to the data recorded in the physical log file of the data storage node according to the instructions of the data recovery command. Perform data operations on the backup data of the data storage node to achieve data recovery in the distributed database system.
- the data recovery method can be used for data recovery of the database in a disaster recovery scenario.
- the distributed database system involved in the data recovery method provided by the embodiment of the present application may include: a database management node and multiple data storage nodes.
- the database management node and the data storage node, as well as between different data storage nodes, can be connected through a wired or wireless network.
- Figure 1 is a schematic diagram of the distributed database system including database management node 01, data storage node 02, and data storage node 03, between database management node 01 and data storage node 02, database management node 01 and data storage node 03
- the data storage node 02 and the data storage node 03 are all connected through a wired or wireless network.
- the data storage node is mainly used to store data.
- the database management node is mainly used to manage the distributed database system.
- the database management node is also used to receive a data recovery request sent by the user through the terminal, and send a data recovery command to the data storage node according to the data recovery request.
- the data recovery request is used to request data recovery for the distributed database system.
- the data storage node is also used to perform data operations on the backup data of the data storage node according to the data operation recorded in the physical log in the data storage node according to the instructions of the data recovery command sent by the database management node to perform data operations on the distributed database system Data Recovery.
- the method may include the following steps:
- Step 201 The database management node receives a data recovery request.
- the user can send a data recovery request to the database management node through the terminal to request data recovery for the distributed database system. For example, when the database system fails, the user can send a data recovery request to the database management node to request that the database system be restored to the state before the database system fails.
- the data recovery request may also carry a target recovery point, and the target recovery point is used to indicate the consistency state to which the distributed database system is recovered.
- the target recovery point may be the point in time to which the distributed database system is recovered.
- the data recovery request is used to request that the distributed database system be restored to the state that the distributed database system was in at the point in time, that is, the distributed database system is restored to the point in time.
- the target recovery point can be the transaction commit number that the distributed database system is restored to, that is, the distributed database system is restored to the transaction commit number; accordingly, the data recovery request is used to request the distributed database system to be restored to the distributed database system. The state that the database system is in after submitting the transaction commit number.
- the transaction commit number is used to identify the committed database transaction (also called transaction, transaction).
- a transaction is a logical unit for data storage nodes to perform database operations, and consists of a sequence of database operations.
- a transaction in the committed state indicates that the transaction has been successfully executed and the data involved in the transaction has been written to the data storage node.
- Step 202 The database management node sends a data recovery command to the first data storage node based on the data recovery request.
- the database management node After receiving the data recovery request, the database management node can send a data recovery command to all data storage nodes in the distributed database system to instruct all data storage nodes to perform data recovery operations on their own backup data to realize the distributed database System data recovery.
- the first data storage node is any one of multiple data storage nodes in the distributed database system.
- the database management node may determine a stop condition for instructing to stop data recovery according to the target recovery point, and send the data recovery command carrying the stop condition to The first data storage node instructs the first data storage node to perform a data recovery operation, and stops performing the data recovery operation when the stop condition is reached.
- the stop condition can be represented by a target transaction commit number, that is, the target transaction commit number is used to indicate the target recovery point to which the distributed database system needs to be restored. That is, in the process of performing data recovery, after performing the data operation involved in the target transaction commit number, it can be determined that the distributed database system has been recovered to the target recovery point.
- the target recovery point is the transaction commit number
- the target transaction commit number is the transaction commit number.
- the process of determining the target transaction commit number by the database management node according to the target recovery point may include:
- Step 2021 The database management node determines the transaction commit number at the target recovery point for each data storage node based on the physical log file of each data storage node.
- the data management node can query the physical log of each data storage node according to the target recovery point, and determine each The transaction commit number corresponding to the data storage node at the target recovery point, and the transaction commit number corresponding to the data storage node at the target recovery point is the transaction commit number of the data storage node at the target recovery point.
- the transaction commit number corresponding to the target recovery point may be the transaction commit number submitted at the target recovery point.
- the transaction commit number corresponding to the target recovery point may be the transaction commit number that was submitted last before the target recovery point.
- the distributed database system includes a data storage node 01 and a data storage node 02.
- the transaction commit number and commit time recorded in the physical log of data storage node 01 are shown in Table 1.
- Table 1 the data storage node 01 submitted the transaction commit number 104 at 10:00, and it can be determined that the transaction commit number of the data storage node 01 at the target recovery point is 104.
- Table 2 the data storage node 02 submitted the transaction commit number 103 at 10:00, and it can be determined that the transaction commit number of the data storage node 02 at the target recovery point is 103.
- Step 2022 the database management node determines the largest transaction commit number among the transaction commit numbers at the target recovery point of the multiple data storage nodes as the target transaction commit number.
- the data storage node When the data storage node finishes executing the transaction, it will send a request to the database management node to assign a transaction commit number.
- the database management node will allocate a transaction commit number to the data storage node according to the request, so that the data storage node can commit the transaction according to the allocated transaction commit number.
- the database management node allocates the transaction commit number to the data storage node according to the request time for sending the request to allocate the transaction commit number. The earlier the request to assign the transaction commit number is sent, the smaller the transaction commit number allocated by the database management node to the data storage node.
- the database management node will assign the same transaction commit number to multiple data storage nodes that jointly process the same distributed transaction. That is, when the same transaction commit number is recorded in the physical logs of multiple data storage nodes, it means that the multiple data storage nodes jointly process the transaction indicated by the transaction commit number.
- the transaction commit number with the largest value can be selected among the transaction commit numbers of multiple data storage nodes at the target recovery point. Determine the commit number of the target transaction.
- step 2021 continue to take the example in step 2021 as an example.
- the transaction commit number of the data storage node 01 at the target recovery point is 104
- the transaction commit number of the data storage node 02 at the target recovery point is 103.
- the target transaction commit number can be determined to be 104.
- the database management node can determine one or more time points at which the distributed database system is in a consistent state according to the physical logs in each data storage node. Then, select a time point from the one or more time points, determine the target transaction commit number corresponding to the selected time point, and then send the data recovery command carrying the target transaction commit number to multiple data storage nodes, To instruct the multiple data storage nodes to restore the distributed database system to the consistency state corresponding to the selected time point.
- Step 203 The first data storage node performs data operations on the backup data of the first data storage node according to the data operation recorded in the physical log file of the first data storage node according to the instructions of the data recovery command to perform data operations on the distributed database system. Data Recovery.
- the physical log file records the data changes caused by the data operation involved in the transaction commit number in the order of the commit time of the transaction commit number, therefore, in accordance with the data operation recorded in the physical log file, the backup data of the first data storage node
- the data operations involved in the corresponding transaction commit numbers can be sequentially performed on the backup data in accordance with the commit time sequence of the multiple transaction commit numbers recorded in the physical log file of the first data storage node.
- the data recovery command carries a stop condition for instructing to stop data recovery
- the process of executing the data operation reaches the stop condition, it can be determined that the distributed database system has been restored to the specified consistency point, and it can be stopped at this time Perform data recovery operations.
- the target transaction commit number is used to indicate the stop condition
- the transaction commit number of the next executed data operation is greater than the target transaction commit number, you can It is determined that the data recovery operation of the data stored in the first data storage node is completed. At this time, the data recovery operation can be stopped.
- the situation of stopping the data recovery operation includes at least the following two situations:
- the first case when the transaction commit number greater than the target transaction commit number is recorded in the physical log after the target transaction commit number, and the log record time and the log record time of the target transaction commit number are adjacent in time sequence
- the essence of the data recovery operation is to stop when the data operation involved in the target transaction commit number is completed.
- the target transaction commit number is 103.
- the transaction commit number 106 is recorded after the target transaction commit number 103, and the log record time of the transaction commit number 106 and the log record time of the target transaction commit number 103 It is adjacent in time sequence. At this time, you can choose to stop performing the data recovery operation after the data operations involved in the transactions indicated by the transaction commit numbers 100, 102, and 103 are executed in sequence.
- the essence of stopping the data recovery operation is to stop the data recovery operation after completing the data operation involved in the transaction commit number before the transaction commit number greater than the target transaction commit number.
- the essence of stopping the execution of the data recovery operation may also be that the data operation involved in the target transaction commit number is completed.
- the target transaction commit number determined in step 2022 is 104.
- the physical log of the data storage node 01 records the target transaction commit number 104
- the transaction commit number 105 is the transaction commit number greater than and recorded after the target transaction commit number 104
- the log record time of the transaction commit number 105 and the log record time of the target transaction commit number 104 are not adjacent in time sequence.
- you can choose to stop performing the data recovery operation after the data operations involved in the transactions indicated by the transaction commit numbers 100, 104, 102, and 101 are executed in sequence.
- the physical log of the first data storage node will not record the target transaction commit number.
- the data that can be executed next time When the transaction commit number of the operation is greater than the target transaction commit number, the data recovery operation is stopped, so that as much data as possible before the target recovery point can be recovered, and the integrity of the recovered data can be guaranteed.
- the target transaction commit number determined in step 2022 is 104.
- the target transaction commit number 104 is not recorded in the physical log of the data storage node 02
- the transaction commit number of the data storage node 02 at the target recovery point is 103
- the first transaction commit number after the transaction commit number 103 is greater than the transaction commit number
- the transaction commit number of 103 is 106.
- the transaction commit number of the next data operation that needs to be executed is 106. At this time, It can be determined that the data recovery operation of the data stored in the data storage node 02 has been completed, and then the data recovery operation can be stopped.
- Step 204 The first data storage node determines whether there is an unfinished distributed transaction on the first data storage node.
- the distributed transaction in the distributed database system may also be cleaned up.
- Performing a cleanup operation on a distributed transaction refers to: for a distributed transaction that has not been executed in the data storage node, the distributed transaction is processed according to the processing state of the distributed transaction by other data storage nodes, so that the data storage node The processing state of the distributed transaction is the same as that of other data storage nodes to ensure the consistency of the processing state of the distributed transaction by multiple data storage nodes that jointly process the distributed transaction.
- the process for the first data storage node to determine whether there is an unfinished distributed transaction on the first data storage node may be: the first data storage node queries the physical log of the first data storage node, When the physical log indicates that a distributed transaction is in an uncommitted and not rolled back state, the distributed transaction is determined to be an unfinished distributed transaction.
- a functional module for managing distributed transactions can be deployed in the distributed database system, and the functional module can query whether there are unfinished distributed transactions in each data storage node.
- each data storage node needs to query whether there is an unfinished distributed transaction, it can implement the query by calling this function module.
- the data storage node needs to apply for memory in advance when executing distributed transactions, and use the requested memory to store relevant data in the process of executing distributed transactions, and when completing distributed transactions (such as submitting distributed transactions or rolling back distributed transactions) Type transaction), the memory of the application will be refreshed.
- the functional module can query the memory allocated for each data storage node for storing distributed transaction related data, and when a certain data storage node processes a distributed transaction related data stored in the memory, determine the data storage The node has not completed the distributed transaction.
- the functional module can be deployed in physical nodes other than the data storage node and the database management node.
- Step 205 After determining that there is an unfinished distributed transaction on the first data storage node, the first data storage node obtains the processing status of the distributed transaction by the second data storage node.
- the first data storage node may send processing status query requests to other data storage nodes to request other data storage nodes to send them Feed back the processing status of the distributed transaction by other data storage nodes.
- the processing state query request may be sent to a second data storage node, where the second data storage node is any data storage node that processes distributed transactions together with the first data storage node among the multiple data storage nodes.
- the second data storage node After the second data storage node receives the processing status query request, it can query the physical log of the second data storage node according to the incomplete distributed transaction indicated by the processing status query request to obtain the information recorded in the physical log. The processing status of this outstanding distributed transaction.
- the function module can also obtain the processing status of the uncompleted distributed transaction by other data storage nodes. Therefore, the first data storage node can call the function module to obtain the second data storage node. The processing status of the outstanding distributed transaction. Wherein, the functional module can query the physical logs of other data storage nodes to obtain the processing status of the unfinished distributed transaction recorded in the physical log.
- this step 205 may also be executed by the database management node.
- the implementation process may be: after the first data storage node determines that there is an unfinished distributed transaction, it sends a notification indicating that the distributed transaction has not been completed to the database management node, and the database management node sends processing to other data storage nodes according to the notification.
- the database management node sends the processing status to the first data storage node after receiving the processing status of the distributed transaction fed back to it by other data storage nodes.
- the database management node may also implement this step 205 by calling a function module.
- both of the above steps 204 and 205 may be executed by the database management node. And when it is executed by the database management node, please refer to the description in the corresponding step for the implementation process.
- Step 206 When there is an unfinished distributed transaction on the first data storage node and the second data storage node has submitted the unfinished distributed transaction, the first data storage node submits the distributed transaction.
- this step 206 can be implemented by invoking a function module by the first data storage node. In this way, the resources occupied by the first data storage node due to unfinished distributed transactions can be reduced, and the resources of the first data storage node can be used more for data storage and related processing.
- Step 207 When there is an unfinished distributed transaction on the first data storage node and the second data storage node has rolled back the unfinished distributed transaction, the first data storage node rolls back the distributed transaction.
- this step 207 can also be implemented by calling a function module by the first data storage node.
- the data storage node 01 and the data storage node 02 jointly process the distributed transactions indicated by the transaction commit numbers 100, 102, 101, and 107.
- the data storage node 01 submitted transaction commit numbers 100, 104, 102, and 101
- the data storage node 02 submitted transaction commit numbers 100, 102, and 103.
- the data storage node 02 can determine that there is an unfinished distributed transaction according to its physical log, which are the distributed transaction indicated by the transaction commit number 101 and the distributed transaction indicated by the transaction commit number 107, respectively.
- step 205 the data storage node 02 determines that the data storage node 01 has committed the distributed transaction indicated by the transaction commit number 101, and has rolled back the distributed transaction indicated by the transaction commit number 107. Then in this step 206, the data storage node 02 can commit the distributed transaction indicated by the transaction commit number 101, and in this step 207, the data storage node 02 can roll back the distributed transaction indicated by the transaction commit number 107.
- the data recovery method sends a data recovery command to the data storage node through the database management node, so that the data storage node follows the physical log file of the first data storage node according to the instructions of the data recovery command.
- Recorded data operations perform data operations on the backup data of the data storage node, and realize data recovery of the distributed database system.
- the distributed transactions in the distributed database system are cleaned up, so that the same distribution is processed together.
- Multiple data storage nodes of a distributed transaction have the same processing state for the distributed transaction, which can ensure the consistency of data recovery.
- the embodiment of the present application also provides a data storage node, which is used to execute the steps executed by the data storage node in the data recovery method.
- FIG. 4 provides an example of module division of a data storage node.
- the data storage node 40 includes:
- the receiving module 401 is configured to receive a data recovery command sent by the database management node.
- the execution module 402 is configured to perform data operations on the backup data of the data storage node according to the data operation recorded in the physical log file of the data storage node according to the instructions of the data recovery command to perform data recovery on the distributed database system.
- the node is any one of multiple data storage nodes in the distributed database system.
- the execution module 402 is further configured to submit the distributed transaction when there is an unfinished distributed transaction on the data storage node and the second data storage node has submitted the distributed transaction, and the second data storage node is more than Any one of the data storage nodes that processes distributed transactions together with the data storage node.
- the execution module 402 is further configured to roll back the distributed transaction when there is an unfinished distributed transaction on the data storage node and the second data storage node has rolled back the distributed transaction.
- the execution module 402 is specifically configured to: according to the commit time sequence of the multiple transaction commit numbers recorded in the physical log file of the data storage node, sequentially execute the data operations involved in the corresponding transaction commit numbers on the backup data until the next The transaction commit number of the executed data operation is greater than the target transaction commit number.
- the target transaction commit number is used to indicate that the distributed database system is at the target recovery point, and the data recovery request is used to request the distributed database system to be restored to the target recovery point.
- the data storage node receives the data recovery command sent by the database management node through the receiving module, and the execution module operates according to the data recorded in the physical log file of the data storage node according to the instructions of the data recovery command. Perform data operations on the backup data of the data storage node to achieve data recovery in the distributed database system. Compared with related technologies, there is no need to perform a series of logical operations based on the original logic recorded in the logical log, which simplifies the data recovery process and effectively Improve the recovery speed of the database.
- the execution module After performing data operations on the backup data of the data storage node according to the data operations recorded in the physical log file of the first data storage node, the execution module performs cleanup operations on the distributed transactions in the distributed database system, so that the same processing is performed together. Multiple data storage nodes of a distributed transaction have the same processing state for the distributed transaction, which can ensure the consistency of data recovery.
- the embodiment of the present application also provides a database management node, which is used to execute the steps executed by the database management node in the data recovery method.
- Figure 5 provides an example of a module division of the database management node.
- the database management node 60 includes:
- the receiving module 601 is configured to receive a data recovery request, and the data recovery request is used to request data recovery for the distributed database system.
- the sending module 602 is configured to send a data recovery command to the first data storage node based on the data recovery request, so that the first data storage node follows the data operation recorded in the physical log file of the first data storage node according to the instructions of the data recovery command, A data operation is performed on the backup data of the first data storage node to perform data recovery on the distributed database system, and the first data storage node is any one of multiple data storage nodes in the distributed database system.
- the sending module 602 includes:
- the determination sub-module 6021 is used for when the data recovery request is used to request the distributed database system to be restored to the target recovery point, based on the target recovery point and the transaction commit number recorded in the physical log files of multiple data storage nodes in the distributed database system , Determine the target transaction commit number used to indicate that the distributed database system is at the target recovery point.
- the sending submodule 6022 is configured to send a data recovery command carrying the target transaction commit number to the first data storage node.
- the determining submodule 6021 is specifically used to: determine the transaction commit number at the target recovery point for each data storage node based on the physical log file of each data storage node; Among the transaction commit numbers of dots, the largest transaction commit number is determined as the target transaction commit number.
- the sending module sends a data recovery command to the first data storage node based on the data recovery request, so that the data storage node according to the instructions of the data recovery command, in accordance with the first data storage Data operations recorded in the physical log file of the node, perform data operations on the backup data of the data storage node, and realize data recovery of the distributed database system.
- the sending module sends a data recovery command to the first data storage node based on the data recovery request, so that the data storage node according to the instructions of the data recovery command, in accordance with the first data storage Data operations recorded in the physical log file of the node, perform data operations on the backup data of the data storage node, and realize data recovery of the distributed database system.
- the embodiment of the present application also provides a distributed database system, which includes a database management node and a plurality of data storage nodes.
- the database management node is used to implement the function of the database management node in the data recovery method provided in the embodiment of the present application.
- the data storage node is used to implement the functions implemented by the data storage node in the data recovery method provided in the embodiment of the present application.
- the distributed database system may be a database system with a distributed architecture based on data sharding. For example, it can be a MySQL Cluster database.
- the embodiment of the present application also provides a computing device.
- the computing device can be a server or a terminal.
- the aforementioned database management node and/or data storage node may be deployed in the computing device.
- the computing device 70 includes a processor 701, a communication interface 702, and a memory 703.
- the processor 701, the communication interface 702, and the memory 703 are connected to each other through a bus 704.
- the memory 703 is used to store computer instructions.
- the processor 701 executes a computer instruction in the memory 703, it can implement the function of the computer instruction.
- the processor 701 executes a computer instruction in the memory 703, it can implement the data recovery method provided in the embodiment of the present application.
- the database management node is deployed in a computer device, when the processor 701 executes the computer instructions in the memory 703, the function of the database management node in the data recovery method provided in the embodiment of the present application can be realized.
- the function of the data storage node in the data recovery method provided in the embodiment of the present application can be realized, such as performing step 203 to step 207.
- the bus 704 can be divided into an address bus, a data bus, a control bus, and so on.
- the bus 704 can be divided into an address bus, a data bus, a control bus, and so on.
- a thick line is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.
- the processor 701 may be a hardware chip, and the hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof.
- ASIC application-specific integrated circuit
- PLD programmable logic device
- the above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof.
- CPLD complex programmable logic device
- FPGA field-programmable gate array
- GAL generic array logic
- it may also be a general-purpose processor, for example, a central processing unit (CPU), a network processor (NP), or a combination of a CPU and an NP.
- the memory 703 may include a volatile memory (volatile memory), such as a random-access memory (RAM). It may also include non-volatile memory, such as flash memory, hard disk drive (HDD), or solid-state drive (SSD). It may also include a combination of the above-mentioned types of memories.
- volatile memory such as a random-access memory (RAM).
- non-volatile memory such as flash memory, hard disk drive (HDD), or solid-state drive (SSD). It may also include a combination of the above-mentioned types of memories.
- the embodiment of the present application also provides a storage medium, the storage medium is a non-volatile computer-readable storage medium, and the instructions in the storage medium are used to implement the data recovery method provided by the embodiment of the present application executed by the database management node. Steps, or functional modules used to implement database management nodes.
- the embodiment of the present application also provides a storage medium, which is a non-volatile computer-readable storage medium, and the instructions in the storage medium are used to implement the data recovery method executed by the data storage node in the data recovery method provided by the embodiment of the present application. Steps, or functional modules used to implement data storage nodes.
- the embodiments of the present application also provide a computer program product containing instructions.
- the instructions included in the computer program product are used to implement the steps executed by the database management node in the data recovery method provided in the embodiments of the present application, or to implement the database management node.
- the computer program product can be stored on the storage medium.
- the embodiments of the present application also provide a computer program product containing instructions.
- the instructions included in the computer program product are used to implement the steps performed by the data storage node in the data recovery method provided in the embodiments of the present application, or are used to implement the data storage node.
- the computer program product can be stored on the storage medium.
- the embodiment of the present application also provides a chip, which includes a programmable logic circuit and/or program instructions, which is used to implement the function of the database management node in the data recovery method provided by the embodiment of the present application when the chip is running.
- An embodiment of the present application also provides a chip, which includes a programmable logic circuit and/or program instructions, which is used to implement the function of a data storage node in the data recovery method provided in the embodiment of the present application when the chip is running.
- the terms “first”, “second” and “third” are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance.
- the term “at least one” refers to one or more, and the term “plurality” refers to two or more, unless expressly defined otherwise.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
一种数据恢复方法,涉及数据库领域。该方法包括:数据库管理节点接收用于请求对分布式数据库系统进行数据恢复的数据恢复请求(201),并基于该数据恢复请求,向第一数据存储节点发送数据恢复命令(202)。该第一数据存储节点为多个数据存储节点中的任一个。第一数据存储节点在接收到数据恢复命令后,根据该数据恢复命令的指示,按照第一数据存储节点的物理日志文件记载的数据操作,对第一数据存储节点的备份数据执行数据操作,以对分布式数据库系统进行数据恢复(203)。所述方法用于对数据库进行数据恢复,简化了数据恢复的流程,有效地提高数据库的恢复速度。
Description
本申请涉及数据库领域,特别涉及一种数据恢复方法及系统、数据存储节点、数据库管理节点。
随着大数据时代的到来,数据的使用量成倍增长,对数据库进行数据恢复有着越来越高的要求,且对数据恢复的完整性和恢复点的时间要求也越来越高。其中,对数据库进行数据恢复,是指将数据库从数据库当前所处的状态恢复到之前的某一一致性状态。例如,在数据库出现故障后,将数据库中的数据恢复至数据库出现故障前的某一时间点所处的状态。
由于分布式数据库(例如基于数据分片的分布式数据库)的逻辑日志统一存储在二进制日志文件(binlog)中。在相关技术中,在需要将分布式数据库恢复至某一一致性状态时,数据库管理节点可以根据该二进制日志文件,控制数据存储节点根据二进制日志文件中逻辑日志所记载的逻辑操作执行逻辑操作,并根据逻辑操作的结果对各个数据存储节点的备份数据执行对应的数据操作,实现对该分布式数据库的数据恢复。其中,逻辑日志用于记载对数据库执行的逻辑操作的原始逻辑。
但是,由于该数据恢复过程需要根据逻辑日志记载的原始逻辑执行逻辑操作,导致数据库的恢复速度较慢。
发明内容
本申请提供了一种数据恢复方法及系统、数据存储节点、数据库管理节点,可以解决相关技术中数据库的恢复速度较慢的问题。
第一方面,本申请提供了一种数据恢复方法。该方法应用于分布式数据库系统,该分布式数据库系统包括:数据库管理节点和多个数据存储节点。该方法包括:数据库管理节点接收数据恢复请求,该数据恢复请求用于请求对分布式数据库系统进行数据恢复;数据库管理节点基于数据恢复请求,向第一数据存储节点发送数据恢复命令,该第一数据存储节点为多个数据存储节点中的任一个;第一数据存储节点根据数据恢复命令的指示,按照第一数据存储节点的物理日志文件记载的数据操作,对第一数据存储节点的备份数据执行数据操作,以对分布式数据库系统进行数据恢复。
本申请实施例提供的数据恢复方法,通过数据库管理节点向数据存储节点发送数据恢复命令,使得数据存储节点根据该数据恢复命令的指示,按照第一数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作,实现对分布式数据库系统的数据恢复,相较于相关技术,无需根据逻辑日志记载的原始逻辑执行一系列逻辑操作,简化了数据恢复的流程,有效地提高数据库的恢复速度。
可选地,在第一数据存储节点对第一数据存储节点的备份数据执行数据操作之后,该方法还可以包括:在第一数据存储节点存在未完成的分布式事务、且第二数据存储节 点已提交分布式事务时,第一数据存储节点提交分布式事务,第二数据存储节点为多个数据存储节点中与第一数据存储节点共同处理分布式事务的任一个;或者,在第一数据存储节点存在未完成的分布式事务、且第二数据存储节点已回滚分布式事务时,第一数据存储节点回滚分布式事务。
其中,在按照第一数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作后,通过对分布式数据库系统中的分布式事务执行清理操作,使得共同处理同一分布式事务的多个数据存储节点对该分布式事务的处理状态相同,能够保证数据恢复的一致性。
在一种可实现方式中,数据库管理节点基于数据恢复请求,向第一数据存储节点发送数据恢复命令的实现过程,可以包括:在数据恢复请求用于请求将分布式数据库系统恢复至目标恢复点时,数据库管理节点基于目标恢复点和多个数据存储节点的物理日志文件记载的事务提交号,确定用于指示分布式数据库系统处于目标恢复点的目标事务提交号;数据库管理节点向第一数据存储节点发送携带有目标事务提交号的数据恢复命令。
当数据恢复请求请求将分布式数据库系统恢复至目标恢复点时,通过根据该数据恢复请求执行本申请实施例提供的数据恢复方法,能够根据用户需求将分布式数据库系统恢复至该目标恢复点。
其中,数据库管理节点基于目标恢复点和多个数据存储节点的物理日志文件记载的事务提交号,确定用于指示分布式数据库系统处于目标恢复点的目标事务提交号的实现过程,可以包括:数据库管理节点基于每个数据存储节点的物理日志文件,为每个数据存储节点分别确定处于目标恢复点的事务提交号;数据库管理节点在多个数据存储节点的处于目标恢复点的事务提交号中,将最大的事务提交号确定为目标事务提交号。
事务提交号的数值越大,表明发送请求分配该事务提交号的时间越晚。相应的,该数值越大的事务提交号所涉及的数据操作的操作时间越接近目标恢复点,根据该数值越大的事务提交号进行数据恢复得到的数据就越完整。因此,将最大的事务提交号确定为目标事务提交号,能够保证将分布式数据库系统有效地恢复至目标恢复点。
在一种可实现方式中,第一数据存储节点根据数据恢复命令的指示,按照第一数据存储节点的物理日志文件记载的数据操作,对第一数据存储节点的备份数据执行数据操作的实现过程,可以包括:第一数据存储节点按照第一数据存储节点的物理日志文件中记载的多个事务提交号的提交时间先后顺序,对备份数据依次执行对应事务提交号涉及的数据操作,直到下一次被执行的数据操作的事务提交号大于目标事务提交号。
第二方面,本申请提供了一种数据存储节点,该数据存储节点包括:接收模块,用于接收数据库管理节点发送的数据恢复命令;执行模块,用于根据数据恢复命令的指示,按照数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作,以对分布式数据库系统进行数据恢复,数据存储节点为分布式数据库系统中多个数据存储节点中的任一个。
可选地,执行模块,还用于在数据存储节点存在未完成的分布式事务、且第二数据存储节点已提交分布式事务时,提交分布式事务,第二数据存储节点为多个数据存储节点中与数据存储节点共同处理分布式事务的任一个;或者,执行模块,还用于在数据存储节点存在未完成的分布式事务、且第二数据存储节点已回滚分布式事务时,回滚分布 式事务。
可选地,执行模块,具体用于:按照数据存储节点的物理日志文件中记载的多个事务提交号的提交时间先后顺序,对备份数据依次执行对应事务提交号涉及的数据操作,直到下一次被执行的数据操作的事务提交号大于目标事务提交号,目标事务提交号用于指示分布式数据库系统处于目标恢复点,数据恢复请求用于请求将分布式数据库系统恢复至目标恢复点。
第三方面,本申请提供了一种数据库管理节点,该数据库管理节点包括:接收模块,用于接收数据恢复请求,数据恢复请求用于请求对分布式数据库系统进行数据恢复;发送模块,用于基于数据恢复请求,向第一数据存储节点发送数据恢复命令,使得第一数据存储节点根据数据恢复命令的指示,按照第一数据存储节点的物理日志文件记载的数据操作,对第一数据存储节点的备份数据执行数据操作,第一数据存储节点为分布式数据库系统中多个数据存储节点中的任一个。
可选地,发送模块,包括:确定子模块,用于在数据恢复请求用于请求将分布式数据库系统恢复至目标恢复点时,基于目标恢复点和多个数据存储节点的物理日志文件记载的事务提交号,确定用于指示分布式数据库系统处于目标恢复点的目标事务提交号;发送子模块,用于向第一数据存储节点发送携带有目标事务提交号的数据恢复命令。
可选地,确定子模块,具体用于:基于每个数据存储节点的物理日志文件,为每个数据存储节点分别确定处于目标恢复点的事务提交号;在多个数据存储节点的处于目标恢复点的事务提交号中,将最大的事务提交号确定为目标事务提交号。
第四方面,本申请提供了一种分布式数据库系统,该系统包括第一方面任一项的数据库管理节点和多个数据存储节点。
第五方面,本申请提供了一种计算设备,该计算设备包括处理器和存储器;处理器执行存储器存储的计算机指令,使得计算设备实现第一方面任一的数据恢复方法中数据库管理节点的功能。
第六方面,本申请提供了一种计算设备,该计算设备包括处理器和存储器;处理器执行存储器存储的计算机指令,使得计算设备实现第一方面任一的数据恢复方法中数据存储节点的功能。
第七方面,本申请提供了一种存储介质,存储介质中的计算机指令,用于实现第一方面任一的数据恢复方法中数据库管理节点的功能。
第八方面,本申请提供了一种存储介质,存储介质中的计算机指令,用于实现第一方面任一的数据恢复方法中数据存储节点的功能。
第九方面,本申请提供了一种包含指令的计算机程序产品,计算机程序产品包括的指令用于实现第一方面任一的数据恢复方法中数据库管理节点的功能。
第十方面,本申请提供了一种包含指令的计算机程序产品,计算机程序产品包括的指令用于实现第一方面任一的数据恢复方法中数据存储节点的功能。
图1是本申请实施例提供的一种数据恢复方法涉及的分布式数据库系统的结构示意图;
图2是本申请实施例提供的一种数据恢复方法的流程图;
图3是本申请实施例提供的一种数据库管理节点根据目标恢复点确定目标事务提交号的方法流程图;
图4是本申请实施例提供的一种数据存储节点的结构示意图;
图5是本申请实施例提供的另一种数据库管理节点的结构示意图;
图6是本申请实施例提供的一种发送模块的结构示意图;
图7是本申请实施例提供的一种计算设备的结构示意图。
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
数据库系统中通常部署有数据存储节点和数据库管理节点。数据存储节点主要用于存储数据。数据库管理节点主要用于对数据库系统进行管理。在数据库系统中,可以采用日志文件记录对数据库系统中的数据执行的操作。相应的,在需要对数据库系统中的数据进行数据恢复时,可以根据日志文件记载的操作对数据库系统中的数据执行恢复操作,以将数据库系统从当前所处的状态恢复至之前的某一状态。
数据库系统中的日志文件包括逻辑日志文件和物理日志文件。逻辑日志文件中的逻辑日志用于记载对数据库系统执行的逻辑操作的原始逻辑。例如,逻辑日志用于记载对数据库系统执行的数据存取、数据删除、数据修改、数据查询、数据库系统升级和数据库系统管理等逻辑操作的原始逻辑。其中,逻辑操作是指根据用户的数据操作命令进行逻辑处理,确定需要对数据执行哪些数据操作的过程。并且,当数据操作命令使用结构化查询语言(structured query language,SQL)表示时,该逻辑操作的原始逻辑可以为使用SQL语句表示的计算机指令。物理日志文件中的物理日志用于记载数据库系统中数据的变化情况(例如记载数据存储节点中数据页的变化)。该物理日志记录的内容可以理解为对数据库系统执行逻辑操作所引起的数据变化。
在不共享存储资源的分布式数据库系统中,逻辑日志统一存储在二进制日志文件(binlog)中。在需要对数据库系统中的数据执行数据恢复操作时,该分布式数据库系统系统中的数据库管理节点可以根据该二进制日志文件,控制数据存储节点根据二进制日志文件中逻辑日志所记载的逻辑操作对各个数据存储节点的备份数据执行数据操作,实现对该分布式数据库系统的数据恢复。
例如,在基于数据分片的分布式架构(shared nothing架构)的数据库系统中,各个数据存储节点均配置有中央处理器(central processing unit,CPU)、内存和硬盘等,各个存储节点之间不共享资源。在该数据库系统中,binlog中统一记载有对所有数据存储节点执行的逻辑操作,数据存储节点中的物理日志中记载有该数据存储节点中数据的变化情况。在进行数据恢复时,数据库管理节点可以控制每个数据存储节点均根据binglog中记载的逻辑操作执行逻辑操作,并根据逻辑操作的结果对各个数据存储节点的备份数据执行对应的数据操作。
但是,在根据逻辑日志进行数据恢复时,由于数据存储节点需要根据逻辑日志记载 的原始逻辑执行逻辑操作,导致数据库系统的恢复速度较慢。
本申请实施例提供了一种数据恢复方法,通过数据库管理节点向数据存储节点发送数据恢复命令,使得数据存储节点根据该数据恢复命令的指示,按照数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作,实现对分布式数据库系统的数据恢复,相较于相关技术,无需根据逻辑日志记载的原始逻辑执行一系列逻辑操作,简化了数据恢复的流程,有效地提高数据库的恢复速度。该数据恢复方法可以用于在灾备场景中对数据库进行数据恢复。
本申请实施例提供的数据恢复方法涉及的分布式数据库系统可以包括:数据库管理节点和多个数据存储节点。数据库管理节点和数据存储节点之间,及不同数据存储节点之间均可以通过有线或无线网络连接。示例地,图1是该分布式数据库系统包括数据库管理节点01、数据存储节点02和数据存储节点03的示意图,数据库管理节点01和数据存储节点02之间,数据库管理节点01和数据存储节点03之间,及数据存储节点02和数据存储节点03之间均通过有线或无线网络连接。
其中,数据存储节点主要用于存储数据。数据库管理节点主要用于对分布式数据库系统进行管理。并且,数据库管理节点还用于接收用户通过终端发送的数据恢复请求,并根据该数据恢复请求向数据存储节点发送数据恢复命令。其中,数据恢复请求用于请求对分布式数据库系统进行数据恢复。数据存储节点还用于根据数据库管理节点发送的数据恢复命令的指示,按照该数据存储节点中物理日志所记载的数据操作,对数据存储节点的备份数据执行数据操作,以对分布式数据库系统进行数据恢复。
下面对本申请实施例提供的数据恢复方法的实现过程进行说明。如图2所示,该方法可以包括以下步骤:
步骤201、数据库管理节点接收数据恢复请求。
用户可以通过终端向数据库管理节点发送数据恢复请求,以请求对分布式数据库系统进行数据恢复。例如,在数据库系统出现故障时,用户可以向数据库管理节点发送数据恢复请求,以请求将数据库系统恢复至数据库系统出现故障前的状态。
可选地,该数据恢复请求中还可以携带有目标恢复点,该目标恢复点用于指示将分布式数据库系统恢复到的一致性状态。示例地,该目标恢复点可以为将分布式数据库系统恢复到的时间点。相应的,该数据恢复请求用于请求将分布式数据库系统恢复至分布式数据库系统在该时间点所处的状态,即将分布式数据库系统恢复至该时间点。或者,该目标恢复点可以为将分布式数据库系统恢复到的事务提交号,即将该分布式数据库系统恢复至该事务提交号;相应地,数据恢复请求用于请求将分布式数据库系统恢复至分布式数据库系统提交该事务提交号后所处的状态。
其中,事务提交号用于标识已提交的数据库事务(也称事务,transaction)。事务是数据存储节点执行数据库操作的逻辑单位,由一个数据库操作序列构成。事务处于已提交状态表明该事务已成功执行,且已将该事务涉及的数据写入到数据存储节点中。
步骤202、数据库管理节点基于数据恢复请求,向第一数据存储节点发送数据恢复命令。
数据库管理节点在接收到数据恢复请求后,可以向分布式数据库系统中的所有数据存储节点发送数据恢复命令,以指示所有数据存储节点均对自身的备份数据执行数据恢复操作,实现对分布式数据库系统的数据恢复。其中,第一数据存储节点为分布式数据库系统中多个数据存储节点中的任一个。
可选地,当数据恢复请求中携带有目标恢复点时,数据库管理节点可以根据该目标恢复点,确定用于指示停止数据恢复的停止条件,并将携带有该停止条件的数据恢复命令发送至第一数据存储节点,以指示第一数据存储节点执行数据恢复操作,并在达到该停止条件时停止执行数据恢复操作。
在一种可实现方式中,该停止条件可以使用目标事务提交号表示,即该目标事务提交号用于指示需要将分布式数据库系统恢复到的目标恢复点。也即是,在执行数据恢复的过程中,在执行完该目标事务提交号所涉及的数据操作时,可以确定已将分布式数据库系统恢复至目标恢复点。其中,当目标恢复点为事务提交号时,目标事务提交号即为该事务提交号。当目标恢复点为时间点时,如图3所示,数据库管理节点根据目标恢复点确定目标事务提交号的实现过程可以包括:
步骤2021、数据库管理节点基于每个数据存储节点的物理日志文件,为每个数据存储节点分别确定处于目标恢复点的事务提交号。
由于物理日志文件中记载有所有已提交事务的事务提交号和每个事务提交号的提交时间,因此,数据管理节点可以根据该目标恢复点,查询每个数据存储节点的物理日志,确定每个数据存储节点在该目标恢复点对应的事务提交号,该数据存储节点在目标恢复点对应的事务提交号即为数据存储节点处于目标恢复点的事务提交号。其中,目标恢复点对应的事务提交号可以为在该目标恢复点提交的事务提交号。或者,当某一数据存储节点在该目标恢复点未提交事务提交号时,目标恢复点对应的事务提交号可以为在该目标恢复点之前最晚提交的事务提交号。
示例地,假设目标恢复点为将分布式数据库系统恢复到的时间点,且该时间点为10:00。分布式数据库系统包括数据存储节点01和数据存储节点02。数据存储节点01的物理日志中记载的事务提交号及其提交时间请见表1。根据表1可知,数据存储节点01在10:00提交了事务提交号104,可以确定数据存储节点01处于目标恢复点的事务提交号为104。数据存储节点02的物理日志中记载的事务提交号及其提交时间请见表2。根据表2可知,数据存储节点02在10:00提交了事务提交号103,可以确定数据存储节点02处于目标恢复点的事务提交号为103。
表1
事务提交号 | 100 | 104 | 102 | 101 | 105 | 107 |
提交时间 | 9:58 | 10:00 | 10:01 | 10:02 | 10:03 | 10:04 |
表2
事务提交号 | 100 | 102 | 103 | 106 | 101 | 107 |
提交时间 | 9:58 | 9:59 | 10:00 | 10:02 | 10:03 | 10:04 |
步骤2022、数据库管理节点在多个数据存储节点的处于目标恢复点的事务提交号中, 将最大的事务提交号确定为目标事务提交号。
当数据存储节点执行完事务后,会向数据库管理节点发送请求分配事务提交号的请求。数据库管理节点会根据该请求向数据存储节点分配事务提交号,以便于数据存储节点按照分配的事务提交号提交事务。并且,数据库管理节点是根据发送请求分配事务提交号的请求时间,向数据存储节点分配事务提交号的。当发送请求分配事务提交号的请求的时间越早,数据库管理节点向数据存储节点分配的事务提交号越小。同时,对于分布式事务,数据库管理节点会向共同处理同一分布式事务的多个数据存储节点分配相同的事务提交号。即当多个数据存储节点的物理日志中记载有同一事务提交号时,表示该多个数据存储节点共同处理该事务提交号所指示的事务。
由上可知,事务提交号的数值越大,表明发送请求分配该事务提交号的时间越晚。相应的,该数值越大的事务提交号所涉及的数据操作的操作时间越接近目标恢复点,根据该数值越大的事务提交号进行数据恢复得到的数据就越完整。因此,在确定目标事务提交号时,为保证能够将分布式数据库系统有效地恢复至目标恢复点,可以在多个数据存储节点处于目标恢复点的事务提交号中,将数值最大的事务提交号确定为目标事务提交号。
示例地,继续以步骤2021中的例子为例,数据存储节点01处于目标恢复点的事务提交号为104,数据存储节点02处于目标恢复点的事务提交号为103,此时,为保证能够将分布式数据库有效恢复至目标恢复点,可以确定目标事务提交号为104。
需要说明的是,当数据恢复请求未携带目标恢复点时,表明该数据恢复请求用于请求将分布式数据库系统恢复至一个一致性状态。此时,在接收到数据恢复请求后,数据库管理节点可以根据各个数据存储节点中的物理日志,确定分布式数据库系统处于一致性状态的一个或多个时间点。然后,在该一个或多个时间点中选择一个时间点,并确定该选择的时间点对应的目标事务提交号,再将携带有该目标事务提交号的数据恢复命令发送至多个数据存储节点,以指示该多个数据存储节点将分布式数据库系统恢复至该选择的时间点对应的一致性状态。
步骤203、第一数据存储节点根据数据恢复命令的指示,按照第一数据存储节点的物理日志文件记载的数据操作,对第一数据存储节点的备份数据执行数据操作,以对分布式数据库系统进行数据恢复。
由于物理日志文件是按照事务提交号的提交时间先后顺序,记载事务提交号涉及的数据操作使数据发生的变化,因此,在按照物理日志文件记载的数据操作,对第一数据存储节点的备份数据执行数据操作时,可以按照第一数据存储节点的物理日志文件中记载的多个事务提交号的提交时间先后顺序,依次对备份数据依次执行对应事务提交号涉及的数据操作。
并且,若数据恢复命令中携带有用于指示停止数据恢复的停止条件,在执行数据操作的进程达到该停止条件时,可以确定已将分布式数据库系统恢复至指定的一致性点,此时可以停止执行数据恢复操作。示例地,当使用目标事务提交号表示停止条件时,在对备份数据依次执行事务提交号涉及的数据操作的过程中,当下一次被执行的数据操作的事务提交号大于目标事务提交号时,可以确定完成了对第一数据存储节点中存储数据的数据恢复操作,此时,可以停止执行数据恢复操作。
其中,当下一次被执行的数据操作的事务提交号大于目标事务提交号时,停止执行数据恢复操作的情况至少包括以下两种情况:
第一种情况:当该大于目标事务提交号的事务提交号是物理日志中记载在目标事务提交号后,且日志记载时间与目标事务提交号的日志记载时间在时序上相邻的事务提交号时,停止执行数据恢复操作的实质是,在完成目标事务提交号所涉及的数据操作即停止。
示例地,假设目标事务提交号为103,如表2所示,事务提交号106记载在目标事务提交号103后,且该事务提交号106的日志记载时间与目标事务提交号103的日志记载时间在时序上相邻,此时,可以选择在依次执行完事务提交号100、102和103所指示的事务所涉及的数据操作后,即停止执行数据恢复操作。
第二种情况:当该大于目标事务提交号的事务提交号是物理日志中记载在目标事务提交号后的事务提交号,且日志记载时间与目标事务提交号的日志记载时间在时序上不相邻的事务提交号时,停止执行数据恢复操作的实质是,完成该大于目标事务提交号的事务提交号之前的事务提交号所涉及的数据操作后,停止执行数据恢复操作。或者,该停止执行数据恢复操作的实质也可以是,在完成目标事务提交号所涉及的数据操作即停止。
示例地,继续以步骤2022中的例子为例,步骤2022中确定的目标事务提交号为104。对于数据存储节点01,如表1所示,该数据存储节点01的物理日志中记载有目标事务提交号104,事务提交号105为大于且记载在该目标事务提交号104后的事务提交号,且该事务提交号105的日志记载时间与目标事务提交号104的日志记载时间在时序上不相邻。在执行该步骤203的过程中,可以选择在依次执行完事务提交号100、104、102和101所指示的事务所涉及的数据操作后,停止执行数据恢复操作。或者,可以选择在依次执行完事务提交号100和104所指示的事务所涉及的数据操作后,即停止执行数据恢复操作。
如前所述,事务提交号的数值越大,表明发送请求分配该事务提交号的时间越晚,完成该事务提交号所指示的数据操作的时间越晚。因此,若停止执行数据恢复操作的实质是:完成目标事务提交号所涉及的数据操作,且完成该大于目标事务提交号的事务提交号之前的事务提交号所涉及的数据操作后,才停止执行数据恢复操作。这样能够尽量多地对目标恢复点之前的数据进行数据恢复,以提高恢复的数据的完整性。
并且,在第一数据存储节点未参与该目标事务提交号所指示的事务时,该第一数据存储节点的物理日志中不会记载该目标事务提交号,此时,可以在下一次被执行的数据操作的事务提交号大于目标事务提交号时,停止执行数据恢复操作,这样可以尽量多地恢复目标恢复点之前的数据,能够保证恢复的数据的完整性。
示例地,继续以步骤2022中的例子为例,步骤2022中确定的目标事务提交号为104。对于数据存储节点02,该数据存储节点02的物理日志中未记载目标事务提交号104,数据存储节点02处于目标恢复点的事务提交号为103,事务提交号103后第一个大于事务提交号103的事务提交号为106。在执行该步骤203的过程中,在依次执行完事务提交号100、102和103所指示的事务所涉及的数据操作后,下一次需要被执行的数据操作的事务提交号为106,此时,可以确定已完成对数据存储节点02中存储数据的数据恢复操作, 则可以停止执行数据恢复操作。
步骤204、第一数据存储节点确定第一数据存储节点是否存在未完成的分布式事务。
在第一数据存储节点对第一数据存储节点的备份数据执行数据恢复操作后,为保证各个数据存储节点中数据恢复的一致性,还可以对分布式数据库系统中的分布式事务执行清理操作。对分布式事务执行清理操作是指:对数据存储节点中未执行完的分布式事务,按照其他数据存储节点对该分布式事务的处理状态,对该分布式事务进行处理,使该数据存储节点和其他数据存储节点对该分布式事务的处理状态相同,以保证共同处理该分布式事务的多个数据存储节点对该分布式事务的处理状态的一致性。
在第一种可实现方式中,第一数据存储节点确定第一数据存储节点是否存在未完成的分布式事务的实现过程可以为:第一数据存储节点查询该第一数据存储节点的物理日志,当物理日志指示某一分布式事务处于未提交且未回滚状态时,确定该分布式事务为未完成的分布式事务。
在第二种可实现方式中,分布式数据库系统中可以部署有用于管理分布式事务的功能模块,该功能模块可以查询各个数据存储节点中是否存在未完成的分布式事务。每个数据存储节点在需要查询自身是否存在未完成的分布式事务时,可以通过调用该功能模块实现查询。其中,由于数据存储节点在执行分布式事务时,需要预先申请内存,并使用申请的内存存储执行分布式事务过程中的相关数据,且在完成分布式事务(如提交分布式事务或回滚分布式事务)后,该申请的内存会被刷新。因此,该功能模块可以查询为每个数据存储节点分配的用于存储分布式事务相关数据的内存,当内存中存储有某一数据存储节点处理某分布式事务的相关数据时,确定该数据存储节点未完成该分布式事务。可选地,该功能模块可以部署在除数据存储节点和数据库管理节点外的物理节点中。
步骤205、在确定第一数据存储节点存在未完成的分布式事务后,第一数据存储节点获取第二数据存储节点对分布式事务的处理状态。
在第一种可实现方式中,在确定第一数据存储节点存在未完成的分布式事务后,第一数据存储节点可以向其他数据存储节点发送处理状态查询请求,以请求其他数据存储节点向其反馈其他数据存储节点对该分布式事务的处理状态。例如,可以向第二数据存储节点发送该处理状态查询请求,该第二数据存储节点为多个数据存储节点中与第一数据存储节点共同处理分布式事务的任一个数据存储节点。第二数据存储节点在接收到处理状态查询请求后,可以根据该处理状态查询请求所指示的未完成的分布式事务,查询该第二数据存储节点的物理日志,以获取该物理日志中记载的该未完成的分布式事务的处理状态。
在第二种可实现方式中,功能模块还可以获取其他数据存储节点对该未完成的分布式事务的处理状态,因此,第一数据存储节点可以调用该功能模块,以获取第二数据存储节点对该未完成的分布式事务的处理状态。其中,功能模块可以查询其他数据存储节点的物理日志,以获取该物理日志中记载的该未完成的分布式事务的处理状态。
需要说明的是,该步骤205也可以由数据库管理节点执行。其实现过程可以为:在第一数据存储节点确定存在未完成的分布式事务后,向数据库管理节点发送指示未完成该分布式事务的通知,数据库管理节点根据该通知向其他数据存储节点发送处理状态查询请求,数据库管理节点在接收到其他数据存储节点向其反馈的对该分布式事务的处理 状态后,向该第一数据存储节点发送该处理状态。并且,数据库管理节点也可以通过调用功能模块实现该步骤205。
或者,上述步骤204和步骤205均可以由数据库管理节点执行。且由数据库管理节点执行时,其实现过程请相应参考对应步骤中的描述。
步骤206、在第一数据存储节点存在未完成的分布式事务、且第二数据存储节点已提交该未完成的分布式事务时,第一数据存储节点提交分布式事务。
当其他数据存储节点已提交该分布式事务时,说明该其他数据存储节点已成功执行该分布式事务。此时,该第一数据存储节点可以提交该分布式事务,以保证该分布式事务的处理状态一致保持为提交状态。在一种可实现方式中,该步骤206可以通过第一数据存储节点调用功能模块实现。这样一来,可以减小第一数据存储节点因处理未完成的分布式事务所占用的资源,能够将第一数据存储节点的资源更多地用于数据存储及相关处理。
步骤207、在第一数据存储节点存在未完成的分布式事务、且第二数据存储节点已回滚该未完成的分布式事务时,第一数据存储节点回滚分布式事务。
当其他数据存储节点已回滚该分布式事务时,说明该其他数据存储节点未成功执行该分布式事务。此时,该第一数据存储节点可以回滚该分布式事务,以保证该分布式事务的处理状态一致保持为回滚状态。类似地,该步骤207也可以通过第一数据存储节点调用功能模块实现。
示例地,继续以步骤203中的例子为例,数据存储节点01和数据存储节点02共同处理了事务提交号100、102、101和107所指示的分布式事务。并且,在数据恢复过程中,数据存储节点01提交了事务提交号100、104、102和101,数据存储节点02提交了事务提交号100、102和103。在步骤204中,数据存储节点02根据其物理日志可以确定其存在未完成的分布式事务,分别为事务提交号101指示的分布式事务和事务提交号107指示的分布式事务。在步骤205中,数据存储节点02确定数据存储节点01已提交事务提交号101指示的分布式事务,且已回滚事务提交号107指示的分布式事务。则在该步骤206中,数据存储节点02可以将事务提交号101指示的分布式事务进行提交,在该步骤207中,数据存储节点02可以将事务提交号107指示的分布式事务进行回滚。
综上所述,本申请实施例提供的数据恢复方法,通过数据库管理节点向数据存储节点发送数据恢复命令,使得数据存储节点根据该数据恢复命令的指示,按照第一数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作,实现对分布式数据库系统的数据恢复,相较于相关技术,无需根据逻辑日志记载的原始逻辑执行一系列逻辑操作,简化了数据恢复的流程,有效地提高数据库的恢复速度。
并且,在按照第一数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作后,通过对分布式数据库系统中的分布式事务执行清理操作,使得共同处理同一分布式事务的多个数据存储节点对该分布式事务的处理状态相同,能够保证数据恢复的一致性。
需要说明的是,该数据恢复方法的步骤先后顺序可以进行适当调整,步骤也可以根据情况进行相应增减。任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内,因此不再赘述。
本申请实施例还提供了一种数据存储节点,该数据存储节点用于执行数据恢复方法中由数据存储节点执行的步骤。
如图4提供了数据存储节点的一种模块划分举例,如图4所示,该数据存储节点40包括:
接收模块401,用于接收数据库管理节点发送的数据恢复命令。
执行模块402,用于根据数据恢复命令的指示,按照数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作,以对分布式数据库系统进行数据恢复,该数据存储节点为分布式数据库系统中多个数据存储节点中的任一个。
可选地,执行模块402,还用于在数据存储节点存在未完成的分布式事务、且第二数据存储节点已提交所述分布式事务时,提交分布式事务,第二数据存储节点为多个数据存储节点中与数据存储节点共同处理分布式事务的任一个。
或者,执行模块402,还用于在数据存储节点存在未完成的分布式事务、且第二数据存储节点已回滚分布式事务时,回滚分布式事务。
可选地,执行模块402,具体用于:按照数据存储节点的物理日志文件中记载的多个事务提交号的提交时间先后顺序,对备份数据依次执行对应事务提交号涉及的数据操作,直到下一次被执行的数据操作的事务提交号大于目标事务提交号,目标事务提交号用于指示分布式数据库系统处于目标恢复点,数据恢复请求用于请求将分布式数据库系统恢复至目标恢复点。
综上所述,本申请实施例提供的数据存储节点,通过接收模块接收数据库管理节点发送的数据恢复命令,执行模块根据数据恢复命令的指示,按照数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作,实现对分布式数据库系统的数据恢复,相较于相关技术,无需根据逻辑日志记载的原始逻辑执行一系列逻辑操作,简化了数据恢复的流程,有效地提高数据库的恢复速度。
并且,在按照第一数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作后,执行模块对分布式数据库系统中的分布式事务执行清理操作,使得共同处理同一分布式事务的多个数据存储节点对该分布式事务的处理状态相同,能够保证数据恢复的一致性。
本申请实施例还提供了一种数据库管理节点,该数据库管理节点用于执行数据恢复方法中由数据库管理节点执行的步骤。
如图5提供数据库管理节点的一种模块划分举例。如图5所示,该数据库管理节点60包括:
接收模块601,用于接收数据恢复请求,数据恢复请求用于请求对分布式数据库系统进行数据恢复。
发送模块602,用于基于数据恢复请求,向第一数据存储节点发送数据恢复命令,使得第一数据存储节点根据数据恢复命令的指示,按照第一数据存储节点的物理日志文件记载的数据操作,对第一数据存储节点的备份数据执行数据操作,以对分布式数据库系统进行数据恢复,第一数据存储节点为分布式数据库系统中多个数据存储节点中的任一 个。
可选地,如图6所示,发送模块602,包括:
确定子模块6021,用于在数据恢复请求用于请求将分布式数据库系统恢复至目标恢复点时,基于目标恢复点和分布式数据库系统中多个数据存储节点的物理日志文件记载的事务提交号,确定用于指示分布式数据库系统处于目标恢复点的目标事务提交号。
发送子模块6022,用于向第一数据存储节点发送携带有目标事务提交号的数据恢复命令。
可选地,确定子模块6021,具体用于:基于每个数据存储节点的物理日志文件,为每个数据存储节点分别确定处于目标恢复点的事务提交号;在多个数据存储节点处于目标恢复点的事务提交号中,将最大的事务提交号确定为目标事务提交号。
综上所述,本申请实施例提供的数据库管理节点,发送模块基于数据恢复请求,向第一数据存储节点发送数据恢复命令,使得数据存储节点根据该数据恢复命令的指示,按照第一数据存储节点的物理日志文件记载的数据操作,对数据存储节点的备份数据执行数据操作,实现对分布式数据库系统的数据恢复,相较于相关技术,无需根据逻辑日志记载的原始逻辑执行一系列逻辑操作,简化了数据恢复的流程,有效地提高数据库的恢复速度。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置、模块和子模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
本申请实施例还提供了一种分布式数据库系统,该分布式数据库系统包括数据库管理节点和多个数据存储节点。该数据库管理节点用于实现本申请实施例提供的数据恢复方法中数据库管理节点的功能。该数据存储节点用于实现本申请实施例提供的数据恢复方法中数据存储节点实现的功能。该分布式数据库系统的系统框图请相应参考图1,此处不再赘述。并且,该分布式数据库系统可以为基于数据分片的分布式架构的数据库系统。例如可以为MySQL Cluster数据库。
本申请实施例还提供了一种计算设备。该计算设备可以为服务器或终端等。前述数据库管理节点和/或数据存储节点可以部署在该计算设备中。如图7所示,该计算设备70包括:处理器701,通信接口702和存储器703。处理器701,通信接口702和存储器703之间通过总线704相互连接。
存储器703用于存储计算机指令。处理器701执行存储器703中的计算机指令时,能够实现该计算机指令的功能。例如,处理器701执行存储器703中的计算机指令时,能够实现本申请实施例提供的数据恢复方法。又例如,当数据库管理节点部署在计算机设备中时,处理器701执行存储器703中的计算机指令时,能够实现本申请实施例提供的数据恢复方法中数据库管理节点的功能。再例如,当数据存储节点部署在计算机设备中时,处理器701执行存储器703中的计算机指令时,能够实现本申请实施例提供的数据恢复方法中数据存储节点的功能,如执行步骤203至步骤207。
在图7中,总线704可以分为地址总线、数据总线、控制总线等。为便于表示,图7中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在图7中,处理器701可以是硬件芯片,该硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。或者,也可以是通用处理器,例如,中央处理器(central processing unit,CPU),网络处理器(network processor,NP),或者,CPU和NP的组合。
在图7中,存储器703可以包括易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM)。也可以包括非易失性存储器(non-volatile memory),例如快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD)。还可以包括上述种类的存储器的组合。
本申请实施例还提供了一种存储介质,该存储介质为非易失性计算机可读存储介质,存储介质中的指令用于实现本申请实施例提供的数据恢复方法中由数据库管理节点执行的步骤,或者用于实现数据库管理节点的功能模块。
本申请实施例还提供了一种存储介质,该存储介质为非易失性计算机可读存储介质,存储介质中的指令用于实现本申请实施例提供的数据恢复方法中由数据存储节点执行的步骤,或者用于实现数据存储节点的功能模块。
本申请实施例还提供了一种包含指令的计算机程序产品,计算机程序产品包括的指令用于实现本申请实施例提供的数据恢复方法中由数据库管理节点执行的步骤,或者用于实现数据库管理节点的功能模块。该计算机程序产品可以存储该存储介质上。
本申请实施例还提供了一种包含指令的计算机程序产品,计算机程序产品包括的指令用于实现本申请实施例提供的数据恢复方法中由数据存储节点执行的步骤,或者用于实现数据存储节点的功能模块。该计算机程序产品可以存储该存储介质上。
本申请实施例还提供了一种芯片,该芯片包括可编程逻辑电路和/或程序指令,当所述芯片运行时用于实现本申请实施例提供的数据恢复方法中数据库管理节点的功能。
本申请实施例还提供了一种芯片,该芯片包括可编程逻辑电路和/或程序指令,当所述芯片运行时用于实现本申请实施例提供的数据恢复方法中数据存储节点的功能。
在本申请实施例中,术语“第一”、“第二”和“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。术语“至少一个”是指一个或多个,术语“多个”指两个或两个以上,除非另有明确的限定。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的构思和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
Claims (16)
- 一种数据恢复方法,其特征在于,所述方法应用于分布式数据库系统,所述分布式数据库系统包括:数据库管理节点和多个数据存储节点,所述方法包括:所述数据库管理节点接收数据恢复请求,所述数据恢复请求用于请求对所述分布式数据库系统进行数据恢复;所述数据库管理节点基于所述数据恢复请求,向第一数据存储节点发送数据恢复命令,所述第一数据存储节点为所述多个数据存储节点中的任一个;所述第一数据存储节点根据所述数据恢复命令的指示,按照所述第一数据存储节点的物理日志文件记载的数据操作,对所述第一数据存储节点的备份数据执行数据操作,以对所述分布式数据库系统进行数据恢复。
- 根据权利要求1所述的方法,其特征在于,在所述第一数据存储节点对所述第一数据存储节点的备份数据执行数据操作之后,所述方法还包括:在所述第一数据存储节点存在未完成的分布式事务、且第二数据存储节点已提交所述分布式事务时,所述第一数据存储节点提交所述分布式事务,所述第二数据存储节点为所述多个数据存储节点中与所述第一数据存储节点共同处理所述分布式事务的任一个;或者,在所述第一数据存储节点存在未完成的分布式事务、且所述第二数据存储节点已回滚所述分布式事务时,所述第一数据存储节点回滚所述分布式事务。
- 根据权利要求1或2所述的方法,其特征在于,所述数据库管理节点基于所述数据恢复请求,向第一数据存储节点发送数据恢复命令,包括:在所述数据恢复请求用于请求将所述分布式数据库系统恢复至目标恢复点时,所述数据库管理节点基于所述目标恢复点和所述多个数据存储节点的物理日志文件记载的事务提交号,确定用于指示所述分布式数据库系统处于所述目标恢复点的目标事务提交号;所述数据库管理节点向所述第一数据存储节点发送携带有所述目标事务提交号的数据恢复命令。
- 根据权利要求3所述的方法,其特征在于,所述数据库管理节点基于所述目标恢复点和所述多个数据存储节点的物理日志文件记载的事务提交号,确定用于指示所述分布式数据库系统处于所述目标恢复点的目标事务提交号,包括:所述数据库管理节点基于每个数据存储节点的物理日志文件,为每个数据存储节点分别确定处于所述目标恢复点的事务提交号;所述数据库管理节点在所述多个数据存储节点的处于所述目标恢复点的事务提交号中,将最大的事务提交号确定为所述目标事务提交号。
- 根据权利要求3所述的方法,其特征在于,所述第一数据存储节点根据所述数据恢复命令的指示,按照所述第一数据存储节点的物理日志文件记载的数据操作,对所述第一数据存储节点的备份数据执行数据操作,包括:所述第一数据存储节点按照所述第一数据存储节点的物理日志文件中记载的多个事务提交号的提交时间先后顺序,对所述备份数据依次执行对应事务提交号涉及的数据操作,直到下一次被执行的数据操作的事务提交号大于所述目标事务提交号。
- 一种数据存储节点,其特征在于,所述数据存储节点包括:接收模块,用于接收数据库管理节点发送的数据恢复命令;执行模块,用于根据所述数据恢复命令的指示,按照所述数据存储节点的物理日志文件记载的数据操作,对所述数据存储节点的备份数据执行数据操作,以对分布式数据库系统进行数据恢复,所述数据存储节点为所述分布式数据库系统中多个数据存储节点中的任一个。
- 根据权利要求6所述的数据存储节点,其特征在于,所述执行模块,还用于在所述数据存储节点存在未完成的分布式事务、且第二数据存储节点已提交所述分布式事务时,提交所述分布式事务,所述第二数据存储节点为所述多个数据存储节点中与所述数据存储节点共同处理所述分布式事务的任一个;或者,所述执行模块,还用于在所述数据存储节点存在未完成的分布式事务、且所述第二数据存储节点已回滚所述分布式事务时,回滚所述分布式事务。
- 根据权利要求6或7所述的数据存储节点,其特征在于,所述执行模块,具体用于:按照所述数据存储节点的物理日志文件中记载的多个事务提交号的提交时间先后顺序,对所述备份数据依次执行对应事务提交号涉及的数据操作,直到下一次被执行的数据操作的事务提交号大于目标事务提交号,所述目标事务提交号用于指示所述分布式数据库系统处于目标恢复点,所述数据恢复请求用于请求将所述分布式数据库系统恢复至所述目标恢复点。
- 一种数据库管理节点,其特征在于,所述数据库管理节点包括:接收模块,用于接收数据恢复请求,所述数据恢复请求用于请求对分布式数据库系统进行数据恢复;发送模块,用于基于所述数据恢复请求,向第一数据存储节点发送数据恢复命令,使得所述第一数据存储节点根据所述数据恢复命令的指示,按照所述第一数据存储节点的物理日志文件记载的数据操作,对所述第一数据存储节点的备份数据执行数据操作,所述第一数据存储节点为所述分布式数据库系统中多个数据存储节点中的任一个。
- 根据权利要求9所述的数据库管理节点,所述发送模块,包括:确定子模块,用于在所述数据恢复请求用于请求将所述分布式数据库系统恢复至目标恢复点时,基于所述目标恢复点和所述多个数据存储节点的物理日志文件记载的事务提交号,确定用于指示所述分布式数据库系统处于所述目标恢复点的目标事务提交号;发送子模块,用于向所述第一数据存储节点发送携带有所述目标事务提交号的数据恢复命令。
- 根据权利要求10所述的数据库管理节点,其特征在于,所述确定子模块,具体用于:基于每个数据存储节点的物理日志文件,为每个数据存储节点分别确定处于所述目标恢复点的事务提交号;在所述多个数据存储节点的处于所述目标恢复点的事务提交号中,将最大的事务提交号确定为所述目标事务提交号。
- 一种分布式数据库系统,其特征在于,所述系统包括权1至5任一项所述的数据库管理节点和多个数据存储节点。
- 一种计算设备,其特征在于,所述计算设备包括处理器和存储器;所述处理器执行所述存储器存储的计算机指令,使得所述计算设备实现权利要求1至5任一所述的数据恢复方法中数据库管理节点的功能。
- 一种计算设备,其特征在于,所述计算设备包括处理器和存储器;所述处理器执行所述存储器存储的计算机指令,使得所述计算设备实现权利要求1至5任一所述的数据恢复方法中数据存储节点的功能。
- 一种存储介质,其特征在于,所述存储介质中的计算机指令,用于实现权利要求1至5任一所述的数据恢复方法中数据库管理节点的功能。
- 一种存储介质,其特征在于,所述存储介质中的计算机指令,用于实现权利要求1至5任一所述的数据恢复方法中数据存储节点的功能。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911099305.1A CN111124751B (zh) | 2019-11-12 | 2019-11-12 | 数据恢复方法及系统、数据存储节点、数据库管理节点 |
CN201911099305.1 | 2019-11-12 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021093323A1 true WO2021093323A1 (zh) | 2021-05-20 |
Family
ID=70495367
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/096006 WO2021093323A1 (zh) | 2019-11-12 | 2020-06-14 | 数据恢复方法及系统、数据存储节点、数据库管理节点 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111124751B (zh) |
WO (1) | WO2021093323A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118503018A (zh) * | 2024-07-17 | 2024-08-16 | 杭州海康威视系统技术有限公司 | 分布式数据库的数据恢复方法、装置、系统及电子设备 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111124751B (zh) * | 2019-11-12 | 2023-11-17 | 华为云计算技术有限公司 | 数据恢复方法及系统、数据存储节点、数据库管理节点 |
CN114090332A (zh) * | 2021-10-14 | 2022-02-25 | 阿里云计算有限公司 | 数据处理方法及装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050193035A1 (en) * | 2004-02-27 | 2005-09-01 | Microsoft Corporation | System and method for recovery units in databases |
CN103412803A (zh) * | 2013-08-15 | 2013-11-27 | 华为技术有限公司 | 数据恢复的方法及装置 |
CN105159818A (zh) * | 2015-08-28 | 2015-12-16 | 东北大学 | 内存数据管理中日志恢复方法及其仿真系统 |
CN108874588A (zh) * | 2018-06-08 | 2018-11-23 | 郑州云海信息技术有限公司 | 一种数据库实例恢复方法和装置 |
CN111124751A (zh) * | 2019-11-12 | 2020-05-08 | 华为技术有限公司 | 数据恢复方法及系统、数据存储节点、数据库管理节点 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5845292A (en) * | 1996-12-16 | 1998-12-01 | Lucent Technologies Inc. | System and method for restoring a distributed checkpointed database |
CN106610876B (zh) * | 2015-10-23 | 2020-11-03 | 中兴通讯股份有限公司 | 数据快照的恢复方法及装置 |
-
2019
- 2019-11-12 CN CN201911099305.1A patent/CN111124751B/zh active Active
-
2020
- 2020-06-14 WO PCT/CN2020/096006 patent/WO2021093323A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050193035A1 (en) * | 2004-02-27 | 2005-09-01 | Microsoft Corporation | System and method for recovery units in databases |
CN103412803A (zh) * | 2013-08-15 | 2013-11-27 | 华为技术有限公司 | 数据恢复的方法及装置 |
CN105159818A (zh) * | 2015-08-28 | 2015-12-16 | 东北大学 | 内存数据管理中日志恢复方法及其仿真系统 |
CN108874588A (zh) * | 2018-06-08 | 2018-11-23 | 郑州云海信息技术有限公司 | 一种数据库实例恢复方法和装置 |
CN111124751A (zh) * | 2019-11-12 | 2020-05-08 | 华为技术有限公司 | 数据恢复方法及系统、数据存储节点、数据库管理节点 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118503018A (zh) * | 2024-07-17 | 2024-08-16 | 杭州海康威视系统技术有限公司 | 分布式数据库的数据恢复方法、装置、系统及电子设备 |
Also Published As
Publication number | Publication date |
---|---|
CN111124751A (zh) | 2020-05-08 |
CN111124751B (zh) | 2023-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11226847B2 (en) | Implementing an application manifest in a node-specific manner using an intent-based orchestrator | |
WO2021093323A1 (zh) | 数据恢复方法及系统、数据存储节点、数据库管理节点 | |
US11334422B2 (en) | System and method for data redistribution in a database | |
WO2021027956A1 (zh) | 一种基于区块链系统的交易处理方法及装置 | |
WO2019001017A1 (zh) | 集群间数据迁移方法、系统、服务器及计算机存储介质 | |
US12038879B2 (en) | Read and write access to data replicas stored in multiple data centers | |
US11262912B2 (en) | File operations in a distributed storage system | |
WO2018058998A1 (zh) | 一种数据加载方法、终端和计算集群 | |
US8555107B2 (en) | Computer system and data processing method for computer system | |
CN107016016B (zh) | 一种数据处理的方法及装置 | |
KR20200048440A (ko) | 블록체인 기반 조회 서비스 제공 시스템 및 그 방법 | |
US20160350192A1 (en) | Storage system transactions | |
WO2021139224A1 (zh) | 云场景下的文件备份方法、装置、介质、电子设备 | |
US11561937B2 (en) | Multitenant application server using a union file system | |
WO2018120810A1 (zh) | 一种解决数据冲突的方法和系统 | |
WO2022142666A1 (zh) | 数据处理方法、装置、终端设备及存储介质 | |
WO2021082465A1 (zh) | 一种保证数据一致性的方法及相关设备 | |
CN110659303A (zh) | 一种数据库节点的读写控制方法及装置 | |
US20190347165A1 (en) | Apparatus and method for recovering distributed file system | |
CN107566470B (zh) | 云数据系统中管理虚拟机的方法和装置 | |
CN110569112B (zh) | 日志数据写入方法及对象存储守护装置 | |
WO2021082720A1 (zh) | 一种数据处理方法及装置 | |
US20220004664A1 (en) | Data integrity procedure | |
US11341104B1 (en) | In place resize of a distributed database | |
US20240220334A1 (en) | Data processing method in distributed system, and related system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20887969 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20887969 Country of ref document: EP Kind code of ref document: A1 |