WO2023151443A1 - Synchronizing main database and standby database - Google Patents

Synchronizing main database and standby database Download PDF

Info

Publication number
WO2023151443A1
WO2023151443A1 PCT/CN2023/071515 CN2023071515W WO2023151443A1 WO 2023151443 A1 WO2023151443 A1 WO 2023151443A1 CN 2023071515 W CN2023071515 W CN 2023071515W WO 2023151443 A1 WO2023151443 A1 WO 2023151443A1
Authority
WO
WIPO (PCT)
Prior art keywords
database
standby
data
primary
databases
Prior art date
Application number
PCT/CN2023/071515
Other languages
French (fr)
Chinese (zh)
Inventor
杨传辉
Original Assignee
北京奥星贝斯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京奥星贝斯科技有限公司 filed Critical 北京奥星贝斯科技有限公司
Publication of WO2023151443A1 publication Critical patent/WO2023151443A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications

Definitions

  • the present disclosure relates to the technical field of databases, and in particular to a method and device for synchronizing active and standby databases.
  • Disasters may render the database unavailable, making it impossible to continue to provide services.
  • Related technologies can realize disaster recovery based on the active-standby synchronization mechanism. For example, when the primary database is unavailable due to a disaster, the active/standby switchover can be performed, and the standby database can continue to provide services.
  • the active-standby synchronous disaster recovery mechanism For example, when the standby database is unavailable, the primary database cannot continue to serve. Or, data may be lost during active/standby switchover.
  • the present disclosure provides a method and device for synchronizing primary and secondary databases to solve the above problems.
  • a method for synchronizing primary and standby databases is provided, the method is applied to the primary database, and the method includes: receiving a first transaction request, the first transaction request is used to request data in the primary database Modify; in response to the first transaction request, perform a modification operation on the data in the primary database; perform data synchronization with the first standby database according to the modification operation; if the data synchronization fails, send a notification to the arbitrator message, the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the databases in the first database set are all standby databases synchronized with the primary database data; A reply to the first transaction request is sent to the initiator of the first transaction.
  • the performing data synchronization with the first standby database according to the modification operation includes: generating a redo log according to the modification operation; and sending the redo log to the first standby database.
  • a method for synchronizing primary and standby databases is provided, the method is applied to the arbitrator, and the method includes: receiving a notification message sent by the primary database, the notification message is used to notify the arbitrator to update the The first standby database is deleted from the first database set, and the databases in the first database set are all standby databases synchronized with the data of the primary database; according to the notification message, the first standby database is deleted from the Deleted in the first database collection.
  • the method further includes: selecting a database from the first set of databases for switching to the primary database.
  • the first data set is recorded in a list.
  • the arbitrator is implemented based on an election protocol.
  • a device for synchronizing primary and backup databases the device is deployed with a primary database, and the device includes: a first receiving unit, configured to receive a first transaction request, and the first transaction request is used to request modify the data in the primary database; the executing unit is configured to perform a modifying operation on the data in the primary database in response to the first transaction request; the synchronizing unit is configured to communicate with the first backup according to the modifying operation
  • the database performs data synchronization;
  • the first sending unit is configured to send a notification message to the arbitrator if the data synchronization fails, and the notification message is used to notify the arbitrator to remove the first standby database from the first database set
  • the databases in the first database set are all standby databases synchronized with the primary database data; the response unit is configured to send a response to the first transaction request to the initiator of the first transaction.
  • the synchronizing unit includes: a generating unit, configured to generate redo logs according to the modification operation; a second sending unit, configured to send redo logs to the first standby database.
  • a device for synchronizing primary and standby databases the device is deployed with an arbitrator, and the device includes: a second receiving unit, configured to receive a notification message sent by the primary database, and the notification message is used to notify The arbitrator deletes the first standby database from the first database set, and the databases in the first database set are all standby databases synchronized with the data of the primary database; the deletion unit is configured to message, deleting the first standby database from the first database set.
  • the device further includes: a selection unit, configured to select a database from the first set of databases for switching to the primary database.
  • a selection unit configured to select a database from the first set of databases for switching to the primary database.
  • the first data set is recorded in a list.
  • the arbitrator is implemented based on an election protocol.
  • a computer-readable storage medium on which executable code is stored, and when the executable code is executed, the method as described in the first aspect or the second aspect can be implemented.
  • a computer program product including executable codes, and when the executable codes are executed, the method as described in the first aspect or the second aspect can be implemented.
  • the primary database may notify the arbitrator to delete the first standby database from the first database set and continue to serve.
  • the standby database in the first database set may be selected to perform active-standby switchover without causing data loss. Therefore, based on the method provided by the present disclosure, when any one of the primary database or the first standby database is unavailable, the system can still continue to serve, and the problem of data loss will not occur.
  • FIG. 1 is a schematic flowchart of a method for synchronizing primary and secondary databases provided by an embodiment of the present disclosure.
  • FIG. 2 is a schematic flowchart of another method for synchronizing primary and secondary databases provided by an embodiment of the present disclosure.
  • FIG. 3 is a schematic flowchart of another method for synchronizing primary and secondary databases provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of an apparatus for synchronizing active and standby databases provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of another device for synchronizing active and standby databases according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of another device for synchronizing primary and secondary databases provided by an embodiment of the present disclosure.
  • Databases can be used to organize, store and manage data in terms of data structures.
  • the database can perform operations such as adding, deleting, modifying, or searching data in the database according to the modification transaction request.
  • Common databases include Oracle, MySQL, DB2, MongoDB, and Redias.
  • a database is an essential part of a system such as a production system.
  • Disasters may include: force majeure (such as natural disasters or terrorist attacks, etc.), heavy data load (such as high concurrent data or massive data, etc.), or network failures.
  • the related technology proposes a disaster recovery mechanism for the database, so as to realize that the database can continue to serve or restore the service after encountering a disaster, thereby improving the usability and stability of the database.
  • a system may include multiple databases. Every database in the system can synchronize the same replica, i.e. store the same data. Multiple databases can be set in different locations (such as different machines, computer rooms or cities). When a database encounters a disaster, other databases can continue to serve.
  • the system can include a primary database and one or more standby databases.
  • the primary database can read and write data, and the standby database can be used to back up the data in the primary database.
  • the primary database can also be called the main database or the production database, and the standby database can also be called the standby database or the slave database.
  • Disasters can render the primary and/or standby databases unavailable.
  • the unavailability of the primary database may include primary database failure.
  • the situation that the standby database is unavailable may include a failure of the standby database or a network failure between the primary and standby databases.
  • the master-standby synchronization scheme can include Oracle's data protection scheme DataGuard, MySQL master-standby synchronization, and election-based master-standby synchronization.
  • the redo log (undo log) is a key structure in the database. Redo logs are used to record changes to the database.
  • a redo log can consist of one or more log files.
  • the master database and the standby database can be associated with corresponding redo logs. In the event of a disaster, data recovery can be performed based on redo logs.
  • the primary database can synchronize redo logs to the standby database to achieve data backup.
  • the disaster recovery method based on master-standby synchronization may include steps S101 to S105.
  • Step S101 the master database receives a transaction request.
  • step S102 the master database executes the modification operation corresponding to the transaction request.
  • step S103 the primary database synchronizes the modification operation to the standby database.
  • the primary database can write modification operations to redo log files and synchronize the redo log files to the standby database.
  • Step S104 the master database sends a response to the transaction request to the initiator of the modification transaction.
  • the synchronization between the primary database and the standby database can be achieved through the following modes: maximum protection mode (maximum protection), maximum performance mode (maximum performance) or maximum availability mode (maximum availability).
  • the primary database In the highest protection mode, only when the modification operation is successfully synchronized to the standby database, the primary database can respond to the initiator of the modification transaction. Therefore, the highest protection mode can realize the complete consistency of the data of the primary database and the data of the standby database. As an implementation manner, if the primary database receives a synchronization success message fed back by the standby database, the primary database may respond to the modification transaction initiator to report that the modification transaction is executed successfully. Alternatively, if the primary database does not receive a synchronization success message from the standby database within a certain period of time, the primary database may not respond to the initiator of the modification transaction.
  • the primary database can respond to the initiator of the transaction request without considering the synchronization of the modified transaction to the standby database.
  • the master database can report to the modification transaction initiator that the modification transaction is successfully executed after successfully writing the local redo log.
  • a background thread can asynchronously copy redo logs to the standby database.
  • the synchronization mode is the highest protection mode. If the standby database is unavailable, such as if the standby database fails or the network between the primary and standby database fails, synchronous mode degrades to the highest performance mode.
  • the highest protection mode may be called strong synchronous mode, and the highest performance mode or highest availability mode may be called asynchronous replication mode.
  • Election protocols can be implemented based on majority voting. Election protocols can be used to solve the problem of how to reach agreement on a certain value (or resolution). To implement the election protocol, at least 3 nodes need to be deployed to support the majority (more than half) and the minority.
  • the election protocol may include, for example, Paxos or Raft (that is, a simplified version of Paxos).
  • the system needs to deploy at least 3 nodes (also called replicas), and each node is used to store data, that is, the system includes at least 3 databases.
  • the system can deploy 3, 5 or more nodes. Taking the system with 3 nodes deployed as an example, each master-standby synchronization or master-standby switchover must be executed successfully on at least 2 nodes before it can be considered successful. Taking the system deployed with 5 nodes as an example, each master-standby synchronization or master-standby switchover must be executed successfully on at least 3 nodes before it can be considered successful.
  • the modification transaction initiator can be fed back to the modification transaction execution success.
  • active-standby switching when a minority (either the primary database or the standby database) fails, a new node can be selected through an election protocol to continue providing services.
  • the disaster recovery mechanism can be evaluated by at least one of recovery point objective (RPO) and recovery time objective (recovery time objective).
  • RPO can refer to the length of time at which data can be lost at most.
  • RTO can refer to the maximum time required from a disaster to return to normal. Understandably, the shorter the RPO, the less data may be lost. The shorter the RTO, the longer the database can be up and running.
  • the data in the standby database is strongly consistent with the data in the primary database.
  • the primary database fails, no data will be lost when switching between the primary and secondary databases.
  • the standby database fails, the primary and standby synchronization cannot be performed, and the service needs to be suspended.
  • the primary database can continue to serve, so high availability can be achieved.
  • the primary database fails, data consistency between the primary database and the primary database cannot be guaranteed because the data in the primary database may not have been synchronized to the standby database, and there is a possibility of data loss when switching to the standby database.
  • the highest available mode is a combination of the highest protection mode and the highest performance mode. Therefore, it is still impossible to determine that the data in the standby database is consistent with the data in the primary database. When the primary database fails, it is difficult to achieve lossless primary-standby switchover.
  • the active-standby mechanism based on the election protocol can continue to serve when the minority is unavailable (such as node failure or communication failure between nodes). Take the system including 3 nodes for storing data as an example, and the minority is 1 node. That is, when any one of the 3 nodes (primary database or standby database) is unavailable, the majority can continue to serve. It can be seen from this that the active and standby mechanism based on the election protocol can realize automatic lossless disaster recovery. Taking the deployment mode of three nodes in the same computer room as an example, this mode can support non-destructive disaster recovery after a database failure on one machine in the computer room.
  • this mode can support non-destructive disaster recovery after the database failure of one computer room.
  • the highest protection mode, the highest performance mode or the highest availability mode can only serve normally when the primary database is unavailable, or can only serve normally when the standby database is unavailable. Services can still be provided as normal. Therefore, it is difficult for the above three modes to realize automatic lossless disaster recovery.
  • the master-standby synchronization based on the election protocol can realize the lossless disaster recovery of any node (primary database or standby database), but there are many nodes to store data, and the storage cost is high. It can be seen from this that it is difficult for related technologies to achieve lossless automatic disaster recovery, or the cost of realizing lossless automatic disaster recovery is relatively high.
  • Fig. 1 is a schematic flowchart of a method for synchronizing primary and secondary databases proposed by an embodiment of the present disclosure.
  • the method shown in FIG. 1 can be implemented by the primary database, the first standby database, the arbitrator and the initiator.
  • the method shown in FIG. 1 may include step S110 to step S160.
  • Step S110 the primary database receives a first transaction request.
  • a first transaction request may be used to request modifications to data in the primary database.
  • the initiator may send at least one transaction request to the main database, and the first transaction request may be any item in the at least one transaction request.
  • the initiator of the first transaction request may be a client.
  • Step S120 in response to the first transaction request, modifying the data in the master database.
  • the modify operation corresponds to the first transaction request.
  • the modification operation may include at least one of adding, deleting, and changing data in the master database.
  • Step S130 perform data synchronization on the first standby database according to the modification operation.
  • the modification operation may be sent to the first standby database for synchronization by the first standby database.
  • step S130 may include step S131 and step S132.
  • Step S131 generating a redo log according to the modification operation.
  • a master library can write modification operations to a local log file to generate redo logs.
  • Step S132 sending redo logs to the first standby database.
  • the first standby database may generate redo logs of the first standby database according to the received redo logs.
  • the first standby database can replicate the redo logs of the primary database to the first standby database.
  • the primary database can synchronize data with multiple standby databases.
  • the first standby database can be any one of multiple databases.
  • the primary database may or may not succeed in synchronizing data to the first standby database.
  • the first standby database may feed back a message, for example, a message of "successful synchronization".
  • the first standby database may feed back a message, for example, a message of "synchronization failure".
  • the first standby database cannot perform feedback or the feedback information cannot be delivered to the primary database. Therefore, if the first standby database does not feed back the synchronization result within a certain period of time, the primary database may also determine that the synchronization of the first data to the first standby database has failed.
  • Step S140 if the data synchronization fails, the master database sends a notification message to the arbitrator.
  • the notification message is used to notify the arbitrator to delete the first standby database from the first data set.
  • Step S150 according to the notification message, the arbitrator deletes the first standby database from the first database set.
  • An arbitrator may maintain a first set of databases. All the databases in the first database set can be standby databases that are synchronized with the data of the primary database. It can be understood that the data of the standby database in the first database set is strongly consistent with the data of the primary database.
  • the arbitrator can refer to the first database set to select the standby database for the master-standby switchover, so as to avoid data loss caused by data inconsistency in the master-standby database during the master-standby switchover.
  • the arbitrator when the arbitrator detects that the primary database is unavailable (for example, the primary database fails), the arbitrator can automatically select a standby database in the first database set to switch over, instead of selecting a standby database that is not in the first database set standby database.
  • step S130 No matter whether the data synchronization in step S130 is successful or not, the master database can execute step S160.
  • Step S160 sending a response to the first transaction request to the initiator of the first transaction.
  • the primary database can respond to the initiator of the first transaction request to successfully modify the transaction. After the initiator receives the response, it can continue to send other transaction requests.
  • the response of the primary database to the initiator may be independent of whether the data of the first standby database is successfully synchronized with the primary database. Therefore, even if data synchronization to the first standby database fails (that is, the primary database sends a notification message to the arbitrator), as long as the execution of the modified transaction is successfully executed, the initiator of the first transaction can be answered.
  • the master data can send a notification message to the arbitrator to remove the first standby database from the first database set before synchronizing data to the first standby database, so that the first database set maintained by the arbitrator The data in the standby database is always up-to-date.
  • the primary database when the modification operation fails to be synchronized to the first standby database, that is, the first standby database is unavailable (for example, the first standby database fails or the network between the primary database and the first standby database fails), the primary database The service may not be stopped and the arbitrator is notified to delete the first standby database from the first database set.
  • the arbitrator can select a standby database consistent with the data of the primary database to switch to the primary database through the first database set, without causing data loss. Therefore, based on the method provided in the present disclosure, when any one of the primary database or the first standby database is unavailable, the system can still continue to serve, and the problem of data loss will not occur.
  • the RPO of the master-slave synchronization solution provided by the present disclosure is equal to 0, and the RTO is less than one minute, that is, automatic lossless disaster recovery can be realized. That is to say, the present disclosure can not only realize the high availability of the database, but also realize that the data will not be lost during the master-standby switchover.
  • the present disclosure requires at least two nodes for data storage (a primary database and a standby database) to achieve lossless disaster recovery.
  • a primary database and a standby database a primary database and two standby databases
  • it reduces the data copies that need to be stored, thereby reducing the cost of data storage. cost.
  • the present disclosure does not limit the manner in which the first database sets records.
  • the first database set may be recorded in the form of a list, that is, the standby database whose data is consistent with the primary database is recorded in a synchronization list.
  • the primary database can send the first message to add the first standby database.
  • the arbitrator can add the first standby database to the first standby database according to the first message in a database collection. For example, after the arbitrator deletes the first standby database from the first database set, if the first standby database is restored to an available state, and the data synchronization between the primary database and the first standby database is successful, the primary database can notify the arbitrator to remove the first The standby database is added to the first database set.
  • the primary database may notify the arbitrator to add the first standby database to the first database set.
  • the primary database may have multiple standby databases associated with it, and the first standby database may be any one of the multiple standby databases.
  • the first set of standby databases may include one or more standby databases.
  • the primary database may send notification messages for different standby databases to the arbitrator, so as to notify the arbitrator which standby database or databases to delete from the first database set.
  • the arbitrator may be a third party that is relatively independent from the primary database and the standby database. Therefore, the arbitrator may be called a third-party arbitrator or a third-party arbitrator.
  • the arbitrator When deploying the arbitrator, the arbitrator can be deployed separately from the database (including the primary database, the first standby database or other standby databases). For example, the arbitrator and the database can be deployed on different machines, different computer rooms or different cities. When the database suffers from a disaster, the arbitrator deployed separately from the database can avoid the disaster, so that the arbitrator can still serve normally even if the database suffers a disaster.
  • the database including the primary database, the first standby database or other standby databases.
  • the arbitrator and the database can be deployed on different machines, different computer rooms or different cities.
  • the arbitrator deployed separately from the database can avoid the disaster, so that the arbitrator can still serve normally even if the database suffers a disaster.
  • the arbitrator can be implemented through an election protocol to achieve high availability of the arbitrator.
  • the arbitrator can use Paxos or Raft election protocol.
  • the arbitrator may include at least 3 nodes (eg 3 or 5 nodes). When a few nodes of the arbitrator are unavailable (such as node failure or network failure), new nodes can be selected through the election protocol to continue to provide services, so that the arbitrator has high availability. It is understandable that in addition to the election protocol, the arbitrator can also be implemented in other ways that can achieve high availability.
  • the arbitrator can be used to maintain multiple database collections that correspond one-to-one to multiple master databases.
  • Each database set includes a standby database that is synchronized with the corresponding primary database data.
  • a company can deploy a set of globally available arbitrator services, and all the database nodes can share this set of arbitrator services.
  • FIG. 2 and FIG. 3 are schematic diagrams of an active-standby synchronous disaster recovery method provided by an embodiment of the present disclosure.
  • Fig. 2 shows the master-standby synchronization method when both the master database and the first standby database are available.
  • the method shown in FIG. 2 can be executed by the client, the primary database and the first standby database.
  • the arbitrator may include 3 nodes (the nodes are represented by circles in Figure 3), and is implemented based on the election protocol.
  • the method shown in FIG. 2 may include step S210 to step S240.
  • Step S210 the client sends a transaction modification request to the master database.
  • Step S221 the master database writes a log in a local log file.
  • the local log files can be the redo log files of the primary database.
  • Step S222 synchronizing the modification operation to the first standby database. Step S221 and step S222 may be performed synchronously.
  • Step S230 in response to the master-standby synchronization operation, the first standby database writes the modification operation into the log file of the standby database.
  • step S240 the first standby database replies "successful synchronization" to the primary database.
  • step S250 the master database sends second information to the client, and the second information is used to reply that the client successfully executes the modification transaction.
  • Fig. 3 shows a fault handling method when the first standby database is unavailable.
  • the method shown in FIG. 3 can be executed by the client, the primary database, the first standby database, and the arbitrator.
  • the arbitrator may include 3 nodes (the nodes are represented by circles in Figure 3), and is implemented based on the election protocol.
  • the method shown in FIG. 3 may include step S310 to step S340.
  • Step S310 the client sends a transaction modification request to the master database.
  • Step S321 the master database writes a log in a local log file.
  • the local log files are the redo log files of the primary database.
  • Step S322 synchronizing the modification operation to the first standby database.
  • Step S321 and step S322 may be performed synchronously.
  • Step S330 when the primary database has not received the response from the first standby database to the synchronization of the primary and secondary databases after a certain period of time (that is, at least one failure in the first standby database and the network between the first standby database and the primary database) , the primary database sends the first message to the arbitrator to notify the arbitrator to remove the first standby database from the synchronization list.
  • step S340 the master database sends second information to the client, and the second information is used to reply that the client successfully modifies the transaction.
  • the arbitrator can select a database (such as a server) in the synchronization list to continue the synchronization service after detecting the failure.
  • the first standby database may be removed from the synchronization list after the primary database fails to synchronize with the first standby database. Subsequent primary database can feed back the second information to the client after successfully writing the local log file.
  • the first standby database may be removed from the synchronization list after the primary database fails to synchronize to the first standby database. Subsequent primary database can feed back the second information to the client after successfully writing the local log file. Next, if the first standby database fails, there is no need to deal with it. If the primary database fails later, since the first standby database is not in the synchronization list, the arbitrator may not select the first standby database to switch to the primary database, thereby avoiding data loss.
  • the arbitrator detects the failure of the primary database, and after a period of time, the standby database in the synchronization list can be synchronized (if the synchronization list includes the first standby database, the first standby database can be selected) Switch to the primary database to continue the service.
  • FIGS. 1 to 3 The method embodiments provided by the present disclosure are described above through FIGS. 1 to 3 , and the device embodiments provided by the present disclosure are described below with reference to FIGS. 4 to 6 .
  • FIG. 4 is a schematic structural diagram of an apparatus 400 for synchronizing active and standby databases according to an embodiment of the present disclosure.
  • Apparatus 400 may be a computing device with computing functions, such as a server.
  • the device 400 is deployed with a master database.
  • the apparatus 400 may include a first receiving unit 410 , an execution unit 420 , a synchronization unit 430 , a first sending unit 440 and a response unit 450 .
  • the first receiving unit 410 may be used to receive a first transaction request, and the first transaction request is used to request to modify data in the master database; the execution unit 420 may be used to respond to the first transaction request, to modify the The data in the primary database performs a modification operation; the synchronization unit 430 can be used to perform data synchronization with the first standby database according to the modification operation; the first sending unit 440 can be used to send a notification to the arbitrator if the data synchronization fails message, the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the databases in the first database set are all standby databases synchronized with the primary database data; The response unit 450 may be configured to send a response to the first transaction request to the initiator of the first transaction.
  • the synchronizing unit 430 may include: a generating unit and a second sending unit.
  • the generation unit can be used to generate redo logs according to the modification operation.
  • the second sending unit may be used to send redo logs to the first standby database.
  • FIG. 5 is a schematic structural diagram of an apparatus 500 for synchronizing active and standby databases provided by an embodiment of the present disclosure.
  • Apparatus 500 may be a computing device with computing functions, such as a server.
  • Apparatus 500 deploys an arbitrator.
  • the apparatus 500 may include a second receiving unit 510 and a deleting unit 520 .
  • the second receiving unit 510 may be configured to receive a notification message sent by the primary database, the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the first database set in the first database set
  • the databases are all standby databases synchronized with the data of the primary database.
  • the deleting unit 520 may be configured to delete the first standby database from the first database set according to the notification message.
  • the apparatus 500 may further include: a selection unit.
  • the selection unit may be used to select a database from the first set of databases for switching to the primary database.
  • the first data set is recorded in a list.
  • the arbitrator is implemented based on an election protocol.
  • Fig. 6 is a schematic structural diagram of an apparatus for synchronizing primary and secondary databases according to yet another embodiment of the present disclosure.
  • the apparatus 600 may be, for example, a computing device with a computing function.
  • the device 600 may be a mobile terminal or a server.
  • the apparatus 600 may include a memory 610 and a processor 620 .
  • Memory 610 may be used to store executable code.
  • the processor 620 can be used to execute the executable code stored in the memory 610, so as to realize the steps in the various methods described above.
  • the apparatus 600 may further include a network interface 630 through which data exchange between the processor 620 and external devices may be implemented.
  • all or part may be implemented by software, hardware, firmware or other arbitrary combinations.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, all or part of the processes or functions according to the embodiments of the present disclosure will be generated.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, a magnetic tape), an optical medium (such as a digital video disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc. .
  • a magnetic medium such as a floppy disk, a hard disk, a magnetic tape
  • an optical medium such as a digital video disc (digital video disc, DVD)
  • a semiconductor medium such as a solid state disk (solid state disk, SSD)
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Provided in the present disclosure are a method and apparatus for synchronizing a main database and a standby database. The method can be applied to a main database, and the method comprises: receiving a first transaction request, wherein the first transaction request is used for requesting the modification of data in a main database; in response to the first transaction request, executing a modification operation on the data in the main database; performing data synchronization with a first standby database according to the modification operation; if data synchronization fails, sending a notification message to an arbitrator, wherein the notification message is used for notifying the arbitrator of the deletion of the first standby database from a first database set, and databases in the first database set are all standby databases which are synchronized with the data in the main database; and sending, to an initiator of a first transaction, a response for the first transaction request.

Description

同步主备数据库Synchronize primary and standby databases 技术领域technical field
本公开涉及数据库技术领域,尤其涉及一种同步主备数据库的方法及装置。The present disclosure relates to the technical field of databases, and in particular to a method and device for synchronizing active and standby databases.
背景技术Background technique
灾害可能导致数据库不可用,从而无法继续提供服务。相关技术可以基于主备同步机制实现容灾。例如,在主数据库受灾不可用的情况下,可以进行主备切换,使用备数据库继续提供服务。但是,主备同步容灾机制存在诸多问题。例如,在备数据库不可用时,主数据库无法继续服务。或者,在进行主备切换时,可能存在数据丢失的情况。Disasters may render the database unavailable, making it impossible to continue to provide services. Related technologies can realize disaster recovery based on the active-standby synchronization mechanism. For example, when the primary database is unavailable due to a disaster, the active/standby switchover can be performed, and the standby database can continue to provide services. However, there are many problems in the active-standby synchronous disaster recovery mechanism. For example, when the standby database is unavailable, the primary database cannot continue to serve. Or, data may be lost during active/standby switchover.
发明内容Contents of the invention
有鉴于此,本公开提供了一种同步主备数据库的方法及装置,以解决上述问题。In view of this, the present disclosure provides a method and device for synchronizing primary and secondary databases to solve the above problems.
第一方面,提供了一种同步主备数据库的方法,所述方法应用于主数据库,所述方法包括:接收第一事务请求,所述第一事务请求用于请求对主数据库中的数据进行修改;响应于所述第一事务请求,对所述主数据库中的数据执行修改操作;根据所述修改操作,与第一备数据库进行数据同步;如果所述数据同步失败,向仲裁方发送通知消息,所述通知消息用于通知所述仲裁方将所述第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库;向所述第一事务的发起方发送针对所述第一事务请求的应答。In a first aspect, a method for synchronizing primary and standby databases is provided, the method is applied to the primary database, and the method includes: receiving a first transaction request, the first transaction request is used to request data in the primary database Modify; in response to the first transaction request, perform a modification operation on the data in the primary database; perform data synchronization with the first standby database according to the modification operation; if the data synchronization fails, send a notification to the arbitrator message, the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the databases in the first database set are all standby databases synchronized with the primary database data; A reply to the first transaction request is sent to the initiator of the first transaction.
可选地,所述根据所述修改操作,与第一备数据库进行数据同步包括:根据所述修改操作,生成重做日志;向所述第一备数据库发送重做日志。Optionally, the performing data synchronization with the first standby database according to the modification operation includes: generating a redo log according to the modification operation; and sending the redo log to the first standby database.
第二方面,提供了一种同步主备数据库的方法,所述方法应用于仲裁方,所述方法包括:接收主数据库发送的通知消息,所述通知消息用于通知所述仲裁方将所述第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库;根据所述通知消息,将所述第一备数据库从所述第一数据库集合中删除。In the second aspect, a method for synchronizing primary and standby databases is provided, the method is applied to the arbitrator, and the method includes: receiving a notification message sent by the primary database, the notification message is used to notify the arbitrator to update the The first standby database is deleted from the first database set, and the databases in the first database set are all standby databases synchronized with the data of the primary database; according to the notification message, the first standby database is deleted from the Deleted in the first database collection.
可选地,所述方法还包括:从所述第一数据库集合中选择用于切换为主数据库的数据库。Optionally, the method further includes: selecting a database from the first set of databases for switching to the primary database.
可选地,所述第一数据集合通过列表记录。Optionally, the first data set is recorded in a list.
可选地,所述仲裁方基于选举协议实现。Optionally, the arbitrator is implemented based on an election protocol.
第三方面,提供了一种同步主备数据库的装置,所述装置部署有主数据库,所述装置包括:第一接收单元,用于接收第一事务请求,所述第一事务请求用于请求对主数据库中的数据进行修改;执行单元,用于响应于所述第一事务请求,对所述主数据库中的数据执行修改操作;同步单元,用于根据所述修改操作,与第一备数据库进行数据同步;第一发送单元,用于如果所述数据同步失败,向仲裁方发送通知消息,所述通知消息用于通知所述仲裁方将所述第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库;应答单元,用于向所述第一事务的发起方发送针对所述第一事务请求的应答。In a third aspect, there is provided a device for synchronizing primary and backup databases, the device is deployed with a primary database, and the device includes: a first receiving unit, configured to receive a first transaction request, and the first transaction request is used to request modify the data in the primary database; the executing unit is configured to perform a modifying operation on the data in the primary database in response to the first transaction request; the synchronizing unit is configured to communicate with the first backup according to the modifying operation The database performs data synchronization; the first sending unit is configured to send a notification message to the arbitrator if the data synchronization fails, and the notification message is used to notify the arbitrator to remove the first standby database from the first database set For deletion, the databases in the first database set are all standby databases synchronized with the primary database data; the response unit is configured to send a response to the first transaction request to the initiator of the first transaction.
可选地,所述同步单元包括:生成单元,用于根据所述修改操作,生成重做日志;第二发送单元,用于向所述第一备数据库发送重做日志。Optionally, the synchronizing unit includes: a generating unit, configured to generate redo logs according to the modification operation; a second sending unit, configured to send redo logs to the first standby database.
第四方面,提供了一种同步主备数据库的装置,所述装置部署有仲裁方,所述装置包括:第二接收单元,用于接收主数据库发送的通知消息,所述通知消息用于通知所述仲裁方将所述第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库;删除单元,用于根据所述通知消息,将所述第一备数据库从所述第一数据库集合中删除。In a fourth aspect, there is provided a device for synchronizing primary and standby databases, the device is deployed with an arbitrator, and the device includes: a second receiving unit, configured to receive a notification message sent by the primary database, and the notification message is used to notify The arbitrator deletes the first standby database from the first database set, and the databases in the first database set are all standby databases synchronized with the data of the primary database; the deletion unit is configured to message, deleting the first standby database from the first database set.
可选地,所述装置还包括:选择单元,从所述第一数据库集合中选择用于切换为主数据库的数据库。Optionally, the device further includes: a selection unit, configured to select a database from the first set of databases for switching to the primary database.
可选地,所述第一数据集合通过列表记录。Optionally, the first data set is recorded in a list.
可选地,所述仲裁方基于选举协议实现。Optionally, the arbitrator is implemented based on an election protocol.
第六方面,提供一种计算机可读存储介质,其上存储有可执行代码,当所述可执行代码被执行时,能够实现如第一方面或第二方面所述的方法。In a sixth aspect, a computer-readable storage medium is provided, on which executable code is stored, and when the executable code is executed, the method as described in the first aspect or the second aspect can be implemented.
第七方面,提供一种计算机程序产品,包括可执行代码,当所述可执行代码被执行时,能够实现如第一方面或第二方面所述的方法。In a seventh aspect, a computer program product is provided, including executable codes, and when the executable codes are executed, the method as described in the first aspect or the second aspect can be implemented.
在第一操作同步至第一备数据库失败的情况下(即第一备数据库不可用),主数据库可以通知仲裁方将第一备数据库从第一数据库集合中删除并继续服务。当主数据库不可用时,可以选择第一数据库集合中的备数据库进行主备切换,而不会导致数据丢失。因 此,基于本公开提供的方法,在主数据库或者第一备数据库中任意一个不可用时,系统仍可以继续服务,并且不会出现数据丢失的问题。When the first operation fails to synchronize to the first standby database (that is, the first standby database is unavailable), the primary database may notify the arbitrator to delete the first standby database from the first database set and continue to serve. When the primary database is unavailable, the standby database in the first database set may be selected to perform active-standby switchover without causing data loss. Therefore, based on the method provided by the present disclosure, when any one of the primary database or the first standby database is unavailable, the system can still continue to serve, and the problem of data loss will not occur.
附图说明Description of drawings
图1为本公开实施例提供的一同步主备数据库的方法的示意性流程图。FIG. 1 is a schematic flowchart of a method for synchronizing primary and secondary databases provided by an embodiment of the present disclosure.
图2为本公开实施例提供的另一同步主备数据库的方法的示意性流程图。FIG. 2 is a schematic flowchart of another method for synchronizing primary and secondary databases provided by an embodiment of the present disclosure.
图3为本公开实施例提供的又一同步主备数据库的方法的示意性流程图。FIG. 3 is a schematic flowchart of another method for synchronizing primary and secondary databases provided by an embodiment of the present disclosure.
图4为本公开实施例提供的一同步主备数据库的装置的结构示意图。FIG. 4 is a schematic structural diagram of an apparatus for synchronizing active and standby databases provided by an embodiment of the present disclosure.
图5为本公开实施例提供的另一同步主备数据库的装置的结构示意图。FIG. 5 is a schematic structural diagram of another device for synchronizing active and standby databases according to an embodiment of the present disclosure.
图6为本公开实施例提供的又一同步主备数据库的装置的结构示意图。FIG. 6 is a schematic structural diagram of another device for synchronizing primary and secondary databases provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
下面将结合本公开实施例的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本公开一部分实施例,而不是全部的实施例。The following will clearly and completely describe the technical solutions in the embodiments of the present disclosure with reference to the accompanying drawings of the embodiments of the present disclosure. Apparently, the described embodiments are only some of the embodiments of the present disclosure, not all of them.
数据库(database,DB)database (database, DB)
数据库可以用于按照数据结构来组织、存储和管理数据。数据库可以根据修改事务请求对数据库中的数据进行新增、删除、修改或查找等操作。常见的数据库有Oracle、MySQL、DB2、MongoDB以及Redias等。数据库是一个系统(例如生产系统)的重要组成部分。Databases can be used to organize, store and manage data in terms of data structures. The database can perform operations such as adding, deleting, modifying, or searching data in the database according to the modification transaction request. Common databases include Oracle, MySQL, DB2, MongoDB, and Redias. A database is an essential part of a system such as a production system.
数据库难免遭遇灾害,从而出现故障。灾害可以包括:不可抗力(例如自然灾害或恐怖袭击等)、数据负载过重(例如高并发数据或海量数据等)或者网络故障等。相关技术提出了针对数据库的容灾机制,以实现数据库在遭遇灾害后可以继续服务或恢复服务,从而提高数据库的可用性和稳定性。Databases are inevitably subject to disasters and failures. Disasters may include: force majeure (such as natural disasters or terrorist attacks, etc.), heavy data load (such as high concurrent data or massive data, etc.), or network failures. The related technology proposes a disaster recovery mechanism for the database, so as to realize that the database can continue to serve or restore the service after encountering a disaster, thereby improving the usability and stability of the database.
作为一种容灾机制,系统(例如生产系统)可以包括多个数据库。系统中的每个数据库可以同步相同的副本,即存储相同的数据。多个数据库可以设置在不同的位置(例如不同的机器、机房或城市)。当某一数据库遭遇灾害时,其他数据库可以继续服务。As a disaster recovery mechanism, a system (such as a production system) may include multiple databases. Every database in the system can synchronize the same replica, i.e. store the same data. Multiple databases can be set in different locations (such as different machines, computer rooms or cities). When a database encounters a disaster, other databases can continue to serve.
主备同步容灾机制Active-standby synchronization disaster recovery mechanism
在主备同步容灾机制下,系统可以包括主数据库(primary database)以及一个或多 个备数据库(standby database)。主数据库可以实现数据的读写,备数据库可以用于备份主数据库中的数据。主数据库也可以称为主库或生产数据库,备数据库也可以称为备库或从库。Under the active-standby synchronous disaster recovery mechanism, the system can include a primary database and one or more standby databases. The primary database can read and write data, and the standby database can be used to back up the data in the primary database. The primary database can also be called the main database or the production database, and the standby database can also be called the standby database or the slave database.
灾害可能导致主数据库和/或备数据库不可用。主数据库不可用的情况可以包括主数据库故障。备数据库不可用的情况可以包括备数据库故障或主备数据库之间的网络故障。Disasters can render the primary and/or standby databases unavailable. The unavailability of the primary database may include primary database failure. The situation that the standby database is unavailable may include a failure of the standby database or a network failure between the primary and standby databases.
当主数据库受灾不可用时,系统可以切换到备数据库继续服务,以避免灾害的影响。每次对主数据库进行操作时,均需要将该操作同步至备数据库,以实现主数据库和备数据库中数据的一致性,从而在切换时可以保证切换为主数据库的备数据库与原主数据库中的数据是相同的。主备同步方案可以包括Oracle的数据保护方案DataGuard、MySQL主备同步以及基于选举协议的主备同步。When the primary database is unavailable due to a disaster, the system can switch to the standby database to continue serving to avoid the impact of the disaster. Every time an operation is performed on the primary database, the operation needs to be synchronized to the standby database to achieve the consistency of data in the primary database and the standby database, so that when switching, it can ensure that the standby database of the primary database is switched to the original primary database. The data is the same. The master-standby synchronization scheme can include Oracle's data protection scheme DataGuard, MySQL master-standby synchronization, and election-based master-standby synchronization.
重做日志(undo log)是数据库中的关键结构。重做日志用于记录数据库发生的修改。重做日志可以包括一个或多个日志文件。对于主备同步方案,主数据库和备数据库可以分别关联对应的重做日志。在发生灾害时,可以基于重做日志进行数据的恢复。主数据库可以将重做日志同步到备数据库,以实现数据的备份。The redo log (undo log) is a key structure in the database. Redo logs are used to record changes to the database. A redo log can consist of one or more log files. For the master-standby synchronization scheme, the master database and the standby database can be associated with corresponding redo logs. In the event of a disaster, data recovery can be performed based on redo logs. The primary database can synchronize redo logs to the standby database to achieve data backup.
作为一种实现方式,基于主备同步的容灾方法可以包括步骤S101~步骤S105。步骤S101,主数据库接收事务请求。步骤S102,主数据库执行事务请求对应的修改操作。步骤S103,主数据库将修改操作同步至备数据库。作为一种实现方式,主数据库可以将修改操作写入重做日志文件,并将重做日志文件同步至备数据库。步骤S104,主数据库向修改事务的发起方发送事务请求的应答。下面分别对不同模式进行详细说明。As an implementation manner, the disaster recovery method based on master-standby synchronization may include steps S101 to S105. Step S101, the master database receives a transaction request. In step S102, the master database executes the modification operation corresponding to the transaction request. Step S103, the primary database synchronizes the modification operation to the standby database. As an implementation, the primary database can write modification operations to redo log files and synchronize the redo log files to the standby database. Step S104, the master database sends a response to the transaction request to the initiator of the modification transaction. The different modes are described in detail below.
主数据库与备数据库之间的同步可以通过以下模式实现:最高保护模式(maximum protection)、最高性能模式(maximum performance)或最高可用模式(maximum availability)。The synchronization between the primary database and the standby database can be achieved through the following modes: maximum protection mode (maximum protection), maximum performance mode (maximum performance) or maximum availability mode (maximum availability).
在最高保护模式下,在修改操作同步到备数据库成功的情况下,主数据库才可以应答修改事务的发起方。因此,最高保护模式可以实现主数据库的数据和备数据库的数据完全一致。作为一种实现方式,如果主数据库接收到备数据库反馈的同步成功消息,主数据库可以应答修改事务发起方反馈修改事务执行成功。或者,如果主数据库在一定时间内未收到备数据库的同步成功消息,主数据库可以不应答修改事务的发起方。In the highest protection mode, only when the modification operation is successfully synchronized to the standby database, the primary database can respond to the initiator of the modification transaction. Therefore, the highest protection mode can realize the complete consistency of the data of the primary database and the data of the standby database. As an implementation manner, if the primary database receives a synchronization success message fed back by the standby database, the primary database may respond to the modification transaction initiator to report that the modification transaction is executed successfully. Alternatively, if the primary database does not receive a synchronization success message from the standby database within a certain period of time, the primary database may not respond to the initiator of the modification transaction.
在最高性能模式下,主数据库成功完成事务请求对应的操作后,即可应答事务请求的发起方,而不需要考虑修改事务同步到备数据库的情况。作为一种实现方式,主数据库将本地的重做日志写入成功即可向修改事务发起方反馈修改事务执行成功。后台线程 可以将重做日志异步复制到备数据库。In the highest performance mode, after the primary database successfully completes the operation corresponding to the transaction request, it can respond to the initiator of the transaction request without considering the synchronization of the modified transaction to the standby database. As an implementation method, the master database can report to the modification transaction initiator that the modification transaction is successfully executed after successfully writing the local redo log. A background thread can asynchronously copy redo logs to the standby database.
在最高可用模式下,如果备数据库可用,则同步模式为最高保护模式。如果备数据库不可用,例如备数据库发生故障或主数据库和备数据库之间的网络发生故障,则同步模式退化为最高性能模式。In the highest availability mode, if the standby database is available, the synchronization mode is the highest protection mode. If the standby database is unavailable, such as if the standby database fails or the network between the primary and standby database fails, synchronous mode degrades to the highest performance mode.
在一些实施例中,最高保护模式可以称为强同步模式,最高性能模式或最高可用模式可以称为异步复制模式。In some embodiments, the highest protection mode may be called strong synchronous mode, and the highest performance mode or highest availability mode may be called asynchronous replication mode.
基于选举协议的主备同步容灾机制Active-standby Synchronous Disaster Recovery Mechanism Based on Election Protocol
选举协议可以基于多数派投票实现。选举协议可以用于解决如何就某个值(或决议)达成一致的问题。为实现选举协议,需要部署至少3个节点,以支持多数派(超过一半)和少数派。选举协议例如可以包括Paxos或者Raft(即Paxos简化版)。Election protocols can be implemented based on majority voting. Election protocols can be used to solve the problem of how to reach agreement on a certain value (or resolution). To implement the election protocol, at least 3 nodes need to be deployed to support the majority (more than half) and the minority. The election protocol may include, for example, Paxos or Raft (that is, a simplified version of Paxos).
在基于选举协议的主备同步方案中,系统需要部署至少3个节点(也可以称为副本),每个节点均用于存储数据,即系统中至少包括3个数据库。在实际部署中,系统可以部署3个、5个或更多节点。以系统部署了3个节点为例,每次主备同步或者主备切换,均需要在至少2个节点上执行成功,才可以认为执行成功。以系统部署了5个节点为例,每次主备同步或者主备切换,均需要在至少3个节点上执行成功,才可以认为执行成功。In the master-slave synchronization scheme based on the election protocol, the system needs to deploy at least 3 nodes (also called replicas), and each node is used to store data, that is, the system includes at least 3 databases. In actual deployment, the system can deploy 3, 5 or more nodes. Taking the system with 3 nodes deployed as an example, each master-standby synchronization or master-standby switchover must be executed successfully on at least 2 nodes before it can be considered successful. Taking the system deployed with 5 nodes as an example, each master-standby synchronization or master-standby switchover must be executed successfully on at least 3 nodes before it can be considered successful.
对于主备同步,当操作同步到多数派(超过一半)节点时,可以向修改事务发起方反馈修改事务执行成功。对于主备切换,当少数派(可以是主数据库也可以是备数据库)发生故障时,可以通过选举协议选出新的节点继续提供服务。For master-slave synchronization, when the operation is synchronized to the majority (more than half) nodes, the modification transaction initiator can be fed back to the modification transaction execution success. For active-standby switching, when a minority (either the primary database or the standby database) fails, a new node can be selected through an election protocol to continue providing services.
无损自动容灾Lossless automatic disaster recovery
容灾机制可以通过恢复点目标(recovery point objective,RPO)和恢复时间目标(recovery time objective)中的至少一个评估。RPO可以指最多可能丢失的数据的时长。RTO可以指从灾难发生到恢复正常所需要的最大时长。可以理解的是,RPO越短,可能丢失的数据越少。RTO越短则数据库可以正常运行的时间越长。The disaster recovery mechanism can be evaluated by at least one of recovery point objective (RPO) and recovery time objective (recovery time objective). RPO can refer to the length of time at which data can be lost at most. RTO can refer to the maximum time required from a disaster to return to normal. Understandably, the shorter the RPO, the less data may be lost. The shorter the RTO, the longer the database can be up and running.
故障发生后,如果系统可以在较短的时间内自动恢复,并且不丢失数据,则该系统可以实现无损自动容灾。例如,如果一个系统的RPO=0,且RTO比较短(例如RTO小于一分钟),则可以认为该系统可以实现无损容灾。After a failure occurs, if the system can automatically recover within a short period of time without losing data, the system can achieve lossless automatic disaster recovery. For example, if the RPO of a system=0 and the RTO is relatively short (for example, the RTO is less than one minute), it can be considered that the system can achieve lossless disaster recovery.
下面逐一分析上述主备同步方案对于无损自动容灾的支持情况。The following is an analysis of the support of the above-mentioned master-slave synchronization scheme for lossless automatic disaster recovery one by one.
在最高保护模式下,备数据库的数据与主数据库的数据是强一致的,当主数据库发 生故障时,进行主备切换不会丢失数据。当备数据库发生故障时,无法进行主备同步,需要暂停服务。In the highest protection mode, the data in the standby database is strongly consistent with the data in the primary database. When the primary database fails, no data will be lost when switching between the primary and secondary databases. When the standby database fails, the primary and standby synchronization cannot be performed, and the service needs to be suspended.
在最高性能模式下,当备数据库发生故障时,主数据库可以继续服务,因此可以实现高可用。当主数据库发生故障时,由于可能存在主数据库数据尚未同步至备数据库的情况,无法保证备数据库与主数据库数据的一致性,切换到备数据库存在数据丢失的可能。In the highest performance mode, when the standby database fails, the primary database can continue to serve, so high availability can be achieved. When the primary database fails, data consistency between the primary database and the primary database cannot be guaranteed because the data in the primary database may not have been synchronized to the standby database, and there is a possibility of data loss when switching to the standby database.
最高可用模式是最高保护模式和最高性能模式的结合。因此,依然无法确定备数据库中的数据与主数据库中的数据是一致的。当主数据库发生故障时,难以实现无损主备切换。The highest available mode is a combination of the highest protection mode and the highest performance mode. Therefore, it is still impossible to determine that the data in the standby database is consistent with the data in the primary database. When the primary database fails, it is difficult to achieve lossless primary-standby switchover.
有此可知,最高保护模式、最高性能模式或最高可用模式在主数据库或备数据库发生故障后,可能出现服务暂停或者数据损失的情况。因此,上述三种模式均难以实现自动无损容灾。It can be seen that in the highest protection mode, highest performance mode or highest availability mode, service suspension or data loss may occur after the failure of the primary database or the standby database. Therefore, it is difficult for the above three modes to realize automatic lossless disaster recovery.
基于选举协议的主备机制在少数派不可用(例如节点发生故障或节点间的通信发生故障)时,可以继续服务。以系统包括3个储存数据的节点为例,少数派为1个节点。也就是说,当3个节点中的任意一个(主数据库或备数据库)不可用时,多数派可以继续服务。有此可知,基于选举协议的主备机制可以实现自动无损容灾。以同机房3个节点的部署模式为例,该模式可以支持机房中一台机器上的数据库故障后无损容灾。或者,以同城市三个机房的部署模式为例,该模式可以支持一个机房的数据库故障后无损容灾。或者,以三地五中心的部署模式为例,该模式可以支持一个城市的数据库故障后无损容灾。The active-standby mechanism based on the election protocol can continue to serve when the minority is unavailable (such as node failure or communication failure between nodes). Take the system including 3 nodes for storing data as an example, and the minority is 1 node. That is, when any one of the 3 nodes (primary database or standby database) is unavailable, the majority can continue to serve. It can be seen from this that the active and standby mechanism based on the election protocol can realize automatic lossless disaster recovery. Taking the deployment mode of three nodes in the same computer room as an example, this mode can support non-destructive disaster recovery after a database failure on one machine in the computer room. Or, taking the deployment mode of three computer rooms in the same city as an example, this mode can support non-destructive disaster recovery after the database failure of one computer room. Or, take the deployment mode of three locations and five centers as an example, which can support non-destructive disaster recovery after a database failure in a city.
由上文可知,最高保护模式、最高性能模式或最高可用模式要么只可以在主数据库不可用时正常服务,要么只可以在备数据库不可用时正常服务,难以实现主数据库或者备数据库中任一不可用时仍可以正常提供服务。因此,上述三种模式均难以实现自动无损容灾。基于选举协议的主备同步虽然可以实现任一节点(主数据库或备数据库)的无损容灾,但是存储数据的节点较多,存储成本较高。有此可知,相关技术难以实现无损自动容灾,或者实现无损自动容灾的成本较高。It can be seen from the above that the highest protection mode, the highest performance mode or the highest availability mode can only serve normally when the primary database is unavailable, or can only serve normally when the standby database is unavailable. Services can still be provided as normal. Therefore, it is difficult for the above three modes to realize automatic lossless disaster recovery. Although the master-standby synchronization based on the election protocol can realize the lossless disaster recovery of any node (primary database or standby database), but there are many nodes to store data, and the storage cost is high. It can be seen from this that it is difficult for related technologies to achieve lossless automatic disaster recovery, or the cost of realizing lossless automatic disaster recovery is relatively high.
针对上述问题,本公开提出一种同步主备数据库的方法。图1是本公开实施例提出的一种同步主备数据库的方法的示意性流程图。图1所示的方法可以由主数据库、第一备数据库、仲裁方以及发起方实现。图1所示的方法可以包括步骤S110~步骤S160。In view of the above problems, the present disclosure proposes a method for synchronizing primary and secondary databases. Fig. 1 is a schematic flowchart of a method for synchronizing primary and secondary databases proposed by an embodiment of the present disclosure. The method shown in FIG. 1 can be implemented by the primary database, the first standby database, the arbitrator and the initiator. The method shown in FIG. 1 may include step S110 to step S160.
步骤S110,主数据库接收第一事务请求。Step S110, the primary database receives a first transaction request.
第一事务请求可以用于请求对主数据库中的数据进行修改。发起方可以通过向主数据库发送至少一个事务请求,第一事务请求可以为至少一个事务请求中的任意一项。第一事务请求的发起方可以为客户端。A first transaction request may be used to request modifications to data in the primary database. The initiator may send at least one transaction request to the main database, and the first transaction request may be any item in the at least one transaction request. The initiator of the first transaction request may be a client.
步骤S120,响应于第一事务请求,对主数据库中的数据执行修改操作。Step S120, in response to the first transaction request, modifying the data in the master database.
修改操作与第一事务请求对应。修改操作可以包括对主数据库中的数据进行新增、删除以及更改等操作中的至少一项。The modify operation corresponds to the first transaction request. The modification operation may include at least one of adding, deleting, and changing data in the master database.
步骤S130,根据修改操作,对第一备数据库进行数据同步。Step S130, perform data synchronization on the first standby database according to the modification operation.
作为一种实现方式,可以将修改操作发送到第一备数据库,以供第一备数据库进行同步。As an implementation manner, the modification operation may be sent to the first standby database for synchronization by the first standby database.
主数据库与第一备数据库之间的数据同步可以基于重做日志实现。作为一种实现方式,步骤S130可以包括步骤S131和步骤S132。Data synchronization between the primary database and the first standby database can be implemented based on redo logs. As an implementation manner, step S130 may include step S131 and step S132.
步骤S131,根据所述修改操作,生成重做日志。例如,主设备库可以将修改操作写入本地日志文件,以生成重做日志。Step S131, generating a redo log according to the modification operation. For example, a master library can write modification operations to a local log file to generate redo logs.
步骤S132,向第一备数据库发送重做日志。Step S132, sending redo logs to the first standby database.
第一备数据库可以根据接收到的重做日志生成第一备数据库的重做日志。例如,第一备数据库可以将主数据库的重做日志复制到第一备数据库。The first standby database may generate redo logs of the first standby database according to the received redo logs. For example, the first standby database can replicate the redo logs of the primary database to the first standby database.
主数据库可以与多个备数据库进行数据同步。第一备数据库可以是多个数据库中的任意一个。The primary database can synchronize data with multiple standby databases. The first standby database can be any one of multiple databases.
主数据库将数据同步至第一备数据库可能成功也可能失败。在同步成功的情况下,第一备数据库可以反馈消息,例如“同步成功”的消息。在同步失败的情况下,第一数备据库可以反馈消息,例如“同步失败”的消息。或者,在一些情况下,第一备数据库无法进行反馈或者反馈信息无法送达主数据库。因此,如果第一备数据库超过一定时间未反馈同步结果,主数据库也可以确定第一数据同步到第一备数据库失败。The primary database may or may not succeed in synchronizing data to the first standby database. In the case of successful synchronization, the first standby database may feed back a message, for example, a message of "successful synchronization". In the case of a synchronization failure, the first standby database may feed back a message, for example, a message of "synchronization failure". Or, in some cases, the first standby database cannot perform feedback or the feedback information cannot be delivered to the primary database. Therefore, if the first standby database does not feed back the synchronization result within a certain period of time, the primary database may also determine that the synchronization of the first data to the first standby database has failed.
步骤S140,如果所述数据同步失败,主数据库向仲裁方发送通知消息。Step S140, if the data synchronization fails, the master database sends a notification message to the arbitrator.
通知消息用于通知仲裁方将第一备数据库从第一数据集合中删除。The notification message is used to notify the arbitrator to delete the first standby database from the first data set.
步骤S150,根据通知消息,仲裁方将第一备数据库从第一数据库集合中删除。Step S150, according to the notification message, the arbitrator deletes the first standby database from the first database set.
仲裁方可以维护第一数据库集合。第一数据库集合中的数据库均可以为与主数据库数据同步的备数据库。可以理解的是,第一数据库集合中的备数据库的数据与主数据库的数据是具有强一致性的。An arbitrator may maintain a first set of databases. All the databases in the first database set can be standby databases that are synchronized with the data of the primary database. It can be understood that the data of the standby database in the first database set is strongly consistent with the data of the primary database.
仲裁方可以参考第一数据库集合选择进行主备切换的备数据库,从而可以避免主备切换时主备数据库中的数据不一致导致数据丢失的情况。作为一种实现方式,仲裁方检测到主数据库不可用(例如主数据库发生故障)时,仲裁方可以在第一数据库集合中自动选择一个备数据库进行切换,而不选择第一数据库集合中没有的备数据库。The arbitrator can refer to the first database set to select the standby database for the master-standby switchover, so as to avoid data loss caused by data inconsistency in the master-standby database during the master-standby switchover. As an implementation method, when the arbitrator detects that the primary database is unavailable (for example, the primary database fails), the arbitrator can automatically select a standby database in the first database set to switch over, instead of selecting a standby database that is not in the first database set standby database.
不论步骤S130中的数据同步是否成功,主数据库均可以执行步骤S160。No matter whether the data synchronization in step S130 is successful or not, the master database can execute step S160.
步骤S160,向所述第一事务的发起方发送针对所述第一事务请求的应答。Step S160, sending a response to the first transaction request to the initiator of the first transaction.
可以理解的是,主数据库可以应答第一事务请求的发起方成功修改事务。发起方收到应答后,可以继续发送其他事务请求。主数据库对发起方的应答可以与第一备数据库的数据是否与主数据库同步成功无关。因此,即使数据同步至第一备数据库失败(即主数据库向仲裁方发送了通知消息),只要成功修改事务执行,就可以应答第一事务的发起方。It can be understood that the primary database can respond to the initiator of the first transaction request to successfully modify the transaction. After the initiator receives the response, it can continue to send other transaction requests. The response of the primary database to the initiator may be independent of whether the data of the first standby database is successfully synchronized with the primary database. Therefore, even if data synchronization to the first standby database fails (that is, the primary database sends a notification message to the arbitrator), as long as the execution of the modified transaction is successfully executed, the initiator of the first transaction can be answered.
作为一种实现方式,主数据库与第一备数据库同步数据失败后,后续其他事务请求产生的修改操作只需要在主数据库执行成功和/或写入主数据库的日志成功即可应答发起方,后续可以不向第一备数据库同步数据。需要说明的是,主数据可以在不向第一备数据库同步数据之前,向仲裁方发送通知消息将第一备数据库从第一数据库集合中移除,从而使得仲裁者维护的第一数据库集合中的备数据库中的数据总是最新的。As an implementation method, after the primary database fails to synchronize data with the first standby database, subsequent modification operations generated by other transaction requests only need to be successfully executed on the primary database and/or successfully written to the log of the primary database to respond to the initiator. It is not necessary to synchronize data to the first standby database. It should be noted that the master data can send a notification message to the arbitrator to remove the first standby database from the first database set before synchronizing data to the first standby database, so that the first database set maintained by the arbitrator The data in the standby database is always up-to-date.
由此可知,修改操作同步至第一备数据库失败的情况下,即第一备数据库不可用(例如第一备数据库发生故障或主数据库与第一备数据库之间的网络发生故障),主数据库可以不停止服务并且通知仲裁方将第一备数据库从第一数据库集合中删除。当主数据库不可用时,仲裁方可以通过第一数据库集合,选择与主数据库数据一致的备数据库切换为主数据库,而不会导致数据丢失。因此,基于本公开提供的方法,在主数据库或者第一备数据库中任意一个不可用时,系统仍可以继续服务,并且不会出现数据丢失的问题。有此可知,本公开提供的主备同步方案的RPO等于0,RTO小于一分钟,即可以实现自动无损容灾。也就是说,本公开不仅可以实现数据库高可用,也可以实现主备切换时数据不丢失。It can be seen from this that when the modification operation fails to be synchronized to the first standby database, that is, the first standby database is unavailable (for example, the first standby database fails or the network between the primary database and the first standby database fails), the primary database The service may not be stopped and the arbitrator is notified to delete the first standby database from the first database set. When the primary database is unavailable, the arbitrator can select a standby database consistent with the data of the primary database to switch to the primary database through the first database set, without causing data loss. Therefore, based on the method provided in the present disclosure, when any one of the primary database or the first standby database is unavailable, the system can still continue to serve, and the problem of data loss will not occur. It can be seen that the RPO of the master-slave synchronization solution provided by the present disclosure is equal to 0, and the RTO is less than one minute, that is, automatic lossless disaster recovery can be realized. That is to say, the present disclosure can not only realize the high availability of the database, but also realize that the data will not be lost during the master-standby switchover.
另外,本公开需要至少2个用于数据存储的节点(一个主数据库和一个备数据库) 即可实现无损容灾。与需要至少3个用于存储的节点(一个主数据库和两个备数据库)的方案(例如基于选举协议的主备同步方案)相比,减少了需要存储的数据副本,从而降低了数据存储的成本。In addition, the present disclosure requires at least two nodes for data storage (a primary database and a standby database) to achieve lossless disaster recovery. Compared with the scheme that requires at least 3 nodes for storage (one primary database and two standby databases) (such as the primary-standby synchronization scheme based on the election protocol), it reduces the data copies that need to be stored, thereby reducing the cost of data storage. cost.
本公开不限制第一数据库集合记录的方式。作为一种实现方式,第一数据库集合可以通过列表的方式记录,即通过同步列表记录与主数据库数据一致的备数据库。The present disclosure does not limit the manner in which the first database sets records. As an implementation manner, the first database set may be recorded in the form of a list, that is, the standby database whose data is consistent with the primary database is recorded in a synchronization list.
作为一种实现方式,如果主数据库的所有修改操作均同步至第一备数据库成功,主数据库可以发送添加第一备数据库的第一消息仲裁方可以根据第一消息将第一备数据库添加至第一数据库集合中。例如,在仲裁方将第一备数据库从第一数据库集合中删除后,如果第一备数据库恢复可用状态,并且主数据库与第一备数据库进行数据同步成功,主数据库可以通知仲裁方将第一备数据库添加到第一数据库集合中。或者,在第一备数据库初始注册到主数据库后,主数据库与第一备数据库进行数据同步成功,主数据库可以通知仲裁方将第一备数据库添加到第一数据库集合中。As an implementation, if all modification operations of the primary database are successfully synchronized to the first standby database, the primary database can send the first message to add the first standby database. The arbitrator can add the first standby database to the first standby database according to the first message in a database collection. For example, after the arbitrator deletes the first standby database from the first database set, if the first standby database is restored to an available state, and the data synchronization between the primary database and the first standby database is successful, the primary database can notify the arbitrator to remove the first The standby database is added to the first database set. Alternatively, after the first standby database is initially registered with the primary database, and the primary database and the first standby database successfully perform data synchronization, the primary database may notify the arbitrator to add the first standby database to the first database set.
主数据库可以有多个与之关联的备数据库,第一备数据库可以为多个备数据库中的任意一个。对于这种情况,第一备数据库集合中可以包括一个或多个备数据库。主数据库可以向仲裁方发送针对不同备数据库的通知消息,以通知仲裁方将哪个或哪些备数据库从第一数据库集合中删除。The primary database may have multiple standby databases associated with it, and the first standby database may be any one of the multiple standby databases. In this case, the first set of standby databases may include one or more standby databases. The primary database may send notification messages for different standby databases to the arbitrator, so as to notify the arbitrator which standby database or databases to delete from the first database set.
仲裁方可以是与主数据库以及备数据库相对独立的第三方,因此,仲裁方可以被称为第三方仲裁方或第三方仲裁者。The arbitrator may be a third party that is relatively independent from the primary database and the standby database. Therefore, the arbitrator may be called a third-party arbitrator or a third-party arbitrator.
在部署仲裁方时,可以将仲裁方与数据库(包括主数据库、第一备数据库或其他备数据库)分开部署。例如仲裁方可以与数据库部署在不同的机器、不同的机房或不同的城市。在数据库遭受灾害的情况下,与数据库分开部署的仲裁方可以免于遭受灾害,从而使得即使数据库遭遇灾害仲裁方也可以正常服务。When deploying the arbitrator, the arbitrator can be deployed separately from the database (including the primary database, the first standby database or other standby databases). For example, the arbitrator and the database can be deployed on different machines, different computer rooms or different cities. When the database suffers from a disaster, the arbitrator deployed separately from the database can avoid the disaster, so that the arbitrator can still serve normally even if the database suffers a disaster.
仲裁方可以通过选举协议实现,以实现仲裁方的高可用。作为一种实现方式,仲裁方可以采用Paxos或Raft选举协议实现。仲裁方可以包括至少3个节点(例如3个或5个节点)。当仲裁方的少数节点不可用(例如出现节点故障或者网络故障),则可以通过选举协议选出新的节点继续提供服务,从而使得仲裁方具有高可用性。可以理解的是,除了选举协议,仲裁方也可以通过其他能够实现高可用性的方式实现。The arbitrator can be implemented through an election protocol to achieve high availability of the arbitrator. As an implementation, the arbitrator can use Paxos or Raft election protocol. The arbitrator may include at least 3 nodes (eg 3 or 5 nodes). When a few nodes of the arbitrator are unavailable (such as node failure or network failure), new nodes can be selected through the election protocol to continue to provide services, so that the arbitrator has high availability. It is understandable that in addition to the election protocol, the arbitrator can also be implemented in other ways that can achieve high availability.
多个系统可以共享一个仲裁方,从而降低仲裁方的部署成本。也就是说,仲裁方可以用于维护与多个主数据库一一对应的多个数据库集合。每个数据库集合包括与对应主 数据库数据同步的备数据库。例如,一个公司可以部署一套全局可用的仲裁方服务,所述的数据库节点均可以共享这一套仲裁方服务。Multiple systems can share a single arbiter, reducing the cost of arbitrator deployment. That is to say, the arbitrator can be used to maintain multiple database collections that correspond one-to-one to multiple master databases. Each database set includes a standby database that is synchronized with the corresponding primary database data. For example, a company can deploy a set of globally available arbitrator services, and all the database nodes can share this set of arbitrator services.
图2和图3为本公开实施例提供的一种主备同步容灾方法的示意图。FIG. 2 and FIG. 3 are schematic diagrams of an active-standby synchronous disaster recovery method provided by an embodiment of the present disclosure.
图2示出了主数据库和第一备数据库均可用时的主备同步方法。图2所示的方法可以由客户端、主数据库以及第一备数据库执行。仲裁方可以包括3个节点(节点通过图3中的圆圈表示),基于选举协议实现。Fig. 2 shows the master-standby synchronization method when both the master database and the first standby database are available. The method shown in FIG. 2 can be executed by the client, the primary database and the first standby database. The arbitrator may include 3 nodes (the nodes are represented by circles in Figure 3), and is implemented based on the election protocol.
图2所示的方法可以包括步骤S210~步骤S240。The method shown in FIG. 2 may include step S210 to step S240.
步骤S210,客户端向主数据库发送修改事务请求。Step S210, the client sends a transaction modification request to the master database.
步骤S221,主数据库在本地日志文件中写日志。本地日志文件可以为主数据库的重做日志文件。Step S221, the master database writes a log in a local log file. The local log files can be the redo log files of the primary database.
步骤S222,将修改操作同步到第一备数据库。步骤S221和步骤S222可以同步进行。Step S222, synchronizing the modification operation to the first standby database. Step S221 and step S222 may be performed synchronously.
步骤S230,响应于主备同步操作,第一备数据库将修改操作写入备数据库的日志文件。Step S230, in response to the master-standby synchronization operation, the first standby database writes the modification operation into the log file of the standby database.
步骤S240,第一备数据库向主数据库应答“同步成功”。In step S240, the first standby database replies "successful synchronization" to the primary database.
步骤S250,主数据库向客户端发送第二信息,第二信息用于应答客户端成功执行修改事务。In step S250, the master database sends second information to the client, and the second information is used to reply that the client successfully executes the modification transaction.
图3示出了第一备数据库不可用情况下故障处理方法。图3所示的方法可以由客户端、主数据库、第一备数据库以及仲裁方执行。仲裁方可以包括3个节点(节点通过图3中的圆圈表示),基于选举协议实现。Fig. 3 shows a fault handling method when the first standby database is unavailable. The method shown in FIG. 3 can be executed by the client, the primary database, the first standby database, and the arbitrator. The arbitrator may include 3 nodes (the nodes are represented by circles in Figure 3), and is implemented based on the election protocol.
图3所示的方法可以包括步骤S310~步骤S340。The method shown in FIG. 3 may include step S310 to step S340.
步骤S310,客户端向主数据库发送修改事务请求。Step S310, the client sends a transaction modification request to the master database.
步骤S321,主数据库在本地日志文件中写日志。本地日志文件即主数据库的重做日志文件。Step S321, the master database writes a log in a local log file. The local log files are the redo log files of the primary database.
步骤S322,将修改操作同步到第一备数据库。步骤S321和步骤S322可以同步进行。Step S322, synchronizing the modification operation to the first standby database. Step S321 and step S322 may be performed synchronously.
步骤S330,在主数据库超过一定时间后未收到第一备数据库对主备同步的应答(即第一备数据库以及第一备数据库与主数据库之间的网络中至少一项故障)的情况下,主数据库向仲裁方发送第一信息,以通知仲裁方将第一备数据库从同步列表中剔除。Step S330, when the primary database has not received the response from the first standby database to the synchronization of the primary and secondary databases after a certain period of time (that is, at least one failure in the first standby database and the network between the first standby database and the primary database) , the primary database sends the first message to the arbitrator to notify the arbitrator to remove the first standby database from the synchronization list.
步骤S340,主数据库向客户端发送第二信息,第二信息用于应答客户端成功修改事务。In step S340, the master database sends second information to the client, and the second information is used to reply that the client successfully modifies the transaction.
基于图2和图3所示的方法,可以实现中在这维护的同步列表中的备数据库中的数据与主数据库的数据是一致的。后续无论是主数据库或者备数据库不可用,仲裁方都可以在检测到故障后再同步列表中选择一个数据库(例如一台服务器)继续进行同步服务。Based on the methods shown in FIG. 2 and FIG. 3 , it can be realized that the data in the standby database in the synchronization list maintained here is consistent with the data in the primary database. Whether the primary database or the standby database is unavailable later, the arbitrator can select a database (such as a server) in the synchronization list to continue the synchronization service after detecting the failure.
例如,在第一备数据库故障的情况下,主数据库向第一备数据库同步失败后可以将第一备数据库从同步列表中剔除。后续主数据库写本地日志文件成功后即可向客户端反馈第二信息。For example, when the first standby database fails, the first standby database may be removed from the synchronization list after the primary database fails to synchronize with the first standby database. Subsequent primary database can feed back the second information to the client after successfully writing the local log file.
或者,在主数据库与第一备数据库之间网络发生故障的情况下,主数据库往第一备数据库同步失败后可以将第一备数据库从同步列表中剔除。后续主数据库写本地日志文件成功后即可向客户端反馈第二信息。接下来,如果第一备数据库发生故障,可以无需处理。后续如果主数据库故障,由于第一备数据库不在同步列表中,仲裁者可以不选择第一备数据库切换为主数据库,从而避免了数据丢失的情况。Alternatively, when a network failure occurs between the primary database and the first standby database, the first standby database may be removed from the synchronization list after the primary database fails to synchronize to the first standby database. Subsequent primary database can feed back the second information to the client after successfully writing the local log file. Next, if the first standby database fails, there is no need to deal with it. If the primary database fails later, since the first standby database is not in the synchronization list, the arbitrator may not select the first standby database to switch to the primary database, thereby avoiding data loss.
或者,在主数据库发生故障的情况下,仲裁方检测到主数据库故障,在一段时间后,可以将同步列表中的备数据库(如果同步列表中包括第一备数据库,可以选择第一备数据库)切换为主数据库继续进行服务。Or, in the case of failure of the primary database, the arbitrator detects the failure of the primary database, and after a period of time, the standby database in the synchronization list can be synchronized (if the synchronization list includes the first standby database, the first standby database can be selected) Switch to the primary database to continue the service.
上文通过图1~图3介绍了本公开提供的方法实施例,下面结合图4~图6介绍本公开提供的装置实施例。The method embodiments provided by the present disclosure are described above through FIGS. 1 to 3 , and the device embodiments provided by the present disclosure are described below with reference to FIGS. 4 to 6 .
图4为本公开实施例提供的一种同步主备数据库的装置400的结构示意图。装置400可以是具有计算功能的计算设备,例如服务器。装置400部署有主数据库。装置400可以包括第一接收单元410、执行单元420、同步单元430、第一发送单元440和应答单元450。FIG. 4 is a schematic structural diagram of an apparatus 400 for synchronizing active and standby databases according to an embodiment of the present disclosure. Apparatus 400 may be a computing device with computing functions, such as a server. The device 400 is deployed with a master database. The apparatus 400 may include a first receiving unit 410 , an execution unit 420 , a synchronization unit 430 , a first sending unit 440 and a response unit 450 .
第一接收单元410可以用于接收第一事务请求,所述第一事务请求用于请求对主数据库中的数据进行修改;执行单元420可以用于响应于所述第一事务请求,对所述主数据库中的数据执行修改操作;同步单元430可以用于根据所述修改操作,与第一备数据库进行数据同步;第一发送单元440可以用于如果所述数据同步失败,向仲裁方发送通知消息,所述通知消息用于通知所述仲裁方将所述第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库;应答单元450可以用于向所述第一事务的发起方发送针对所述第一事务请求的应答。The first receiving unit 410 may be used to receive a first transaction request, and the first transaction request is used to request to modify data in the master database; the execution unit 420 may be used to respond to the first transaction request, to modify the The data in the primary database performs a modification operation; the synchronization unit 430 can be used to perform data synchronization with the first standby database according to the modification operation; the first sending unit 440 can be used to send a notification to the arbitrator if the data synchronization fails message, the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the databases in the first database set are all standby databases synchronized with the primary database data; The response unit 450 may be configured to send a response to the first transaction request to the initiator of the first transaction.
可选地,所述同步单元430可以包括:生成单元和第二发送单元。生成单元可以用于根据所述修改操作,生成重做日志。第二发送单元可以用于向所述第一备数据库发送重做日志。Optionally, the synchronizing unit 430 may include: a generating unit and a second sending unit. The generation unit can be used to generate redo logs according to the modification operation. The second sending unit may be used to send redo logs to the first standby database.
图5本公开实施例提供的一种同步主备数据库的装置500的结构示意图。装置500可以是具有计算功能的计算设备,例如服务器。装置500部署有仲裁方。装置500可以包括第二接收单元510以及删除单元520。FIG. 5 is a schematic structural diagram of an apparatus 500 for synchronizing active and standby databases provided by an embodiment of the present disclosure. Apparatus 500 may be a computing device with computing functions, such as a server. Apparatus 500 deploys an arbitrator. The apparatus 500 may include a second receiving unit 510 and a deleting unit 520 .
第二接收单元510可以用于接收主数据库发送的通知消息,所述通知消息用于通知所述仲裁方将所述第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库。The second receiving unit 510 may be configured to receive a notification message sent by the primary database, the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the first database set in the first database set The databases are all standby databases synchronized with the data of the primary database.
删除单元520可以用于根据所述通知消息,将所述第一备数据库从所述第一数据库集合中删除。The deleting unit 520 may be configured to delete the first standby database from the first database set according to the notification message.
可选地,装置500还可以包括:选择单元。选择单元可以用于从所述第一数据库集合中选择用于切换为主数据库的数据库。Optionally, the apparatus 500 may further include: a selection unit. The selection unit may be used to select a database from the first set of databases for switching to the primary database.
可选地,所述第一数据集合通过列表记录。Optionally, the first data set is recorded in a list.
可选地,所述仲裁方基于选举协议实现。Optionally, the arbitrator is implemented based on an election protocol.
图6是本公开又一实施例提供的同步主备数据库的装置的结构示意图。该装置600例如可以是具有计算功能的计算设备。比如,装置600可以是移动终端或者服务器。装置600可以包括存储器610和处理器620。存储器610可用于存储可执行代码。处理器620可用于执行所述存储器610中存储的可执行代码,以实现前文描述的各个方法中的步骤。在一些实施例中,该装置600还可以包括网络接口630,处理器620与外部设备的数据交换可以通过该网络接口630实现。Fig. 6 is a schematic structural diagram of an apparatus for synchronizing primary and secondary databases according to yet another embodiment of the present disclosure. The apparatus 600 may be, for example, a computing device with a computing function. For example, the device 600 may be a mobile terminal or a server. The apparatus 600 may include a memory 610 and a processor 620 . Memory 610 may be used to store executable code. The processor 620 can be used to execute the executable code stored in the memory 610, so as to realize the steps in the various methods described above. In some embodiments, the apparatus 600 may further include a network interface 630 through which data exchange between the processor 620 and external devices may be implemented.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其他任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本公开实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、 微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。In the above embodiments, all or part may be implemented by software, hardware, firmware or other arbitrary combinations. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, all or part of the processes or functions according to the embodiments of the present disclosure will be generated. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media. The available medium may be a magnetic medium (such as a floppy disk, a hard disk, a magnetic tape), an optical medium (such as a digital video disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc. .
本领域普通技术人员可以意识到,结合本公开实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本公开的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments of the present disclosure can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementation should not be considered beyond the scope of the present disclosure.
在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
以上所述,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of the present disclosure, but the scope of protection of the present disclosure is not limited thereto. Anyone skilled in the art can easily think of changes or substitutions within the technical scope of the present disclosure. should fall within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be determined by the protection scope of the claims.
以上所述仅为本公开的较佳实施例而已,并不用以限制本公开,凡在本公开的精神和原则之内,所作的任何修改、等同替换等,均应包含在本公开的保护范围之内。The above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modifications, equivalent replacements, etc. made within the spirit and principles of the present disclosure shall be included in the protection scope of the present disclosure within.

Claims (13)

  1. 一种同步主备数据库的方法,所述方法应用于主数据库,所述方法包括:A method for synchronizing active and standby databases, the method being applied to the active database, the method comprising:
    接收第一事务请求,所述第一事务请求用于请求对主数据库中的数据进行修改;receiving a first transaction request, where the first transaction request is used to request to modify data in the master database;
    响应于所述第一事务请求,对所述主数据库中的数据执行修改操作;In response to the first transaction request, perform a modification operation on data in the primary database;
    根据所述修改操作,与第一备数据库进行数据同步;Perform data synchronization with the first standby database according to the modification operation;
    如果所述数据同步失败,向仲裁方发送通知消息,所述通知消息用于通知所述仲裁方将所述第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库;If the data synchronization fails, send a notification message to the arbitrator, where the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the databases in the first database set are all A standby database that is synchronized with the primary database data;
    向所述第一事务的发起方发送针对所述第一事务请求的应答。A reply to the first transaction request is sent to the initiator of the first transaction.
  2. 根据权利要求1所述的方法,所述根据所述修改操作,与第一备数据库进行数据同步包括:The method according to claim 1, said performing data synchronization with the first standby database according to said modification operation comprises:
    根据所述修改操作,生成重做日志;Generate a redo log according to the modification operation;
    向所述第一备数据库发送重做日志。Send redo logs to the first standby database.
  3. 一种同步主备数据库的方法,所述方法应用于仲裁方,所述方法包括:A method for synchronizing active and standby databases, the method is applied to an arbitrator, and the method includes:
    接收主数据库发送的通知消息,所述通知消息用于通知所述仲裁方将第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库;receiving a notification message sent by the primary database, the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the databases in the first database set are all synchronized with the primary database data standby database;
    根据所述通知消息,将所述第一备数据库从所述第一数据库集合中删除。According to the notification message, the first standby database is deleted from the first database set.
  4. 根据权利要求3所述的方法,所述方法还包括:The method of claim 3, further comprising:
    从所述第一数据库集合中选择用于切换为主数据库的数据库。Selecting a database for switching to the primary database from the first database set.
  5. 根据权利要求3所述的方法,所述第一数据集合通过列表记录。The method of claim 3, said first set of data being recorded by a list.
  6. 根据权利要求3所述的方法,所述仲裁方基于选举协议实现。The method according to claim 3, the arbitration party is implemented based on an election protocol.
  7. 一种同步主备数据库的装置,所述装置部署有主数据库,所述装置包括:A device for synchronizing primary and backup databases, the device is deployed with a primary database, and the device includes:
    第一接收单元,用于接收第一事务请求,所述第一事务请求用于请求对主数据库中的数据进行修改;The first receiving unit is configured to receive a first transaction request, and the first transaction request is used to request modification of data in the master database;
    执行单元,用于响应于所述第一事务请求,对所述主数据库中的数据执行修改操作;an execution unit, configured to perform a modification operation on data in the master database in response to the first transaction request;
    同步单元,用于根据所述修改操作,与第一备数据库进行数据同步;a synchronization unit, configured to perform data synchronization with the first standby database according to the modification operation;
    第一发送单元,用于如果所述数据同步失败,向仲裁方发送通知消息,所述通知消息用于通知所述仲裁方将所述第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库;The first sending unit is configured to send a notification message to the arbitrator if the data synchronization fails, the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the second standby database is deleted from the first database set. The databases in a database set are standby databases synchronized with the primary database data;
    应答单元,用于向所述第一事务的发起方发送针对所述第一事务请求的应答。A response unit, configured to send a response to the first transaction request to the initiator of the first transaction.
  8. 根据权利要求7所述的装置,所述同步单元包括:The device according to claim 7, the synchronization unit comprising:
    生成单元,用于根据所述修改操作,生成重做日志;和a generating unit, configured to generate a redo log according to the modification operation; and
    第二发送单元,用于向所述第一备数据库发送重做日志。The second sending unit is configured to send redo logs to the first standby database.
  9. 一种同步主备数据库的装置,所述装置部署有仲裁方,所述装置包括:A device for synchronizing active and standby databases, the device is deployed with an arbitrator, and the device includes:
    第二接收单元,用于接收主数据库发送的通知消息,所述通知消息用于通知所述仲裁方将第一备数据库从第一数据库集合中删除,所述第一数据库集合中的数据库均为与所述主数据库数据同步的备数据库;The second receiving unit is configured to receive a notification message sent by the primary database, the notification message is used to notify the arbitrator to delete the first standby database from the first database set, and the databases in the first database set are all A standby database synchronized with the primary database data;
    删除单元,用于根据所述通知消息,将所述第一备数据库从所述第一数据库集合中删除。A deleting unit, configured to delete the first standby database from the first database set according to the notification message.
  10. 根据权利要求9所述的装置,所述装置还包括:The apparatus of claim 9, further comprising:
    选择单元,从所述第一数据库集合中选择用于切换为主数据库的数据库。The selecting unit selects a database from the first set of databases for switching to the primary database.
  11. 根据权利要求9所述的装置,所述第一数据集合通过列表记录。The apparatus of claim 9, said first set of data being recorded by a list.
  12. 根据权利要求9所述的装置,所述仲裁方基于选举协议实现。The apparatus according to claim 9, the arbitrator is implemented based on an election protocol.
  13. 一种同步主备数据库的装置,包括存储器和处理器,所述存储器中存储有可执行代码,所述处理器被配置为执行所述可执行代码,以实现权利要求1-6中任一项所述的方法。A device for synchronizing active and standby databases, comprising a memory and a processor, wherein executable code is stored in the memory, and the processor is configured to execute the executable code, so as to realize any one of claims 1-6 the method described.
PCT/CN2023/071515 2022-02-09 2023-01-10 Synchronizing main database and standby database WO2023151443A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210121230.8A CN114490188A (en) 2022-02-09 2022-02-09 Method and device for synchronizing main database and standby database
CN202210121230.8 2022-02-09

Publications (1)

Publication Number Publication Date
WO2023151443A1 true WO2023151443A1 (en) 2023-08-17

Family

ID=81479594

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/071515 WO2023151443A1 (en) 2022-02-09 2023-01-10 Synchronizing main database and standby database

Country Status (2)

Country Link
CN (1) CN114490188A (en)
WO (1) WO2023151443A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114490188A (en) * 2022-02-09 2022-05-13 北京奥星贝斯科技有限公司 Method and device for synchronizing main database and standby database

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7003694B1 (en) * 2002-05-22 2006-02-21 Oracle International Corporation Reliable standby database failover
CN107038192A (en) * 2016-11-17 2017-08-11 阿里巴巴集团控股有限公司 database disaster recovery method and device
CN108932338A (en) * 2018-07-11 2018-12-04 北京百度网讯科技有限公司 Data-updating method, device, equipment and medium
CN113535665A (en) * 2021-07-16 2021-10-22 北京元年科技股份有限公司 Method and device for synchronizing log files between main database and standby database
CN114490188A (en) * 2022-02-09 2022-05-13 北京奥星贝斯科技有限公司 Method and device for synchronizing main database and standby database

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019502B (en) * 2017-08-29 2023-03-21 阿里巴巴集团控股有限公司 Synchronization method between primary database and backup database, database system and device
KR102170531B1 (en) * 2018-11-13 2020-10-28 한국기업데이터 주식회사 system for greeting and announcing alerts using corporate news search

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7003694B1 (en) * 2002-05-22 2006-02-21 Oracle International Corporation Reliable standby database failover
CN107038192A (en) * 2016-11-17 2017-08-11 阿里巴巴集团控股有限公司 database disaster recovery method and device
CN108932338A (en) * 2018-07-11 2018-12-04 北京百度网讯科技有限公司 Data-updating method, device, equipment and medium
CN113535665A (en) * 2021-07-16 2021-10-22 北京元年科技股份有限公司 Method and device for synchronizing log files between main database and standby database
CN114490188A (en) * 2022-02-09 2022-05-13 北京奥星贝斯科技有限公司 Method and device for synchronizing main database and standby database

Also Published As

Publication number Publication date
CN114490188A (en) 2022-05-13

Similar Documents

Publication Publication Date Title
US7039661B1 (en) Coordinated dirty block tracking
JP4668763B2 (en) Storage device restore method and storage device
US7032089B1 (en) Replica synchronization using copy-on-read technique
JP4477950B2 (en) Remote copy system and storage device system
CN102891849B (en) Service data synchronization method, data recovery method, data recovery device and network device
WO2016070375A1 (en) Distributed storage replication system and method
JP2005196683A (en) Information processing system, information processor and control method of information processing system
EP1569120A1 (en) Computer system for recovering data based on priority of the data
CN111427728B (en) State management method, main/standby switching method and electronic equipment
CN101136728A (en) Cluster system and method for backing up a replica in a cluster system
JP2007518195A (en) Cluster database using remote data mirroring
JP5292351B2 (en) Message queue management system, lock server, message queue management method, and message queue management program
WO2023151443A1 (en) Synchronizing main database and standby database
CN115794499B (en) Method and system for dual-activity replication data among distributed block storage clusters
KR101605455B1 (en) Method for Replicationing of Redo Log without Data Loss and System Thereof
JP5292350B2 (en) Message queue management system, lock server, message queue management method, and message queue management program
CN113326251B (en) Data management method, system, device and storage medium
WO2021115043A1 (en) Distributed database system and data disaster backup drilling method
CN112783694B (en) Long-distance disaster recovery method for high-availability Redis
CN116389233B (en) Container cloud management platform active-standby switching system, method and device and computer equipment
JP2006318077A (en) Remote copy system
WO2023019953A1 (en) Data synchronization method and system, server, and storage medium
US20210240351A1 (en) Remote copy system and remote copy management method
CN113297134B (en) Data processing system, data processing method and device, and electronic device
JP2004272884A5 (en)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23752223

Country of ref document: EP

Kind code of ref document: A1