WO2018014650A1 - Distributed database data synchronisation method, related apparatus and system - Google Patents

Distributed database data synchronisation method, related apparatus and system Download PDF

Info

Publication number
WO2018014650A1
WO2018014650A1 PCT/CN2017/085486 CN2017085486W WO2018014650A1 WO 2018014650 A1 WO2018014650 A1 WO 2018014650A1 CN 2017085486 W CN2017085486 W CN 2017085486W WO 2018014650 A1 WO2018014650 A1 WO 2018014650A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
data node
node
version
log
Prior art date
Application number
PCT/CN2017/085486
Other languages
French (fr)
Chinese (zh)
Inventor
陶维忠
吴刚
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018014650A1 publication Critical patent/WO2018014650A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to the field of database technologies, and in particular, to a distributed database data synchronization method, related device and system.
  • Distributed database specifically refers to the use of high-speed computer networks to connect physically dispersed multiple data storage nodes to form a logically unified database.
  • the basic idea of a distributed database is to distribute the data in the original centralized database to multiple data storage nodes connected through the network to obtain larger storage capacity and higher concurrent access.
  • the prior art provides a distributed database. As shown in FIG. 1, the same data is stored on the data node 1 and the data node 2, and both the data node 1 and the data node 2 are in an active state.
  • the data operation instruction may be sent to the data node 1 or the data node 2 for execution. After the data node executes the data operation instruction, the execution result is synchronized to another data node by the form of the replication log, and the other data node performs data replication according to the replication log. , thereby achieving data synchronization between data nodes 1 and 2.
  • the prior art provides a data node-based data synchronization method.
  • the method is described below with a data node 2 upgrade. The method mainly includes the following steps:
  • Node 1 suspends processing of data operation commands sent by the application client and suspension of data replication log transmission between node 1 and node 2.
  • node 2 receives the upgrade instruction, performs a version upgrade operation, and then node 2 receives the data operation command sent by the application client and executes, and records the data replication log.
  • the node 2 sends the upgrade command and the data replication log to the node 1, and the node 1 receives the data replication log and the upgrade command sent by the node 2.
  • S4 and node 1 first upgrade the version according to the upgrade instruction, and then perform data replication according to the replication log. After the replication is completed, the data operation instruction sent by the application client is normally received, and the normal data between the node 1 and the node 2 is restored. Synchronize.
  • the data node 1 cannot process the data operation command sent by the application client, and only the data node 2 can process the data operation command sent by the application client, thereby reducing the reliability of the distributed database. .
  • Embodiments of the present invention provide a data synchronization method and apparatus for improving reliability of a distributed database.
  • an embodiment of the present invention provides a distributed database data synchronization method, which is applied to a destination data node in the distributed database, where the destination data node further includes a data operation instruction sent by an application client.
  • the upgrade is performed according to the upgrade instruction, and after the upgrade is completed, data replication is performed according to the cached replication log.
  • the upgrade instruction carries a field for modifying a data model of the destination data node or an instruction for adding a field of the data model.
  • the source data node can normally process the data operation instruction sent by the application client, and generate a data replication log to be sent to the destination data node.
  • the destination data node When the destination data node is upgraded, it can receive the data replication log sent by the source data node.
  • the data replication log is cached first, and after the version upgrade is completed, the data replication is performed according to the cached replication log, and the synchronization between the source data node and the destination data node is completed.
  • the upgrade method provided by the embodiment of the present invention enables the source data node and the destination data node to process data operation commands sent by the application client while upgrading, thereby improving the reliability of the distributed database.
  • the destination data node obtains the version of the source data node.
  • the method further includes:
  • the destination data node generates a data replication log when the data operation instruction is executed, where the data replication log carries data that needs to be synchronized to a data node corresponding to the destination data node;
  • the data replication log is sent to the source data node to enable data synchronization between the destination data node and the source data node.
  • the destination data node when the destination data node further determines that the version of the source data node is lower than its own version, it indicates that it can process the replication log sent by the source data node, and the data.
  • the node performs data copying according to the copy log, thereby implementing data synchronization from a low version to a high version.
  • the sending, by the destination data node, the data replication log to the source data node includes:
  • the replication log is placed in a local transmit queue and sent to the source data node.
  • the replication log can be passed to improve the availability of the database system.
  • an embodiment of the present invention provides a destination data node of a distributed database, including:
  • a data operation instruction processing unit configured to execute a data operation instruction sent by the application client
  • a receiving unit configured to receive a data replication log sent by the source data node, where the replication log carries data that needs to be synchronized to the destination node;
  • a log cache unit when the version of the source data node is obtained, and the version of the source data node is determined to be higher than its own version, the replication log is cached;
  • the receiving unit is further configured to receive an upgrade instruction sent by a metadata server, where the upgrade instruction is used to indicate the item Data node for version upgrade;
  • An upgrade unit configured to perform an upgrade according to the upgrade instruction
  • the log synchronization unit is configured to perform data replication according to the cached replication log after the upgrade is completed.
  • the obtaining, by the log cache unit, the version of the source data node includes:
  • the log cache unit acquires a version of the source data node according to the data replication log, where the data replication log carries a version of the source data node;
  • the log buffering unit receives the notification message sent by the source data node, where the notification message carries the version of the source data node, and obtains the version of the source data node according to the notification message.
  • the destination data node further includes:
  • a log generating unit configured to: when the data operation instruction is executed, generate a data replication log, where the data replication log carries data that needs to be synchronized to a data node corresponding to the destination data node;
  • a sending unit configured to send the data replication log to the source data node.
  • the log synchronization unit of the destination data node is further configured to perform data replication according to the replication log when determining that the version of the source data node is lower than its own version.
  • the sending, by the sending unit of the destination data node, the data replication log to the source data node includes:
  • the sending unit sends the replication log to a local sending queue to the source data node.
  • an embodiment of the present invention provides a distributed database data synchronization method, which is applied to a destination data node in the distributed database, where the destination data node further includes a data operation instruction sent by the application client.
  • the data corresponding to the entry to be deleted is filtered when the data is copied according to the replication log, and the data is already in the destination data node.
  • the existing entries are processed according to default values to achieve data synchronization between the source data node and the destination data node.
  • the method further includes:
  • the destination data node can know in advance which data corresponding to the entry needs to be deleted, thereby receiving the replication log. Or when the replication log is sent to the source data node, the data is synchronized in advance.
  • the method further includes:
  • the data replication log When the data replication log is generated according to the execution result of the data operation instruction, the data corresponding to the entry to be deleted is filtered, and the data replication log is sent to the source data node.
  • the embodiment of the present invention provides a distributed database destination data node, where the destination data node specifically includes:
  • a data operation instruction processing unit configured to execute a data operation instruction sent by the application client
  • a receiving unit configured to receive an upgrade pre-notification instruction sent by the metadata server, where the upgrade pre-notification instruction is Carry the entry that needs to be deleted;
  • the receiving unit is further configured to receive a data replication log sent by the source data node, where the data replication log carries data that needs to be synchronized to the destination node;
  • a log processing unit configured to filter data corresponding to the entry to be deleted when the data is copied according to the copy log, when the data corresponding to the entry to be deleted is included in the copy log,
  • the entries already existing in the destination data node are processed according to default values, thereby implementing data synchronization between the source data node and the destination data node.
  • the destination data node further includes:
  • an upgrade unit configured to receive an upgrade command sent by the metadata server, where the upgrade command is used to delete an entry of the data node, and delete an entry of the data node according to the upgrade instruction.
  • the destination data node can know in advance which data corresponding to the entry needs to be deleted, thereby receiving the replication log. Or when the replication log is sent to the source data node, the data is synchronized in advance.
  • the destination data node further includes:
  • a log sending unit configured to: when the data replication log is generated according to the execution result of the data operation instruction, filter the data corresponding to the entry to be deleted, and send the data replication log to the source data node.
  • the sending, by the destination data node, the data replication log to the source data node includes:
  • the replication log is placed in the local send queue and sent to the source data node.
  • an embodiment of the present invention provides a distributed database, including the destination data node according to the second aspect or the fourth aspect, and a source data node corresponding to the destination data node.
  • the distributed data provided by the embodiment of the present invention can complete data synchronization between the source data node and the destination data node while upgrading the source data node and the destination data node, thereby further improving the reliability of the distributed database.
  • FIG. 1 is a schematic structural diagram of a distributed database provided by the prior art
  • FIG. 2 is a schematic structural diagram of a distributed database according to Embodiment 1 of the present invention.
  • FIG. 3 is a schematic diagram of data storage of a data node according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of a data synchronization method according to Embodiment 2 of the present invention.
  • FIG. 5 is a flowchart of a data synchronization method according to Embodiment 3 of the present invention.
  • FIG. 6 is a schematic diagram of data storage of a data node according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a destination data node according to Embodiment 5 of the present invention.
  • FIG. 9 is a schematic structural diagram of a destination data node according to Embodiment 6 of the present invention.
  • FIG. 10 is a schematic structural diagram of a destination data node in a distributed database according to Embodiment 7 of the present invention.
  • FIG. 2 is a schematic structural diagram of a database system according to Embodiment 1 of the present invention.
  • the data processing system of the present invention includes an application client, a metadata server, and a data node server from top to bottom.
  • the application client is an application that uses a database, such as a billing application.
  • the application client can access the data stored in the data node server.
  • the metadata server is responsible for the distributed management capabilities of the database system.
  • the metadata server can be deployed independently or in conjunction with the data node server.
  • the metadata server can be deployed using a minicomputer, an X86 computer, or a personal computer server PC server.
  • the application client communicates with the metadata server and the data node server respectively through the IP network, wherein the communication interface between the client and the metadata server or the data node server may be a Transmission Control Protocol (TCP) interface or User Datagram Protocol (UDP) interface.
  • TCP Transmission Control Protocol
  • UDP User Datagram Protocol
  • the data node server includes a plurality of data nodes (also referred to as physical nodes), and the data nodes may be minicomputers, X86 computers, or personal computer servers, PC servers, etc., and data in the data nodes may be stored in a storage medium located in the storage network.
  • a plurality of data nodes and a storage network are read and written by a block IO (block IO), that is, a storage medium is read and written by a block.
  • the storage medium may be a hard disk drive (HDD) or a solid state. Hard State Drives (SSD), etc.
  • the client also deploys a driver driver, which caches routing information. In this way, when the client sends a data operation instruction to the physical node, the client can complete the route judgment through the cached routing information and access the corresponding physical node.
  • Figure 3 shows a schematic diagram of data storage on a data node.
  • Data node 1 and data node 2 are backups of each other, a slice located on data node 1 (which may also be referred to as a virtual node) 1, and a slice 2 located on data node 2 as a primary slice, and a segment on data node 1 Slice 2 and slice 1 located on data node 2 are spare slices.
  • Data node 1 and data node 2 may be referred to as a source data node and a destination data node, respectively. The names of the source data node and the destination data node are relative.
  • the metadata server stores the mapping relationship between the virtual node and the physical node, and the definition of the primary fragment and the backup slice of the virtual node.
  • the metadata server is also used to control data synchronization between data nodes.
  • FIG. 4 is a flowchart of a data synchronization method according to Embodiment 2 of the present invention.
  • the data node 1 is a source data node
  • the data node 2 is a destination data node.
  • the data synchronization method of the database provided by this embodiment can be applied to the scenario of database upgrade, and the method mainly includes the following steps:
  • Step 301 The data node 1 receives an upgrade instruction sent by the metadata server, where the upgrade instruction is used to indicate that the source data node performs version upgrade.
  • the data nodes 1 and 2 successively receive the upgrade instruction sent by the metadata server to notify the upgrade from the version V1 to the V2, and the upgrade instruction may modify the field of the data model (for example, a table structure) or add a field of the data model. Due to the existence of the time when the upgrade command is received, the data models of data nodes 1 and 2 are inconsistent.
  • the data node 1 first receives an upgrade instruction as an example.
  • Step 302 The data node 1 performs an upgrade according to the upgrade instruction.
  • Step 303 After the upgrade succeeds, the data node 1 receives the data operation command sent by the application client, performs a data operation according to the data operation command, and generates a replication log.
  • the data node 1 can receive data operation instructions sent by the application client, such as querying, modifying, and the like of the data of the primary fragment 1 while performing the upgrade. After performing the data operation, the data node 1 generates a replication log, which carries data that needs to be synchronized to the data node corresponding to the backup slice 1 (ie, the data node 2).
  • Step 304 the data node 1 sends the generated replication log to the data node 2.
  • Data Node 1 can also receive and process the data operation instructions sent by the client and generate a replication log. At this time, the versions of the data node 1 and the data node 2 are the same, and the data node 1 directly transmits the copy log to the data node 2 to realize data synchronization between the data node 1 and the data node 2.
  • the data node 1 can also receive and process the data operation instructions sent by the client and generate a replication log.
  • the version of the data node 1 is higher than the version of the data node 2, that is, the data model is out of synchronization between the data node 1 and the data node 2, and the data node 2 in the prior art cannot process the copy received from the data node 1.
  • the log is described in the embodiment of the present invention.
  • the replication log can be transmitted asynchronously between the data node 1 and the data node 2, that is, the data node 1 can put the replication log into the local transmission queue, and the local processing ends. Passing the replication log relative to the synchronous mode (the destination data node handles the replication log successfully) can improve the availability and partition fault tolerance of the database system.
  • Step 305 The data node 2 receives the data replication log sent by the data node 1, and the replication log carries data that needs to be synchronized to the data node 1.
  • Step 306 The data node 2 acquires the version of the data node 1, and determines that the version of the data node 1 is higher than its own version, and caches the replication log.
  • the copy log may carry the version of the data node 1, and the data node 2 may obtain the version of the data node 1 from the copy log, and compare the version of the data node 1 with its own version.
  • the data node 2 can also receive the notification message sent by the data node 1, the notification message carrying the version of the source data node, and the data node 2 acquiring the version of the data node 1 according to the notification message.
  • the version of the data node 1 is higher than its own version, it indicates that the copy log is generated according to the new data model, and the data node 2 temporarily does not process the copy log and cache the copy log.
  • the data node 2 can process the replication log sent by the data node 1, that is, according to the replication log. data synchronization. If the version of data node 1 is the same as its own version, data node 2 can process the replication log sent by data node 1, ie, synchronize data according to the replication log.
  • Step 307 The data node 2 receives an upgrade instruction sent by the metadata server, where the upgrade instruction is used to instruct the data node 2 to perform a version upgrade.
  • the data nodes 1, 2 receive the upgrade instructions sent by the source data server in succession.
  • Step 308 the data node 2 is upgraded according to the upgrade instruction.
  • Step 309 After the upgrade is completed, the data node 2 performs data replication according to the cached replication log.
  • the version of the data node 2 is the same as the version of the data node 1, and the data node 2 starts to process the replication log, that is, performs data replication according to the replication log cached in step 306 (also referred to as data replication). Data redo) to ensure data synchronization between Data Node 1 and Data Node 2.
  • the source data node (data node 1) can normally process the data operation instruction sent by the application client, and generate a data replication log to be sent to the destination data node (data node 2).
  • the destination data node When the destination data node is upgraded, it can receive the data replication log sent by the source data node.
  • the data replication log is cached first, and after the version upgrade is completed, the data replication is performed according to the cached replication log, and the synchronization between the source data node and the destination data node is completed.
  • the upgrade method provided by the embodiment of the present invention enables the source data node and the destination data node to process data operation commands sent by the application client while upgrading, thereby improving the reliability of the distributed database.
  • the data node 2 may perform the data operation instruction sent by the application client while performing the foregoing upgrade, and the data node 2 may further perform the following steps:
  • Step 310 When executing the data operation instruction, the data node 2 generates a data replication log, where the data replication log carries data that needs to be synchronized to the data node corresponding to the data node 2.
  • the data node corresponding to the data node 2 is the data node 1.
  • the modified data carries the modified data, and the data needs to be synchronized to the backup slice 2 of the data node 1.
  • Step 311 The data node 2 sends the data replication log to the data node 1.
  • the replication log can be transmitted asynchronously between the data node 2 and the data node 1, that is, the data node 2 can put the replication log into the local transmission queue, and the local processing ends. Passing replication logs synchronously can improve database system availability and partition fault tolerance.
  • steps 310-311 and the previous steps 305-309 are timing-independent, that is, the data node 1 and the data node 2 receive the data operation instruction is timing-independent.
  • FIG. 5 is a flowchart of a data synchronization method according to Embodiment 3 of the present invention.
  • the data node 1 is a source data node
  • the data node 2 is a destination data node.
  • the database data synchronization method provided in this embodiment is mainly described by taking the data node 2 as an example.
  • the method further includes the following steps:
  • Step 401 The data node 2 receives an upgrade pre-notification command sent by the metadata server, where the upgrade pre-notification command carries an entry that needs to be deleted.
  • the metadata server sends an upgrade pre-notification instruction to the data nodes 1 and 2 in advance before sending the upgrade instruction, where the upgrade pre-notification instruction is used to notify the data node that the data model needs to be modified in advance.
  • the notification instruction carries an entry (also referred to as a field) that needs to be deleted in the data model (for example, a table structure), indicating that the entry in the data model needs to be deleted.
  • the data node 2 (destination data node) is taken as an example for description.
  • Step 402 The data node 2 receives a data replication log sent by the data node 1, where the data replication log carries data that needs to be synchronized to the data node 2.
  • the data synchronization between the data node 1 and the data node 2 does not stop.
  • the data node 1 After receiving the data operation instruction sent by the application client, the data node 1 performs a corresponding data operation, generates a data replication log, and synchronizes to the data node. 1.
  • the data replication log carries data that needs to be synchronized to data node 2.
  • Step 403 When it is determined that the data corresponding to the entry to be deleted is included in the replication log, the data node 2 filters the data corresponding to the entry to be deleted when performing data replication according to the replication log, for the purpose.
  • the existing entries in the data node are processed according to the default values.
  • the upgrade pre-notification instruction carries the need to delete
  • the data node 2 may determine whether the data corresponding to the entry to be deleted is included in the replication log, and if so, the data node 2 filters the entry to be deleted when performing data replication according to the received replication log.
  • Corresponding data is processed according to the default value of the existing entries in the data node 2, that is, the data corresponding to the entry to be deleted in the data node 2 is not modified (avoiding modification first, then deleting when upgrading), thereby realizing Data synchronization between data nodes 1 and 2 improves the efficiency of data synchronization.
  • Step 404 The data node 2 receives the upgrade command sent by the metadata server, where the upgrade command is used to delete the entry of the data node, and delete the entry of the data node according to the upgrade instruction.
  • the data node 2 can execute the upgrade command sent by the metadata server while processing the copy log, that is, the step 404 and the steps 401-403 are time-independent, and the step 404 can also be performed before the step 401. .
  • the metadata server sends an upgrade instruction for deleting the entry of the data node to the data nodes 1 and 2.
  • the data node 1, 2 deletes the entry in the data node 1, 2 according to the upgrade instruction.
  • Step 405 The data node 2, when generating the data replication log according to the execution result of the data operation instruction, filters the data corresponding to the entry to be deleted, and sends the data replication log to the source data node.
  • the data node 2 after executing the data operation instruction sent by the application client, the data node 2 itself generates a data replication log according to the execution result of the data operation instruction.
  • the data node 2 can filter the data corresponding to the entry to be deleted when the data replication log is generated, so that the data corresponding to the entry to be deleted is not carried in the replication log, and the transmission efficiency of the data replication log is improved.
  • step 405 and steps 402-404 are time-independent, for example, step 405 can be performed before step 402.
  • the replication log can be transmitted asynchronously between the data node 1 and the data node 2 in this embodiment, that is, the data node 2 can put the replication log into the local transmission queue, and the processing ends even if the local processing is completed. Passing replication logs synchronously can improve database system availability and partition fault tolerance.
  • FIG. 6 is a schematic diagram of data storage of a data node according to an embodiment of the present invention.
  • the distributed database consists of four physical nodes 1, 2, 3 and 4, each node has 3 main shards (the part filled with slashes in the figure) and 3 spare shards (unfilled parts in the figure) .
  • slice 3 is copied from node 1 to node 2
  • slice 6 is copied from node 2 to node 1.
  • Each shard in the distributed database consists of two table structures, Table_A and Table_B, which are currently in V1 version.
  • the table structure is as follows:
  • Table_A uses cust_id as the primary key.
  • Table_B uses product_id as the primary key.
  • FIG. 7 is a data synchronization method of a distributed database according to Embodiment 4 of the present invention.
  • the above synchronization methods mainly include:
  • Step 601 The metadata server receives a table structure upgrade instruction sent by the application client.
  • the table structure upgrade instruction includes a Table_A delete field and a Table_B add field.
  • Step 602 The metadata server sends an upgrade pre-notification command to all the data nodes, where the upgrade pre-notification command carries the entry to be deleted.
  • the upgrade pre-notification command is used to inform the data node to upgrade from version V1 to V2, and inform the table_A in the data node that the field to be deleted (also referred to as a heterogeneous field) cust_bank.
  • the data node 1 and the data node 2 are taken as an example for illustration.
  • step 603 the data nodes 1, 2 demote the synchronous replication to the asynchronous replication mode.
  • the steps of the data nodes 1 and 2 processing the upgrade pre-notification command include demoting the synchronous replication to the asynchronous replication mode.
  • the replication is degraded to ensure that the log replication process changes during the upgrade process, and does not affect or block online services, ensuring high availability of the system.
  • each data node also identifies the field Table_A.cust_bank that needs to be deleted during the upgrade.
  • Step 604 The data node 1, 2 receives the data operation instruction sent by the application client, performs a corresponding data operation, and generates a replication log.
  • the online service generates a copy log for the write operation of the data nodes 1, 2.
  • the data node can directly filter out the field Table_A.cust_bank according to the identified field that needs to be deleted, and the manner of generating the replication log can be heterogeneous replication.
  • Step 605 The data node 1 asynchronously transmits the generated replication log to the data node 2.
  • the data of the primary fragment 3 of the data node 1 is modified, and the replication log generated by the data node 1 carries the data on the backup slice 3 that needs to be synchronized to the data node 2.
  • Step 606 The data node 2 performs data synchronization according to the replication log.
  • the heterogeneous field Table_A.cust_bank exists in the replication log, and the data node 2 performs filtering when performing data synchronization (also referred to as data redo processing), and the heterogeneous field Table_A.cust_bank is also processed according to the default value (nul l). .
  • the data node 1 also receives the replication log sent by the data node 2, and performs data synchronization according to the replication log.
  • Step 607 The metadata server sends an upgrade instruction to the data nodes 1 and 2 to instruct to perform a version upgrade.
  • the metadata server sends an upgrade instruction through a Data Definition Language (DDL) operation, and the upgrade content includes a Table_A table deletion field cust_bank, and a Table_B table adds a field.
  • Product_discount int is a Data Definition Language (DDL) operation, and the upgrade content includes a Table_A table deletion field cust_bank, and a Table_B table adds a field.
  • the default data node 1 in the embodiment first receives the upgrade command as an example.
  • Step 608 the data node 1 first receives and processes the DDL upgrade instruction.
  • Step 609 the data node 1 sends the generated replication log to the data node 2.
  • the replication log generated by the online service to the primary fragment 3 write operation of the data node 1 is synchronously synchronized to the number node 2, and the replication log is matched with the new version V2 (the field is added in the replication log Table_B.Product_discount, and the field Table_A is deleted. Cust_bank).
  • Step 610 The data node 2 performs data synchronization according to the replication log.
  • Data node 2 obtains version V2 of data node 1 from the replication log. Since version V1 of data node 2 is low (no field Table_B.Product_discount), the replication log of V2 version cannot be processed (the log information of the added field Table_B.Product_discount cannot be recognized). ). At this point, data node 2 first caches this portion of the replication log.
  • Step 611 the data node 2 sends the generated replication log to the data node 1.
  • the replication log Since the data node 2 writes the primary fragment 6 of the data node 2 before the DDL upgrade command is received, the replication log is generated and synchronized to the data node 1 in an asynchronous manner. The replication log is matched with the old version V1.
  • Step 612 The data node 1 performs data synchronization according to the replication log.
  • the copy log of the V1 version fragment 6 can be processed.
  • the data node 1 normally processes the fragment 6 copy log transmitted from the data node 2, and the new field Table_B.Product_discount that is not involved in the V1 version of the data node 2, according to the default value nul when performing the copy processing. l processing.
  • processing is performed according to the processing logic of step 606.
  • Step 613 the data node 2 processes the DDL upgrade instruction.
  • the data node 2 receives the DDL upgrade instruction sent by the metadata server with respect to the data node 1.
  • the data node 2 executes the DDL upgrade instruction to complete the version upgrade.
  • the Table_A table deletes the field cust_bank, and the Table_B table adds the field Product_discount int.
  • Step 614 The data node 2 performs data synchronization according to the cached replication log.
  • the upgraded data node has the version V2, and can process the previously cached replication logs to complete data synchronization between the data nodes 1 and 2.
  • Step 615 After the upgrade is completed, the data nodes 1, 2 send a notification message of successful upgrade to the metadata server.
  • the metadata server After identifying that all nodes have completed the upgrade, the metadata server enters the post-upgrade processing, for example, notifying each data node to restore the replication level to synchronous replication. At the same time, heterogeneous replication is also cancelled, and the replication log is generated normally.
  • FIG. 8 is a schematic structural diagram of a destination data node according to Embodiment 5 of the present invention.
  • the destination data node may be the data node 2 shown in Figure 3-5.
  • the destination data node employs general purpose computer hardware including a processor 101, a memory 102, a bus 103, an input device 104, an output device 105, and a network interface 106.
  • memory 102 can include computer storage media in the form of volatile and/or nonvolatile memory, such as read only memory and/or random access memory.
  • the memory 102 can store an operating system, an application, and other programs Modules, executable code, and program data.
  • the input device 104 can be used to input commands and information to a destination data node, such as a keyboard or pointing device, such as a mouse, trackball, touchpad, microphone, joystick, game pad, round dish satellite television antenna, scanner Or similar equipment. These input devices can be connected to the processor 101 via a bus 103.
  • a destination data node such as a keyboard or pointing device, such as a mouse, trackball, touchpad, microphone, joystick, game pad, round dish satellite television antenna, scanner Or similar equipment.
  • the output device 105 can be used for the destination data node to output information.
  • the output device 105 can also be configured for other peripheral outputs, such as speakers and/or printing devices, which can also be connected to the processor via the bus 103. 101.
  • the destination data node can be connected to the network through the network interface 106, for example to a local area network (LAN).
  • LAN local area network
  • computer-executed instructions stored in a destination data node may be stored in a remote storage device, and are not limited to being stored locally.
  • the destination data node may perform the method steps on the destination data node side in the second, third, and fourth embodiments above, for example, performing steps 305-311, 401-405, 603, 606, 610, and the like.
  • steps 305-311, 401-405, 603, 606, 610, and the like for details, refer to the second, third, and fourth embodiments, and details are not described herein again.
  • FIG. 9 is a schematic structural diagram of a destination data node according to Embodiment 6 of the present invention.
  • the destination data node provided by the embodiment of the present invention includes:
  • the data operation instruction processing unit 710 is configured to execute a data operation instruction sent by the application client;
  • the receiving unit 720 is configured to receive a data replication log sent by the source data node, where the replication log carries data that needs to be synchronized to the destination node.
  • the log buffering unit 730 is configured to obtain a version of the source data node, and determine that the version of the source data node is higher than its own version, and cache the replication log.
  • the receiving unit 720 is further configured to receive an upgrade instruction sent by the metadata server, where the upgrade instruction is used to indicate that the destination data node performs a version upgrade.
  • the upgrading unit 740 is configured to perform an upgrade according to the upgrade instruction
  • the log synchronization unit 750 is configured to perform data replication according to the cached replication log after the upgrade is completed.
  • the destination data node provided by the embodiment of the present invention may be used in the foregoing method embodiments 2 and 4, which is between the data operation instruction unit 710, the receiving unit 720, the log buffer unit 730, the upgrading unit 740, and the log synchronization unit 750.
  • the cooperation steps are performed to complete the method steps on the data node side of the second and fourth embodiments.
  • the destination data node provided by this embodiment has the same beneficial effects as the foregoing method embodiment when performing data synchronization.
  • the version of the source data node obtained by the log cache unit 730 in the destination data node includes:
  • the log cache unit 730 obtains a version of the source data node according to the data replication log, where the data replication log carries a version of the source data node;
  • the log buffering unit 730 receives the notification message sent by the source data node, where the notification message carries the version of the source data node, and obtains the version of the source data node according to the notification message.
  • the destination data node described in FIG. 5 further includes:
  • the log generating unit 760 is configured to generate a data replication log, where the data replication log carries data that needs to be synchronized to a data node corresponding to the destination data node, when the data operation instruction is executed;
  • the sending unit 770 is configured to send the data replication log to the source data node. Wherein, the sending unit 770 can The copy log is sent to the local data node by placing it in a local transmit queue.
  • the specific process of the log generation unit 760 generating the replication log and the sending unit 770 sending the replication log to the source data node may refer to the description of steps 310-311 in the foregoing method embodiment.
  • the log synchronization unit 750 of the destination data node is further configured to perform data replication according to the replication log when determining that the version of the source data node is lower than its own version.
  • FIG. 10 is a schematic structural diagram of a destination data node in a distributed database according to Embodiment 7 of the present invention.
  • the destination data node specifically includes:
  • the data operation instruction processing unit 810 is configured to execute a data operation instruction sent by the application client;
  • the receiving unit 820 is configured to receive an upgrade pre-notification command sent by the metadata server, where the upgrade pre-notification command carries an entry that needs to be deleted;
  • the receiving unit 820 is further configured to receive a data replication log sent by the source data node, where the data replication log carries data that needs to be synchronized to the destination node;
  • the log processing unit 830 is configured to: when the data corresponding to the entry to be deleted is included in the replication log, filter the data corresponding to the entry to be deleted when performing data replication according to the replication log, The table items already existing in the destination data node are processed according to default values, thereby implementing data synchronization between the source data node and the destination data node.
  • the destination data node provided by the embodiment of the present invention may be used in the foregoing method embodiments 3 and 4, and the third embodiment is implemented by the cooperation between the data operation instruction unit 810, the receiving unit 820, and the log processing unit 830. Method steps on the data node side of the fourth. Compared with the destination data node in the prior art, the destination data node provided by this embodiment has the same beneficial effects as the foregoing method embodiment when performing data synchronization.
  • the destination data node provided by the embodiment of the present invention further includes:
  • the upgrading unit 840 is configured to receive an upgrade command sent by the metadata server, where the upgrade command is used to delete an entry of the data node, and delete an entry of the data node according to the upgrade instruction.
  • the destination data node can know in advance which data corresponding to the entry needs to be deleted, thereby receiving the replication log. Or when the replication log is sent to the source data node, the data is synchronized in advance.
  • the log sending unit 850 is configured to filter data corresponding to the entry to be deleted when the data replication log is generated according to the execution result of the data operation instruction, and send the data replication log to the source data node.
  • the log sending unit 850 can send the replication log to the local sending queue to the source data node.
  • the destination data node is presented in the form of a functional unit.
  • a "unit” herein may refer to an application-specific integrated circuit (ASIC), circuitry, a processor and memory that executes one or more software or firmware programs, integrated logic circuitry, and/or other functions that provide the functionality described above. Device.
  • ASIC application-specific integrated circuit
  • the destination data node can also take the form shown in FIG.
  • the functions implemented by the data operation instruction unit 710, the reception unit 720, the log buffer unit 730, the upgrade unit 740, and the log synchronization unit 750 can all be implemented by the processor 101 and the memory 102 in FIG.
  • the data operation instruction processing unit 710 executing the data operation instruction sent by the application client may be executed by the processor 101.
  • the code stored in the line memory 102 is implemented.
  • the processor for implementing the above-mentioned data node of the present invention may be a central processing unit (CPU), a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), and a field programmable gate array (FPGA). Or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • aspects of the present invention, or possible implementations of various aspects may be embodied as a system, method, or computer program product.
  • aspects of the invention, or possible implementations of various aspects may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," “modules,” or “systems.”
  • aspects of the invention, or possible implementations of various aspects may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.

Abstract

In the data synchronisation method provided by the present invention, a source data node processes normally the data operating commands sent by an application client and generates a data replication log to send to a target data node. When updating, the target data node receives the data replication log sent by the source data node. When the version of the source data node is higher than the version of the target data node, the target data node first caches the replication log, and after implementing version update of the target data node version, implements data replication on the basis of the cached replication log, thus implementing synchronisation between the source data node and the target data node. Compared to the prior art, the upgrading method provided in the embodiments of the present invention enables a source data node and a target data node to process data operating commands sent by an application client whilst upgrading, improving the reliability of the distributed database.

Description

分布式数据库数据同步方法、相关装置及系统Distributed database data synchronization method, related device and system
本申请要求于2016年7月20日提交中国专利局、申请号为201610578730.9,发明名称为“分布式数据库数据同步方法、相关装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to Chinese Patent Application No. 201610578730.9, entitled "Distributed Database Data Synchronization Method, Related Devices and Systems" on July 20, 2016, the entire contents of which are incorporated by reference. In this application.
技术领域Technical field
本发明涉及数据库技术领域,具体而言涉及一种分布式数据库数据同步方法、相关装置及系统。The present invention relates to the field of database technologies, and in particular, to a distributed database data synchronization method, related device and system.
背景技术Background technique
近年来,随着数据量的高速增长,分布式数据库技术也得到了快速的发展,传统的关系型数据库开始从集中式向分布式发展。分布式数据库具体指利用高速计算机网络将物理上分散的多个数据存储节点连接起来组成一个逻辑上统一的数据库。分布式数据库的基本思想是将原来集中式数据库中的数据分散存储到多个通过网络连接的数据存储节点上,以获取更大的存储容量和更高的并发访问量。In recent years, with the rapid growth of data volume, distributed database technology has also developed rapidly, and traditional relational databases have begun to evolve from centralized to distributed. Distributed database specifically refers to the use of high-speed computer networks to connect physically dispersed multiple data storage nodes to form a logically unified database. The basic idea of a distributed database is to distribute the data in the original centralized database to multiple data storage nodes connected through the network to obtain larger storage capacity and higher concurrent access.
现有技术提供了一种分布式数据库,如图1所示,数据节点1和数据节点2上存储了相同的数据,数据节点1和数据节点2均处于运行(Act ive)状态。数据操作指令可以发送到数据节点1或数据节点2进行执行,一个数据节点执行数据操作指令后,将执行结果通过复制日志的形式同步到另一数据节点,另一数据节点根据复制日志进行数据复制,从而实现数据节点1和2之间的数据同步。The prior art provides a distributed database. As shown in FIG. 1, the same data is stored on the data node 1 and the data node 2, and both the data node 1 and the data node 2 are in an active state. The data operation instruction may be sent to the data node 1 or the data node 2 for execution. After the data node executes the data operation instruction, the execution result is synchronized to another data node by the form of the replication log, and the other data node performs data replication according to the replication log. , thereby achieving data synchronization between data nodes 1 and 2.
但是,分布式数据库在数据同步的过程中,存在需要对数据库进行升级的场景,例如对数据库中的数据模型增加字段、删除字段。现有技术提供了一种基于数据节点的数据同步方法,下面以数据节点2升级来介绍该方法,该方法主要包括如下步骤:However, in the process of data synchronization in distributed databases, there are scenarios in which the database needs to be upgraded, such as adding fields to the data model in the database and deleting fields. The prior art provides a data node-based data synchronization method. The method is described below with a data node 2 upgrade. The method mainly includes the following steps:
S1、节点1暂停处理应用客户端发送的数据操作命令以及节点1和节点2之间暂停数据复制日志的传输。S1. Node 1 suspends processing of data operation commands sent by the application client and suspension of data replication log transmission between node 1 and node 2.
S2、节点2接收升级指令,执行版本升级操作,然后节点2接收应用客户端发送的数据操作命令并执行,记录数据复制日志。S2, node 2 receives the upgrade instruction, performs a version upgrade operation, and then node 2 receives the data operation command sent by the application client and executes, and records the data replication log.
S3、节点2在升级完成后,将升级指令和数据复制日志发送给节点1,节点1接收节点2发送的数据复制日志和升级命令。After the upgrade is completed, the node 2 sends the upgrade command and the data replication log to the node 1, and the node 1 receives the data replication log and the upgrade command sent by the node 2.
S4、节点1先根据升级指令进行版本升级,然后再根据复制日志进行数据复制,在复制完成后,开始正常接收应用客户端发送的数据操作指令,并恢复节点1和节点2之间正常的数据同步。S4 and node 1 first upgrade the version according to the upgrade instruction, and then perform data replication according to the replication log. After the replication is completed, the data operation instruction sent by the application client is normally received, and the normal data between the node 1 and the node 2 is restored. Synchronize.
现有技术的数据同步方法在数据库进行升级时,数据节点1不能处理应用客户端发送的数据操作命令,只有数据节点2能处理应用客户端发送的数据操作命令,降低了分布式数据库的可靠性。When the data synchronization method of the prior art is upgraded in the database, the data node 1 cannot process the data operation command sent by the application client, and only the data node 2 can process the data operation command sent by the application client, thereby reducing the reliability of the distributed database. .
发明内容Summary of the invention
本发明实施例提供了一种提高分布式数据库可靠性的数据同步方法和装置。Embodiments of the present invention provide a data synchronization method and apparatus for improving reliability of a distributed database.
在一方面,本发明实施例提供一种分布式数据库数据同步方法,应用于所述分布式数据库中的目的数据节点,所述目的数据节点在执行应用客户端发送的数据操作指令时,还包括: In an aspect, an embodiment of the present invention provides a distributed database data synchronization method, which is applied to a destination data node in the distributed database, where the destination data node further includes a data operation instruction sent by an application client. :
接收源数据节点发送的数据复制日志,所述复制日志中携带需要同步到所述目的节点的数据;Receiving a data replication log sent by the source data node, where the replication log carries data that needs to be synchronized to the destination node;
获取所述源数据节点的版本,确定所述源数据节点的版本高于自身的版本时,缓存所述复制日志;Obtaining a version of the source data node, and determining that the version of the source data node is higher than its own version, and buffering the replication log;
接收元数据服务器发送的升级指令,所述升级指令用于指示所述目的数据节点进行版本升级;Receiving an upgrade instruction sent by the metadata server, where the upgrade instruction is used to instruct the destination data node to perform a version upgrade;
根据所述升级指令进行升级,在升级完成后,根据所述缓存的复制日志进行数据复制。其中,升级指令中携带修改目的数据节点的数据模型的字段或增加数据模型的字段的指令。The upgrade is performed according to the upgrade instruction, and after the upgrade is completed, data replication is performed according to the cached replication log. The upgrade instruction carries a field for modifying a data model of the destination data node or an instruction for adding a field of the data model.
在本实施例中,源数据节点可以正常处理应用客户端发送的数据操作指令,并生成数据复制日志发送到目的数据节点。目的数据节点在进行升级时,可以接收源数据节点发送的数据复制日志。在源数据节点的版本高于自身的版本时,先缓存该数据复制日志,在自身进行版本升级完成后,再根据缓存的复制日志进行数据复制,完成源数据节点和目的数据节点之间的同步。与现有技术相比,本发明实施例提供的升级方法使得源数据节点和目的数据节点在升级的同时,都可以处理应用客户端发送的数据操作命令,提高了分布式数据库的可靠性。In this embodiment, the source data node can normally process the data operation instruction sent by the application client, and generate a data replication log to be sent to the destination data node. When the destination data node is upgraded, it can receive the data replication log sent by the source data node. When the version of the source data node is higher than the version of the source data node, the data replication log is cached first, and after the version upgrade is completed, the data replication is performed according to the cached replication log, and the synchronization between the source data node and the destination data node is completed. . Compared with the prior art, the upgrade method provided by the embodiment of the present invention enables the source data node and the destination data node to process data operation commands sent by the application client while upgrading, thereby improving the reliability of the distributed database.
结合第一方面,在一个可能的实施方案中,目的数据节点获取源数据节点的版本具体In combination with the first aspect, in a possible implementation, the destination data node obtains the version of the source data node.
包括:include:
根据所述数据复制日志获取所述源数据节点的版本,所述数据复制日志中携带所述源数据节点的版本;或者Obtaining, according to the data replication log, a version of the source data node, where the data replication log carries a version of the source data node; or
接收所述源数据节点发送的通知消息,所述通知消息中携带所述源数据节点的版本,根据所述通知消息获取所述源数据节点的版本。Receiving a notification message sent by the source data node, where the notification message carries a version of the source data node, and obtains a version of the source data node according to the notification message.
结合第一方面,在一个可能的实施方案中,所述方法还包括:With reference to the first aspect, in a possible implementation, the method further includes:
目的数据节点在执行所述数据操作指令时,生成数据复制日志,所述数据复制日志中携带需要同步到与所述目的数据节点对应的数据节点的数据;The destination data node generates a data replication log when the data operation instruction is executed, where the data replication log carries data that needs to be synchronized to a data node corresponding to the destination data node;
向所述源数据节点发送所述数据复制日志,从而实现由目的数据节点到源数据节点之间的数据同步。The data replication log is sent to the source data node to enable data synchronization between the destination data node and the source data node.
结合第一方面,在一个可能的实施方案中,所述目的数据节点还在确定所述源数据节点的版本低于自身的版本时,则说明其自身可以处理源数据节点发送的复制日志,数据节点则根据所述复制日志进行数据复制,从而实现低版本到高版本的数据同步。With reference to the first aspect, in a possible implementation, when the destination data node further determines that the version of the source data node is lower than its own version, it indicates that it can process the replication log sent by the source data node, and the data. The node performs data copying according to the copy log, thereby implementing data synchronization from a low version to a high version.
结合第一方面,在一个可能的实施方案中,目的数据节点向源数据节点发送所述数据复制日志具体包括:With reference to the first aspect, in a possible implementation, the sending, by the destination data node, the data replication log to the source data node includes:
将所述复制日志放到本机发送队列发送给所述源数据节点。相对于同步的方式(同步方式要求对端对复制日志处理完毕,同步才结束)传递复制日志,可以提高数据库系统的可用性。The replication log is placed in a local transmit queue and sent to the source data node. Compared with the synchronous mode (the synchronous mode requires the peer to process the replication log and the synchronization is finished), the replication log can be passed to improve the availability of the database system.
第二方面,本发明实施例提供一种分布式数据库的目的数据节点,其包括:In a second aspect, an embodiment of the present invention provides a destination data node of a distributed database, including:
数据操作指令处理单元,用于执行应用客户端发送的数据操作指令;a data operation instruction processing unit, configured to execute a data operation instruction sent by the application client;
接收单元,用于接收源数据节点发送的数据复制日志,所述复制日志中携带需要同步到所述目的节点的数据;a receiving unit, configured to receive a data replication log sent by the source data node, where the replication log carries data that needs to be synchronized to the destination node;
日志缓存单元,获取所述源数据节点的版本,确定所述源数据节点的版本高于自身的版本时,缓存所述复制日志;a log cache unit, when the version of the source data node is obtained, and the version of the source data node is determined to be higher than its own version, the replication log is cached;
所述接收单元还用于接收元数据服务器发送的升级指令,所述升级指令用于指示所述目 的数据节点进行版本升级;The receiving unit is further configured to receive an upgrade instruction sent by a metadata server, where the upgrade instruction is used to indicate the item Data node for version upgrade;
升级单元,用于根据所述升级指令进行升级;An upgrade unit, configured to perform an upgrade according to the upgrade instruction;
日志同步单元,用于在升级完成后,根据所述缓存的复制日志进行数据复制。The log synchronization unit is configured to perform data replication according to the cached replication log after the upgrade is completed.
结合第二方面,在一种可能的实施方案中,日志缓存单元获取源数据节点的版本包括:With reference to the second aspect, in a possible implementation, the obtaining, by the log cache unit, the version of the source data node includes:
所述日志缓存单元根据所述数据复制日志获取所述源数据节点的版本,所述数据复制日志中携带所述源数据节点的版本;或者The log cache unit acquires a version of the source data node according to the data replication log, where the data replication log carries a version of the source data node; or
所述日志缓存单元接收所述源数据节点发送的通知消息,所述通知消息中携带所述源数据节点的版本,根据所述通知消息获取所述源数据节点的版本。The log buffering unit receives the notification message sent by the source data node, where the notification message carries the version of the source data node, and obtains the version of the source data node according to the notification message.
结合第二方面,在一种可能的实施方案中,所述目的数据节点还包括:With reference to the second aspect, in a possible implementation, the destination data node further includes:
日志生成单元,用于在执行所述数据操作指令时,生成数据复制日志,所述数据复制日志中携带需要同步到与所述目的数据节点对应的数据节点的数据;a log generating unit, configured to: when the data operation instruction is executed, generate a data replication log, where the data replication log carries data that needs to be synchronized to a data node corresponding to the destination data node;
发送单元,用于向所述源数据节点发送所述数据复制日志。And a sending unit, configured to send the data replication log to the source data node.
结合第二方面,在一种可能的实施方案中,目的数据节点的日志同步单元还用于在确定所述源数据节点的版本低于自身的版本时,根据所述复制日志进行数据复制。With reference to the second aspect, in a possible implementation, the log synchronization unit of the destination data node is further configured to perform data replication according to the replication log when determining that the version of the source data node is lower than its own version.
结合第二方面,在一种可能的实施方案中,目的数据节点的发送单元向源数据节点发送所述数据复制日志具体包括:With reference to the second aspect, in a possible implementation, the sending, by the sending unit of the destination data node, the data replication log to the source data node includes:
所述发送单元将所述复制日志放到本机发送队列发送给所述源数据节点。The sending unit sends the replication log to a local sending queue to the source data node.
第三方面,本发明实施例提供一种分布式数据库数据同步方法,应用于所述分布式数据库中的目的数据节点,其中,目的数据节点在执行应用客户端发送的数据操作指令时,还包括:In a third aspect, an embodiment of the present invention provides a distributed database data synchronization method, which is applied to a destination data node in the distributed database, where the destination data node further includes a data operation instruction sent by the application client. :
接收所述元数据服务器发送的升级预通知指令,所述升级预通知指令中携带需要删除的表项;Receiving an upgrade pre-notification instruction sent by the metadata server, where the upgrade pre-notification instruction carries an entry that needs to be deleted;
接收源数据节点发送的数据复制日志,所述数据复制日志中携带需要同步到所述目的节点的数据;Receiving a data replication log sent by the source data node, where the data replication log carries data that needs to be synchronized to the destination node;
确定所述复制日志中包含所述要删除的表项对应的数据时,则在根据所述复制日志进行数据复制时过滤所述要删除的表项对应的数据,对所述目的数据节点中已经存在的表项按照缺省值进行处理,从而实现源数据节点和目的数据节点之间的数据同步。When it is determined that the data corresponding to the entry to be deleted is included in the replication log, the data corresponding to the entry to be deleted is filtered when the data is copied according to the replication log, and the data is already in the destination data node. The existing entries are processed according to default values to achieve data synchronization between the source data node and the destination data node.
结合第三方面,在一种可能的实施方案中,所述方法还包括:In conjunction with the third aspect, in a possible implementation, the method further includes:
接收元数据服务器发送的升级指令,所述升级指令用于删除所述数据节点的表项,根据所述升级指令删除所述数据节点的表项。在本实施方案中,由于预先已经接收到升级预通知指令,升级预通知指令中携带需要删除的表项,则目的数据节点可以预先知道哪些表项对应的数据需要删除,从而在接收到复制日志或向源数据节点发送复制日志时预先进行处理,实现了升级情况下的数据同步。Receiving an upgrade command sent by the metadata server, where the upgrade command is used to delete an entry of the data node, and deleting an entry of the data node according to the upgrade instruction. In this embodiment, since the upgrade pre-notification command has been received in advance, and the upgrade pre-notification command carries the entry to be deleted, the destination data node can know in advance which data corresponding to the entry needs to be deleted, thereby receiving the replication log. Or when the replication log is sent to the source data node, the data is synchronized in advance.
结合第三方面,在一种可能的实施方案中,该方法还包括:In combination with the third aspect, in a possible implementation, the method further includes:
在根据数据操作指令的执行结果生成数据复制日志时,过滤所述需要删除的表项对应的数据,向所述源数据节点发送所述数据复制日志。When the data replication log is generated according to the execution result of the data operation instruction, the data corresponding to the entry to be deleted is filtered, and the data replication log is sent to the source data node.
第四方面,本发明实施例提供一种分布式数据库目的数据节点,其中,目的数据节点具体包括:In a fourth aspect, the embodiment of the present invention provides a distributed database destination data node, where the destination data node specifically includes:
数据操作指令处理单元,用于执行应用客户端发送的数据操作指令;a data operation instruction processing unit, configured to execute a data operation instruction sent by the application client;
接收单元,用于接收所述元数据服务器发送的升级预通知指令,所述升级预通知指令中 携带需要删除的表项;a receiving unit, configured to receive an upgrade pre-notification instruction sent by the metadata server, where the upgrade pre-notification instruction is Carry the entry that needs to be deleted;
所述接收单元还用于接收源数据节点发送的数据复制日志,所述数据复制日志中携带需要同步到所述目的节点的数据;The receiving unit is further configured to receive a data replication log sent by the source data node, where the data replication log carries data that needs to be synchronized to the destination node;
日志处理单元,用于在确定所述复制日志中包含所述要删除的表项对应的数据时,则在根据所述复制日志进行数据复制时过滤所述要删除的表项对应的数据,对所述目的数据节点中已经存在的表项按照缺省值进行处理,从而实现源数据节点和目的数据节点之间的数据同步。a log processing unit, configured to filter data corresponding to the entry to be deleted when the data is copied according to the copy log, when the data corresponding to the entry to be deleted is included in the copy log, The entries already existing in the destination data node are processed according to default values, thereby implementing data synchronization between the source data node and the destination data node.
结合第四方面,在一种可能的实施方案中,所述目的数据节点还包括:With reference to the fourth aspect, in a possible implementation, the destination data node further includes:
升级单元,用于接收元数据服务器发送的升级指令,所述升级指令用于删除所述数据节点的表项,根据所述升级指令删除所述数据节点的表项。在本实施方案中,由于预先已经接收到升级预通知指令,升级预通知指令中携带需要删除的表项,则目的数据节点可以预先知道哪些表项对应的数据需要删除,从而在接收到复制日志或向源数据节点发送复制日志时预先进行处理,实现了升级情况下的数据同步。And an upgrade unit, configured to receive an upgrade command sent by the metadata server, where the upgrade command is used to delete an entry of the data node, and delete an entry of the data node according to the upgrade instruction. In this embodiment, since the upgrade pre-notification command has been received in advance, and the upgrade pre-notification command carries the entry to be deleted, the destination data node can know in advance which data corresponding to the entry needs to be deleted, thereby receiving the replication log. Or when the replication log is sent to the source data node, the data is synchronized in advance.
结合第四方面,在一种可能的实施方案中,所述目的数据节点还包括:With reference to the fourth aspect, in a possible implementation, the destination data node further includes:
日志发送单元,用于在根据数据操作指令的执行结果生成数据复制日志时,过滤所述需要删除的表项对应的数据,向所述源数据节点发送所述数据复制日志。And a log sending unit, configured to: when the data replication log is generated according to the execution result of the data operation instruction, filter the data corresponding to the entry to be deleted, and send the data replication log to the source data node.
结合第三方面、第四方面,在一种可能的实施方案中,所述目的数据节点向源数据节点发送数据复制日志具体包括:With reference to the third aspect, the fourth aspect, in a possible implementation, the sending, by the destination data node, the data replication log to the source data node includes:
将复制日志放到本机发送队列发送给所述源数据节点。The replication log is placed in the local send queue and sent to the source data node.
第五方面,本发明实施例提供一种分布式数据库,包括第二方面或第四方面所述的目的数据节点以及与所述目的数据节点对应的源数据节点。In a fifth aspect, an embodiment of the present invention provides a distributed database, including the destination data node according to the second aspect or the fourth aspect, and a source data node corresponding to the destination data node.
在本发明实施例提供的分布式数据可以在源数据节点和目的数据节点进行升级的同时,完成源数据节点和目的数据节点之间的数据同步,进一步提高了分布式数据库的可靠性。The distributed data provided by the embodiment of the present invention can complete data synchronization between the source data node and the destination data node while upgrading the source data node and the destination data node, thereby further improving the reliability of the distributed database.
附图说明DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。其中:In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Other drawings may also be obtained from those of ordinary skill in the art in light of the inventive work. among them:
图1是现有技术提供的分布式数据库的结构示意图;1 is a schematic structural diagram of a distributed database provided by the prior art;
图2是本发明实施例一提供的分布式数据库的结构示意图;2 is a schematic structural diagram of a distributed database according to Embodiment 1 of the present invention;
图3是本发明实施例提供的数据节点的数据存储示意图;3 is a schematic diagram of data storage of a data node according to an embodiment of the present invention;
图4是本发明实施例二提供的数据同步方法的流程图;4 is a flowchart of a data synchronization method according to Embodiment 2 of the present invention;
图5是本发明实施例三提供的数据同步方法的流程图;5 is a flowchart of a data synchronization method according to Embodiment 3 of the present invention;
图6是本发明实施例提供的数据节点的数据存储示意图;FIG. 6 is a schematic diagram of data storage of a data node according to an embodiment of the present invention; FIG.
图7是本发明实施例四提供的分布式数据库的数据同步方法;7 is a data synchronization method of a distributed database according to Embodiment 4 of the present invention;
图8是本发明实施例五提供的目的数据节点的结构示意图;8 is a schematic structural diagram of a destination data node according to Embodiment 5 of the present invention;
图9是本发明实施例六提供的目的数据节点的结构示意图;9 is a schematic structural diagram of a destination data node according to Embodiment 6 of the present invention;
图10是本发明实施例七提供的分布式数据库中的目的数据节点的结构示意图。 FIG. 10 is a schematic structural diagram of a destination data node in a distributed database according to Embodiment 7 of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性的劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without departing from the inventive scope are the scope of the present invention.
本发明提供一种分布式数据库的数据同步方法、相关装置及数据库系统,参见图2,图2是本发明实施例一提供的数据库系统的结构示意图。The present invention provides a data synchronization method, a related device, and a database system for a distributed database. Referring to FIG. 2, FIG. 2 is a schematic structural diagram of a database system according to Embodiment 1 of the present invention.
如图2所示,本发明涉及的数据处理系统自上而下分别包括:应用客户端、元数据服务器以及数据节点服务器。As shown in FIG. 2, the data processing system of the present invention includes an application client, a metadata server, and a data node server from top to bottom.
其中,应用客户端为使用数据库的应用程序,例如计费应用。应用客户端可以访问数据节点服务器中存储的数据。The application client is an application that uses a database, such as a billing application. The application client can access the data stored in the data node server.
元数据服务器用于负责数据库系统的分布式管理能力,元数据服务器可以独立部署,也可以和数据节点服务器合设。元数据服务器可采用小型机、X86计算机、个人计算机服务器PC server进行部署。The metadata server is responsible for the distributed management capabilities of the database system. The metadata server can be deployed independently or in conjunction with the data node server. The metadata server can be deployed using a minicomputer, an X86 computer, or a personal computer server PC server.
应用客户端通过IP网络分别与元数据服务器和数据节点服务器进行通信,其中,客户端与元数据服务器或数据节点服务器之间的通信接口可以为传输控制协议(Transmiss ion Control Protocol,TCP)接口或用户数据报协议(User Datagram Protocol,UDP)接口。The application client communicates with the metadata server and the data node server respectively through the IP network, wherein the communication interface between the client and the metadata server or the data node server may be a Transmission Control Protocol (TCP) interface or User Datagram Protocol (UDP) interface.
数据节点服务器包括多个数据节点(也可以称为物理节点),数据节点可以为小型机、X86计算机或个人计算机服务器PC server等,数据节点中的数据可存储在位于存储网络的存储介质中,多个数据节点与存储网络之间通过块IO(Block IO)进行读写,即以块(Block)的方式对存储介质进行读写,存储介质可以是硬盘驱动器(Hard Disk Drive,HDD)、固态硬盘(Solid State Drives,SSD)等。The data node server includes a plurality of data nodes (also referred to as physical nodes), and the data nodes may be minicomputers, X86 computers, or personal computer servers, PC servers, etc., and data in the data nodes may be stored in a storage medium located in the storage network. A plurality of data nodes and a storage network are read and written by a block IO (block IO), that is, a storage medium is read and written by a block. The storage medium may be a hard disk drive (HDD) or a solid state. Hard State Drives (SSD), etc.
此外,客户端还部署了驱动Driver,Driver中缓存路由信息。这样,客户端在发送数据操作指令给物理节点时,可以通过缓存的路由信息完成路由判断,访问对应的物理节点。In addition, the client also deploys a driver driver, which caches routing information. In this way, when the client sends a data operation instruction to the physical node, the client can complete the route judgment through the cached routing information and access the corresponding physical node.
参见图3,图3给出了数据节点上的数据存储示意图。数据节点1和数据节点2互为备份,位于数据节点1上的分片(也可以称为虚拟节点)1、以及位于数据节点2上的分片2为主分片,数据节点1上的分片2、以及位于数据节点2上的分片1为备分片。数据节点1和数据节点2分别可以称为源数据节点和目的数据节点。其中,源数据节点和目的数据节点的名称是相对的。Referring to Figure 3, Figure 3 shows a schematic diagram of data storage on a data node. Data node 1 and data node 2 are backups of each other, a slice located on data node 1 (which may also be referred to as a virtual node) 1, and a slice 2 located on data node 2 as a primary slice, and a segment on data node 1 Slice 2 and slice 1 located on data node 2 are spare slices. Data node 1 and data node 2 may be referred to as a source data node and a destination data node, respectively. The names of the source data node and the destination data node are relative.
元数据服务器上存储了虚拟节点与物理节点间的映射关系、虚拟节点的主分片和备份片的定义。元数据服务器还用于控制数据节点之间的数据同步。The metadata server stores the mapping relationship between the virtual node and the physical node, and the definition of the primary fragment and the backup slice of the virtual node. The metadata server is also used to control data synchronization between data nodes.
继续参见图4,图4是本发明实施例二提供的数据同步方法的流程图。其中,数据节点1为源数据节点,数据节点2为目的数据节点。本实施例提供的数据库的数据同步方法可以应用于数据库升级的场景,主要包括如下步骤:With continued reference to FIG. 4, FIG. 4 is a flowchart of a data synchronization method according to Embodiment 2 of the present invention. The data node 1 is a source data node, and the data node 2 is a destination data node. The data synchronization method of the database provided by this embodiment can be applied to the scenario of database upgrade, and the method mainly includes the following steps:
步骤301、数据节点1接收元数据服务器发送的升级指令,升级指令用于指示源数据节点进行版本升级。Step 301: The data node 1 receives an upgrade instruction sent by the metadata server, where the upgrade instruction is used to indicate that the source data node performs version upgrade.
其中,数据节点1、2先后接收到元数据服务器发送的升级指令,告知从版本V1升级到V2,升级指令可以修改数据模型(例如表结构)的字段或增加数据模型的字段。由于收到升级指令的时间存在先后,因而数据节点1和2的数据模型存在不一致的情况。本发明实施例以数据节点1先收到升级指令为例来说明。 The data nodes 1 and 2 successively receive the upgrade instruction sent by the metadata server to notify the upgrade from the version V1 to the V2, and the upgrade instruction may modify the field of the data model (for example, a table structure) or add a field of the data model. Due to the existence of the time when the upgrade command is received, the data models of data nodes 1 and 2 are inconsistent. In the embodiment of the present invention, the data node 1 first receives an upgrade instruction as an example.
步骤302、数据节点1根据该升级指令进行升级。Step 302: The data node 1 performs an upgrade according to the upgrade instruction.
步骤303、数据节点1在升级成功后接收应用客户端发送的数据操作命令,根据数据操作命令执行数据操作,生成复制日志。Step 303: After the upgrade succeeds, the data node 1 receives the data operation command sent by the application client, performs a data operation according to the data operation command, and generates a replication log.
在本实施例中,数据节点1在进行升级的同时,可以接收应用客户端发送的数据操作指令,例如对主分片1的数据的查询、修改等。数据节点1还在执行数据操作后,生成复制日志,该复制日志中携带需要同步到备分片1对应的数据节点(即数据节点2)上的数据。In this embodiment, the data node 1 can receive data operation instructions sent by the application client, such as querying, modifying, and the like of the data of the primary fragment 1 while performing the upgrade. After performing the data operation, the data node 1 generates a replication log, which carries data that needs to be synchronized to the data node corresponding to the backup slice 1 (ie, the data node 2).
步骤304、数据节点1向数据节点2发送生成的复制日志。Step 304, the data node 1 sends the generated replication log to the data node 2.
在升级开始之前或在升级完成之前,数据节点1也可以接收并处理客户端发送的数据操作指令,并生成复制日志。此时,数据节点1和数据节点2的版本是相同的,数据节点1直接向数据节点2发送复制日志即可实现数据节点1和数据节点2之间的数据同步。Before the upgrade starts or before the upgrade is completed, Data Node 1 can also receive and process the data operation instructions sent by the client and generate a replication log. At this time, the versions of the data node 1 and the data node 2 are the same, and the data node 1 directly transmits the copy log to the data node 2 to realize data synchronization between the data node 1 and the data node 2.
在升级完成之后,数据节点1也可以接收并处理客户端发送的数据操作指令,并生成复制日志。此时,数据节点1的版本高于数据节点2的版本,即数据节点1和数据节点2之间出现数据模型不同步,现有技术中的数据节点2无法处理从数据节点1接收到的复制日志,本发明实施例对以此为例来说明。After the upgrade is completed, the data node 1 can also receive and process the data operation instructions sent by the client and generate a replication log. At this time, the version of the data node 1 is higher than the version of the data node 2, that is, the data model is out of synchronization between the data node 1 and the data node 2, and the data node 2 in the prior art cannot process the copy received from the data node 1. The log is described in the embodiment of the present invention.
在本实施例中,数据节点1和数据节点2之间可以采用异步的方式传递复制日志,即数据节点1可以将复制日志放到本机发送队列,并且本机处理完就算结束。相对于同步的方式(目的数据节点处理复制日志成功为止)传递复制日志,可以提高数据库系统的可用性和分区容错性。In this embodiment, the replication log can be transmitted asynchronously between the data node 1 and the data node 2, that is, the data node 1 can put the replication log into the local transmission queue, and the local processing ends. Passing the replication log relative to the synchronous mode (the destination data node handles the replication log successfully) can improve the availability and partition fault tolerance of the database system.
步骤305、数据节点2接收数据节点1发送的数据复制日志,复制日志中携带需要同步到所述数据节点1的数据。Step 305: The data node 2 receives the data replication log sent by the data node 1, and the replication log carries data that needs to be synchronized to the data node 1.
步骤306、数据节点2获取数据节点1的版本,确定数据节点1的版本高于自身的版本时,缓存复制日志。Step 306: The data node 2 acquires the version of the data node 1, and determines that the version of the data node 1 is higher than its own version, and caches the replication log.
在本实施例中,复制日志中可以携带数据节点1的版本,数据节点2可以从复制日志中获取数据节点1的版本,并比较数据节点1的版本和自身的版本。此外,数据节点2也可以接收数据节点1发送的通知消息,该通知消息中携带所述源数据节点的版本,数据节点2根据所述通知消息获取数据节点1的版本。In this embodiment, the copy log may carry the version of the data node 1, and the data node 2 may obtain the version of the data node 1 from the copy log, and compare the version of the data node 1 with its own version. In addition, the data node 2 can also receive the notification message sent by the data node 1, the notification message carrying the version of the source data node, and the data node 2 acquiring the version of the data node 1 according to the notification message.
在本实施例中,如果数据节点1的版本高于自身的版本时,则说明复制日志是根据新的数据模型生成的,数据节点2暂时不处理复制日志,缓存复制日志。In this embodiment, if the version of the data node 1 is higher than its own version, it indicates that the copy log is generated according to the new data model, and the data node 2 temporarily does not process the copy log and cache the copy log.
在另一实施例中,如果数据节点1的版本低于自身的版本时,即数据节点2上的数据模型较新,则数据节点2可以处理数据节点1发送的复制日志,即根据复制日志进行数据同步。如果数据节点1的版本和自身的版本相同,则数据节点2可以处理数据节点1发送的复制日志,即根据复制日志进行数据同步。In another embodiment, if the version of the data node 1 is lower than its own version, ie, the data model on the data node 2 is newer, the data node 2 can process the replication log sent by the data node 1, that is, according to the replication log. data synchronization. If the version of data node 1 is the same as its own version, data node 2 can process the replication log sent by data node 1, ie, synchronize data according to the replication log.
步骤307、数据节点2接收元数据服务器发送的升级指令,所述升级指令用于指示数据节点2进行版本升级。Step 307: The data node 2 receives an upgrade instruction sent by the metadata server, where the upgrade instruction is used to instruct the data node 2 to perform a version upgrade.
在本实施例中,数据节点1、2先后接收到源数据服务器发送的升级指令。In this embodiment, the data nodes 1, 2 receive the upgrade instructions sent by the source data server in succession.
步骤308、数据节点2根据升级指令进行升级。Step 308, the data node 2 is upgraded according to the upgrade instruction.
步骤309、数据节点2在升级完成后,根据缓存的复制日志进行数据复制。Step 309: After the upgrade is completed, the data node 2 performs data replication according to the cached replication log.
在本发明实施例中,数据节点2在升级完成后,其版本和数据节点1的版本相同,数据节点2开始处理复制日志,即根据步骤306中缓存的复制日志进行数据复制(也可称为数据重做),从而保证数据节点1和数据节点2之间的数据同步。 In the embodiment of the present invention, after the upgrade is completed, the version of the data node 2 is the same as the version of the data node 1, and the data node 2 starts to process the replication log, that is, performs data replication according to the replication log cached in step 306 (also referred to as data replication). Data redo) to ensure data synchronization between Data Node 1 and Data Node 2.
在本实施例中,源数据节点(数据节点1)可以正常处理应用客户端发送的数据操作指令,并生成数据复制日志发送到目的数据节点(数据节点2)。目的数据节点在进行升级时,可以接收源数据节点发送的数据复制日志。在源数据节点的版本高于自身的版本时,先缓存该数据复制日志,在自身进行版本升级完成后,再根据缓存的复制日志进行数据复制,完成源数据节点和目的数据节点之间的同步。与现有技术相比,本发明实施例提供的升级方法使得源数据节点和目的数据节点在升级的同时,都可以处理应用客户端发送的数据操作命令,提高了分布式数据库的可靠性。In this embodiment, the source data node (data node 1) can normally process the data operation instruction sent by the application client, and generate a data replication log to be sent to the destination data node (data node 2). When the destination data node is upgraded, it can receive the data replication log sent by the source data node. When the version of the source data node is higher than the version of the source data node, the data replication log is cached first, and after the version upgrade is completed, the data replication is performed according to the cached replication log, and the synchronization between the source data node and the destination data node is completed. . Compared with the prior art, the upgrade method provided by the embodiment of the present invention enables the source data node and the destination data node to process data operation commands sent by the application client while upgrading, thereby improving the reliability of the distributed database.
在本发明实施例中,数据节点2在执行上述升级的同时,还可以执行所述应用客户端发送的数据操作指令,数据节点2还可以执行如下步骤:In the embodiment of the present invention, the data node 2 may perform the data operation instruction sent by the application client while performing the foregoing upgrade, and the data node 2 may further perform the following steps:
步骤310、数据节点2在执行所述数据操作指令时,生成数据复制日志,该数据复制日志中携带需要同步到与所述数据节点2对应的数据节点的数据。Step 310: When executing the data operation instruction, the data node 2 generates a data replication log, where the data replication log carries data that needs to be synchronized to the data node corresponding to the data node 2.
其中,与数据节点2对应的数据节点为数据节点1。例如数据节点2上的主分片2上的数据进行了修改,则复制日志中携带了该修改后的数据,该数据需要同步到数据节点1的备分片2上。The data node corresponding to the data node 2 is the data node 1. For example, if the data on the primary fragment 2 on the data node 2 is modified, the modified data carries the modified data, and the data needs to be synchronized to the backup slice 2 of the data node 1.
步骤311、数据节点2向数据节点1发送所述数据复制日志。Step 311: The data node 2 sends the data replication log to the data node 1.
在本实施例中,数据节点2和数据节点1之间可以采用异步的方式传递复制日志,即数据节点2可以将复制日志放到本机发送队列,并且本机处理完就算结束。相对于同步的方式传递复制日志,可以提高数据库系统的可用性和分区容错性。In this embodiment, the replication log can be transmitted asynchronously between the data node 2 and the data node 1, that is, the data node 2 can put the replication log into the local transmission queue, and the local processing ends. Passing replication logs synchronously can improve database system availability and partition fault tolerance.
需要说明的是,步骤310-311和前面的步骤305-309是时序无关的,即数据节点1和数据节点2接收到数据操作指令是时序无关的。It should be noted that steps 310-311 and the previous steps 305-309 are timing-independent, that is, the data node 1 and the data node 2 receive the data operation instruction is timing-independent.
参见图5,图5是本发明实施例三提供的一种数据同步方法的流程图。其中,数据节点1为源数据节点,数据节点2为目的数据节点。本实施例提供的数据库数据同步方法主要以数据节点2为例来说明,数据节点2在执行应用客户端发送的数据操作指令时,还包括如下步骤:Referring to FIG. 5, FIG. 5 is a flowchart of a data synchronization method according to Embodiment 3 of the present invention. The data node 1 is a source data node, and the data node 2 is a destination data node. The database data synchronization method provided in this embodiment is mainly described by taking the data node 2 as an example. When the data node 2 executes the data operation instruction sent by the application client, the method further includes the following steps:
步骤401、数据节点2接收元数据服务器发送的升级预通知指令,所述升级预通知指令中携带需要删除的表项。Step 401: The data node 2 receives an upgrade pre-notification command sent by the metadata server, where the upgrade pre-notification command carries an entry that needs to be deleted.
在本实施例中,元数据服务器在发送升级指令之前,预先向数据节点1和2发送升级预通知指令,其中升级预通知指令用于预先通知数据节点需要修改数据模型,本实施例中升级预通知指令中携带数据模型(例如表结构)中需要删除的表项(也可以称为字段),表明需要删除数据模型中的表项。本实施例中以数据节点2(目的数据节点)为例来说明。In this embodiment, the metadata server sends an upgrade pre-notification instruction to the data nodes 1 and 2 in advance before sending the upgrade instruction, where the upgrade pre-notification instruction is used to notify the data node that the data model needs to be modified in advance. The notification instruction carries an entry (also referred to as a field) that needs to be deleted in the data model (for example, a table structure), indicating that the entry in the data model needs to be deleted. In this embodiment, the data node 2 (destination data node) is taken as an example for description.
步骤402、数据节点2接收数据节点1发送的数据复制日志,所述数据复制日志中携带需要同步到所述数据节点2的数据。Step 402: The data node 2 receives a data replication log sent by the data node 1, where the data replication log carries data that needs to be synchronized to the data node 2.
其中,数据节点1和数据节点2之间的数据同步并没有停止,数据节点1在接收到应用客户端发送的数据操作指令后,执行相应的数据操作,并生成数据复制日志,同步到数据节点1,该数据复制日志中携带需要同步到数据节点2的数据。The data synchronization between the data node 1 and the data node 2 does not stop. After receiving the data operation instruction sent by the application client, the data node 1 performs a corresponding data operation, generates a data replication log, and synchronizes to the data node. 1. The data replication log carries data that needs to be synchronized to data node 2.
步骤403、确定复制日志中包含所述要删除的表项对应的数据时,则数据节点2在根据所述复制日志进行数据复制时过滤所述要删除的表项对应的数据,对所述目的数据节点中已经存在的表项按照缺省值进行处理。Step 403: When it is determined that the data corresponding to the entry to be deleted is included in the replication log, the data node 2 filters the data corresponding to the entry to be deleted when performing data replication according to the replication log, for the purpose. The existing entries in the data node are processed according to the default values.
具体的,由于数据节点2预先接收到了升级预通知指令,升级预通知指令中携带需要删 除的表项,则数据节点2可以确定复制日志中是否包含所述要删除的表项对应的数据,若是,则数据节点2在根据接收到的复制日志进行数据复制时过滤要删除的表项对应的数据,对数据节点2中已经存在的表项按照缺省值进行处理,即不对数据节点2中要删除的表项对应的数据进行修改(避免先修改,随后升级时删除),从而实现数据节点1和2之间的数据同步,提高了数据同步的效率。Specifically, since the data node 2 receives the upgrade pre-notification instruction in advance, the upgrade pre-notification instruction carries the need to delete In addition to the entry, the data node 2 may determine whether the data corresponding to the entry to be deleted is included in the replication log, and if so, the data node 2 filters the entry to be deleted when performing data replication according to the received replication log. Corresponding data is processed according to the default value of the existing entries in the data node 2, that is, the data corresponding to the entry to be deleted in the data node 2 is not modified (avoiding modification first, then deleting when upgrading), thereby realizing Data synchronization between data nodes 1 and 2 improves the efficiency of data synchronization.
步骤404、数据节点2接收元数据服务器发送的升级指令,升级指令用于删除所述数据节点的表项,根据升级指令删除所述数据节点的表项。Step 404: The data node 2 receives the upgrade command sent by the metadata server, where the upgrade command is used to delete the entry of the data node, and delete the entry of the data node according to the upgrade instruction.
在本实施例中,数据节点2在处理复制日志的同时,可以执行元数据服务器发送的升级指令,即步骤404和步骤401-403之间是时序无关的,步骤404也可以先于步骤401执行。In this embodiment, the data node 2 can execute the upgrade command sent by the metadata server while processing the copy log, that is, the step 404 and the steps 401-403 are time-independent, and the step 404 can also be performed before the step 401. .
本实施例中,元数据服务器向数据节点1和2发送删除数据节点的表项的升级指令,数据节点1、2收到升级指令后,根据升级指令删除数据节点1、2中的表项。In this embodiment, the metadata server sends an upgrade instruction for deleting the entry of the data node to the data nodes 1 and 2. After receiving the upgrade command, the data node 1, 2 deletes the entry in the data node 1, 2 according to the upgrade instruction.
步骤405、数据节点2在根据数据操作指令的执行结果生成数据复制日志时,过滤所述需要删除的表项对应的数据,向所述源数据节点发送所述数据复制日志。Step 405: The data node 2, when generating the data replication log according to the execution result of the data operation instruction, filters the data corresponding to the entry to be deleted, and sends the data replication log to the source data node.
在本实施例中,数据节点2本身在执行应用客户端发送的数据操作指令后,还根据数据操作指令的执行结果生成数据复制日志。其中,数据节点2在生成数据复制日志时,可以过滤需要删除的表项对应的数据,从而复制日志中不携带需要删除的表项对应的数据,提高数据复制日志的发送效率。In this embodiment, after executing the data operation instruction sent by the application client, the data node 2 itself generates a data replication log according to the execution result of the data operation instruction. The data node 2 can filter the data corresponding to the entry to be deleted when the data replication log is generated, so that the data corresponding to the entry to be deleted is not carried in the replication log, and the transmission efficiency of the data replication log is improved.
需要说明的是,步骤405与步骤402-404之间是时序无关的,例如步骤405可以在步骤402之前执行。It should be noted that step 405 and steps 402-404 are time-independent, for example, step 405 can be performed before step 402.
进一步的,在本实施例中的数据节点1和数据节点2之间可以采用异步的方式传递复制日志,即数据节点2可以将复制日志放到本机发送队列,并且本机处理完就算结束。相对于同步的方式传递复制日志,可以提高数据库系统的可用性和分区容错性。Further, the replication log can be transmitted asynchronously between the data node 1 and the data node 2 in this embodiment, that is, the data node 2 can put the replication log into the local transmission queue, and the processing ends even if the local processing is completed. Passing replication logs synchronously can improve database system availability and partition fault tolerance.
为更详细的理解本发明实施例提供的数据同步方法,以下给出本发明实施例的具体应用场景。如图6所示,图6为本发明实施例提供的数据节点的数据存储示意图。For a more detailed understanding of the data synchronization method provided by the embodiment of the present invention, a specific application scenario of the embodiment of the present invention is given below. As shown in FIG. 6, FIG. 6 is a schematic diagram of data storage of a data node according to an embodiment of the present invention.
其中,分布式数据库由四个物理节点1、2、3和4组成,每个节点有3个主分片(图中斜线填充的部分)和3个备分片(图中未填充部分)。Among them, the distributed database consists of four physical nodes 1, 2, 3 and 4, each node has 3 main shards (the part filled with slashes in the figure) and 3 spare shards (unfilled parts in the figure) .
例如,分片3是从节点1复制给节点2,而分片6从节点2复制给节点1。节点1和节点2间进行双向复制。For example, slice 3 is copied from node 1 to node 2, and slice 6 is copied from node 2 to node 1. Two-way replication between node 1 and node 2.
分布式数据库中的每个分片包括两个表结构Table_A和Table_B,当前为V1版本,表结构如下:Each shard in the distributed database consists of two table structures, Table_A and Table_B, which are currently in V1 version. The table structure is as follows:
Table_A表结构为Table_A table structure is
{{
cust_id int,Cust_id int,
cust_Name varchar(128),Cust_Name varchar(128),
cust_bank varchar(128),Cust_bank varchar(128),
}}
Table_A中以cust_id为主键。Table_A uses cust_id as the primary key.
Table_B表结构为Table_B table structure is
{ {
product_id int,Product_id int,
product_Name varchar(128),product_Name varchar(128),
product_price int,Product_price int,
}}
Table_B中以product_id为主键。Table_B uses product_id as the primary key.
升级后的目标V2版本,表结构变化如下:After the upgraded target V2 version, the table structure changes as follows:
Table_A表要删除字段cust_bank,Table_B表要增加字段Product_discount int。Table_A table to delete the field cust_bank, Table_B table to add the field Product_discount int.
以节点1(DataNode1)和节点2(DataNode2)间的双向复制为例,介绍从V1升级到V2的在线升级过程中的数据同步。Taking the bidirectional replication between Node 1 (DataNode 1) and Node 2 (DataNode 2) as an example, the data synchronization during the online upgrade process from V1 to V2 is introduced.
参见图7,图7是本发明实施例四提供的分布式数据库的数据同步方法。上述同步方法主要包括:Referring to FIG. 7, FIG. 7 is a data synchronization method of a distributed database according to Embodiment 4 of the present invention. The above synchronization methods mainly include:
步骤601、元数据服务器接收应用客户端发送的表结构升级指令。Step 601: The metadata server receives a table structure upgrade instruction sent by the application client.
其中,表结构升级指令中包含Table_A删除字段和Table_B增加字段。The table structure upgrade instruction includes a Table_A delete field and a Table_B add field.
步骤602、元数据服务器向所有的数据节点发送升级预通知指令,升级预通知指令中携带需要删除的表项。Step 602: The metadata server sends an upgrade pre-notification command to all the data nodes, where the upgrade pre-notification command carries the entry to be deleted.
其中,升级预通知指令用于告知数据节点从版本V1升级为V2,同时告知数据节点中的Table_A需要删除的字段(也可以称为异构字段)cust_bank。本实施例中以数据节点1和数据节点2为例来说明。The upgrade pre-notification command is used to inform the data node to upgrade from version V1 to V2, and inform the table_A in the data node that the field to be deleted (also referred to as a heterogeneous field) cust_bank. In this embodiment, the data node 1 and the data node 2 are taken as an example for illustration.
步骤603、数据节点1、2将同步复制降级为异步复制方式。In step 603, the data nodes 1, 2 demote the synchronous replication to the asynchronous replication mode.
其中,数据节点1和2处理升级预通知指令的步骤包括将同步复制降级为异步复制方式。复制降级,是为了保证升级过程中对日志复制处理的变化,不会影响或阻塞在线业务,保证系统的高可用性。The steps of the data nodes 1 and 2 processing the upgrade pre-notification command include demoting the synchronous replication to the asynchronous replication mode. The replication is degraded to ensure that the log replication process changes during the upgrade process, and does not affect or block online services, ensuring high availability of the system.
同时,各数据节点还标识出升级期间需要删除的字段Table_A.cust_bank。At the same time, each data node also identifies the field Table_A.cust_bank that needs to be deleted during the upgrade.
步骤604、数据节点1、2接收应用客户端发送的数据操作指令,执行对应的数据操作并生成复制日志。Step 604: The data node 1, 2 receives the data operation instruction sent by the application client, performs a corresponding data operation, and generates a replication log.
在整个升级过程中,在线业务未中断。在线业务对数据节点1、2的写操作产生复制日志。此时,数据节点可以根据标识的需要删除的字段,在产生复制日志时,直接过滤掉字段Table_A.cust_bank,此种生成复制日志的方式可以成为异构复制。Online business was not interrupted throughout the upgrade process. The online service generates a copy log for the write operation of the data nodes 1, 2. At this time, the data node can directly filter out the field Table_A.cust_bank according to the identified field that needs to be deleted, and the manner of generating the replication log can be heterogeneous replication.
步骤605、数据节点1向数据节点2异步发送生成的复制日志。Step 605: The data node 1 asynchronously transmits the generated replication log to the data node 2.
在本实施例中,数据节点1的主分片3存在数据修改,则数据节点1生成的复制日志中携带需要同步到数据节点2的备分片3上的数据。In this embodiment, the data of the primary fragment 3 of the data node 1 is modified, and the replication log generated by the data node 1 carries the data on the backup slice 3 that needs to be synchronized to the data node 2.
步骤606、数据节点2根据复制日志执行数据同步。Step 606: The data node 2 performs data synchronization according to the replication log.
其中,复制日志中存在异构字段Table_A.cust_bank,则数据节点2进行数据同步(也称为数据重做)处理时进行过滤,对异构字段Table_A.cust_bank也按缺省值(nul l)处理。The heterogeneous field Table_A.cust_bank exists in the replication log, and the data node 2 performs filtering when performing data synchronization (also referred to as data redo processing), and the heterogeneous field Table_A.cust_bank is also processed according to the default value (nul l). .
对于分片6,数据节点1也会接收到数据节点2发送的复制日志,并根据复制日志进行数据同步。For the fragment 6, the data node 1 also receives the replication log sent by the data node 2, and performs data synchronization according to the replication log.
步骤607、元数据服务器向数据节点1和2发送升级指令,指示进行版本升级。Step 607: The metadata server sends an upgrade instruction to the data nodes 1 and 2 to instruct to perform a version upgrade.
在本实施例中,元数据服务器通过数据定义语言(Data Definition Language,DDL)操作发送升级指令,升级内容包括Table_A表删除字段cust_bank,Table_B表增加字段 Product_discount int。In this embodiment, the metadata server sends an upgrade instruction through a Data Definition Language (DDL) operation, and the upgrade content includes a Table_A table deletion field cust_bank, and a Table_B table adds a field. Product_discount int.
由于数据节点1和2接收到升级指令有先后顺序,本实施例中默认数据节点1先接收到升级指令为例来说明。Since the data nodes 1 and 2 receive the sequence of the upgrade command, the default data node 1 in the embodiment first receives the upgrade command as an example.
步骤608、数据节点1先接收并处理DDL升级指令。Step 608, the data node 1 first receives and processes the DDL upgrade instruction.
数据节点1执行DDL升级指令后,其版本号V2瞬间比数据节点2的版本号V1新。After the data node 1 executes the DDL upgrade instruction, its version number V2 is instantaneously newer than the version number V1 of the data node 2.
步骤609、数据节点1向数据节点2发送生成的复制日志。Step 609, the data node 1 sends the generated replication log to the data node 2.
升级期间,分片3的复制处理如下:During the upgrade, the copy processing of slice 3 is as follows:
由于升级期间,在线业务未中断,即数据操作未中断。在线业务对数据节点1的主分片3写操作产生的复制日志通过异步方式同步给数节点2,此时复制日志是配套新版本V2的(复制日志中增加字段Table_B.Product_discount,删除字段Table_A.cust_bank)。Due to the uninterrupted online business during the upgrade, the data operation was not interrupted. The replication log generated by the online service to the primary fragment 3 write operation of the data node 1 is synchronously synchronized to the number node 2, and the replication log is matched with the new version V2 (the field is added in the replication log Table_B.Product_discount, and the field Table_A is deleted. Cust_bank).
步骤610、数据节点2根据复制日志进行数据同步。Step 610: The data node 2 performs data synchronization according to the replication log.
数据节点2从复制日志中获取数据节点1的版本V2,由于数据节点2的版本V1较低(无字段Table_B.Product_discount),无法处理V2版本的复制日志(无法识别增加字段Table_B.Product_discount的日志信息)。此时,数据节点2先缓存这部分复制日志。Data node 2 obtains version V2 of data node 1 from the replication log. Since version V1 of data node 2 is low (no field Table_B.Product_discount), the replication log of V2 version cannot be processed (the log information of the added field Table_B.Product_discount cannot be recognized). ). At this point, data node 2 first caches this portion of the replication log.
步骤611、数据节点2向数据节点1发送生成的复制日志。Step 611, the data node 2 sends the generated replication log to the data node 1.
升级期间,分片6的复制处理如下:During the upgrade, the copy processing of slice 6 is as follows:
由于数据节点2在接收到DDL升级指令之前,在线业务对数据节点2的主分片6写操作,产生复制日志并通过异步方式同步给数据节点1,此时复制日志是配套旧版本V1的。Since the data node 2 writes the primary fragment 6 of the data node 2 before the DDL upgrade command is received, the replication log is generated and synchronized to the data node 1 in an asynchronous manner. The replication log is matched with the old version V1.
步骤612、数据节点1根据复制日志进行数据同步。Step 612: The data node 1 performs data synchronization according to the replication log.
由于数据节点1的表结构V2较新,可以处理V1版本分片6的复制日志。此时,数据节点1正常处理数据节点2传来的分片6复制日志,对于数据节点2的V1版本没有涉及的新增的字段Table_B.Product_discount,在进行复制处理时,按其缺省值nul l处理。对于异构字段Table_A.cust_bank,按步骤606的处理逻辑进行处理。Since the table structure V2 of the data node 1 is newer, the copy log of the V1 version fragment 6 can be processed. At this time, the data node 1 normally processes the fragment 6 copy log transmitted from the data node 2, and the new field Table_B.Product_discount that is not involved in the V1 version of the data node 2, according to the default value nul when performing the copy processing. l processing. For the heterogeneous field Table_A.cust_bank, processing is performed according to the processing logic of step 606.
步骤613、数据节点2处理DDL升级指令。Step 613, the data node 2 processes the DDL upgrade instruction.
在本实施例中,数据节点2相对于数据节点1后收到元数据服务器发送的DDL升级指令。数据节点2执行DDL升级指令,完成版本升级,升级后的V2版本中Table_A表删除字段cust_bank,Table_B表增加字段Product_discount int。In this embodiment, the data node 2 receives the DDL upgrade instruction sent by the metadata server with respect to the data node 1. The data node 2 executes the DDL upgrade instruction to complete the version upgrade. In the upgraded V2 version, the Table_A table deletes the field cust_bank, and the Table_B table adds the field Product_discount int.
步骤614、数据节点2根据缓存的复制日志进行数据同步。Step 614: The data node 2 performs data synchronization according to the cached replication log.
升级后的数据节点的版本为V2,可以处理之前缓存的复制日志,完成数据节点1、2之间的数据同步。The upgraded data node has the version V2, and can process the previously cached replication logs to complete data synchronization between the data nodes 1 and 2.
步骤615、数据节点1、2在升级完成后,向元数据服务器发送升级成功的通知消息。Step 615: After the upgrade is completed, the data nodes 1, 2 send a notification message of successful upgrade to the metadata server.
元数据服务器在识别所有节点均完成升级后,进入升级后处理环节,例如通知各数据节点恢复复制级别为同步复制。同时,也取消异构复制,正常生成复制日志。After identifying that all nodes have completed the upgrade, the metadata server enters the post-upgrade processing, for example, notifying each data node to restore the replication level to synchronous replication. At the same time, heterogeneous replication is also cancelled, and the replication log is generated normally.
参见图8,图8是本发明实施例五提供的目的数据节点的结构示意图。Referring to FIG. 8, FIG. 8 is a schematic structural diagram of a destination data node according to Embodiment 5 of the present invention.
其中,所述目的数据节点可以为图3-5所示的数据节点2。目的数据节点采用了通用的计算机硬件,其包括处理器101、存储器102、总线103、输入设备104、输出设备105以及网络接口106。The destination data node may be the data node 2 shown in Figure 3-5. The destination data node employs general purpose computer hardware including a processor 101, a memory 102, a bus 103, an input device 104, an output device 105, and a network interface 106.
具体的,存储器102可以包括以易失性和/或非易失性存储器形式的计算机存储媒体,如只读存储器和/或随机存取存储器。存储器102可以存储操作系统、应用程序、其他程序 模块、可执行代码和程序数据。In particular, memory 102 can include computer storage media in the form of volatile and/or nonvolatile memory, such as read only memory and/or random access memory. The memory 102 can store an operating system, an application, and other programs Modules, executable code, and program data.
输入设备104可以用于向目的数据节点输入命令和信息,输入设备104如键盘或指向设备,如鼠标、轨迹球、触摸板、麦克风、操纵杆、游戏垫、圆盆式卫星电视天线、扫描仪或类似设备。这些输入设备可以通过总线103连接至处理器101。The input device 104 can be used to input commands and information to a destination data node, such as a keyboard or pointing device, such as a mouse, trackball, touchpad, microphone, joystick, game pad, round dish satellite television antenna, scanner Or similar equipment. These input devices can be connected to the processor 101 via a bus 103.
输出设备105可以用于目的数据节点输出信息,除了监视器之外,输出设备105还可以为其他外围输出设各,如扬声器和/或打印设备,这些输出设备也可以通过总线103连接到处理器101。The output device 105 can be used for the destination data node to output information. In addition to the monitor, the output device 105 can also be configured for other peripheral outputs, such as speakers and/or printing devices, which can also be connected to the processor via the bus 103. 101.
目的数据节点可以通过网络接口106连接到网络中,例如连接到局域网(Local Area Network,LAN)。在联网环境下,目的数据节点中存储的计算机执行指令可以存储在远程存储设备中,而不限于在本地存储。The destination data node can be connected to the network through the network interface 106, for example to a local area network (LAN). In a networked environment, computer-executed instructions stored in a destination data node may be stored in a remote storage device, and are not limited to being stored locally.
当目的数据节点中的处理器101执行存储器102中存储的可执行代码或应用程序时,目的数据节点可以执行以上实施例二、三、四中的目的数据节点一侧的方法步骤,例如执行步骤305-311、401-405、603、606、610等。具体执行过程参见上述实施例二、三和四,在此不再赘述。When the processor 101 in the destination data node executes the executable code or application stored in the memory 102, the destination data node may perform the method steps on the destination data node side in the second, third, and fourth embodiments above, for example, performing steps 305-311, 401-405, 603, 606, 610, and the like. For details, refer to the second, third, and fourth embodiments, and details are not described herein again.
参见图9,图9是本发明实施例六提供的目的数据节点的结构示意图。Referring to FIG. 9, FIG. 9 is a schematic structural diagram of a destination data node according to Embodiment 6 of the present invention.
如图所示,本发明实施例提供的目的数据节点包括:As shown in the figure, the destination data node provided by the embodiment of the present invention includes:
数据操作指令处理单元710,用于执行应用客户端发送的数据操作指令;The data operation instruction processing unit 710 is configured to execute a data operation instruction sent by the application client;
接收单元720,用于接收源数据节点发送的数据复制日志,所述复制日志中携带需要同步到所述目的节点的数据;The receiving unit 720 is configured to receive a data replication log sent by the source data node, where the replication log carries data that needs to be synchronized to the destination node.
日志缓存单元730,获取所述源数据节点的版本,确定所述源数据节点的版本高于自身的版本时,缓存所述复制日志;The log buffering unit 730 is configured to obtain a version of the source data node, and determine that the version of the source data node is higher than its own version, and cache the replication log.
所述接收单元720还用于接收元数据服务器发送的升级指令,所述升级指令用于指示所述目的数据节点进行版本升级;The receiving unit 720 is further configured to receive an upgrade instruction sent by the metadata server, where the upgrade instruction is used to indicate that the destination data node performs a version upgrade.
升级单元740,用于根据所述升级指令进行升级;The upgrading unit 740 is configured to perform an upgrade according to the upgrade instruction;
日志同步单元750,用于在升级完成后,根据所述缓存的复制日志进行数据复制。The log synchronization unit 750 is configured to perform data replication according to the cached replication log after the upgrade is completed.
本发明实施例提供的目的数据节点可以使用在前述方法实施例二和四中,其通过上述的数据操作指令单元710、接收单元720、日志缓存单元730、升级单元740以及日志同步单元750之间的配合来完成实施例二、四中的目的数据节点一侧的方法步骤。与现有技术中的目的数据节点相比,本实施例提供的目的数据节点在执行数据同步时,具有与前述方法实施例相同的有益效果。The destination data node provided by the embodiment of the present invention may be used in the foregoing method embodiments 2 and 4, which is between the data operation instruction unit 710, the receiving unit 720, the log buffer unit 730, the upgrading unit 740, and the log synchronization unit 750. The cooperation steps are performed to complete the method steps on the data node side of the second and fourth embodiments. Compared with the destination data node in the prior art, the destination data node provided by this embodiment has the same beneficial effects as the foregoing method embodiment when performing data synchronization.
具体的,目的数据节点中的日志缓存单元730获取源数据节点的版本包括:Specifically, the version of the source data node obtained by the log cache unit 730 in the destination data node includes:
所述日志缓存单元730根据所述数据复制日志获取所述源数据节点的版本,所述数据复制日志中携带所述源数据节点的版本;或者The log cache unit 730 obtains a version of the source data node according to the data replication log, where the data replication log carries a version of the source data node; or
所述日志缓存单元730接收所述源数据节点发送的通知消息,所述通知消息中携带所述源数据节点的版本,根据所述通知消息获取所述源数据节点的版本。The log buffering unit 730 receives the notification message sent by the source data node, where the notification message carries the version of the source data node, and obtains the version of the source data node according to the notification message.
进一步的,图5所述的目的数据节点还包括:Further, the destination data node described in FIG. 5 further includes:
日志生成单元760,用于在执行所述数据操作指令时,生成数据复制日志,所述数据复制日志中携带需要同步到与所述目的数据节点对应的数据节点的数据;The log generating unit 760 is configured to generate a data replication log, where the data replication log carries data that needs to be synchronized to a data node corresponding to the destination data node, when the data operation instruction is executed;
发送单元770,用于向所述源数据节点发送所述数据复制日志。其中,发送单元770可 以将所述复制日志放到本机发送队列发送给所述源数据节点。The sending unit 770 is configured to send the data replication log to the source data node. Wherein, the sending unit 770 can The copy log is sent to the local data node by placing it in a local transmit queue.
在本实施例中,日志生成单元760生成复制日志、发送单元770向源数据节点发送复制日志的具体过程可以参见上述方法实施例中的步骤310-311的描述。In the embodiment, the specific process of the log generation unit 760 generating the replication log and the sending unit 770 sending the replication log to the source data node may refer to the description of steps 310-311 in the foregoing method embodiment.
在本实施例中,目的数据节点的日志同步单元750还用于在确定所述源数据节点的版本低于自身的版本时,根据所述复制日志进行数据复制。In this embodiment, the log synchronization unit 750 of the destination data node is further configured to perform data replication according to the replication log when determining that the version of the source data node is lower than its own version.
参见图10,图10是本发明实施例七提供的分布式数据库中的目的数据节点的结构示意图。Referring to FIG. 10, FIG. 10 is a schematic structural diagram of a destination data node in a distributed database according to Embodiment 7 of the present invention.
如图所示,目的数据节点具体包括:As shown in the figure, the destination data node specifically includes:
数据操作指令处理单元810,用于执行应用客户端发送的数据操作指令;The data operation instruction processing unit 810 is configured to execute a data operation instruction sent by the application client;
接收单元820,用于接收所述元数据服务器发送的升级预通知指令,所述升级预通知指令中携带需要删除的表项;The receiving unit 820 is configured to receive an upgrade pre-notification command sent by the metadata server, where the upgrade pre-notification command carries an entry that needs to be deleted;
所述接收单元820还用于接收源数据节点发送的数据复制日志,所述数据复制日志中携带需要同步到所述目的节点的数据;The receiving unit 820 is further configured to receive a data replication log sent by the source data node, where the data replication log carries data that needs to be synchronized to the destination node;
日志处理单元830,用于在确定所述复制日志中包含所述要删除的表项对应的数据时,则在根据所述复制日志进行数据复制时过滤所述要删除的表项对应的数据,对所述目的数据节点中已经存在的表项按照缺省值进行处理,从而实现源数据节点和目的数据节点之间的数据同步。The log processing unit 830 is configured to: when the data corresponding to the entry to be deleted is included in the replication log, filter the data corresponding to the entry to be deleted when performing data replication according to the replication log, The table items already existing in the destination data node are processed according to default values, thereby implementing data synchronization between the source data node and the destination data node.
本发明实施例提供的目的数据节点可以使用在前述方法实施例三和四中,其通过上述的数据操作指令单元810、接收单元820、日志处理单元830之间的配合来完成实施例实施例三、四中的目的数据节点一侧的方法步骤。与现有技术中的目的数据节点相比,本实施例提供的目的数据节点在执行数据同步时,具有与前述方法实施例相同的有益效果。The destination data node provided by the embodiment of the present invention may be used in the foregoing method embodiments 3 and 4, and the third embodiment is implemented by the cooperation between the data operation instruction unit 810, the receiving unit 820, and the log processing unit 830. Method steps on the data node side of the fourth. Compared with the destination data node in the prior art, the destination data node provided by this embodiment has the same beneficial effects as the foregoing method embodiment when performing data synchronization.
进一步参见图10,本发明实施例提供的目的数据节点还包括:With further reference to FIG. 10, the destination data node provided by the embodiment of the present invention further includes:
升级单元840,用于接收元数据服务器发送的升级指令,所述升级指令用于删除所述数据节点的表项,根据所述升级指令删除所述数据节点的表项。在本实施方案中,由于预先已经接收到升级预通知指令,升级预通知指令中携带需要删除的表项,则目的数据节点可以预先知道哪些表项对应的数据需要删除,从而在接收到复制日志或向源数据节点发送复制日志时预先进行处理,实现了升级情况下的数据同步。The upgrading unit 840 is configured to receive an upgrade command sent by the metadata server, where the upgrade command is used to delete an entry of the data node, and delete an entry of the data node according to the upgrade instruction. In this embodiment, since the upgrade pre-notification command has been received in advance, and the upgrade pre-notification command carries the entry to be deleted, the destination data node can know in advance which data corresponding to the entry needs to be deleted, thereby receiving the replication log. Or when the replication log is sent to the source data node, the data is synchronized in advance.
日志发送单元850,用于在根据数据操作指令的执行结果生成数据复制日志时,过滤所述需要删除的表项对应的数据,向所述源数据节点发送所述数据复制日志。其中,日志发送单元850可以将复制日志放到本机发送队列发送给所述源数据节点。The log sending unit 850 is configured to filter data corresponding to the entry to be deleted when the data replication log is generated according to the execution result of the data operation instruction, and send the data replication log to the source data node. The log sending unit 850 can send the replication log to the local sending queue to the source data node.
在本实施例中,升级单元840对数据节点本身进行升级、日志发送单元850向源数据节点发送复制日志的具体过程参见上述方法实施例中的步骤404-405的描述。In the embodiment, the specific process of the upgrade unit 840 to upgrade the data node itself and the log sending unit 850 to send the copy log to the source data node is described in steps 404-405 in the foregoing method embodiment.
在本实施例中,目的数据节点是以功能单元的形式来呈现。这里的“单元”可以指特定应用集成电路(application-specific integrated circuit,ASIC),电路,执行一个或多个软件或固件程序的处理器和存储器,集成逻辑电路,和/或其他可以提供上述功能的器件。在一个简单的实施例中,本领域的技术人员可以想到目的数据节点也可以采用图8所示的形式。例如,数据操作指令单元710、接收单元720、日志缓存单元730、升级单元740以及日志同步单元750所实现的功能都可以通过图8中的处理器101和存储器102来实现。例如,数据操作指令处理单元710执行应用客户端发送的数据操作指令可以通过由处理器101来执 行存储器102中存储的代码来实现。In this embodiment, the destination data node is presented in the form of a functional unit. A "unit" herein may refer to an application-specific integrated circuit (ASIC), circuitry, a processor and memory that executes one or more software or firmware programs, integrated logic circuitry, and/or other functions that provide the functionality described above. Device. In a simple embodiment, those skilled in the art will appreciate that the destination data node can also take the form shown in FIG. For example, the functions implemented by the data operation instruction unit 710, the reception unit 720, the log buffer unit 730, the upgrade unit 740, and the log synchronization unit 750 can all be implemented by the processor 101 and the memory 102 in FIG. For example, the data operation instruction processing unit 710 executing the data operation instruction sent by the application client may be executed by the processor 101. The code stored in the line memory 102 is implemented.
其中,用于实现本发明上述目的数据节点的处理器可以是中央处理器(CPU),通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC),现场可编程门阵列(FPGA)或者其他可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。其可以实现或执行结合本发明公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。The processor for implementing the above-mentioned data node of the present invention may be a central processing unit (CPU), a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), and a field programmable gate array (FPGA). Or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure. The processor may also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
本领域普通技术人员将会理解,本发明的各个方面、或各个方面的可能实现方式可以被具体实施为系统、方法或者计算机程序产品。因此,本发明的各方面、或各个方面的可能实现方式可以采用完全硬件实施例、完全软件实施例(包括固件、驻留软件等等),或者组合软件和硬件方面的实施例的形式,在这里都统称为“电路”、“模块”或者“系统”。此外,本发明的各方面、或各个方面的可能实现方式可以采用计算机程序产品的形式,计算机程序产品是指存储在计算机可读介质中的计算机可读程序代码。Those of ordinary skill in the art will appreciate that various aspects of the present invention, or possible implementations of various aspects, may be embodied as a system, method, or computer program product. Thus, aspects of the invention, or possible implementations of various aspects, may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," "modules," or "systems." Furthermore, aspects of the invention, or possible implementations of various aspects, may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应所述以权利要求的保护范围为准。 The above is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope of the present invention. It should be covered by the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the claims.

Claims (10)

  1. 一种分布式数据库数据同步方法,应用于所述分布式数据库中的目的数据节点,其特征在于,所述目的数据节点在执行应用客户端发送的数据操作指令时,还包括:A distributed database data synchronization method is applied to the destination data node in the distributed database, and the destination data node further includes: when executing the data operation instruction sent by the application client,
    接收源数据节点发送的数据复制日志,所述复制日志中携带需要同步到所述目的节点的数据;Receiving a data replication log sent by the source data node, where the replication log carries data that needs to be synchronized to the destination node;
    获取所述源数据节点的版本,确定所述源数据节点的版本高于自身的版本时,缓存所述复制日志;Obtaining a version of the source data node, and determining that the version of the source data node is higher than its own version, and buffering the replication log;
    接收元数据服务器发送的升级指令,所述升级指令用于指示所述目的数据节点进行版本升级;Receiving an upgrade instruction sent by the metadata server, where the upgrade instruction is used to instruct the destination data node to perform a version upgrade;
    根据所述升级指令进行升级,在升级完成后,根据所述缓存的复制日志进行数据复制。The upgrade is performed according to the upgrade instruction, and after the upgrade is completed, data replication is performed according to the cached replication log.
  2. 根据权利要求1所述的方法,其特征在于,所述获取源数据节点的版本包括:The method according to claim 1, wherein the obtaining the version of the source data node comprises:
    根据所述数据复制日志获取所述源数据节点的版本,所述数据复制日志中携带所述源数据节点的版本;或者Obtaining, according to the data replication log, a version of the source data node, where the data replication log carries a version of the source data node; or
    接收所述源数据节点发送的通知消息,所述通知消息中携带所述源数据节点的版本,根据所述通知消息获取所述源数据节点的版本。Receiving a notification message sent by the source data node, where the notification message carries a version of the source data node, and obtains a version of the source data node according to the notification message.
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:【数据日志同步到源数据节点】The method according to claim 2, wherein the method further comprises: [data log synchronization to the source data node]
    在执行所述数据操作指令时,生成数据复制日志,所述数据复制日志中携带需要同步到与所述目的数据节点对应的数据节点的数据;When the data operation instruction is executed, generating a data replication log, where the data replication log carries data that needs to be synchronized to a data node corresponding to the destination data node;
    向所述源数据节点发送所述数据复制日志。Sending the data replication log to the source data node.
  4. 根据权利要求1所述的方法,其特征在于,还包括:【源节点版本低】The method according to claim 1, further comprising: [the source node version is low]
    确定所述源数据节点的版本低于自身的版本时,根据所述复制日志进行数据复制。When it is determined that the version of the source data node is lower than its own version, data replication is performed according to the replication log.
  5. 根据权利要求3所述的方法,其特征在于,所述向源数据节点发送所述数据复制日志具体包括:The method according to claim 3, wherein the sending the data replication log to the source data node comprises:
    将所述复制日志放到本机发送队列发送给所述源数据节点。The replication log is placed in a local transmit queue and sent to the source data node.
  6. 一种分布式数据库的目的数据节点,其特征在于,包括:A destination data node of a distributed database, comprising:
    数据操作指令处理单元,用于执行应用客户端发送的数据操作指令;a data operation instruction processing unit, configured to execute a data operation instruction sent by the application client;
    接收单元,用于接收源数据节点发送的数据复制日志,所述复制日志中携带需要同步到所述目的节点的数据;a receiving unit, configured to receive a data replication log sent by the source data node, where the replication log carries data that needs to be synchronized to the destination node;
    日志缓存单元,获取所述源数据节点的版本,确定所述源数据节点的版本高于自身的版本时,缓存所述复制日志;a log cache unit, when the version of the source data node is obtained, and the version of the source data node is determined to be higher than its own version, the replication log is cached;
    所述接收单元还用于接收元数据服务器发送的升级指令,所述升级指令用于指示所述目的数据节点进行版本升级;The receiving unit is further configured to receive an upgrade instruction sent by the metadata server, where the upgrade instruction is used to indicate that the destination data node performs a version upgrade.
    升级单元,用于根据所述升级指令进行升级;An upgrade unit, configured to perform an upgrade according to the upgrade instruction;
    日志同步单元,用于在升级完成后,根据所述缓存的复制日志进行数据复制。The log synchronization unit is configured to perform data replication according to the cached replication log after the upgrade is completed.
  7. 根据权利要求6所述的目的数据节点,其特征在于,所述日志缓存单元获取源数据节点的版本包括:The destination data node according to claim 6, wherein the obtaining, by the log cache unit, the version of the source data node comprises:
    所述日志缓存单元根据所述数据复制日志获取所述源数据节点的版本,所述数据复制日志中携带所述源数据节点的版本;或者The log cache unit acquires a version of the source data node according to the data replication log, where the data replication log carries a version of the source data node; or
    所述日志缓存单元接收所述源数据节点发送的通知消息,所述通知消息中携带所述源数 据节点的版本,根据所述通知消息获取所述源数据节点的版本。The log buffer unit receives a notification message sent by the source data node, where the notification message carries the source number According to the version of the node, the version of the source data node is obtained according to the notification message.
  8. 根据权利要求7所述的目的数据节点,其特征在于,所述数据节点还包括:The destination data node according to claim 7, wherein the data node further comprises:
    日志生成单元,用于在执行所述数据操作指令时,生成数据复制日志,所述数据复制日志中携带需要同步到与所述目的数据节点对应的数据节点的数据;a log generating unit, configured to: when the data operation instruction is executed, generate a data replication log, where the data replication log carries data that needs to be synchronized to a data node corresponding to the destination data node;
    发送单元,用于向所述源数据节点发送所述数据复制日志。And a sending unit, configured to send the data replication log to the source data node.
  9. 根据权利要求6所述的目的数据节点,其特征在于,所述日志同步单元还用于在确定所述源数据节点的版本低于自身的版本时,根据所述复制日志进行数据复制。The destination data node according to claim 6, wherein the log synchronization unit is further configured to perform data replication according to the replication log when determining that the version of the source data node is lower than its own version.
  10. 根据权利要求8所述的目的数据节点,其特征在于,所述发送单元向源数据节点发送所述数据复制日志具体包括:The destination data node according to claim 8, wherein the sending, by the sending unit, the data replication log to the source data node comprises:
    所述发送单元将所述复制日志放到本机发送队列发送给所述源数据节点。 The sending unit sends the replication log to a local sending queue to the source data node.
PCT/CN2017/085486 2016-07-20 2017-05-23 Distributed database data synchronisation method, related apparatus and system WO2018014650A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610578730.9 2016-07-20
CN201610578730.9A CN107644030B (en) 2016-07-20 2016-07-20 Distributed database data synchronization method, related device and system

Publications (1)

Publication Number Publication Date
WO2018014650A1 true WO2018014650A1 (en) 2018-01-25

Family

ID=60991877

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/085486 WO2018014650A1 (en) 2016-07-20 2017-05-23 Distributed database data synchronisation method, related apparatus and system

Country Status (2)

Country Link
CN (1) CN107644030B (en)
WO (1) WO2018014650A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11281484B2 (en) 2016-12-06 2022-03-22 Nutanix, Inc. Virtualized server systems and methods including scaling of file system virtual machines
US11537384B2 (en) 2016-02-12 2022-12-27 Nutanix, Inc. Virtualized file server distribution across clusters
US11562034B2 (en) 2016-12-02 2023-01-24 Nutanix, Inc. Transparent referrals for distributed file servers
US11568073B2 (en) 2016-12-02 2023-01-31 Nutanix, Inc. Handling permissions for virtualized file servers
US11768809B2 (en) 2020-05-08 2023-09-26 Nutanix, Inc. Managing incremental snapshots for fast leader node bring-up
US11770447B2 (en) 2018-10-31 2023-09-26 Nutanix, Inc. Managing high-availability file servers
US11775397B2 (en) 2016-12-05 2023-10-03 Nutanix, Inc. Disaster recovery for distributed file servers, including metadata fixers
CN117149905A (en) * 2023-08-16 2023-12-01 上海沄熹科技有限公司 Time sequence data copying method and device
US11888599B2 (en) 2016-05-20 2024-01-30 Nutanix, Inc. Scalable leadership election in a multi-processing computing environment
US11954078B2 (en) 2016-12-06 2024-04-09 Nutanix, Inc. Cloning virtualized file servers
US11966730B2 (en) 2022-01-26 2024-04-23 Nutanix, Inc. Virtualized file server smart data ingestion

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033391B (en) * 2018-07-31 2019-12-31 北京嘀嘀无限科技发展有限公司 Method and system for synchronizing data among multiple data centers and computer readable storage medium
CN109491599A (en) * 2018-10-24 2019-03-19 山东超越数控电子股份有限公司 A kind of distributed memory system and its isomery accelerated method
CN111666331A (en) * 2019-03-06 2020-09-15 华为技术有限公司 Method, device and system for copying data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388759A (en) * 2007-09-10 2009-03-18 中兴通讯股份有限公司 Method and system for implementing from asynchronous copy to synchronous copy by data
CN104346373A (en) * 2013-07-31 2015-02-11 华为技术有限公司 Partition log queue synchronization management method and device
US20150347250A1 (en) * 2014-05-30 2015-12-03 Altibase Corp. Database management system for providing partial re-synchronization and partial re-synchronization method of using the same
CN105512266A (en) * 2015-12-03 2016-04-20 曙光信息产业(北京)有限公司 Method and device for achieving operational consistency of distributed database

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101753349B (en) * 2008-12-09 2012-08-15 中国移动通信集团公司 Upgrading method of data node, upgrade dispatching node as well as upgrading system
CN101908064A (en) * 2010-07-20 2010-12-08 中兴通讯股份有限公司 Data base backup recovery method and device
US8924384B2 (en) * 2010-08-04 2014-12-30 Sap Ag Upgrading column-based databases
CN102088489B (en) * 2010-12-31 2013-05-22 北京理工大学 Distributed data synchronizing system and method
US9792321B2 (en) * 2013-07-09 2017-10-17 Oracle International Corporation Online database migration
CN105550288B (en) * 2015-12-10 2019-07-02 百度在线网络技术(北京)有限公司 The update method and management system of Database Systems

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388759A (en) * 2007-09-10 2009-03-18 中兴通讯股份有限公司 Method and system for implementing from asynchronous copy to synchronous copy by data
CN104346373A (en) * 2013-07-31 2015-02-11 华为技术有限公司 Partition log queue synchronization management method and device
US20150347250A1 (en) * 2014-05-30 2015-12-03 Altibase Corp. Database management system for providing partial re-synchronization and partial re-synchronization method of using the same
CN105512266A (en) * 2015-12-03 2016-04-20 曙光信息产业(北京)有限公司 Method and device for achieving operational consistency of distributed database

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11669320B2 (en) 2016-02-12 2023-06-06 Nutanix, Inc. Self-healing virtualized file server
US11922157B2 (en) 2016-02-12 2024-03-05 Nutanix, Inc. Virtualized file server
US11947952B2 (en) 2016-02-12 2024-04-02 Nutanix, Inc. Virtualized file server disaster recovery
US11550557B2 (en) 2016-02-12 2023-01-10 Nutanix, Inc. Virtualized file server
US11550559B2 (en) 2016-02-12 2023-01-10 Nutanix, Inc. Virtualized file server rolling upgrade
US11550558B2 (en) 2016-02-12 2023-01-10 Nutanix, Inc. Virtualized file server deployment
US11537384B2 (en) 2016-02-12 2022-12-27 Nutanix, Inc. Virtualized file server distribution across clusters
US11579861B2 (en) 2016-02-12 2023-02-14 Nutanix, Inc. Virtualized file server smart data ingestion
US11544049B2 (en) 2016-02-12 2023-01-03 Nutanix, Inc. Virtualized file server disaster recovery
US11645065B2 (en) 2016-02-12 2023-05-09 Nutanix, Inc. Virtualized file server user views
US11888599B2 (en) 2016-05-20 2024-01-30 Nutanix, Inc. Scalable leadership election in a multi-processing computing environment
US11568073B2 (en) 2016-12-02 2023-01-31 Nutanix, Inc. Handling permissions for virtualized file servers
US11562034B2 (en) 2016-12-02 2023-01-24 Nutanix, Inc. Transparent referrals for distributed file servers
US11775397B2 (en) 2016-12-05 2023-10-03 Nutanix, Inc. Disaster recovery for distributed file servers, including metadata fixers
US11922203B2 (en) 2016-12-06 2024-03-05 Nutanix, Inc. Virtualized server systems and methods including scaling of file system virtual machines
US11954078B2 (en) 2016-12-06 2024-04-09 Nutanix, Inc. Cloning virtualized file servers
US11281484B2 (en) 2016-12-06 2022-03-22 Nutanix, Inc. Virtualized server systems and methods including scaling of file system virtual machines
US11770447B2 (en) 2018-10-31 2023-09-26 Nutanix, Inc. Managing high-availability file servers
US11768809B2 (en) 2020-05-08 2023-09-26 Nutanix, Inc. Managing incremental snapshots for fast leader node bring-up
US11966729B2 (en) 2022-01-20 2024-04-23 Nutanix, Inc. Virtualized file server
US11966730B2 (en) 2022-01-26 2024-04-23 Nutanix, Inc. Virtualized file server smart data ingestion
CN117149905A (en) * 2023-08-16 2023-12-01 上海沄熹科技有限公司 Time sequence data copying method and device

Also Published As

Publication number Publication date
CN107644030B (en) 2021-05-18
CN107644030A (en) 2018-01-30

Similar Documents

Publication Publication Date Title
WO2018014650A1 (en) Distributed database data synchronisation method, related apparatus and system
US10929428B1 (en) Adaptive database replication for database copies
EP3811230B1 (en) Automatic query offloading to a standby database
AU2016405587B2 (en) Splitting and moving ranges in a distributed system
US9031910B2 (en) System and method for maintaining a cluster setup
WO2019154394A1 (en) Distributed database cluster system, data synchronization method and storage medium
US8229893B2 (en) Metadata management for fixed content distributed data storage
US11481139B1 (en) Methods and systems to interface between a multi-site distributed storage system and an external mediator to efficiently process events related to continuity
US10922303B1 (en) Early detection of corrupt data partition exports
JP2020514902A (en) Synchronous replication of datasets and other managed objects to cloud-based storage systems
US9515878B2 (en) Method, medium, and system for configuring a new node in a distributed memory network
JP2023546249A (en) Transaction processing methods, devices, computer equipment and computer programs
US11647075B2 (en) Commissioning and decommissioning metadata nodes in a running distributed data storage system
WO2012039988A2 (en) System and method for managing integrity in a distributed database
CN105493474B (en) System and method for supporting partition level logging for synchronizing data in a distributed data grid
WO2012045245A1 (en) Method and system for maintaining data consistency
KR101922044B1 (en) Recovery technique of data intergrity with non-stop database server redundancy
Dwivedi et al. Analytical review on Hadoop Distributed file system
US11860892B2 (en) Offline index builds for database tables
JP6196389B2 (en) Distributed disaster recovery file synchronization server system
US11514000B2 (en) Data mesh parallel file system replication
US10346299B1 (en) Reference tracking garbage collection for geographically distributed storage system
US11789971B1 (en) Adding replicas to a multi-leader replica group for a data set
US20200349035A1 (en) Method and system for minimizing rolling database recovery downtime
KR20180126431A (en) Recovery technique of data intergrity with non-stop database server redundancy

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17830287

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17830287

Country of ref document: EP

Kind code of ref document: A1