CN117349370A - Method and device for dynamically modifying data synchronization packet - Google Patents

Method and device for dynamically modifying data synchronization packet Download PDF

Info

Publication number
CN117349370A
CN117349370A CN202311204111.XA CN202311204111A CN117349370A CN 117349370 A CN117349370 A CN 117349370A CN 202311204111 A CN202311204111 A CN 202311204111A CN 117349370 A CN117349370 A CN 117349370A
Authority
CN
China
Prior art keywords
transaction
sub
commit
transactions
active
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311204111.XA
Other languages
Chinese (zh)
Inventor
孙峰
彭青松
刘启春
陈江辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dream Database Co ltd
Original Assignee
Wuhan Dream Database Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dream Database Co ltd filed Critical Wuhan Dream Database Co ltd
Priority to CN202311204111.XA priority Critical patent/CN117349370A/en
Publication of CN117349370A publication Critical patent/CN117349370A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of database synchronization, and provides a method and a device for dynamically modifying data synchronization packets. The invention sets the grouping of the virtual transaction modification table, which belongs to the sub-transaction of which the commit operation is not received through setting the active transaction management, stores the corresponding operation in a warehouse according to the sequence of the operation in the main transaction at the target end, and completes the modification of the grouping configuration of the table to be modified among the groupings under the condition of not influencing the data synchronization. The method solves the problems that when the grouping configuration is modified, the data of the table to be modified in the target end database is inconsistent with the data of the table to be modified in the source end database because the table to be modified cannot be updated according to the sequence in the main transaction, the consistency of the data in the database system is destroyed, and the operation error of the data synchronization service is caused.

Description

Method and device for dynamically modifying data synchronization packet
Technical Field
The present invention relates to the field of database synchronization technologies, and in particular, to a method and apparatus for dynamically modifying a data synchronization packet.
Background
At present, in real-time synchronization of database data based on log analysis, operations in different groups are often synchronized after grouping the operations according to a table on which the operations depend in order to ensure the synchronization performance of the database and improve the parallelism. Wherein the table on which the operation is updated is a table on which the operation depends. In an actual production environment, packet configuration and data synchronization policies are often formulated individually according to specific services, so as to meet service requirements or improve data synchronization performance of individual packets. The service requirements often change over time, and the packet configuration of the table to be synchronized also needs to change with the service requirements, i.e. the packets divided by the operations depending on the relevant table to be synchronized are modified in the data synchronization process.
In the running process of the data synchronization service, the delay of each packet for data synchronization may be different, some packets are continuously subjected to data synchronization, and some packets may have larger delay due to the difference of synchronization efficiency, so that the packet configuration of the table to be modified can relate to the transaction dependence relationship of data synchronization between the packets; the table to be modified is a table to be synchronized which is obtained by comparing the grouping configuration before modification and the grouping configuration after modification and needs to modify the grouping configuration. If a table to be modified is set up in groups multiple times in a short time and there is a round-trip adjustment between groups of the same group number, this may result in a situation where a single master transaction involving the table to be modified is split into multiple parts and the split sub-transactions are spread among multiple groups or there are multiple split sub-transactions in a group. The cut sub-transactions belong to the same main transaction, the prior art cannot be executed according to the preset sequence of all operations in the main transaction, the updating sequence of the table to be synchronized cannot be ensured, the data of the table to be modified in the target-end database is very easy to be inconsistent with the data of the table to be modified in the source-end database, the consistency of the data in the database system is destroyed, and the operation error of the data synchronization service is caused.
In view of this, overcoming the drawbacks of the prior art is a problem to be solved in the art.
Disclosure of Invention
In view of the above-mentioned drawbacks or improvements of the prior art, the present invention provides a method and apparatus for dynamically modifying data synchronization packets, which aims to implement that, during data synchronization, the execution sequence of operations in a source-side master transaction is strictly observed, and the packet configuration of a table to be modified is modified without affecting data synchronization.
The invention adopts the following technical scheme:
in a first aspect, the present invention provides a method of dynamically modifying a data synchronization packet, comprising:
suspending log receiving service, traversing all current active transactions, and judging whether operations related to a table to be modified exist or not;
when the operation exists, determining the active transaction corresponding to the operation as an active transaction to be cut, constructing a virtual transaction according to grouping configuration, and taking the virtual transaction as a sub-transaction of the active transaction to be cut; the virtual transaction is used for changing the grouping to which the table to be modified belongs; wherein the packet configuration is a packet configuration after modification based on the modification request;
restarting the log receiving service, and warehousing the active transaction to be cut.
Further, when the operation exists, determining the active transaction corresponding to the operation as an active transaction to be cut, constructing a virtual transaction according to grouping configuration, and taking the virtual transaction as a sub-transaction of the active transaction to be cut; the virtual transaction for changing the packet to which the table to be modified belongs includes:
taking the sub-transaction of the active transaction to be cut as a sub-transaction to be cut, and adding an end mark for the sub-transaction to be cut so as to facilitate that the operation is not divided into the sub-transaction to be cut any more in the follow-up process;
acquiring a target group of the table to be modified according to the group configuration;
constructing a virtual transaction of the table to be modified, taking the virtual transaction as a sub-transaction belonging to the target group of the corresponding active transaction to be cut, and adding an end mark for the virtual transaction;
and constructing a modification operation for dividing the table to be modified into the target group, and adding the modification operation to the virtual transaction.
Further, restarting the log receiving service, warehousing the active transaction to be cut includes:
receiving a commit message, taking a sub-transaction of an active transaction corresponding to the commit message as a sub-transaction to be put in storage, and judging whether the active transaction is an active transaction to be cut;
When the transaction is an active transaction to be cut, traversing all sub transactions to be put in the warehouse of the active transaction to be cut, and finding corresponding virtual transactions from the sub transactions to be put in the warehouse;
taking the commit LSN of the commit message as a commit LSN and a wait LSN of the virtual transaction; the submitting number of the virtual transaction is the operation number of the last operation in the active transaction to be cut;
adding all sub-transactions to be put into a submitted linked list of a corresponding target group according to the sequence from small to large of the combination of the submitted numbers of all sub-transactions to be put and the submitted LSNs of all sub-transactions to be put;
and warehousing all the sub-transactions to be warehoused according to the submitted linked list and the waiting LSN.
Further, the warehousing all the sub-transactions to be warehoused according to the submitted linked list and the waiting LSN comprises:
sequentially judging whether the commit LSN of the sub-transaction to be put in storage in the committed chain table is smaller than or equal to the commit LSN of the corresponding target packet, and judging whether the commit number of the sub-transaction to be put in storage is smaller than or equal to the commit number of the target packet;
if yes, discarding the sub-transaction to be put in storage; if not, judging whether the sub-transaction to be put in storage is a virtual transaction or not;
When the virtual transaction is a virtual transaction, sequentially judging whether the commit LSN corresponding to the original packet relied by the virtual transaction is larger than or equal to the waiting LSN of the virtual transaction, and judging whether the commit number corresponding to the original packet is larger than or equal to the commit number of the virtual transaction; if not, not warehousing all sub-transactions to be warehoused in the target group to which the virtual transaction belongs in the submitted linked list; if yes, warehousing all sub-transactions to be warehoused in the target group to which the virtual transaction belongs;
after executing the sub-transaction to be put into the warehouse each time, taking the commit LSN of the sub-transaction to be put into the warehouse as the commit LSN of the corresponding target group, and taking the commit number of the sub-transaction to be put into the warehouse as the commit number of the target group.
Further, restarting the log receiving service, warehousing the active transaction to be cut includes:
continuously receiving a log, and judging the type of operation corresponding to the log;
when the operation is a DML operation, dividing the DML operation into corresponding sub-transactions according to the log and grouping configuration corresponding to the DML operation; when the sub-transaction has added an end mark, the sub-transaction is a sub-transaction of an active transaction to be cut, a corresponding target sub-transaction is created according to the grouping configuration, and the operation number of the DML operation is used as the commit number of the target sub-transaction;
And when the operation is a submitting operation, finding out a corresponding active transaction to be cut according to a log corresponding to the submitting operation, and warehousing sub-transactions to be warehoused in the active transaction to be cut.
Further, when the operation is a commit operation, finding a corresponding active transaction to be cut according to a log corresponding to the commit operation, and warehousing the sub-transaction to be warehoused in the active transaction to be cut includes:
adding commit messages for all sub-transactions to be put of the active transactions to be cut;
taking the commit LSN of the commit operation as the commit LSN of all sub-transactions to be put in storage;
and adding all sub-transactions to be put into a submitted linked list of the target group, and putting all sub-transactions to be put into a put according to the submitted linked list.
Further, before the pause log receiving service traverses all current active transactions, determining whether there is an operation involving the table to be modified includes:
receiving a log, and analyzing the log to obtain at least one operation, corresponding table information, a main transaction ID and an operation number;
judging whether a corresponding active transaction exists according to the main transaction ID, and creating the active transaction corresponding to the main transaction ID when the corresponding active transaction does not exist;
Judging whether an original sub-transaction corresponding to the original group exists under the active transaction according to the table information, and creating the original sub-transaction when the original sub-transaction does not exist; partitioning the operation to the original sub-transaction;
the operation number of the last operation in each original sub-transaction is used as the commit number of the corresponding original sub-transaction;
and the active transaction receiving the commit message adds all the original sub-transactions into a committed linked list of a corresponding original group according to the sequence from small to large of the combination of the commit numbers of all the original sub-transactions in the active transaction and the commit LSN of all the original sub-transactions, and executes and stores all the original sub-transactions according to the committed linked list.
Further, before restarting the log receiving service and putting the active transaction to be cut in storage, the method includes:
executing a check point action, creating a check point file, and storing transaction operation information of the current system into the check point file;
after the transaction operation information is stored, the current grouping configuration of the table to be synchronized is modified and stored according to the grouping configuration, so that the current grouping configuration is recovered when the data synchronization service is restarted each time.
Further, upon restart recovery of the data synchronization service failure, wherein:
restoring the synchronized commit LSN, commit number and virtual transaction in each target group according to the checkpoint file;
when the sub-transaction is executed and put in storage after recovery, judging whether the commit LSN and the commit number corresponding to the executed sub-transaction are smaller than or equal to the commit LSN and the commit number of the last synchronized sub-transaction;
if yes, not executing the sub-transaction and warehousing; if not, executing the sub-transaction, and modifying the current grouping configuration and storing according to the modification operation in the virtual transaction.
In a second aspect, the present invention further provides an apparatus for dynamically modifying a data synchronization packet, for implementing the method for dynamically modifying a data synchronization packet according to the first aspect, where the apparatus for dynamically modifying a data synchronization packet includes:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor for performing the method of dynamically modifying data synchronization packets of the first aspect.
In a third aspect, the present invention also provides a non-volatile computer storage medium storing computer-executable instructions for execution by one or more processors to perform the method of dynamically modifying data synchronization packets of the first aspect.
Unlike the prior art, the invention has at least the following beneficial effects:
by setting sub-transactions of which the active transaction management does not receive the commit operation, setting groups to which the virtual transaction modification table belongs, and warehousing corresponding operations according to the sequence of the operations in the main transaction at the target end, under the condition that data synchronization is not affected, modifying the grouping configuration of the table to be modified among the groups is completed. The method solves the problems that when the grouping configuration is modified, the data of the table to be modified in the target end database is inconsistent with the data of the table to be modified in the source end database because the table to be modified cannot be updated according to the sequence in the main transaction, the consistency of the data in the database system is destroyed, and the operation error of the data synchronization service is caused.
Drawings
In order to more clearly illustrate the technical solution of the embodiments of the present invention, the drawings that are required to be used in the embodiments of the present invention will be briefly described below. It is evident that the drawings described below are only some embodiments of the present invention and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 is a flow chart of a method for dynamically modifying a data sync packet according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of step 20 according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of step 30 according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart of step 305a according to an embodiment of the present invention;
FIG. 5 is a schematic diagram showing another embodiment of the process of step 30 of the present invention;
fig. 6 is a schematic diagram of an architecture of an apparatus for dynamically modifying a data synchronization packet according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "transverse", "upper", "lower", "top", "bottom", etc. refer to an orientation or positional relationship based on that shown in the drawings, merely for convenience of describing the present invention and do not require that the present invention must be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
The terms "first," "second," and the like herein are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", etc. may explicitly or implicitly include one or more such feature. In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Example 1:
in a packet synchronization mode based on log analysis, a target database receives a log sent by a source database, analyzes the log to obtain at least one main transaction divided by the source database, each main transaction comprises at least one operation, splits and classifies the main transaction by taking a table as a unit to obtain sub-transactions belonging to different packets, namely classifying the operation to the corresponding packet according to the table on which the operation depends in the main transaction, and executing the sub-transactions in each packet in parallel by using multithreading to improve the data synchronization efficiency.
There is a need to modify the packet configuration of the correlation table during the operation of the data synchronization service, and the modification of the packet configuration is exemplified as follows: before modification, dividing the operations of the dependence table A and the dependence table B into a first group, and dividing the operations of the dependence table C into a second group; dividing the modified operation of the dependence table A into a first group, and dividing the operation of the dependence tables B and C into a second group; the table to be modified is table B.
When the grouping configuration of the tables to be modified is modified among the groupings in the prior art, only the operation of depending on the tables to be modified, which is received newly after modification, is realized, the operation is divided into sub-transactions belonging to the same main transaction in the different groupings before modification according to the new grouping configuration, but the operation which is divided into corresponding groupings before modification but is not yet subjected to warehousing is not divided again. If a certain table to be modified is adjusted in a packet configuration multiple times in a short time (for example, the operation of the dependent table a is divided into a first packet, then into a second packet, and finally into a third packet) and there is an adjustment back and forth between packets of the same group number (for example, the operation of the dependent table a is divided into the first packet, then into the second packet, and finally into the first packet), this may result in a situation that a single main transaction involving the table to be modified is cut into a plurality of sub-transactions in different packets in the target database, and the cut sub-transactions are scattered in the plurality of packets (for example, the main transaction 1 is cut into sub-transaction 1 in the first packet and sub-transaction 2 in the second packet), or there is a plurality of cut sub-transactions in a certain packet (for example, since the operation of the dependent table a is divided into the first packet, the second packet, then into the first packet, there is sub-transaction 1 and sub-transaction 3 in the first packet, and there is sub-transaction 1 in the second packet, and sub-transaction 1 in the sub-transaction 2 is divided into sub-transaction 1 and sub-transaction 2 in the order).
When the operations in each group are executed in parallel and put in storage, due to the difference in the synchronization efficiency of the sub-transactions in different groups, the operations belonging to different sub-transactions before and after the modification of the group configuration cannot be executed according to the sequence of all the operations preset in the main transaction, so that the to-be-synchronized table exists in the target-end database for synchronization, which is originally required to be updated according to the sequence in the main transaction (namely, after a certain operation 1 updates the to-be-synchronized table, the other operation 2 continuously updates the to-be-synchronized table based on the updated to-be-synchronized table of the operation 1), after the modification of the group configuration of the to-be-synchronized table, the sequential update cannot be guaranteed (namely, the situation that the to-be-synchronized table is updated based on the operation 2 after the update of the operation 1 is continuously updated is possible), the data of the to-be-modified table in the target-end database is inconsistent with the data of the to-be-modified table in the source-end database, the consistency of the data in the database system is destroyed, and the running error of the data synchronization service is caused.
Dynamically modifying packet configuration tends to be small, and the tables that need to be modified for packet configuration are known to the data synchronization service. After receiving the modification command of the packet configuration, the data synchronization service analyzes the modification command to obtain a table to be modified, which needs to modify the packet configuration. In the dynamic modification, the time for suspending the log receiving is very short, and the data synchronization service only suspends the thread of the log receiving service, and other threads (such as an execution thread) still run. Because the time for suspending the log reception is extremely short, there is a high possibility that a large number of operations which are divided into corresponding groups before modification but are not yet put into storage are existed, and when the operations are put into storage in parallel according to the groups after the grouping configuration is modified, the number of operations which cannot strictly follow the execution sequence in the main transaction in the sub-transaction of the main transaction of the same source end is large.
To solve the foregoing problem, an embodiment of the present invention provides a method for dynamically modifying a data synchronization packet, as shown in fig. 1, including:
step 10: and pausing the log receiving service, traversing all current active transactions, and judging whether an operation related to the table to be modified exists or not.
The method comprises the steps that a main transaction of a source terminal corresponds to an active transaction of a target terminal, the main transaction from the source terminal is cut into at least one sub-transaction, the active transaction is a set of at least one sub-transaction, and all the active transactions are sub-transactions which do not receive a corresponding commit operation; and when the table to be modified is modified for the grouping configuration, modifying the divided grouping. According to the method for dynamically modifying the data synchronization packet, the active transaction of the corresponding source-side main transaction to be cut is determined according to the table to be modified configured by the modification packet. Before modifying the grouping configuration of the table to be modified, the active transaction related to the table to be modified is required to be cut, so that the active transaction related to the table to be modified needs to be found first, and in order to avoid that a new operation is divided into the active transaction in the process, the active transaction is changed, the receiving of the source log needs to be paused before modification, and then the active transaction can be cut.
Step 20: when the operation exists, determining the active transaction corresponding to the operation as an active transaction to be cut, constructing a virtual transaction according to grouping configuration, and taking the virtual transaction as a sub-transaction of the active transaction to be cut; the virtual transaction is used for changing the grouping to which the table to be modified belongs.
When the active transaction to be cut is configured for the modification group, the active transaction of the table to be modified is related, and the active transaction needs to be cut; the packet configuration is the current system modified packet configuration. By constructing the modification operation of the virtual transaction save grouping configuration and restricting the execution sequence of the grouping before and after the table to be modified, the corresponding operation is ensured to be put in storage according to the sequence of the operation in the main transaction, and the modification of the configuration of the table to be modified among the groupings is completed under the condition that the data synchronization is not affected.
Step 30: restarting the log receiving service, and warehousing the active transaction to be cut.
The warehousing refers to executing a certain operation on the target-end database to realize data synchronization.
The method for dynamically modifying the data synchronization packet according to the embodiment of the invention can at least solve the following problems: firstly, the data synchronization problem related to the table to be modified before and after the modification of the grouping configuration needs to be solved; secondly, the grouping configuration of the current system needs to be modified, so that operations contained in the logs are divided according to the modified grouping configuration after the logs of the targets are received. Since the table to be modified is known, all the active transactions related to the synchronous table to be modified are searched in the current active transactions, and sub-transactions related to the groups are cut and virtual transactions are constructed so as to restrict the execution sequence of the groups before and after the table to be modified.
By setting sub-transactions of which the active transaction management does not receive the commit operation, setting groups to which the virtual transaction modification table belongs, and warehousing corresponding operations according to the sequence of the operations in the main transaction at the target end, under the condition that data synchronization is not affected, modifying the grouping configuration of the table to be modified among the groups is completed. The method solves the problems that when the grouping configuration is modified, the data of the table to be modified in the target end database is inconsistent with the data of the table to be modified in the source end database because the table to be modified cannot be updated according to the sequence in the main transaction, the consistency of the data in the database system is destroyed, and the operation error of the data synchronization service is caused.
According to the embodiment of the invention, the log analysis service is arranged at the source end, the analyzed operation is independently numbered by taking the main transaction of the source end as a unit, the sequential incremental performance of the operation numbers of the operation in the main transaction is ensured, and the log corresponding to the operation is filled with the table information related to the operation. Wherein, the person skilled in the art can self-specify the specific implementation mode of the operation number and the table information according to the specific data synchronization scene. The target-end data synchronization service receives the operation in the log sent by the source end, classifies the operation into corresponding active transactions according to the ID of the main transaction, extracts the table information related in the operation, locates the packet to which the operation belongs through the table information, and constructs the sub-transaction corresponding to the main transaction under the active transaction by combining the packet information.
In order to better illustrate the method for dynamically modifying the data synchronization packet according to the present invention, the preparation work required to be done before modifying the packet configuration according to the embodiment of the present invention is further described below, specifically, a log is received, and the log is parsed to obtain at least one operation, corresponding table information, a master transaction ID and an operation number. Judging whether a corresponding active transaction exists according to the main transaction ID, and creating the active transaction corresponding to the main transaction ID when the corresponding active transaction does not exist. Judging whether an original sub-transaction corresponding to the original group exists under the active transaction according to the table information, and creating the original sub-transaction when the original sub-transaction does not exist; the operation is divided into the original sub-transactions. Wherein, the operation number of the last operation in each original sub-transaction is taken as the commit number of the corresponding original sub-transaction.
And adding all the original sub-transactions into a submitted linked list of a corresponding original group according to the sequence from small to large of the combination of the submitted numbers of all the original sub-transactions in the active transactions and the submitted LSNs (Log Sequence Number, log sequence numbers) of all the original sub-transactions, and executing and warehousing all the original sub-transactions according to the submitted linked list.
When the method for dynamically modifying the data synchronization packet in the embodiment of the invention carries out log analysis on the source end, the operations in the main transaction of each source end are numbered according to the sequence. Since a transaction is a collection of operations, the operations in the same transaction either commit together or do not commit together. Therefore, the commit LSNs of the target terminal transactions corresponding to the same source main transaction are the same, and the operation sequences are required to be ordered according to the operation numbers in the group so as to strictly distinguish the operation sequences in the source main transaction. When the sub-transaction is executed at the target end, the consistency of the data after fault recovery is solved through the operation number of the last operation of the transaction. After the target receives the log, at least one operation in each main transaction of the source is grouped according to a table on which the operation is dependent according to the operation contained in the log, and the operation is divided into a plurality of sub-transactions. A single master transaction resulting from source-side parsing of the log involves the operation of multiple tables that are partitioned into different groupings when grouped at the target-side, which creates an overall transaction (i.e., an active transaction) to categorize the operation of the master transaction. Creating a plurality of sub-transactions under the active transaction, wherein each sub-transaction corresponds to a group, classifying the operations belonging to the same group under the sub-transaction, and recording the operation number of the last operation of each sub-transaction as the commit number of the sub-transaction. By setting the active transaction, the main transaction of the source end is managed at the target end by taking the active transaction as a set, so that the sub-transactions positioned in each group can be found and put in storage according to the sequence of the operations in the main transaction under the condition that the operations in the main transaction are divided according to the dependent table and the group configuration is divided into a plurality of sub-transactions after modification.
After the grouping configuration of the table to be modified is modified, the sub-transactions related to the related grouping are cut, and the sub-transactions before modification and after modification form independent sub-transactions respectively. The method comprises the steps that a target end receives logs which are all logs generated by a source end when operations are executed, so that logs of operations which are being executed and not submitted are contained, and in order to keep consistency of a database, synchronization is carried out based on the fact that the source end submits main transactions, namely when the target end receives corresponding logs, only the operations which are not submitted by the source end are divided into active transactions and corresponding groups, and when the target end receives submitted operations, data synchronization of the operations is carried out.
After the operation of the table to be modified is cut from one packet to another, sub-transactions belonging to the two packets form a transaction dependency. If a certain table a in the database is updated by an operation in the transaction 1, the operation in the subsequent transaction 2 can complete updating of the table a based on the updated table a, then the transaction 2 is referred to as depending on the transaction 1, or a transaction dependency exists between the transaction 1 and the transaction 2. Because operations of the same sub-transaction must be guaranteed to be performed in strict order of operation numbers of operations in the main transaction, operations cut to the target packet must wait for the sub-transaction to complete binning in the original packet before binning can begin. To solve this problem, when an active transaction (corresponding to a source-side main transaction) is cut, a virtual transaction for checking the execution of an original packet is added to a target packet corresponding to the active transaction, and the target packet checks the execution completion of the original packet of the related table by executing the virtual transaction, and only when a condition is satisfied, the sub-transaction belonging to the target packet can start to execute.
To better illustrate the method for dynamically modifying data synchronization packets according to the present invention, the following further details the step 20 of the method for dynamically modifying data synchronization packets according to the embodiment of the present invention, specifically, as shown in fig. 2, the step 20 includes:
step 201: and taking the sub-transaction of the active transaction to be cut as a sub-transaction to be cut, and adding an end mark for the sub-transaction to be cut so that the operation is not divided into the sub-transaction to be cut later.
Wherein the end mark is used to mark sub-transactions that have been cut; the method for dynamically modifying the data synchronization packet according to the embodiment of the invention identifies sub-transactions that have been created before suspending the log receiving service each time the packet configuration is modified by setting an end flag. Searching sub-transactions related to the table to be modified in the active transactions, determining the sub-transactions related to the table to be modified as sub-transactions to be cut, and then not dividing the operation which is continuously received into the sub-transactions, and determining the active transactions corresponding to the sub-transactions to be cut as the active transactions which need to be cut, namely the active transactions to be cut.
An example of an active transaction provided by an embodiment of the present invention is as follows:
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end tag: 0, commit LSN:0, commit number: 1]
}
Wherein, the target end is configured with two groups, a G1 group and a G2 group (here, sub-transactions under the G2 group are not created temporarily in the active transaction A1, so the G2 group is not shown), and initially, the table to be modified is configured in the G1 group; an end mark of 0 indicates that no end mark is added, and an end mark of 1 indicates that an end mark is added; commit LSN of 0 indicates that the child transaction did not receive a commit message, so commit LSN is not set; the commit number of a sub-transaction is set when an operation is divided into the sub-transactions, the operation number of the last operation of the sub-transaction. Upon receiving an operation with commit LSN 21 at the source, an active transaction A1 with ID 1 is created by the master transaction ID, at which point T1 belongs to the G1 group, a sub-transaction g1_trx1 belonging to the group G1 is created on the active transaction, and the operation is added to the sub-transaction with operation number 1 of the operation as the commit number of the transaction.
Active transaction A1 is determined to be a cut active transaction according to the to-be-modified table T1. Taking an original grouping sub-transaction related to a T1 table on an active transaction A1 to be cut as the sub-transaction to be cut, and adding an end mark as follows:
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
}
Step 202: and acquiring the target group of the table to be modified according to the group configuration.
For example, according to the packet configuration modification instruction, the target packet of the table T1 to be modified is acquired as a G2 packet.
Step 203: and constructing a sub-transaction of the active transaction to be cut, which belongs to the target group, as a virtual transaction, and adding an end mark for the virtual transaction. The operation number of the last operation of the corresponding active transaction is used as the commit number of the virtual transaction, so that the operation in the virtual transaction waits for the sub-transaction to which the original packet belongs to be executed until the commit LSN and the commit number of the virtual transaction are submitted, and then the operation is executed. The specific implementation form of the operation number is specified by one of ordinary skill in the art according to the specific data synchronization scenario, and is not limited herein.
Step 204: and constructing a modification operation for dividing the table to be modified into the target group, and adding the modification operation to the virtual transaction.
Steps 203 and 204 are exemplified as follows: constructing a virtual transaction G2_TRX2 which depends on the G1 group for the target group G2 group related to the table T1 of the table T1 to be modified on the active transaction A1 to be cut (after the G1 group is waited for executing the warehouse entry, the virtual transaction G2_TRX2 can be executed for warehouse entry), wherein the submitting number of the transaction is the operation number of the last operation of the active transaction A1 to be cut. If the virtual transaction g2_trx2 can successfully execute the binning based on the target-side database updated by all the sub-transactions in the group G1 after the execution binning is completed, the virtual transaction g2_trx2 is referred to as a virtual transaction g2_trx2 depending on the group G1. Since the active transaction to be cut has not received the commit message, the wait LSN is set to 0, and the modification operation of the table T1 to be modified to change the packet configuration from the G1 packet to the G2 packet is saved to the virtual transaction, which is specifically as follows:
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 1]
}
When the target end adjusts the grouping configuration of the table to be modified, all the active transactions related to the table to be modified are searched in all the active transactions, whether the operation related to the table exists in the sub-transactions is judged, if so, the active transactions belong to the active transactions needing to be cut, an end mark is set for the sub-transactions of the active transactions in the original grouping related to the active transactions, and the operation number of the last operation is recorded. A sub-transaction is found that involves the target grouping of the table in the active transaction, which is an active transaction that requires a cut transaction if the sub-transaction exists. Other active transactions that are not involved do not need to be processed. The modification of the packet configuration is then completed. And inserting a virtual transaction into the target packet depending on the table to be modified, wherein the virtual transaction is used for modifying the packet to which the table to be modified belongs. If the sub-transaction in the target packet updates the table to be modified, the target packet is called to depend on the table to be modified. After the grouping configuration of the table is modified, the sub-transactions related to the related grouping are cut, the sub-transactions before modification and the sub-transactions after modification form an independent sub-transaction respectively, and due to the fact that the execution logic dependency relationship exists between the two sub-transactions, the execution sequence of the related grouping is connected in series by setting a virtual transaction between the two sub-transactions, so that the sub-transactions before modification are ensured to be executed before the sub-transactions after modification. The invention constructs the sequence dependency relationship executed between the groups by forming sub-transactions of different stages (before and after the modification of the group configuration) when modifying the table group and inserting virtual transactions for the groups with the dependency relationship between the sub-transactions. If the sub-transaction in the target packet updates the table to be modified, the target packet is called to depend on the table to be modified. If all the sub-transactions in the group G1 complete execution and warehousing, a certain sub-transaction in the group G2 can successfully execute and warehousing based on the target end database updated by all the sub-transactions in the group G1, then the group G2 is called as the group G1, and the group G1 has a dependency relationship with the group G2.
And cutting all related sub-transactions of the active transactions to complete, and completing modification of grouping configuration information after completing modification of all related active transactions.
Before said step 30 comprises:
and executing a check point action, creating a check point file, and storing the transaction operation information of the current system into the check point file. After the transaction operation information is stored, the current grouping configuration of the table to be synchronized is modified and stored according to the grouping configuration, so that the current grouping configuration is recovered when the data synchronization service is restarted each time. After modifying and storing the current grouping configuration, the current system restarts the grouping configuration to carry out operation division and grouping construction after the log receiving service. A checkpoint action is performed to record the transaction state in the current system and the current packet configuration to facilitate recovery in the event of a failure of the data synchronization service. When the modified target packet or the original packet receives again an operation from the same main transaction of the source, then a sub-transaction of the target is created to categorize and house the operation.
The execution thread of the target-end data synchronization service is responsible for polling the sub-transactions corresponding to the main transaction submitted by the source end in each packet and warehousing all the sub-transactions corresponding to the main transaction. To better illustrate the method for dynamically modifying data synchronization packets according to the present invention, step 30 of the method for dynamically modifying data synchronization packets according to an embodiment of the present invention is further detailed, specifically, as shown in fig. 3, the step 30 includes:
Step 301a: receiving a commit message, taking a sub-transaction of an active transaction corresponding to the commit message as a sub-transaction to be put in storage, and judging whether the active transaction is an active transaction to be cut. Wherein the active transaction to which the end mark is not added by the corresponding sub-transaction is the active transaction to be cut. When the transaction is not an active transaction to be cut, all the sub-transactions to be put in the database are added into the submitted linked list of the corresponding target group according to the sequence from small to large of the combination of the submitted numbers of all the sub-transactions to be put in the database and the submitted LSNs of all the sub-transactions to be put in the database, so as to put in the database.
Step 302a: and when the transaction is an active transaction to be cut, traversing all sub transactions to be put in the database of the active transaction to be cut, and finding corresponding virtual transactions from the sub transactions to be put in the database.
Step 303a: taking the commit LSN of the commit message as a commit LSN and a wait LSN of the virtual transaction; the commit number of the virtual transaction is the operation number of the last operation in the active transaction to be cut. The commit number is set when the virtual transaction is constructed; the wait LSN is the commit LSN of the packet that the virtual transaction needs to wait for. By setting the wait LSN, the virtual transaction (the operation of modifying the configuration of the packet) can begin executing after the sub-transaction synchronization of less than or equal to the commit LSN in the wait packet is completed.
Step 304a: and adding all the sub-transactions to be put into the submitted linked list of the corresponding target group according to the sequence from small to large of the combination of the submitted numbers of all the sub-transactions to be put and the submitted LSNs of all the sub-transactions to be put. The operation of the same source-side main transaction is the same in commit LSN, and different in commit number, and a person of ordinary skill in the art can self-assign a combination mode of the commit LSN and the commit number according to a specific data synchronization scene. Wherein each packet corresponds to a committed linked list.
Step 305a: and warehousing all the sub-transactions to be warehoused according to the submitted linked list and the waiting LSN.
In order to maintain the consistency of the database, before synchronization, the operation to be synchronized is distinguished according to the dependent table, so as to avoid database errors caused by the fact that sub-transactions are not put in storage in the groups relied by the groups of the sub-transactions when the sub-transactions are executed in parallel.
To better illustrate the method of dynamically modifying data synchronization packets of the present invention, step 305a of the method of dynamically modifying data synchronization packets of an embodiment of the present invention is further detailed, specifically, as shown in fig. 4, the step 305a includes:
Step 3051: and sequentially judging whether the commit LSN of the sub-transaction to be put in the committed chain table is smaller than or equal to the commit LSN of the corresponding target packet, and judging whether the commit number of the sub-transaction to be put in the warehouse is smaller than or equal to the commit number of the target packet.
For example, active transaction A1, which has received the commit message, is as follows:
A1
{
g1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:25, commit number: 1]
G1_TRX4[ rely on G2, wait for LSN:25, end tag: 1, submit LSN:25, submit number: 2]
G1_TRX5[ Table: T1, end tag: 0, commit LSN:25, commit number: 4]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:25, end tag: 1, submit LSN:25, submit number: 1]
G2_TRX3[ Table: T1, end marker: 1, commit LSN:25, commit number: 2]
}
The execution thread polls the G1 group and the G2 group, and executes the group according to the size sequence of the combination of the commit LSN and the commit number of the transaction from the committed linked list of the G1 group and the G2 group.
Step 3052: if yes, discarding the sub-transaction to be put in storage; if not, judging whether the sub-transaction to be put in storage is a virtual transaction.
For example, the execution thread extracts the sub-transaction g1_trx1 to be put in the G1 packet, and at this time, the G1 packet does not execute the putting in the box of a certain sub-transaction to be put in the box, so that both the commit LSN and the commit number of the G1 packet are 0, the commit LSN25 of the sub-transaction g1_trx1 to be put in the box is not less than or equal to 0, and the commit number 1 is not less than or equal to 0, and the sub-transaction to be put in the box is put in the box.
Step 3053: when the virtual transaction is a virtual transaction, sequentially judging whether the commit LSN corresponding to the original packet relied by the virtual transaction is larger than or equal to the waiting LSN of the virtual transaction, and judging whether the commit number corresponding to the original packet is larger than or equal to the commit number of the virtual transaction; if not, not warehousing all sub-transactions to be warehoused in the target group to which the virtual transaction belongs in the submitted linked list; and if so, warehousing all sub-transactions to be warehoused in the target group to which the virtual transaction belongs. Wherein, when the transaction is a virtual transaction, the execution of the sub-transaction in the original group corresponding to the modification is dependent; when not a virtual transaction, it may be executed directly, independent of the execution of other packets.
For example, the execution thread extracts the sub-transaction to be binned g1_trx4 of the G1 packet, which depends on the execution of the G2 packet, and is a virtual transaction, the commit LSN25 of the G2 packet is equal to the wait LSN25 of the virtual transaction g1_trx4, the condition is satisfied, the commit number 0 of the G2 packet (at this time, the G2 packet does not binning a certain sub-transaction to be binned) is not greater than or equal to the commit number 2 of the virtual transaction g1_trx4, and the condition is not satisfied, so that the g1_trx4 transaction cannot be executed, the sub-transaction to be binned in the G1 packet on the currently committed linked list should be skipped, and the sub-transaction to be binned of the G2 packet is fetched.
The execution thread fetches the to-be-binned sub-transaction g2_trx2 of the G2 packet, which depends on the execution of the G1 packet, when the G1 packet has executed g1_trx1, commit LSN of the G1 packet and commit LSN25 and commit number 1 of g1_trx1. And the waiting LSN25 of the sub-transaction to be put into storage G2 TRX2 is equal to the submitting LSN25 of the G1 group, the submitting number 1 of the G2 group is equal to the submitting number 1 of the G1 group, and the condition is met, and the sub-transaction to be put into storage G2 TRX2 is put into storage.
Step 3054: after executing the sub-transaction to be put into the warehouse each time, taking the commit LSN of the sub-transaction to be put into the warehouse as the commit LSN of the corresponding target group, and taking the commit number of the sub-transaction to be put into the warehouse as the commit number of the target group. When executing the sub-transaction to be put in storage, each group also needs to record the commit LSN and commit number of the sub-transaction to be put in storage, which are completed by execution, and is used for filtering the executed sub-transaction to be put in storage after faults, so as to ensure the consistency of data.
For example, the execution thread fetches the g1_trx1 transaction of the G1 packet into a pool, recording the commit LSN25 and commit number 1 of the completed transaction as the commit LSN of the target packet.
By setting the virtual transaction, when the sub-transaction to be put in the warehouse related to the table to be modified is submitted, the execution sequence of the sub-transaction formed after cutting and the execution sequence of the dependency relationship executed between the original group and the target group are ensured. By setting the waiting LSN and the submitting LSN of the virtual transaction and setting the submitting number of the virtual transaction as the operation number corresponding to the last operation of the active transaction, the data synchronization service is ensured not to generate operation errors caused by the transaction dependency relationship of the sub-transaction in the process of executing the warehouse entry, and the consistency of the synchronous data is ensured to be recovered when the data synchronization service fails for recovery.
The target end continues to receive the log sent by the source end, generates or classifies the operations in the log into corresponding active transactions, and carries out corresponding processing according to the operation types. To better illustrate the method for dynamically modifying data synchronization packets according to the present invention, step 30 of the method for dynamically modifying data synchronization packets according to an embodiment of the present invention is further detailed, specifically, as shown in fig. 5, the step 30 includes:
step 301b: and continuously receiving the log, and judging the type of the operation corresponding to the log. Where there is often a dependency conflict between DML (Data Manipulation Language ) operations that belong to the same master transaction at the source and are split into different packets for parallel execution. DDL (Data Definition Language ) operations are generally free of dependency conflicts without using the method of dynamically modifying data sync packets of embodiments of the present invention. After receiving DDL operation, dividing the DDL operation into corresponding active transactions in a dependent table, adding the corresponding commit message to a committed linked list according to commit LSN after receiving the corresponding commit message, and executing warehousing. And when the operation is a rollback operation, releasing all sub-transactions and the active transactions under the active transaction according to the active transaction. The rollback operation of the source end withdraws all executed main transactions, and in order to maintain the consistency of the source end data and the target end data in the data synchronization, all operations of the target end related to the main transactions of the source end need to be rolled back.
Step 302b: when the operation is a DML operation, dividing the DML operation into corresponding sub-transactions according to the log and grouping configuration corresponding to the DML operation; when the sub-transaction has added an end mark, the sub-transaction is a sub-transaction of an active transaction to be cut, a corresponding target sub-transaction is created according to the grouping configuration, and the operation number of the DML operation is used as the commit number of the target sub-transaction.
For example, upon receiving a DML operation commit LSN 22, active transaction A1 is located with its corresponding ID 1 by the master transaction ID. The corresponding sub-transaction original sub-transaction g1_trx1 and target sub-transaction g2_trx3 corresponding to the table to be modified T1 are found in the active transaction A1. Since the original sub-transaction g1_trx1 has added an end tag, i.e. the sub-transaction to be cut at the last time of the packet configuration modification, the table T1 to be modified at the moment belongs to the G2 packet according to the packet configuration, a target sub-transaction g2_trx3 belonging to the G2 packet is created on the active transaction, and the DML operation is added to the target sub-transaction, and the operation number of the operation is taken as the commit number of the target sub-transaction, specifically as follows:
A1
{
g1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 1]
G2_TRX3[ Table: T1, end tag: 0, commit LSN:0, commit number: 2]
}
Upon receipt of a DML operation with commit LSN 24, the DML operation is again appended to the sub-transaction G1_TRX5 by locating the main transaction ID to the active transaction A1 with ID 1, at which point T1 belongs to the G1 packet, and there is already a sub-transaction G1_TRX5 on the active transaction that belongs to the G1 packet, whose end tag is 0, i.e., no end tag is appended.
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
G1_TRX4[ rely on G2, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 2]
G1_TRX5[ Table: T1, end tag: 0, commit LSN:0, commit number: 4]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 1]
G2_TRX3[ Table: T1, end marker: 1, commit LSN:0, commit number: 2]
}
Step 303b: and when the operation is a submitting operation, finding out a corresponding active transaction to be cut according to a log corresponding to the submitting operation, and warehousing sub-transactions to be warehoused in the active transaction to be cut. And determining corresponding active transactions to be cut according to the main transaction ID in the log corresponding to the submitting operation.
The commit message is added for all sub-transactions to be put in the warehouse of the active transaction to be cut; taking the commit LSN of the commit operation as the commit LSN of all sub-transactions to be put in storage; and adding all sub-transactions to be put into a submitted linked list of the target group, and putting all sub-transactions to be put into a put according to the submitted linked list.
When the method for dynamically modifying the data synchronization packet in the embodiment of the invention continues to receive the log and divides the corresponding operation, the packet is determined by the table information analyzed in the log corresponding to the operation. The packet belongs to an active transaction or an active transaction to be cut. If the table corresponding to the table information is not the table to be modified, the operation that the corresponding active transaction is not required to be cut before and after the grouping configuration is modified and the operation that is continuously received is divided into sub-transactions of the active transaction is described.
For example, upon receiving a commit operation with commit LSN of 25, locating active transaction A1 with ID 1 by the master transaction ID, traversing all sub-transactions of active transaction A1, and marking their commit LSN as that of the current commit operation, and when the sub-transactions are virtual transactions, then setting the wait LSN of the packet it waits for to be 25, and then adding to the committed linked list of the corresponding packet.
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:25, commit number: 1]
G1_TRX4[ rely on G2, wait for LSN:25, end tag: 1, submit LSN:25, submit number: 2]
G1_TRX5[ Table: T1, end tag: 0, commit LSN:25, commit number: 4]
G2 grouping: G2_TRX2[ rely on G1, wait for LSN:25, end tag: 1, submit LSN:25, submit number: 1]
G2_TRX3[ Table: T1, end marker: 1, commit LSN:25, commit number: 2]
}
Wherein the commit message is not presented herein, and the specific implementation is specified by one of ordinary skill in the art according to a specific data synchronization scenario, and is not limited herein.
Notably, upon restart recovery of the data synchronization service failure, wherein:
the synchronized commit LSNs, commit numbers, and virtual transactions in each target packet are restored from the checkpoint file. And when the execution is carried out and the warehouse entry is carried out after the recovery, judging whether the commit LSN and the commit number corresponding to the executed sub-transaction are smaller than or equal to the commit LSN and the commit number of the last synchronized sub-transaction. If yes, not executing the sub-transaction and warehousing; if not, executing the sub-transaction, and modifying the current grouping configuration and storing according to the modification operation in the virtual transaction. Sub-transactions with commit LSNs less than or equal to the commit LSNs of the target packet and commit numbers less than or equal to the commit numbers of the target packet are filtered to ensure consistency of the fault recovery data.
Example 2:
on the basis of the above embodiment 1, the embodiment of the present invention provides a specific example of a method for dynamically modifying a data synchronization packet so as to better understand the whole synchronization process, and will be described by taking a log information sequence shown in the following table as an example:
the source database has a table T1 (ID INT). The source end has a transaction to operate on the table T1 to generate the following sequential log, and the receiving thread forms the following numbered table after receiving:
master transaction ID Operation of Commit LSN Operation numbering
1 INSERT INTO T1(ID)VALUES(100); 21 1
1 UPDATE T1 SET ID=200WHERE ID=100; 22 2
1 UPDATE T1 SET ID=300WHERE ID=200; 23 3
1 UPDATE T1 SET ID=400WHERE ID=300; 24 4
1 COMMIT 25 5
The target synchronization service configures two groups, G1 and G2, initially, a table to be modified is configured in the G1 group, T1 to G2 groups are modified, then the G2 groups are changed back to the G1 groups, and the process is as follows:
upon receiving the commit LSN 21 operation, an active transaction A1 is created with ID 1, at which point T1 belongs to the G1 group, and a sub-transaction G1_TRX1 is created on the active transaction, which belongs to the G1 group, as follows:
A1
{
g1 grouping:
G1_TRX1[ Table: T1, end tag: 0, commit LSN:0, commit number: 1]
}
A modification operation is prepared to group T1 from the original G1 group to the target G2 group.
The receipt of the source log is suspended, the active transaction is traversed, and the active transaction A1 associated with T1 is found.
A G2 dependent G1 virtual transaction G2_TRX2 is constructed for the target packet referred to by the T1 table on active transaction A1, and the modified operation to change the T1 packet from G1 to G2 is saved to the virtual transaction.
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 1]
}
And checking the transaction information in the current system and the modified grouping configuration information once, and storing the states of all the current transactions into a check point file.
And restoring the receiving of the source log.
Upon receiving the commit LSN 22 operation, a sub-transaction G2_TRX3 is created on the active transaction that belongs to the G2 group by locating the main transaction ID to the active transaction A1 with ID 1, at which point T1 belongs to the G2 group.
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 1]
G2_TRX3[ Table: T1, end tag: 0, commit LSN:0, commit number: 2]
}
A modification operation is prepared to group T1 from the original G2 group to the target G1 group.
The receipt of the source log is suspended, the active transaction is traversed, and the active transaction A1 associated with T1 is found.
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 1]
G2_TRX3[ Table: T1, end marker: 1, commit LSN:0, commit number: 2]
}
A G1 dependent G2 virtual transaction G1_TRX4 is constructed for the target packet referred to by the T1 table on active transaction A1, and the modified operation to change the T1 packet from G2 to G1 is saved to the virtual transaction.
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
G1_TRX4[ rely on G2, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 2]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 1]
G2_TRX3[ Table: T1, end marker: 1, commit LSN:0, commit number: 2]
}
And checking the transaction information and the modified grouping configuration information in the current system once, and storing the current system state.
And restoring the receiving of the source log.
Upon receiving the commit LSN of 23 operation, by locating the main transaction ID to the active transaction A1 with ID 1, at which point T1 belongs to the G1 group, a sub-transaction g1_trx5 belonging to the G1 group is created on the active transaction and the operation is added to the sub-transaction.
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
G1_TRX4[ rely on G2, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 2]
G1_TRX5[ Table: T1, end tag: 0, commit LSN:0, commit number: 3]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 1]
G2_TRX3[ Table: T1, end marker: 1, commit LSN:0, commit number: 2]
}
Upon receiving the commit LSN of 24 operation, the operation is added to the sub-transaction g1_trx5 by locating the main transaction ID to the active transaction A1 with ID 1, at which point T1 belongs to the G1 packet.
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:0, commit number: 1]
G1_TRX4[ rely on G2, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 2]
G1_TRX5[ Table: T1, end tag: 0, commit LSN:0, commit number: 4]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:0, end tag: 1, commit LSN:0, commit number: 1]
G2_TRX3[ Table: T1, end marker: 1, commit LSN:0, commit number: 2]
}
Upon receiving a commit operation with commit LSN 25, active transaction A1 with ID 1 is located by the master transaction ID.
A1
{
G1 grouping:
G1_TRX1[ Table: T1, end marker: 1, commit LSN:25, commit number: 1]
G1_TRX4[ rely on G2, wait for LSN:25, end tag: 1, submit LSN:25, submit number: 2]
G1_TRX5[ Table: T1, end tag: 0, commit LSN:25, commit number: 4]
G2 grouping:
G2_TRX2[ rely on G1, wait for LSN:25, end tag: 1, submit LSN:25, submit number: 1]
G2_TRX3[ Table: T1, end marker: 1, commit LSN:25, commit number: 2]
}
The execution thread polls the G1 packet and the G2 packet, and executes the packets according to the size sequence of the sub-transaction commit LSN and the commit number combination from the group committed linked list.
The execution thread fetches the g1_trx1 transaction of the G1 packet, keeping track of commit LSN 25 and commit number 1 of the completed transaction it has executed.
The execution thread fetches the g1_trx4 transaction of the G1 packet, skips the G1 packet (including the g1_trx4 transaction), and fetches the transaction of the G2 packet.
The execution thread fetches the g2_trx2 transaction of the G2 packet and performs binning.
The execution thread fetches the g2_trx3 transaction of the G2 packet, performs binning, and records commit LSN 25 and commit number 2 of the completed transaction it has executed.
The pending sub-transaction of the G2 packet has been executed, and the execution thread executes the g1_trx4 transaction of the G1 packet.
When the program is aborted after the execution is finished, the commit LSN of the G1 packet is 25 and the commit number is 2; the commit LSN for the G2 packet is 25 and commit number is 2.
And restarting the synchronous service of the target end, and executing the sub-transaction G1_TRX5 of the G1 packet after fault recovery.
Example 3:
fig. 6 is a schematic diagram of an apparatus for dynamically modifying data synchronization packets according to an embodiment of the present invention. The apparatus for dynamically modifying data synchronization packets of the present embodiment includes one or more processors 31 and a memory 32. In fig. 6, a processor 31 is taken as an example.
The processor 31 and the memory 32 may be connected by a bus or otherwise, which is illustrated in fig. 6 as a bus connection.
The memory 32 is used as a non-volatile computer readable storage medium for storing non-volatile software programs and non-volatile computer executable programs, such as the method of dynamically modifying data synchronization packets in embodiment 1. The processor 31 performs a method of dynamically modifying the data sync packet by running non-volatile software programs and instructions stored in the memory 32.
The memory 32 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 32 may optionally include memory located remotely from processor 31, which may be connected to processor 31 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 32 and when executed by the one or more processors 31 perform the method of dynamically modifying data synchronization packets in embodiment 1 described above, for example, performing the steps shown in fig. 1-5 described above.
It should be noted that, because the content of information interaction and execution process between modules and units in the above-mentioned device and system is based on the same concept as the processing method embodiment of the present invention, specific content may be referred to the description in the method embodiment of the present invention, and will not be repeated here.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the embodiments may be implemented by a program that instructs associated hardware, the program may be stored on a computer readable storage medium, the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (10)

1. A method of dynamically modifying a data synchronization packet, comprising:
suspending log receiving service, traversing all current active transactions, and judging whether operations related to a table to be modified exist or not;
when the operation exists, determining the active transaction corresponding to the operation as an active transaction to be cut, constructing a virtual transaction according to grouping configuration, and taking the virtual transaction as a sub-transaction of the active transaction to be cut; the virtual transaction is used for changing the grouping to which the table to be modified belongs; wherein the packet configuration is a packet configuration after modification based on the modification request;
restarting the log receiving service, and warehousing the active transaction to be cut.
2. The method for dynamically modifying a data synchronization packet according to claim 1, wherein when the data synchronization packet exists, determining an active transaction corresponding to the operation as an active transaction to be cut, constructing a virtual transaction according to a packet configuration, and taking the virtual transaction as a sub-transaction of the active transaction to be cut; the virtual transaction for changing the packet to which the table to be modified belongs includes:
taking the sub-transaction of the active transaction to be cut as a sub-transaction to be cut, and adding an end mark for the sub-transaction to be cut so as to facilitate that the operation is not divided into the sub-transaction to be cut any more in the follow-up process;
Acquiring a target group of the table to be modified according to the group configuration;
constructing a virtual transaction of the table to be modified, taking the virtual transaction as a sub-transaction belonging to the target group of the corresponding active transaction to be cut, and adding an end mark for the virtual transaction;
and constructing a modification operation for dividing the table to be modified into the target group, and adding the modification operation to the virtual transaction.
3. The method of dynamically modifying a data synchronization packet according to claim 2, wherein restarting the log receiving service, binning the active transaction to be cut comprises:
receiving a commit message, taking a sub-transaction of an active transaction corresponding to the commit message as a sub-transaction to be put in storage, and judging whether the active transaction is an active transaction to be cut;
when the transaction is an active transaction to be cut, traversing all sub transactions to be put in the warehouse of the active transaction to be cut, and finding corresponding virtual transactions from the sub transactions to be put in the warehouse;
taking the commit LSN of the commit message as a commit LSN and a wait LSN of the virtual transaction; the submitting number of the virtual transaction is the operation number of the last operation in the active transaction to be cut;
Adding all sub-transactions to be put into a submitted linked list of a corresponding target group according to the sequence from small to large of the combination of the submitted numbers of all sub-transactions to be put and the submitted LSNs of all sub-transactions to be put;
and warehousing all the sub-transactions to be warehoused according to the submitted linked list and the waiting LSN.
4. The method of dynamically modifying a data sync packet according to claim 3, wherein said binning all of said pending sub-transactions according to said committed linked list and a waiting LSN comprises:
sequentially judging whether the commit LSN of the sub-transaction to be put in storage in the committed chain table is smaller than or equal to the commit LSN of the corresponding target packet, and judging whether the commit number of the sub-transaction to be put in storage is smaller than or equal to the commit number of the target packet;
if yes, discarding the sub-transaction to be put in storage; if not, judging whether the sub-transaction to be put in storage is a virtual transaction or not;
when the virtual transaction is a virtual transaction, sequentially judging whether the commit LSN corresponding to the original packet relied by the virtual transaction is larger than or equal to the waiting LSN of the virtual transaction, and judging whether the commit number corresponding to the original packet is larger than or equal to the commit number of the virtual transaction; if not, not warehousing all sub-transactions to be warehoused in the target group to which the virtual transaction belongs in the submitted linked list; if yes, warehousing all sub-transactions to be warehoused in the target group to which the virtual transaction belongs;
After executing the sub-transaction to be put into the warehouse each time, taking the commit LSN of the sub-transaction to be put into the warehouse as the commit LSN of the corresponding target group, and taking the commit number of the sub-transaction to be put into the warehouse as the commit number of the target group.
5. A method of dynamically modifying a data synchronization packet according to claim 3, wherein restarting the log receiving service, binning the active transaction to be cut comprises:
continuously receiving a log, and judging the type of operation corresponding to the log;
when the operation is a DML operation, dividing the DML operation into corresponding sub-transactions according to the log and grouping configuration corresponding to the DML operation; when the sub-transaction has added an end mark, the sub-transaction is a sub-transaction of an active transaction to be cut, a corresponding target sub-transaction is created according to the grouping configuration, and the operation number of the DML operation is used as the commit number of the target sub-transaction;
and when the operation is a submitting operation, finding out a corresponding active transaction to be cut according to a log corresponding to the submitting operation, and warehousing sub-transactions to be warehoused in the active transaction to be cut.
6. The method for dynamically modifying a data synchronization packet according to claim 5, wherein when the operation is a commit operation, finding a corresponding transaction to be cut according to a log corresponding to the commit operation, and warehousing sub-transactions to be binned in the transaction to be cut comprises:
adding commit messages for all sub-transactions to be put of the active transactions to be cut;
taking the commit LSN of the commit operation as the commit LSN of all sub-transactions to be put in storage;
and adding all sub-transactions to be put into a submitted linked list of the target group, and putting all sub-transactions to be put into a put according to the submitted linked list.
7. The method of dynamically modifying a data sync packet according to claim 1, wherein prior to the halting log receiving service traversing all currently active transactions, determining whether there is an operation involving a table to be modified comprises:
receiving a log, and analyzing the log to obtain at least one operation, corresponding table information, a main transaction ID and an operation number;
judging whether a corresponding active transaction exists according to the main transaction ID, and creating the active transaction corresponding to the main transaction ID when the corresponding active transaction does not exist;
Judging whether an original sub-transaction corresponding to the original group exists under the active transaction according to the table information, and creating the original sub-transaction when the original sub-transaction does not exist; partitioning the operation to the original sub-transaction;
the operation number of the last operation in each original sub-transaction is used as the commit number of the corresponding original sub-transaction;
and the active transaction receiving the commit message adds all the original sub-transactions into a committed linked list of a corresponding original group according to the sequence from small to large of the combination of the commit numbers of all the original sub-transactions in the active transaction and the commit LSN of all the original sub-transactions, and executes and stores all the original sub-transactions according to the committed linked list.
8. The method of dynamically modifying a data synchronization packet according to claim 1, wherein prior to said restarting said log receiving service, binning said active transaction to be cut comprises:
executing a check point action, creating a check point file, and storing transaction operation information of the current system into the check point file;
after the transaction operation information is stored, the current grouping configuration of the table to be synchronized is modified and stored according to the grouping configuration, so that the current grouping configuration is recovered when the data synchronization service is restarted each time.
9. A method for dynamically modifying a data synchronization packet according to any one of claims 1-8, wherein upon restart recovery from a data synchronization service failure, wherein:
restoring the synchronized commit LSN, commit number and virtual transaction in each target group according to the checkpoint file;
when the sub-transaction is executed and put in storage after recovery, judging whether the commit LSN and the commit number corresponding to the executed sub-transaction are smaller than or equal to the commit LSN and the commit number of the last synchronized sub-transaction;
if yes, not executing the sub-transaction and warehousing; if not, executing the sub-transaction, and modifying the current grouping configuration and storing according to the modification operation in the virtual transaction.
10. An apparatus for dynamically modifying a data sync packet, comprising at least one processor and a memory, the at least one processor and the memory being coupled via a data bus, the memory storing instructions executable by the at least one processor, the instructions, when executed by the processor, for performing the method for dynamically modifying a data sync packet as claimed in any one of claims 1-9.
CN202311204111.XA 2023-09-15 2023-09-15 Method and device for dynamically modifying data synchronization packet Pending CN117349370A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311204111.XA CN117349370A (en) 2023-09-15 2023-09-15 Method and device for dynamically modifying data synchronization packet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311204111.XA CN117349370A (en) 2023-09-15 2023-09-15 Method and device for dynamically modifying data synchronization packet

Publications (1)

Publication Number Publication Date
CN117349370A true CN117349370A (en) 2024-01-05

Family

ID=89370155

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311204111.XA Pending CN117349370A (en) 2023-09-15 2023-09-15 Method and device for dynamically modifying data synchronization packet

Country Status (1)

Country Link
CN (1) CN117349370A (en)

Similar Documents

Publication Publication Date Title
US11556543B1 (en) Streaming joins with synchronization via stream time estimations
CN109101627B (en) Heterogeneous database synchronization method and device
EP3401804B1 (en) Adaptive query routing in a replicated database environment
CN109460349B (en) Test case generation method and device based on log
CN105550293B (en) A kind of backstage method for refreshing based on Spark SQL big data processing platforms
US9881041B2 (en) Multiple RID spaces in a delta-store-based database to support long running transactions
EP3120261B1 (en) Dependency-aware transaction batching for data replication
US8949178B2 (en) Method and system for efficient data synchronization
WO2019128205A1 (en) Method and device for achieving grayscale publishing, computing node and system
CN111177178B (en) Data processing method and related equipment
CN112286941B (en) Big data synchronization method and device based on Binlog + HBase + Hive
CN111694800B (en) Method for improving data synchronization performance and data synchronization system
CN112559473B (en) Priority-based two-way synchronization method and system
CN111858501B (en) Log reading method based on log analysis synchronization and data synchronization system
US20230137119A1 (en) Method for replaying log on data node, data node, and system
CN114661816B (en) Data synchronization method and device, electronic equipment and storage medium
CN111694798A (en) Data synchronization method and data synchronization system based on log analysis
CN111694893A (en) Partial rollback analysis method based on log analysis and data synchronization system
CN112559629B (en) Large object initialization method and device based on log analysis synchronization
CN111858504B (en) Operation merging execution method based on log analysis synchronization and data synchronization system
CN111930828B (en) Data synchronization method and data synchronization system based on log analysis
CN117349370A (en) Method and device for dynamically modifying data synchronization packet
CN111930692A (en) Transaction merging execution method and device based on log analysis synchronization
CN117349371A (en) Method and device for statically modifying data synchronization packet
CN115422286A (en) Data synchronization method and device for distributed database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination