CN117349369A - Data synchronization method and device for packet execution - Google Patents

Data synchronization method and device for packet execution Download PDF

Info

Publication number
CN117349369A
CN117349369A CN202311197235.XA CN202311197235A CN117349369A CN 117349369 A CN117349369 A CN 117349369A CN 202311197235 A CN202311197235 A CN 202311197235A CN 117349369 A CN117349369 A CN 117349369A
Authority
CN
China
Prior art keywords
transaction
sub
current
packet
transactions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311197235.XA
Other languages
Chinese (zh)
Inventor
梅纲
吴鑫
高东升
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dream Database Co ltd
Original Assignee
Wuhan Dream Database Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dream Database Co ltd filed Critical Wuhan Dream Database Co ltd
Priority to CN202311197235.XA priority Critical patent/CN117349369A/en
Publication of CN117349369A publication Critical patent/CN117349369A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of database synchronization, and provides a data synchronization method and device for packet execution. The invention supports the adjustment of grouping rules at any time in the data synchronization process on the basis of ensuring the consistency of the synchronous data and the execution granularity of split transactions by introducing multi-version grouping management, improves the convenience and fault tolerance of the synchronous program, and improves the data synchronization efficiency. By supporting setting of execution priority of each packet in the packet rule, adaptability of data synchronization to service requirements is improved, and manual assignment of synchronization sequence is supported under the condition of avoiding operation errors of synchronization service.

Description

Data synchronization method and device for packet execution
Technical Field
The present invention relates to the field of database synchronization technologies, and in particular, to a method and an apparatus for packet execution data synchronization.
Background
The data real-time synchronization is issued and executed in the unit of transaction, and the minimum concurrency granularity of execution is the transaction. The data synchronization is a process of copying the change data of the source end database to the destination end database in the shortest time possible; a transaction is a sequence of database operations that access and potentially operate on various data items, either all or none, and is an indivisible unit of work, consisting of all database operations performed between the beginning of the transaction and the end of the transaction. When the log is analyzed, the transaction information and the transaction operation record are cached by taking the transaction as a unit; transactional operations refer to database operations that a transaction contains. When the log is analyzed to the commit (commit) operation of the transaction, the corresponding information of the transaction to be committed is issued to the execution end, and the execution end traverses all transaction operations related to the transaction from the cache to execute and store. The warehousing is to execute a certain transaction operation in a destination database so as to realize data synchronization.
The prior art data synchronization is performed on a table-wise grouping of transaction operations in a transaction, but without regard to grouping adjustment and execution priority. The grouping rules cannot be adjusted, the grouping rules cannot be modified once data synchronization is started, the dependence among the transactions is complex, the convenience and the fault tolerance rate are low, the running errors of the synchronization service are easy to cause, the execution sequence of the dependent transactions is strictly performed according to the SCN (System Change Number, system revision number) size sequence, the waiting time is increased, and the synchronization efficiency is low. If a certain table a in the database is updated by one transaction operation in the transaction 1, another transaction operation in the subsequent transaction 2 continuously updates the table a based on the updated table a, that is, the corresponding transaction operation in the transaction 2 needs to wait for the execution of the corresponding transaction operation in the transaction 1 before being successfully executed, then the dependency relationship exists between the two transaction operations; because the transaction is an inseparable work unit, the transaction operation in the transaction cannot be independently executed, so that the transaction 2 can be successfully executed after waiting for the execution of the transaction 1, namely, the dependency relationship exists between the transaction 2 and the transaction 1; there may be dependencies between different transactions that affect the order of execution of the transactions. SCN is a number that grows in order in a database system to accurately distinguish the order of operations.
Because the transaction is generally determined according to the service requirement, the transaction operation for updating a plurality of tables often exists in one transaction, so that the dependency relationship among the transactions is complex, and under the condition that the operation error of the synchronous service needs to be avoided, the execution granularity of the transaction can only be considered and not split, the independent execution of the transaction operation for updating part of the tables in a certain transaction is not allowed, the preferential execution of the transaction operation for updating a certain table in a certain transaction is not allowed, and the synchronous efficiency is further reduced. For example, in the transaction 1, the transaction operation of the update table a is independent, that is, the corresponding transaction operation does not have a dependency relationship with the transaction operations in other transactions, but the transaction operation of the update table a is a part of the transaction 1 and cannot be executed independently, and is limited by the execution sequence of the transaction operation preset in the transaction 1, so that the transaction operations, which do not have a dependency relationship with each other, in the transaction 1 and other transactions wait for each other when executing, thereby wasting execution time; and, since the transaction operation of the update table a may exist in the transaction 1 where the transaction operation of the update table a is located, the transaction operation of the update table a cannot be preferentially executed, and when the transaction 1 and other transactions need to wait for each other, the synchronization efficiency is low.
In view of this, overcoming the drawbacks of the prior art is a problem to be solved in the art.
Disclosure of Invention
Aiming at the defects or improvement demands of the prior art, the invention provides a data synchronization method and device for packet execution, which aim to solve the problem of low data synchronization efficiency under the condition of normal operation of synchronization service by realizing dynamic adjustment of packet rules, supporting setting of execution priority, parallel execution and warehousing of sub-transactions in each packet, and reducing the waiting time of execution transactions.
The invention adopts the following technical scheme:
in a first aspect, the present invention provides a data synchronization method performed by a packet, including:
acquiring a grouping rule; wherein, each current packet corresponds to at least one table, and the priority of each current packet is set; when the grouping rule is changed, the changed grouping rule is used as a current version grouping rule, and each version grouping rule corresponds to a default grouping;
receiving a log, analyzing the log, dividing the transaction operation into corresponding current groups or default groups according to the grouping rule of the current version, and generating at least one current sub-transaction; wherein the transaction operations of the master transaction are grouped according to the dependent table;
When the grouping rule of the current version changes before the main transaction is not submitted, attributing the current grouping to a history grouping and attributing the current sub-transaction to a history sub-transaction;
when the submitting operation of the main transaction is analyzed, respectively constructing at least one sub-transaction message and at least one grouping queue according to all historical sub-transactions and all current sub-transactions of the main transaction, issuing the sub-transaction message of the historical sub-transaction to the grouping queue of the historical grouping, and issuing the sub-transaction message of the current sub-transaction to the grouping queue of the current grouping;
and sequentially pulling the sub-transaction messages in the grouping queues corresponding to each version according to the priority, acquiring the sub-transaction to be put in storage according to the sub-transaction messages, and executing the sub-transaction to be put in storage.
In a second aspect, the present invention further provides a packet-implemented data synchronization device, configured to implement the packet-implemented data synchronization method according to the first aspect, where the packet-implemented data synchronization device includes:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor for performing the packet-performed data synchronization method of the first aspect.
In a third aspect, the present invention also provides a non-volatile computer storage medium storing computer executable instructions for execution by one or more processors to perform the packet-executed data synchronization method of the first aspect.
Unlike the prior art, the invention has at least the following beneficial effects:
the invention supports the adjustment of grouping rules at any time in the data synchronization process on the basis of ensuring the consistency of the synchronous data and the execution granularity of split transactions by introducing multi-version grouping management, improves the convenience and fault tolerance of the synchronous program, and improves the data synchronization efficiency. By supporting setting of execution priority of each packet in the packet rule, adaptability of data synchronization to service requirements is improved, and manual assignment of synchronization sequence is supported under the condition of avoiding operation errors of synchronization service.
Furthermore, by designing the transaction conflict detection logic, unnecessary waiting during executing the transaction is greatly reduced, the parallel execution efficiency of the transaction is improved, the data synchronization efficiency is further improved, and the use scene of the table grouping execution mode is widened.
Drawings
In order to more clearly illustrate the technical solution of the embodiments of the present invention, the drawings that are required to be used in the embodiments of the present invention will be briefly described below. It is evident that the drawings described below are only some embodiments of the present invention and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
Fig. 1 is a general flow diagram of a packet-implemented data synchronization method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of step 20 according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of step 40 according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart of step 50 according to an embodiment of the present invention;
FIG. 5 is a schematic diagram showing another embodiment of the process of step 50 of the present invention;
FIG. 6 is a schematic diagram of a multi-version transaction execution sequence provided by an embodiment of the present invention;
FIG. 7 is a schematic diagram of a specific flow of transaction conflict detection according to an embodiment of the present invention;
FIG. 8 is a schematic flow chart of a specific fault or exception based process of step 50 of an embodiment of the present invention;
fig. 9 is a schematic diagram of an architecture of a packet-implemented data synchronization device according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "transverse", "upper", "lower", "top", "bottom", etc. refer to an orientation or positional relationship based on that shown in the drawings, merely for convenience of describing the present invention and do not require that the present invention must be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
The terms "first," "second," and the like herein are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", etc. may explicitly or implicitly include one or more such feature. In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Example 1:
in a packet synchronization mode based on log analysis, a destination database receives a log containing a plurality of source-side main transactions, each main transaction contains at least one transaction operation, the main transaction is split according to a table on which the main transaction depends, and is classified into corresponding packets, and the transaction in each packet is executed in parallel by using multithreading so as to improve the synchronization efficiency.
The data synchronization modes in the prior art are as follows:
the first data synchronization method is as follows: traditional data synchronization methods. When the log of the transaction operation is analyzed, the transaction ID and the corresponding transaction operation are respectively cached in the form of key value pairs. When resolving to commit operation, issuing a transaction ID to a transaction commit queue, taking out a transaction to be committed from the transaction commit queue by an execution end concurrency, traversing a transaction operation object from a cache according to the transaction ID for execution, and when the transaction is executed and put in storage, if commit SCN of other transactions in the transaction to be committed is smaller than commit SCN of the currently executed transaction, judging whether transaction dependence exists or not, and when the transaction dependence exists, executing and put in storage of the transaction with small commit SCN first. The queue is a linear data structure which only allows insertion at one end and deletion at the other end, and inhibits direct access to all data except the two ends, and the queue is a first-in first-out data structure.
And a second data synchronization method: table packet execution, but without consideration of packet adjustment and execution priority. When the log of the transaction operation is analyzed, the corresponding group number is searched according to the table name in the log, the sub-transaction is constructed according to the transaction ID and the group number, and the transaction relation between the main transaction and the sub-transaction is maintained. When a transaction is submitted, all sub-transaction IDs are fetched from the transaction relationship according to the master transaction ID, and each sub-transaction ID is issued to a corresponding packet queue. And traversing the transaction operation in the cache by the group execution thread of the execution end according to the sub-transaction ID, and executing warehousing. When the transactions in the group are executed and put in storage, the transaction dependency relationship is considered, and the sub-transactions with the dependency relationship are required to be executed and put in storage first. Out-of-group transactions do not need to consider concurrency control between transactions, and no dependency exists.
In the first method, the execution granularity of the transaction cannot be split, so that the independent execution of the transaction of part of the table cannot be realized, and the priority execution of the appointed table cannot be realized. The operations of a plurality of tables are combined in one transaction, and as more transaction operations which have no transaction dependency relationship with each other exist in the same transaction in an actual application scene, a large number of transaction operations which have no transaction dependency relationship with each other cannot be executed concurrently, so that the synchronization efficiency is lower. In the second method, the transaction is subjected to table grouping processing, but the grouping rule cannot be adjusted, the grouping rule cannot be modified once data synchronization service is started, the convenience and the fault tolerance are low, partial tables cannot be executed and put in storage preferentially, and the use requirements in the actual production environment are not met. In the first and second methods, in order to avoid causing the running error of the synchronization service, the execution sequence of the transaction with transaction dependence is strictly performed according to the size sequence of the submitted SCN of the transaction at the source end. Since one database in an actual production environment often contains several tens to several thousands of tables, data synchronization of multiple databases is sometimes performed, including a large number of transactions for which transaction dependencies exist. After a transaction that can be executed in parallel waits, the execution time of a plurality of transactions that depend on the execution needs to be delayed, so that the time wasted on waiting each other is too long, and the synchronization efficiency is low.
In order to solve the foregoing problem, embodiment 1 of the present invention provides a data synchronization method performed by a packet, as shown in fig. 1, including:
step 10: acquiring a grouping rule; wherein, each current packet corresponds to at least one table, and the priority of each current packet is set; when the grouping rule is changed, the changed grouping rule is used as a current version grouping rule, and each version of grouping rule corresponds to a default grouping.
Before data synchronization, the user needs to configure grouping rules. In the grouping rule of the data synchronization method of the embodiment of the invention, each grouping corresponds to at least one table. The mapping relationship of the packet and the table is maintained. In an alternative embodiment, the packet ID of the packet and the table full name of the table are cached in the form of key-value pairs, so that when data synchronization is performed, the data synchronization is performed according to the mapping relationship (i.e., the packet rule) of the packet and the table.
Carrying out multi-version management on the grouping rule, namely setting the version number of the grouping rule of an initial version to be 1 at a destination end of data synchronization, wherein only the grouping rule of the initial version exists at the moment, and the grouping rule of the current version is the grouping rule of the initial version; after each time the user modifies the grouping rule, the version number of the grouping rule of the current version before modification is added with 1, and the version number of the grouping rule of the current version after modification is set. Each time the user configures the grouping rules, a group number is set for each version of the grouping, and the number representing the group number is incremented by 1 each time from 0. Each time the user configures the grouping rules, the priority of the execution order of each grouping is set, in an alternative embodiment, the number representing the priority is set to be increased by 1 each time from 1, the smaller the number, the higher the priority, and the priorities of two or more groups are allowed to be the same according to the actual requirement of the data synchronous use scene. The packet ID of each version of the packet is set to "version number-group number-priority". In each version, a default packet is set, and the group number of the default packet of each version is 0, for example, in the packet rule of the initial version, the packet ID of the default packet with the priority of 3 is "1-0-3".
Step 20: receiving a log, analyzing the log, dividing the transaction operation into corresponding current groups or default groups according to the grouping rule of the current version, and generating at least one current sub-transaction; wherein the transaction operations of the master transaction are grouped by the table on which they depend.
The method comprises the steps of ending a data synchronization service at a destination end, wherein the data synchronization service comprises a log analysis service, and when the data synchronization is performed, after the destination end receives a log of a source end of the data synchronization, analyzing the log, namely restoring binary log data into at least one transaction operation record. The destination terminal classifies the transaction operation belonging to a certain main transaction at the source terminal into corresponding groups according to the grouping rule of the current version and the table on which the transaction operation depends, and divides the grouping into sub-transactions corresponding to the main transaction. If the sub-transaction of the table belonging to the main transaction does not exist, a corresponding sub-transaction is generated.
Step 30: when the grouping rule of the current version changes before the main transaction is not submitted, the current grouping is attributed to a historical grouping, and the current sub-transaction is attributed to a historical sub-transaction.
And after the grouping rule is modified by the user, the user continues to receive the log and carries out corresponding operation. After the history sub-transaction is executed and put in storage, traversing the sub-transaction ID in the cache, and judging whether the version number of the history sub-transaction is the version number in the sub-transaction ID of the history sub-transaction; and when the historical version grouping rule does not exist, cleaning the historical version grouping rule in the cache, and releasing the occupied memory space. Wherein, the grouping rule of the historical version refers to the grouping rules of all versions except the grouping rule of the current version.
Step 40: when the submitting operation of the main transaction is analyzed, at least one sub-transaction message and at least one grouping queue are respectively constructed according to all historical sub-transactions and all current sub-transactions of the main transaction, the sub-transaction messages of the historical sub-transactions are issued to the grouping queues of the historical grouping, and the sub-transaction messages of the current sub-transactions are issued to the grouping queues of the current grouping.
When the commit operation of the main transaction is resolved, it is indicated that the main transaction is committed at the source end, and because the transaction is an inseparable work unit, that is, the transaction operation in the main transaction at the source end is either completely executed or not completely executed, the data synchronization of the main transaction can be started at this time. Wherein each packet of each version corresponds to a packet queue.
Step 50: and sequentially pulling the sub-transaction messages in the grouping queues corresponding to each version according to the priority, acquiring the sub-transaction to be put in storage according to the sub-transaction messages, and executing the sub-transaction to be put in storage.
The sub-transactions of each version are stored in the corresponding grouping queues by using the data structure of the queues, so that the corresponding sub-transactions are added into the execution queues according to the submitting sequence of the transaction operation of the main transaction at the source end, and the execution and the storage are carried out.
In order to better illustrate the data synchronization method performed by the packet according to the embodiment of the present invention, step 20 of the data synchronization method performed by the packet according to the embodiment of the present invention is further refined, specifically, as shown in fig. 2, the step 20 includes:
step 201: and according to the grouping rule of the current version, acquiring a first mapping relation between the table full name of the table and the group number of the current grouping. Wherein the full name of the table is the unique identification of the table, for example, in the ORACLE database, there are different patterns or databases under which the same table may exist. In an alternative embodiment, the first mapping is cached in the form of key-value pairs, and a hash Map (an implementation of the hash-table based Map interface) is used to store a mapping of key-value pairs.
Step 202: receiving the log, analyzing the log to obtain at least one transaction operation, obtaining the table full name of a table on which each transaction operation depends, and judging whether a corresponding group number exists in the first mapping relation according to the table full name; if so, dividing the transaction operation into corresponding current groups; if not, the transaction operation is divided into a default grouping of the current version.
If a table exists, no group in the latest grouping rule is associated, the table is classified to a default grouping. In the actual data synchronization scenario, the number of tables from the source end needs to be more for data synchronization, but when the user sets a grouping rule according to specific service requirements, there is no need to define all the groupings of the tables from the source end, the grouping rule often only involves a few tables, and the transaction operation corresponding to the table not involved in the grouping rule is divided into default groupings. By setting default packets of each version, any packet is guaranteed to be globally unique at the destination.
Step 203: constructing a sub-transaction ID of a current sub-transaction according to the main transaction of the transaction operation, the version number of the grouping rule of the current version and the group number; searching whether a current sub-transaction corresponding to the sub-transaction ID exists or not; if not, generating the current sub-transaction, and adding the transaction operation to the current sub-transaction; if so, the transaction operation is added to the current sub-transaction.
And searching a group ID (namely 'version number-group number-priority') in a first mapping relation of the latest grouping rule according to the table full name of a table on which the transaction operation depends in the log when analyzing one log each time, and acquiring a corresponding group number, namely dividing the transaction operation into groups corresponding to the group number. And judging whether the sub-transaction corresponding to the main transaction belongs to the group exists or not. If not, constructing a sub-transaction and dividing the transaction operation into the sub-transactions; and taking the main transaction number (namely, the main transaction ID) of the source main transaction corresponding to the transaction operation, the version number of the latest grouping rule and the group number as sub transaction IDs (namely, "main transaction number-version number-group number"). If so, the transaction operation is divided into corresponding sub-transactions.
In each grouping of the above versions, there are sub-transactions of the same version belonging to different main transactions, and the tables on which these sub-transactions depend are tables specified by the corresponding grouping in the grouping rule of the corresponding version. Each sub-transaction described above contains only transaction operations that belong to the same main transaction.
After the step 20 is completed, an operation number of the corresponding transaction operation is obtained according to the number of the transaction operations contained in the current sub-transaction; and constructing a mapping relation between the sub-transaction IDs and the number, and constructing a mapping relation between the operation numbers and the corresponding logs of the transaction operation so as to obtain the corresponding logs of the transaction operation according to the sub-transaction IDs when the subsequent execution and warehousing are performed.
In an alternative embodiment, the mapping relation between the sub-transaction IDs and the number is cached in the form of key-value pairs (key-value), the sub-transaction IDs are used as key values, and the transaction header information containing the corresponding number is used as value values; creating a corresponding key value pair each time a sub-transaction is generated; each time a transaction operation is grouped, the corresponding key-value pair is updated, increasing the number of transaction operations therein by 1. The transaction header information is used for storing related information of a corresponding transaction so as to obtain the transaction and related information of transaction operation therein according to the transaction header information, and a person of ordinary skill in the art can determine contents to be contained in the transaction header information according to different actual use scenarios.
In an alternative embodiment, the mapping relationship between sub-transactions and the log of transaction operations is cached in the form of key-value pairs, so that specific executable information of corresponding transaction operations is found according to the sub-transaction IDs. Obtaining the operation number of the transaction operation according to the position of the transaction operation in the corresponding grouping queue, taking the sub-transaction ID-operation number as a key value and taking the log corresponding to the transaction operation as a value; creating a corresponding key value pair each time a sub-transaction is generated; each time a transaction operation is grouped, the corresponding key-value pair is updated. One of ordinary skill in the art can also select the specific way to record the sub-transaction ID and the log of the transaction operation according to different actual usage scenarios.
Step 204: constructing a second mapping relation between the main transaction ID of the main transaction and the sub transaction ID; the second mapping is updated each time a current sub-transaction is generated.
In the process of analyzing the log, the mapping relation between the main transaction ID and the sub transaction ID is required to be maintained. Each time a sub-transaction is generated, a corresponding second mapping is constructed, and in an alternative embodiment, a key pair is added to construct the second mapping. One of ordinary skill in the art can select a specific way of maintaining the mapping relationship between the main transaction ID and the sub transaction ID according to different actual usage scenarios, and can associate (including access and cleaning) the mapping relationship.
The logs received by the destination end are all logs generated by the source end when the transaction operation is executed, so that the logs containing the transaction operation which is being executed and not submitted are included, and in order to keep the consistency of the database, the data synchronization is performed based on the fact that the data is submitted at the source end, namely, the destination end only executes and stores the transaction operation which is submitted at the source end and is not executed at the destination end, and the data synchronization is completed.
In an alternative embodiment, the log parsing service is deployed at the source end, the parsed transaction operations are independently numbered in units of transactions, the operation numbers of the transaction operations are guaranteed to be sequentially increased in the transactions, and the log corresponding to the transaction operations is filled with the table full names of the tables on which the transaction operations depend. The data synchronization service of the destination terminal receives a log sent by a source terminal, classifies corresponding transaction operations into active transactions according to transaction IDs, extracts corresponding table full names, classifies the transaction operations into corresponding groups according to the table full names, processes sub-transactions according to the operations in the step 20 under the active transactions, adds the transaction operations into corresponding sub-transactions, records the number of the last transaction operation of each sub-transaction, and takes the number as the commit number of the sub-transaction; when the corresponding sub-transaction does not exist, the corresponding sub-transaction is generated. When the source end analyzes more than one table relied by a single main transaction, and when the destination end groups, under the condition that the more than one table is divided into different groups, an integral active transaction is created and used for storing sub-transactions corresponding to the main transaction which does not receive the commit operation, managing the sub-transactions divided from the main transaction as a set of sub-transactions, and classifying the transaction operation of the main transaction.
Notably, the step 30 includes two cases: in an extreme case, when the grouping rule is changed, all the transaction operations of the main transaction are finished to be executed and put in storage, or the log of any transaction operation of the main transaction is not received, and the same main transaction does not relate to the situation that the grouping rule according to the historical version is divided and then the grouping rule according to the current version is divided. When the grouping rule is changed, only the transaction operation in the log which is continuously received is divided into the main transaction with the new version of grouping rule.
When the grouping rule changes under the condition that the source end of the main transaction is not submitted yet, the transaction operation already divided in the main transaction is kept unchanged, the version of the main transaction can be judged according to the sub-transaction ID of the main transaction, but the main transaction is continuously divided according to the grouping rule of the current version for the transaction operation in the log which is continuously received. That is, in the event that the user modifies the grouping rules before the master transaction has been committed at the source, the transaction operations in the master transaction are cut into at least two versions. In an alternative embodiment, the main transaction cutting method in this case includes a dynamic modification method and a static modification method.
First, since the change of the grouping rule in the embodiment of the present invention is represented by modifying the mapping relationship between the group and the table, for example, in the grouping rule of the history version, the transaction operations of the dependency table T1 and the table T2 are divided into the group 1, the transaction operation of the dependency table T3 is divided into the group 2, in the grouping rule of the current version, the transaction operation of the dependency table T1 is divided into the group 1, and the transaction operations of the dependency table T2 and the table T3 are divided into the group 2. When the grouping rule changes, the destination obtains the table with the grouping change as the synchronization table to be modified according to the grouping rule of the historical version and the grouping rule of the current version, for example, the synchronization table to be modified is T2 in the foregoing example. In order to prevent the active transaction from being modified by the transaction operation of the log which is continuously received when the transaction is cut, the receiving of the source log by the destination needs to be paused before modification, and then the active transaction can be cut.
Secondly, when the grouping rule of the destination end changes, the dynamic modification method performs transaction cutting on the active transaction related to the synchronous table to be modified, constructs modification operation of virtual transaction storage grouping rule, and constrains the execution sequence of the grouping before and after the synchronous table to be modified, and the method is concretely as follows:
and searching all active transactions depending on the synchronous table to be modified in the current active transactions, and cutting sub-transactions in the current packet related to the active transactions and constructing virtual transactions so as to conveniently restrict the execution sequence of the packets before and after the synchronous table to be modified. And finding all the historical groups of the synchronous table to be modified in the sub-transactions in the active transaction, and classifying the active transaction as the active transaction needing to be subjected to transaction cutting when the transaction operation depending on the synchronous table to be modified exists in the sub-transactions. Then finding the sub-transaction of the current group of the synchronous table to be modified in the active transaction, if the corresponding sub-transaction exists, the active transaction is the active transaction requiring cutting the transaction, and other active transactions not involved do not need to be processed; if no corresponding sub-transaction exists, no processing is required. When there is a sub-transaction in the active transaction that depends on the history group of the synchronization table to be modified, the sub-transaction is marked with an end mark. When a sub-transaction which depends on the current version grouping of the synchronous table to be modified exists in the active transaction, marking an end mark for the sub-transaction, and constructing a virtual transaction for the current grouping of the synchronous table to be modified according to the last operation number of the active transaction, wherein the transaction operation of the virtual transaction is that the sub-transaction of the historical grouping is waited to be executed until the SCN and the commit number of the current transaction are submitted, the commit number of the virtual transaction is set as the last operation number of the active transaction, and the grouping rule modification operation of the synchronous table to be modified is added to the virtual transaction. And cutting all related sub-transactions of the active transactions according to the steps to complete the modification of all related active transactions. And then, making a check point on the transaction state formed after modification in the current system, and storing the related information of the transaction operation into a corresponding check point file so as to restore to the system running state before the fault after the synchronous service fault. And modifying the grouping configuration of the synchronous table to be modified and storing.
The static modification method is distinguished from the dynamic modification method here specifically as follows:
and executing a check point action, and after the information of the group to which each sub-transaction in all the current active transactions belongs and the table information related to the sub-transactions are stored, exiting the synchronous service. The packet configuration of the synchronization table to be modified is modified. Starting the synchronous service, recovering the active transaction from the last check point, wherein the recovery comprises the active transaction and corresponding sub-transaction, and the sub-transaction comprises the information of dividing the group when the synchronous service is stopped last time and the table full name of the table on which the sub-transaction depends. Traversing sub-transactions without end marks in each active transaction, and dividing the sub-transactions into corresponding groups in the modified grouping rules of the current version according to the table full names of the tables relied on in the sub-transactions. If the sub-transaction divided packets are the same as the packets stored by the check points, no change is needed; if the grouping divided by the sub-transaction is different from the grouping stored in the check point, marking an end mark for the sub-transaction, and acquiring the last operation number of the active transaction, and constructing a virtual transaction for the current grouping where the synchronous table to be modified is located, wherein the virtual transaction is the same as that in the dynamic modification method. And cutting all related sub-transactions of the active transactions according to the steps to complete the modification of all related active transactions. And then, making a check point on the transaction state formed after modification in the current system, and storing the related information of the transaction operation into a corresponding check point file so as to restore to the system running state before the fault after the synchronous service fault.
Finally, the static modification method and the dynamic modification method execute the following operations:
and finishing the starting of the destination synchronization service, and continuously receiving a new log. After receiving the commit message of the main transaction, traversing the sub-transaction of the main transaction by the cut active transaction, obtaining the virtual transaction in the sub-transaction, setting the commit SCN of the commit operation of the main transaction to the operation of waiting for the execution of the sub-transaction of the current group before modification, so that when the sub-transaction is executed, the sub-transaction needs to wait for less than or equal to the commit SCN in the history group, and the commit number is less than or equal to the number of the sub-transaction to be committed currently and can be executed after the synchronization is completed.
When the commit operation of the master transaction is resolved, the data synchronization of the master transaction is started, specifically, as shown in fig. 3, the step 40 includes:
step 401: and when the commit operation is analyzed, acquiring all sub-transaction IDs corresponding to the main transaction according to the main transaction ID and the second mapping relation.
Step 402: and constructing a corresponding sub-transaction message according to all the sub-transaction IDs.
Step 403: and generating a packet queue of a corresponding historical packet and a packet queue of a corresponding current packet according to the version numbers and the group numbers of all the sub-transaction IDs. Wherein the sub-transaction ID is issued in the form of a lightweight message to the corresponding packet queue. After the submitting operation of the main transaction is analyzed, a corresponding grouping queue is generated, and the memory space occupied by data synchronization is saved.
Step 404: and sending the sub-transaction information of the history sub-transaction to a packet queue of the history packet, and sending the sub-transaction information of the current sub-transaction to the packet queue of the current packet.
And all sub-transaction IDs are fetched from the second mapping relation according to the main transaction ID, and a sub-transaction message object is constructed. Issuing sub-transaction information of each version to a corresponding packet queue, classifying the packet queues according to version numbers in sub-transaction IDs, enabling sub-transaction information corresponding to historical sub-transactions to enter the packet queues of the historical packets, and enabling sub-transaction information corresponding to current sub-transactions to enter the packet queues of the current packets.
After the sub-transaction message is issued to the packet queue, pulling the sub-transaction to be put in the packet queue, and concurrently executing the sub-transaction to be put in the packet according to the version of the packet rule and the corresponding packet priority under the condition of avoiding causing the operation error of the synchronous service, and sequentially executing the transaction operation corresponding to the sub-transaction to be put in the packet according to the sequence submitted by the source, specifically, as shown in fig. 4, the step 50 includes:
step 501a: and traversing the packet queues of all versions in sequence from small version to large version.
Step 502a: and traversing the current version of the packet queue according to the order of the priority from high to low when the sub-transaction to be put in exists in the current packet queue.
Step 503a: when sub-transactions to be put in exist in the packet queues of the current priority, sub-transaction messages are sequentially pulled from the corresponding packet queues, and according to the corresponding sub-transaction messages, the corresponding transaction operation of the sub-transactions to be put in is obtained, and the transaction operation is executed.
Step 504a: when packet queues with the same priority exist, sequencing the commit SCNs of sub-transactions to be put in the packet queues, and correspondingly processing the packet queues in the order from small to large of the commit SCNs.
After sub-transaction information of the submitting operation analyzed to the main transaction is issued to the grouping queue, the executing thread of the data synchronization method for grouping execution in the embodiment of the invention sequentially pulls the transaction to be submitted in the queue according to the version number sequence of the grouping queue and the priority of the grouping queue, traverses the transaction operation in the cache according to the sub-transaction ID, executes the transaction operation and submits. And after the submitting operation of the sub-transaction to be put in storage is executed, notifying a transaction cleaning thread to clean a cache related to the sub-transaction to be put in storage. Wherein if the packet queue of the history packet is empty, the packet queue of the current packet is pulled. And pulling the packet queue with high priority in the same version, and pulling the queue data with one level lower priority when the packet queue with high priority is empty. If there are packet queues of the same priority, commit SCNs of sub-transactions in the two queues are compared, and sub-transactions with small commit SCNs are processed first.
When executing the sub-transaction to be binned, it is required to determine whether the sub-transaction to be binned depends on the result after the execution of other sub-transactions, specifically, as shown in fig. 5, the step 50 further includes:
step 501b: and adding the sub-transaction to be put into the execution queue.
Step 502b: before executing the sub-transaction to be put in storage, when the commit SCN of other sub-transactions to be put in storage is smaller than the commit SCN of the sub-transaction to be put in storage in the execution queue, and when the versions of the other sub-transactions to be put in storage and the sub-transactions to be put in storage are the same and the groups are different, performing transaction conflict detection on the other sub-transactions to be put in storage and the sub-transactions to be put in storage, and judging whether transaction dependence exists.
Step 503b: and if the transaction dependence exists, sequentially executing the sub-transactions to be put in the order of the version from small to large.
Step 504b: and if the transaction dependence does not exist, executing the other sub-transactions to be put in storage and the sub-transactions to be put in storage in parallel.
When the sub-transactions to be put are put in the warehouse, the sub-transactions to be put in the warehouse of the same version and different groups are not needed to be considered because the transaction dependence does not exist in the different groups; the sub-transaction to be put in the storage of different versions belonging to the same main transaction, whether the sub-transaction to be put in the storage of the current version has transaction dependence with the sub-transaction to be put in the storage of the history version needs to be considered, and if the transaction dependence exists, the sub-transaction to be put in the storage of the history version needs to be executed first.
For example, as shown in fig. 6, the master transaction 1001 intermediately makes a grouping rule adjustment in the process of data synchronization, and the grouping rule changes from version 1 to version 2. The transaction operation of transaction 1001 is divided into a sub-transaction of version 1 and a sub-transaction of version 2. The transaction operations of the master transaction are partitioned into sub-transactions 1001-1-1 and 1001-1-2 according to the version 1 grouping rules, and the transaction operations of the master transaction are partitioned into sub-transactions 1001-2-1 and 1001-2-2 according to the version 2 grouping rules. Sub-transaction 1001-1 and sub-transaction 1001-1-2 belong to the same version of the different groupings so that transaction dependencies are not considered between them; the sub-transaction 1001-2-1 belongs to a different grouping of the same version than the sub-transaction 1001-2, regardless of transaction dependencies. The version numbers of sub-transaction 1001-1-1 and sub-transaction 1001-1-2 are smaller than the version numbers of sub-transaction 1001-2-1 and sub-transaction 1001-2-2, so sub-transaction 1001-2-1 and sub-transaction 1001-2 need to wait for sub-transaction 1001-1-1 and sub-transaction 1001-1-2 to execute after the execution is completed.
Because the sub-transactions to be put in storage with the same version and the same group are executed, or the sub-transactions to be put in storage with different versions are executed, the sub-transactions to be put in storage with large commit SCN need to be executed depending on the sub-transactions to be put in storage with small commit SCN, if the sub-transactions to be put in storage with small commit SCN are completely waited for executing, the sub-transactions to be put in storage with large commit SCN are executed in sequence, and the execution efficiency is affected. Therefore, the data synchronization method for packet execution in the embodiment of the invention carries out transaction conflict detection on the sub-transactions to be put in the execution queue. If there is no transaction conflict between the two transactions, then the two transactions may be executed in parallel.
In the default grouping of each version, the sub-transactions to be put in storage, which have no transaction conflict after the transaction conflict detection, are executed in parallel. When a large number of tables are subjected to data synchronization, the tables set by grouping rules are often only a few in each time according to service requirements, and most of the tables are divided into default groups, so that the embodiment of the invention greatly improves the synchronization efficiency of the whole data synchronization by improving the synchronization efficiency of the default groups.
In order to better illustrate the packet-executed data synchronization method of the present invention, the transaction conflict detection method of the packet-executed data synchronization method of the embodiment of the present invention is further refined, specifically, as shown in fig. 7, the transaction conflict detection includes:
step 5021: and traversing all other sub-transactions to be put in the execution queue in sequence, taking the other sub-transactions to be put in as transactions to be detected, and taking the sub-transactions to be put in as current transactions.
Step 5022: when the transaction conflict detection is carried out based on the rowid, when the log of the transaction to be detected and the log of the current transaction contain the same rowid, executing the transaction operation of the current transaction after the transaction operation of the transaction to be detected is completed.
Step 5023: when the transaction conflict detection is carried out based on the transaction dependence, when the table on which the transaction to be detected depends is different from the table on which the current transaction depends, no transaction dependence exists; when the table on which the transaction to be detected depends is the same as the table on which the current transaction depends, the existence or nonexistence of the transaction dependence is selectively determined according to the type of the transaction operation therein.
When the transaction to be detected and the current transaction have only insert operation on the table, no transaction dependence exists; when the table has a primary key and the transaction to be detected has only a delete operation for the table, no transaction dependency exists.
Transaction conflict detection includes add-drop-change detection of the same table and rowid conflict detection. rowid is a pseudo-column used to uniquely mark a row in a table, is an internal address of row data in a physical table, and contains two addresses, one is an address pointing to a data file stored in a block containing the row in the data table, and the other is an address in a data block of the row that can be directly located to the data row itself. In oracle, each record has a rowid, which is unique across the database, which determines which data file, block, row in oracle each record is on. When the same table is subjected to adding and deleting change detection, when the table does not have a main key, the deleting operation may occur to delete a plurality of records in the table.
The data synchronization method for packet execution in the embodiment of the invention provides an example of transaction conflict detection, and is specifically as follows:
and taking out other transactions with execution queues smaller than the commit SCN of the current transaction, recording the other transactions as a to-be-detected transaction set, traversing the to-be-detected transactions in the to-be-detected transaction set in sequence, and detecting whether the execution operation of the current transaction and the execution operation of the to-be-detected transaction have transaction conflicts or not. Assuming that the current transaction is T1, the transaction to be detected is T2, and the table that the current transaction is operating on is TableA.
When the transaction conflict detection is improved based on the addition and deletion of the same table, if the transaction operation aiming at the TableA does not exist in the T2, the transaction conflict does not exist. If the transaction operation of T1 is an INSERT operation of TableA, only INSERT operations of TableA exist in transaction T2, then no transaction conflict exists. If there is a primary key for TableA and there is only a delete operation for TableA in transaction T2, then there is no transaction conflict.
When transaction conflict detection is performed based on rowid, if the transaction T2 detects the operation of the same rowid, the transaction T1 can be executed only by waiting for the transaction T2 to be submitted.
In order to better illustrate the data synchronization method performed by the packet according to the embodiment of the present invention, step 50 of the data synchronization method performed by the packet according to the embodiment of the present invention is further refined, specifically, when the data synchronization service is restarted, as shown in fig. 8, the step 50 further includes:
Step 501c: and creating a check point file, and writing the sub-transaction ID of the sub-transaction to be binned into the check point file after executing one sub-transaction to be binned each time.
Step 502c: and each time a new version of grouping rule is acquired, writing the sub-transaction ID of the sub-transaction to be put in the current system into the check point file.
When the fault is recovered, loading the executed sub-transaction to be put in storage according to the sub-transaction ID in the check point file; when executing the sub-transaction to be put in storage, judging whether the commit SCN of the sub-transaction to be put in storage is smaller than or equal to the commit SCN of the executed sub-transaction to be put in storage; if yes, the sub-transaction to be put in storage is directly discarded; and if not, executing the sub-transaction to be put in storage.
When the source end throws out the run-time exception (i.e. the exception inherited from the RuntimeException) during data synchronization, the transaction automatically rolls back to control and maintain the consistency and integrity of each transaction operation in the transaction. When the rollback operation of the main transaction is analyzed, all corresponding historical sub-transactions and corresponding logs corresponding to all current sub-transactions in the cache are cleared; and when the partial rollback operation of the main transaction is analyzed, the log of the commit SCN of the partial rollback operation is cleared, wherein the commit SCN of all corresponding historical sub-transactions and all corresponding current sub-transactions in the log is greater than or equal to that of the partial rollback operation.
In all the above operations, one of ordinary skill in the art can use either commit LSN (Log sequence number ) or commit SCN as a unified discrimination criterion for the timing of synchronization of the entire database data, as specified by the specific production environment.
It should be noted that, taking data synchronization as an example, the invention proposes an improvement of multi-version table grouping transaction execution in the data synchronization process, and in actual production, the data synchronization method of grouping execution of the embodiment of the invention can be applied to data migration and application scenarios of partial large transaction processing besides the data synchronization scenarios.
The following briefly describes the method of using the data synchronization method performed by the packet of the present invention in different applicable scenarios: for business scenes such as data synchronization, data migration, large transaction processing and the like, the method can be optimized by adopting the thought of packet execution. When the performance of the transaction processing has a bottleneck, a grouping execution mode can be considered to split the transaction operation according to the table, and sub-transactions among different groupings are executed concurrently so as to improve the execution efficiency of the transaction. If there is a demand in the service scenario, the partial table needs to be executed preferentially, and the partial table may be set in the packet with higher priority by adopting a mode of setting the priority of the packet. The execution thread preferentially pulls the transactions in the packet queues with higher priority, and ensures that the transactions are preferentially executed. If there is a halfway change in the table grouping rule in the service scene, a multi-version grouping management mechanism is introduced to support adjustment of the grouping rule. The specific method of the data synchronization case refers to the embodiment of the invention. In the data migration process, the problems of more tables and larger data volume of a single table are frequently encountered, the tables are grouped, the large tables and the partitioned tables are independently grouped, and the higher priority is configured, so that the system resources can be fully utilized, and the data migration efficiency is improved. When the transaction is executed and put in storage, if the transaction operation contained in the transaction is too many and the processing time is too long, grouping the transaction operations in the transaction, grouping the transaction operations of the associated tables, grouping the transaction operations of the unassociated tables, dividing the transaction operations of the unassociated tables into a plurality of groups, and splitting the large transaction into the small transaction for concurrent execution, so as to reduce the processing time of the large transaction and improve the efficiency.
Example 2:
on the basis of the above embodiment 1, the embodiment of the present invention provides a specific example of dividing packets and executing transactions in a packet-executed data synchronization method, so as to better understand the synchronization process, and will be described by taking a log sequence shown in the following table as an example:
/>
wherein, for convenience of description, the names of the tables are used as the full names of the tables in the embodiment of the invention; transaction operations after 8 to 94 and 100 of commit SCN are omitted, no master transaction belonging to the master transaction ID 1 is included in the corresponding transaction operations, and when the division of transaction operations is described, transaction operations after 8 to 94 and 100 of commit SCN and 2 and 3 of master transaction ID are omitted.
The destination acquires the latest grouping rule, namely grouping rule version 1: tables a, B and C correspond to packet 1, packet 1 priority 1, table D corresponds to packet 2, packet 2 priority 2, table E corresponds to packet 3, packet 3 priority 3. The packet ID of packet 1 is 1-1-1, the packet ID of packet 2 is 1-2-2, and the packet ID of packet 3 is 1-3-3, wherein the smaller the number, the higher the priority.
The destination receives the log sequence, divides the transaction operation into corresponding current groups according to the latest grouping rules, and generates at least one current sub-transaction.
The transaction operations involving tables A, B and C are divided into groups 1-1-1; grouping to groupings 1-2-2 relating to Table D; packets 1-3-3, relating to Table E.
For a transaction operation with a commit SCN of 1, the log contains a table A, a group number of 1 is found, and a sub-transaction with a sub-transaction ID of 1-1-1 is generated;
for a transaction operation with a commit SCN of 2, the log contains a table B, a group number of 1 is found, and a sub-transaction with a sub-transaction ID of 2-1-1 is generated;
for a transaction operation with a commit SCN of 3, the log contains a table C, a group number of 1 is found, and a sub-transaction with a sub-transaction ID of 3-1-1 is generated;
for a transaction operation with a commit SCN of 4, the log contains a table B, a group number of 1 is found, and the transaction is divided into sub-transactions with sub-transaction IDs of 1-1-1;
for a transaction operation with a commit SCN of 5, the log contains a table D, a group number of 2 is found, and a sub-transaction with a sub-transaction ID of 3-1-2 is generated;
for a transaction operation with a commit SCN of 6, the log contains a table A, a group number of 1 is found, and the transaction is divided into sub-transactions with sub-transaction IDs of 1-1-1;
for a commit SCN 7 transaction operation, the log contains Table B, find group number 2, and generate a sub-transaction with sub-transaction ID 2-1-2.
For a commit SCN of 95 transaction operation, the log contains Table A, find group number 1, partition to sub-transactions with sub-transaction IDs 1-1-1.
A plurality of key value pairs of sub-transaction IDs and the number of transaction operations in the sub-transaction are cached as shown in the following table:
key value value
1-1-1 4
2-1-1 1
3-1-1 1
3-1-2 1
2-1-2 1
Caching a plurality of sub-transaction IDs and operation number of transaction operations and key value pairs of a transaction operation log as shown in the following table:
key value value
1-1-1-1 Journal with SCN 1
2-1-1-1 SCN 2 log
3-1-1-1 SCN 3 log
1-1-1-2 SCN 4 log
3-1-2-1 SCN 5 journal
1-1-1-3 SCN 6 journal
2-1-2-1 SCN 7 journal
1-1-1-4 SCN 95 journal
Caching key value pairs of a plurality of master transaction IDs and child transaction IDs as shown in the following table:
key value value
1 1-1-1
2 2-1-1
3 3-1-1
3 3-1-2
2 2-1-2
Before analyzing to submit the log with SCN of 95, obtaining the grouping rule of new version, namely grouping rule version 2: tables a and B correspond to packet 1, packet 1 priority 1, table C corresponds to packet 2, packet 2 priority 2, tables D and E correspond to packet 3, packet 3 priority 3. The packet ID of packet 1 is 2-1-3, the packet ID of packet 2 is 2-2-2, and the packet ID of packet 3 is 2-3-1. The grouping rule 2 modifies the table C to the grouping 2, the table D to the grouping 3, the priority of the grouping 1 to 3, and the priority of the grouping 3 to 1 with respect to the grouping rule 1.
For a transaction operation with a commit SCN of 96, the log contains a table E, a group number of 3 is found, and sub-transactions with sub-transaction IDs of 1-2-3 are generated;
For a transaction operation with a commit SCN of 97, the log contains a table E, a group number of 3 is found, and the transaction is divided into sub-transactions with sub-transaction IDs of 1-2-3;
for a transaction operation with a commit SCN of 98, the log contains Table A, find group number 1, partition to sub-transactions with sub-transaction IDs of 1-2-1;
for a transaction operation with commit SCN of 99, the log contains Table E, find group number 3, partition to sub-transactions with sub-transaction IDs 1-2-3.
A plurality of key value pairs of sub-transaction IDs and the number of transaction operations in the sub-transaction are cached as shown in the following table:
key value value
1-2-1 1
1-2-3 3
Caching a plurality of sub-transaction IDs and operation number of transaction operations and key value pairs of a transaction operation log as shown in the following table:
key value value
1-2-3-1 SCN 96 log
1-2-3-2 SCN 97 journal
1-2-1-1 SCN 98 journal
1-2-3-3 SCN 99 journal
Caching key value pairs of a plurality of master transaction IDs and child transaction IDs as shown in the following table:
key value value
1 1-2-1
1 1-2-3
Wherein the commit SCN of the sub-transaction is the SCN of the last transaction operation contained by the sub-transaction.
After the transaction operation is divided, the commit operation of the main transaction with the main transaction ID of 1 at the source end is received, and all sub-transaction IDs with the main transaction ID of 1 are taken out from the cache and are respectively 1-1-1, 1-2-1 and 1-2-3. A packet queue 1 of version 1, and packet queues 1 and 3 of version 2 are constructed. Generating a sub-transaction message of the packet queue 1 of the version 1, wherein the sub-transaction message comprises a sub-transaction ID1-1-1; generating a sub-transaction message of the packet queue 1 of the version 2, wherein the sub-transaction message comprises a sub-transaction ID1-2-1; a sub-transaction message for version 2 packet queue 3 is generated containing sub-transaction IDs 1-2-3. And issuing the corresponding sub-transaction message to each packet queue.
The execution thread executes a sub-transaction with a main transaction ID of 1. Sub-transactions 1-1-1 are first pulled from version 1 packet queue 1 and added to the execution queue. Since the priority of the packet 1 in the version 2 is 3 and the priority of the packet 3 is 1, the sub-transaction 1-2-3 is added to the execution queue first, and then the sub-transaction 1-2-1 is added to the execution queue.
The execution thread pulls sub-transaction 1-1 from the execution queue, and since the commit SCN of sub-transaction 1-1-1 is 95, the commit SCN of sub-transaction 1-2-3 is 99, the commit SCN of sub-transaction 1-2-1 is 98, and there is no sub-transaction with commit SCN less than 95 in the execution queue, sub-transaction 1-1-1 is executed.
The execution thread pulls sub-transactions 1-2-3 from the execution queue; the commit SCN of the sub-transaction 1-2-1 in the execution queue is smaller than the commit SCN of the sub-transaction 1-2-3, but the versions of the sub-transaction 1-2-3 and the sub-transaction 1-2-1 are the same and the groups are different, so that transaction conflict detection is not needed; the commit SCN of the executing sub-transaction 1-1 is less than the commit SCN of the sub-transaction 1-2-3, and transaction conflict detection is performed.
Based on the transaction dependency for transaction conflict detection, the transaction operation dependency table A and the transaction operation dependency table B in the sub-transaction 1-1 and the transaction operation dependency table E in the sub-transaction 1-2-3, the sub-transaction 1-1 and the sub-transaction 1-2-3 have no transaction dependency, and the sub-transaction 1-1-1 and the sub-transaction 1-2-3 are executed in parallel.
The execution thread pulls sub-transactions 1-2-1 from the execution queue; the commit SCN of the executing sub-transaction 1-1 is less than the commit SCN of the sub-transaction 1-2-1, and transaction conflict detection is performed. Based on the transaction dependency for transaction conflict detection, the transaction operation dependency table A in the sub-transaction 1-1 and the transaction operation dependency table B in the sub-transaction 1-2-1 are identical, and the judgment is continued according to the type of the transaction operation therein. In the transaction conflict detection, the sub-transaction 1-2-1 is the current transaction, the sub-transaction 1-1 is the transaction to be detected, so that the sub-transaction 1-2-1 and the sub-transaction 1-1 have transaction dependence, the sub-transaction 1-1 is executed first, and the sub-transaction 1-2-1 is started to be executed after the sub-transaction 1-1 is executed.
Example 3:
fig. 9 is a schematic diagram of a packet-implemented data synchronization device according to an embodiment of the present invention. The packet-implemented data synchronization device of the present embodiment includes one or more processors 31 and a memory 32. In fig. 9, a processor 31 is taken as an example.
The processor 31 and the memory 32 may be connected by a bus or otherwise, which is illustrated in fig. 9 as a bus connection.
The memory 32 is used as a nonvolatile computer-readable storage medium for storing a nonvolatile software program and a nonvolatile computer-executable program, and performs a data synchronization method as a packet in embodiment 1. The processor 31 performs a data synchronization method of packet execution by running non-volatile software programs and instructions stored in the memory 32.
The memory 32 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 32 may optionally include memory located remotely from processor 31, which may be connected to processor 31 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 32, which when executed by the one or more processors 31, perform the data synchronization method performed by the packet in embodiment 1 described above, for example, performing the various steps shown in fig. 1-5 and 7-8 described above.
It should be noted that, because the content of information interaction and execution process between modules and units in the above-mentioned device and system is based on the same concept as the processing method embodiment of the present invention, specific content may be referred to the description in the method embodiment of the present invention, and will not be repeated here.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the embodiments may be implemented by a program that instructs associated hardware, the program may be stored on a computer readable storage medium, the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (10)

1. A method of packet-implemented data synchronization, comprising:
acquiring a grouping rule; wherein, each current packet corresponds to at least one table, and the priority of each current packet is set; when the grouping rule is changed, the changed grouping rule is used as a current version grouping rule, and each version grouping rule corresponds to a default grouping;
receiving a log, analyzing the log, dividing the transaction operation into corresponding current groups or default groups according to the grouping rule of the current version, and generating at least one current sub-transaction; wherein the transaction operations of the master transaction are grouped according to the dependent table;
when the grouping rule of the current version changes before the main transaction is not submitted, attributing the current grouping to a history grouping and attributing the current sub-transaction to a history sub-transaction;
when the submitting operation of the main transaction is analyzed, respectively constructing at least one sub-transaction message and at least one grouping queue according to all historical sub-transactions and all current sub-transactions of the main transaction, issuing the sub-transaction message of the historical sub-transaction to the grouping queue of the historical grouping, and issuing the sub-transaction message of the current sub-transaction to the grouping queue of the current grouping;
And sequentially pulling the sub-transaction messages in the grouping queues corresponding to each version according to the priority, acquiring the sub-transaction to be put in storage according to the sub-transaction messages, and executing the sub-transaction to be put in storage.
2. The method of claim 1, wherein the receiving the log, parsing the log, dividing the transaction operation into corresponding current packets or default packets according to current version of the packet rules, and generating at least one current sub-transaction comprises:
acquiring a first mapping relation between the table full name of the table and the group number of the current group according to the grouping rule of the current version;
receiving the log, analyzing the log to obtain at least one transaction operation, obtaining the table full name of a table on which each transaction operation depends, and judging whether a corresponding group number exists in the first mapping relation according to the table full name; if so, dividing the transaction operation into corresponding current groups; if not, dividing the transaction operation into a default group of the current version;
constructing a sub-transaction ID of a current sub-transaction according to the main transaction of the transaction operation, the version number of the grouping rule of the current version and the group number; searching whether a current sub-transaction corresponding to the sub-transaction ID exists or not; if not, generating the current sub-transaction, and adding the transaction operation to the current sub-transaction; if so, adding the transaction operation to the current sub-transaction;
Constructing a second mapping relation between the main transaction ID of the main transaction and the sub transaction ID; the second mapping is updated each time a current sub-transaction is generated.
3. The method for synchronizing data executed by a packet according to claim 2, wherein when the commit operation of the main transaction is resolved, respectively constructing at least one sub-transaction message and at least one packet queue according to all historical sub-transactions and all current sub-transactions of the main transaction, issuing the sub-transaction message of the historical sub-transaction to the packet queue of the historical packet, and issuing the sub-transaction message of the current sub-transaction to the packet queue of the current packet comprises:
when the submitting operation is analyzed, acquiring all sub-transaction IDs corresponding to the main transaction according to the main transaction ID and the second mapping relation;
constructing corresponding sub-transaction information according to all the sub-transaction IDs;
generating a packet queue of a corresponding historical packet and a packet queue of a corresponding current packet according to the version numbers and the group numbers of all the sub-transaction IDs;
and sending the sub-transaction information of the history sub-transaction to a packet queue of the history packet, and sending the sub-transaction information of the current sub-transaction to the packet queue of the current packet.
4. The method for synchronizing data executed by packets according to claim 3, wherein sequentially pulling the sub-transaction messages in the packet queues corresponding to each version according to the priority, obtaining a sub-transaction to be binned according to the sub-transaction messages, and executing the sub-transaction to be binned comprises:
traversing the packet queues of all versions in sequence from small version to large version;
traversing the current version of packet queue according to the order of the priority from high to low when the sub-transaction to be put in exists in the current packet queue;
when sub-transactions to be put in are in the packet queues of the current priority, sub-transaction messages are sequentially pulled from the corresponding packet queues, transaction operations of the corresponding sub-transactions to be put in are obtained according to the corresponding sub-transaction messages, and the transaction operations are executed;
when packet queues with the same priority exist, sequencing the commit SCNs of sub-transactions to be put in the packet queues, and correspondingly processing the packet queues in the order from small to large of the commit SCNs.
5. The method for synchronizing data executed by packets according to claim 4, wherein sequentially pulling the sub-transaction messages in the packet queues corresponding to each version according to the priority, obtaining a sub-transaction to be binned according to the sub-transaction messages, and executing the sub-transaction to be binned further comprises:
Creating a check point file, and writing a sub-transaction ID of a sub-transaction to be put into the check point file after executing one sub-transaction to be put into the storage each time;
and each time a new version of grouping rule is acquired, writing the sub-transaction ID of the sub-transaction to be put in the current system into the check point file.
6. The method for synchronizing data executed by packets according to claim 1, wherein sequentially pulling the sub-transaction messages in the packet queues corresponding to each version according to the priority, obtaining a sub-transaction to be binned according to the sub-transaction messages, and executing the sub-transaction to be binned further comprises:
adding the sub-transaction to be put into an execution queue;
before executing the sub-transaction to be put in storage, when the submitted SCN of other sub-transactions to be put in storage is smaller than the submitted SCN of the sub-transaction to be put in storage in the execution queue, if the versions of the other sub-transactions to be put in storage are the same as the versions of the sub-transactions to be put in storage and the groups are different, carrying out transaction conflict detection on the other sub-transactions to be put in storage and the sub-transactions to be put in storage, and judging whether transaction dependence exists;
if transaction dependence exists, sequentially executing the sub-transactions to be put in the order from small version to large version;
And if the transaction dependence does not exist, executing the other sub-transactions to be put in storage and the sub-transactions to be put in storage in parallel.
7. The packet-implemented data synchronization method of claim 6, wherein the transaction collision detection comprises:
traversing all other sub-transactions to be put in the execution queue in sequence, taking the other sub-transactions to be put in as transactions to be detected, and taking the sub-transactions to be put in as current transactions;
when transaction conflict detection is carried out based on rowid, executing the transaction operation of the current transaction after the transaction operation of the transaction to be detected is completed when the same rowid is contained in the log of the transaction to be detected and the log of the current transaction;
when the transaction conflict detection is carried out based on the transaction dependence, when the table on which the transaction to be detected depends is different from the table on which the current transaction depends, no transaction dependence exists; when the table on which the transaction to be detected depends is the same as the table on which the current transaction depends, the existence or nonexistence of the transaction dependence is selectively determined according to the type of the transaction operation therein.
8. The packet-implemented data synchronization method of claim 7, wherein selectively determining the presence or absence of a transaction dependency based on a type of transaction operation therein when the table on which the transaction to be detected depends is the same as the table on which the current transaction depends comprises:
When the transaction to be detected and the current transaction have only insert operation on the table, no transaction dependence exists;
when the table has a primary key and the transaction to be detected has only a delete operation for the table, no transaction dependency exists.
9. The method of data synchronization performed by a packet according to any one of claims 1-8, wherein said attributing the current packet to a history packet and the current sub-transaction to a history sub-transaction when the current version of the packet rule changes before the main transaction is not committed comprises:
when the fault is recovered, loading the executed sub-transaction to be put in storage according to the sub-transaction ID in the check point file; when executing the sub-transaction to be put in storage, judging whether the commit SCN of the sub-transaction to be put in storage is smaller than or equal to the commit SCN of the executed sub-transaction to be put in storage; if yes, the sub-transaction to be put in storage is directly discarded; if not, executing the sub-transaction to be put in storage;
when the rollback operation of the main transaction is analyzed, all corresponding historical sub-transactions and corresponding logs corresponding to all current sub-transactions in the cache are cleared; and when the partial rollback operation of the main transaction is analyzed, the log of the commit SCN of the partial rollback operation is cleared, wherein the commit SCN of all corresponding historical sub-transactions and all corresponding current sub-transactions in the log is greater than or equal to that of the partial rollback operation.
10. A data synchronization device for packet execution, comprising at least one processor and a memory, said at least one processor and memory being connected by a data bus, said memory storing instructions executable by said at least one processor, said instructions, after being executed by said processor, for performing the data synchronization method for packet execution as claimed in any one of claims 1-9.
CN202311197235.XA 2023-09-15 2023-09-15 Data synchronization method and device for packet execution Pending CN117349369A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311197235.XA CN117349369A (en) 2023-09-15 2023-09-15 Data synchronization method and device for packet execution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311197235.XA CN117349369A (en) 2023-09-15 2023-09-15 Data synchronization method and device for packet execution

Publications (1)

Publication Number Publication Date
CN117349369A true CN117349369A (en) 2024-01-05

Family

ID=89354899

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311197235.XA Pending CN117349369A (en) 2023-09-15 2023-09-15 Data synchronization method and device for packet execution

Country Status (1)

Country Link
CN (1) CN117349369A (en)

Similar Documents

Publication Publication Date Title
JP5259404B2 (en) Cloning and managing database fragments
US7827167B2 (en) Database management system and method including a query executor for generating multiple tasks
US10769134B2 (en) Resumable and online schema transformations
US20090271435A1 (en) Data management method, data management program, and data management device
CN112286941B (en) Big data synchronization method and device based on Binlog + HBase + Hive
WO2021046750A1 (en) Data redistribution method, device, and system
CN111694863B (en) Database cache refreshing method, system and device
US11983168B2 (en) Block verification method, apparatus and device
CN112000649A (en) Incremental data synchronization method and device based on map reduce
CN111694798A (en) Data synchronization method and data synchronization system based on log analysis
CN111858626B (en) Parallel execution-based data synchronization method and device
CN111930692B (en) Transaction merging execution method and device based on log analysis synchronization
CN117349369A (en) Data synchronization method and device for packet execution
CN111858504A (en) Operation merging execution method based on log analysis synchronization and data synchronization system
CN111858503A (en) Parallel execution method and data synchronization system based on log analysis synchronization
EP3951609A1 (en) Query optimization method and apparatus
US20050076029A1 (en) Non-blocking distinct grouping of database entries with overflow
CN112667744B (en) Method and device for synchronously updating data in database in batch
CN111930693B (en) Transaction merging execution method and device based on log analysis synchronization
Zhang et al. An Optimized Transaction Processing Scheme for Highly Contented E-commerce Workloads Optimized Scheme for Contended Workloads
CN117349371A (en) Method and device for statically modifying data synchronization packet
CN114296887A (en) Transaction combination-based parallel execution method and device
CN117349370A (en) Method and device for dynamically modifying data synchronization packet
CN115185929A (en) Data association migration method and device
CN116991939A (en) DDL synchronization method and device for packet execution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination