CN112148436A

CN112148436A - Decentralized TCC (transmission control protocol) transaction management method, device, equipment and system

Info

Publication number: CN112148436A
Application number: CN202011010261.3A
Authority: CN
Inventors: 林斌; 施建安; 庄一波; 赵友平; 孙志伟
Original assignee: Xiamen Yilianzhong Yihui Technology Co ltd
Current assignee: Xiamen Yilianzhong Yihui Technology Co ltd
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2020-12-29
Anticipated expiration: 2040-09-23
Also published as: CN112148436B

Abstract

The invention provides a decentralized TCC transaction management method, a decentralized TCC transaction management device, a decentralized TCC transaction management node and a decentralized TCC transaction management system, wherein the method comprises the following steps: receiving a service execution request initiated by a user, and starting a TCC transaction according to the service execution request; generating a first trying method according to the service execution request; wherein the first attempt method comprises a second attempt method of at least one called participating node; acquiring the execution condition of each called participating node on the second trying method; generating the current marking state of the TCC transaction according to the execution condition; coordinating the TCC transaction to enter a completion phase and performing a confirmation operation or a rollback operation based on the flag state. The invention avoids the single-point safety problem and the performance problem caused by centralized nodes.

Description

Decentralized TCC (transmission control protocol) transaction management method, device, equipment and system

Technical Field

The invention relates to the technical field of computers, in particular to a decentralized TCC (transmission control protocol) transaction management method, device, equipment and system

Background

Under the microservice architecture, each microservice manages a respective data source. And a complete multi-service often requires multiple data sources to participate in the interaction. In order to ensure the integrity and consistency of the service data in this case, a distributed transaction approach is required. The distributed transaction means which are used more in the application are a long transaction scheme based on the Saga model, a final consistent transaction scheme based on the TCC model and a strong consistent transaction scheme based on the XA protocol.

The final consistency scheme based on the TCC model mainly requires splitting one business action into two steps in the implementation: 1) attempting to conduct a business action; 2) when the first step is successful, the second step executes a confirmation action to persist the result of the first step; when the first step fails, the second step executes a cancel action to roll back the content of the first step.

The service in the internet field emphasizes the performance, and the scheme of TCC does not lock resources in the whole execution process, so that the performance is better. The TCC scheme is implemented in two modes: manual and automatic. The manual scheme is that the whole TCC process is manually realized through a service code, and comprises trying, rolling back, confirming and the like. Relatively speaking, the implementation effort is large. The automatic scheme is to complete the control of the whole TCC process through the framework.

The modules that the TCC framework must implement include a TCC transaction manager. In the service process, all the participating nodes in the same TCC transaction need to register themselves in the TCC transaction manager, so that after the trying phase is finished, the TCC transaction manager determines to execute a confirmation action or a rollback action according to the success or failure of the trying phase.

Several forms of TCC frames exist today, mainly: tcc-transaction, Hmily, EasyTransaction. All three frames are centralized frames, that is, the transaction manager is a central node mode manager. Once the transaction manager node fails, all traffic fails because no new TCC transactions can be registered and existing TCCs cannot continue to execute because the coordinator of the transaction manager as a whole is lost.

Disclosure of Invention

In view of the above, the present invention provides a decentralized TCC transaction management method, apparatus, device and system, which adopts a decentralized transaction manager node implementation mechanism to avoid the single point failure problem of the transaction manager node in the prior art.

The embodiment of the invention provides a decentralized TCC transaction management method, which comprises the following steps:

receiving a service execution request initiated by a user, and starting a TCC transaction according to the service execution request;

generating a first trying method according to the service execution request; wherein the first attempt method comprises a second attempt method of at least one called participating node;

acquiring the execution condition of each called participating node on the second trying method;

generating the current marking state of the TCC transaction according to the execution condition;

coordinating the TCC transaction to enter a completion phase and performing a confirmation operation or a rollback operation based on the flag state.

Preferably, the TCC transaction has the following properties:

xid of the TCC transaction itself;

TCC transaction current flag state; wherein the flag states of the TCC transaction include an initial state, a commit/rollback state, and an end state; when the execution condition of all the attempted methods is successful, recording the marking state as a submission state; when the execution condition of at least one trial method is execution identification, marking the marking state as a rollback state;

list of attempted methods under the TCC transaction scope;

current attempted methods under the TCC transaction scope; wherein, each trial method to be executed needs to be added to the trial method list of the TCC transaction object and written into the log before being executed;

a list of remote participants under the TCC transaction scope; wherein, the far-end participant can only send out a far-end calling instruction after being added to the far-end participant list.

Preferably, said xid comprises:

a globally unique ID for uniquely identifying the TCC transaction;

a branch ID identifying a local transaction participating in the TCC transaction; wherein, for any node participating in a local transaction of the TCC transaction, it is a branch of a global TCC transaction; when the branch ID of xid is null, it indicates that xid is used to mark the TCC transaction itself.

Preferably, the method further comprises the following steps:

when a remote participating node is called, a TCC (transmission control center) execution instruction is sent to the remote participating node; wherein, the TCC execution instruction comprises xid, an identifier of a node, and a globally unique traceId called this time for preventing concurrent contention; after receiving the TCC execution instruction, the remote participating node judges whether a TCC transaction object corresponding to xid in the TCC execution instruction exists in a local TCC transaction warehouse, and if yes, the remote participating node directly extracts and executes a corresponding second trying method; if the transaction object does not exist, a transaction manager of the transaction manager builds a TCC transaction object of the participant, and executes a second trying method, thereby returning to the execution condition of the second trying method;

sending a commit instruction to the remote participating node based on execution of the second attempted method; after receiving the execution instruction, the remote participating node updates the TCC transaction state in a CAS mode, the successfully updated thread continues to execute the flow of the subsequent completion stage, and returns a successful response; and the thread with failed updating returns a successful response after the log writing is finished.

Preferably, the system further comprises a log, wherein the log comprises a database log, and the database log is used for confirming whether the trial phase method or the completion phase method is executed or not when the downtime restarts; wherein:

the internal inclusion of the database log store is stored as xid in the trial method object or completion phase xid in the validation method or rollback method.

Preferably, the log further comprises a log file for reconstructing TCC transactions when the downtime restarts;

and during the process of the TCC transaction, writing related key operations into a log file according to the occurrence sequence of time, and executing the key operations after each key operation is written into the log file, so as to recover the transaction according to the information recorded in the log file after downtime and restart.

Preferably, upon transaction recovery by said log file;

when the TCC transaction is judged to be in an initial state, checking whether a first trying method is successful;

if yes, updating the state of the TCC transaction into a submitting/rollback state, and putting the TCC transaction into an asynchronous thread pool to execute the flow of the corresponding completion stage;

if not, inquiring whether a log file has a record of xid of the first trial method;

if so, updating the first attempted method to be successful, otherwise updating the first attempted method to be failed;

updating the state of the TCC transaction according to the state updated by the first trial method, and further executing the flow of the finishing stage;

when the TCC transaction is in a commit/rollback state, firstly traversing a confirmation method/rollback method list, inquiring whether a completion phase xid exists in a log file to update whether a corresponding confirmation method/rollback method is completed, putting an object of the TCC transaction into an asynchronous thread pool after the completion of the confirmation, and continuing to execute a commit or rollback process.

Preferably, the flow of the completion phase is executed in a multi-thread manner, which specifically includes:

sending corresponding instructions to all the remote participating nodes and calling a completion phase method corresponding to a local first trial method simultaneously in a multithreading mode; wherein, the main thread is informed in a waiting or asynchronous notification mode through a CountDownLatch mode; in the multithreading process, the completion marks in the local and remote confirmation method/rollback method are decorated by using preset keywords.

Preferably, for an exception transaction put into the transaction warehouse, the method further comprises:

taking out abnormal transactions in the abnormal transaction set; wherein the local participating node and the remote participating node currently in execution are marked with an identification number; the identification number is contended by the multiple threads;

before the preparation execution completion stage, firstly judging whether the current identification number is a first number;

if not, discarding the exception transaction;

if the number is the first number, counting the operand which is needed currently, updating by using a CAS mode, and obtaining an execution right when the updating is successful; the operand is the sum of the number of the local participating nodes which are not successful and the number of the remote participating nodes;

putting a monitor in the execution method of the local participating node and the remote participating node, and performing subtraction operation on an operand when the execution is finished;

for the thread with the operand of 0, acquiring the completion conditions of a local participating node and a remote participating node in the current TCC transaction;

if the completion conditions of the local participating node and the remote participating node are all completed, updating the state of TCC transaction, and writing a log file and clearing a log;

if the local participating node and the remote participating node have not completed, the TCC transaction is put into the abnormal transaction set of the transaction warehouse again for the next attempt.

The embodiment of the invention also provides a decentralized TCC transaction management device, which comprises:

a TCC transaction starting unit, configured to receive a service execution request initiated by a user, and start a TCC transaction according to the service execution request;

an attempt method generation unit for generating a first attempt method according to the service execution request; wherein the first attempted method includes at least one of its remotely invoked participating nodes and a second attempted method of the invoked participating node;

an execution condition obtaining unit, configured to obtain an execution condition of each called participating node on the second trial method;

the marking unit is used for generating the current marking state of the TCC transaction according to the execution condition;

and the coordination unit is used for coordinating the TCC transaction to enter a completion phase and executing a confirmation operation or a rollback operation according to the marking state.

The embodiment of the present invention further provides a decentralized TCC transaction management node, which includes a memory and a processor, where a computer program is stored in the memory, and the computer program can be executed by the processor to implement the decentralized TCC transaction management method.

The embodiment of the invention also provides a decentralized TCC transaction management system, which is characterized by comprising the decentralized TCC transaction management node and a plurality of participating nodes as participants.

In the above embodiment, the state of the whole TCC transaction changes, and there is no central node involved in the process coordination, any node that first initiates the TCC transaction can be a management node, and the management node is actually a transaction manager node. The design enables the management of the whole TCC affair to have more fault tolerance and elasticity, and any node can become a management node, namely a manager node, so that the single-point safety problem and the performance problem caused by a centralized node are avoided. Since the transaction manager responsibilities are shared by the entire service cluster, there are naturally no performance and security concerns.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a flowchart illustrating a method for decentralized TCC transaction management according to a first embodiment of the present invention.

Fig. 2 is a schematic diagram of a call between nodes according to a first embodiment of the present invention.

FIG. 3 is a state change diagram of a TCC transaction.

Fig. 4 is a flow chart illustrating the success of the trial method.

Fig. 5 is a flow chart illustrating a failure of the trial method.

FIG. 6 is a process diagram for joining a remote participating node to a TCC transaction.

Fig. 7 is a flow chart of a remote participating node executing TCC execution instructions.

Fig. 8 is a flow chart of a remote participating node executing a TCC commit instruction.

Fig. 9 is a flow diagram of a remote participating node executing a rollback instruction.

Fig. 10 is a schematic structural diagram of a decentralized TCC transaction management device according to a second embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings of the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

Referring to fig. 1, a first embodiment of the present invention provides a decentralized TCC transaction management method, which can be executed by a decentralized TCC transaction management node (hereinafter referred to as management node) to at least implement the following steps:

s101, receiving a service execution request initiated by a user, and starting a TCC transaction according to the service execution request.

In this embodiment, the management node may be a computing device with data processing capability and communication capability, such as a notebook computer, a desktop computer, a mobile terminal, a PDA, a workstation, and the like, in which corresponding service software is installed, and performs service processing by executing the service software.

S102, generating a first trying method according to the service execution request; wherein the first attempt method comprises a second attempt method of at least one called participating node.

In this embodiment, when a user wants to execute a service, he or she may issue a service execution request through the management node, and the management node starts a corresponding TCC transaction according to the service execution request when receiving the service execution request of the user.

It should be noted that, in this embodiment, the management node is not a specific node, a user may initiate a TCC transaction through any service node, and the node that initiates the TCC transaction is referred to as a management node.

For example, as shown in fig. 2, in the present embodiment, the service nodes include an a node, a B node, a C node, and a D node. These traffic nodes are all peer-to-peer nodes when a TCC transaction is not initiated. At this time, if a user starts a TCC transaction through the a node, and the first trying method of the a node remotely calls the second trying method B (try) of the B node and the second trying method D (try) of the D node, at this time, the identity of the a node is converted into a management node for coordinating the execution process of the whole TCC transaction, and the B node and the D node are converted into participating nodes. It should be noted that, similarly, for the node B or the node D, when executing the second try method of itself, it may also invoke the try method of the downstream node, for example, the second try method B (try) of the node B invokes the second try method C (try) of the node C, then the node C is also a participating node, and the upper node of the node C is the node B. It will be appreciated that the C node may also further invoke the try method of its downstream nodes when executing its second try method, as long as the nodes invoked during the TCC transaction execution are referred to as participating nodes.

S103, obtaining the execution condition of each called participating node on the second trying method.

And S104, generating the current marking state of the TCC transaction according to the execution condition.

S105, coordinating the TCC affair to enter the completion phase, and executing the confirmation operation or the rollback operation according to the marking state.

In this embodiment, after being called, each participating node executes its own second attempted method, and sends the execution of the second attempted method to its upstream node.

Taking fig. 2 as an example, if the upstream node of the node B and the node D is the node a, it sends the execution of the second trying method to the node a, and the upstream node of the node C is the node B, so the node C sends the execution of the second trying method C (try) to the node B, and then the node B confirms the execution and sends the execution to the node a.

As shown in fig. 3, in this embodiment, after acquiring the execution condition of each node, the a node may generate a current flag state of the TCC transaction according to the execution condition, for example, the flag state is a commit state or a rollback state. The A node then coordinates the TCC transaction into the completion phase to perform either the confirmation method or the rollback method. After entering the completion phase, both the confirmation method and the rollback method must be successfully executed, and if the execution fails, the execution is retried until the completion is successful. Therefore, after the management node a finishes executing the self completion phase method and successfully sends an instruction for executing the completion phase method to the downstream node, the TCC distributed transaction is considered to be completed. All participating nodes must be able to receive instructions from the execution complete stage and eventually execute successfully.

In one implementation:

if the attempted methods of the four nodes are successfully executed, the node A as the management node marks the current flag state of the TCC transaction as a commit state and carries out the completion phase. Since all nodes 'attempted methods are performed successfully during the attempt phase, the completion phase is the execution of all nodes' validation methods. Firstly, the confirmation method Acommit of the A node as a management node is executed firstly, and then the A node sends out an instruction for executing the confirmation method and transmits the instruction to the B node of the direct downstream node thereof in a remote mode. After receiving the instruction of executing the confirmation method in the completion phase, the node B executes the local Bcomp and continues to transmit the instruction of executing the confirmation method to the direct downstream node C. After receiving the instruction, the C node also executes the local confirmation method Ccommit. Then, the node a sends an execution confirmation method instruction to the node D, and the node D executes the local confirmation method Dcommit.

In another implementation:

if a node has a problem in executing its attempted method, for example, in executing the second attempted method D (try) to the D node, the data state of the D node is still in the initial state since all attempted methods should be under the enclosure of the local transaction. But the data states of both node B and node C are now in an intermediate state and both need to be rolled back to the initial state. Due to the synchronous call, the failure of the second trying method d (try) is sensed by the upstream node, i.e. the a node, and the a node coordinates the TCC transaction to enter the completion phase and executes the rollback method. The a node will issue a rollback instruction to the B node. After receiving the rollback instruction of the node A, the node B locally executes the rollback method and continuously transmits the rollback instruction to a downstream node C. After receiving the rollback instruction of the node B, the node C also executes the rollback method. Thus, the A node and the D node do not influence the data state due to the method failure under the enclosure of the local transaction because the method is tried. And the node B and the node C roll back the data state of the node B and the node C from the intermediate state to the initial state through a roll-back method, so that the whole TCC transaction is rolled back finally, and all node data of the whole service are consistent.

In summary, it can be seen that, the state of the whole TCC transaction changes, and there is no central node involved in the process coordination, any node that first initiates the TCC transaction can be a management node or a coordination node, and the management node is actually a transaction manager node. The design enables the management of the whole TCC affair to have more fault tolerance and elasticity, and any node can become a management node, namely a manager node, so that the single-point safety problem and the performance problem caused by a centralized node are avoided. Since the transaction manager responsibilities are shared by the entire service cluster, there are naturally no performance and security concerns.

Some preferred embodiments or specific implementations of the present invention are described further below to facilitate an understanding of the present invention.

On the basis of the above-described embodiment, in a specific implementation of the present invention, the TCC transaction has the following attributes:

(1) xid of the TCC transaction itself

In this embodiment, a TCC transaction uses xid as a unique identifier, and xid is divided into two parts:

a globally unique id (globalid) for uniquely identifying a TCC transaction, which must not be empty.

A branch id (branchid) that identifies a local transaction that participates in a TCC transaction.

Wherein, for a local transaction participating in a TCC transaction on any node, it is itself a branch of the global TCC transaction. When the value is empty, it means that the xid is used to flag the TCC transaction itself.

The transaction committed in this embodiment includes two types: TCC transactions and local transactions. Local transactions refer to database transactions on a single node, which are commonly initiated and committed via the JDBC interface. The TCC transaction is defined by a class in this embodiment, and the local transaction is conceptual only, and there is no class to define. The xid assigned to the local transaction will actually be used by the corresponding tccvinvoke.

In this embodiment, when a TCC transaction is created, its identity can be immediately made clear. The identity of the TCC transaction created on the original management node is the coordinator. The identity of the TCC transaction created by the remote invocation command is a participant.

When a TCC transaction is created, its xid value can be immediately specified. If it is the coordinator, its xid value is newly created. If it is a participant, then the xid value is from the coordinator, which is the same as the coordinator's xid value.

In this embodiment, the trial phase boundary of the TCC transaction is identical to the first local transaction initiated in the TCC transaction. When an attempted method is to be performed and there is no TCC transaction in the context at this time, a TCC transaction is initiated and the identity of the TCC transaction is the coordinator. It is necessary for the trial method to be performed in a local transaction. When the trial method is completed, the trial phase of the TCC transaction is complete. At the same time, the first local transaction participating in the TCC transaction, i.e., the local transaction corresponding to the attempted method, commits or rolls back. Thus, it can be said that the trial phase boundary of a TCC transaction is the same as the first local transaction in the TCC transaction scope.

(2) The current flag state of the TCC transaction.

As shown in fig. 3, it can be seen from the above description that there are three distinct states of the service node participating in the TCC transaction: initial state, intermediate state, end state. Obviously, any scheme that employs final consistency exists in all three states for nodes participating in the service.

For the TCC transaction itself, there are three corresponding states, respectively:

initially: the TCC transaction is in this state once it is created. In this state, all service nodes (including the management node or participating nodes) may perform the try method.

Marking as committed/rollback State: this is an intermediate state, and when entering this state, it means that all the attempted methods of the participating service nodes are successfully executed, or any execution fails. The TCC transaction is either marked as a commit state or a rollback state depending on the outcome of the execution of the participating service node attempt methods. The management node can let itself enter this state, while the participating nodes can only accept the commands of the superordinate node to enter this state. After entering this state, the management node may coordinate each node to execute a corresponding method according to the current state, for example, execute a confirmation method if the node is marked as commit, and execute a rollback method if the node is marked as rollback.

And (4) ending: when the completion phase methods (validation or rollback) for both the local and remote calls involved in the TCC transaction have been executed, the TCC transaction enters this state. TCC transaction resources in the ending state can be cleared, such as memory resources and log occupation can be cleared.

In this embodiment, in any state during the TCC transaction, related critical operations are written into a log according to the occurrence sequence of time, and after each critical operation is written into the transaction log, the critical operation is executed, so that after the downtime and restart, the transaction recovery is performed according to the information recorded in the log.

The TCC transaction is a global concept, and for each participating service node, the service node itself is also a branch of the TCC transaction, which can be regarded as a local content of the global transaction. The TCC transaction branch on each node also goes from an initial state after creation, to a marked commit/rollback state after the attempted method has been executed (if the remote participating node has received an instruction from the superordinate node), to an end state after the acknowledge/rollback method has been executed.

For a TCC transaction, it is necessary to guarantee data consistency for each service node. The consistency of data is viewed from several perspectives:

data consistency for all participants in a service.

Consistency of data in an abnormal state on a single node.

From the perspective of all participants of the service, the data consistency is guaranteed by the present embodiment mainly based on the basic principle of final consistency.

Under normal operating conditions, the service nodes participating in the TCC transaction must all undergo three state changes. And all the nodes reach the final state, namely, the state of data consistency is necessarily achieved.

For the consistency of data on a single node, the embodiment is mainly ensured by logging. In this embodiment, any operation involving a TCC transaction is first written into the log, and the corresponding operation is executed after the log is successfully written. Any operation that changes the state of the TCC transaction (including adding new participants, performing new try, confirm, rollback, etc., in addition to the three states described above) also needs to be written to the log before it can be executed. Furthermore, all TCC operations are performed within the scope of local transactions. Since a TCC transaction is itself composed of multiple local transactions. Meanwhile, in order to determine whether a local transaction commits, after the local transaction is started, an xid is assigned to the transaction. This xid represents the concept of a branch of the local transaction in the global TCC transaction. The log is written after the xid is distributed, and the xid is written into the log before the local transaction is submitted, so that the consistency of the business operation and the log operation is ensured through the local transaction. Therefore, after the downtime, whether the data of the business operation is successfully submitted can be inferred according to the information in the log.

Because the key operations are executed after the logs are recorded, after the nodes are down and restarted, the state of the TCC transaction can be recovered by reading the action and playing back the key action, the TCC transaction on the nodes is recovered to the state before the nodes are down and is continuously executed, and finally the TCC transaction enters the ending state, so that the data consistency of all the nodes is achieved.

In the log of TCC, the records are all key information, mainly including: the current state of the TTC transaction, the list of local transactions participating in the TCC transaction. A local try method that participates in the TCC transaction, a local complete phase method that participates in the TCC transaction and attempts execution. The related operations involved in the TCC transaction are written sequentially, so the downtime recovery is based on the playback of the log.

Specifically, after the downtime is restarted, all necessary information of the TCC transaction object can be recovered as long as the logs are read in the corresponding sequence, and unnecessary information can be obtained by inference through the necessary information, so that all information of the TCC transaction is finally recovered to the state before the downtime.

All local transactions on the nodes (usually, one local transaction is used on one node, and occasionally, a plurality of local transactions are started, each local transaction is a branch transaction of the global TCC transaction) register themselves in the TCC transaction object before starting, so that after the downtime is recovered, the local transaction list in the TCC transaction object can be completely recovered by reading the log file.

Downtime may occur during the trial phase of a TCC transaction as well as during the completion phase. If the TCC transaction state read in the log is the initial state, this means that the downtime occurred during the trial phase, since the TCC state has not yet migrated to be marked as committed/rolled back at this point. Whether xid records exist or not is specified through the query log, and whether the corresponding local transaction is successfully submitted or not can be judged. Because of the occurrence of the downtime, this means that the trial phase of the TCC transaction has ended, it is first necessary to determine to which intermediate state the TCC transaction needs to be transitioned from the initial state. It can be determined whether the current TCC transaction should enter into a mark-to-commit or mark-to-rollback by determining whether the first local transaction on the coordinating node commits (since the first local transaction on the coordinating node is the local transaction that actually opens the TCC global transaction, whether the local transaction commits, meaning whether the TCC transaction was all successful during the trial phase). And then executing the completion phase methods of all the participating nodes.

If the TCC transaction was already marked as a commit/rollback state when it was resumed, then the completion method for each node in that state needs to continue. It is first necessary to determine which local completion methods have already been executed. For each trial method, a completion phase xid is assigned in advance before execution. Because the trial method is in the completion phase, the corresponding branch of the completion phase of the trial method needs to be executed. When the downtime is recovered, whether the method in the completion phase xid of the trial method is executed can be judged by checking whether the completion phase xid exists in the log, and if the completion phase xid exists in the log, the method is marked as executed; if not, the retry can continue. Thus, a TCC transaction that has not completed execution is completely completed.

In this embodiment, when the TCC transaction enters the state marked as committed/rollback, it means that the TCC transaction enters the completion phase. To ensure data consistency, the completion phase must be executed successfully regardless of which branch is taken. Repeated execution is therefore required once the method execution of the completion phase is in error. The method of the completion phase uses local transaction enclosure, so repeated execution after execution failure does not cause problems of local data, and the method can safely retry.

Theoretically, the completion phase method must be successfully performed, but if it is still not successfully performed after a plurality of retries, manual intervention is required.

The framework logs the trial method before the participating nodes execute it, and needs to log the entry of the trial method and also store it in the memory. When the TCC transaction enters the completion phase and needs to execute the validation/rollback method, the framework can obtain, via the metadata information, how the validation or rollback method corresponding to an attempted method is invoked. The validation and rollback methods should use the same entries as the trial method. The framework may invoke the confirm or rollback method and introduce the arguments used by the attempted method to execute the corresponding method.

(3) List of attempted methods under the TCC transaction scope.

In this embodiment, the attempted method call is a specific attempted method execution. Before the attempted method is executed, the method information should first be registered into the list of attempted methods of the TCC transaction. Specifically, an object of the trial method is generated and registered into the TCC transaction. The registered action content includes:

setting a subject of a current attempted method in the TCC transaction as a precursor to the attempted method subject and setting the attempted method subject as the current attempted method point;

adding the object of the trial method to a trial method list of TCC transactions;

object information of the trial method is written to the log.

At the same time, from the implementation point of view, the object of the trial method should be put into the current trial method direction of the Tcc connection object, which facilitates the relevant processing when the local transaction commits or rolls back.

In this embodiment, the first trial method in a TCC transaction has a significant meaning, because the success or failure of this trial method means whether the branch of the TCC transaction was successful or failed. From the log file, the object of the first attempted method is the first write to the log. However, there is a simple method to determine whether the first trial method is the subject of the first trial method. That is, the object for which there is no predecessor trial method is the object of the first trial method in the TCC transaction branch.

(4) Current trial method under TCC transaction scope; wherein each attempted method to be executed requires to be added to the list of attempted methods of the TCC transaction object and written to the log before execution.

In this embodiment, considering the responsibility of the trial method, the attributes that should be included are:

the trial method corresponds to the operation object.

The trial method is performed with reference to the parameters.

Xid of the local transaction when the attempted method is executed.

The completion phase xid assigned to the trial method is used to write to the log file and commit to the log table when its corresponding completion phase method is executed.

From the above example, during the execution of one trial method, another trial method may be invoked. And when the attempted method is executed, the method returns to the calling position of the last attempted method again to continue executing. To abstractly embody this relationship, the object of the attempted method would have a forward pointer to the attempted method that called its method. There is a value for xid in the trial method object that is the xid of the local transaction at the time the trial method was executed. Local transactions are a logical concept and there are no actual class definitions and objects. The xid value is therefore actually determined by logic. The specific rules are as follows:

when the trial method has no forward trial method object, the value of xid is newly created.

There is a forward trial method when the trial method is created, and if the transaction propagation property of the method in which the current trial method is located is to follow the original transaction, the xid of the trial method multiplexes the objects of the forward trial method. If the transaction propagation attribute of the method in which the current attempted method is located is a new transaction, xid of the attempted method is new.

When a local transaction commits, it means that the attempted methods for the same xid value are all executed.

First, the current trial method is performed. Secondly, it needs to judge whether the xid value of the object of the forward trial method is the same as itself. If the same, it means that the forward try method is also performed in the same local transaction, and must be performed completely. Recursive this process can find the objects of all trial methods in the same local transaction. When this local transaction commits, all attempted methods of the same xid need to be marked as complete.

Before the local transaction is submitted, for the downtime recovery consideration, xid of the local transaction (i.e. xid of tcc (try) under the scope of the local transaction) needs to be written into the log table together, and the consistent writing of the log data and the service table data is ensured by means of a single transaction.

If a local transaction is successfully committed, there is no uncommitted trial method object, meaning that the TCC transaction branches are successful at the node, and if the node is a coordinating node, it can be considered that the TCC transaction has been successful in the trial phase, and can be marked as committed, and the completion phase is entered. If a local transaction commit fails, the same attempted method of xid should be marked as an execution failure. At this time, by continuously throwing out errors, the initial trial method on the coordination node is finally failed to execute, and then the corresponding TCC transaction is marked as rollback, and the completion stage is entered. A corresponding rollback method is performed.

For attempted method commit, as shown in FIG. 4, when a local transaction commits, and the local transaction is under the scope of a TCC transaction, then the object of the current attempted method of the TCC transaction needs to be fetched. Set it to commit state and access its predecessor, if the predecessor's xid is the same as the object of the currently attempted method, then also set to commit. This action is repeated until the predecessor xid is different from the subject of the current attempted method, and the subject of the attempted method is set as the subject of the current attempted method of the TCC transaction.

For the trial method failure, as shown in fig. 5, the processing logic and the idea of the trial method failure and the trial method submission are similar, and the difference between the two is in the direction of the two.

When a local transaction rolls back because of an exception, the current attempted method that requires fetching a TCC transaction is marked as failed if the local transaction is under the scope of the TCC transaction. And accesses its predecessor, if the predecessor's attempted method xid is the same as the attempted method, then the predecessor is also marked as failed. This process is repeated until no predecessors are found or the predecessor's xid does not conform to the current attempted method, then the predecessor may be set to the current attempted method direction of the TCC transaction.

In this embodiment, when all the trying methods under the scope of TCC transaction are executed, that is, the current trying method of TCC transaction is empty, it means that the trying phase of TCC transaction is completed. At this point, the TCC transaction should be marked as committed or rolled back, depending on the circumstances of the try phase, for guiding the progression of the completion phase.

The decision is really simple, and the first method of trying to register with a TCC transaction is special under the TCC transaction scope because it initiates the entire TCC transaction. If the attempted method is successful, the TCC transaction is marked as committed; if the method fails, the TCC transaction is marked for rollback.

The state change of the TCC transaction is a critical operation and therefore the updated TCC transaction state needs to be written to the log. Once this state is entered, the TCC transaction requires the completion phase method to be performed until the TCC transaction state is finished. If the completion phase method is abnormal, repeated retries are required until successful.

In this embodiment, depending on the TCC transaction state: the methods to be executed in the commit, rollback, and completion phases are different. Specifically, when the state is commit, the confirmation method is executed, and when the state is rollback, the rollback method is executed.

Confirmation method/TccCommit

When a TCC transaction enters the completion phase and is marked as committed, a corresponding validation method needs to be performed for each attempted method registered in the TCC transaction.

The validation methods are framework-triggered, each of which should first be surrounded by local transactions. Considering that Spring is adopted as an operation container in most Java development scenarios, the requirement can be realized only by confirming that the @ Interactive annotation is annotated on the method.

Similar to the trial method, when the local transaction needs to be committed after the execution of the confirmation method is completed, the completion phase xid of the local transaction needs to be written into the log together, and then the local transaction can be committed. The source of the value for completion of the phase xid is assigned to the corresponding trial method. From the viewpoint of functions and utility, the attributes that a confirmation method should have are:

the reference of the method is the same as the corresponding trial method.

The phase xid is completed and the value is assigned by the object of the corresponding trial method.

Method object, defined in tcco operation.

Rollback method/TccRollback

When a TCC transaction enters the completion phase and is marked as rollback, a corresponding rollback method needs to be performed for each attempted method registered in the TCC transaction.

Similar to the validation method, the rollback method is also triggered by the framework and should be surrounded by local transactions.

The rollback method and the validation method are basically similar in logic except for the different directions: are all triggered by the framework, all need to be surrounded by local transactions, and all need to write to the completion phase xid when a local transaction commits. The attributes that the entity should possess are also the same as the confirmation.

The only difference is that if the TCC transaction enters the completion phase and is marked as committed, all validation methods need to be performed. Whereas TCC transactions enter the completion phase and are marked as rollback, whether the rollback method is executed depends on whether the corresponding attempted method was executed successfully. If the corresponding attempted method fails to execute, it means that there is no intermediate data that needs to be rolled back, and naturally the corresponding rollback method does not need to be executed. In order to unify the logics of the confirmation method and the rollback method, in a scenario where the rollback method does not need to be executed, the rollback method may be considered to be directly successfully executed.

(5) A list of remote participants under the TCC transaction scope; wherein, the far-end participant can only send out a far-end calling instruction after being added to the far-end participant list.

Since it is a distributed transaction, there must be a remote participant (e.g., node B, C, D). A remote participant is an abstract description or proxy for a remote resource. It should itself have a commit and rollback interface for sending corresponding instructions to the remote participant through the proxy during the completion phase.

Considering that the same remote participant may be called many times within the calling range of a trial method, in order to save resources, the remote participant should implement the isSame interface for determining whether the remote participant subsequently added to the same TCC transaction is the same as the previous one, and if so, the remote participant does not need to add again, thereby reducing the use of repeated objects.

Specifically, as shown in FIG. 6, in executing an attempt method, if a remote call needs to be made, the providing node of the remote service is considered to be a remote participant of the TCC transaction. It is therefore necessary to register the remote participant in the remote participant list of the TCC transaction before invoking the remote participant. Registering a remote participant requires a first determination to determine whether the current remote participant already exists.

The commit and rollback method of the remote participant is implemented by sending a commit or rollback instruction to the remote node by way of a remote call (RPC or Http Api). Carry the xid value of the TCC transaction as a parameter.

As shown in FIG. 7, when the RPC method is actually executed, it is necessary to carry information about the TCC execution instruction. It is also necessary to send TCC transaction commit/rollback instructions to the remote participants when the completion phase of the TCC transaction is executed. The information that each remote TCC execution instruction needs to contain in common is:

xid of TCC transaction.

2. An identifier of the propagating node.

When a remote participant receives a TCC execution instruction, before calling a service trying method, first, a TCC transaction object corresponding to xid in the execution instruction needs to be acquired from a TCC transaction repository. If not, the participant TCC transaction object needs to be created by the transaction manager. After the TCC transaction is established, the calls to the try method need to be done consistently with the management node (or coordinator) and are not differentiated.

Many RPC frameworks have an automatic retry function when they fail, which results in a situation where the service provider is in a concurrent race when processing TCC execution instructions. To handle concurrent contention, the TCC execute instruction additionally carries a parameter: traceId of this call. The traceId is globally unique and uniquely assigned each remote call, and the service provider can exercise anti-replay control based on this parameter.

This remote call under the TCC transaction scope cannot be executed in an asynchronous fashion because the superior node needs to know the result of the remote call (whether it is abnormal) to decide whether to execute the subsequent steps, and the completion phase confirms the selection of the branch and cancels the branch.

To this end, the method of proceeding to the completion phase begins when the TCC transaction attempt phase of the management node is completed. If the management node's trial phase method is performed successfully, the management node's TCC transaction is marked as committed, at which point a commit instruction is sent to all remote participants.

When the remote participant receives the commit command, it needs to update the TCC transaction state to be marked as committed. Given the nature of the RPC framework that may retry, changes to the TCC transaction state need to be made using CAS. That is, only if the TCC transaction state is updated by the CAS mode, the successfully updated thread can continue the subsequent flow and return a caller success response. The thread that failed the CAS continues to try and cannot immediately return a success message if the transaction status is found to have been updated.

This is because after the state of the TCC transaction object is updated in the memory, a log is written to perform subsequent business actions. Assume that there are A, B threads at this time. The A thread CAS operation succeeds, the B thread CAS operation fails, and the transaction state is found to be changed when the B thread retries, and a caller success response is returned. And the thread A goes down as soon as the thread A has not yet written the log. Causing the method to fail in practice. This causes an error in the judgment of whether the method is executed by the superior node and the local node. Therefore, there should be an attribute (named as IntermediateLogged) in the TCC transaction to indicate whether the log is completed after updating to the intermediate state. The retried thread can only return a response to the caller after the log write is completed. In summary, the flow of executing a commit instruction on a node of a remote participant is shown in FIG. 8.

The method of execution and the flow logic of the rollback instruction is very similar to the commit instruction. A rollback instruction may need to handle some of the conditions that a commit instruction will not encounter. Thus, the overall process is somewhat complicated.

First, the remote participant may receive a repeated rollback instruction. This is because, after the upper node issues the rollback instruction, the successful response of the remote participant node is not delivered to the upper node due to a network failure or the like. The upper node issues the rollback instruction again when retrying. This TCC transaction is also executing in the rollback branch at this time.

For this case, the problem is solved by using the TCC to execute the flow in the instruction, so that the second arriving rollback instruction does not participate in the CAS contention. The TCC transaction state and the value of the intermediateLogged flag bit are checked and then the successful response of the superior node is directly returned.

In a second possibility, the remote participant node still receives a rollback instruction while performing the try phase method. This is the case because the participant node disconnects while performing the try phase, resulting in the superordinate node not receiving a successful response, at which point the management node considers that it is necessary to roll back the entire TCC transaction. A rollback instruction may be issued to the remote participant node while the remote participant node may still be in the process of performing the method in the trial phase.

For this case, it is to avoid starting the method of performing the completion phase while the trial phase is still ongoing. Whether the trying phase is completed or not is a simple basis, namely whether the flag bit of the first trying method in the TCC transaction is successful or failed (i.e. as long as it is not the initial state). The rollback branch is only ready to be taken after the explicit try phase has been completed, otherwise it needs to wait until the try phase is over.

Additional considerations for both cases, the final flow of the TCC rollback instruction is shown in FIG. 9.

Design of journal

In this embodiment, the log is mainly embodied in two places: database logs and log files. Two different logs assume different responsibilities. The database logs play a role in confirming whether the trial phase method or the completion phase method is executed or not when the downtime is restarted. The log file plays a role in reconstructing TCC transactions during downtime restart.

The database log has the function that when the downtime is restarted, whether the trial phase method or the completion phase method is executed or not can be confirmed by checking the data of the log. This eliminates the need for the business method itself to implement idempotency. Thus, the contents of its storage are xid in the trial method object, or completion phase xid in the validation method or rollback method. The design is shown in table 1:

TABLE 1

The log file is used to play the responsibility for recovery of the downtime restart, and therefore, the key operations in the life cycle of the TCC transaction object need to be recorded. The log information in the log file has an overall format, as shown in Table 2

TABLE 2

The value of the schema is registered in advance. The allowed values are as follows:

schema：1

the format of the body is as follows:

TABLE 3

Register local TCC call:

TABLE 4

The value of sequence number 4 is an easy way to get. For the trial method in the same local transaction, the values of sequence numbers 1 and 2 are the same, and therefore cannot be used to identify the trial method. The value of the sequence number 3 is different for each trial method, and therefore can be used to identify the object of a particular trial method in addition to the value of the branch ID for the done phase xid of the done phase. Sequence number 4 uses this value as a pointer to the predecessor.

Registering a remote resource:

TABLE 5

Updating the transaction state:

TABLE 6

Log archiving

The present embodiment employs a general logging solution. According to the scheme, the archiving of the log is triggered in the running process. The loose log is compressed into compact data. Thereby realizing the multiplexing of the log file space. Journal archiving, most importantly, provides the format of the archive journal, as shown in table 7:

TABLE 7

In this embodiment, the log based on the above design can implement transaction recovery and exception handling during downtime.

Specifically, the main tasks in the completion phase are:

1. corresponding instructions are sent to all remote participants.

2. And calling a completion phase method corresponding to the local trial method.

Both of these require traversing the list, operating on every element, and therefore can be accelerated by parallelization through multithreading. And the main thread can be notified in the form of a wait by CountDownLatch or an asynchronous notification.

Since multithreading is involved, the completion flag in the validation/rollback should be decorated with a predetermined key (e.g., volatile); similarly, the same is true of the completion flag in the remote participant.

Using CountDownLatch results in the main thread becoming blocked and therefore using asynchronous notification reduces the degree of blocking. Whether the thread is local or remote, the notification is carried out after the corresponding flow is executed, all the completion conditions of the local TCC completion phase method call and the remote completion phase method call are traversed, and the last completed thread can see all the completed views. And the thread can perform the final TCC transaction state update to the completed work. And meanwhile, the information of the transaction can be written into the clearing log, and the subsequent asynchronous clearing of the corresponding log table record is reserved.

Consider that if the completion phase goes wrong, the TCC transaction will be placed in the transaction store and retried by the timed task thread. For a single local participant or remote resource, the completion phase cannot be performed concurrently, otherwise data errors result. Thus, a precondition for retries in the completion phase of a TCC transaction is that all local participants and remote resources are not executing. To address this issue, an identification number may be used to mark how many local participants and remote resources are currently executing. In view of the fact that the identification number is contended by multiple threads, the TCC transaction object can be made to inherit the atomicineger class, and before preparing for the execution completion phase, it is first determined whether the current identification number is the first number (e.g., -1), and if not, it is discarded. The number of the first digit is the number of the currently needed operands (after the number of the local participants and the number of the remote participants which have not succeeded yet). With the CAS mode update, the execution right is obtained if the update is successful. The method is characterized in that listeners are arranged in the execution methods of the local participant and the remote participant, and the identification number is reduced by one when the execution is finished. The thread monitoring 0 can be determined according to the completion condition of the local participant and the remote resource in the current TCC transaction, if all the threads are completed, the state of the transaction is updated, and a log file is written and a log is cleared. If there are more outstanding transactions, the transaction is placed again in the anomalous transaction set of the transaction repository for the next attempt.

Since the subsequent phase can retry without side effect after the TCC transaction state is updated to be marked as commit/rollback, the participant node can return a response once the log file is written after receiving the commit/rollback instruction of the superior node. And the execution of the completion phase is put on the asynchronous task thread, so that the delay of the whole link can be reduced.

Referring to fig. 10, a decentralized TCC transaction management device according to a second embodiment of the present invention includes:

a TCC transaction starting unit 210, configured to receive a service execution request initiated by a user, and start a TCC transaction according to the service execution request;

an attempted method generating unit 220 for generating a first attempted method according to the service execution request; wherein the first attempted method includes at least one of its remotely invoked participating nodes and a second attempted method of the invoked participating node;

an execution condition obtaining unit 230, configured to obtain an execution condition of each called participating node on the second trial method;

a flag unit 240, configured to generate a current flag state of the TCC transaction according to the execution condition;

a coordination unit 250, configured to coordinate the TCC transaction to enter the completion phase, and perform a confirmation operation or a rollback operation according to the flag state.

The third embodiment of the present invention further provides a decentralized TCC transaction management node, which includes a memory and a processor, wherein the memory stores a computer program, and the computer program can be executed by the processor to implement the decentralized TCC transaction management method.

The fourth embodiment of the present invention also provides a decentralized TCC transaction management system, which comprises the decentralized TCC transaction management node as the coordinator and a plurality of participating nodes as participants.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for decentralized TCC transaction management, comprising:

2. The decentralized TCC transaction management method according to claim 1, wherein said TCC transaction has the following attributes:

xid of the TCC transaction itself; said xid comprising: a globally unique ID for uniquely identifying the TCC transaction; a branch ID identifying a local transaction participating in the TCC transaction; wherein, for any node participating in a local transaction of the TCC transaction, it is a branch of a global TCC transaction; when the branch ID of the xid is null, it indicates that the xid is used to mark the TCC transaction itself;

list of attempted methods under the TCC transaction scope;

3. The decentralized TCC transaction management method according to claim 2, further comprising:

4. The decentralized TCC transaction management method according to claim 1, further comprising a log, said log comprising a database log and a log file, said database log being used to confirm whether the attempted phase method or the completed phase method is executed when the downtime restarts; wherein:

the stored content included in the database log storage is xid in the trial method object or completion phase xid in the confirmation method or rollback method;

the log file is used for reconstructing TCC transactions when the downtime is restarted;

5. The decentralized TCC transaction management method according to claim 4, wherein upon a transaction recovery through said log file;

6. The method for TCC transaction management with decentralized center as claimed in claim 5, wherein the flow of the completion phase is executed in a multi-thread manner, which is specifically:

7. The method for transaction management of decentralized TCC according to claim 5, further comprising, for an exception transaction placed in the transaction repository:

if not, discarding the exception transaction;

8. A decentralized TCC transaction management apparatus, comprising:

9. A decentralized TCC transaction management node, comprising a memory and a processor, said memory having stored thereon a computer program executable by said processor for implementing the decentralized TCC transaction management method according to any one of claims 1 to 7.

10. A decentralized TCC transaction management system, comprising a decentralized TCC transaction management node according to claim 9 and a plurality of participating nodes as participants.