CN103164219A - Distributed transaction processing system using multi-type replica in decentralized schema - Google Patents

Distributed transaction processing system using multi-type replica in decentralized schema Download PDF

Info

Publication number
CN103164219A
CN103164219A CN2013100058578A CN201310005857A CN103164219A CN 103164219 A CN103164219 A CN 103164219A CN 2013100058578 A CN2013100058578 A CN 2013100058578A CN 201310005857 A CN201310005857 A CN 201310005857A CN 103164219 A CN103164219 A CN 103164219A
Authority
CN
China
Prior art keywords
submodule
copy
write
transaction
subtransaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013100058578A
Other languages
Chinese (zh)
Other versions
CN103164219B (en
Inventor
石宣化
金海�
吴松
朱陈云海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201310005857.8A priority Critical patent/CN103164219B/en
Publication of CN103164219A publication Critical patent/CN103164219A/en
Application granted granted Critical
Publication of CN103164219B publication Critical patent/CN103164219B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a distributed transaction processing system using a multi-type replica in a decentralized schema. The distributed transaction processing system comprises a transaction interface module, a transaction processing module and a transaction memory module, wherein the transaction interface module comprises an outward interface sub-module and a transaction preprocessing sub-module, the transaction processing module comprises a multi-type replica sub-module, a read transaction processing module, a replica group transaction state sub-module, a read request distribution sub-module, a replica repair sub-module, a multiversion concurrency control (MVCC) read sub-module, a local writing transaction processing sub-module, a local writing transaction paxos replica consistency sub-module, a local writing transaction commit sub-module, a global writing transaction processing sub-module, a main sub-transaction paxos replica consistency sub-module, a secondary sub-transaction paxos replica consistency sub-module and a global writing transaction commit sub-module. The distributed transaction processing system can solve the problems in an existing system that the reading and writing environment is limited, reading and writing availability can not be configured autonomously according to requirements of application, a global transaction depends on locking and cost is high.

Description

Use the distributed transaction disposal system of polymorphic type copy in the decentralization framework
Technical field
The invention belongs to technical field of distributed memory, more specifically, relate to the distributed transaction disposal system of using the polymorphic type copy in a kind of decentralization framework.
Background technology
Along with the development of Internet technology, the data in the internet with mysterious speed rising, and how the so large-scale data of Storage and Processing become the research direction of large data age now.Decentralization NoSQL is a kind of mass data storage system, has following characteristics: high readwrite performance, and without Single Point of Faliure, high availability, enhanced scalability.Cassandra system for example, its uses memory model towards row families to obtain high readwrite performance, uses the decentralization framework to avoid Single Point of Faliure and to obtain high availability, uses the consistance Hash to obtain enhanced scalability.
Distributed transaction disposal system in existing decentralization framework have following these: the Megastore system is a system that Google completes on the basis of Bigtable.This system has used special data model EntityGroups, then uses extra system module Coordinator to also have replica server to guarantee consistance.It submits algorithm to is the variant of Paxos algorithm, is used for safeguarding the consistance of synchronization counterpart between a plurality of data centers.Yet the method has used fixing read-write successfully to count, and can't carry out for different application the adjusting of availability, and its global transaction uses two sections expensive submissions, can cause obstruction.Scalaris is a distributed transaction system on the chord# ring, uses symmetrical replication policy, uses improved Paxos atomic commitment agreement, needs three phases just can complete affairs one time, can't carry out for different application the adjusting of availability.Also have the prototype system that academic research is write, studied the distributed transaction under the P-Ring environment, proposed a kind of MVCC algorithm---LSTP.Busy environment paid attention to read in article, and read-only affairs can not ended and block, but be not suitable for writing busy environment, and the application scenarios of this system comparatively limits to.
In sum, existing distributed transaction disposal system has the following disadvantages: 1, for the read-write environment limit to very much, can't configure voluntarily the read-write availability according to the demand of using.2, global transaction depends on and locks, and cost is higher.
Summary of the invention
Defective for prior art, the object of the present invention is to provide the distributed transaction disposal system of using the polymorphic type copy in a kind of decentralization framework, be intended to solve exist in existing system for the read-write environment limit to very much, can't configure voluntarily the read-write availability according to the demand of using, and global transaction depends on and locks, the problem that cost is higher.
for achieving the above object, the invention provides the distributed transaction disposal system of using the polymorphic type copy in a kind of decentralization framework, comprise the transaction interface module, issued transaction module and affairs memory module, the transaction interface module comprises external interface submodule and affairs pre-service submodule, the issued transaction module comprises polymorphic type copy submodule, read the issued transaction module, replica group transaction status submodule, read request distribution submodule, copy is repaired submodule, the MVCC reading submodule, the issued transaction submodule is write in the part, affairs Paxos copy consistency submodule is write in the part, the part is write affairs and is submitted submodule to, global write issued transaction submodule, main subtransaction Paxos copy consistency submodule, secondary subtransaction Paxos copy consistency submodule and global write affairs are submitted submodule to, the external interface submodule is used for receiving the transactions requests from client, and this transactions requests is sent to affairs pre-service submodule, affairs pre-service submodule is used for the judgement transactions requests and reads transactions requests or write transactions requests, if read transactions requests, will read transactions requests sends to and reads the issued transaction submodule, if write transactions requests, judge further that this writes transactions requests is that transactions requests or global write transactions requests are write in the part, if transactions requests is write in the part, this part being write transactions requests sends to the part and writes the issued transaction submodule, if global write transactions requests, this global write transactions requests is sent to global write issued transaction submodule, each that read that the issued transaction submodule is used for obtaining that this reads that transactions requests comprises from polymorphic type copy submodule read the address of hard copy corresponding to instruction and the response quantity of needs, and will read transactions requests, the address of hard copy and the response quantity that needs send to replica group transaction status submodule, and control the whole overtime retry of reading the transactions requests treatment scheme, replica group transaction status submodule is for the affairs executing state that reads according to the address of hard copy in corresponding replica group, read each maximum attitude daily record numbering and maximum attitude daily record timestamp of submitting to submitted to corresponding to instruction to obtain, reading instruction and whether can carry out in this locality in transactions requests read in the judgement of read request distribution submodule, if can, will read instruction, maximum submission attitude daily record numbering and the maximum copy of submitting to attitude daily record timestamp to send to this locality are repaired submodule, otherwise will read instruction, maximum submission attitude daily record numbering and maximum are submitted to attitude daily record timestamp to send to and are read any one hard copy corresponding to instruction, copy is repaired submodule and is used for maximum that the node updates with its place obtains to replica group transaction status submodule and submits attitude daily record numbering place to, the MVCC reading submodule is used for according to reading instruction and the maximum attitude daily record timestamp of submitting to from affairs memory module reading out data, and data are returned to read the issued transaction module, reading the issued transaction module also is used for data are sent to the external interface submodule, the external interface submodule also is used for data are sent to client, the part is write the issued transaction submodule and is used for obtaining from polymorphic type copy submodule the response quantity that the hard copy of the address of witness copy corresponding to write command that transactions requests comprises and hard copy and needs is write in this part, and transactions requests is write in the part, the response quantity of the hard copy of the address of witness copy and hard copy and needs sends to the part and writes affairs Paxos copy consistency submodule, and control the overtime retry that the transactions requests treatment scheme is write in whole part, the part is write affairs Paxos copy consistency submodule and is used for arranging consistent daily record value on the interior witness copy of witness replica group, this daily record value is sent to the part writes affairs submission submodule, and add local transaction tag for journal entry, the part is write affairs and is submitted to submodule to be used for according to the response quantity of the hard copy of the address of hard copy and needs, the daily record value being committed to corresponding hard copy, and will process successful result and return to the part and write the issued transaction submodule, the part is write the issued transaction submodule and is also sent to the external interface submodule for processing successful result, the external interface submodule also is used for processing successful result and sends to client, global write transaction management submodule is used for obtaining from polymorphic type copy submodule the response quantity of the hard copy of the address of witness copy corresponding to each write command that this global write transactions requests comprises and hard copy and needs, and with the global write transactions requests, the response quantity of the hard copy of the witness copy that each write command is corresponding and the address of hard copy and needs sends to main subtransaction Paxos copy consistency submodule, and control the overtime retry of whole global write transactions requests treatment scheme, main subtransaction Paxos copy consistency submodule is for consistent daily record value on witness copy in the witness replica group that main subtransaction is set, the positional information of this daily record value and main subtransaction is sent to secondary subtransaction submission submodule, and add the global transaction mark for this journal entry, secondary subtransaction Paxos copy consistency submodule is for consistent daily record value on witness copy in the witness replica group that all secondary subtransactions is arranged this secondary subtransaction, add positional information and the global transaction mark of main subtransaction for the journal entry of all secondary subtransactions, and the daily record value of main subtransaction and the daily record value of all secondary subtransactions are sent to main subtransaction submission submodule, global write affairs submission submodule is used for the daily record value of main subtransaction being committed to the hard copy of correspondence according to the response number of the hard copy of the address of the hard copy of main subtransaction and needs, and will process successful result and return to global write transaction management submodule, after processing successfully to each secondary subtransaction according to the response number of the hard copy of the address of the hard copy of this secondary subtransaction and needs will this secondary subtransaction the daily record value be committed to the hard copy of correspondence, global write transaction management submodule also is used for processing successful result and sends to the external interface submodule, the external interface submodule also is used for processing successful result and sends to client.
Affairs pre-service submodule is to judge its type by the OPER field that reads in transactions requests, this field is read transactions requests for expression, for transactions requests is write in expression, utilize the consistance hash function to carry out computing to the key of writing each write operation that comprises in transactions requests, and judge the type of writing transactions requests according to operation result, if same node is all pointed in the key computing of all write operations, this to write transactions requests be that transactions requests is write in the part, otherwise be the global write transactions requests.
Copy is repaired submodule and will all be obtained homogeneity value less than all journal entries of this numbering and need judge whether submission, if the transaction types of this journal entry record is local affairs, when journal entry reaches consistent in the witness copy, namely think and to submit to, otherwise just submit blank operation to; If the transaction types of this journal entry record is global transaction, whether reach unanimously in the witness copy except the needs inspection, whether the main subtransaction that also will check storage is submitted to, when main subtransaction has been submitted to, just think and to submit to, otherwise just submit blank operation to, the affairs that at last all need to be submitted to are carried out to completing attitude.
It is to use the Paxos algorithm that affairs Paxos copy consistency submodule is write in the part, trial reaches the consistent of daily record value on the same daily record position of each witness copy, in transactions requests, the operation of write command adds timestamp to this daily record value for this part is write, and before this timestamp is carried out greater than this Paxos algorithm, the maximum of this witness replica group is submitted attitude daily record timestamp to.
The global write affairs comprise that two or more parts writes affairs, all can be coupled with the global transaction mark, one of them can be designated as main subtransaction, and be used as and submit to point to use, other parts are write affairs and are designated as secondary subtransaction, will record the positional information of main subtransaction, be used for the copy reparation.
Main subtransaction Paxos copy consistency submodule is to use the Paxos algorithm, trial reaches the consistent of daily record value on the same daily record position of each witness copy of main subtransaction, this daily record value adds timestamp for the operation of write command in this main subtransaction, and before this timestamp is carried out greater than this Paxos algorithm, the maximum of this witness replica group is submitted attitude daily record timestamp to.
Secondary subtransaction Paxos copy consistency submodule is to use the Paxos algorithm, to each secondary subtransaction, trial reaches the consistent of daily record value on the same daily record position of each witness copy of secondary subtransaction, this daily record value for this reason in secondary subtransaction the operation of write command add timestamp, before this timestamp is carried out greater than this Paxos algorithm, the maximum of this witness replica group is submitted attitude daily record timestamp to.
By the above technical scheme that the present invention conceives, compared with prior art, the present invention has following beneficial effect:
(1) the read-write node that separates
Write affairs and submit to submodule, main subtransaction Paxos copy consistency submodule, secondary subtransaction Paxos copy consistency submodule and global write affairs to submit submodule to owing to having adopted polymorphic type copy submodule, part to write affairs Paxos copy consistency submodule, part, therefore the processing of daily record and the processing node of data are separated, improved configurability
(2) the read-write availability rank of configurable distributed transaction
Owing to having adopted polymorphic type copy submodule, the hard copy quantity that needs in the time of therefore can setting read-write is guaranteeing conforming while, the availability rank that can also regulate read-write like this.
(3) without the global transaction of locking
Owing to having adopted main subtransaction Paxos copy consistency submodule, secondary subtransaction Paxos copy consistency submodule and global write affairs to submit submodule to, therefore can use main subtransaction as submitting point to, avoided using locking.
(4) extensibility is strong
Owing to having adopted the consistance Hash in the decentralization framework, therefore provide very strong extensibility.When data scale increases, can be extending transversely by increasing node easily, only a token value need to be set just can add whole server cluster voluntarily.Overall performance is along with the increase near-linear of scale increases.
(5) high reliability
Copy mechanism in system the reliability of data, same piece of data can be kept on a plurality of nodes simultaneously, when node failure, data can not lost.Can also copy mechanism be risen to data center's rank by configuration, provide the disaster level other disaster tolerance.
Description of drawings
Fig. 1 is based on the interconnected topological diagram of distributed transaction disposal system of polymorphic type copy in the decentralization framework.
Fig. 2 is based on the distributed transaction disposal system system assumption diagram of polymorphic type copy in the decentralization framework.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, is not intended to limit the present invention.
As shown in Figure 1, using the distributed transaction disposal system of polymorphic type copy in decentralization framework of the present invention is to be applied in to comprise client and a plurality of node A, the B of server end and distributed system of C of being arranged in, and specifically is arranged in node.Client is used for the customer in response request, and carries out alternately with node A, B and C, to submit affairs to and to obtain result.Node A, B, C are used for the storage data, response read-write affairs etc., and use express network to interconnect between node.Distributed system adopts distributed hashtable as bottom, process the data of key-value pair (Key-Value) form, each node is mapped as a token (Token) by the consistance hash function with key, and determine the memory node of this key-value pair according to this token, and each node is responsible for the key-value pair of storage certain limit.
As shown in Figure 2, use the distributed transaction disposal system of polymorphic type copy to comprise transaction interface module 1, issued transaction module 2 and affairs memory module 3 in decentralization framework of the present invention.
Transaction interface module 1 comprises external interface submodule 101 and affairs pre-service submodule 102.
issued transaction module 2 comprises polymorphic type copy submodule 201, read issued transaction module 202, replica group transaction status submodule 203, read request distribution submodule 204, copy is repaired submodule 205, MVCC reading submodule 206, issued transaction submodule 207 is write in the part, affairs Paxos copy consistency submodule 208 is write in the part, the part is write affairs and is submitted submodule 209 to, global write issued transaction submodule 210, main subtransaction Paxos copy consistency submodule 211, secondary subtransaction Paxos copy consistency submodule 212 and global write affairs are submitted submodule 213 to.
External interface submodule 101 is used for receiving the transactions requests from client, and this transactions requests is sent to affairs pre-service submodule 102.
Affairs pre-service submodule 102 is used for the judgement transactions requests and reads transactions requests or write transactions requests, if read transactions requests, will read transactions requests sends to and reads issued transaction submodule 204, if write transactions requests, judge further that this writes transactions requests is that transactions requests or global write transactions requests are write in the part, if transactions requests is write in the part, this part being write transactions requests sends to the part and writes issued transaction submodule 207, if the global write transactions requests sends to global write issued transaction submodule 210 with this global write transactions requests.Particularly, be to judge its type by the OPER field that reads in transactions requests, this field is that transactions requests is read in 0 expression, is that transactions requests is write in 1 expression; Utilize the consistance hash function to carry out computing to the key of writing each write operation that comprises in transactions requests, and judge the type of writing transactions requests according to operation result, if same node is all pointed in the key computing of all write operations, this to write transactions requests be that transactions requests is write in the part, otherwise be the global write transactions requests.
Each that read that issued transaction submodule 202 is used for obtaining that this reads that transactions requests comprises from polymorphic type copy submodule 201 read the address of hard copy corresponding to instruction and the response amount R of needs (R is positive integer), and will read the address of transactions requests, hard copy and the response quantity that needs sends to replica group transaction status submodule 203, and control the whole overtime retry of reading the transactions requests treatment scheme.
Replica group transaction status submodule 203 is read each maximum submission attitude daily record numbering corresponding to instruction and maximum submission attitude daily record timestamp for the affairs executing state that reads according to the address of hard copy in corresponding replica group to obtain; Particularly, safeguard the consistance of each node log in replica group, each of daily record has numbering and the timestamp that constantly increases progressively, and have a various states: waiting state, submit attitude to and complete attitude, each node can record known maximum and complete attitude daily record numbering, maximum submission attitude daily record numbering and the maximum attitude daily record timestamp of submitting to, and the result of acquisition is the maximal value in the individual successfully response of R at least.
Reading instruction and whether can carry out in this locality in transactions requests read in 204 judgements of read request distribution submodule, if can, will read instruction, maximum submit attitude daily record numbering to and maximumly submit to attitude daily record timestamp to send to local copy to repair submodule 205, otherwise will read instruction, maximumly submit attitude daily record numbering to and maximumly submit to attitude daily record timestamp to send to read any one hard copy corresponding to instruction.
Copy is repaired submodule 205 and is used for maximum that the node updates with its place obtains to replica group transaction status submodule 203 and submits attitude daily record numbering place to; Particularly, to all obtain homogeneity value less than all journal entries of this numbering and need judge whether and submit to, if the transaction types of this journal entry record is local affairs, when journal entry reaches consistent in the witness copy, namely think and to submit to, otherwise just submit blank operation to; If the transaction types of this journal entry record is global transaction, whether reach unanimously in the witness copy except the needs inspection, whether the main subtransaction that also will check storage is submitted to, when main subtransaction has been submitted to, just think and to submit to, otherwise just submit blank operation to, the affairs that at last all need to be submitted to are carried out to completing attitude.
MVCC reading submodule 206 is used for according to reading instruction and the maximum attitude daily record timestamp of submitting to from affairs memory module 3 reading out datas, and data are returned to reads issued transaction module 202.
Reading issued transaction module 202 also is used for data are sent to external interface submodule 101.
External interface submodule 101 also is used for data are sent to client.
The part write issued transaction submodule 207 be used for from polymorphic type copy submodule 201 obtain this part write the address of witness copy corresponding to write command that transactions requests comprises and hard copy and needs hard copy response quantity W(wherein W be positive integer), and the address of transactions requests, witness copy and hard copy is write in the part and the response quantity of the hard copy that needs sends to the part and writes affairs Paxos copy consistency submodule 208, and control the overtime retry that the transactions requests treatment scheme is write in whole part.
The part is write affairs Paxos copy consistency submodule 208 and is used for arranging in the witness replica group consistent daily record value on the witness copy, this daily record value is sent to the part writes affairs and submit submodule 209 to, and add local transaction tag for journal entry; Particularly, to use the Paxos algorithm, trial reaches the consistent of daily record value on the same daily record position of each witness copy, in transactions requests, the operation of write command adds timestamp to this daily record value for this part is write, and before this timestamp is carried out greater than this Paxos algorithm, the maximum of this witness replica group is submitted attitude daily record timestamp to.
The part is write affairs and is submitted to submodule 209 to be used for according to the response quantity W of the hard copy of the address of hard copy and needs, the daily record value being committed to corresponding hard copy, and will process successful result and return to the part and write issued transaction submodule 207.
The part is write issued transaction submodule 207 and is also sent to external interface submodule 101 for processing successful result.
External interface submodule 101 also is used for processing successful result and sends to client.
Global write transaction management submodule 212 be used for from polymorphic type copy submodule 201 obtain the address of witness copy corresponding to each write command that these global write transactions requests comprise and hard copy and needs hard copy response quantity V(wherein V be positive integer), and the response quantity of the address of the witness copy that global write transactions requests, each write command is corresponding and hard copy and the hard copy that needs sends to main subtransaction Paxos copy consistency submodule 211, and controls the overtime retry of whole global write transactions requests treatment scheme.Particularly, the global write affairs comprise that two or more parts writes affairs, all can be coupled with the global transaction mark, one of them can be designated as main subtransaction, and be used as and submit to point to use, other parts are write affairs and are designated as secondary subtransaction, will record the positional information of main subtransaction, are used for the copy reparation.
Main subtransaction Paxos copy consistency submodule 211 is for consistent daily record value on witness copy in the witness replica group that main subtransaction is set, the positional information of this daily record value and main subtransaction is sent to secondary subtransaction submits submodule 214 to, and add the global transaction mark for this journal entry; Particularly, to use the Paxos algorithm, trial reaches the consistent of daily record value on the same daily record position of each witness copy of main subtransaction, this daily record value adds timestamp for the operation of write command in this main subtransaction, and before this timestamp is carried out greater than this Paxos algorithm, the maximum of this witness replica group is submitted attitude daily record timestamp to.
Secondary subtransaction Paxos copy consistency submodule 212 is for consistent daily record value on witness copy in the witness replica group that all secondary subtransactions is arranged this secondary subtransaction, add positional information and the global transaction mark of main subtransaction for the journal entry of all secondary subtransactions, and the daily record value of main subtransaction and the daily record value of all secondary subtransactions are sent to main subtransaction submission submodule 215; Particularly, to use the Paxos algorithm, to each secondary subtransaction, trial reaches the consistent of daily record value on the same daily record position of each witness copy of secondary subtransaction, this daily record value for this reason in secondary subtransaction the operation of write command add timestamp, before this timestamp is carried out greater than this Paxos algorithm, the maximum of this witness replica group is submitted attitude daily record timestamp to.
Global write affairs submission submodule 213 is used for counting V according to the response of the hard copy of the address of the hard copy of main subtransaction and needs and the daily record value of main subtransaction is committed to the hard copy of correspondence, and will process successful result and return to global write transaction management submodule 212, after processing successfully, each secondary subtransaction is counted according to the response of the hard copy of the address of the hard copy of this secondary subtransaction and needs the hard copy that daily record value that V will this secondary subtransaction is committed to correspondence.
Global write transaction management submodule 212 also is used for processing successful result and sends to external interface submodule 101.
External interface submodule 101 also is used for processing successful result and sends to client.
Example:
For feasibility and the validity of verifying the inventive method, configuration-system under true environment is tested using the distributed transaction of polymorphic type copy in the decentralization framework.
Server basic hardware of the present invention and software configuration are as shown in table 1:
Figure BDA00002714392800111
Table 1
The present invention processes the distributed transaction in polymorphic type copy and decentralization framework and has carried out effective combination.It uses the decentralization framework, and very strong extensibility is provided, and its copy mechanism has improved reliability, the restorability of data, and higher availability of data, and the distributed transaction function of strong consistency is provided.This system uses the polymorphic type copy to separate the read-write physical node of distributed transaction, effectively reduce node failure to reading affairs and the impact of writing the affairs availability, distributed transaction important in inhibiting for for the suitable read-write availability of different application scene configuration has larger application potential.
Those skilled in the art will readily understand; the above is only preferred embodiment of the present invention; not in order to limiting the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (7)

1. use the distributed transaction disposal system of polymorphic type copy in a decentralization framework, comprise the transaction interface module, issued transaction module and affairs memory module, the transaction interface module comprises external interface submodule and affairs pre-service submodule, the issued transaction module comprises polymorphic type copy submodule, read the issued transaction module, replica group transaction status submodule, read request distribution submodule, copy is repaired submodule, the MVCC reading submodule, the issued transaction submodule is write in the part, affairs Paxos copy consistency submodule is write in the part, the part is write affairs and is submitted submodule to, global write issued transaction submodule, main subtransaction Paxos copy consistency submodule, secondary subtransaction Paxos copy consistency submodule and global write affairs are submitted submodule to, it is characterized in that,
The external interface submodule is used for receiving the transactions requests from client, and this transactions requests is sent to affairs pre-service submodule;
Affairs pre-service submodule is used for the judgement transactions requests and reads transactions requests or write transactions requests, if read transactions requests, will read transactions requests sends to and reads the issued transaction submodule, if write transactions requests, judge further that this writes transactions requests is that transactions requests or global write transactions requests are write in the part, if transactions requests is write in the part, this part being write transactions requests sends to the part and writes the issued transaction submodule, if the global write transactions requests sends to global write issued transaction submodule with this global write transactions requests;
Each that read that the issued transaction submodule is used for obtaining that this reads that transactions requests comprises from polymorphic type copy submodule read the address of hard copy corresponding to instruction and the response quantity of needs, and will read the address of transactions requests, hard copy and the response quantity that needs sends to replica group transaction status submodule, and control the whole overtime retry of reading the transactions requests treatment scheme;
Replica group transaction status submodule is read each maximum submission attitude daily record numbering corresponding to instruction and maximum submission attitude daily record timestamp for the affairs executing state that reads according to the address of hard copy in corresponding replica group to obtain;
Reading instruction and whether can carry out in this locality in transactions requests read in the judgement of read request distribution submodule, if can, will read instruction, maximum submit attitude daily record numbering to and maximumly submit to attitude daily record timestamp to send to local copy to repair submodule, otherwise will read instruction, maximumly submit attitude daily record numbering to and maximumly submit to attitude daily record timestamp to send to read any one hard copy corresponding to instruction;
Copy is repaired submodule and is used for maximum that the node updates with its place obtains to replica group transaction status submodule and submits attitude daily record numbering place to;
The MVCC reading submodule is used for according to reading instruction and the maximum attitude daily record timestamp of submitting to from affairs memory module reading out data, and data are returned to reads the issued transaction module;
Reading the issued transaction module also is used for data are sent to the external interface submodule;
The external interface submodule also is used for data are sent to client;
The part is write the issued transaction submodule and is used for obtaining from polymorphic type copy submodule the response quantity that the hard copy of the address of witness copy corresponding to write command that transactions requests comprises and hard copy and needs is write in this part, and the address of transactions requests, witness copy and hard copy is write in the part and the response quantity of the hard copy that needs sends to the part and writes affairs Paxos copy consistency submodule, and control the overtime retry that the transactions requests treatment scheme is write in whole part;
The part is write affairs Paxos copy consistency submodule and is used for arranging in the witness replica group consistent daily record value on the witness copy, this daily record value is sent to the part writes affairs and submit submodule to, and add local transaction tag for journal entry;
The part is write affairs and is submitted to submodule to be used for according to the response quantity of the hard copy of the address of hard copy and needs, the daily record value being committed to corresponding hard copy, and will process successful result and return to the part and write the issued transaction submodule;
The part is write the issued transaction submodule and is also sent to the external interface submodule for processing successful result;
The external interface submodule also is used for processing successful result and sends to client;
Global write transaction management submodule is used for obtaining from polymorphic type copy submodule the response quantity of the hard copy of the address of witness copy corresponding to each write command that this global write transactions requests comprises and hard copy and needs, and the response quantity of the address of the witness copy that global write transactions requests, each write command is corresponding and hard copy and the hard copy that needs sends to main subtransaction Paxos copy consistency submodule, and controls the overtime retry of whole global write transactions requests treatment scheme;
Main subtransaction Paxos copy consistency submodule is for consistent daily record value on witness copy in the witness replica group that main subtransaction is set, the positional information of this daily record value and main subtransaction is sent to secondary subtransaction submits submodule to, and add the global transaction mark for this journal entry;
Secondary subtransaction Paxos copy consistency submodule is for consistent daily record value on witness copy in the witness replica group that all secondary subtransactions is arranged this secondary subtransaction, add positional information and the global transaction mark of main subtransaction for the journal entry of all secondary subtransactions, and the daily record value of main subtransaction and the daily record value of all secondary subtransactions are sent to main subtransaction submission submodule;
Global write affairs submission submodule is used for the daily record value of main subtransaction being committed to the hard copy of correspondence according to the response number of the hard copy of the address of the hard copy of main subtransaction and needs, and will process successful result and return to global write transaction management submodule, after processing successfully to each secondary subtransaction according to the response number of the hard copy of the address of the hard copy of this secondary subtransaction and needs will this secondary subtransaction the daily record value be committed to the hard copy of correspondence;
Global write transaction management submodule also is used for processing successful result and sends to the external interface submodule;
The external interface submodule also is used for processing successful result and sends to client.
2. distributed transaction disposal system according to claim 1, it is characterized in that, affairs pre-service submodule is to judge its type by the OPER field that reads in transactions requests, this field is read transactions requests for expression, for transactions requests is write in expression, utilize the consistance hash function to carry out computing to the key of writing each write operation that comprises in transactions requests, and judge the type of writing transactions requests according to operation result, if same node is all pointed in the key computing of all write operations, this to write transactions requests be that transactions requests is write in the part, otherwise be the global write transactions requests.
3. distributed transaction disposal system according to claim 1, it is characterized in that, copy is repaired submodule and will all be obtained homogeneity value less than all journal entries of this numbering and need judge whether submission, if the transaction types of this journal entry record is local affairs, when journal entry reaches consistent in the witness copy, namely think and to submit to, otherwise just submit blank operation to; If the transaction types of this journal entry record is global transaction, whether reach unanimously in the witness copy except the needs inspection, whether the main subtransaction that also will check storage is submitted to, when main subtransaction has been submitted to, just think and to submit to, otherwise just submit blank operation to, the affairs that at last all need to be submitted to are carried out to completing attitude.
4. distributed transaction disposal system according to claim 1, it is characterized in that, it is to use the Paxos algorithm that affairs Paxos copy consistency submodule is write in the part, trial reaches the consistent of daily record value on the same daily record position of each witness copy, in transactions requests, the operation of write command adds timestamp to this daily record value for this part is write, and before this timestamp is carried out greater than this Paxos algorithm, the maximum of this witness replica group is submitted attitude daily record timestamp to.
5. distributed transaction disposal system according to claim 1, it is characterized in that, the global write affairs comprise that two or more parts writes affairs, all can be coupled with the global transaction mark, one of them can be designated as main subtransaction, and is used as the use of submission point, and other parts are write affairs and are designated as secondary subtransaction, will record the positional information of main subtransaction, be used for the copy reparation.
6. distributed transaction disposal system according to claim 1, it is characterized in that, main subtransaction Paxos copy consistency submodule is to use the Paxos algorithm, trial reaches the consistent of daily record value on the same daily record position of each witness copy of main subtransaction, this daily record value adds timestamp for the operation of write command in this main subtransaction, and before this timestamp is carried out greater than this Paxos algorithm, the maximum of this witness replica group is submitted attitude daily record timestamp to.
7. distributed transaction disposal system according to claim 1, it is characterized in that, secondary subtransaction Paxos copy consistency submodule is to use the Paxos algorithm, to each secondary subtransaction, trial reaches the consistent of daily record value on the same daily record position of each witness copy of secondary subtransaction, this daily record value for this reason in secondary subtransaction the operation of write command add timestamp, before this timestamp is carried out greater than this Paxos algorithm, the maximum of this witness replica group is submitted attitude daily record timestamp to.
CN201310005857.8A 2013-01-08 2013-01-08 The distributing real time system system of polymorphic type copy is used in decentralization framework Expired - Fee Related CN103164219B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310005857.8A CN103164219B (en) 2013-01-08 2013-01-08 The distributing real time system system of polymorphic type copy is used in decentralization framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310005857.8A CN103164219B (en) 2013-01-08 2013-01-08 The distributing real time system system of polymorphic type copy is used in decentralization framework

Publications (2)

Publication Number Publication Date
CN103164219A true CN103164219A (en) 2013-06-19
CN103164219B CN103164219B (en) 2015-09-23

Family

ID=48587340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310005857.8A Expired - Fee Related CN103164219B (en) 2013-01-08 2013-01-08 The distributing real time system system of polymorphic type copy is used in decentralization framework

Country Status (1)

Country Link
CN (1) CN103164219B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530362A (en) * 2013-10-12 2014-01-22 清华大学 Computer data read-write method for multi-copy distributed system
CN104699527A (en) * 2013-12-10 2015-06-10 杭州海康威视系统技术有限公司 Critical resource management method and device in cloud storage system
CN105208096A (en) * 2015-08-24 2015-12-30 用友网络科技股份有限公司 Distributed cache system and method
CN106021277A (en) * 2016-04-27 2016-10-12 湖南蚁坊软件有限公司 State-based method for implementation of lock-less distributed ACID consistency
CN108322459A (en) * 2018-01-31 2018-07-24 北京信息科技大学 A kind of decentralization domain names method of servicing and system based on EPaxos
CN109074387A (en) * 2016-04-18 2018-12-21 亚马逊科技公司 Versioned hierarchical data structure in Distributed Storage area
CN109783578A (en) * 2019-01-09 2019-05-21 腾讯科技(深圳)有限公司 Method for reading data, device, electronic equipment and storage medium
CN109902127A (en) * 2019-03-07 2019-06-18 腾讯科技(深圳)有限公司 History state data processing method, device, computer equipment and storage medium
CN112995262A (en) * 2019-12-18 2021-06-18 中国移动通信集团浙江有限公司 Distributed transaction submission method, system and computing equipment
US11308123B2 (en) 2017-03-30 2022-04-19 Amazon Technologies, Inc. Selectively replicating changes to hierarchial data structures
WO2022134876A1 (en) * 2020-12-24 2022-06-30 中兴通讯股份有限公司 Data synchronization method and apparatus, and electronic device and storage medium
CN115357600A (en) * 2022-10-21 2022-11-18 鹏城实验室 Data consensus processing method, system, device, equipment and readable storage medium
US11550763B2 (en) 2017-03-30 2023-01-10 Amazon Technologies, Inc. Versioning schemas for hierarchical data structures

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156174A (en) * 2015-04-16 2016-11-23 中国移动通信集团山西有限公司 The system and method that a kind of db transaction processes

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020083078A1 (en) * 2000-11-02 2002-06-27 Guy Pardon Decentralized, distributed internet data management
CN102521330A (en) * 2011-12-07 2012-06-27 华中科技大学 Mirror distributed storage method under desktop virtual environment
CN102831156A (en) * 2012-06-29 2012-12-19 浙江大学 Distributed transaction processing method on cloud computing platform

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020083078A1 (en) * 2000-11-02 2002-06-27 Guy Pardon Decentralized, distributed internet data management
CN102521330A (en) * 2011-12-07 2012-06-27 华中科技大学 Mirror distributed storage method under desktop virtual environment
CN102831156A (en) * 2012-06-29 2012-12-19 浙江大学 Distributed transaction processing method on cloud computing platform

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103530362A (en) * 2013-10-12 2014-01-22 清华大学 Computer data read-write method for multi-copy distributed system
CN103530362B (en) * 2013-10-12 2017-01-04 清华大学 A kind of computer data reading/writing method for many copies distributed system
CN104699527A (en) * 2013-12-10 2015-06-10 杭州海康威视系统技术有限公司 Critical resource management method and device in cloud storage system
CN105208096A (en) * 2015-08-24 2015-12-30 用友网络科技股份有限公司 Distributed cache system and method
CN109074387B (en) * 2016-04-18 2022-05-13 亚马逊科技公司 Versioned layered data structures in distributed data stores
US11157517B2 (en) 2016-04-18 2021-10-26 Amazon Technologies, Inc. Versioned hierarchical data structures in a distributed data store
CN109074387A (en) * 2016-04-18 2018-12-21 亚马逊科技公司 Versioned hierarchical data structure in Distributed Storage area
CN106021277A (en) * 2016-04-27 2016-10-12 湖南蚁坊软件有限公司 State-based method for implementation of lock-less distributed ACID consistency
US11860895B2 (en) 2017-03-30 2024-01-02 Amazon Technologies, Inc. Selectively replicating changes to hierarchial data structures
US11550763B2 (en) 2017-03-30 2023-01-10 Amazon Technologies, Inc. Versioning schemas for hierarchical data structures
US11308123B2 (en) 2017-03-30 2022-04-19 Amazon Technologies, Inc. Selectively replicating changes to hierarchial data structures
CN108322459B (en) * 2018-01-31 2020-10-16 北京信息科技大学 EPaxos-based decentralized network domain name service method and system
CN108322459A (en) * 2018-01-31 2018-07-24 北京信息科技大学 A kind of decentralization domain names method of servicing and system based on EPaxos
CN109783578A (en) * 2019-01-09 2019-05-21 腾讯科技(深圳)有限公司 Method for reading data, device, electronic equipment and storage medium
CN109783578B (en) * 2019-01-09 2022-10-21 腾讯科技(深圳)有限公司 Data reading method and device, electronic equipment and storage medium
CN109902127B (en) * 2019-03-07 2020-12-25 腾讯科技(深圳)有限公司 Historical state data processing method and device, computer equipment and storage medium
CN109902127A (en) * 2019-03-07 2019-06-18 腾讯科技(深圳)有限公司 History state data processing method, device, computer equipment and storage medium
CN112995262B (en) * 2019-12-18 2022-06-10 中国移动通信集团浙江有限公司 Distributed transaction submission method, system and computing equipment
CN112995262A (en) * 2019-12-18 2021-06-18 中国移动通信集团浙江有限公司 Distributed transaction submission method, system and computing equipment
WO2022134876A1 (en) * 2020-12-24 2022-06-30 中兴通讯股份有限公司 Data synchronization method and apparatus, and electronic device and storage medium
CN115357600A (en) * 2022-10-21 2022-11-18 鹏城实验室 Data consensus processing method, system, device, equipment and readable storage medium

Also Published As

Publication number Publication date
CN103164219B (en) 2015-09-23

Similar Documents

Publication Publication Date Title
CN103164219B (en) The distributing real time system system of polymorphic type copy is used in decentralization framework
CN108804112B (en) Block chain settlement processing method and system
CN111338766B (en) Transaction processing method and device, computer equipment and storage medium
EP3117349B1 (en) System and method for massively parallel processing database
US20230100223A1 (en) Transaction processing method and apparatus, computer device, and storage medium
US9582520B1 (en) Transaction model for data stores using distributed file systems
US8301600B1 (en) Failover recovery in a distributed data store
CN105359099B (en) Index update pipeline
CN111597015B (en) Transaction processing method and device, computer equipment and storage medium
CN107209704A (en) Detect the write-in lost
CN103268318A (en) Distributed key value database system with strong consistency and read-write method thereof
CN102882927A (en) Cloud storage data synchronizing framework and implementing method thereof
CN103593266A (en) ot standby method based on arbitration disk mechanism
US20180165343A1 (en) Quorum-based replication of data records
US10324905B1 (en) Proactive state change acceptability verification in journal-based storage systems
EP2976714A2 (en) Method and system for byzantine fault tolerant data replication
Qin et al. Scalable replay-based replication for fast databases
US10467223B1 (en) Mixed-mode method for combining active/active and validation architectures
CN104978336A (en) Unstructured data storage system based on Hadoop distributed computing platform
CN102693312B (en) Flexible transaction management method in key-value store data storage
US11003550B2 (en) Methods and systems of operating a database management system DBMS in a strong consistency mode
WO2020119050A1 (en) Write-write conflict detection for multi-master shared storage database
JP2023541298A (en) Transaction processing methods, systems, devices, equipment, and programs
Dey et al. Scalable distributed transactions across heterogeneous stores
US10635552B1 (en) Method for tracking validity of journal copies to allow journal mirroring

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150923