CN107832121B - Concurrency control method applied to distributed serial long transactions - Google Patents

Concurrency control method applied to distributed serial long transactions Download PDF

Info

Publication number
CN107832121B
CN107832121B CN201711085717.0A CN201711085717A CN107832121B CN 107832121 B CN107832121 B CN 107832121B CN 201711085717 A CN201711085717 A CN 201711085717A CN 107832121 B CN107832121 B CN 107832121B
Authority
CN
China
Prior art keywords
current
timestamp
time stamp
data
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711085717.0A
Other languages
Chinese (zh)
Other versions
CN107832121A (en
Inventor
王宏志
赵志强
王刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hit Big Data Harbin Intelligent Technology Co ltd
Original Assignee
Hit Big Data Harbin Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hit Big Data Harbin Intelligent Technology Co ltd filed Critical Hit Big Data Harbin Intelligent Technology Co ltd
Priority to CN201711085717.0A priority Critical patent/CN107832121B/en
Publication of CN107832121A publication Critical patent/CN107832121A/en
Application granted granted Critical
Publication of CN107832121B publication Critical patent/CN107832121B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1438Restarting or rejuvenating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process

Abstract

The invention relates to a concurrency control method applied to distributed serial long transactions, which respectively stores timestamp information and data of each data version, and only reads a timestamp storage area and compares the timestamp storage area with a maximum timestamp sequence without reading the data of the version in the transaction execution process; meanwhile, the maximum timestamp sequence is stored in a distributed mode, when the timestamp of the current transaction request execution thread is compared with the maximum timestamp, the whole maximum timestamp sequence does not need to be read, only the maximum timestamp sequence segment with the same ID as the current timestamp is needed to be read, and the execution efficiency of the distributed transaction is improved.

Description

Concurrency control method applied to distributed serial long transactions
Technical Field
The invention relates to the technical field of computer application, in particular to a concurrency control method applied to distributed serial long transactions.
Background
Distributed transaction processing is widely used in the fields of finance, transportation, insurance, electronic commerce and the like, wherein serial long transactions are also called long-time running transactions, which refer to database transactions consuming a relatively long time, and the long transactions usually take one hour, one day or even longer to run. Concurrency control is one of the core technologies of transaction processing. The concurrency control means that the database reasonably schedules concurrent transactions, and the inconsistency of data caused by mutual interference among the concurrent transactions is avoided. The distributed transaction concurrency control strategy mainly comprises lockout and timestamp, wherein the timestamp strategy means that when each transaction is generated, a system can endow the transaction with a unique timestamp, the later-started transaction obtains a larger timestamp, and when a certain transaction conflicts with the transaction with the larger timestamp, the transaction is terminated, so that deadlock cannot occur in concurrency control based on the timestamp.
A new timestamp management method is provided in the prior art, and a maximum timestamp vector T is introducedR=<t1,t2,t3,…,tn>Wherein the element tiRepresents the latest timestamp, i.e., the largest timestamp, corresponding to each thread in the transaction, i represents a different thread, and TREach time stamp t iniAre all unique.
In the transaction processing process, a transaction version number, a timestamp and a modification flag bit are independently stored in a Header buffer (Data-buffer) as prefixes, Data corresponding to each version is stored in a Data buffer (Data-buffer), when whether the Data of the current version is changed or not is verified, the Header buffer (Header-buffer) is only required to be accessed to compare the timestamp and the version number, if the Data of the current version is changed, the Data corresponding to the current version number is accessed next, and if the Data of the previous version number is not changed, the Data corresponding to the previous version number is accessed without inquiring and comparing all the Data corresponding to all the versions. By doing so, the data volume of the transaction processing query is greatly reduced, the efficiency is improved, and the accuracy of the query is not affected.
The method which is provided by the prior art and uses the time stamp sequence to carry out concurrent control and puts information such as the time stamp, the version number and the like in a special storage area for independent management can not generate deadlock and can effectively improve the transaction processing efficiency. However, for the problem of serial long-transaction processing, because the amount of timestamp data generated by a long transaction is also very large, the problem of long transaction running time cannot be well solved only by a method of singly managing timestamps, and further improvement is still needed to improve efficiency.
Therefore, in view of the shortcomings of the prior art, there is a need to provide a more efficient concurrency control method applied to distributed serial long transactions.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a concurrency control method applied to distributed serial long transaction processing, aiming at the defects in the prior art, wherein the method comprises the following steps:
the method comprises the following steps: creating a maximum time stamp sequence TR=<t1,t2,t3,…,tn>Wherein the element tiRepresenting the latest timestamp corresponding to each thread i in the transaction execution process, and giving the timestamp by the machine according to the time sequence of the occurrence of the instructions;
step two: utilizing distributed hash algorithm (DHT) to process time stamp t in maximum time stamp sequenceiDividing to uniformly distribute the timestamp data to each node;
step three: the time stamp corresponding to the current execution request and the maximum time stamp sequence T distributed on each nodeRAnd comparing, and updating the data version according to the comparison result.
Further, in the second step, a distributed hash algorithm (DHT) is used to perform a hash on the timestamp t in the maximum timestamp sequenceiDividing the data to uniformly distribute the timestamp data to each node, and specifically comprising the following steps: the distributed system has N machines, the storage range of each machine is M, the ID of each hash bucket is determined by using a hash function, the maximum timestamp data is uniformly distributed into the N computers according to the corresponding ID, and the used hash function is used for taking the remainder of i.
Further, in the third step, the timestamp corresponding to the current execution request is compared with the maximum timestamp sequence T distributed on multiple machinesRComparing, and updating the data version according to the comparison result, specifically comprising: old data version QoldThe current data version is QcurrentThe timestamp of the current transaction request execution thread i is ti(Qcurrent) When t isi(Qcurrent)≤ti,(ti∈TR) When the operation is finished, the operation i rolls back, and the data version is kept QoldThe change is not changed; when t isi(Qcurrent)>ti,(ti∈TR) When the operation i continues to be executed, the data version is updated to QcurrentAnd correspondingly, TRT in (1)i=ti(Qcurrent)。
Further, remote direct data access (RDMA) techniques are used to send and receive data and/or instructions between the nodes.
Further, the present invention also provides a processor for executing the method of any one of the above.
The concurrency control method applied to distributed serial long-transaction processing provided by the embodiment of the invention applies the idea of distributed hash to management of the timestamp, further divides a large amount of timestamp data generated by the long transaction, and improves the processing efficiency of the distributed serial long transaction.
Drawings
FIG. 1 is a flowchart of a concurrency control method for distributed serial long transaction processing according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of storing a timestamp and a data version pointed to by the timestamp according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Example one
As shown in fig. 1, an embodiment of the present invention provides a concurrency control method applied to distributed serial long transaction processing, where the method includes the following steps:
the method comprises the following steps: creating a maximum time stamp sequence TR=<t1,t2,t3,…,tn>. Wherein the element tiThe latest timestamp corresponding to each thread i in the execution process of the transaction is assigned by the machine according to the time sequence of the occurrence of the instructions, and for the distributed transaction, each onetiAre all unique.
Step two: and (3) utilizing a distributed hash algorithm (DHT) to divide the timestamps ti in the maximum timestamp sequence, so that the timestamp data are uniformly distributed on each node, and distributed storage of the timestamps is realized. The specific method comprises the following steps:
assuming that the distributed system has N machines, the storage range of each machine is M, the hash function is used to determine the ID of each hash bucket, and the maximum timestamp data is uniformly distributed to the N computers according to the corresponding IDs. The hash function used in the scheme is to balance i.
Step three: the time stamp corresponding to the current execution request and the maximum time stamp sequence T distributed on each nodeRA comparison is made. Assume old data version is QoldThe current data version is QcurrentThe timestamp of the current transaction request execution thread i is ti(Qcurrent) Then when t isi(Qcurrent)≤ti,(ti∈TR) When the operation is finished, the operation i rolls back, and the data version is kept QoldThe change is not changed; when t isi(Qcurrent)>ti,(ti∈TR) When the operation i continues to be executed, the data version is updated to QcurrentAnd correspondingly, TRT in (1)i=ti(Qcurrent)。
In the first embodiment of the present invention, the time stamp and the data version pointed to by the time stamp are stored separately, as shown in fig. 2, the time stamp t of the current transaction request execution thread i is read in step threei(Qcurrent) And only accessing the buffer storing the time stamp without reading the data corresponding to the time stamp.
In the first embodiment of the present invention, the maximum timestamp vector is stored in a distributed manner. In step three, a timestamp t of the current transaction request execution thread i is readi(Qcurrent) Then, first, t is calculated according to the hash functioni(Qcurrent) The corresponding ID is compared with the ID of each machine in the distributed system, and then all t stored in the corresponding ID machine is readiAnd compared to it, rather than reading all t of the entire long transactioniI.e. all TRAnd (4) sequencing.
In the first embodiment of the present invention, the transmission and reception of data and instructions between the nodes are all based on a Remote Direct data Access (RDMA) technology. RDMA allows one computer to transfer data directly over a network to another computer's memory without any impact on the operating system. This technique eliminates external memory copy and text exchange operations, thus freeing up bus space and CPU cycles for improved application system performance, thereby reducing the need for bandwidth and processor overhead, significantly reducing latency, and facilitating the algorithm to further reduce processing time.
The technical scheme of the concurrency control method applied to the distributed serial long transactions provided by the embodiment of the invention is that the timestamp information and the data of each data version are respectively stored, and in the transaction execution process, only the timestamp storage area is read and compared with the maximum timestamp sequence, but the data of the version is not read. By doing so, reading of too much data that is useless for judging whether to execute or rollback operation is avoided, and efficiency is improved.
Meanwhile, the maximum timestamp sequence is stored in a distributed mode, when the timestamp of the current transaction request execution thread is compared with the maximum timestamp, the whole maximum timestamp sequence does not need to be read, only the maximum timestamp sequence segment with the same ID as the current timestamp is needed to be read, and the execution efficiency of the distributed transaction is further improved.
Compared with the prior art, the technical scheme of the concurrency control method applied to the distributed serial long transaction provided by the embodiment of the invention avoids the problems of long running time and low efficiency caused by excessive timestamps generated by the distributed serial long transaction. Meanwhile, the maximum timestamp sequence is uniformly distributed on each node by using the distributed hash, so that the load balance of the computer is realized.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (3)

1. A concurrency control method for distributed serial long transaction processing, the method comprising:
the method comprises the following steps: creating a maximum time stamp sequence TR=<t1,t2,t3,…,tn>Wherein the element tiRepresenting the latest timestamp corresponding to each thread i in the transaction execution process, and giving the timestamp by the machine according to the time sequence of the occurrence of the instructions; the time stamp and the data version pointed by the time stamp are respectively stored;
step two: time stamp t in maximum time stamp sequence by distributed hash algorithmiDividing to uniformly distribute the timestamp data to each node;
step three: the time stamp corresponding to the current execution request and the maximum time stamp sequence T distributed on each nodeRComparing, and updating the data version according to the comparison result;
in the second step, the distributed hash algorithm is used for carrying out the time stamp t in the maximum time stamp sequenceiDividing the data to uniformly distribute the timestamp data to each node, and specifically comprising the following steps: the distributed system has N machines, the storage range of each machine is M, the ID of each hash bucket is determined by using a hash function, the maximum timestamp data is uniformly distributed into N computers according to the corresponding ID, and the used hash function is to take the remainder of i;
in the third step, the time stamp corresponding to the current execution request and the maximum time stamp sequence T distributed on a plurality of machinesRComparing, and updating the data version according to the comparison result, specifically comprising: old data version QoldThe current data version is QcurrentCurrent transactionThe time stamp of the requesting execution thread i is ti(Qcurrent) When t isi(Qcurrent)≤ti,(ti∈TR) When the operation is finished, the operation i rolls back, and the data version is kept QoldThe change is not changed; when t isi(Qcurrent)>ti,(ti∈TR) When the operation i continues to be executed, the data version is updated to QcurrentAnd correspondingly, TRT in (1)i=ti(Qcurrent) (ii) a Specifically, at the time of reading the time stamp t of the current transaction request execution thread ii(Qcurrent) Then, first, t is calculated according to the hash functioni(Qcurrent) The corresponding ID is compared with the ID of each machine in the distributed system, and then all t stored in the corresponding ID machine is readiAnd compared therewith.
2. The method of claim 1, wherein: data and/or instructions are sent and received between nodes using remote direct data access techniques.
3. A processor, characterized in that: for performing the method of any one of claims 1-2.
CN201711085717.0A 2017-11-07 2017-11-07 Concurrency control method applied to distributed serial long transactions Active CN107832121B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711085717.0A CN107832121B (en) 2017-11-07 2017-11-07 Concurrency control method applied to distributed serial long transactions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711085717.0A CN107832121B (en) 2017-11-07 2017-11-07 Concurrency control method applied to distributed serial long transactions

Publications (2)

Publication Number Publication Date
CN107832121A CN107832121A (en) 2018-03-23
CN107832121B true CN107832121B (en) 2020-11-03

Family

ID=61654723

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711085717.0A Active CN107832121B (en) 2017-11-07 2017-11-07 Concurrency control method applied to distributed serial long transactions

Country Status (1)

Country Link
CN (1) CN107832121B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388646B (en) * 2018-10-16 2021-03-02 新华三大数据技术有限公司 Data processing method and device
CN110018884B (en) * 2019-03-19 2023-06-06 创新先进技术有限公司 Distributed transaction processing method, coordination device, database and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103995691A (en) * 2014-05-21 2014-08-20 中国人民解放军国防科学技术大学 Service state consistency maintenance method based on transactions
CN104765840A (en) * 2015-04-16 2015-07-08 成都睿峰科技有限公司 Big data distributed storage method and device
CN105959235A (en) * 2016-07-21 2016-09-21 中国工商银行股份有限公司 Distributed data processing system and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9665625B2 (en) * 2014-06-25 2017-05-30 International Business Machines Corporation Maximizing the information content of system logs

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103995691A (en) * 2014-05-21 2014-08-20 中国人民解放军国防科学技术大学 Service state consistency maintenance method based on transactions
CN104765840A (en) * 2015-04-16 2015-07-08 成都睿峰科技有限公司 Big data distributed storage method and device
CN105959235A (en) * 2016-07-21 2016-09-21 中国工商银行股份有限公司 Distributed data processing system and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种改进的基于时间戳的分布式事务并发控制策略;李清霞;《东莞理工学院学报》;20130630;第20卷(第3期);第58-59页第1.1、1.2节 *

Also Published As

Publication number Publication date
CN107832121A (en) 2018-03-23

Similar Documents

Publication Publication Date Title
CN111338766B (en) Transaction processing method and device, computer equipment and storage medium
US9852204B2 (en) Read-only operations processing in a paxos replication system
JP5006348B2 (en) Multi-cache coordination for response output cache
US9405574B2 (en) System and method for transmitting complex structures based on a shared memory queue
CN111597015B (en) Transaction processing method and device, computer equipment and storage medium
US7249152B2 (en) Dynamic disk space management by multiple database server instances in a cluster configuration
CN102999522B (en) A kind of date storage method and device
US11442961B2 (en) Active transaction list synchronization method and apparatus
CN108459913B (en) Data parallel processing method and device and server
CN111124270B (en) Method, apparatus and computer program product for cache management
US20190042100A1 (en) Apparatus and methods for a distributed memory system including memory nodes
CN110119304B (en) Interrupt processing method and device and server
CN112307119A (en) Data synchronization method, device, equipment and storage medium
US20220374407A1 (en) Multi-tenant partitioning in a time-series database
EP3404537A1 (en) Processing node, computer system and transaction conflict detection method
CN107832121B (en) Concurrency control method applied to distributed serial long transactions
CN111651286A (en) Data communication method, device, computing equipment and storage medium
CN114936093A (en) Transaction execution method in blockchain system, node and blockchain system
CN101339527B (en) Shadow EMS memory backup method and apparatus
US8341368B2 (en) Automatic reallocation of structured external storage structures
CN116719646A (en) Hot spot data processing method, device, electronic device and storage medium
CN104702508A (en) Method and system for dynamically updating table items
Shanker et al. Some performance issues in distributed real time database systems
US9063858B2 (en) Multi-core system and method for data consistency by memory mapping address (ADB) to hash table pattern associated with at least one core
US11222003B1 (en) Executing transactions for a hierarchy of data objects stored in a non-transactional data store

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant